AI ambitions need a strong Data Governance foundation

Wednesday, 6 May 2026 01:14 - - {{hitsCtrl.values.hits}}

Sri Lanka is prioritising the deployment of AI across various sectors, including digital public infrastructure. The use of AI presents many opportunities, but they can only be realised if the country develops appropriate governance frameworks to support them. It is foundational that this conversation must begin with data. A coherent national approach to data governance will not only support AI development but also strengthen public trust

''We are updating legal frameworks on personal data protection and cyber security, continuing to invest in digital public infrastructure and strengthening institutions that safeguard public trust while encouraging innovation." – President

Anura Kumara Dissanayake, Artificial Intelligence (AI) Impact Summit 2026.

The above is an important observation by the President of Sri Lanka. Governments across the globe are keen to adopt AI systems to promote innovation and economic growth. Sri Lanka’s draft National AI Strategy emphasises the need to integrate AI into public services and businesses to accelerate development. However, in this context, the need to consider laws that impact its deployment becomes more relevant than ever. To enable efficient and responsible AI adoption, regulating the various “AI layers,” for instance, at the technical/algorithmic layer or application/use case layer are essential. But the first question we need to ask is the foundational or data layer: what data will power these systems, and how will that data be governed?

AI systems do operate in a vacuum. They learn from vast quantities of data—text, images, documents, and digital records collected from across the internet and other databases. The quality, accessibility, and legality of that data will be critical in determining whether AI adoption is responsible and sustainable.

The personal data behind AI

Large language models (LLMs) are trained on data scraped from the internet, which may include personally identifiable information. There are copious amounts of personal data that are publicly available. A common assumption is that much of the data used to train AI models comes from “publicly available” sources on the internet and, hence, is exempt from the application of personal data protection laws. However, the fact that information is publicly accessible does not necessarily mean it can be used freely without constraints.

The applicable law in Sri Lanka, Personal Data Protection Act, No. 9 of 2022 (as amended) (PDPA) does not exempt “publicly available data” or those obtained from public sources from its application. In principle, individuals should be informed when their personal data is collected or used. The PDPA also requires this when personal data has been collected by means “other than direct interaction.” This raises an obvious challenge in the context of AI development. Training LLMs involves collecting information from millions or even billions of webpages. Providing individual notice in such circumstances may be practically impossible. However, the PDPA also has a provision through which an exemption can be claimed from the requirement of notice i.e., where there is “disproportionate effort” involved in fulfilling that requirement of notice. This in itself is not sufficient, and the author’s view is that regulatory guidance is required on how the PDPA applies to web scraping or other large-scale data collection. This would not only bring in legal certainty but also provide comfort to the data subjects on how their personal data is processed.

This challenge is not unique to Sri Lanka; the European Commission is considering simplifying the General Data Protection Regulation through the EU Digital Omnibus proposal. This proposal mentions that AI developers can use “legitimate interests” as the legal basis for AI development, where certain safeguards are met, and there is an unconditional right to opt out. It is important to note that the PDPA in Sri Lanka also allows for processing on the grounds of “legitimate interests”. Similarly, it has been reported that Japan is considering a bill to revise its personal information protection law to promote AI development by easing regulations on the acquisition of personal data. The proposed bill would eliminate the requirement of consent when training AI on certain types of personal information.

Copyright and AI training

Leaving aside the issue of AI authorship, i.e., whether a work created by AI can be given authorship under copyright law, the concerns from using copyrighted material to train LLMs need consideration. In recent years, legal challenges have arisen in other jurisdictions over datasets that include copyrighted materials being used without consent to train AI models. In Sri Lanka, such legal cases have not yet arisen. But in the future, such legal challenges may arise, for example, with regard to the use of copyrighted Sinhala or Tamil language material to train AI models.

Under the Intellectual Property Rights Act No 36 of 2003 (as amended) (“IP Act”) in Sri Lanka, an author’s copyright is protected during the lifetime of the author and for a further period of seventy years from the date of his or her death. A key question that arises is whether the use of copyrighted material for training LLMs could come within the scope of “fair use” under the IP Act. Section 11 of the IP Act deals with “fair use”. The said provision includes a widely worded 4-part test for “fair use”. Reproduction of the work for purposes like research is not an infringement of copyright. To decide whether a case falls within the definition of ‘fair use,’ several factors are considered, including whether the work is being used for commercial or non-profit educational purposes, the portion of the work used in relation to the whole copyrighted work, and the effect of the use on the potential market value of the work. While the “fair use” exemption is widely worded, there is no certainty as to how this would apply in the context of AI training. We would require judicial interpretation for certainty in this regard.

It is recommended that a regime through which copyrighted material can be used, such as licensing, be considered. This is also essential to encourage homegrown AI models.

A fragmented data governance landscape

The above issues also point to a broader challenge. Data (personal and non-personal data) governance in Sri Lanka is currently shaped by laws, policies, and draft frameworks developed at different times and for different purposes.

For instance, the proposed National Data Sharing Policy has been in draft form for more than a decade. Similarly, a draft Information Classification Policy and framework for Information Classification, which were previously available online, have not been adopted for a long time.

Sri Lanka does not lack in policy frameworks; what is needed is a review of existing data governance policies and procedures, including how data is stored and structured, which is critical to leveraging the benefits of AI, as AI systems require machine-readable data.

Sri Lanka’s draft National AI Strategy recognises the importance of developing a data governance framework. This is a crucial step. AI innovation cannot succeed without a reliable data infrastructure, clear rules, and strong public trust.

Before shifting focus to algorithms and applications, policymakers need to address the foundational data layer: how data is collected, shared, structured, and protected.

A conversation that must begin now

Sri Lanka is prioritising the deployment of AI across various sectors, including digital public infrastructure. The use of AI presents many opportunities, but they can only be realised if the country develops appropriate governance frameworks to support them. It is foundational that this conversation must begin with data.

A coherent national approach to data governance will not only support AI development but also strengthen public trust.

The above is based on the comprehensive study titled "Data Governance Framework in Sri Lanka''. The full report is available on lirneasia.net

(The author is a Research Fellow at LIRNEasia, and legal consultant specialising in Technology, Media, and Telecommunications Law).

AI ambitions need a strong Data Governance foundation

Recent columns

COMMENTS