The Legal Landscape: Balancing Act in the Digital Realm

A pivotal moment in the legal sphere occurred in 2015 when a U.S. appeals court established that Google’s scanning of millions of books for Google Books limited excerpt of copyrighted content constituted "fair use." The court ruled that scanning of these books is highly transformative, the public display of the text is limited, and the display is not a market substitute for the original. However, Generative AI transcends these boundaries, delving into uncharted territories where legal frameworks struggle to keep pace. Lawsuits have emerged, raising pertinent questions about compensating content creators whose work fuels the algorithms of Language Model (LLM) producers. OpenAI, Microsoft, Github, and Meta have found themselves entangled in legal wrangling, especially concerning the reproduction of computer code from copyrighted open-source software. Content creators on social platforms already monetize their content, and this legal conundrum underscores the urgency for defining the scope of fair use in the realm of Generative AI.

Privacy Predicaments: no-llm-index

The nuances of generative algorithms, coupled with the vastness of the data they process, create a potential minefield of privacy breaches. A quick look into Meta’s AI LLaMA paper, LLaMA: Open and Efficient Foundation Language Models shows that the bulk of the training data for LLMs is a mix of CommonCrawl, Wikipedia, StackExchange, and other publicly sourced data. Are LLMs compliant with the requirements under California Consumer Privacy Act of 2019 (“CCPA”) – Requests to Delete or GDPR’s ‘Right to erasure’? As Generative AI generates content, the question of how to ethically source and utilize data becomes paramount. The noindex rule, set either with the meta tag or HTTP response header requests the search engines to drop the page from being indexed. Perhaps, a similar option (no-llm-index) should be available for content creators to opt out from LLMs processing. Striking a balance between technological innovation and privacy safeguards is pivotal to fostering public trust in these systems.

Unraveling Ethical Quandaries

Biases embedded in algorithms and the data they learn from can perpetuate societal inequalities. Recognizing and mitigating these biases are imperative steps toward ethical AI deployment.

Developing ethical software should not be discretionary, but mandatory.

The responsibility falls not only on developers but also on policymakers and organizations to ensure the fair and unbiased use of Generative AI. Mistral recently launched its 7B model under Apache 2.0 licenses, however, in the absence of explicit constraints, there is a concern regarding the potential for misuse of these models.

Who is liable?

Language Models have undergone training on extensive datasets, the quality of which can vary substantially. This variance implies that these models have the potential to generate information that may be inaccurate or misleading. The repercussions of such inaccuracies extend beyond the virtual realm and can significantly impact the real world. For instance, Alphabet shares plummeted after Google’s Bard chatbot incorrectly claimed that the James Webb Space Telescope had captured the world’s first images of a planet outside of our solar system. The terms of services for generative AI neither guarantee accuracy nor assume liability and instead rely on user discretion. According to a Pew Research Center report, many users of these services are already using them to learn something new or for tasks at work and may not be equipped to differentiate between credible and hallucinated content. The application landscape of these models is continuously evolving, with some of them already driving solutions that involve substantial decision-making. In the event of an error, the question of liability arises. Should the responsibility fall on the provider of the LLMs itself, the entity offering value-added services utilizing these LLMs, or the user for potential lack of discernment?

Navigating the Future

In this transformative era, a harmonious coexistence between technology and legal frameworks is imperative. Everyone who has been using smartphones for a while has come to live with the fact that applications crash. In fact, as reported by AppDynamics, applications experience a 2% crash rate. However, expectations regarding Language Models (LLMs) are still being recalibrated. Unlike app crashes, which are tangible events, determining when AI experiences breakdowns or engages in hallucination is considerably more challenging due to the abstract nature of these occurrences.

As Generative AI continues to push the boundaries of innovation, the intersection of legal, ethical, and technological realms beckons comprehensive frameworks. Striking a delicate balance between fostering innovation and preserving fundamental rights is the clarion call for policymakers, technologists, and society at large. Courts and governments play a pivotal role in settling disputes, defining fair use, and safeguarding privacy. Chinese government organization, National Information Security Standardization Technical Committee has already released a draft document on October 11, 2023, that proposes detailed rules on how to determine the issues associated with generative AI. Simultaneously, organizations must proactively devise internal policies that uphold ethical standards, ensuring responsible AI development and deployment. Only through collaborative efforts can we pave the way for a future where Generative AI thrives ethically, respecting privacy, ownership, and the principles of fairness. 

Sources of Article

Amit Verma

Want to publish your content?

Get Published Icon
ALSO EXPLORE