The Influence of Legal Action and Regulation on AI in 2024

7 min read

Several lawsuits are poised to influence AI development in 2024. Here’s what to anticipate.

The initial excitement surrounding artificial intelligence is waning, as the implications of training large language models (LLMs) become more apparent. The use of personal data in LLM training has intensified existing concerns over data privacy, which were already highlighted by regulations like the EU’s GDPR and California’s CCPA, aimed at regulating data usage in business.

Moreover, the practice of freely utilizing public web data to develop AI-based products and services has triggered significant copyright and data ownership issues. This has led to multiple legal disputes filed in 2023, including a copyright infringement case against OpenAI for utilizing 300,000 books in its model training, a lawsuit against Stability AI for using images and metadata owned by Getty, and a case against Github’s Copilot for republishing code, among others.

Some of the unresolved questions arising from these cases may find clarity in court rulings as early as 2024. Lawsuits targeting AI companies and web data collection providers are expected to shed light on these contentious areas.

The resulting changes in regulation, influenced by both case law and public scrutiny, will undoubtedly shape the future development of AI and automated data gathering practices. Already, certain trends and trajectories are emerging in response to these developments.

What Are the Main Legal and Ethical Questions Facing AI?

Ongoing legal disputes in the realm of web data collection bring to light a host of complex issues, primarily centered around data ownership. At the core of these battles lies the fundamental question of who has the rightful claim to the data. Social media platforms contend that since users willingly share their information on these platforms, it is incumbent upon the platforms to safeguard this data from unauthorized collection and resale. Conversely, scraping services argue that publicly available data is fair game for all, asserting that it is the users themselves who should have the ultimate authority over the collection and utilization of their data.

However, this debate is limited to data that is publicly accessible without requiring user authentication, leading to the next point of contention. Accessing data through user accounts typically entails agreeing to the platform’s terms and conditions, which often explicitly prohibit bot activity and scraping. Social media companies maintain that breaching these accepted terms and conditions constitutes a violation, which has historically served as a key argument against scraping activities.

Yet again, the question arises as to whether social media platforms are justified in restricting access to data in this manner, given that they do not own the data generated by users. This raises divergent opinions among users regarding whether these platforms are safeguarding their data or appropriating it by imposing limitations on public access.

Furthermore, there are specific concerns surrounding the data of minors, as highlighted by a proposed class action lawsuit against a data provider in Israel, where the sale of minor data is explicitly prohibited.

Given that the training of large language models (LLMs) and AI development heavily rely on scraped data, AI faces similar legal challenges. These issues revolve around data privacy, ownership, and the question of whether data creators deserve compensation for their contributions to training machine learning algorithms that subsequently generate other products.

Will AI Development Halt in 2024?

Earlier this year, a group of 1,000 tech leaders issued a call to pause AI development for a minimum of six months, citing concerns over the lack of regulation and numerous uncertainties surrounding AI’s workings and advancements. While it may seem that their appeal is being addressed to some extent due to the legal and ethical issues discussed earlier, a complete halt to AI development in the near future is highly improbable.

While legal battles might momentarily impede the progression of generative AI and machine learning-based tools, comprehensive regulation should ultimately provide a roadmap for the field, fostering focused advancement.

Furthermore, there are emerging techniques in the AI domain that offer promising technological breakthroughs. Among these are federated machine learning and causal AI, which have the potential to propel AI beyond superficially intelligent generative systems.

Federated learning presents a framework for training machine learning algorithms without direct access to users’ personal data, effectively addressing concerns surrounding data privacy and the fragmentation of data.

On the other hand, causal AI holds promise in addressing the issue of predictive models producing aberrant outcomes due to a failure to comprehend causal relationships. Unlike generative AI, which often conflates correlation with causation, causal AI operates akin to human cognition, exploring “what if” scenarios and investigating potential cause-and-effect relationships. This could lead to enhanced reliability in AI systems.

These advancements pave the way for the successful deployment of such models across a wide array of applications, such as analyzing health data for diagnosis and prevention. They underscore the notion that AI continues to evolve and innovate, even as regulatory frameworks undergo expansion.

Is General Web Scraping Regulation Coming?

In response to the challenges of personal data protection in an increasingly digital society, the EU introduced the General Data Protection Regulation (GDPR) as a comprehensive framework. After years of development and in light of the emergence of generative AI models in 2022, the EU appears to have finalized the foundational rules for its first comprehensive AI law as 2023 draws to a close.

This legislation will categorize AI models based on their risk levels, prohibiting those deemed unacceptable and mandating transparency from tools like ChatGPT. In the U.S., President Joe Biden issued an executive order outlining new standards for AI safety and innovation, potentially shaping the policy trajectory for federal agencies on AI.

However, the question remains: will this pave the way for a general regulation on web scraping? It’s unlikely that we will see such regulation anytime soon, especially in 2024. Web scraping serves numerous essential purposes, enabling functionalities like search engines such as Google, supporting investigative journalism, and facilitating research across various domains, including AI.

Instead, what we may witness is the implementation of laws targeting specific abusive uses of web scraping technology. For instance, there have been discussions about expanding the BOTS Act, which prohibits the use of automated solutions for ticketing purposes.

Furthermore, there may be a broader adoption of tools and regulations akin to California’s Delete Act, which is set to take effect at the beginning of the new year. This act mandates regulators to develop a tool allowing consumers to submit a single request to cease all data collection and storage activities by registered data brokers in the state.

How Will These Lawsuits Impact AI in 2024?

Indeed, while regulatory changes may have swift impacts within months, the resolution of lawsuits can often take years, with no guaranteed outcome. In many instances, we will need to wait and observe the precise changes these developments bring to the data gathering industry, AI applications reliant on such data, and the end-users of these services.

What can be anticipated is that case law and well-defined regulations will identify and address abuses of web scraping practices or compel entities engaging in such practices to adapt. Although this process may initially create disruptions within some service providers, it is hoped that in the long term, it will foster stability within both the AI and data collection sectors.

You May Also Like

More From Author

+ There are no comments

Add yours