Tech companies like Microsoft, NVIDIA, and Apple trade trust for data and talent

In the race to dominate the artificial intelligence landscape, tech giants are pushing ethical boundaries and testing the limits of public trust.  Recent revelations have exposed a pattern of behavior that raises alarm bells about data privacy, fair competition, and the concentration of power in the tech industry.  First off, an investigation by Proof News and WIRED uncovered that Apple, NVIDIA, Anthropic, and Salesforce have been using a dataset containing subtitles from over 170,000 YouTube videos to train their AI models.  This dataset, known as “YouTube Subtitles,” was compiled without the consent of content creators, potentially violating YouTube’s terms of service. The post Tech companies like Microsoft, NVIDIA, and Apple trade trust for data and talent appeared first on DailyAI.

Jul 16, 2024 - 13:00
 11
Tech companies like Microsoft, NVIDIA, and Apple trade trust for data and talent

In the race to dominate the artificial intelligence landscape, tech giants are pushing ethical boundaries and testing the limits of public trust. 

Recent revelations have exposed a pattern of behavior that raises alarm bells about data privacy, fair competition, and the concentration of power in the tech industry. 

First off, an investigation by Proof News and WIRED uncovered that Apple, NVIDIA, Anthropic, and Salesforce have been using a dataset containing subtitles from over 170,000 YouTube videos to train their AI models. 

This dataset, known as “YouTube Subtitles,” was compiled without the consent of content creators, potentially violating YouTube’s terms of service.

The scale of this data mining operation is staggering. It includes content from educational institutions like Harvard, popular YouTubers such as MrBeast and PewDiePie, and even major news outlets like The Wall Street Journal and the BBC. 

YouTube is yet to react, but back in April, CEO Neal Mohan said OpenAI’s potential use of videos to train text-to-video model Sora would violate its terms of service, telling Bloomberg, “If Sora used content from YouTube it would be a ‘clear violation’ of its terms of service.”

OpenAI isn’t among the accused on this occasion, but we don’t know whether YouTube will attempt to take any action if the new allegations are indeed proved correct. 

This isn’t the first time tech companies have been caught in the crosshairs. 

In 2018, Facebook faced intense scrutiny over the Cambridge Analytica scandal, where millions of users’ data was harvested without consent for political advertising. 

More pertinently to AI, in 2023, it was discovered that a dataset called Books3, containing over 180,000 copyrighted books, had been used to train AI models without authors’ permission. This led to a wave of lawsuits against AI companies, with authors claiming copyright infringement. 

That’s just one example from an ever-growing stack of lawsuits emanating from every corner of the creative industries. Universal Music Group, Sony Music, and Warner Records recently added their names to the pile

In their rush to build more advanced AI models, tech companies seem to adopt an “ask for forgiveness, not permission” approach to data acquisition.

The Microsoft-Inflection merger

While the YouTube scandal unfolds, Microsoft’s recent hiring spree from AI startup Inflection has caught the eye of UK regulators. 

The Competition and Markets Authority (CMA) has launched a phase one merger investigation, probing whether this mass hiring constitutes a de facto merger that could stifle competition in the AI sector.

This move by Microsoft, which included hiring Inflection’s co-founder Mustafa Suleyman (a former Google DeepMind executive) and a significant portion of the startup’s staff, was swift and incisive. 

It takes on added weight when considering Microsoft’s existing partnerships in the AI field. The company has already invested a total of some $13 billion in OpenAI, raising questions about market concentration. 

It also comes after Microsoft retreats from its non-voting seat at OpenAI. Experts say this likely resulted from a decision to rein in their oversight to appease antitrust authorities. 

Alex Haffner, a competition partner at law firm Fladgate, said, “It is hard not to conclude that Microsoft’s decision has been heavily influenced by the ongoing competition/antitrust scrutiny of its (and other major tech players) influence over emerging AI players such as OpenAI.”

Critics and regulators alike say this threatens to deepen an oligopoly in the AI sector, potentially stifling innovation and limiting consumer choice. 

A trust deficit?

Both the YouTube data mining scandal and Microsoft’s hiring practices contribute to a growing trust deficit between Big Tech and the public. 

Content creators have become more guarded about their work in fear of exploitation. 

This could have a knock-on effect on content creation and sharing, ultimately impoverishing the very platforms that tech companies rely on for data.

Similarly, the concentration of AI talent in a few major companies is homogenizing AI development and limiting diversity.

For tech companies, rebuilding trust will likely require more than just compliance with future regulations and antitrust investigations. 

The question lingers: can we harness the potential of AI while preserving ethics, fair competition, and public trust?

The post Tech companies like Microsoft, NVIDIA, and Apple trade trust for data and talent appeared first on DailyAI.