Latest

DAI#45 – New top model, lawsuit blues, and puzzled AI

Welcome to this week’s roundup of hand-assembled bespoke AI news. This week Anthropic knocked OpenAI off pole position. AI audio generators face the music in court. And the top LLMs struggle with a puzzle your kids can solve. Let’s dig in. Claude vs GPT-4o After months of AI models claiming to be ‘almost as good as GPT-4’, we’ve finally got a model that pushes OpenAI off its top spot on the leaderboards. Anthropic released Claude Sonnet 3.5, an upgraded version of its mid-size Claude model. The MMLU benchmark tests show it beating GPT-4o and Google’s Gemini 1.5 Pro in almost The post DAI#45 – New top model, lawsuit blues, and puzzled AI appeared first on DailyAI.

Jun 28, 2024 - 12:00

141

DAI#45 – New top model, lawsuit blues, and puzzled AI

Welcome to this week’s roundup of hand-assembled bespoke AI news.

This week Anthropic knocked OpenAI off pole position.

AI audio generators face the music in court.

And the top LLMs struggle with a puzzle your kids can solve.

Let’s dig in.

Claude vs GPT-4o

After months of AI models claiming to be ‘almost as good as GPT-4’, we’ve finally got a model that pushes OpenAI off its top spot on the leaderboards.

Anthropic released Claude Sonnet 3.5, an upgraded version of its mid-size Claude model. The MMLU benchmark tests show it beating GPT-4o and Google’s Gemini 1.5 Pro in almost every test.

With an even more powerful Claude Opus 3.5 expected soon, what will OpenAI’s response be?

Claude 3.5 Sonnet is not like the other LLMs
11 impressive demos of the new model: pic.twitter.com/2oHZdArz6J
— Proper (@ProperPrompter) June 26, 2024

After Meta called off its launch of Meta AI in the EU, Apple is doing the same due to strict laws in the region.

Apple has delayed the rollout of its Apple Intelligence features there as EU tech fans watch the rest of the world get first dibs.

Sounds familiar…

AI companies are getting sued, and for a change, it’s not OpenAI or Meta.

Text-to-audio platforms Suno and Udio generate impressive music, but how did they get so good?

The Recording Industry Association of America is suing the companies, saying they “stole copyrighted sound recordings” to train their AI. When the judge listens to these sample clips it might be a short day in court.

An AI company using copyrighted material to train its models without paying the creators? We’re as unsurprised as you are.

Recreating copyrighted music isn’t the worst thing AI is being used for though. A DeepMind study says that the leading form of AI misuse is bad guys creating deep fakes for opinion manipulation.

The rest of the AI misuse list makes for interesting reading.

Are you sure that’s right?

AI models are really good at generating very plausible but completely wrong information.

AI scientists say hallucinations can’t be fixed but a University of Oxford study identified when AI hallucinations are more likely to occur.

“Semantic entropy” checks the AI model’s confidence level and it’s also my new polite way to say someone is talking BS.

via GIPHY

Even the most advanced LLMs make stuff up when presented with surprisingly simple puzzles. This week users on X posted examples of how the smartest models can’t solve a simple river crossing puzzle.

Is it evidence that LLMs aren’t good at reasoning, or is something else happening here?

AI might struggle with some riddles but it knows you better than you think. A new study found that an AI system can predict how anxious you are from how you react to photos.

The ability of these models to infer human emotions could be very helpful, but might be a source of human anxiety too.

AI open season

When AI companies use the word “open” to describe their models it rarely means what you think it does.

How “open” are these AI models? Sam took a closer look at which AI models are truly open and why some companies keep certain aspects very much closed.

This week saw an exciting development in the open model space. EvolutionaryScale’s ESM3 is a generative model for biology that turns prompts into proteins.

Previously, scientists looking for a novel protein would have to wait for nature to come up with it or try a hit-or-miss approach in the lab.

Now ESM3 enables scientists to program biology and create proteins beyond nature.

AI events

If you want to level up your marketing efforts then check out the MarTech Summit Hong Kong 2024 happening on 9 July.

The AI Accelerator Institute presents the Generative AI Summit Austin 2024 on 10 July. The agenda sees industry leaders discuss the latest trends in real-world generative AI applications.

In other news…

Here are some other clickworthy AI stories we enjoyed this week:

Meta is incorrectly marking real photos as ‘Made by AI’.
SoftBank CEO says AI that is 10,000 times smarter than humans will come out in 10 years.
OpenAI delays the launch of GPT-4o’s voice assistant to address safety issues.
Anthropic debuts collaboration tools for its Claude AI assistant.
Chinese AI firms woo OpenAI users as the US company plans API restrictions.
OpenAI acquires collaborative screen sharing tool creator Multi.
Toys “R” Us sparks an online backlash after releasing an ad created with OpenAI’s Sora.

this toys r us commercial is made entirely with AI which means the kid is disgusting and ghoulish, the sentiment hollow, and the toys r us brand is dead for at least the third time pic.twitter.com/IRprWZKN8O
— Chris Alsikkan (@AlsikkanTV) June 25, 2024

And that’s a wrap.

Have you tried out the upgraded Claude? The Artifacts window is seriously cool. It’s a sure bet that ChatGPT will get a similar feature very soon.

I love playing with Udio and Suno but there’s no denying they rip off copyrighted music. Is this the price of progress or is it a showstopper?

I’m still surprised that AI models struggle with a simple river crossing puzzle. We should probably fix that before letting AI control really important stuff like power grids or hospitals.

Let us know what you think and keep sending us links to interesting AI news and research we may have missed.

The post DAI#45 – New top model, lawsuit blues, and puzzled AI appeared first on DailyAI.