site:the-decoder.com - Search News

Qwen3-VL can scan two-hour videos and pinpoint nearly every detail

A few months after launching Qwen3-VL, Alibaba has released a detailed technical report on the open multimodal model. The data shows the system excels at image-based math tasks and can analyze hours ...

the-decoder

Microsoft adds built-in AI shopping tools to Edge in the U.S.

Microsoft is adding new AI shopping tools to its Edge browser in the US. The built-in Copilot can now surface price comparisons, price histories, and cashback options right inside the browser. Users ...

the-decoder

The White House has paused a federal order that would have overridden state-level AI regulations

The White House has reportedly put a hold on a draft executive order that would have let federal law override state-level AI regulations. According to Reuters, the draft called for the Department of ...

the-decoder

Google upgrades Gemini 2.5 Pro for coding and app development

The latest pre-release version of Google's Gemini 2.5 Pro language model brings major improvements for front-end development and complex programming tasks. Google has launched an updated preview of ...

the-decoder

Google Deepmind taps Boston Dynamics' former CTO to build the 'Android' of robots

The company has hired Aaron Saunders, the former Chief Technology Officer of Boston Dynamics, as Vice President of Hardware Engineering—a move that strengthens its hardware expertise as it aims to ...

the-decoder

Most LLM benchmarks are flawed, casting doubt on AI progress metrics, study finds

A new international study highlights major problems with large language model (LLM) benchmarks, showing that most current evaluation methods have serious flaws. After reviewing 445 benchmark papers ...

the-decoder

Anthropic's Jack Clark compares AI breakthroughs to hammers that suddenly become self-aware

Anthropic co-founder and Head of Policy Jack Clark offers a look at how Silicon Valley AI leaders are thinking about the future of AI. To explain the moment AI systems develop situational awareness, ...

the-decoder

Most AI models can fake alignment, but safety training suppresses the behavior, study finds

A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability. Only a handful - including Claude 3 Opus, Claude 3.5 Sonnet, Llama 3 ...

the-decoder

AWS to invest up to $50 billion in U.S. AI and supercomputing for government agencies

Amazon has announced a major investment in its AI footprint for federal work, saying it will spend up to $50 billion to expand AI and supercomputing infrastructure for U.S. government agencies. The ...

the-decoder

Google's latest image model Nano Banana Pro makes image generation feel truly intentional

It replaces the Gemini 2.5 Flash Image model from August and is built to handle complex scenes with consistent physics, render text accurately, and use real-time information as input. It also appears ...

the-decoder

Strict anti-hacking prompts make AI models more likely to sabotage and lie, Anthropic finds

New research from Anthropic shows how reward hacking in AI models can trigger more dangerous behaviors. When models learn to trick their reward systems, they can spontaneously drift into deception, ...

the-decoder

Salesforce's CRM benchmark finds AI agents struggle in real-world business scenarios

Salesforce's new CRMArena-Pro benchmark reveals major challenges for AI agents in business contexts. Even top models like Gemini 2.5 Pro manage just a 58 percent success rate on single turns. When the ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results