ai safety

Anthropic CEO wants to open the black box of AI models by 2027

Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has

Anthropic CEO wants to open the black box of AI models by 2027 Read More »

OpenAI’s latest AI models have a new safeguard to prevent biorisks

OpenAI says that it deployed a new system to monitor its latest AI reasoning models, o3 and o4-mini, for prompts related to biological and chemical threats. The system aims to prevent the models from offering advice that could instruct someone on carrying out potentially harmful attacks, according to OpenAI’s safety report. O3 and o4-mini represent

OpenAI’s latest AI models have a new safeguard to prevent biorisks Read More »

Google is shipping Gemini models faster than its AI safety reports

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the

Google is shipping Gemini models faster than its AI safety reports Read More »

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks

In a new report, a California-based policy group co-led by Fei-Fei Li, an AI pioneer, suggests that lawmakers should consider AI risks that “have not yet been observed in the world” when crafting AI regulatory policies. The 41-page interim report released on Tuesday comes from the Joint California Policy Working Group on Frontier AI Models,

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks Read More »

Eric Schmidt argues against a ‘Manhattan Project for AGI’

In a policy paper published Wednesday, former Google CEO Eric Schmidt, Scale AI CEO Alexandr Wang, and Center for AI Safety Director Dan Hendrycks said that the U.S. should not pursue a Manhattan Project-style push to develop AI systems with “superhuman” intelligence, also known as AGI. The paper, titled “Superintelligence Strategy,” asserts that an aggressive

Eric Schmidt argues against a ‘Manhattan Project for AGI’ Read More »

UK drops ‘safety’ from its AI body, now called AI Security Institute, inks MOU with Anthropic

The U.K. government wants to make a hard pivot into boosting its economy and industry with AI, and as part of that, it’s pivoting an institution that it founded a little over a year ago for a very different purpose. Today the Department of Science, Industry and Technology announced that it would be renaming the

UK drops ‘safety’ from its AI body, now called AI Security Institute, inks MOU with Anthropic Read More »

Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test

Anthropic’s CEO Dario Amodei is worried about competitor DeepSeek, the Chinese AI company that took Silicon Valley by storm with its R1 model. And his concerns could be more serious than the typical ones raised about DeepSeek sending user data back to China.  In an interview on Jordan Schneider’s ChinaTalk podcast, Amodei said DeepSeek generated

Anthropic CEO says DeepSeek was ‘the worst’ on a critical bioweapons data safety test Read More »