ai safety

California lawmaker behind SB 1047 reignites push for mandated AI safety reports

California State Senator Scott Wiener on Wednesday introduced new amendments to his latest bill, SB 53, that would require the world’s largest AI companies to publish safety and security protocols and issue reports when safety incidents occur. If signed into law, California would be the first state to impose meaningful transparency requirements onto leading AI […]

California lawmaker behind SB 1047 reignites push for mandated AI safety reports Read More »

Anthropic says most AI models, not just Claude, will resort to blackmail

Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models. On Friday, Anthropic published new safety research testing 16

Anthropic says most AI models, not just Claude, will resort to blackmail Read More »

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims

Former OpenAI research leader Steven Adler published a new independent study on Wednesday claiming that, in certain scenarios, his former employer’s AI models will go to great lengths to try to avoid being shut down. In a blog post, Adler describes a series of experiments he ran on OpenAI’s latest GPT-4o model, the default model

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims Read More »

Yoshua Bengio launches LawZero, a nonprofit AI safety lab

Turing Award winner Yoshua Bengio is launching a nonprofit AI safety lab called LawZero to build safer AI systems, he told the Financial Times on Monday. LawZero raised $30 million in philanthropic contributions from Skype founding engineer Jaan Tallinn, former Google chief Eric Schmidt, Open Philanthropy, and the Future of Life Institute, among others. The

Yoshua Bengio launches LawZero, a nonprofit AI safety lab Read More »

Anthropic CEO wants to open the black box of AI models by 2027

Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027. Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has

Anthropic CEO wants to open the black box of AI models by 2027 Read More »

OpenAI’s latest AI models have a new safeguard to prevent biorisks

OpenAI says that it deployed a new system to monitor its latest AI reasoning models, o3 and o4-mini, for prompts related to biological and chemical threats. The system aims to prevent the models from offering advice that could instruct someone on carrying out potentially harmful attacks, according to OpenAI’s safety report. O3 and o4-mini represent

OpenAI’s latest AI models have a new safeguard to prevent biorisks Read More »

Google is shipping Gemini models faster than its AI safety reports

More than two years after Google was caught flat-footed by the release of OpenAI’s ChatGPT, the company has dramatically picked up the pace. In late March, Google launched an AI reasoning model, Gemini 2.5 Pro, that leads the industry on several benchmarks measuring coding and math capabilities. That launch came just three months after the

Google is shipping Gemini models faster than its AI safety reports Read More »