AI

OpenAI partner says it had relatively little time to test the company’s o3 AI model

An organization OpenAI frequently partners with to probe the capabilities of its AI models and evaluate them for safety, Metr, suggests that it wasn’t given much time to test one of the company’s highly capable new releases, o3. In a blog post published Wednesday, Metr writes that one red teaming benchmark of o3 was “conducted […]

OpenAI partner says it had relatively little time to test the company’s o3 AI model Read More »

OpenAI launches a pair of AI reasoning models, o3 and o4-mini

OpenAI announced on Thursday the launch of o3 and o4-mini, new AI reasoning models designed to pause and work through questions before responding. The company calls o3 its most advanced reasoning model ever, outperforming the company’s previous models on tests measuring math, coding, reasoning, science, and visual understanding capabilities. Meanwhile, o4-mini offers what OpenAI says

OpenAI launches a pair of AI reasoning models, o3 and o4-mini Read More »

OpenAI debuts Codex CLI, an open source coding tool for terminals

In a bid to inject AI into more of the programming process, OpenAI is launching Codex CLI, a coding “agent” designed to run locally from terminal software. Announced on Wednesday alongside OpenAI’s newest AI models, o3 and o4-mini, Codex CLI links OpenAI’s models with local code and computing tasks, OpenAI says. Via Codex CLI, OpenAI’s

OpenAI debuts Codex CLI, an open source coding tool for terminals Read More »

Startup funding hit records in Q1. But the outlook for 2025 is still awful.

Startups attracted $91.5 billion in venture capital funding in Q1, according to the latest report from data provider PitchBook. This figure not only exceeds the previous quarter’s allocation by 18.5% but also represents the second-highest quarterly investment in the last decade. Despite this seemingly positive news, Kyle Stanford, lead U.S. venture capital analyst at PitchBook,

Startup funding hit records in Q1. But the outlook for 2025 is still awful. Read More »

Microsoft researchers say they’ve developed a hyper-efficient AI model that can run on CPUs

Microsoft researchers claim they’ve developed the largest-scale 1-bit AI model, also known as a “bitnet,” to date. Called BitNet b1.58 2B4T, it’s openly available under an MIT license and can run on CPUs, including Apple’s M2. Bitnets are essentially compressed models designed to run on lightweight hardware. In standard models, weights, the values that define

Microsoft researchers say they’ve developed a hyper-efficient AI model that can run on CPUs Read More »

A dev built a test to see how AI chatbots respond to controversial topics

A pseudonymous developer has created what they’re calling a “free speech eval,” SpeechMap, for the AI models powering chatbots like OpenAI’s ChatGPT and X’s Grok. The goal is to compare how different models treat sensitive and controversial subjects, the developer told TechCrunch, including political criticism and questions about civil rights and protest. AI companies have

A dev built a test to see how AI chatbots respond to controversial topics Read More »

OpenAI may ‘adjust’ its safeguards if rivals release ‘high-risk’ AI

In an update to its Preparedness Framework, the internal framework OpenAI uses to decide whether AI models are safe and what safeguards, if any, are needed during development and release, OpenAI said that it may “adjust” its requirements if a rival AI lab releases a “high-risk” system without comparable safeguards. The change reflects the increasing

OpenAI may ‘adjust’ its safeguards if rivals release ‘high-risk’ AI Read More »