AI isn’t very good at history, new paper finds
AI might excel at certain tasks like coding or generating a podcast. But it struggles to pass a high-level history exam, a new paper has found. A team of researchers has created a new benchmark to test three top large language models (LLMs) — OpenAI’s GPT-4, Meta’s Llama, and Google’s Gemini — on historical questions. […]
AI isn’t very good at history, new paper finds Read More »