multimodal Archives - GenixPlay Studios

Gemini 2.0, Google’s newest flagship AI, can generate text, images, and speech

Google’s next major AI model has arrived to combat a slew of new offerings from OpenAI. On Wednesday, Google announced Gemini 2.0 Flash, which the company says can natively generate images and audio in addition to text. 2.0 Flash can also use third-party apps and services, allowing it to tap into Google Search, execute code, […]

Gemini 2.0, Google’s newest flagship AI, can generate text, images, and speech Read More »

Meta’s Llama AI models get multimodal

Enterprise / Sasandara Dilmina

Benjamin Franklin once wrote that nothing is certain except death and taxes. Let me amend that phrase to reflect the current AI goldrush: Nothing is certain except death, taxes, and new AI models, with the last of those three arriving at an ever-accelerating pace. Earlier this week, Google released upgraded Gemini models, and, earlier in

Meta’s Llama AI models get multimodal Read More »

Meta’s Llama AI models now support images, too

Apps / Sasandara Dilmina

Benjamin Franklin once wrote that nothing is certain except death and taxes. Let me amend that phrase to reflect the current AI gold rush: Nothing is certain except death, taxes, and new AI models, with the last of those three arriving at an ever-accelerating pace. Earlier this week, Google released upgraded Gemini models, and, earlier

Meta’s Llama AI models now support images, too Read More »

Mistral releases Pixtral, its first multimodal model

AI / Sasandara Dilmina

French AI startup Mistral has released its first model that can process images as well as text. Called Pixtral 12B, the 12-billion-parameter model is roughly 24GB size. (Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.) Available on GitHub as well as the AI and machine

Mistral releases Pixtral, its first multimodal model Read More »