Tech & Science

Google Launches PaliGemma 2 with Advanced Vision-Language Capabilities

43
Google PaliGemma 2

Google has unveiled PaliGemma 2, an advanced version of its vision-language model (VLM) announced earlier in 2024.

Building on the capabilities of the original PaliGemma, which focused on tasks like image captioning, object detection, and visual question answering, PaliGemma 2 introduces new features like long captioning.

This allows the model to generate detailed, context-aware captions that go beyond simple object identification, describing actions, emotions, and the overall scene.

The model also boasts improvements in optical character recognition, document table structure comprehension, and excels in tasks such as chemical formula recognition, music score interpretation, spatial reasoning, and chest X-ray report generation.

Available in multiple sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px), PaliGemma 2 is designed to provide developers with an easy upgrade from the original model, offering immediate performance improvements with minimal code changes.

The pre-trained models and code are available today on platforms like Kaggle, Hugging Face, and Ollama.

Written by
Sazid Kabir

I've loved music and writing all my life. That's why I started this blog. In my spare time, I make music and run this blog for fellow music fans.

Stay updated with nomusica.com. Add us to your preferred sources to see our latest updates first.

Related Articles

The moon moves in front of the sun in a rare "ring of fire" solar eclipse as seen from Singapore on December 26, 2019.
Tech & Science

“Ring of Fire” Solar Eclipse to Light Up Antarctica on Feb. 17

A rare “ring of fire” solar eclipse will take place on Tuesday,...

Artificial Intelligence (AI)
Tech & Science

AI.com Sold for $70 Million as Crypto.com CEO Bets Big on Artificial Intelligence

Crypto.com co-founder and CEO Kris Marszalek has entered the artificial intelligence space...

ChatGPT 5
AITech & Science

AI Experts Say Stop Relying on ChatGPT Alone

ChatGPT is one of the most popular AI tools in the world,...

Artificial Intelligence — AI
AITech & Science

AI Floods Research Papers, Scientists Call for Stricter Disclosure

Scientists are raising alarms over a surge of low-quality AI-generated research papers,...