Google Launches PaliGemma 2 with Advanced Vision-Language Capabilities

Google has unveiled PaliGemma 2, an advanced version of its vision-language model (VLM) announced earlier in 2024.

Building on the capabilities of the original PaliGemma, which focused on tasks like image captioning, object detection, and visual question answering, PaliGemma 2 introduces new features like long captioning.

This allows the model to generate detailed, context-aware captions that go beyond simple object identification, describing actions, emotions, and the overall scene.

The model also boasts improvements in optical character recognition, document table structure comprehension, and excels in tasks such as chemical formula recognition, music score interpretation, spatial reasoning, and chest X-ray report generation.

Available in multiple sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px), PaliGemma 2 is designed to provide developers with an easy upgrade from the original model, offering immediate performance improvements with minimal code changes.

The pre-trained models and code are available today on platforms like Kaggle, Hugging Face, and Ollama.

NoMusica.com

Google Launches PaliGemma 2 with Advanced Vision-Language Capabilities

Tags:

Sazid Kabir

‘Chainsaw Man’ Leads Box Office With $8.5M Opening Day; Springsteen Biopic Struggles

Time Releases New Donald Trump Cover After He Criticizes Original Photo

White House Movie Theater Demolished After 80 Years of Presidential Screenings

Demi Moore Says Tom Cruise Felt ‘Embarrassed’ by Her Pregnancy on A Few Good Men

Adam Driver Reveals Unmade Star Wars Film Directed by Steven Soderbergh

Google Replaces Old Logo with Bold Gradient ‘G’

20+ Nano Banana Prompts for Stunning AI Images

DOJ Seeks Sale of Google Ad Tools in Major Antitrust Showdown

Google Makes Gemini Free in Chrome With New AI Features

Latest from Tech & Science

TikTok US Operations Set to Transfer to New Owners in $14 Billion Deal

Microsoft Teases “Ultra-Premium” Next-Gen Xbox After $1,000 Handheld Launch

New ‘Space Armor’ Shields Satellites from Hypersonic Space Debris

AI Breakthrough Makes Live Voice Cloning a New Cyber Threat

OpenAI Launches $5 Monthly ChatGPT Plan with GPT-5 Access and Image Tools

Kevin Woo Joins UTA After Netflix Smash ‘KPop Demon Hunters’

ONF Drops Full Tracklist for ‘UNBROKEN’ Featuring Kwon Jin Ah Collaboration

FIFTY FIFTY to Drop “Too Much Part. 1” Featuring Three New Tracks

Adam Driver Reveals Unmade Star Wars Film Directed by Steven Soderbergh

Andy Garcia’s ‘Diamond’ Begins Filming After 15 Years in the Making

Apple TV+ Drops Trailer for Vince Gilligan’s Sci-Fi Series ‘Pluribus’

TikTok US Operations Set to Transfer to New Owners in $14 Billion Deal

Microsoft Teases “Ultra-Premium” Next-Gen Xbox After $1,000 Handheld Launch

New ‘Space Armor’ Shields Satellites from Hypersonic Space Debris

F1 Mexico GP 2025: Full Schedule, Race Times, and Key Details

Everything You Need to Know Before the 2025 F1 Mexico GP

Helmut Marko Hints at “Game-Changing” Red Bull Update as Verstappen Closes In

Facial Recognition Becomes Mandatory for Canadians Entering or Leaving the US

TikTok US Operations Set to Transfer to New Owners in $14 Billion Deal

30,000 Israeli Air Force Members Named in Gaza Airstrikes Leak, Al Jazeera Reports

Playboi Carti Charged with Alleged Assault on Girlfriend & Driver

30,000 Israeli Air Force Members Named in Gaza Airstrikes Leak, Al Jazeera Reports

Israeli Activists Use Strollers to Block Gaza Aid Trucks Over Hostage Dispute

Suggestions

Tags:

You might be interested in

Latest from Tech & Science