AITech

Alibaba’s Qwen Team Unveils AI Models with PC and Phone Control Capabilities

17
Qwen2.5 VL

Alibaba’s Qwen team has released the Qwen2.5-VL family of AI models, marking a significant advancement in AI’s ability to interact with software.

These models are capable of performing various text and image analysis tasks, including video understanding, document analysis, and object counting.

The models are also designed to control PCs and mobile devices, similar to OpenAI’s Operator.

Performance and Capabilities

The Qwen2.5-VL models, especially the Qwen2.5-VL-72B, have outperformed OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.0 Flash in areas like math, document analysis, and question answering. They can also parse charts, extract data from invoices and forms, and comprehend lengthy videos.

Controversial Restrictions

However, these models come with certain restrictions due to China’s internet regulations. For instance, when asked about politically sensitive topics, such as Xi Jinping’s mistakes, the AI refused to respond, citing an error message. This aligns with China’s regulatory requirements to ensure AI responses adhere to core socialist values.

PC and Mobile Device Control

One of the most striking features of Qwen2.5-VL is its ability to control software on both PCs and mobile devices. In a demonstration, the AI successfully launched the Booking.com app on an Android phone and booked a flight. However, its performance on a Linux desktop was less impressive, as it struggled to do more than switch tabs.

Licensing and Availability

While the Qwen2.5-VL-3B and Qwen2.5-VL-7B models are available under a permissive license, the flagship Qwen2.5-VL-72B is under a custom license.

Companies with over 100 million monthly active users must seek permission from Alibaba to deploy the model commercially.

Written by
Sazid Kabir

I've loved music and writing all my life. That's why I started this blog. In my spare time, I make music and run this blog for fellow music fans.

Related Articles

Apple Tim Cook
Tech

Why iPhone Users Are Raving About This $0.99 Apple Product

Apple’s best privacy feature might be one you’ve never even heard of—and...

Xbox Copilot for Gaming App
TechApps & Updates

Xbox Copilot for Gaming Arrives on Mobile, Early Testing Begins

Microsoft has begun rolling out an early version of its Xbox Copilot...

Telegram
Apps & UpdatesAI

Telegram and xAI Partner in $300M Deal to Integrate AI Chatbot Grok

Telegram has entered a major partnership with Elon Musk’s AI company, xAI,...

Starlink
Tech

Solar Storms Are Killing Starlink Satellites Faster Than Expected

A recent NASA study has revealed that powerful solar storms are causing...