Groq Unveils LLaVA V1.5 7B, Faster than OpenAI GPT-4o

Groq has introduced LLaVA v1.5 7B, a new visual model now available on its Developer Console. This launch makes GroqCloud multimodal and broadens its support to include image, audio, and text modalities.

New *multi-modal* model dropped on @Groqinc! Llava v1.5 7b is a visual model that can take images as input.
Try it now in API or console as “llava-v1.5-7b-4096-preview”!
Developers can now build applications on Groq with all three modalities: image, audio, and text! pic.twitter.com/px90CVtPLq

— Benjamin Klieger (@BenjaminKlieger) September 4, 2024

LLaVA, short for Large Language and Vision Assistant, combines language and vision capabilities. It builds on OpenAI’s CLIP and Meta’s Llama 2 7B model, utilising visual instruction tuning to enhance image-based natural instruction following and visual reasoning.

This enables LLaVA to excel in tasks such as visual question answering, caption generation, optical character recognition, and multimodal dialogue.

“LLaVA-v1.5-7B which supports vision/image inputs and in our initial benchmarking response times were >4X faster than GPT-4o on OpenAI,” said Artificial Analysis.

Groq has launched their first multi-modal endpoint! Groq is hosting LLaVA-v1.5-7B which supports vision/image inputs and in our initial benchmarking response times were >4X faster than GPT-4o on OpenAI.
We have conducted initial benchmarking comparing the response speed of… pic.twitter.com/bHFDSeVPaZ

— Artificial Analysis (@ArtificialAnlys) September 4, 2024

The new model unlocks numerous practical applications. Retailers can use it for inventory tracking, social media platforms can improve accessibility with image descriptions, and customer service chatbots can handle text and image-based interactions.

Additionally, it aids in automating tasks in industries such as manufacturing, finance, retail, and education by streamlining processes and enhancing efficiency.

Developers and businesses can LLaVA v1.5 7B in Preview Mode on GroqCloud.

Groq recently partnered with Meta, making the latest Llama 3.1 models—including 405B Instruct, 70B Instruct, and 8B Instruct—available to the community at Groq speed.

Former OpenAI researcher Andrej Karpathy praised Groq’s inference speed, saying, “This is so cool. It feels like AGI—you just talk to your computer and it does stuff instantly. Speed really makes AI so much more pleasing.”

Founded in 2016 by Ross, Groq distinguishes itself by eschewing GPUs in favour of its proprietary hardware, the LPU.

The post Groq Unveils LLaVA V1.5 7B, Faster than OpenAI GPT-4o appeared first on AIM.

Groq Unveils LLaVA V1.5 7B, Faster than OpenAI GPT-4o

Thoughtworks Names Kaushik Sarkar as Regional MD for Europe, Center East and India

Anthropic Launches Claude for Schooling to Assist Universities with AI

E2E Cloud Launches NVIDIA H200 GPU Clusters in Delhi NCR and Chennai

Microsoft Lastly Expands Copilot+ AI Options to Intel and AMD-Powered PCs

India Cozying As much as Large Tech Below Trump’s Tariff Warmth

Latest stories

Anthropic Launches Claude for Schooling to Assist Universities with AI

Is ChatGPT Plus value your $20? Here is the way...

OpenAI’s New Benchmark to Research AI Brokers’ Analysis Capabilities

How Anthropic’s AI Mannequin Thinks, Lies, and Catches itself Making...

Anthropic launches Claude for Schooling, an AI to assist college...

You might also like...

Anthropic Launches Claude for Schooling to Assist Universities with AI

Is ChatGPT Plus value your $20? Here is the way it compares to Free and Professional plans

OpenAI’s New Benchmark to Research AI Brokers’ Analysis Capabilities