Microsoft Launches Phi-4 multimodal and Phi-4-mini, Matches OpenAI’s GPT-4o

Microsoft has launched Phi-4-multimodal and Phi-4-mini, the most recent additions to its Phi household of small language fashions (SLMs). These fashions are actually accessible on Azure AI Foundry, Hugging Face, and the NVIDIA API Catalog.

Phi-4-multimodal is a 5.6 billion-parameter mannequin that integrates speech, imaginative and prescient, and textual content processing. “By leveraging superior cross-modal studying strategies, this mannequin allows extra pure and context-aware interactions, permitting units to grasp and cause throughout a number of enter modalities concurrently,” mentioned Weizhu Chen, vp of generative AI at Microsoft.

Final yr, Microsoft launched phi-4, with 14 billion parameters. The mannequin excels at complicated reasoning capabilities.

The Phi-4 multimodal mannequin helps purposes together with doc evaluation and speech recognition. On multimodal audio and visible benchmarks, it surpasses Google Gemini 2 Flash and Gemini 1.5 Professional. Microsoft claims that it’s corresponding to OpenAI’s GPT-4o.

The corporate mentioned it has demonstrated sturdy efficiency in speech-related duties, surpassing fashions reminiscent of WhisperV3 and SeamlessM4T-v2-Giant in computerized speech recognition and speech translation. It additionally ranks first on the Hugging Face OpenASR leaderboard with a phrase error price of 6.14%. The mannequin exhibits aggressive ends in doc and chart understanding, Optical Character Recognition (OCR), and visible science reasoning.

Alternatively, Phi-4-mini is a 3.8 billion-parameter text-based mannequin for reasoning, coding, and long-context duties. It helps sequences of as much as 128,000 tokens and provides environment friendly processing with diminished computational necessities. It helps perform calling, permitting integration with exterior instruments and APIs.

Each of the fashions are appropriate for deployment in constrained computing environments. They are often optimised utilizing ONNX Runtime for cross-platform availability and decrease latency.

Microsoft is incorporating these fashions into its ecosystem, together with Home windows purposes and Copilot+ PCs. “Copilot+ PCs will construct upon Phi-4-multimodal’s capabilities, delivering the facility of Microsoft’s superior SLMs with out the power drain,” mentioned Vivek Pradeep, vp and distinguished engineer of Home windows Utilized Sciences.

Builders can entry Phi-4-multimodal and Phi-4-mini on a number of platforms and discover their purposes in varied industries, together with finance, healthcare, and automotive know-how.

The put up Microsoft Launches Phi-4 multimodal and Phi-4-mini, Matches OpenAI’s GPT-4o appeared first on Analytics India Journal.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...