ServiceNow Researchers Launch Foundational Mannequin to Generate SVG from Textual content and Video

On Thursday, a gaggle of researchers from ServiceNow launched a brand new foundational mannequin, StarVector, that helps generate Scalable Vector Graphics (SVG) from textual content and picture inputs.

Juan A. Rodriguez, an AI researcher at ServiceNow Analysis, introduced on X concerning the mannequin launch and its code.

StarVector is a multimodal giant language mannequin (MLLM) designed for Scalable Vector Graphics (SVG) technology from photos or textual content directions. It addresses the constraints of earlier SVG technology strategies that usually produced artifacts and struggled with SVG primitives past path curves.

I’m excited to announce that StarVector has been accepted at CVPR 2025! Over a yr within the making, StarVector opens a brand new paradigm for Scalable Vector Graphics (SVG) technology by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors enter photos and textual content.… pic.twitter.com/zTquIdq3n9

— Juan A. Rodríguez (@joanrod_ai) March 20, 2025

The analysis paper acknowledged that StarVector works instantly within the SVG code area, leveraging visible understanding to use correct SVG primitives for compact, exact outputs.

To coach StarVector, the researchers created SVG-Stack, a large-scale dataset of two million samples. Additionally they introduce SVG-Bench, a benchmark throughout ten datasets and three duties: Picture-to-SVG, Textual content-to-SVG technology, and diagram technology.

StarVector’s structure integrates a picture encoder to challenge photos into visible tokens and a transformer language mannequin to study the relationships between directions, visible options, and SVG code sequences. This permits StarVector to carry out picture vectorisation and text-driven SVG technology, producing extra compact and semantically wealthy SVGs.

StarVector demonstrates robust efficiency in comparison with present fashions in image-to-SVG and text-to-SVG duties. As per the benchmark outcomes, the mannequin outperformed fashions like GPT-4 Imaginative and prescient (2023), and Potrace.

Rodriguez talked about that even with the developments within the mannequin, it hallucinates, generally producing inaccurate particulars. He added that they’re actively engaged on enhancing and tackling such challenges.

The mannequin is on the market on Hugging Face, and its code is open-sourced on GitHub below Apache 2.0 licence.

The put up ServiceNow Researchers Launch Foundational Mannequin to Generate SVG from Textual content and Video appeared first on Analytics India Journal.

ServiceNow Researchers Launch Foundational Mannequin to Generate SVG from Textual content and Video

Latest stories

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron...

PNNL: Integrating AI into Biological Research

Rick Stevens on the Genesis Mission and the Future of...

Inside the DOE’s 26 AI Challenges for Genesis Mission

You might also like...

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron Star Data

PNNL: Integrating AI into Biological Research