On Thursday, a gaggle of researchers from ServiceNow launched a brand new foundational mannequin, StarVector, that helps generate Scalable Vector Graphics (SVG) from textual content and picture inputs.
Juan A. Rodriguez, an AI researcher at ServiceNow Analysis, introduced on X concerning the mannequin launch and its code.
StarVector is a multimodal giant language mannequin (MLLM) designed for Scalable Vector Graphics (SVG) technology from photos or textual content directions. It addresses the constraints of earlier SVG technology strategies that usually produced artifacts and struggled with SVG primitives past path curves.
I’m excited to announce that
StarVector has been accepted at CVPR 2025! Over a yr within the making, StarVector opens a brand new paradigm for Scalable Vector Graphics (SVG) technology by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors enter photos and textual content.… pic.twitter.com/zTquIdq3n9
— Juan A. Rodríguez
(@joanrod_ai) March 20, 2025
The analysis paper acknowledged that StarVector works instantly within the SVG code area, leveraging visible understanding to use correct SVG primitives for compact, exact outputs.
To coach StarVector, the researchers created SVG-Stack, a large-scale dataset of two million samples. Additionally they introduce SVG-Bench, a benchmark throughout ten datasets and three duties: Picture-to-SVG, Textual content-to-SVG technology, and diagram technology.
StarVector’s structure integrates a picture encoder to challenge photos into visible tokens and a transformer language mannequin to study the relationships between directions, visible options, and SVG code sequences. This permits StarVector to carry out picture vectorisation and text-driven SVG technology, producing extra compact and semantically wealthy SVGs.
StarVector demonstrates robust efficiency in comparison with present fashions in image-to-SVG and text-to-SVG duties. As per the benchmark outcomes, the mannequin outperformed fashions like GPT-4 Imaginative and prescient (2023), and Potrace.

Rodriguez talked about that even with the developments within the mannequin, it hallucinates, generally producing inaccurate particulars. He added that they’re actively engaged on enhancing and tackling such challenges.
The mannequin is on the market on Hugging Face, and its code is open-sourced on GitHub below Apache 2.0 licence.
The put up ServiceNow Researchers Launch Foundational Mannequin to Generate SVG from Textual content and Video appeared first on Analytics India Journal.
StarVector has been accepted at CVPR 2025! Over a yr within the making, StarVector opens a brand new paradigm for Scalable Vector Graphics (SVG) technology by harnessing multimodal LLMs to generate SVG code that aesthetically mirrors enter photos and textual content.… pic.twitter.com/zTquIdq3n9