London-based Stability AI has released SDXL 0.9 (Stable Diffusion XL), the newest addition to the company’s suite of products including Stable Diffusion. Currently accessible through ClipDrop, with an upcoming API release, the public launch is scheduled for mid-July, following the beta release in April.
Despite its capacity to run on modern consumer GPUs, SDXL 0.9 can generate hyper-realistic creations for cinema, television, music, and instructional videos, as well as advancements in design and industrial applications, positions SDXL at the forefront of practical AI image implementations.
The SDXL series encompasses a wide array of functionalities that go beyond basic text prompting including image-to-image prompting (using one image to obtain variations of it), inpainting (reconstructing missing parts of an image), and outpainting (creating a seamless extension of an existing image).
SDXL 0.9 boasts one of the largest parameter counts among open-source image models, comprising a 3.5 billion parameter base model (compared to 3.1 billion parameter) with a 6.6 billion parameter model ensemble pipeline. The final output is generated by running the input through two models and aggregating the results. The second-stage model of the pipeline introduces finer details to the output generated by the first stage. It is based on two CLIP models, including one of the largest OpenCLIP models trained to date (OpenCLIP ViT-G/14), giving images with higher depth and a resolution of 1024×1024.
In March, Stability AI acquired Init ML, the creator of Clipdrop.
The new SDXL 0.9 can work on modern consumer GPUs, with a Windows 10 or 11, or Linux operating system, with 16GB of RAM. Additionally, an Nvidia GeForce RTX 20 graphics card (or equivalent higher standard) with a minimum of 8GB VRAM is required. Linux users can also use a compatible AMD card with 16GB VRAM.
An array of New Products
SDXL 0.9 will be provided for research purposes only during a limited period to collect feedback and fully refine the model before its general open release. The code to run it will be publicly available on GitHub. In April, Stability AI introduced StableLM, a collection of open-source large language models called StableLM. These models, currently in the “alpha” stage, can be found on GitHub and Hugging Face.
Additionally, Stability AI unveiled a new software development kit (SDK) that works with Stable Diffusion 2.0 and Stable Diffusion XL, allowing users to control the output of the software by adjusting various parameters, including style presets, cadence, frames per second (FPS), colours, 3D depths, and post-processing effects.
Mitigating the Threat of Copyright Infringement
Back in January, artists filed a class action lawsuit against Stability AI alleging that their artwork was illegally used by these companies to train models and create new images, endangering their professions. Getty Images has also sued Stability AI for copyright infringement.
During the Bloomberg Technology Summit in San Francisco, Stability AI’s CEO Emad Mostaque acknowledged these concerns of lifelike deep fakes created by AI in an on-stage interview. Mostaque revealed that the company possessed “photo-realistic models” earlier this year but chose not to release them due to timing considerations. He emphasised the need for incorporating features like watermarking to establish standards, enabling tracking and appropriate utilisation of AI-generated content.
The post Stability AI Unveils SDXL 0.9 to Enhance Image Generation appeared first on Analytics India Magazine.