Hong Kong Researchers Introduce Diffusion Model-Based Editing Tool

video_dataset

A group of researchers from the Hong Kong University of Sciences and Technology (HKUST) introduced MagicQuill, an advanced and interactive image editing system based on diffusion models. It is designed to provide a user-friendly experience by combining AI-driven suggestions with precise local editing capabilities.

The system builds on Stable Diffusion v1.5, enhanced with specialised modules such as Dual-branch architecture, inpainting and control branches, and Fine-tuned LLaVA-1.5, a multimodal large language model.

The paper challenges existing tools such as SmartEdit for instruction-based editing, BrushNet for mask and prompt-guided inpainting, and GAN-based tools like SketchEdit. Even Adobe’s Palette is a competitor in this space.

As compared to platforms like Dragdiffusion or Zone, similar advanced systems for image editing, MagicQuill strength lies in real-time intent recognition via Multimodal Large Language Models (MLLMs). Besides, it’s meant for general-purpose streamlined editing as against precision oriented editing offered by other systems.

Researchers from Ant Group, Zhejiang University (ZJU), and The University of Hong Kong (HKU), have also contributed to this paper.

What does MagicQuill solve?

According to their research paper, MagicQuill addresses limitations in existing tools by improving precision, reducing complexity, and making sophisticated image editing accessible to users of all skill levels.

MagicQuill’s Painting Assistor employs a multimodal large language model (MLLM) to predict editing prompts in real time from brushstrokes, eliminating the need for manual input. The user-friendly Idea Collector interface streamlines interaction, allowing iterative edits, stroke management, and result previews.

However, the research has limitations, including quality degradation when sketches deviate from prompts, loss of fine details during colour adjustments, and misinterpretation of simple or ambiguous sketches. It requires high-end hardware (15 GB VRAM) and lacks advanced features like reference-based editing, planned for future updates.

The post Hong Kong Researchers Introduce Diffusion Model-Based Editing Tool appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...