★ CVPR 2026 (Main Conference)

InverFill

One-Step Inversion for Enhanced Few-Step Diffusion Inpainting

¹ Qualcomm AI Research^† · ² Posts & Telecommunications Inst. of Tech., Vietnam

^† Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.

* Equal Contribution

Paper (arXiv) Code Coming Soon Demo

Abstract

Results at a Glance

≤0.06s

Negligible

Overhead (at most, A100 40 GB)

Bounded by 0.06 s on SANA-Sprint, 0.04 s on SDXL-Turbo

+22.7%

Across all settings

Peak IR Gain (MagicBrush, SANA 4-step)

Consistent quality boost in every backbone, NFE, and benchmark we evaluated

4 NFEs

Improving

Matches 30-Step (Quality parity)

Reaches near multi-step quality at a fraction of the compute

Architecture-Agnostic: Validated on both DiT (SANA-Sprint) and UNet (SDXL-Turbo) backbones

Method Overview

Inversion network training pipeline

Figure 1

Inversion Network Training. We train an inversion network F_θ to map a masked image to a noise latent z^T, which is then blended with random Gaussian noise to form z^T_blend, enabling high-fidelity, well-harmonized inpainting reconstruction.

Demo:

Timing reflects actual model latency — see InverFill's minimal overhead

Example 1 / 4

Prompt: |

InverFill adds only 0.04 – 0.06 s overhead while delivering significantly better inpainting quality

Table 1: Quantitative Comparison

IR (ImageReward ×10) · HPS (Human Preference ×10²) · AS (Aesthetic Score) · CLIP (×100). Bold = best in backbone group.

Backbone	Method	NFEs	BrushBench				MagicBrush				Runtime
Backbone	Method	NFEs	IR	HPS	AS	CLIP	IR	HPS	AS	CLIP	Runtime

Citation

If you find InverFill useful in your research, please consider citing our paper.