GlobalPaint: Spatiotemporal Coherent Video Outpainting with Global Feature Guidance

Yueming Pan1,2*, Ruoyu Feng3, Jianmin Bao2, Chong Luo2†, Nanning Zheng1†
¹State Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University    ²Microsoft Research Asia    ³University of Science and Technology of China
*This work was performed during Yueming Pan's internship at MSRA Corresponding author

Abstract

Video outpainting extends a video beyond its original boundaries by synthesizing missing border content. Compared with image outpainting, it requires not only per-frame spatial plausibility but also long-range temporal coherence, especially when outpainted content becomes visible across time under camera or object motion. We propose \textit{GlobalPaint}, a diffusion-based framework for spatiotemporal coherent video outpainting. Our approach adopts a hierarchical pipeline that first outpaints key frames and then completes intermediate frames via an interpolation model conditioned on the completed boundaries, reducing error accumulation in sequential processing. At the model level, we augment a pretrained image inpainting backbone with (i) an Enhanced Spatial-Temporal module featuring 3D windowed attention for stronger spatiotemporal interaction, and (ii) global feature guidance that distills OpenCLIP features from observed regions across all frames into compact global tokens using a dedicated extractor. Comprehensive evaluations on benchmark datasets demonstrate improved reconstruction quality and more natural motion compared to prior methods.


Demo Results

Source Video
Outpainted by GlobalPaint


Source Video
Outpainted by GlobalPaint


Source Video
Outpainted by GlobalPaint




Source Video
Outpainted by GlobalPaint
Source Video
Outpainted by GlobalPaint


Source Video
Outpainted by GlobalPaint
Source Video
Outpainted by GlobalPaint



Comparison

Source Video
M3DDM
MOTIA
GlobalPaint

Source Video
M3DDM
MOTIA
GlobalPaint

Source Video
MagicEdit
GlobalPaint

Source Video
MagicEdit
GlobalPaint