Skip to yearly menu bar Skip to main content


Poster

Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

Peng Li · Yuan Liu · Xiaoxiao Long · Feihu Zhang · Cheng Lin · Mengfei Li · Xingqun Qi · Shanghang Zhang · Wei Xue · Wenhan Luo · Ping Tan · Wenping Wang · Qifeng Liu · Yike Guo

East Exhibit Hall A-C #2809
[ ] [ Project Page ]
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

In this paper, we introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image. Despite significant advancements in multiview generation, existing methods still suffer from camera prior mismatch, inefficacy, and low resolution, resulting in poor-quality multiview images. Specifically, these methods assume that the input images should comply with a predefined camera type, e.g. a perspective camera with a fixed focal length, leading to distorted shapes when the assumption fails. Moreover, the full-image or dense multiview attention they employ leads to a dramatic explosion of computational complexity as image resolution increases, resulting in prohibitively expensive training costs. To bridge the gap between assumption and reality, Era3D first proposes a diffusion-based camera prediction module to estimate the focal length and elevation of the input image, which allows our method to generate images without shape distortions. Furthermore, a simple but efficient attention layer, named row-wise attention, is used to enforce epipolar priors in the multiview diffusion, facilitating efficient cross-view information fusion. Consequently, compared with state-of-the-art methods, Era3D generates high-quality multiview images with up to a 512×512 resolution while reducing computation complexity of multiview attention by 12x times. Comprehensive experiments demonstrate the superior generation power of Era3D- it can reconstruct high-quality and detailed 3D meshes from diverse single-view input images, significantly outperforming baseline multiview diffusion methods.

Live content is unavailable. Log in and register to view live content