Timezone: »
We study the recent progress on dynamic view synthesis (DVS) from monocular video. Though existing approaches have demonstrated impressive results, we show a discrepancy between the practical capture process and the existing experimental protocols, which effectively leaks in multi-view signals during training. We define effective multi-view factors (EMFs) to quantify the amount of multi-view signal present in the input capture sequence based on the relative camera-scene motion. We introduce two new metrics: co-visibility masked image metrics and correspondence accuracy, which overcome the issue in existing protocols. We also propose a new iPhone dataset that includes more diverse real-life deformation sequences. Using our proposed experimental protocol, we show that the state-of-the-art approaches observe a 1-2 dB drop in masked PSNR in the absence of multi-view cues and 4-5 dB drop when modeling complex motion. Code and data can be found at http://hangg7.com/dycheck.
Author Information
Hang Gao (University of California Berkeley)
Ruilong Li (University of California, Berkeley)
Shubham Tulsiani (Carnegie Mellon University)
Bryan Russell (Adobe)
Angjoo Kanazawa (University of California, Berkeley)
More from the Same Authors
-
2023 Poster: Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives »
Tom Monnier · Jake Austin · Angjoo Kanazawa · Alexei Efros · Mathieu Aubry -
2019 Poster: Learning elementary structures for 3D shape generation and matching »
Theo Deprelle · Thibault Groueix · Matthew Fisher · Vladimir Kim · Bryan Russell · Mathieu Aubry -
2016 Poster: SURGE: Surface Regularized Geometry Estimation from a Single Image »
Peng Wang · Xiaohui Shen · Bryan Russell · Scott Cohen · Brian Price · Alan Yuille