Local Light
Field Fusion

Practical View Synthesis with Prescriptive Sampling Guidelines

Example Results

Abstract

We present a practical and robust deep learning solution for capturing and rendering novel views of complex real world scenes for virtual exploration. Previous approaches either require intractably dense view sampling or provide little to no guidance for how users should sample views of a scene to reliably render high-quality novel views. Instead, we propose an algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local light fields. We extend traditional plenoptic sampling theory to derive a bound that specifies precisely how densely users should sample views of a given scene when using our algorithm. In practice, we apply this bound to capture and render views of real world scenes that achieve the perceptual quality of Nyquist rate view sampling while using up to 4000x fewer views. We demonstrate our approach's practicality with an augmented reality smartphone app that guides users to capture input images of a scene and viewers that enable realtime virtual exploration on desktop and mobile platforms.

Image

Fast and easy handheld capture with guideline: closest object moves at most D pixels between views

Image

Promote sampled views to local light field via layered scene representation

Image

Blend neighboring local light fields to render novel views

Technical Video

Acknowledgements

We thank the SIGGRAPH reviewers for their constructive comments. The technical video was created with help from Julius Santiago, Milos Vlaski, Endre Ajandi, and Christopher Schnese. The augmented reality app was developed by Alex Trevor. Web Viewer and WebGL version by Pantelis Kalogiros.

BM is funded by a Hertz Foundation Fellowship. PS is funded by an NSF Graduate Fellowship. RR was supported in part by NSF grant 1617234, ONR grant N000141712687, and Google Research Awards. RN was supported in part by NSF grant 1617794 and an Alfred P. Sloan Foundation fellowship.