NSF: Neural Surface Fields for Human Modeling from Monocular Depth

Abstract

Creating personalized and animatable 3D avatars is challenging, with real-world applications in gaming, virtual try-on, animation, and VR/XR. On the other hand, it is also a complex problem to infer cloth geometry and dynamics from sparse and monocular view data. Existing methods for modeling 3D humans from depth data have limitations in terms of computational efficiency, mesh coherency, and flexibility in resolution and topology. Reconstructing shapes using implicit functions and extracting explicit meshes per frame is computationally expensive and cannot ensure coherent meshes across frames. Conversely, predicting per-vertex deformations on a pre-designed human template with a discrete surface lacks flexibility in resolution and topology.

To overcome these limitations, we propose a novel method NSF: Neural Surface Fields' for modeling 3D clothed humans. A distinctive aspect of NSF is that it defines a neural field solely on the base surface, enabling it to predict a continuous displacement field over the surface. To determine the shape of the base surface, our method fuses depth observations in a canonical space and learns a coarse geometry without high-frequency pose-dependent deformations. Compared to existing approaches, our method eliminates the expensive per-frame surface extraction while maintaining mesh coherency, and is capable of reconstructing meshes with arbitrary resolution without retraining.

Reconstruction

Reconstruction of subject 03223 shortlong of BuFF dataset.

Animation

Animation of subject 03223 shortlong on AIST dataset.

Reconstruction of subject 03223 shortshort of BuFF dataset.

Animation of subject 03223 shortshort on AIST dataset.

Reconstruction of subject 00032 shortshort of BuFF dataset.

Animation of subject 00032 shortshort on AIST dataset.

Reconstruction of subject 00032 shortlong of BuFF dataset.

Animation of subject 00032 shortlong on AIST dataset.

Reconstruction of subject 00096 shortshort of BuFF dataset.

Animation of subject 00096 shortshort on AIST dataset.

Reconstruction of subject 00096 shortlong of BuFF dataset.

Animation of subject 00096 shortlong on AIST dataset.

Acknowledgement

This work was made possible by funding from the Carl Zeiss Foundation. This work is also funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 409792180 (EmmyNoether Programme, project: Real Virtual Humans) and the German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A. G. Pons-Moll is a member of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645. The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting Y.Xue. For this project, R. Marin has been supported by an Alexander von Humboldt Foundation Research Fellowship and partially from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 101109330.

BibTeX

@article{xue2023nsf,
  author    = {Xue, Yuxuan and Bhatnagar, Bharat Lal and Marin, Riccardo and Sarafianos, Nikolaos and Xu, Yuanlu and Pons-Moll, Gerard and Tung, Tony},
  title     = {NSF: Neural Surface Fields for Human Modeling from Monocular Depth},
  journal   = {ICCV},
  year      = {2023},
}