Meta-Learning NeRF

Project Overview

This thesis presents a method for meta-learning Neural Radiance Fields, a deep-learning approach for synthesizing novel views of a scene. The method improves the efficiency of NeRF by learning the hyperparameters such as learning rates and loss regularization terms, resulting in faster convergence and fewer required input views. The method outperforms prior meta-learning work on the ShapeNet dataset.

Project Summary

Neural Radiance Fields (NeRFs)

Scene representation is a crucial aspect of computer graphics and computer vision. With the advent of deep learning, researchers have developed Neural Radiance Fields (NeRF) as a form of implicit neural representation to capture scenes. NeRF maps 3D coordinates and viewing direction to RGB color and density, and by fine-tuning with images of the scene taken from different viewpoints, it can accurately capture the underlying representation of the scene.

Limitation of NeRFs

The main advantage of NeRF over traditional methods like meshes or voxel grids is that it can render the scene in arbitrary resolution without storage limitations. However, optimizing NeRF with given observations usually requires many gradient descent steps, making the convergent time long and sometimes taking hours or days. Additionally, NeRF requires a sufficient number of input images from different viewpoints, making it suboptimal with limited observations.

Meta-Learning NeRF

Meta-Learning a coordinate-based MLP: Given a dataset of tasks, we can use Meta-learning to find an initialization that can quickly converge to any tasks of that domain with fewer input data. Images credit: Learned Init paper [1]

Methods

This study focuses on improving the performance of Neural Radiance Fields (NeRF) by meta-learning the hyperparameters such as the learning rates or loss regularization terms. Meta-learning is a process where a model learns to learn from experience, rather than starting from scratch each time. This approach extends the idea of meta-learning to NeRF, with the aim of finding a better initialization and faster convergence. In contrast to standard random initialization, meta-learning introduces prior knowledge into the optimization, which makes few-shot learning possible.

In this study, the meta-learning process consists of two stages: meta-learning and meta-testing. During the meta-learning stage, the meta initialization is optimized over a particular training domain, such as the 3D chairs from ShapeNet. During the meta-testing stage, the meta-learned model weights and hyperparameters are initialized into NeRF. This study primarily focuses on meta-learning the learning rates, but also demonstrates that other types of hyperparameters, such as parametric loss regularization, can also be learned.

Result

The study was performed on the ShapeNet dataset, and the results show that the method outperforms prior meta-learning work on the novel view synthesis task. This meta-learning approach provides a powerful tool for improving the performance of NeRF and can be applied to other forms of implicit neural representations. By meta-learning the hyperparameters, NeRF can achieve faster convergence and better performance with fewer input views, making it a useful tool for capturing underlying representations of scenes and low-dimensional signals.

Thesis showcase

The thesis has been chosen to be featured in the 2022 Computer Science and Engineering (CSE) Thesis Showcase.