Instant 3D Reconstruction with Python and Instant-NGP

Papers in 100 Lines of Code
4 min readAug 5, 2024

Instant Neural Graphics Primitives in 100 lines of pure PyTorch code

Instant 3D Reconstruction with Python and Instant-NGP | Pure Python

With the advent of Instant Neural Graphics Primitives (Instant-NGP), it is now possible to create detailed implicit 3D models quickly and efficiently. However, many implementations of Instant-NGP rely on CUDA code to enhance performance, which can complicate understanding and accessibility. In this article, we present a pure Python implementation of Instant-NGP that strips away the complexity of CUDA, allowing you to grasp the underlying principles and techniques. By leveraging just about100 lines of PyTorch code, this approach provides a clear and straightforward way to achieve 3D reconstruction with Instant-NGP.

Instant-NGP

Constructor

In Instant Neural Graphics Primitives (Instant-NGP), the architecture cleverly balances speed and expressivity by combining a neural network (NN) with a lookup table. Neural networks, while powerful in modelling complex features, are often slow due to their computational demands. To address this, Instant-NGP replaces the “large” neural networks typically used in models like NeRF with smaller, more efficient networks. Instead of relying solely on the NN, Instant-NGP utilizes a lookup table to store and quickly access learned features, allowing for constant-time 𝑂( 1 ) queries. This approach maintains the expressivity of the model while dramatically improving performance, making Instant-NGP both fast and effective in 3D reconstruction.

The NGP class constructor in the code below takes several key parameters to configure the model. T defines the number of entries in each lookup table, which influences the resolution of the feature grid. Nl is a list specifying the number of lookup tables, allowing to store features at both coarse and fine resolutions. L is the number of components for positional encoding. aabb_scale scales the axis-aligned bounding box (cube). Finally, Fsets the number of features processed by the model, defaulting to 2. These parameters together shape the model’s performance and capability in 3D reconstruction.

Instant Neural Graphics Primitives (instant-ngp) | Python Constructor

Positional Encoding

While the original paper on Instant Neural Graphics Primitives employs spherical harmonics for encoding directions, our implementation opts for a simpler positional encoding approach. Such encodings allow the learning of high-frequency functions by expanding the input into a higher-dimensional space through sine and cosine transformations.

Instant Neural Graphics Primitives (instant-ngp) | Positional encoding

Forward pass

In the forward method, the input coordinates x are first normalized and adjusted to fall within the [0, 1] range. A mask filters out coordinates within that range (bounding box delimiting the scene). For each lookup table, vertices are computed to identify the surrounding cell corners, and these vertices are hashed to efficiently query the learned features. Those feature data are passed through neural network to predict the log density (direction independent), while another neural network predicts the color (direction dependent).

Instant Neural Graphics Primitives (instant-ngp) | PyTorch forward pass

Rendering

The compute_accumulated_transmittance function is a common component in NeRF implementations, playing a crucial role in volumetric rendering by calculating the accumulated transmittance along a ray. This function helps compute the visibility of 3D points in a volume. If you are interesting to learn more about volumetric rendering, I offer a comprehensive course that covers volumetric rendering and NeRF extensively. Check it out here if you are interested in deepening your understanding.

Instant Neural Graphics Primitives (instant-ngp) | 3d rendering

The rendering function takes as input the NeRF model together with some rays from the camera, and returns the color associated to each ray, using volumetric rendering. The initial portion of the code uses stratified sampling to select 3D points along the rays. The NGP model is then queried at those points (together with the ray directions) to obtain the density and color information. The outputs from the model can then be used to compute line integrals for each ray with Monte Carlo integration.

Instant Neural Graphics Primitives (instant-ngp) | PyTorch rendering

Training & Testing

Training Instant-NGP

The implementation of the training loop is straightforward, as it can be seen as a supervised learning approach. For each ray, the target color is known from the 2D images, allowing us to directly minimize the L2 loss between the predicted and ground truth colors.

Instant Neural Graphics Primitives (instant-ngp) | Python training

Testing Instant-NGP

Once the training process is complete, the NGP model can be used to generate 2D images from any viewpoint. The testing function takes as input a dataset of rays from test images, and generates novel views for those rays.

Instant Neural Graphics Primitives (instant-ngp) | Python testing

Putting it all together

Finally, all the pieces can be easily combined. In order to keep the implementation short, the rays are encoded in a dataset which can be loaded. If you want to learn more about the ray computation, I invite you to have a look at this video, or to follow my course on Udemy.

Instant Neural Graphics Primitives in 100 lines of PyTorch code

I hope this story was helful to you. If it was, consider clapping this story, and do not forget to subscribe for more tutorials related to NeRF and Machine Learning.

[Full Code] | [Udemy Course] | [NeRF Consulting] | [Career & Internships]

--

--

Papers in 100 Lines of Code
Papers in 100 Lines of Code

Written by Papers in 100 Lines of Code

Implementation of research papers in about 100 lines of code

No responses yet