Nvidia researchers have developed an approach to reconstructing a 3D scene from a handful of 2D images "almost instantly." A new blog post describes (opens in new tab) the feat, which leverages a popular new technology called neural radiance fields (NeRF), which is accelerated up to 1,000x compared to rival implementations. Nvidia's processing speed is largely due to its AI acceleration leveraging Tensor Cores which speed up both model training and scene rendering. If you are interested but want a TLDR, take a peek at the short video embedded directly below.
Providing some context to its demo, Nvidia says that previous NeRF techniques could take hours to train for a scene and then minutes to render target scenes. Though the results of previous slower implementations were good, Nvidia researchers leveraging AI technology have put a rocket into the performance, and hence Nvidia has the confidence to describe its tech as "Instant NeRF."
You probably have already guessed, but this NeRF tech uses neural networks to represent and render realistic 3D scenes based on an input collection of 2D images. The above video implies that just four snaps were required to create the 3D representation we see in motion. However, the blog might be more realistic in explaining the "the neural network requires a few dozen images taken from multiple positions around the scene, as well as the camera position of each of those shots." The neural network fills in the blanks of the full 360-degree scene and can predict the color of light radiating in any direction, from any point in 3D space for added realism. Nvidia says the technique can work around occlusions.
With Instant NeRF, Nvidia facilitates the rendering of a full 3D scene as described above in tens of milliseconds. That is impressive, but what use could it be? David Luebke, VP for graphics research at Nvidia has big hopes for the technology. "If traditional 3D representations like polygonal meshes are akin to vector images, NeRFs are like bitmap images: they densely capture the way light radiates from an object or within a scene," said Luebke. "In that sense, Instant NeRF could be as important to 3D as digital cameras and JPEG compression have been to 2D photography — vastly increasing the speed, ease and reach of 3D capture and sharing."
Other uses foreseen for Instant NeRF tech include providing help for robots and autonomous vehicles to understand the size and shape of real-world objects from incomplete data. Instant NeRF might also be of value where speed is of the essence for virtual environment design or even architecture. It also will surely be of interest to gaming and VR / metaverse developers.
If you want to read about Instant NeRF in more depth, Nvidia research has published a paper called Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. Last but not least, you can also download, train and run the demo code, available via that same GitHub link.
Though as far as their demonstration video went, it wouldn't exactly be practical to have a subject pose perfectly still while you take numerous photos of them from every direction. And for that matter, I suspect the system would get confused if you tried to do that outside a controlled environment, with other people walking around and trees blowing in the wind, and so on. AI algorithms might be able to filter those things out, but it's probably not going to be happening in near-real-time on standard computer hardware, at least not yet.
but it's finicky to get the images captured properly and is slow to recontruct. What that means to the worlflow is you capture images in the field and hope you got enough of the right areas in the right ways. Then you bring them back to the lab to rescontruct the scene which takes minutes to hours.. only to find out something wasn't right with the capture. Sometimes it's not possible to go back to the scene or the scene has changed enough that you have to recapture the entire thing again, and sometimes you don't recapture but instead spend days or weeks manually tweaking to get the render right.
If this tech can render instantly in the field on a commodity laptop, then that would be a game changer and very valuable to the right industries.
Most people watch porn, where this tech has obvious application.
3D someone else's face in place of hers is priority one.