Immersive Media
Gaussian Splatting: Painting Immersive Scenes With Reality
The state of the art of immersive media is evolving so rapidly that it’s hard to keep up! It often feels as if you can choose a random limitation in the latest research, wait a few weeks and find a new paper that’s solved that problem. Here, we’ll look at an example of exactly that, pushing the bar to get faster, higher-quality immersive content.
If you didn’t catch our previous post, we looked at Neural Radiance Fields (NeRF) for their ability to “memorize” beautiful, photorealistic 3D snapshots of the real world. In the past three years, a lively community has sprung up around NeRF. Developers and artists have created content, built tools and pushed boundaries on all the ways NeRF can be used.
One of NeRF’s biggest remaining limitations is that real-time interactive viewing of NeRF-based content generally requires reducing image quality, which can cause fog-like visual artifacts and color inaccuracies in the scene.
As it turns out, the answer to this problem came in a paper at SIGGRAPH 2023: 3D Gaussian Splatting for Real-Time Radiance Field Rendering. Despite having little to no conceptual connection to the original NeRF methodology, Gaussian Splatting dramatically improved both visual fidelity and performance of real-time viewing. The results speak for themselves: In just the few months since the paper was released, we’ve seen dozens of product enhancements and launches incorporating Gaussian Splatting functionality.
The Bridge at Argenteuil, Claude Monet, 1874 (Collection of Mr. and Mrs. Paul Mellon)
How It Works: Computer-Generated Impressionism
If you’re a fan of Monet or Renoir, you’re likely familiar with Impressionism. This 19th century art movement is known for large, distinct brushstrokes and an emphasis on larger forms, as you can see in the above example. Try looking too close and you’ll mostly see brushstrokes; the full scene comes together when you gaze at it from far enough away.
As it turns out, Impressionism is a useful analogy for Gaussian Splatting. Creating a scene with Gaussian Splatting is like making an Impressionist painting, but in 3D. The scene is composed of millions of “splats,” also known as 3D Gaussians. Each splat is like a voluminous cloud painted onto an empty 3D space, and each splat can show different colors from various angles to mimic view-dependent effects like reflections. When you build a scene from lots of small splats, the result can be amazingly photorealistic!
Here’s an example. I recorded this cellphone video at the Duke Gardens in Durham, North Carolina:
Here’s the result as an interactive Gaussian Splatting scene via Luma AI. You can click and drag to move the scene around.
You can view a couple more examples from my visit to the gardens here and here.
From a technical perspective, 3D Gaussians are a unique variant of point clouds, where each point encodes spherical harmonics for view-dependent color and a covariance matrix for describing shape (some sort of directionally scaled sphere). Although splat-based rendering has existed for a long time, the Gaussian Splatting paper was the first to show that 3D Gaussians serve as an excellent scene representation, and it describes new methods to create and efficiently render these scenes. For more details, refer to the SIGGRAPH paper or the video overview provided on the authors’ website.
Gaussian Splatting scenes tend to be large compared to other scene formats, on the order of hundreds of megabytes to gigabytes. Each splat is 248 bytes, and a scene is typically composed of millions of splats. However, programmer Aras Pranckevičius has a great technical deep dive showing that Gaussian Splatting is ripe for compression, bringing sizes under a gigabyte with little to no visual impact, or smaller if you can accept “lossy” visuals.
Network Traffic of the Future
With all this said about Gaussian Splatting, where are we going next?
The dust hasn’t settled on immersive scene representations. A new research preprint already proposes combining the strengths of NeRF and Gaussian Splatting into a hybrid approach. Still, everything is so fast that the state of the art could change any day. When things do settle, the next step will be standardization.
If Gaussian Splatting is here to stay, we should expect file sizes to grow with the scale of the use cases at play. For example, a real estate agent selling a house may want to deliver an online virtual tour that allows viewers to experience granular details like the sparkle of a fine granite countertop while also walking through the rooms and seeing the house from the outside.
Going even bigger, consider a power transmission/distribution company that constructs a visual digital twin of its entire power grid across a city, then sync that across cloud simulations and user interfaces. Whereas previously we discussed scenes on the order of millions of splats, eventually we’ll need billions and beyond.
CableLabs’ Immersive Media Experiences team engages with immersive standards activities and monitors the state of the art of immersive media to understand and communicate key trends and their impact on the cable industry. Subscribe to our blog for more updates from the Immersive Media Team and other activities at CableLabs.