FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

Jiale Xu1, Shenghua Gao2, Ying Shan1
1ARC Lab, Tencent PCG    2The University of Hong Kong   

Reconstruct 3D Gaussians from unposed sparse-view images and recover their camera parameters.

Abstract

Existing sparse-view reconstruction models heavily rely on accurate known camera poses. However, deriving camera extrinsics and intrinsics from sparse-view images presents significant challenges. In this work, we present FreeSplatter, a highly scalable, feed-forward reconstruction framework capable of generating high-quality 3D Gaussians from uncalibrated sparse-view images and recovering their camera parameters in mere seconds. FreeSplatter is built upon a streamlined transformer architecture, comprising sequential self-attention blocks that facilitate information exchange among multi-view image tokens and decode them into pixel-wise 3D Gaussian primitives. The predicted Gaussian primitives are situated in a unified reference frame, allowing for high-fidelity 3D modeling and instant camera parameter estimation using off-the-shelf solvers. To cater to both object-centric and scene-level reconstruction, we train two model variants of FreeSplatter on extensive datasets. In both scenarios, FreeSplatter outperforms state-of-the-art baselines in terms of reconstruction quality and pose estimation accuracy. Furthermore, we showcase FreeSplatter's potential in enhancing the productivity of downstream applications, such as text/image-to-3D content creation.

pipeline

Pipeline. Given N input views without any known camera extrinsics nor intrinsics, we first patchify them into image tokens, and then feed all tokens into a sequence of self-attention blocks to exchange information among multiple views. Finally, we decode the output image tokens into N Gaussian maps, from which we can render novel views, as well as recovering camera focal length and poses with simple iterative solvers.

Object-level Reconstruction

Scene-level Reconstruction

Image-to-3D Generation

Zero123++_v1.2/house_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Zero123++ v1.2
Zero123++_v1.2/stitch_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Zero123++ v1.2
Zero123++_v1.2/camera_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Zero123++ v1.2
Hunyuan3D_Std/crab_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Hunyuan3D-1
Hunyuan3D_Std/dragon_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Hunyuan3D-1
Hunyuan3D_Std/sign_images
Input | 3DGS Viewer | Pose Viewer
Multi-view Generator: Hunyuan3D-1

BibTeX

@article{xu2024freesplatter,
    title={FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction},
    author={Xu, Jiale and Gao, Shenghua and Shan, Ying},
    journal={arXiv preprint arXiv:2412.09573},
    year={2024},
}