The overview of UGS-Loc. Our proposed framework refines camera pose by incorporating pose prior uncertainty via Monte Carlo sampling and geometric uncertainty via uncertainty fields, achieving robust localization without retraining.
3D Gaussian Splatting (3DGS) has recently emerged as a powerful scene representation and is increasingly used for visual localization and pose refinement. However, despite its high-quality differentiable rendering, the robustness of 3DGS-based pose refinement remains highly sensitive to both the initial camera pose and the reconstructed geometry.
We identify two major sources of uncertainty:
Such uncertainties can distort reprojection geometry and destabilize optimization, even when the rendered appearance still looks plausible. To address these issues, we introduce UGS-Loc, a relocalization framework that combines Monte Carlo pose sampling with Fisher Information-based PnP optimization. Our method explicitly accounts for both pose and geometric uncertainty and requires no retraining or additional supervision. Across diverse indoor and outdoor benchmarks, UGS-Loc consistently improves localization accuracy and significantly increases stability under pose and depth noise.
Left: UGS-Loc overview. The framework refines camera pose by combining Monte Carlo pose sampling with uncertainty-aware PnP optimization for robust localization. Right: Comparison between previous 3DGS-based pose refinement and UGS-Loc. Our method uses Monte Carlo pose sampling and Fisher-information-based uncertainty to focus refinement on reliable regions, improving robustness under poor pose priors.
Distributions of error–confidence and error–uncertainty. For each frame, we plot translation or rotation error against the aggregated matcher confidence or depth uncertainty. The trend lines show that higher confidence and lower uncertainty correlate with smaller localization errors.
Effect of uncertainty-based PnP. High-uncertainty correspondences are suppressed, yielding more reliable inliers and improved pose estimates.
Visualization of localization quality on the 7-Scenes dataset. Each triplet compares the ground-truth view (top-left) with the view rendered from (i) the input pose prior, (ii) the baseline GS-CPR refinement, and (iii) our UGS-Loc refinement (bottom-right). A closer alignment along the diagonal boundary indicates better pose accuracy.
Visualization of localization quality. Each pair of images compares the ground-truth view (top-left) with the view rendered from (i) the repeated GS-CPR refinement, (ii) our proposed UGS-Loc refinement, (bottom-right). A closer alignment along the diagonal boundary indicates better pose accuracy. IR denotes Image retrieval. IR + UGS-Loc process is entirely scene-agnostic, without requiring scene-specific optimization or customization..