Yujiao Shi

I am an assistant professor at ShanghaiTech University. I was a research fellow and completed my Ph.D. at the Australian National University, supervised by Prof. Hongdong Li. My research focuses on camera localization, 3D reconstruction, view synthesis, and scene generation, particularly from aerial perspectives.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  /  Linkedin

profile photo
Potential Students:

  • Looking for highly self-motivated undergraduate/master students with strong learning and problem-solving abilities working on challenging computer vision research problems!
  • Postdoc, visiting undergraduate/master/PhD students, and RA positions are also available.

News

  • 02/2025: I will serve as an Area Chair for NeurIPS 2025
  • 01/2025: One paper is accepted at ICLR 2025, congrats Xianghui!
  • 11/2024: Two papers are accepted at 3DV 2025
  • 07/2024: One paper is accepted at Siggraph Asia 2024
  • 07/2024: Two papers are accepted at ECCV 2024
  • 05/2024: I am serving as an Area Chair for NeurIPS 2024
  • 04/2024: Our workshop proposal "UAVs in Multimedia" is accepted at ACM MM 2024
  • 02/2024: I am serving as a co-organizer for Women in Computer Vision Workshop at ECCV 2024
  • 01/2024: One paper accepted to ICRA 2024
  • 10/2023: Co-organizer of workshop "UAVs in Multimedia" at ACM MM 2023
  • 09/2023: One paper accepted to NeurIPS 2023
  • 06/2023: One paper accepted to IROS 2023
  • 06/2023: Tutorial speaker on cross-model camera localization at CVPR 2023

Selected Publications
PontTuset BevSplat: Resolving Height Ambiguity via Feature-Based Gaussian Primitives for Weakly-Supervised Cross-View Localization
Qiwei Wang, Shaoxun Wu, and Yujiao Shi *
2025

PontTuset AerialGo: Walking-through City View Generation from Aerial Perspectives
Fuqiang Zhao, Yijing Guo, Siyuan Yang, Xi Chen, Luo Wang, Lan Xu, Yingliang Zhang, Yujiao Shi *, and Jingyi Yu *
2025

PontTuset Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Xianghui Ze, Zhenbo Song, Qiwei Wang, Jianfeng Lu, and Yujiao Shi *
ICLR, 2025

PontTuset Letsgo: Large-scale garage modeling and rendering via lidar-assisted gaussian primitives
Jiadi Cui, Junming Cao, Fuqiang Zhao, Zhipeng He, Yifan Chen, Yuhui Zhong, Lan Xu, Yujiao Shi *, Yingliang Zhang *, Jingyi Yu *
ACM Transactions on Graphics (TOG), 2024

PontTuset FastGrasp: Efficient Grasp Synthesis with Diffusion
Xiaofei Wu, Tao Liu, Caoji Li, Yuexin Ma, Yujiao Shi *, and Xuming He*
3DV, 2025

PontTuset Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis
Tao Jun Lin, Wenqing Wang, Yujiao Shi , Akhil Perincherry, Ankit Vora, and Hongdong Li
3DV, 2025

PontTuset Weakly-supervised camera localization by ground-to-satellite image registration
Yujiao Shi , Akhil Perincherry, Ankit Vora, and Hongdong Li
ECCV, 2024

PontTuset Adapting fine-grained cross-view localization to areas without fine ground truth
Zimin Xia, Yujiao Shi , Hongdong Li, and Julian FP Kooij
ECCV, 2024
PontTuset Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer
Yujiao Shi , Fei Wu, Akhil Perincherry, Ankit Vora, and Hongdong Li
ICCV, 2023
paper / code

This work proposes a decoupled rotation and translation estimation method for cross-view image matching, achieving significant performance improvement.

PontTuset CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization
Yujiao Shi , Xin Yu, Shan Wang, and Hongdong Li
ACCV, 2022
paper / code

This work addresses city-scale satellite image-based camera localization by using a sequence of ground-view images.

PontTuset Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image
Yujiao Shi , Hongdong Li
CVPR, 2022
paper / code

We introduce a new pose optimization method to accurately pinpoint which pixel in a satellite image corresponds to the query camera location.

PontTuset Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching
Yujiao Shi , Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, and Hongdong Li
TPAMI, 2022
paper / code

We propose projective transform, which (1) compliments polar transform to achieve better coarse localization performance and (2) provides a novel handcrafted method to accurately localize query camera on its matching satellite image.

PontTuset Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery
Yujiao Shi , Dylan Campbell, Xin Yu, and Hongdong Li
TPAMI, 2022
paper / code

Satellite to street-view panorama synthesis, implicit satellite image height map estimation.

PontTuset Self-Supervised Visibility Learning for Novel View Synthesis
Yujiao Shi , Hongdong Li, and Xin Yu
CVPR, 2021
paper / code

We estimate target-view depth and source-view visibility in an end-to-end manner.

PontTuset Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching
Yujiao Shi , Xin Yu, Dylan Campbell, and Hongdong Li
CVPR, 2020
paper / code

The first 3-DoF camera pose estimation framework via ground-to-satellite image matching.

PontTuset Optimal Feature Transport for Cross-View Image Geo-Localization
Yujiao Shi , Xin Yu, Liu Liu, Tong Zhang, and Hongdong Li
AAAI, 2020
paper / code

Motivated by optimal transport, we invent a cross-view feature transport module to bridge the cross-view domain gap.

PontTuset Spatial-Aware Feature Aggregation for Cross-View Image based Geo-Localization
Yujiao Shi , Liu Liu, Xin Yu, and Hongdong Li
NeurIPS, 2019
paper / code

A polar transform is introduced to bridge the ground-and-satellite domain gap, which significantly boosts the state-of-the-art cross-view localization performance.

Academic Service

Conference Program Committee Member/Reviewer:

  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • International Conference on Computer Vision (ICCV)
  • European Conference on Computer Vision (ECCV)
  • Asian Conference on Computer Vision (ACCV)
  • Neural Information Processing Systems (NeurIPS)
  • The International Conference on Learning Representations (ICLR)
  • AAAI Conference on Artificial Intelligence (AAAI)
  • International Joint Conference on Artificial Intelligence (IJCAI)
  • International Conference on 3D Vision (3DV)

Journal Reviewer:

  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
  • International Journal of Computer Vision (IJCV)
  • IEEE Transactions on Image Processing (TIP)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
  • The IEEE Robotics and Automation Letters (RAL)
  • IEEE Transactions on Artificial Intelligence
  • IEEE Transactions on Geoscience and Remote Sensing
  • ISPRS Journal of Photogrammetry and Remote Sensing
  • International Journal of Digital Earth
  • Knowledge-Based Systems

Template from Jon Barron's website.