Embodied AI / Vision-and-Language Navigation / Agent Intelligence

Liuyi Wang

I build vision-language navigation systems that move from benchmark reasoning to embodied execution: multimodal perception, causal decision-making, efficient VLN models, and real-world robot evaluation.

Contact Google Scholar

Language Intent Multimodal Perception Causal / Hierarchical Decision Embodied Execution

26 AI and robotics papers

9 first-author papers in TPAMI, CVPR, ICCV, IJCAI, TCSVT, EAAI, IROS

8 authorized or disclosed patents

700+ GitHub stars for an embodied navigation engine

55K+ dataset / engine downloads

3x real-world success gain in large-small hierarchical VLN decision making

I am pursuing my Ph.D. in Control Science and Engineering from Tongji University, advised by Prof. Qijun Chen and Prof. Chengju Liu at the Robotics and Artificial Intelligence Lab (RAIL). I led the RAIL-VLN team and interned at the Center for Embodied AI, Shanghai AI Laboratory, where I worked on a next-generation VLN foundation engine for high-fidelity embodied interaction across legged and wheeled robots.

My research focuses on vision-and-language navigation, embodied AI, vision-language large models, multimodal perception and decision-making, causal learning, knowledge distillation, and trajectory-instruction generation. My broader goal is to make embodied agents robust enough to understand language, perceive complex scenes, reason under uncertainty, and act reliably in the physical world.

News

2026.07: One paper accepted by TASE.
2026.05: One paper accepted by TPAMI.
2026.06: Two papers accepted by ECCV 2026.
2026.04: One paper accepted by CVPR 2026.
2026.03: Honored as an Outstanding Graduate of Shanghai Municipality.
2026.01: Received the Shanghai AI Society Young Excellent Paper Award and Shanghai AI Lab “Xingqi” Intern Award (5%).
2025.11: Honored as Tongji University’s Academic Pioneer, the university’s highest academic distinction for graduate students .
2025.08: One paper accepted by TCSVT.
2025.06: Received the Best Paper Award at the Chinese Intelligent Automation Conference.
2025.05: Two papers accepted by ICCV 2025.
2025.04: Received Tongji University’s Pursuit of Excellence Scholarship, the university’s highest comprehensive student honor.
2024.10: Received Best Cognitive Robotics Paper Nomination at IROS 2024.
2024.06: Two papers accepted by IROS 2024.
2024.02: One paper accepted by CVPR 2024 and one by RAL.

Research Focus

Physical VLN Evaluation

Building high-fidelity embodied navigation platforms and real-world protocols to expose the gap between simulated success and deployable robot behavior.

Large-Small Model Collaboration

Designing uncertainty-aware hierarchical policies where reactive small planners and deliberative large reasoning models collaborate online.

Efficient VLN Agents

Distilling meta-abilities into compact modules, reaching strong navigation performance with only 5% learnable parameters in key settings.

Causal and Multimodal Reasoning

Reducing spurious correlations in VLN through structural causal modeling, intervention learning, semantic factor fusion, and memory representation.

Featured Projects

Shanghai AI Lab / ICCV 2025

Embodied VLN Engine and Physical Evaluation

Core contributor to a new VLN physical-engine platform based on Isaac Sim, supporting humanoid, quadruped, and wheeled robot navigation. The related open-source project has gained nearly 700 GitHub stars and 55K+ downloads.

Code / Paper

TPAMI 2026

MAGIC: Meta-Ability Guided Distillation

Introduced a chain-of-distillation framework that decomposes embodied navigation ability into meta-knowledge modules and improves efficient VLN learning with 11M learnable parameters.

Paper / Code

CVPR 2024

VLN-GOAT: Causal Learning for Navigation

Built a structural causal view of VLN and combined back-door and front-door adjustment to improve generalization in unseen environments.

Paper / Code

IROS 2024

Semantic-Depth Cross-Modal Navigation

Designed panoramic semantic depth perception and cross-modal fusion for language-guided robot navigation in continuous environments.

Paper / Code

Selected Publications

TPAMI 2026 MAGIC: Efficient Vision-and-Language Navigation through Meta-Ability Guided Interactive Chain-of-Distillation Learning, Liuyi Wang, Zongtao He, Mengjiao Shen, et al.
ICCV 2025 Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities, Liuyi Wang, Xinyuan Xia, Hui Zhao, et al.
CVPR 2024 Vision-and-Language Navigation via Causal Learning, Liuyi Wang, Zongtao He, Ronghao Dang, et al.
TCSVT 2023 RES-StS: Referring Expression Speaker via Self-Training With Scorer for Goal-Oriented Vision-Language Navigation, Liuyi Wang, Zongtao He, Ronghao Dang, et al.
IJCAI 2023 A Dual Semantic-Aware Recurrent Global-Adaptive Network for Vision-and-Language Navigation, Liuyi Wang, Zongtao He, Jiagui Tang, et al.
EAAI 2024 PASTS: Progress-Aware Spatio-Temporal Transformer Speaker for Vision-and-Language Navigation, Liuyi Wang, Chengju Liu, Zongtao He, et al.
IROS 2024 Enhanced Language-guided Robot Navigation with Panoramic Semantic Depth Perception and Cross-modal Fusion, Liuyi Wang, Jiagui Tang, Zongtao He, et al.
TASE Under Review A Comprehensive Survey and Systematic Real-World Evaluation of Embodied Vision-and-Language Navigation, Liuyi Wang, Zongtao He, Kaixuan Sheng, et al.
TRO Under Review CLASH: Collaborative Large-Small Hierarchical Framework for Continuous Vision-and-Language Navigation, Liuyi Wang, Zongtao He, Jiahao Li, et al.
TCSVT 2025 NavComposer: Composing Language Instructions for Navigation Trajectories through Action-Scene-Object Modularization, Zongtao He, Liuyi Wang, Lu Chen, et al.
ICCV 2025 CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation, Xinyu Lin, Yilin Peng, Liuyi Wang, et al.
RAL 2025 Temporal Scene-Object Graph Learning for Object Navigation, Lu Chen, Zongtao He, Liuyi Wang, et al.
IROS 2024 Multimodal Evolutionary Encoder for Continuous Vision-Language Navigation, Zongtao He, Liuyi Wang, Lu Chen, et al.
ICCV 2023 Search for or Navigate to? Dual Adaptive Thinking for Object Navigation, Ronghao Dang, Liuyi Wang, Zongtao He, et al.
ICML 2023 Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation, Ronghao Dang, Lu Chen, Liuyi Wang, et al.
ACMMM 2022 Unbiased Directed Object Attention Graph for Object Navigation, Ronghao Dang, Zhuofan Shi, Liuyi Wang, et al.

Honors and Awards

Tongji University Pursuit of Excellence Scholarship, the university’s highest comprehensive student honor, awarded to only 3 graduate students each year.
Tongji University Academic Pioneer, the university’s highest academic honor for graduate students.
Outstanding Graduate of Shanghai Municipality, awarded in 2026 and 2020.
Principal investigator of a National Natural Science Foundation of China youth student basic research project.
Selected for the 2024 Youth Talent Support Program for PhD Students by the China Association for Science and Technology.
Best Paper Award at the 2025 Chinese Intelligent Automation Conference and Young Excellent Paper Award from Shanghai AI Society.
Best Cognitive Robotics Paper Nomination at IROS 2024 and finalist for the 2025 World AI Conference Young Scholar Best Paper Award.
Shanghai AI Laboratory “Xingqi” Intern Award, top 5% intern recognition.
Huawei Scholarship, Baosteel Excellent Student Award, Tongji Excellent Doctoral Scholarship, and Tongji Outstanding Communist Party Member.
Three-time champion of RoboCup@Home China Open.

Experience

2024.07 - 2026.03, Research Intern, Center for Embodied AI, Shanghai AI Laboratory. Led and contributed to an embodied VLN foundation engine for high-fidelity robot interaction, supporting research on VLA navigation foundation models.
2020.09 - 2026.03, Ph.D., Control Science and Engineering, Tongji University. Dissertation: Mobile Robot Vision-and-Language Navigation: Causal Reasoning and Collaborative Decision-Making via Meta-Ability Decoupling.
2016.09 - 2020.07, B.Eng., Automation, Tongji University.

Academic Service and Skills

Reviewer for TPAMI, TCSVT, CVPR, ICCV, ACMMM, ICRA, IROS, and related venues.
Invited or contributed talks at embodied intelligence, robotics, and AI forums including CAA, IROS, IJCAI, and industrial AI events.
Technical stack: Python, PyTorch, Linux, ROS, Habitat, Isaac Sim, vLLM, LlamaFactory, and Transformers.