Advanced 3D Scene Understanding

in Realistic and Dynamic Scenes




In Conjunction with ICME 2026

July 5 - July 9, Bangkok, Thailand

News !

  • Dec 17, 2026:   The website is coming. Call for papers.





Overview


   Understanding complex 3D scenes in realistic and dynamic environments is a fundamental challenge for next-generation multimedia systems. Recent advances in 3D/4D reconstruction, neural rendering (e.g., NeRF and 3D Gaussian Splatting), multimodal foundation models, and embodied AI have significantly improved our ability to capture and represent geometry, appearance, and semantics. However, achieving robust, scalable, and interaction-aware 3D scene understanding in real-world settings—characterized by clutter, temporal changes, physical interactions, and human involvement—remains an open problem. This workshop focuses on advanced 3D scene understanding in realistic and dynamic scenes, aiming to bridge research across computer vision, graphics, multimedia, robotics, and AR/VR. We emphasize holistic scene intelligence that integrates high-fidelity geometry, semantic and relational reasoning, temporal consistency, physics awareness, and multimodal perception. Topics of interest include, but are not limited to, dynamic and 4D scene modeling, open-vocabulary and language-grounded 3D understanding, human-object interaction modeling, physics-aware scene interpretation, and generative approaches for scene synthesis, completion, and refinement. The workshop will serve as a forum for discussing emerging methodologies, datasets, benchmarks, and system-level solutions that enable interactive, scalable, and realistic 3D scene understanding. By bringing together researchers and practitioners from diverse communities, the workshop aims to foster cross-disciplinary collaboration and identify key challenges and future directions. Ultimately, this workshop seeks to advance the foundations of 3D scene understanding that are critical for immersive multimedia applications, embodied agents, digital twins, and intelligent interactive systems




Call for papers

   We invite submissions for ICME 2026 Workshop, Advanced 3D Scene Understanding in Realistic and Dynamic Scenes, which brings researchers together to discuss new technologies of 3D scene understanding. We solicit original research and survey papers that must be no longer than 6 pages (including all text, figures, and references). All accepted papers will be presented as either oral or poster presentations, with the best paper award. Papers that violate anonymity, do not use the ICME submission template will be rejected without review. By submitting a manuscript to this workshop, the authors acknowledge that no paper substantially similar in content has been submitted to another workshop or conference during the review period. Authors should prepare their manuscript according to the Guide for Authors of ICME. For detailed instructions, see here. Submission address is here.
  The scope of this workshop includes, but is not limited to, the following topics:

  • High-Fidelity 3D/4D Scene Understanding
  • Dynamic and Temporally Consistent Scene Understanding
  • Semantic and Relational Scene Understanding
  • Physics-Aware Scene Understanding
  • Multimodal and Cross-Modal Scene Understanding
  • Scalable and Diverse Scene Modeling
  • Human-Centric and Interaction-Centric Scene Understanding
  • Generative Models for Scene Understanding
  • Benchmarking and Evaluation for 3D Scene Understanding



Organizers

Xin Tan
East China Normal University& Shanghai AI Laboratory, China
Xuequan Lu
The University of Western Australia, AU
Shuquan Ye
The Chinese University of Hong Kong, China
Jiaying Lin
The Hong Kong University of Science and Technology, China
Qijian Tian
Shanghai Jiao Tong University, China
Ruijie Xu
East China Normal University, China


If you have any questions, feel free to contact < xtan AT cs <DOT> ecnu <DOT> edu <DOT> cn