Introduction
The 4th PVUW challenge will be held in conjunction with CVPR 2025 at the Music City Center, Nashville TN. Pixel-level Scene Understanding is one of the fundamental problems in computer vision, which aims at recognizing object classes, masks and semantics of each pixel in the given image. Since the real-world is actually video-based rather than a static state, learning to perform video segmentation is more reasonable and practical for realistic applications. To advance the segmentation task from images to videos, we will present new datasets and competitions in this workshop, aiming at performing the challenging yet practical Pixel-level Video Understanding in the Wild (PVUW). This workshop will cover but not limit to the following topics:
- Semantic/panoptic segmentation for images/videos
- Referring image/video comprehension/segmentation
- Video object/instance segmentation
- Video understanding in complex environments
- Language-guided video understanding
- Audio-guided video segmentation
- Efficient computation for video scene parsing
- Semi-supervised recognition in videos
- New metrics to evaluate the quality of video scene parsing results
- Real-world video applications, e.g., autonomous driving, indoor robotics, visual navigation, etc.
Challenge Tracks & Submission
Track 1: Complex Video Object Segmentation (MOSE) Track
MOSE aims to track and segment objects in videos of complex environments. MOSE submission server [click here].
Track 2: Motion Expression guided Video Segmentation (MeViS) Track
MeViS focuses on segmenting objects in video based on a sentence describing the motion of the objects. MeViS submission server [click here].
Challenge Timeline
Event | Date |
---|---|
Challenge release | Mar 01, 2025 |
Validation server online [click here] | Mar 01, 2025 |
Test server online | Mar 15, 2025 |
Submission deadline | Mar 25, 2025 |
Notification | Mar 27, 2025 |
Call for Paper
[Update] PVUW 2025 will be using [OpenReview] to manage submissions. We are looking forward to your work and engaging discussions at the workshop!
We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) through the paper submission portal.
Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.
Paper Submission Timeline
[Update] We just received updates form IEEE that the metadata submission deadline is extended, and we can keep the original submission timeline unchanged. Please follow the timeline in the table and submit your paper before Mar 25. All accepted papers will be published in the official CVPR Workshops Proceedings. Sorry for the confusion!
Event | Date |
---|---|
Regular paper submission deadline | Mar 25, 2025 |
Supplemental material deadline | Mar 25, 2025 |
Notification of paper acceptance | Mar 31, 2025 |
Challenge paper submission deadline | Apr 1, 2025 |
Camera ready deadline | Apr 7, 2025 |
Speakers

Xiangtai Li
ByteDance
Hengshuang Zhao
The University of Hong KongOrganizers

Henghui Ding
Fudan University
Nikhila Ravi
META AI
Chang Liu
Nanyang Technological University
Yunchao Wei
Beijing Jiaotong University
Jiaxu Miao
Sun Yat-Sen University
Shuting He
Shanghai University of Finance and Economics
Zuxuan Wu
Fudan University
Zongxin Yang
Harvard University
Yi Yang
Zhejiang University
Si Liu
Beihang University
Yi Zhu
Amazon
Elisa Ricci
University of Trento
Cees Snoek
University of Amsterdam
Song Bai
ByteDance
Philip Torr
University of Oxford
Contact
Feel free to contact us:
henghui.ding@gmail.com
liuc0058@e.ntu.edu.sg