Fast open-vocabulary semantic segmentation. (Pure vision work)

Currently open to new students?

Yes
No

Description

This is a very simple well defined project. Our goal is to have a fast single-forward pass open-vocabulary zero-shot encoder which can be queried to get semantic segmentation. We will evaluate both 2D and 3D segmentation performance. Current leaders in this space are:
  •  RayFronts  
  •  ResCLIP 
  •  Trident 
  •  NACLIP 
Improving beyond RayFronts and Trident is easy. We need someone to push for lots of benchmarking. The resulting model will be used in many downstream robotics applications.
Target venues: CVPR / ECCV / ICCV

Skills Required

  • Strong coding skills in python and particularly PyTorch.
  • Ability to read, understand, and replicate results of papers.
  • Proactive.

Classes Accepted into Project

Junior
Senior
Graduate Student

Compensation

Units 9
Units 12
Pay

Contact

Scientist, postdoc or student to contact about the project: