Tanner Schmidt

I am a computer vision researcher interested in real-time vision, unsupervised visual representation learning, geometry-based computer vision, and vision for robotics. Currently I'm working as a Research Scientist at Meta Reality Labs.

E-mail | Scholar | GitHub | Twitter

First-Author Publications:

Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation

Tanner Schmidt, Richard Newcombe
CVPR 2025

Project | Paper | Code | Video

A modification of the Segment Anything Model (SAM), increasing efficiency through foveated tokenization.

Self-Supervised Visual Descriptor Learning for Dense Correspondence

Tanner Schmidt, Richard Newcombe, Dieter Fox
IEEE Robotics and Automation Letters 2 (2016)

Paper | Video

Dense visual descriptors are training using a contrastive loss and similarity labels provided by a 3D reconstruction system.

Depth-based Tracking with Physical Constraints for Robot Manipulation

Tanner Schmidt, Katharina Hertkorn, Richard Newcombe, Zoltan Marton, Michael Suppa, Dieter Fox
ICRA 2015

Paper | Video

A depth-based tracker provides real-time estimates of the pose of robot hands and target objects for tele-operated grasping.

DART: Dense Articulated Real-Time Tracking

Tanner Schmidt, Richard Newcombe, Dieter Fox
RSS 2014

Paper | Code | Video

A CUDA-accelerated model-based tracker using a signed distance function representation of target objects.

Datasets:

YCB-Video

A dataset of RGB-D videos of static configurations of YCB objects. Annotations of object poses are provided.

Jaech Gallery Dataset

A dataset of RGB-D videos capturing the same scene in a variety of lighting conditions and furniture configurations.