Research Projects
Object and Scene Segmentation
|
We address the problem of joint detection and segmentation
of multiple object instances in an image, a key step towards
scene understanding. Inspired by data-driven methods,
we propose an exemplar-based approach to the task
of instance segmentation, in which a set of reference image/shape
masks is used to find multiple objects. We design
a novel CRF framework that jointly models object appearance,
shape deformation, and object occlusion.
|
Xuming He, Stephen Gould,
An Exemplar-based CRF for Multi-instance Object Segmentation,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
[pdf] [Dataset]
Buyu Liu, Xuming He, Stephen Gould,
Multi-class Semantic Video Segmentation with Exemplar-based Object Reasoning,
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015
[pdf]
Holistic Video Understanding
|
We address the problem of integrating object reasoning
with supervoxel labeling in multiclass semantic video
segmentation. To this end, we first propose an object-augmented
CRF in spatio-temporal domain, which
captures long-range dependency between supervoxels, and
imposes consistency between object and supervoxel labels.
We develop an efficient inference algorithm to
jointly infer the supervoxel labels, object activations and
their occlusion relations for a large number of object
hypotheses.
|
Buyu Liu, Xuming He,
Multiclass Semantic Video Segmentation With Object-Level Active Inference,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
[pdf] [suppl zip]
Buyu Liu, Xuming He, Stephen Gould,
Joint Semantic and Geometric Segmentation of Videos with a Stage Model,
IEEE Winter Conference on Applications of Computer Vision (WACV), 2014
[pdf]
Depth Prediction from Images
|
We tackle the problem of single image depth estimation,
which, without additional knowledge, suffers from many
ambiguities. We introduce a hierarchical
representation of the scene and
formulate single image depth estimation as inference in a
graphical model whose edges let us encode the interactions
within and across the different layers of our hierarchy. Our
method therefore still produces detailed depth estimates, but
also leverages higher-level information about the scene.
|
Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu,
Indoor Scene Structure Analysis for Single Image Depth Estimation,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
[pdf]
Miaomiao Liu, Mathieu Salzmann, Xuming He,
Discrete-Continuous Depth Estimation from a Single Image,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
[pdf]
Object Detection in Context
|
Exploring contextual relations is one of the key factors to improve object detection under challenging viewing condition and to scale up recognition to large numbers of object classes. We consider two effective approaches that incorporate contextual information: object codetection, which jointly detects object instances in a set of related images, and structural Hough voting, which models the context from 2.5D perspective for object localization under heavy occlusion.
|
Zeeshan Hayder, Mathieu Salzmann, Xuming He,
Object Co-Detection via Efficient Inference in a Fully-Connected CRF,
European Conference on Computer Vision (ECCV), 2014
[pdf]
Tao Wang, Xuming He, Nick Barnes,
Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013
[pdf] [Dataset]
From Image to Concept
|
We study basic-level categories for describing visual concepts, and empirically observe context-dependant basic-level names across thousands of concepts.
We propose methods for predicting basic-level names using a series of classification and ranking tasks, producing the first large-scale catalogue of basic-level names for hundreds of thousands of images depicting thousands of visual concepts. We also demonstrate the usefulness of our method with a picture-to-word task.
|
Alexander Mathews, Lexing Xie, Xuming He,
Choosing Basic-Level Concept Names using Visual and Language Context,
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015
[pdf] [suppl pdf]
Lexing Xie, Xuming He,
Picture Tags and World Knowledge: Learning Tag Relations from Visual Semantic Sources,
The 21st ACM International Conference on Multimedia (ACM MM), 2013
[pdf]
Past Projects
Contour Detection and Completion
|
Yansheng Ming, Hongdong Li, Xuming He,
Winding Number for Region-Boundary Consistent Salient Contour Extraction,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013
[pdf]
Yansheng Ming, Hongdong Li, Xuming He,
Connected Contours: a Contour Completion Model That Respects Closure-Effect,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012
[pdf]
|
Image Understanding for Bionic Eye
|
Xuming He, Junae Kim, Nick Barnes,
An Face-based Visual Fixation System for Prosthetic Vision,
Annual International Conference of the Engineering in Medicine and Biology Society (EMBC), 2012, USA
Tao Wang, Xuming He, Nick Barnes,
Glass Object Localization by Joint Inference of Boundary and Depth,
International Conference on Pattern Recognition (ICPR), 2012
[pdf] [Dataset]
|
Motion Anlaysis
|
Shuang Wu, Xuming He, Hongjing Lu, and Alan Yuille,
A Unified Model of Short-range and Long-range Motion Perception,
Annual Conference on Neural Information Processing Systems (NIPS), 2010, Vancouver, Canada
[pdf]
Xuming He and Alan Yuille,
Occlusion Boundary Detection using Pseudo-Depth,
European Conference on Computer Vision (ECCV), 2010, Greece
[pdf]
|
Image/Scene Labeling
|
Xuming He, and Richard S. Zemel,
Latent Topic Random Fields: Learning Using a Taxonomy of Labels,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008
[pdf (with Appendix)]
Xuming He, Richard Zemel, and Deb Ray,
Learning and Incorporating Top-down Cues in Image Segmentation,
European Conference on Computer Vision (ECCV), 2006.
[pdf] [Dataset]
Xuming He, Richard Zemel, and Miguel Carreira-Perpinan,
Multiscale Conditional Random Fields for Image Labelling,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004
[pdf] [Dataset]
|
|