We study basic-level categories for describing visual
concepts, and empirically observed context-dependant
basic-level names across thousands of concepts. We propose methods for predicting basic-level names using a series of classification and ranking tasks, producing the first
large-scale catalogue of basic-level names for hundreds
of thousands of images depicting thousands of visual concepts. We also demonstrate the usefulness of our method
with a picture-to-word task, showing strong improvement
(0.17 precision at slightly higher recall) over recent work
by Ordonez et al, and observing significant effects of incorporating both visual and language context for classification.
Moreover, our study suggests that a model for naming visual
concepts is an important part of any automatic image/video
captioning and visual story-telling system.
, basic-level names catalogue (csv
), name benchmark zip
, comparison data zip Published: WACV2015