I am currently a research fellow at the Australian National University. I completed my PhD at the Australian National University in 2018, under the supervision of Dr Lexing Xie and Dr Xuming He. My PhD research focused on the fields of computer vision and natrual language processing
with the goal of joining image and text modalities in order to produce natural language captions for images and videos.
Alexander Mathews, Lexing Xie, Xuming He
We design a system to describe an image with emotions,
and present a model that automatically generates captions with
positive or negative sentiments.
We propose a novel switching recurrent neural network with word-level regularization,
which is able to produce emotional image captions using
only 2000+ training sentences containing sentiments.
We evaluate the captions with different
automatic and crowd-sourcing metrics.
Our model compares favourably in common quality metrics for image captioning.
In 84.6% of cases the generated positive captions were judged as being at
least as descriptive as the factual captions. Of these positive captions 88% were
confirmed by the crowd-sourced workers as having the appropriate sentiment.
Download: Suppliment
pdf, ANP list
txtTo appear: AAAI'16
Alexander Mathews, Lexing Xie, Xuming He
We study basic-level categories for describing visual
concepts, and empirically observed context-dependant
basic-level names across thousands of concepts. We propose methods for predicting basic-level names using a series of classification and ranking tasks, producing the first
large-scale catalogue of basic-level names for hundreds
of thousands of images depicting thousands of visual concepts. We also demonstrate the usefulness of our method
with a picture-to-word task, showing strong improvement
(0.17 precision at slightly higher recall) over recent work
by Ordonez et al, and observing significant effects of incorporating both visual and language context for classification.
Moreover, our study suggests that a model for naming visual
concepts is an important part of any automatic image/video
captioning and visual story-telling system.
Download:
pdf, basic-level names catalogue (
csv,
pickle), name benchmark
zip, comparison data
zip Published: WACV2015