Learning to make better mistakes: Semantics-aware visual food recognition

Hui Wu; Michele Merler; Rosario Uceda-Sosa; John Smith

doi:10.1145/2964284.2967205

MM 2016

Conference paper

01 Oct 2016

Learning to make better mistakes: Semantics-aware visual food recognition

View publication

Abstract

We propose a visual food recognition framework that integrates the inherent semantic relationships among fine-grained classes. Our method learns semantics-aware features by formulating a multi-task loss function on top of a convolutional neural network (CNN) architecture. It then refines the CNN predictions using a random walk based smoothing procedure, which further exploits the rich semantic information. We evaluate our algorithm on a large \food-in-the-wild" benchmark [3], as well as a challenging dataset of restaurant food dishes with very few training images. The proposed method achieves higher classification accuracy than a baseline which directly fine-tunes a deep learning network on the target dataset. Furthermore, we analyze the consistency of the learned model with the inherent semantic relationships among food categories. Results show that the proposed approach provides more semantically meaningful results than the baseline method, even in cases of mispredictions. Categories and Subject Descriptors H.3.3 [Information Systems]Information Storage and Retrieval-Content Analysis and Indexing; I.4 [ComputingMethodologies] Processing and Computer Vision General Terms Hierarchical Deep Learning.

Conference paper