A large amount of food photos are taken in restaurants for diverse reasons. This dish recognitionproblem is very challenging, due to different cuisines, cooking styles and the intrinsic difficulty of modeling food from its visual appearance. Contextual knowledge is crucial to improve recognition in such scenario. In particular, geocontext has been widely exploited for outdoor landmark recognition. Similarly, we exploit knowledge about menus and geolocation of restaurants and test images.
We first adapt a framework based on discarding unlikely categories located far from the test image. Then we reformulate the problem using a probabilistic model connecting dishes, restaurants and geolocations. We apply that model in three different tasks: dish recognition, restaurant recognition and geolocation refinement. Experiments on a dataset including 187 restaurants and 701 dishes show that combining multiple evidences (visual, geolocation, and external knowledge) can boost the performance in all tasks.