Trademark retrieval (TR) is the problem of retrieving similar trademarks (logos) for a query, and the main aim is to detect copyright infringements in trademarks. Since there are millions of companies worldwide, automatically retrieving similar trademarks has become an important problem, and currently, checking trademark infringements is mostly performed manually by humans. However, although there have been many attempts for automated TR, as also acknowledged in the community, the problem is largely unsolved.
One of the main reasons for that is the unavailability of a publicly available comprehensive dataset that includes the various challenges of the TR problem. In this article, we propose and introduce a large dataset composed of more than 930,000 trademarks, and evaluate the existing approaches in the literature on this dataset. We show that the existing methods are far from being useful in such a challenging dataset, and we hope that the dataset can facilitate the development of better methods to make progress in the performance of trademark retrieval systems.