Источник
IEEE Access
Год публикации
2023
Авторы
Илья Макаров Mikhail Shulgin
Поделиться

Scalable Zero-Shot Logo Recognition

Аннотация

Brand logo recognition is a task focused on the identification and classification of logos, with various applications such as brand protection and market discovery. Since the number of brands is dynamic and evolves over time, we propose a two-step, zero-shot framework for this problem. In the first step, we train a Scaled-YOLOv4 single-stage universal logo detector to identify regions containing logo-like objects. Our results indicate that this detector achieves generalizability comparable to the two-stage Faster-RCNN model. In the second step, we employ an enhanced CLIP model for zero-shot classification of the identified regions. Our experiments demonstrate that the CLIP model outperforms state-of-the-art few-shot classifiers in terms of accuracy. Additionally, we adopt test-time augmentation to improve the model’s resistance against false positives. We also present a proof-of-concept for fine-tuning the CLIP model, which enhances its cosine similarity measures. Our proposed end-to-end solution is scalable in terms of the number of brands and requires only the brand names for detection. The logo detector achieves superior performance on the FlickrLogos-32 dataset without the need for additional fine-tuning.

Присоединяйтесь к AIRI в соцсетях