QuantNAS for Super Resolution: Searching for Efficient Quantization-Friendly Architectures Against Quantization Noise
Аннотация
This work aims to develop an automated procedure for discovering new, efficient solutions that can be effectively quantized in mixed-precision mode with minimal degradation. While our primary focus is on Super-Resolution (SR), our proposed procedure is applicable beyond this domain. To achieve our goals, we first develop an efficient Neural Architecture Search (NAS) procedure for full-precision (in this paper, “full-precision” or FP refers to floating point with a 32-bit data format) models, surpassing existing NAS solutions for SR. We then adapt this procedure for quantization-aware search. By introducing Quantization Noise (QN) during the search phase, we approximate the model degradation after quantization. Additionally, we improve search performance by implementing entropy regularization, which prioritizes operations and its precision within each search space block. Our experiments confirm the superiority of quantization-aware NAS compared to the two-step process: NAS followed by quantization. Furthermore, approximating quantization with QN offers a 30% speed improvement over direct weight quantization. We validate our approach by developing and applying it to two search spaces inspired by state-of-the-art SR models. Our code is publicly available (github.com/On-Point-RND/QuantNAS)
Похожие публикации
сотрудничества и партнерства