Fact: Factor-tuning For Lightweight Adaptation On Vision Transformer | Awesome LLM Papers Add your paper to Awesome LLM Papers

Fact: Factor-tuning For Lightweight Adaptation On Vision Transformer

Shibo Jie, Zhi-Hong Deng . Proceedings of the AAAI Conference on Artificial Intelligence 2023 – 54 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
AAAI Applications Compositional Generalization Efficiency Evaluation Few Shot Fine Tuning Interdisciplinary Approaches Model Architecture Productivity Enhancement Tools

Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL). Current PETL methods have shown that by tuning only 0.5% of the parameters, ViT can be adapted to downstream tasks with even better performance than full fine-tuning. In this paper, we aim to further promote the efficiency of PETL to meet the extreme storage constraint in real-world applications. To this end, we propose a tensorization-decomposition framework to store the weight increments, in which the weights of each ViT are tensorized into a single 3D tensor, and their increments are then decomposed into lightweight factors. In the fine-tuning process, only the factors need to be updated and stored, termed Factor-Tuning (FacT). On VTAB-1K benchmark, our method performs on par with NOAH, the state-of-the-art PETL method, while being 5x more parameter-efficient. We also present a tiny version that only uses 8K (0.01% of ViT’s parameters) trainable parameters but outperforms full fine-tuning and many other PETL methods such as VPT and BitFit. In few-shot settings, FacT also beats all PETL baselines using the fewest parameters, demonstrating its strong capability in the low-data regime.

Similar Work