Tinyllama: An Open-source Small Language Model | Awesome LLM Papers Contribute to Awesome LLM Papers

Tinyllama: An Open-source Small Language Model

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu . No Venue 2024

[Code] [Paper] [Paper]   Search on Google Scholar   Search on Semantic Scholar
Efficiency Has Code Model Architecture

We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.

https://huggingface.co/discussions/paper/6597627202a265cc802c016c

Similar Work