Falcon-40B is a 40 billion parameter causal decoder model trained on 1000B tokens from RefinedWeb and augmented with a curated dataset. It tops Huggingface’s OpenLLM leaderboard and outperforms LLaMA, MPT, RedPajama, and StableLM, among others. Built using custom tools, Falcon-40B includes a unique data pipeline that pulls training data from public networks. After Falcon grabs content from the public Internet to build Falcon’s initial pre-training dataset, it uses CommonCrawl to dump and perform a lot of filtering (including deleting machine… |
#Causal #Decoder #Large #Model #Falcon40B