Results for ""
Meta AI has launched LLaMA, a collection of 7B to 65B parameter foundation language models.
The model, an acronym for Long Language Model Meta AI, is smaller than its contemporaries because it is designed for research communities with limited access to infrastructure—sizes of LLaMA range from seven billion to sixty-five billion parameters.
LLaMA-13B is over ten times smaller than OpenAI's GPT-3 (175B), whereas LLaMA-65B is comparable to DeepMind's Chinchilla-70B and Google's PaLM-540B.
The study is different from others because it shows that state-of-the-art performance can be reached by training on publicly available data alone, without using proprietary datasets. Models that are smaller and have been trained on a larger number of tokens (word fragments) are more flexible and easier to modify for new commercial applications. LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, was trained on one trillion tokens.
Like any other LLM, LLaMA works by taking a string of words as input and guessing the next word to create text in a loop. The team trained the model with text from the top 20 languages, focusing on those that use the Latin and Cyrillic alphabets.
Researchers at Meta say it is hard to access large language models because they are so big.
"This limited access has made it harder for researchers to understand how and why these large language models work. It has slowed down efforts to make them more reliable and fix problems like bias, toxicity, and the possibility that they could spread false information," Meta says.
Meta is trying to make LLaMA more accessible by making the models smaller and putting them out under a non-commercial licence.
Academic researchers from governments, public organisations, and institutions will be granted case-by-case access to LLaMA models. Here is where you can apply to use LLaMA.
Like ChatGPT and other language models, LLaMA needs help with people making mean comments and giving strange answers. Meta's announcement of LLaMA admits this by saying that researchers can "more easily test new ways to limit or get rid of these problems in large language models" if they share the model.
Meta's research team also put out a set of evaluations on model biases and toxicity that were based on benchmarks. It was done to show the model's limitations and encourage more research in this critical area.
For more information, read the research paper here.