Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (2024)

  • Last updated June 19, 2024
  • In

Models build on top of Llama 2 with 2 billion tokens of different language cannot be called as a product.

Share

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (1)
  • Published on
  • byMohit Pandey
Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (2)
Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (3)

It has been reiterated several times that the existing AI models from Google, Meta, and OpenAI are not inherently good when dealing with Indian language data, or any data in any language other than English. Worse is that even with expanding the models’ capabilities by showing Indic language data, the quality does not necessarily improve.

Raj Dabre, a prominent researcher at NICT in Kyoto, adjunct faculty at IIT Madras and a visiting professor at IIT Bombay, recently posted similar thoughts on X. “People be taking llama2, expanding vocabulary, pretraining on 2B tokens of a language and calling it a product,” he said, adding that he has already trained around 50 such models.

People be taking llama2, expanding vocabulary, pretraining on 2B tokens of a language and calling it a product. Bruh I have trained like 50 such models but I can tell you that outside of being useful to answer some research questions they are utter garbaggio.

— Raj Dabre (@prajdabre1) June 18, 2024

He further added that apart from answering some research questions, such models built on top of existing models, such as Mistral, Gemma, or Llama 2, are “utter garbaggio”.

Much of this is pointed towards the rise of open-source Indic language models such as Tamil Llama, Telugu Llama, Kannada Llama, and many such open-source offerings, which are built on top of open-source English language-based models.

This sentiment that India is not innovating in the AI space and merely building on top of existing models from the West has been echoed several times. When talking about India’s future in AI being bleak, several AI experts from India said that most LLMs produced in India are built on top of the already-available LLMs and cannot be called fundamental research.

Though there are others such as Pratik Desai from Kissan AI or Anubhav Sabharwal from CoRover.ai, who believe that building on top of existing open source models is good enough for making the models proprietary and building for specialised use cases.

Though startups like Sarvam AI are planning to build foundational models in Indic language, the current OpenHathi model is built on top of Meta’s Llama 2. Meanwhile, Soket AI Labs has already launched the Pragna-1B open source foundational model, but that is also yet to see a lot of adoption.

Much of this is because of the lack of adoption of Indic language models in the country. Even though everyone wants Indian models, the industry is not adopting them so readily.

Researchers are also content with fine-tuning with new languages on top of English-based models and trying them out for specific use cases as training a frontier foundational model would be a waste of resources for them.

The Creators Agree About the Adoption Problem

There is a widespread idea that open source is a good enough start for India as the adoption rate is too low. As Nandan Nilekani recently said that India’s focus should be on using AI to make a difference in people’s lives. “We are not in the arms race to build the next LLM, let people with capital, let people who want to pedal ships do all that stuff… We are here to make a difference, and our aim is to put this technology in the hands of people.”

Definitely, the factor of cost plays a big role, along with the flexibility of the open source models created by big-tech companies. When speaking with AIM, Adarsh Shirawalmath, the creator of Kannada Llama agreed that most of the problem within the country is that the industry is not willing to adopt the models created which are being built on top of existing models.

In a recent podcast with AIM, Arjun Rao, the founding partner of Speciale Invest, said that he would not be interested in investing in a company which is not doing foundational research and just building wrappers or models on top of existing open source offerings.

Earlier, in a conversation with AIM, Dabre also discussed the complexities of building models for Indic language. “These models [GPT-3] have seen close to tens of trillions of tokens or words in English. Unless you have seen the entirety of the web, or more or less all of it, none of these models will be able to actually solve the generative AI problem for that [Indian] language,” said Dabre.

Dabre rued that chatbots for Indian languages are still a dream. “You will see a lot of people claiming that they can make a chatbot or LLM for Indian languages, but 99% of those things are transient. They are not going to be too useful in production, because nobody has solved the data problem yet,” said Dabre. The biggest missing link here is the lack of Indic language data, which still needs to be solved.

For now, even though one can say that these fine-tuned models cannot be classified as products, they are ideal for research for students in universities. If companies such as Sarvam AI, Kissan AI, and Krutrim are still struggling to build foundational Indic language models, the individuals experimenting with such models should definitely be pushed further. Though not to be called products.

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words.

Related Posts

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (6)

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (7)

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (8)

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (9)

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

MachineCon GCC Summit 2024

June 28, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (10)

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (11)

CP Gurnani Proves Altman Wrong, Tech Mahindra Builds Indian LLM Under $5M

Shyam Nandan Upadhyay

He also said that India will develop its own NVIDIA in the next 5-7 years and would not need to be dependent on someone else.

Watch Out, Chatbots! Amazon Metis is Almost Here

Anshul Vipat

Top Editorial Picks

Figma Rolls Out AI Tools to Rival Design Giants like Adobe, Canva

Shritama Saha

Yann LeCun Urges Signing Letter to Block Regulation for AI Research

Anshul Vipat

OpenAI Unveils CriticGPT to Review GPT-4’s Performance

Donna Eva

Tech Mahindra Finally Launches Project Indus, Indic LLM with 37+ Hindi Dialects

Shyam Nandan Upadhyay

Google Opens Access to Gemini 1.5 Pro 2M Context Window, Enables Code Execution for Gemini API

Donna Eva

After Skoda, Audi Integrates ChatGPT into Cars

Vidyashree Srinivas

Shyam Nandan Upadhyay

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (15)

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration withNVIDIA.

GenAI
Corner

View All

Top 7 Papers Presented by Google at CVPR 2024

AI Sparks India’s Research Boom but Threat to Quality Looms

The 5 Ps That Will Define Healthcare GCCs in India

Data Science Hiring Process at Target

Meet the Indian Who Created an Open Source Perplexity Over One Weekend

AWS Selects 24 Indian Startups for First Space Accelerator Program

TIME and OpenAI Announce Multi-Year Content Partnership

Google Rolls Out Gemma 2, Leaves Llama 3 Behind

Fine-Tuned Indic Llamas are ‘Utter Garbaggio’ (2024)

References

Top Articles
Latest Posts
Article information

Author: Greg Kuvalis

Last Updated:

Views: 6334

Rating: 4.4 / 5 (75 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Greg Kuvalis

Birthday: 1996-12-20

Address: 53157 Trantow Inlet, Townemouth, FL 92564-0267

Phone: +68218650356656

Job: IT Representative

Hobby: Knitting, Amateur radio, Skiing, Running, Mountain biking, Slacklining, Electronics

Introduction: My name is Greg Kuvalis, I am a witty, spotless, beautiful, charming, delightful, thankful, beautiful person who loves writing and wants to share my knowledge and understanding with you.