In the startup and VC world, Generative AI (Gen AI) is certainly creating a big wave. If you are an entrepreneur and would like to start a business leveraging the power of the Large Language Models (LLMs) — or the name of it — what is the battle plan? What are the areas you may want to bear in mind?
I’ve been speaking to leading experts in the field. One of them is Dr. Dan Roth, Ph. D., who is a Distinguished Professor of Computer and Information Science at the University of Pennsylvania. He has decades of experience in the technology, software and AI innovation space. With one foot in the technical world and the other in the entrepreneurial world, Dr. Roth has a bird’s eye view of the Gen AI wave — as well as its nuances.
Here are the learnings:
Utilise existing language models
Building your own model from scratch requires significant investment and expertise. Leveraging existing models can save time and resources, allowing you to focus on fine-tuning for specific applications.
Identify your differentiators
Determine what sets your approach apart. This could involve using better or unique data, or applying data in more innovative ways. Fine-tuning models with high-quality, application-specific data can significantly enhance performance.
Also Read: Beyond the hype: Taking Gen AI mainstream with next-level automation
Save costs
- Use smaller models and optimise inference: Large models like GPT-4 are powerful but costly. Smaller models, even as compact as 3B or 7B parameters, can be highly effective and more economical. Fine-tuning these smaller models on your own data can further reduce costs. Investing in efficient inference technologies, such as model quantisation, can also significantly cut expenses.
- Distill models for cost-effective inference: Employing methods to distill smaller models can enhance their efficiency, making them more cost-effective for production use. This approach is being adopted by several startups to improve inference efficiency.
- Consider simpler models when appropriate: Not all problems require large language models (LLMs). For specific tasks like information extraction, smaller, fine-tuned models can outperform even the largest LLMs. Understanding the tasks your application needs to perform will help you choose the most appropriate and cost-effective model.
- Develop a robust evaluation protocol: Establish a comprehensive evaluation protocol that includes both automatic metrics and human assessments. This builds trust with investors by demonstrating a thorough understanding of your technology’s capabilities and limitations.
- Address hallucinations: Implement systems to evaluate and mitigate hallucinations in your models, focusing on both factual inaccuracies and reasoning errors. Utilise metrics like accuracy and F1 score, and ensure human evaluation is part of your assessment process.
- Be mindful of misinformation: Consider the potential for your models to generate toxic information or misinformation. Implement safeguards to prevent misuse and minimise the risk of information pollution.
“You have to think about who will use your tools and whether they will be careful or not. […] It could be a PR disaster if someone generates toxic information or misinformation, which could find its way to [X]. You need to think about this, and it’s a function of who you’re giving the models to. […] Information pollution doesn’t get enough visibility. It’s really a scary space. […] You have to think about whether you care about your model generating toxic information.” — Dr. Roth, Professor of Computer and Information Science at the University of Pennsylvania.
—
Editor’s note: e27 aims to foster thought leadership by publishing views from the community. Share your opinion by submitting an article, video, podcast, or infographic.
Join us on Instagram, Facebook, X, and LinkedIn to stay connected.
Image credit: Canva Pro
The post Navigating the Gen AI wave: A startup’s battle plan appeared first on e27.