Intro to RAG architectures, including Vertex AI Search
By Victor Asuquo
Machine Learning Engineer at Start Innovation HubAbstract:
This workshop practically explains and teaches you how to build Retrieval Augmented Generation (RAG) architectures, exploring their potential to combat the "grounding problem" in Large Language Models (LLMs).
Workshop Content
- Potential approaches:
- We'll examine existing solutions like model fine-tuning, human checking, and prompt engineering.
- We'll discuss the drawbacks of these methods, including:
- Data preparation effort: Expensive and time-consuming
- Online learning and data updates: Can be challenging
- Limited effectiveness: May not always work
- Human involvement: Can be costly, time-consuming, and unreliable
- Let's see it in action:
- We'll delve into practical implementations of RAG architectures, demonstrating how to build data-aware generative AI applications.
- Vertex Search:
- Learn how to leverage Vertex AI Search for efficient information retrieval.
- Multimodal RAG using Gemini:
- Explore the possibilities of using Google's Gemini for multimodal RAG architectures, handling text, images, and more.
- What next?:
- We'll discuss future directions and potential applications of RAG in generative AI.
Key Benefits of RAG:
- Factuality & Grounding: Provides accurate responses grounded in evidence, beyond the capabilities of traditional LLMs.
- Better Context: Uses more contextually relevant data than what's available in general LLMs.
- Fresher Data: Allows access to more recent information compared to the training data used for traditional LLMs.
- Quick Data Updates: Easily updates data in RAG without significant costs or retraining.
- Cheaper: RAG implementations are more affordable and quicker to implement than fine-tuning.
- Governance: Enhances control over LLM responses by implementing access control and entitlements.
RAG Workflow for Building a QA System:
- Data Ingestion / Parsing: Split documents into chunks. Each chunk represents a piece of text and is then assigned an embedding (a numerical representation of its meaning) to store in a vector database.
- Querying: Generate an embedding for a user query. Find the top
k
most similar chunks from the vector database based on the query embedding. - Retrieval: Retrieve the relevant chunks from the vector database.
- Synthesis: Plug the retrieved chunks into an LLM to generate a comprehensive response.
Improving Performance:
- Better retrieval (finding the most relevant chunks) leads to better results.
Let's see it in action:
- We'll showcase practical examples using Vertex AI Search and Multimodal RAG with Google's Gemini.
Building Generative AI Applications:
- Learn how to leverage LLMs to interact with external systems like databases, APIs, and public data sources.
- We'll explore patterns such as:
- Converting natural language to SQL queries for analysis.
- Calling external APIs based on user queries.
- Chaining or combining multiple models to achieve complex tasks.
This workshop will provide you with the practical knowledge and tools to build sophisticated and accurate Generative AI applications using RAG.
Here are a few patterns where LLMs can be augmented with other systems: Convert natural language to SQL, run the SQL on a database, analyze and present the results Call an external webhook or API based on the user query Synthesize outputs from multiple models, or chain the models in a specific order
We'll delve into the common use cases for RAG, highlighting how it can be applied to improve question answering, chatbots, and more. Through practical demonstrations, we'll showcase the implementation of RAG architectures using Vertex AI Search and explore the capabilities of multimodal RAG with Google's Gemini. This session equips attendees with the knowledge and tools to build data-aware generative AI applications that provide reliable, grounded responses
GO BACK
Other Talks
-
-
Getting into AI for solving problems in Agriculture in Africa
by Daniel Samuel Etukudo -
-
From Volunteer to Lead: A Python Accra Story
by EL-KARECE A ASIEDU -
Building a Realtime Data Capture Stream
by Ahmad Bilesanmi