[2024-Mar-06] A Pragmatic Introduction to Retrieval-Augmented-Generation (RAG)

Institute of Information Systems and Applications


Wei-Fang Sun (孫偉芳), NVIDIA


A Pragmatic Introduction to Retrieval-Augmented-Generation (RAG)


13:20-15:00 Wednesday 06-Mar-2024

QR Code:





Delta 103

Hosted by:

ProfChun-Yi Lee


Pre-trained large language models (LLMs) have demonstrated great potential in language generation. However, certain challenges persist, such as providing the sources that influence their decisions, staying current with the latest knowledge, handling knowledge-intensive tasks, and mitigating "hallucinations". To tackle these challenges, Retrieval-Augmented-Generation (RAG) has emerged as a promising paradigm, where a parametric LLM specialized in language generation collaborates with traditional non-parametric vector databases.

RAG involves two key processes: (1) Document ingestion from external repositories, databases, or APIs beyond the foundational model's knowledge. (2) Retrieval of relevant document data and response generation during inference. In the first process, documents from the knowledge database are encoded by an embedding model, converting their content into vectors. These vector embeddings are then stored in a dedicated vector database. In the second process, the user's query is transformed into vectors using the same embedding model utilized in the previous step. The vector database performs a similarity search to identify vectors closely aligning with the user's intent, providing them to the Language Model (LLM) for context enhancement.

After introducing the core concepts and components of RAG in the main presentation, attendees will be guided through a hands-on session featuring a simplified RAG example. The only prerequisite for attendees is a laptop with an internet connection. During this session, participants will execute code on Google Colab, leveraging free API endpoints for accessing LLMs. This exercise aims to offer participants practical experience in building a basic RAG system using popular open-source libraries.

Lastly, a curated list of resources is provided for individuals interested in delving deeper into RAG. Those intrigued by this topic can explore these materials to gain further insights into the latest research trends on RAG and follow the documentations to develop their own RAG applications.


Wei-Fang Sun is currently a Solution Architect at NVIDIA AI Technology Center (NVAITC), where his main research areas include Reinforcement Learning and Generative Modeling.

All faculty and students are welcome to join.