Build RAG from Scratch

Phil Nash
Developer relations engineer at DataStax

Phil Nash

twitter.com/philnash

linkedin.com/in/philnash

philna.sh/links

Build RAG from Scratch

The problem with large language models

Demo

Retrieval-Augmented Generation

RAG: the idea

Store our data
User makes query
Retrieve relevant data
Provide data as context to model with the original query
Model generates response
???
Profit!

RAG: the idea

Store our data
User makes query
Retrieve relevant data
Provide data as context to model with the original query
Model generates response
???
Profit!

Retrieval = Search

Augmentation = Prompt

Generation = Model

Search

Not keyword search

Search by similarity

How do we capture meaning in a way that we can seach for similar meaning?

Vector embeddings

A vector embedding is a list of numbers that represents the meaning of a body of text.

Let's create our own vector embedding

Hypothesis

For a conference bot

Titles and descriptions contain the meaning of the talks

Those are made up of words

Titles and descriptions that share words are similar

We can collect all the words and represent each talk as a count of each word

When we get a user query, we can do the same thing and compare

Demo

Comparing Vectors

Vector search

Cosine similarity

Let's see that in code

It works!

Sort of...

We needed to know all the data up front

The query is sensitive to the vocabulary

More words means more calculations

The same word can mean different things

Embedding models

import OpenAI from "openai";
const openai = new OpenAI();

const embedding = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "Your text string goes here",
  encoding_format: "float",
});

console.log(embedding);

Vector databases

Astra DB

export const client = new DataAPIClient(astraDb.token);
export const db = client.db(astraDb.endpoint);
export const collection = db.collection(astraDb.collection);

const results = await collection.find({}, { sort: { $vectorize: query }, limit: 5 })

Demo

RAG

More RAG

Highly Accurate Retrieval for your RAG Application with ColBERT and Astra DB
https://dtsx.io/3Y7D6eE
Better LLM Integration and Relevancy with Content-Centric Knowledge Graphs
https://dtsx.io/4fbrfme
Generate Related Posts for Your Astro Blog with Astra DB Vector Search
https://dtsx.io/4d2SzRy

Thanks

twitter.com/philnash

linkedin.com/in/philnash

philna.sh/links