Build RAG from Scratch

Phil Nash
Developer relations engineer at DataStax

Phil Nash

twitter.com/philnash

linkedin.com/in/philnash

philna.sh/links

Phil Nash DataStax logo

Build RAG from Scratch

The problem with large language models

Demo

Retrieval-Augmented Generation

RAG: the idea

  • Store our data
  • User makes query
  • Retrieve relevant data
  • Provide data as context to model with the original query
  • Model generates response
  • ???
  • Profit!

RAG: the idea

  • Store our data
  • User makes query
  • Retrieve relevant data
  • Provide data as context to model with the original query
  • Model generates response
  • ???
  • Profit!

Retrieval = Search

Augmentation = Prompt

Generation = Model

Search

Search

Not keyword search

Search by similarity

How do we capture meaning in a way that we can seach for similar meaning?

Vector embeddings

A vector embedding is a list of numbers that represents the meaning of a body of text.

Let's create our own vector embedding

Hypothesis

For a conference bot

Titles and descriptions contain the meaning of the talks

Those are made up of words

Titles and descriptions that share words are similar

We can collect all the words and represent each talk as a count of each word

When we get a user query, we can do the same thing and compare

Demo

Comparing Vectors

Vector search

0,0

Vector search

0,0

Vector search

0,0

Cosine similarity

Cosine similarity

Let's see that in code

It works!

Sort of...

We needed to know all the data up front

The query is sensitive to the vocabulary

More words means more calculations

The same word can mean different things

Embedding models

Embedding models

import OpenAI from "openai";
const openai = new OpenAI();

const embedding = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: "Your text string goes here",
  encoding_format: "float",
});

console.log(embedding);

Vector databases

Astra DB

export const client = new DataAPIClient(astraDb.token);
export const db = client.db(astraDb.endpoint);
export const collection = db.collection(astraDb.collection);

const results = await collection.find({}, { sort: { $vectorize: query }, limit: 5 })

Demo

RAG

More RAG

Thanks

Phil Nash DataStax logo