Not keyword search
Search by similarity
For a conference bot
Titles and descriptions contain the meaning of the talks
Those are made up of words
Titles and descriptions that share words are similar
We can collect all the words and represent each talk as a count of each word
When we get a user query, we can do the same thing and compare
We needed to know all the data up front
The query is sensitive to the vocabulary
More words means more calculations
The same word can mean different things
import OpenAI from "openai";
const openai = new OpenAI();
const embedding = await openai.embeddings.create({
model: "text-embedding-3-small",
input: "Your text string goes here",
encoding_format: "float",
});
console.log(embedding);
export const client = new DataAPIClient(astraDb.token);
export const db = client.db(astraDb.endpoint);
export const collection = db.collection(astraDb.collection);
const results = await collection.find({}, { sort: { $vectorize: query }, limit: 5 })