How to work with
generative AI in JavaScript

Phil Nash
Developer relations engineer at DataStax

Phil Nash

twitter.com/philnash

linkedin.com/in/philnash

philna.sh/links

Phil Nash DataStax logo

How to work with generative AI in JavaScript

Why?

GenAI is
✨ brand new ✨

Some good news

LLMs

LLM APIs

Gemini logo
Anthropic logo
OpenAI logo

https://artificialanalysis.ai/

Gemini


import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash"});

const prompt = "Write a whole presentation on getting started with AI and JavaScript"
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();

Demo

Pretty cool
not very useful
...yet

OK, how do these things actually work?

Tokens

https://platform.openai.com/tokenizer

Fancy autocomplete

The model generates the next token in the sequence. e.g.

"Once upon a time"

", in"

"Once upon a time, in"

" a"

"Once upon a time, in a"

" land"

Changing the behaviour

topK: The number of tokens to consider when generating the next token

topP: The probability of the tokens to consider when generating the next token

Temperature: the randomness in choosing from the selected tokens

Factual

Keep temperature down or restrict topK or topP

Creative

Turn the temperature up and don't restrict topK and topP

Beyond the model

The models don't know

So we need to tell them

Prompt engineering

Zero shot prompting

Instruction

"Tell me a story about a friendly goblin"

Grobnar wasn't your typical goblin. While his brethren were known for their grubby habits, their insatiable greed, and their fondness for mischief, Grobnar was different...

Question

"What is the capital of France?"

The capital of France is Paris.

One/few shot prompting

"Australia: Canberra. France: Paris. UK: London. USA:"

You got it! The answer is Washington D.C.

Let me know if you want to try another country! 😊

System prompt

System:
"You are a capital city assistant. Only respond with the name of the capital city for the country or state you are provided."

Prompt:
"Australia: Canberra. France: Paris. UK: London. USA:"

Washington, D.C.

Chain of thought

System:
"You are a riddle solving assistant. You will be told a riddle, break down your chain of thought, then come up with an answer."

Prompt:
"You see a boat filled with people. It has not sunk, but when you look again you don’t see a single person on the boat. Why?"

Chain of thought

Here's my thought process for solving this riddle:

1. The riddle focuses on the boat and people: This suggests the answer lies in the relationship between the two.

2. The boat is full of people, then empty: This implies a change in state or perspective.

3. The boat hasn't sunk: This rules out any event that would physically remove the people.

Answer:

The most likely answer is that everyone on the boat is married. You initially saw a boat full of "people," but after you looked again, you realized that all the people on the boat were married, meaning they were no longer single people.

Chat

Chat

To chat with a model, we give the model the conversation context.


import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash"});
const chat =  model.startChat(config);

let userInput;

while(userInput = getUserInput()) {
  const result = await chat.sendMessage(userInput);
  const response = await result.response;
  const text = response.text();
  console.log(text);
}

Demo

Context

Retrieval-Augmented Generation

RAG: the idea

  • Store our data
  • User makes query
  • Retrieve relevant data
  • Provide data as context to model with the original query
  • Model generates response
  • ???
  • Profit!

RAG: the idea

  • Store our data
  • User makes query
  • Retrieve relevant data
  • Provide data as context to model with the original query
  • Model generates response
  • ???
  • Profit!

Vector search

Vector search

0,0

Vector search

0,0

Vector search

0,0

How do we turn content into vectors?


const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(process.env.API_KEY);

const model = genAI.getGenerativeModel({ model: "text-embedding-004"});
const text = "How to work with generative AI in JavaScript..."

const result = await model.embedContent(text);
const embedding = result.embedding;
console.log(embedding.values);
// => [-0.0018657326, -0.00950177, -0.062905475, 0.011513614, -0.043369178, ... ]

How do we turn content into vectors?

import { DataAPIClient } from "@datastax/astra-db-ts";

export const client = new DataAPIClient(astraDb.token);
export const db = client.db(astraDb.endpoint);
export const collection = db.collection(astraDb.collection);

const privateData = ["How to work with generative AI in JavaScript...", ...];

collection.insertMany(
  privateData.map((data) => ({ $vectorize: data }))
);
            

How do we turn a user query into vectors?


import { DataAPIClient } from "@datastax/astra-db-ts";

export const client = new DataAPIClient(astraDb.token);
export const db = client.db(astraDb.endpoint);
export const collection = db.collection(astraDb.collection);

const query = "What is Phil Nash speaking about?"
const context = await collection
  .find(
    {},
    {
      sort: { $vectorize: query },
      projection: { $vectorize: 1 },
    }
  )
  .toArray();
              

Demo

UX

Models are slow

Streaming responses


import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash"});
const chat =  model.startChat(config);

const query = "What is Phil Nash speaking about?"
const result = await chat.sendMessageStream(query);

for await (const chunk of result.stream) {
  console.log(chunk.text());
}
            

Streaming with fetch

const response = await fetch("/messages", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ query }),
});

const decoder = new TextDecoder();

for await (const chunk of response.body) {
  console.log(decoder.decode(chunk));
}
              

Demo

Extras

Libraries

npm install ai

Vercel's AI SDK

Unified interface to AI providers

Simplifies streaming responses for Next.js, Svelte, Nuxt, Node

Libraries

npm install langchain
npm install llamaindex

Popular (originally Python) libraries that provide many tools for AI engineers

Including: unified interface to AI providers, strategies for data ingestion, useful patterns

Visual builders

Demo

Function calling!

⚠️ Experimental ⚠️

In browser models

Demo

Now is the
time to build

Thanks

Phil Nash DataStax logo