How to Create OpenAI Embeddings & Why You Need to Know About Them
Text embeddings are a NLP technique that converts textual data into numerical vectors that can be processed by machine learning algorithms, especially large models. These vector representations are designed to capture the semantic meaning and context of the words they represent.
The technique involves taking a chunk of text and turning it into a vector. The output looks like this:
The output represents a point in high-dimensional space, think Cartesian Coordinate System but instead of having 2 dimensions(x- and y-axis) there’s thousands of them. The words that are also closer to each other in terms of semantic meaning would be placed closer to each other.
For example:
One of the best uses for embeddings are semantic search, where it returns not the exact match for words but instead the best results in terms of semantic meaning.
Using OpenAI’s GPT is one of the ways of generating embeddings, you can use other models like BERT.
To get started you would need an API key from OpenAI, create an account if you don’t already have one and navigate to this link inside your dashboard where you can click create a new key.
We will install Langchain, that’s a framework for building LLM-based applications I have found it helpful to streamline lot of AI-related functionalities and tasks when building apps.
Inside your project you can run the command:
npm install langchain @langchain/openai
And create a new function:
import { OpenAIEmbeddings } from "@langchain/openai";
const generateEmbeddings = async (text) => {
const embeddings = new OpenAIEmbeddings({
openAIApiKey: "", // pass your api key here
modelName: "text-embedding-3-large",
});
const vectors = await embeddings.embedDocuments([text]);
return vectors;
}
The functions will generate embeddings that you can use for search, summarization or with your vector database. And you can call the function by doing:
const text = "Your text here";
generateEmbeddings(text).then(vectors => {
// do stuff here
console.log(vectors); // examine results
});
If you prefer a video format, where I also create a Vector Database and build a document Summarization Tool, you can check out the following:
Looking to build a complete AI Saas Application with subscriptions? Check out AI Form Builder: