Built-in vector database
How to use Relevance AI’s built in dataset
Relevance AI has built-in datasets that are NoSQL-based documents and support storing vector embeddings and vector search.
Install the SDK via:
npm install --save @relevanceai/dataset
What is a dataset?
A dataset is a collection of documents that are stored in a NoSQL database. Each document has a unique ID and a set of fields. Each field can be a string, number, date, or a vector. A vector is a list of numbers that represent a point in a multi-dimensional space.
You can use this dataset like you would a Table in SQL or Collection in Mongo. It has CRUD with support for endpoints like:
- Get documents
- Get documents where
- Insert documents
- Update documents
- Update documents where
- Delete documents
- Delete documents where
- Search
Every document in the dataset has an _id
string field that acts as a unique identifier.
Support for vector embeddings
Each dataset can store one or more vector fields that have a _vector_
suffix with up to 2048 dimensions. Relevance AI’s API can automatically generate the embeddings upon insert and during search by specifying the model.
const dataset = client.dataset('1000-movies');
const movies = [
{ title: 'Lord of the Rings: The Fellowship of the Ring', genre: 'action', budget: 100 }, ...
];
await dataset.insertDocuments(movies, [
{ model_name: 'text-embedding-ada-002', field: 'title' }
]);
const query = QueryBuilder().vector('title_vector_',
{
query: 'LOTR',
model: 'text-embeddings-ada-002'
});
const { results } = await dataset.search(query);
Was this page helpful?