This page provides an overview of APIs related to embeddings persistence, management, and deletion.
Object Overview
Embeddings documents contain vector embeddings and relevant metadata about the source document from which the embeddings were generated.
Endpoint, URL, and Supported Methods
Embeddings documents are managed via the View Vector API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/documents
By default, View Vector is accessible on port 8311
.
Supported methods include: PUT
POST
DELETE
Structure
Objects have the following structure:
{
"GUID": "ac16a21a-88a4-4083-8fcd-75f49bd02384",
"TenantGUID": "default",
"CollectionGUID": "default",
"SourceDocumentGUID": "94dcfa65-adec-4ddf-bea7-2e4f290ace24",
"BucketGUID": "example-data-bucket",
"VectorRepositoryGUID": "example-vector-repository",
"GraphRepositoryGUID": "ac16a21a-88a4-4083-8fcd-75f49bd02384",
"GraphNodeIdentifier": "94dcfa65-adec-4ddf-bea7-2e4f290ace24",
"ObjectGUID": "2462d84b-dba8-4deb-ba3b-f1dee2106376",
"ObjectKey": "1.pdf",
"ObjectVersion": "1",
"Model": "all-MiniLM-L6-v2",
"Score": 1.3349401950836182,
"SemanticCells": [
{
"GUID": "0e20c037-7eeb-407d-b2fc-b4b3db422e22",
"CellType": "Text",
"MD5Hash": "A382429550056CCFA2BFFC602EC86605",
"SHA1Hash": "1BFE6FBDFC07B9217E3DC03D10D89551EBD3BAFE",
"SHA256Hash": "539821158F217EB540C1DC83E9FDD7DC48BAE9874BD3A7807613C6F8FED0F064",
"Position": 0,
"Length": 0,
"Chunks": [
{
"GUID": "e45dc62b-e662-47b7-a1b9-d98f112813f0",
"MD5Hash": "A382429550056CCFA2BFFC602EC86605",
"SHA1Hash": "1BFE6FBDFC07B9217E3DC03D10D89551EBD3BAFE",
"SHA256Hash": "539821158F217EB540C1DC83E9FDD7DC48BAE9874BD3A7807613C6F8FED0F064",
"Position": 0,
"Start": 0,
"End": 0,
"Length": 0,
"Content": "The quick brown fox jumped over the lazy dog",
"Embeddings": [
-0.013084393,
...
]
},
...
],
"Children": []
},
...
],
"CreatedUtc": "2024-10-25T02:14:08.000000Z"
}
Properties:
GUID
string
globally unique identifier for the objectTenantGUID
string
globally unique identifier for the tenantCollectionGUID
string
globally unique identifier for the collectionSourceDocumentGUID
string
globally unique identifier for the source documentBucketGUID
string
globally unique identifier for the bucket where the object is storedDataRepositoryGUID
string
globally unique identifier for the data repository where the object is storedVectorRepositoryGUID
string
globally unique identifier for the vector repository where the embeddings are storedGraphRepositoryGUID
string
globally unique identifier for the graph repository where relationship metadata is storedGraphNodeIdentifier
string
globally unique identifier for the graph node where relationship metadata is storedObjectGUID
string
globally unique identifier for the objectObjectKey
string
key for the objectObjectVersion
string
version of the objectModel
string
model from which embeddings were generatedScore
float
score for the documentSemanticCells
array
an array of semantic cells within the documentCreatedUtc
datetime
timestamp from creation, in UTC time
Create
To write an embeddings document, call POST /v1.0/tenants/[tenant-guid]/documents
with a fully-populated embeddings document.
An example request appears as follows:
{
"TenantGUID": "default",
"BucketGUID": "data",
"CollectionGUID": "default",
"SourceDocumentGUID": "default",
"ObjectGUID": "default",
"VectorRepositoryGUID": "example-vector-repository",
"ObjectKey": "hello.json",
"ObjectVersion": "1",
"CreatedUtc": "2024-06-01",
"SemanticCells": [
{
"GUID": "example-semantic-cell-1",
"CellType": "Text",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 0,
"Chunks": [
{
"GUID": "example-semantic-chunk-1",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 0,
"Content": "This is a sample chunk",
"Embeddings": [0.16624743426880373,...]
},
{
"GUID": "example-semantic-chunk-2",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 1,
"Content": "This is a sample chunk",
"Embeddings": [0.16624743426880373,..]
}
],
"Children": [ ]
},
{
"GUID": "example-semantic-cell-2",
"CellType": "Text",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 1,
"Chunks": [
{
"GUID": "example-semantic-chunk-3",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 0,
"Content": "This is a sample chunk",
"Embeddings": [0.16624743426880373,...]
},
{
"GUID": "example-semantic-chunk-4",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 1,
"Content": "This is a sample chunk",
"Embeddings": [0.16624743426880373,...]
}
],
"Children": [ ]
}
]
}
When writing embeddings, each semantic chunk is split into a separate database row, with the encapsulating semantic cell and document metadata attached. The result will be an array.
Search
To search embeddings, call PUT/v1.0/tenants/[tenant-guid]/search
with a vector search request body as follows. View Vector currently supports inner product search InnerProduct
, cosine distance search CosineDistance
, and L2 distance search L2Distance
.
{
"SearchType": "CosineDistance",
"VectorRepositoryGUID": "example-vector-repository",
"MaxResults": 5,
"Embeddings": [0.16624743426880373,...]
}
The result will be an array of distinct embeddings documents including document metadata, semantic cells, semantic chunks, and the embeddings for each.
Delete
To delete document metadata from Vector, call DELETE /v1.0/tenants/[tenant-guid]/documents
with a populated delete request body as follows. Any populated parameters will be ANDed together. This operation is destructive and cannot be undone.
{
"VectorRepositoryGUID": "example-vector-repository",
"TenantGUID": "default",
"CollectionGUID": null,
"DataRepositoryGUID": null,
"BucketGUID": "data",
"ObjectGUID": null,
"Key": "2.txt",
"Version": "1"
}
To truncate a vector table entirely from Vector, call DELETE /v1.0/tenants/[tenant-guid]/documents?truncate
with a JSON object containing the globally unique identifier of the vector repository object. This operation is destructive and cannot be undone.
{
"VectorRepositoryGUID": "example-vector-repository"
}