Comprehensive guide to managing vector documents in the View Vector Database platform.
Overview
Vector documents contain metadata about processed documents with their associated vector embeddings and semantic analysis data. They serve as the foundation for AI-powered search, semantic analysis, and vector-based document retrieval within the View Vector Database platform.
Vector documents are managed via the View Vector API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents
and support comprehensive operations including document creation, retrieval, existence checking, and deletion with full semantic processing capabilities.
Vector Document Object Structure
Vector documents contain comprehensive metadata about processed documents with their associated vector embeddings and semantic analysis data:
{
"Success": true,
"GUID": "8b5b2c5d-bf03-4bb4-8888-fd1c2a3645ec",
"DocumentGUID": "d826c2ce-ed73-49d8-beed-b487fb37ae80",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"CollectionGUID": "00000000-0000-0000-0000-000000000000",
"SourceDocumentGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"GraphNodeIdentifier": "",
"ObjectGUID": "00000000-0000-0000-0000-000000000000",
"ObjectKey": "hello.json",
"ObjectVersion": "1",
"Model": "sentence-transformers/all-MiniLM-L6-v2",
"SemanticCells": [],
"CreatedUtc": "2024-06-01T12:00:00.000000Z"
}
Field Descriptions
- Success (boolean): Indicates whether the operation that created this data was successful
- GUID (GUID): Globally unique identifier for the current metadata record
- DocumentGUID (GUID): Globally unique identifier for the processed document
- TenantGUID (GUID): Globally unique identifier for the tenant
- CollectionGUID (GUID): Globally unique identifier for the collection this document belongs to
- SourceDocumentGUID (GUID): Globally unique identifier for the original source document
- BucketGUID (GUID): Globally unique identifier for the storage bucket
- VectorRepositoryGUID (GUID): Globally unique identifier for the vector store or repository
- GraphNodeIdentifier (string): String representing a node identifier within a semantic or knowledge graph
- ObjectGUID (GUID): Globally unique identifier for the object stored (usually in S3 or similar)
- ObjectKey (string): The key (filename) of the object
- ObjectVersion (string): Version of the stored object
- Model (string): The machine learning model used for semantic processing (e.g., embedding)
- SemanticCells (array): An array of semantic cell representations (typically empty until populated by processing)
- CreatedUtc (datetime): Timestamp for when this metadata record was created, in UTC time
Write Vector Document
Creates a new vector document with semantic processing data using POST /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents
. Stores document metadata along with vector embeddings and semantic analysis information for AI-powered search and retrieval.
Request Parameters
Required Parameters
- BucketGUID (GUID, Body, Required): GUID of the storage bucket containing the source object
- CollectionGUID (GUID, Body, Required): GUID of the collection this document belongs to
- SourceDocumentGUID (GUID, Body, Required): GUID of the original source document
- ObjectGUID (GUID, Body, Required): GUID of the object stored in the bucket
- VectorRepositoryGUID (GUID, Body, Required): GUID of the vector repository for embeddings storage
- ObjectKey (string, Body, Required): Key/filename of the object
- ObjectVersion (string, Body, Required): Version of the stored object
- CreatedUtc (datetime, Body, Required): Timestamp when the document was created
Optional Parameters
- GraphNodeIdentifier (string, Body, Optional): Node identifier within a semantic or knowledge graph
- Model (string, Body, Optional): Machine learning model used for semantic processing
- SemanticCells (array, Body, Optional): Array of semantic cell representations with embeddings
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"CollectionGUID": "00000000-0000-0000-0000-000000000000",
"SourceDocumentGUID": "00000000-0000-0000-0000-000000000000",
"ObjectGUID": "00000000-0000-0000-0000-000000000000",
"VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"ObjectKey": "hello.json",
"ObjectVersion": "1",
"CreatedUtc": "2024-06-01",
"SemanticCells": []
}'
import { ViewVectorProxySdk } from "view-sdk";
const api = new ViewVectorProxySdk(
"http://localhost:8000/", //endpoint
"<tenant-guid>", //tenant Id
"default" //access key
);
const writeDocument = async () => {
try {
const response = await api.Document.write({
BucketGUID: "<bucket-guid>",
CollectionGUID: "<collection-guid>",
SourceDocumentGUID: "<source-document-guid>",
ObjectGUID: "<object-guid>",
VectorRepositoryGUID: "<vector-repository-guid>",
ObjectKey: "hello.json",
ObjectVersion: "1",
CreatedUtc: "2024-06-01",
SemanticCells: [],
});
console.log(response, "Write document response");
} catch (err) {
console.log("Error write document:", err);
}
};
writeDocument();
import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="tenant-guid",
service_ports={Service.VECTOR: 8000},
)
def createDocument():
response = vector.Documents.create("<tenant-guid>",
BucketGUID = "<bucket-guid>",
CollectionGUID = "<collection-guid>",
SourceDocumentGUID = "<source-document-guid>",
ObjectGUID = "<object-guid>",
VectorRepositoryGUID = "<vector-repository-guid>",
ObjectKey = "hello.json",
ObjectVersion = "1",
CreatedUtc = "2024-06-01",
SemanticCells = [
{
"CellType": "Text",
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 0,
"Chunks": [
{
"MD5Hash": "000",
"SHA1Hash": "111",
"SHA256Hash": "222",
"Position": 0,
"Content": "This is a sample chunk",
"Embeddings": []
}
]
}
]
)
print(response)
createDocument()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;
ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
SemanticChunk chunk = new SemanticChunk
{
MD5Hash = "000",
SHA1Hash = "111",
SHA256Hash = "222",
Position = 0,
Content = "This is a sample chunk",
Embeddings = new List<float> { 0.16624743f, -0.01494671f }
};
SemanticCell cell = new SemanticCell
{
CellType = SemanticCellTypeEnum.Text,
MD5Hash = "000",
SHA1Hash = "111",
SHA256Hash = "222",
Position = 0,
Chunks = new List<SemanticChunk> { chunk, chunk }
};
EmbeddingsDocument document = new EmbeddingsDocument
{
BucketGUID = Guid.Parse("<bucket-guid>"),
CollectionGUID = Guid.Parse("<collection-guid>"),
SourceDocumentGUID = Guid.Parse("<source-document-guid>"),
ObjectGUID = Guid.Parse("<object-guid>"),
VectorRepositoryGUID = Guid.Parse("<vector-repository-guid>"),
ObjectKey = "hello.json",
ObjectVersion = "1",
CreatedUtc = DateTime.Parse("2024-06-01"),
SemanticCells = new List<SemanticCell> { cell, cell }
};
EmbeddingsDocument result = await sdk.Document.Write(document);
Response
Returns the created vector document object with all metadata and processing information:
{
"Success": true,
"GUID": "8b5b2c5d-bf03-4bb4-8888-fd1c2a3645ec",
"DocumentGUID": "d826c2ce-ed73-49d8-beed-b487fb37ae80",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"CollectionGUID": "00000000-0000-0000-0000-000000000000",
"SourceDocumentGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"GraphNodeIdentifier": "",
"ObjectGUID": "00000000-0000-0000-0000-000000000000",
"ObjectKey": "hello.json",
"ObjectVersion": "1",
"Model": "sentence-transformers/all-MiniLM-L6-v2",
"SemanticCells": [],
"CreatedUtc": "2024-06-01T12:00:00.000000Z"
}
Read Vector Document
Retrieves a specific vector document by GUID using GET /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]
. Returns the complete document metadata including vector embeddings and semantic processing information.
Request Parameters
- vector-repository-guid (string, Path, Required): GUID of the vector repository containing the document
- document-guid (string, Path, Required): GUID of the vector document to retrieve
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/8b5b2c5d-bf03-4bb4-8888-fd1c2a3645ec' \
--header 'Authorization: ••••••' \
--data ''
import { ViewVectorProxySdk } from "view-sdk";
const api = new ViewVectorProxySdk(
"http://localhost:8000/", //endpoint
"<tenant-guid>", //tenant Id
"default" //access key
);
const readDocument = async () => {
try {
const response = await api.Document.read(
"<vector-repository-guid>",
"<document-guid>"
);
console.log(response, "Read document response");
} catch (err) {
console.log("Error read document:", err);
}
};
readDocument();
import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="tenant-guid",
service_ports={Service.VECTOR: 8000},
)
def readDocument():
response = vector.Documents.retrieve("<vector-repository-guid>","<document-guid>")
print(response)
readDocument()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;
ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid = Guid.Parse("<document-guid>");
EmbeddingsDocument result = await sdk.Document.Retrieve(vectorRepoGuid, documentGuid);
Response
Returns the vector document object with all metadata, embeddings, and semantic processing information if found, or a 404 Not Found error if the document doesn't exist.
Check Vector Document Existence
Checks whether a vector document exists using HEAD /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]
. Returns only HTTP status codes without response body for efficient existence verification.
Request Parameters
- vector-repository-guid (string, Path, Required): GUID of the vector repository containing the document
- document-guid (string, Path, Required): GUID of the vector document to check
curl --location --head 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••' \
--data ''
import { ViewVectorProxySdk } from "view-sdk";
const api = new ViewVectorProxySdk(
"http://localhost:8000/", //endpoint
"<tenant-guid>", //tenant Id
"default" //access key
);
const documentExist = async () => {
try {
const response = await api.Document.exists(
"<vector-repository-guid>",
"<document-guid>"
);
console.log(response, "Document exist response");
} catch (err) {
console.log("Error document exist:", err);
}
};
import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="tenant-guid",
service_ports={Service.VECTOR: 8000},
)
def existsDocument():
response = vector.Documents.exists("<vector-repository-guid>","<document-guid>")
print(response)
existsDocument()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;
ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid = Guid.Parse("<document-guid>");
bool exists = await sdk.Document.Exists(vectorRepoGuid, documentGuid);
Response
- 200 OK: Vector document exists
- 404 Not Found: Vector document does not exist
- No response body: Only HTTP status code is returned
Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the vector document exists.
Delete Vector Document
Deletes a vector document by GUID using DELETE /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]
. Removes the document from the vector repository along with all associated embeddings and semantic processing data.
Request Parameters
- vector-repository-guid (string, Path, Required): GUID of the vector repository containing the document
- document-guid (string, Path, Required): GUID of the vector document to delete
curl --location --request DELETE 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/00000000-0000-0000-0000-000000000000' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
'
import { ViewVectorProxySdk } from "view-sdk";
const api = new ViewVectorProxySdk(
"http://localhost:8000/", //endpoint
"<tenant-guid>", //tenant Id
"default" //access key
);
const deleteDocument = async () => {
try {
const response = await api.Document.delete(
"<vector-repository-guid>",
"<document-guid>"
);
console.log(response, "Delete document response");
} catch (err) {
console.log("Error delete document:", err);
}
};
deleteDocument();
import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="tenant-guid",
service_ports={Service.VECTOR: 8000},
)
def deleteDocument():
response = vector.Documents.delete("<vector-repository-guid>","<document-guid>")
print(response)
deleteDocument()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;
ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("00000000-0000-0000-0000-000000000000"),"default", "http://localhost:8000/");
Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid = Guid.Parse("<document-guid>");
bool deleted = await sdk.Document.Delete(vectorRepoGuid, documentGuid);
Response
- 200 OK: Vector document deleted successfully
- 404 Not Found: Vector document does not exist
Note: Deleting a vector document removes it permanently from the vector repository and all associated search indexes. This operation cannot be undone.
Best Practices
When managing vector documents in the View Vector Database, consider the following recommendations for optimal document processing, search performance, and vector management:
- Document Organization: Organize vector documents within logical collections and repositories based on content type, domain, or search requirements
- Embedding Quality: Ensure high-quality vector embeddings are generated using appropriate models for your specific use case and content types
- Semantic Processing: Use comprehensive semantic cell and chunk processing to maximize search accuracy and content understanding
- Model Selection: Choose appropriate embedding models based on your content types and search requirements for optimal vector similarity
- Performance Optimization: Monitor document processing times and optimize semantic cell structures for efficient vector operations
Next Steps
After successfully managing vector documents, you can:
- Vector Search: Implement advanced vector search operations using inner product, cosine distance, and L2 distance algorithms
- Semantic Analysis: Analyze and manage semantic cells and chunks for detailed content understanding and processing
- Embeddings Management: Work with embeddings documents for comprehensive vector storage and retrieval operations
- Search Integration: Integrate vector document management with search functionality for AI-powered document discovery
- Performance Monitoring: Monitor vector document processing and search performance to optimize your AI-powered applications