Semantic Chunk Management

Comprehensive guide to managing semantic chunks in the View Vector Database platform for content analysis and vector embeddings.

Overview

Semantic chunks represent the smallest units of content within semantic cells, containing actual text content and their associated vector embeddings. They serve as the foundation for vector-based search, content analysis, and AI-powered document processing within the View Vector Database platform.

Semantic chunks are managed via the View Vector API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]/cells/[cell-guid]/chunks and support comprehensive operations including chunk retrieval, existence checking, and vector embedding analysis.

Structure

{
    "GUID": "de45fa91-35f0-4c71-829e-9f4a59a4cd92",
    "MD5Hash": "d41d8cd98f00b204e9800998ecf8427e",
    "SHA1Hash": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
    "SHA256Hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "Position": 1,
    "Start": 0,
    "End": 58,
    "Length": 58,
    "Content": "This is a semantic chunk representing a piece of text.",
    "Embeddings": [0.134, -0.092, 0.238, ...]
}

Properties:

  • GUID string unique identifier for the semantic chunk (may be auto-generated if not provided)
  • MD5Hash string MD5 hash of the chunk content
  • SHA1Hash string SHA1 hash of the chunk content
  • SHA256Hash string SHA256 hash of the chunk content
  • Position number index of the chunk in the document
  • Start number start character offset of the chunk in the source content
  • End number end character offset of the chunk in the source content
  • Length number total length of the chunk in characters
  • Content string textual content of the chunk
  • Embeddings array<number> array of floating-point numbers representing the semantic embedding of the content

Read semantic chunks

To Read semantic cells, call GET/v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[documentguid]/cells/[cell-guid]/chunks

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/00000000-0000-0000-0000-000000000000/cells' \
--header 'Authorization: ••••••' \
--data ''
import { ViewVectorProxySdk } from "view-sdk";

const api = new ViewVectorProxySdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const readSematicChunks = async () => {
  try {
    const response = await api.SemanticChunk.readAll(
      "<vector-repository-guid>",
      "<document-guid>",
      "<cell-guid>"
    );
    console.log(response, "Read semantic chunks response");
  } catch (err) {
    console.log("Error read semantic chunks:", err);
  }
};
readSematicChunks();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readSemanticCells():
    response = vector.SemanticChunks.retrieve_all("<vector-repository-guid>","<document-guid>","<cell-guid>")
    print(response)

readSemanticCells()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;

ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid   = Guid.Parse("<document-guid>");
Guid cellGuid = Guid.Parse("<cell-guid>");

List<SemanticChunk> cells = await sdk.SemanticChunk.ReadMany(vectorRepoGuid, documentGuid, cellGuid);

Response

Returns an array of semantic chunk objects with complete structure, embeddings, and metadata.

Read semantic chunk

To read a single semantic chunk, call GET /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]/cells/[cell-guid]/chunks/[chunk-guid]

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/00000000-0000-0000-0000-000000000000/cells/00000000-0000-0000-0000-000000000000/chunks/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••' \
--data ''
import { ViewVectorProxySdk } from "view-sdk";

const api = new ViewVectorProxySdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const readSematicChunk = async () => {
  try {
    const response = await api.SemanticChunk.read(
      "<vector-repository-guid>", // vector repository guid
      "<document-guid>", // document guid
      "<cell-guid>", // semantic cell guid
      "<chunk-guid>" // semantic chunk guid
    );
    console.log(response, "Read semantic chunk response");
  } catch (err) {
    console.log("Error read semantic chunk:", err);
  }
};

readSematicChunk();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readSemanticChunk():
    response = vector.SemanticChunks.retrieve("<vector-repository-guid>","<document-guid>","<cell-guid>","<chunk-guid>")
    print(response)

readSemanticChunk()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;

ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid   = Guid.Parse("<document-guid>");
Guid cellGuid = Guid.Parse("<cell-guid>");
Guid chunkGuid = Guid.Parse("<chunk-guid>");

SemanticChunk cell = await sdk.SemanticChunk.Read(vectorRepoGuid, documentGuid, cellGuid, chunkGuid);

Response

Returns the semantic chunk object with complete structure, embeddings, and metadata if found, or a 404 Not Found error if the chunk doesn't exist.

Semantic chunk exists

To semantic chunk existence , call HEAD /v1.0/tenants/[tenant-guid]/vectorrepositories/[vector-repository-guid]/documents/[document-guid]/cells/[cell-guid]/chunks/[chunk-guid]

curl --location --head 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/vectorrepositories/00000000-0000-0000-0000-000000000000/documents/00000000-0000-0000-0000-000000000000/cells/00000000-0000-0000-0000-000000000000/chunks/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••' \
--data ''
import { ViewVectorProxySdk } from "view-sdk";

const api = new ViewVectorProxySdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const existSemanticChunk = async () => {
  try {
    const response = await api.SemanticChunk.exists(
      "<vector-repository-guid>", // vector repository guid
      "<document-guid>", // document guid
      "<cell-guid>", // semantic cell guid
      "<chunk-guid>" // semantic chunk guid
    );
    console.log(response, "Semantic chunk exists response");
  } catch (err) {
    console.log("Error semantic chunk exists:", err);
  }
};
existSemanticChunk();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def existsSemanticChunk():
    response = vector.SemanticChunks.exists("<vector-repository-guid>","<document-guid>","<cell-guid>","<chunk-guid>")
    print(response)

existsSemanticChunk()
using View.Sdk;
using View.Sdk.Vector;
using View.Sdk.Embeddings;
using View.Sdk.Semantic;

ViewVectorSdk sdk = new ViewVectorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

Guid vectorRepoGuid = Guid.Parse("<vector-repository-guid>");
Guid documentGuid   = Guid.Parse("<document-guid>");
Guid cellGuid = Guid.Parse("<cell-guid>");
Guid chunkGuid = Guid.Parse("<chunk-guid>");

SemanticChunk cell = await sdk.SemanticChunk.Exists(vectorRepoGuid, documentGuid, cellGuid, chunkGuid);

Response

  • 200 OK: Semantic chunk exists
  • 404 Not Found: Semantic chunk does not exist
  • No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the semantic chunk exists.