Embeddings Rules

Comprehensive guide to View's embeddings rule management system, including configuration of AI-powered text processing, chunking, and vector generation workflows.

Overview

The View Embeddings Rules management system provides comprehensive configuration for AI-powered text processing workflows. Embeddings rules define how text content is processed, chunked, tokenized, and converted into vector embeddings for storage and retrieval within the View platform.

Key Features

  • AI-Powered Processing: Configure advanced text processing using HuggingFace models and LCProxy generators
  • Intelligent Chunking: Define content chunking strategies with configurable size limits and processing parameters
  • Vector Generation: Set up automated vector embedding generation with batch processing and retry logic
  • Multi-Repository Storage: Store processed data in both graph and vector repositories
  • Content Type Filtering: Apply rules based on specific content types or use wildcards for broad matching
  • Scalable Processing: Configure parallel processing with configurable task limits and batch sizes
  • Error Handling: Built-in retry mechanisms and failure tolerance for robust processing

Supported Operations

  • Create: Define new embeddings processing rules with comprehensive configuration
  • Read: Retrieve embeddings rule configurations and metadata
  • Enumerate: List all embeddings rules with pagination support
  • Update: Modify existing embeddings rule configurations
  • Delete: Remove embeddings rules from the system
  • Existence Check: Verify embeddings rule presence without retrieving details

API Endpoints

Embeddings rules are managed via the Configuration server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/embeddingsrules

Supported HTTP Methods: GET, HEAD, PUT, DELETE

Embeddings Rule Object Structure

Embeddings rule objects contain comprehensive configuration for AI-powered text processing workflows. Here's the complete structure:

{
    "GUID": "embeddings-rule",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "My embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "example-graph-repository",
    "VectorRepositoryGUID": "example-vector-repository",
    "ProcessingEndpoint": "http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
    "ProcessingAccessKey": "***ault",
    "ChunkingServerUrl": "http://nginx-chunker:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/chunking",
    "ChunkingServerApiKey": "***ault",
    "MaxChunkingTasks": 16,
    "MinChunkContentLength": 1,
    "MaxChunkContentLength": 1,
    "MaxTokensPerChunk": 1,
    "TokenOverlap": 32,
    "TokenizationModel": "sentence-transformers/all-MiniLM-L6-v2",
    "HuggingFaceApiKey": "***ault",
    "EmbeddingsServerUrl": "http://nginx-embeddings:8000/",
    "EmbeddingsServerApiKey": "***ault",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://nginx-lcproxy:8000/",
    "EmbeddingsGeneratorApiKey": "***ault",
    "EmbeddingsBatchSize": 16,
    "MaxEmbeddingsTasks": 16,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
    "VectorStoreUrl": "http://nginx-vector:8311/",
    "VectorStoreAccessKey": "***ault",
    "MaxContentLength": 16777216,
    "CreatedUtc": "2024-07-10T05:09:32.000000Z"
}

Field Descriptions

  • GUID (GUID): Globally unique identifier for the embeddings rule object
  • TenantGUID (GUID): Globally unique identifier for the tenant
  • BucketGUID (GUID): GUID of the source bucket containing data to process
  • OwnerGUID (GUID): GUID of the user who owns this embeddings rule
  • Name (string): Display name of the embeddings rule
  • ContentType (string): Content type filter (use "*" for all types)
  • CreatedUtc (datetime): UTC timestamp when the rule was created
  • GraphRepositoryGUID (GUID): GUID of the graph repository for storing processed data
  • VectorRepositoryGUID (GUID): GUID of the vector repository for storing embeddings
  • ProcessingEndpoint (string): URL of the data processing endpoint
  • ProcessingAccessKey (string): Access key for the processing endpoint
  • ChunkingServerUrl (string): URL of the text chunking server
  • ChunkingServerApiKey (string): Access key for the chunking server
  • MaxChunkingTasks (integer): Maximum number of parallel chunking tasks
  • MinChunkContentLength (integer): Minimum length for content chunks
  • MaxChunkContentLength (integer): Maximum length for content chunks
  • MaxTokensPerChunk (integer): Maximum tokens per chunk for processing
  • TokenOverlap (integer): Number of overlapping tokens between consecutive chunks
  • TokenizationModel (string): HuggingFace model for text tokenization
  • HuggingFaceApiKey (string): API key for HuggingFace model access
  • EmbeddingsServerUrl (string): URL of the embeddings generation server
  • EmbeddingsServerApiKey (string): Access key for the embeddings server
  • EmbeddingsGenerator (enum): Generator type (currently only "LCProxy")
  • EmbeddingsGeneratorUrl (string): URL of the embeddings generator service
  • EmbeddingsGeneratorApiKey (string): API key for the embeddings generator
  • EmbeddingsBatchSize (integer): Number of chunks to process per batch
  • MaxEmbeddingsTasks (integer): Maximum parallel embeddings generation tasks
  • MaxTokensPerChunk (integer): Maximum tokens per chunk for processing
  • MaxEmbeddingsRetries (integer): Maximum retry attempts for failed batches
  • MaxEmbeddingsFailures (integer): Maximum failures before job termination
  • VectorStoreUrl (string): URL of the vector storage server
  • VectorStoreAccessKey (string): Access key for vector storage
  • MaxContentLength (integer): Maximum content length to process (in bytes)

Create Embeddings Rule

Creates a new embeddings rule with comprehensive AI processing configuration using PUT /v1.0/tenants/[tenant-guid]/embeddingsrules. This endpoint allows you to define complete text processing workflows including chunking, tokenization, and vector generation.

Request Parameters

Required Parameters

  • BucketGUID (string, Body, Required): GUID of the source bucket containing data to process
  • OwnerGUID (string, Body, Required): GUID of the user who owns this embeddings rule
  • Name (string, Body, Required): Display name for the embeddings rule
  • ContentType (string, Body, Required): Content type filter (use "*" for all types)
  • GraphRepositoryGUID (string, Body, Required): GUID of the graph repository for storing processed data
  • VectorRepositoryGUID (string, Body, Required): GUID of the vector repository for storing embeddings
  • EmbeddingsGenerator (string, Body, Required): Generator type (currently only "LCProxy")

Processing Configuration

  • ProcessingEndpoint (string, Body, Required): URL of the data processing endpoint
  • ProcessingAccessKey (string, Body, Required): Access key for the processing endpoint

Chunking Configuration

  • ChunkingServerUrl (string, Body, Optional): URL of the text chunking server
  • ChunkingServerApiKey (string, Body, Optional): Access key for the chunking server
  • MaxChunkingTasks (integer, Body, Optional): Maximum number of parallel chunking tasks
  • MinChunkContentLength (integer, Body, Optional): Minimum length for content chunks
  • MaxChunkContentLength (integer, Body, Optional): Maximum length for content chunks

AI Model Configuration

  • TokenizationModel (string, Body, Optional): HuggingFace model for text tokenization
  • HuggingFaceApiKey (string, Body, Optional): API key for HuggingFace model access

Embeddings Generation

  • EmbeddingsGeneratorUrl (string, Body, Required): URL of the embeddings generator service
  • EmbeddingsGeneratorApiKey (string, Body, Optional): API key for the embeddings generator
  • EmbeddingsBatchSize (integer, Body, Optional): Number of chunks to process per batch
  • MaxEmbeddingsTasks (integer, Body, Optional): Maximum parallel embeddings generation tasks
  • MaxTokensPerChunk (integer, Body, Optional): Maximum tokens per chunk for processing
  • MaxEmbeddingsRetries (integer, Body, Optional): Maximum retry attempts for failed batches
  • MaxEmbeddingsFailures (integer, Body, Optional): Maximum failures before job termination

Vector Storage

  • VectorStoreUrl (string, Body, Required): URL of the vector storage server
  • VectorStoreAccessKey (string, Body, Required): Access key for vector storage
  • MaxContentLength (integer, Body, Optional): Maximum content length to process (in bytes)
curl --location --request PUT 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules' \
--header 'content-type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "BucketGUID": "00000000-0000-0000-0000-000000000000",
    "Name": "Embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "ProcessingEndpoint": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
    "ProcessingAccessKey": "default",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
   "TokenizationModel": "sentence-transformers/all-MiniLM-L6-v2",
    "HuggingFaceApiKey" : "default",
    "EmbeddingsGeneratorApiKey": "default",
    "EmbeddingsBatchSize": 512,
    "MaxGeneratorTasks": 32,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
    "VectorStoreUrl": "http://localhost:8000/",
    "VectorStoreAccessKey": "default",
    "MaxContentLength": 16777216
}'
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const createEmbeddingRules = async () => {
  try {
    const response = await api.EmbeddingRule.create({
      BucketGUID: "<bucket-guid>",
      Name: "Embeddings rule test ash",
      ContentType: "*",
      GraphRepositoryGUID: "<graph-repository-guid>",
      VectorRepositoryGUID: "<vector-repository-guid>",
      ProcessingEndpoint:
        "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
      ProcessingAccessKey: "default",
      EmbeddingsGenerator: "LCProxy",
      EmbeddingsGeneratorUrl:
        "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
      EmbeddingsGeneratorApiKey: "",
      BatchSize: 512,
      MaxGeneratorTasks: 32,
      MaxRetries: 3,
      MaxFailures: 3,
      VectorStoreUrl: "http://localhost:8000/",
      VectorStoreAccessKey: "default",
      MaxContentLength: 16777216,
    });
    console.log(response, "Embedding rule created successfully");
  } catch (err) {
    console.log("Error creating Embedding rule:", err);
  }
};

createEmbeddingRules();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default"
)

def createEmbeddingRules():
    embeddingRules = configuration.EmbeddingsRule.create(
        BucketGUID="00000000-0000-0000-0000-000000000000",
        Name="Embeddings rule",
        ContentType="*",
        GraphRepositoryGUID="00000000-0000-0000-0000-000000000000",
        VectorRepositoryGUID="00000000-0000-0000-0000-000000000000",
        ProcessingEndpoint="http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
        ProcessingAccessKey="default",
        EmbeddingsGenerator="LCProxy",
        EmbeddingsGeneratorUrl="http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
        EmbeddingsGeneratorApiKey="",
        EmbeddingsBatchSize=512,
        MaxEmbeddingsTasks=32,
        MaxEmbeddingsRetries=3,
        MaxEmbeddingsFailures=3,
        VectorStoreUrl="http://localhost:8000/",
        VectorStoreAccessKey="default",
        MaxContentLength=16777216,
    )
    print(embeddingRules)


createEmbeddingRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
EmbeddingsRule request = new EmbeddingsRule
{
   BucketGUID = Guid.Parse("<bucket-guid>"),
   Name = "Embeddings rule test ash",
   ContentType = "*",
   GraphRepositoryGUID = Guid.Parse("<graph-repository-guid>"),
   VectorRepositoryGUID = Guid.Parse("<vector-repository-guid>"),
   ProcessingEndpoint = "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
   ProcessingAccessKey = "default",
   EmbeddingsGenerator = EmbeddingsGeneratorEnum.LCProxy,
   EmbeddingsGeneratorUrl = "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
   EmbeddingsGeneratorApiKey = "",
   BatchSize = 512,
   MaxGeneratorTasks = 32,
   MaxRetries = 3,
   MaxFailures = 3,
   VectorStoreUrl = "http://localhost:8000/",
   VectorStoreAccessKey = "default",
   MaxContentLength = 16777216,
};

EmbeddingsRule response = await sdk.EmbeddingsRule.Create(request);

Response

Returns the created embeddings rule object with all configuration details:

{
    "GUID": "6046481e-5682-462a-a1a2-a7e1e242b1ff",
    "TenantGUID": "00000000-0000-0000-0000-000000000000",
    "BucketGUID": "00000000-0000-0000-0000-000000000000",
    "OwnerGUID": "00000000-0000-0000-0000-000000000000",
    "Name": "Embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "ProcessingEndpoint": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
    "ProcessingAccessKey": "***ault",
    "ChunkingServerUrl": "http://nginx-chunker:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/chunking",
    "ChunkingServerApiKey": "***ault",
    "MaxChunkingTasks": 16,
    "MinChunkContentLength": 1,
    "MaxChunkContentLength": 1,
    "MaxTokensPerChunk": 1,
    "TokenOverlap": 32,
    "TokenizationModel": "sentence-transformers/all-MiniLM-L6-v2",
    "HuggingFaceApiKey": "***ault",
    "EmbeddingsServerUrl": "http://nginx-embeddings:8000/",
    "EmbeddingsServerApiKey": "***ault",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://nginx-lcproxy:8000/",
    "EmbeddingsGeneratorApiKey": "***ault",
    "EmbeddingsBatchSize": 16,
    "MaxEmbeddingsTasks": 16,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
    "VectorStoreUrl": "http://nginx-vector:8311/",
    "VectorStoreAccessKey": "***ault",
    "MaxContentLength": 16777216,
    "CreatedUtc": "2024-10-21T14:33:55.000000Z"
}

Enumerate Embeddings Rules

Retrieves a paginated list of all embeddings rule objects in the tenant using GET /v2.0/tenants/[tenant-guid]/embeddingsrules. This endpoint provides comprehensive enumeration with pagination support for managing multiple embeddings rules.

Response Structure

The enumeration response includes pagination metadata and embeddings rule objects:

curl --location 'http://localhost:8000/v2.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/' \
--header 'Authorization: ••••••'
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);


const enumerateEmbeddingRules = async () => {
  try {
    const response = await api.EmbeddingRule.enumerate();
    console.log(response, "Embedding rules fetched successfully");
  } catch (err) {
    console.log("Error fetching Embedding rules:", err);
  }
};

enumerateEmbeddingRules();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def enumerateEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.enumerate()
    print(embeddingsRules)

enumerateEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
EnumerationResult<EmbeddingsRule> response = await sdk.EmbeddingsRule.Enumerate();

Response

Returns a paginated list of embeddings rule objects:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-21T02:36:37.677751Z",
        "TotalMs": 23.58,
        "Messages": {}
    },
    "MaxResults": 10,
    "IterationsRequired": 1,
    "EndOfResults": true,
    "RecordsRemaining": 0,
    "Objects": [
        {
            "GUID": "6046481e-5682-462a-a1a2-a7e1e242b1ff",
            "TenantGUID": "00000000-0000-0000-0000-000000000000",
            "BucketGUID": "00000000-0000-0000-0000-000000000000",
            "OwnerGUID": "00000000-0000-0000-0000-000000000000",
    "Name": "Embeddings rule",
    "ContentType": "*",
            "GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
            "VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
            "ProcessingEndpoint": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
            "ProcessingAccessKey": "***ault",
            "MaxTokensPerChunk": 1,
            "TokenOverlap": 32,
    "EmbeddingsGenerator": "LCProxy",
            "EmbeddingsGeneratorUrl": "http://nginx-lcproxy:8000/",
            "EmbeddingsGeneratorApiKey": "***ault",
    "EmbeddingsBatchSize": 16,
            "MaxEmbeddingsTasks": 16,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
            "VectorStoreUrl": "http://nginx-vector:8311/",
            "VectorStoreAccessKey": "***ault",
    "MaxContentLength": 16777216,
            "CreatedUtc": "2024-10-21T14:33:55.000000Z"
        }
    ],
    "ContinuationToken": null
}

Read Embeddings Rule

Retrieves embeddings rule configuration by GUID using GET /v1.0/tenants/[tenant-guid]/embeddingsrule/[embeddingsrule-guid]. Returns the complete embeddings rule configuration including all processing parameters. If the rule doesn't exist, a 404 error is returned.

Request Parameters

  • embeddingsrule-guid (string, Path, Required): GUID of the embeddings rule object to retrieve
curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'

import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const readEmbeddingRule = async () => {
  try {
    const response = await api.EmbeddingRule.read(
      "<embeddingsrule-guid>"
    );
    console.log(response, "Embedding rule fetched successfully");
  } catch (err) {
    console.log("Error fetching Embedding rule:", err);
  }
};

readEmbeddingRule();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def readEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.retrieve("<embeddingsrule-guid>")
    print(embeddingsRules)

readEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
EmbeddingsRule response = await sdk.EmbeddingsRule.Retrieve(Guid.Parse("<embeddingsrule-guid>"));

Response

Returns the complete embeddings rule configuration:

{
    "GUID": "59f7c18d-7342-4aef-9308-86083459dd81",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "Embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "example-graph-repository",
    "VectorRepositoryGUID": "example-vector-repository",
    "ProcessingEndpoint": "http://localhost:8501/processor",
    "ProcessingAccessKey": "***ault",
    "ChunkingServerUrl": "http://nginx-chunker:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/chunking",
    "ChunkingServerApiKey": "***ault",
    "MaxChunkingTasks": 16,
    "MinChunkContentLength": 1,
    "MaxChunkContentLength": 1,
    "MaxTokensPerChunk": 1,
    "TokenOverlap": 32,
    "TokenizationModel": "sentence-transformers/all-MiniLM-L6-v2",
    "HuggingFaceApiKey": "***ault",
    "EmbeddingsServerUrl": "http://nginx-embeddings:8000/",
    "EmbeddingsServerApiKey": "***ault",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://localhost:8301/",
    "EmbeddingsGeneratorApiKey": "***ault",
    "EmbeddingsBatchSize": 16,
    "MaxEmbeddingsTasks": 16,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
    "VectorStoreUrl": "http://localhost:8311/",
    "VectorStoreAccessKey": "***ault",
    "MaxContentLength": 16777216,
    "CreatedUtc": "2024-10-21T15:19:09.000000Z"
}

Note: the HEAD method can be used as an alternative to get to simply check the existence of the object. HEAD requests return either a 200/OK in the event the object exists, or a 404/Not Found if not. No response body is returned with a HEAD request.

Read All Embeddings Rules

Retrieves all embeddings rule objects in the tenant using GET /v1.0/tenants/[tenant-guid]/embeddingsrules/. Returns an array of embeddings rule objects with complete configuration details for all rules in the tenant.

Request Parameters

No additional parameters required beyond authentication.

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/' \
--header 'Authorization: ••••••'
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const readAllEmbeddingRules = async () => {
  try {
    const response = await api.EmbeddingRule.readAll();
    console.log(response, "All embedding rules fetched successfully");
  } catch (err) {
    console.log("Error fetching Embedding rules:", err);
  }
};
readAllEmbeddingRules();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def readAllEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.retrieve_all()
    print(embeddingsRules)

readAllEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
List<EmbeddingsRule> response = await sdk.EmbeddingsRule.RetrieveMany();

Response

Returns an array of all embeddings rule objects:

[
    {
        "GUID": "59f7c18d-7342-4aef-9308-86083459dd81",
        "TenantGUID": "default",
        "BucketGUID": "example-data-bucket",
        "OwnerGUID": "default",
        "Name": "Embeddings rule",
        "ContentType": "*",
        "GraphRepositoryGUID": "example-graph-repository",
        "VectorRepositoryGUID": "example-vector-repository",
        "ProcessingEndpoint": "http://localhost:8501/processor",
        "ProcessingAccessKey": "***ault",
        "MaxTokensPerChunk": 1,
        "TokenOverlap": 32,
        "EmbeddingsGenerator": "LCProxy",
        "EmbeddingsGeneratorUrl": "http://localhost:8301/",
        "EmbeddingsGeneratorApiKey": "***ault",
        "EmbeddingsBatchSize": 16,
        "MaxEmbeddingsTasks": 16,
        "MaxEmbeddingsRetries": 3,
        "MaxEmbeddingsFailures": 3,
        "VectorStoreUrl": "http://localhost:8311/",
        "VectorStoreAccessKey": "***ault",
        "MaxContentLength": 16777216,
        "CreatedUtc": "2024-10-21T15:19:09.000000Z"
    },
    {
        "GUID": "another-embeddings-rule-guid",
        "TenantGUID": "default",
        "BucketGUID": "another-data-bucket",
        "OwnerGUID": "default",
        "Name": "Another embeddings rule",
        "ContentType": "text/plain",
        "GraphRepositoryGUID": "another-graph-repository",
        "VectorRepositoryGUID": "another-vector-repository",
        "ProcessingEndpoint": "http://localhost:8501/processor",
        "ProcessingAccessKey": "***ault",
        "MaxTokensPerChunk": 256,
        "TokenOverlap": 32,
        "EmbeddingsGenerator": "LCProxy",
        "EmbeddingsGeneratorUrl": "http://localhost:8301/",
        "EmbeddingsGeneratorApiKey": "***ault",
        "EmbeddingsBatchSize": 32,
        "MaxEmbeddingsTasks": 8,
        "MaxEmbeddingsRetries": 5,
        "MaxEmbeddingsFailures": 2,
        "VectorStoreUrl": "http://localhost:8311/",
        "VectorStoreAccessKey": "***ault",
        "MaxContentLength": 8388608,
        "CreatedUtc": "2024-10-21T16:45:30.000000Z"
    }
]

Update Embeddings Rule

Updates an existing embeddings rule configuration using PUT /v1.0/tenants/[tenant-guid]/embeddingsrule/[embeddingsrule-guid]. This endpoint allows you to modify embeddings rule parameters while preserving certain immutable fields.

Request Parameters

  • embeddingsrule-guid (string, Path, Required): GUID of the embeddings rule object to update

Updateable Fields

All configuration parameters can be updated except for:

  • GUID: Immutable identifier
  • TenantGUID: Immutable tenant association
  • CreatedUtc: Immutable creation timestamp

Important Notes

  • Field Preservation: Certain fields cannot be modified and will be preserved across updates
  • Complete Object: Provide a fully populated object in the request body
  • Validation: All updated parameters will be validated before applying changes

Request body:

curl --location --request PUT 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/00000000-0000-0000-0000-000000000000' \
--header 'content-type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "BucketGUID": "00000000-0000-0000-0000-000000000000",
    "Name": "An updated embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
    "ProcessingEndpoint": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
    "ProcessingAccessKey": "default",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
    "EmbeddingsGeneratorApiKey": "",
    "BatchSize": 512,
    "MaxGeneratorTasks": 32,
    "MaxRetries": 3,
    "MaxFailures": 3,
    "VectorStoreUrl": "http://localhost:8000/",
    "VectorStoreAccessKey": "default",
    "MaxContentLength": 16777216
}'
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const updateEmbeddingRule = async () => {
  try {
    const response = await api.EmbeddingRule.update({
      GUID: "<embeddingsrule-guid>",
      TenantGUID: "<tenant-guid>",
      BucketGUID: "<bucket-guid>",
      OwnerGUID: "<owner-guid>",
      Name: "Embeddings rule test updated",
      ContentType: "*",
      GraphRepositoryGUID: "<graph-repository-guid>",
      VectorRepositoryGUID: "<vector-repository-guid>",
      ProcessingEndpoint:
        "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
      ProcessingAccessKey: "***ault",
      EmbeddingsServerUrl:
        "http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
      EmbeddingsServerApiKey: "***ault",
      EmbeddingsGenerator: "LCProxy",
      EmbeddingsGeneratorUrl:
        "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
      EmbeddingsGeneratorApiKey: "",
      BatchSize: 512,
      MaxGeneratorTasks: 32,
      MaxRetries: 3,
      MaxFailures: 3,
      VectorStoreUrl: "http://localhost:8000/",
      VectorStoreAccessKey: "***ault",
      MaxContentLength: 16777216,
      CreatedUtc: "2025-03-26T09:37:15.386Z",
    });
    console.log(response, "Embedding rule updated successfully");
  } catch (err) {
    console.log("Error updating Embedding rule:", err);
  }
};

updateEmbeddingRule();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def updateEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.update("<embeddingsrule-guid>",
        BucketGUID="<bucket-guid>",
        Name="Embeddings rule [updated]",
        ContentType="*",
        GraphRepositoryGUID="<graph-repository-guid>",
        VectorRepositoryGUID="<vector-repository-guid>",
        ProcessingEndpoint="http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
        ProcessingAccessKey="*****lt",
        EmbeddingsGenerator="LCProxy",
        EmbeddingsGeneratorUrl="http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
        HuggingFaceApiKey="",
        EmbeddingsBatchSize=512,
        MaxGeneratorTasks=32,
        MaxEmbeddingsRetries=3,
        MaxEmbeddingsFailures=3,
        VectorStoreUrl="http://localhost:8000/",
        VectorStoreAccessKey="default",
        MaxContentLength=16777216
        )
    print(embeddingsRules)

updateEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
var request = new EmbeddingsRule
{
     GUID = Guid.Parse("<embeddingsrule-guid>"),
     TenantGUID = Guid.Parse("<tenant-guid>"),
     BucketGUID = Guid.Parse("<bucket-guid>"),
     OwnerGUID = Guid.Parse("<owner-guid>"),
     Name = "Embeddings rule test updated",
     ContentType = "*",
     GraphRepositoryGUID = Guid.Parse("<graph-repository-guid>"),
     VectorRepositoryGUID = Guid.Parse("<vector-repository-guid>"),
     ProcessingEndpoint = "http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
     ProcessingAccessKey ="***ault",
     EmbeddingsServerUrl ="http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
     EmbeddingsServerApiKey ="***ault",
     EmbeddingsGenerator =EmbeddingsGeneratorEnum.LCProxy,
     EmbeddingsGeneratorUrl ="http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddings",
     EmbeddingsGeneratorApiKey ="",
     BatchSize =512,
     MaxGeneratorTasks =32,
     MaxRetries =3,
     MaxFailures =3,
     VectorStoreUrl ="http://localhost:8000/",
     VectorStoreAccessKey ="***ault",
     MaxContentLength =16777216,
};

EmbeddingsRule response = await sdk.EmbeddingsRule.Update(request);

Response

Returns the updated embeddings rule object with all configuration details:

{
    "GUID": "59f7c18d-7342-4aef-9308-86083459dd81",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "My updated embeddings rule",
    "ContentType": "*",
    "GraphRepositoryGUID": "example-graph-repository",
    "VectorRepositoryGUID": "example-vector-repository",
    "ProcessingEndpoint": "http://localhost:8501/processor",
    "ProcessingAccessKey": "***ault",
    "ChunkingServerUrl": "http://nginx-chunker:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/chunking",
    "ChunkingServerApiKey": "***ault",
    "MaxChunkingTasks": 16,
    "MinChunkContentLength": 1,
    "MaxChunkContentLength": 1,
    "MaxTokensPerChunk": 1,
    "TokenOverlap": 32,
    "TokenizationModel": "sentence-transformers/all-MiniLM-L6-v2",
    "HuggingFaceApiKey": "***ault",
    "EmbeddingsServerUrl": "http://nginx-embeddings:8000/",
    "EmbeddingsServerApiKey": "***ault",
    "EmbeddingsGenerator": "LCProxy",
    "EmbeddingsGeneratorUrl": "http://localhost:8301/",
    "EmbeddingsGeneratorApiKey": "***ault",
    "EmbeddingsBatchSize": 16,
    "MaxEmbeddingsTasks": 16,
    "MaxEmbeddingsRetries": 3,
    "MaxEmbeddingsFailures": 3,
    "VectorStoreUrl": "http://localhost:8311/",
    "VectorStoreAccessKey": "***ault",
    "MaxContentLength": 16777216,
    "CreatedUtc": "2024-10-21T15:19:09.000000Z"
}

Delete Embeddings Rule

Deletes an embeddings rule object by GUID using DELETE /v1.0/tenants/[tenant-guid]/embeddingsrule/[embeddingsrule-guid]. This operation permanently removes the embeddings rule and stops all associated processing workflows. Use with caution as this action cannot be undone.

Request Parameters

  • embeddingsrule-guid (string, Path, Required): GUID of the embeddings rule object to delete
curl -X DELETE 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••' \
```javascript
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const deleteEmbeddingRule = async () => {
  try {
    const response = await api.EmbeddingRule.delete(
      "<embeddingsrule-guid>"
    );
    console.log(response, "Embedding rule deleted successfully");
  } catch (err) {
    console.log("Error deleting Embedding rule:", err);
  }
};

deleteEmbeddingRule();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def deleteEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.delete("<embeddingsrule-guid>")
    print(embeddingsRules)

deleteEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
        
bool deleted = await sdk.EmbeddingsRule.Delete(Guid.Parse("<embeddingsrule-guid>"));

Response

Returns 204 No Content on successful deletion. No response body is returned.

Check Embeddings Rule Existence

Verifies if an embeddings rule object exists without retrieving its configuration using HEAD /v1.0/tenants/[tenant-guid]/embeddingsrules/[embeddingsrule-guid]. This is an efficient way to check embeddings rule presence before performing operations.

Request Parameters

  • embeddingsrule-guid (string, Path, Required): GUID of the embeddings rule object to check
curl --location --head 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/embeddingsrules/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewConfigurationSdk } from "view-sdk";

const api = new ViewConfigurationSdk(
  "http://localhost:8000/", //endpoint
  "default", //tenant Id
  "default" //access key
);

const embeddingRuleExists = async () => {
  try {
    const response = await api.EmbeddingRule.exists(
      "<embeddingsrule-guid>"
    );
    console.log(response, "Embedding rule exists");
  } catch (err) {
    console.log("Error checking Embedding rule:", err);
  }
};

embeddingRuleExists();
import view_sdk
from view_sdk import configuration
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DEFAULT: 8000},
)

def existsEmbeddingsRules():
    embeddingsRules = configuration.EmbeddingsRule.exists("<embeddingsrule-guid>")
    print(embeddingsRules)

existsEmbeddingsRules()
using View.Sdk;
using View.Sdk.Configuration;

ViewConfigurationSdk sdk = new ViewConfigurationSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
        
bool exists = await sdk.EmbeddingsRule.Exists(Guid.Parse("<embeddingsrule-guid>"));

Response

  • 200 No Content: Embeddings rule exists
  • 404 Not Found: Embeddings rule does not exist
  • No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the embeddings rule exists.

Best Practices

When configuring embeddings rules in the View platform, consider the following recommendations for optimal performance and reliability:

  • Content Type Filtering: Use specific content types instead of "*" when possible to improve processing efficiency
  • Chunk Size Optimization: Configure appropriate MinChunkContentLength and MaxChunkContentLength based on your content characteristics
  • Batch Size Tuning: Adjust EmbeddingsBatchSize based on your server capacity and processing requirements
  • API Key Security: Store and manage API keys securely, especially for HuggingFace and external services
  • Testing and Validation: Test embeddings rules with sample content before deploying to production

Next Steps

After successfully configuring embeddings rules, you can:

  • Content Processing: Upload content to configured buckets to trigger automated embeddings generation
  • Vector Search: Implement semantic search capabilities using generated vector embeddings
  • AI Applications: Build AI-powered applications that leverage processed text embeddings
  • Integration: Connect embeddings rules with other View platform services and APIs
  • Monitoring: Set up monitoring for embeddings processing workflows and performance metrics