Lexi Embeddings - View Processing Platform

Overview

Lexi embeddings processing provides comprehensive AI-powered content analysis and vector generation capabilities within the View Processing platform. It enables automated generation of vector embeddings from processed documents, semantic cells, and content chunks for enhanced search, similarity matching, and AI-powered content understanding.

Lexi embeddings processing is accessible via the View Processing API at [http|https]://[hostname]:[port]/[apiversion]/tenants/[tenantguid]/processing/lexiprocessing and supports comprehensive vector generation and embeddings management.

API Endpoints

POST /v1.0/tenants/[tenant-guid]/processing/lexiprocessing - Process Lexi embeddings for documents and content

Process Lexi Embeddings

Processes comprehensive Lexi embeddings for documents and content using POST /v1.0/tenants/[tenant-guid]/processing/lexiprocessing. Generates vector embeddings from processed documents, semantic cells, and content chunks for AI-powered analysis and search capabilities.

Request Parameters

Required Parameters

Results (object, Body, Required): Search results containing documents and content for embeddings processing
EmbeddingsRule (object, Body, Required): Embeddings rule configuration for vector generation
VectorRepository (object, Body, Required): Vector repository configuration for embeddings storage

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing/lexiprocessing' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "Results": {
        "Success": true,
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "CollectionGUID": "00000000-0000-0000-0000-000000000000",
        "Query": {
        },
        "Documents": [
            {
                "GUID": "00000000-0000-0000-0000-000000000001",
                "TenantGUID": "00000000-0000-0000-0000-000000000000",
                "BucketGUID": "00000000-0000-0000-0000-000000000000",
                "CollectionGUID": "00000000-0000-0000-0000-000000000000",
                "ObjectGUID": "00000000-0000-0000-0000-000000000000",
                "ObjectKey": "1.json",
                "ObjectVersion": "1",
                "ContentType": "application/json",
                "DocumentType": "Json",
                "ContentLength": 100,
                "Created": "2024-05-22 00:00:00.000000Z",
                "UdrDocument": {
                    "GUID": "00000000-0000-0000-0000-000000000001",
                    "Success": true,
                    "Key": "1.json",
                    "TypeResult": {
                        "MimeType": "application/json",
                        "Extension": "json",
                        "Type": "Json"
                    },
                    "Terms": [
                        "quick",
                        "brown",
                        "fox"
                    ],
                    "SemanticCells": [
                        {
                            "GUID": "00000000-0000-0000-0000-000000000000",
                            "MD5Hash": "md5",
                            "SHA1Hash": "sha1",
                            "SHA256Hash": "sha256",
                            "Position": 0,
                            "Length": 13,
                            "Chunks": [
                                {
                                    "GUID": "00000000-0000-0000-0000-000000000000",
                                    "MD5hash": "md5",
                                    "SHA1Hash": "sha1",
                                    "SHA256Hash": "sha256",
                                    "Position": 0,
                                    "Start": 0,
                                    "End": 12,
                                    "Length": 13,
                                    "Content": "Hello, world!"
                                }
                            ]
                        }
                    ]
                }
            }
        ]
    },
    "EmbeddingsRule": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "BucketGUID": "00000000-0000-0000-0000-000000000000",
        "OwnerGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My embeddings rule",
        "ContentType": "text/plain",
        "VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
        "EmbeddingsServerUrl": "http://nginx-embeddings:8000/",
        "EmbeddingsServerApiKey": "default",
        "EmbeddingsGenerator": "LCProxy",
        "EmbeddingsGeneratorUrl": "http://nginx-lcproxy:8000/",
        "EmbeddingsGeneratorApiKey": "default",
        "VectorStoreUrl": "http://nginx-vector:8000/",
        "VectorStoreAccessKey": "default"
    },
    "VectorRepository": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My vector repository",
        "RepositoryType": "Pgvector",
        "VectorStoreUrl": "http://viewdemo:8000/",
        "VectorStoreAccessKey": "default",
        "Model": "all-MiniLM-L6-v2",
        "Dimensionality": 384,
        "DatabaseHostname": "localhost",
        "DatabaseName": "vectordb",
        "DatabaseTable": "vectors",
        "DatabasePort": 5432,
        "DatabaseUser": "postgres",
        "DatabasePassword": "password"
    }
}
'

import { ViewProcessorSdk } from "view-sdk";

const processor = new ViewProcessorSdk(
  "<tenant-guid>", //tenant Id
  "default", //access token
  "http://localhost:8000/" //endpoint
);
const lexiEmbeddings = async () => {
  try {
    const response = await processor.generateLexiEmbeddings({
      Results: {
        Success: true,
        TenantGUID: "<tenant-guid>",
        CollectionGUID: "<collection-guid>",
        Query: {},
        Documents: [
          {
            GUID: "<document-guid>",
            TenantGUID: "<tenant-guid>",
            BucketGUID: "<bucket-guid>",
            CollectionGUID: "<collection-guid>",
            ObjectGUID: "<object-guid>",
            ObjectKey: "1.json",
            ObjectVersion: "1",
            ContentType: "application/json",
            DocumentType: "Json",
            ContentLength: 100,
            Created: "2024-05-22 00:00:00.000000Z",
            UdrDocument: {
              GUID: "<document-guid>",
              Success: true,
              Key: "1.json",
              TypeResult: {
                MimeType: "application/json",
                Extension: "json",
                Type: "Json",
              },
              Terms: ["quick", "brown", "fox"],
              SemanticCells: [
                {
                  GUID: "<cell-guid>",
                  MD5Hash: "md5",
                  SHA1Hash: "sha1",
                  SHA256Hash: "sha256",
                  Position: 0,
                  Length: 13,
                  Chunks: [
                    {
                      GUID: "<chunk-guid>",
                      MD5hash: "md5",
                      SHA1Hash: "sha1",
                      SHA256Hash: "sha256",
                      Position: 0,
                      Start: 0,
                      End: 12,
                      Length: 13,
                      Content: "Hello, world!",
                    },
                  ],
                },
              ],
            },
          },
        ],
      },
      EmbeddingsRule: {
        GUID: "<embeddingrule-guid>",
        TenantGUID: "<tenant-guid>",
        BucketGUID: "<bucket-guid>",
        OwnerGUID: "<owner-guid>",
        Name: "My embeddings rule",
        ContentType: "text/plain",
        VectorRepositoryGUID: "<vector-repository-guid>",
        EmbeddingsServerUrl: "http://nginx-embeddings:8000/",
        EmbeddingsServerApiKey: "default",
        EmbeddingsGenerator: "LCProxy",
        EmbeddingsGeneratorUrl: "http://nginx-lcproxy:8000/",
        EmbeddingsGeneratorApiKey: "default",
        VectorStoreUrl: "http://nginx-vector:8000/",
        VectorStoreAccessKey: "default",
      },
      VectorRepository: {
        GUID: "<vector-repository-guid>",
        TenantGUID: "<tenant-guid>",
        Name: "My vector repository",
        RepositoryType: "Pgvector",
        VectorStoreUrl: "http://viewdemo:8000/",
        VectorStoreAccessKey: "default",
        Model: "all-MiniLM-L6-v2",
        Dimensionality: 384,
        DatabaseHostname: "localhost",
        DatabaseName: "vectordb",
        DatabaseTable: "vectors",
        DatabasePort: 5432,
        DatabaseUser: "postgres",
        DatabasePassword: "password",
      },
    });
    console.log(response);
  } catch (err) {
    console.log("Error", err);
  }
};
lexiEmbeddings();

using View.Sdk;
using View.Sdk.Processor;

ViewProcessorSdk sdk = new ViewProcessorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
SearchResult results = new SearchResult
{
    Success = true,
    TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    CollectionGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    Query = new Query(),
    SourceDocument = new List<SourceDocument>
    {
        new SourceDocument
        {
            GUID = Guid.Parse("00000000-0000-0000-0000-000000000001"),
            TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
            BucketGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
            CollectionGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
            ObjectGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
            ObjectKey = "1.json",
            ObjectVersion = "1",
            ContentType = "application/json",
            DocumentType = DocumentTypeEnum.Json,
            ContentLength = 100,
            Created = DateTime.Parse("2024-05-22 00:00:00.000000Z"),
            UdrDocument = new UdrDocument
            {
                GUID = Guid.Parse("00000000-0000-0000-0000-000000000001"),
                Success = true,
                Key = "1.json",
                TypeResult = new TypeResult
                {
                    MimeType = "application/json",
                    Extension = "json",
                    Type = "Json"
                },
                Terms = new List<string> { "quick", "brown", "fox" },
                SemanticCells = new List<SemanticCell>
                {
                    new SemanticCell
                    {
                        GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
                        MD5Hash = "md5",
                        SHA1Hash = "sha1",
                        SHA256Hash = "sha256",
                        Position = 0,
                        Length = 13,
                        Chunks = new List<SemanticChunk>
                        {
                            new SemanticChunk
                            {
                                GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
                                MD5Hash = "md5",
                                SHA1Hash = "sha1",
                                SHA256Hash = "sha256",
                                Position = 0,
                                Start = 0,
                                End = 12,
                                Length = 13,
                                Content = "Hello, world!"
                            }
                        }
                    }
                }
            }
        }
    }
};

EmbeddingsRule embeddingsRule = new EmbeddingsRule
{
    GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    BucketGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    OwnerGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    Name = "My embeddings rule",
    ContentType = "text/plain",
    VectorRepositoryGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    EmbeddingsServerUrl = "http://nginx-embeddings:8000/",
    EmbeddingsServerApiKey = "default",
    EmbeddingsGenerator = "LCProxy",
    EmbeddingsGeneratorUrl = "http://nginx-lcproxy:8000/",
    EmbeddingsGeneratorApiKey = "default",
    VectorStoreUrl = "http://nginx-vector:8000/",
    VectorStoreAccessKey = "default"
};
         
VectorRepository vectorRepository = new VectorRepository
{
    GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
    Name = "My vector repository",
    RepositoryType = "Pgvector",
    VectorStoreUrl = "http://viewdemo:8000/",
    VectorStoreAccessKey = "default",
    Model = "all-MiniLM-L6-v2",
    Dimensionality = 384,
    DatabaseHostname = "localhost",
    DatabaseName = "vectordb",
    DatabaseTable = "vectors",
    DatabasePort = 5432,
    DatabaseUser = "postgres",
    DatabasePassword = "password"
};


LexiEmbeddingsResult response = await sdk.LexiEmbeddings.Process(results, embeddingsRule, vectorRepository);

Response

Returns Lexi embeddings processing results with execution status and timing information.

{
    "GUID": "ba3b9a65-0104-46dd-ab81-195dd89452ad",
    "Success": true,
    "Async": true,
    "Timestamp": {
        "Start": "2025-04-30T13:38:08.260832Z",
        "TotalMs": 15.37,
        "Messages": {}
    }
}

Best Practices

When managing Lexi embeddings processing in the View Processing platform, consider the following recommendations for optimal vector generation, AI-powered analysis, and embeddings management:

Content Preparation: Ensure documents and content are properly processed and structured before embeddings generation for optimal vector quality
Vector Configuration: Configure appropriate vector repository settings (dimensionality, model selection) based on your content types and search requirements
Embeddings Quality: Use high-quality embeddings generators and models to maximize search accuracy and content understanding
Performance Optimization: Monitor embeddings processing performance and optimize vector generation parameters for large-scale content processing
Search Integration: Integrate generated embeddings with search capabilities to enable AI-powered content discovery and similarity matching

Next Steps

After successfully processing Lexi embeddings, you can:

Vector Search: Implement advanced vector search operations using generated embeddings for AI-powered content discovery
Similarity Matching: Use vector embeddings for similarity matching and content recommendation systems
AI-Powered Analysis: Leverage embeddings for advanced content analysis, classification, and understanding
Search Optimization: Optimize search performance using vector embeddings for enhanced content discovery
Content Processing: Continue processing additional content and documents to build comprehensive embeddings databases