Collection Management - Lexi Metadata Database

Overview

Collections provide a logical grouping of source documents within the Lexi data catalog and search platform. They serve as containers for organizing, searching, and managing processed documents with their associated metadata, terms, and semantic information. Collections enable efficient document discovery, content analysis, and search operations across large document repositories.

Collections are managed via the Lexi server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/collections and support comprehensive operations including document enumeration, search with various filters, top terms analysis, and statistical reporting.

Collection Object Structure

Collections have the following structure:

{
    "GUID": "default",
    "TenantGUID": "default",
    "Name": "My first collection",
    "AllowOverwrites": true,
    "AdditionalData": "Created by setup",
    "CreatedUtc": "2024-07-10T05:11:51.000000Z"
}

Field Descriptions

GUID (GUID): Globally unique identifier for the collection object
TenantGUID (GUID): Globally unique identifier for the tenant that owns this collection
Name (string): Display name for the collection
AllowOverwrites (boolean): Specifies whether source documents can be overwritten when they already exist
AdditionalData (string): User-supplied notes, descriptions, or additional metadata for the collection
CreatedUtc (datetime): Timestamp indicating when the collection was created, in UTC time

Create Collection

Creates a new collection in the Lexi metadata database using PUT /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Collections serve as logical containers for organizing and managing source documents with their associated metadata and search capabilities.

Request Parameters

Required Parameters

Name (string, Body, Required): Display name for the collection
AllowOverwrites (boolean, Body, Required): Whether source documents can be overwritten when they already exist

Optional Parameters

AdditionalData (string, Body, Optional): User-supplied notes or additional metadata for the collection

curl -X PUT http://localhost:8601/v1.0/tenants/[tenant-guid]/collections \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer [accesskey]" \
     -d '
{
    "Name": "My collection",
    "AllowOverwrites": true,
    "AdditionalData": "My notes"
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const createCollection = async () => {
  try {
    const response = await api.collectionsSdk.create({
      Name: "My second collection [ASH]",
      AdditionalData: "Yet another collection",
    });
    console.log(response, "Collection created successfully");
  } catch (err) {
    console.log("Error creating Collection:", err);
  }
};

createCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI: 8000},
)

def createCollection():
    collection = lexi.Collection.create(  
        Name="My second collection",
        AdditionalData="Yet another collection"
    )
    print(collection)

createCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
Collection newCollection = new Collection
{
   Name = "My second collection [ASH]",
   AdditionalData = "Yet another collection",
};

Collection response = await sdk.Collection.Create(newCollection);

Response

Returns the created collection object with all configuration details:

{
    "GUID": "default",
    "TenantGUID": "default",
    "Name": "My second collection [ASH]",
    "AllowOverwrites": true,
    "AdditionalData": "Yet another collection",
    "CreatedUtc": "2024-07-10T05:11:51.000000Z"
}

Read Collection

Retrieves a specific collection by GUID using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Returns the complete collection configuration including metadata and settings.

Request Parameters

collection-guid (string, Path, Required): GUID of the collection object to retrieve

{
    "GUID": "oneminute",
    "TenantGUID": "default",
    "Name": "Every minute",
    "Schedule": "MinutesInterval",
    "Interval": 1,
    "CreatedUtc": "2024-07-10T05:21:00.000000Z"
}

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollection = async () => {
  try {
    const response = await api.collectionsSdk.read(
      "<collection-guid>"
    );
    console.log(response, "Collection fetched successfully");
  } catch (err) {
    console.log("Error fetching Collection:", err);
  }
};

retrieveCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def readCollection():
    collection = lexi.Collection.retrieve("<collection-guid>")
    print(collection)

readCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
Collection response = await sdk.Collection.Retrieve(Guid.Parse("<collection-guid>"));

Response

Returns the collection object with all configuration details if found, or a 404 Not Found error if the collection doesn't exist.

Response

Returns the complete collection configuration:

{
    "GUID": "oneminute",
    "TenantGUID": "default",
    "Name": "Every minute",
    "AllowOverwrites": true,
    "AdditionalData": "Created by setup",
    "CreatedUtc": "2024-07-10T05:21:00.000000Z"
}

Read All Collections

Retrieves all collections for a tenant using GET /v1.0/tenants/[tenant-guid]/collections. Returns a JSON array containing all collection objects with their complete configuration details.

Request Parameters

No additional parameters required beyond authentication.

[
    {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    }
]

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollections = async () => {
  try {
    const response = await api.collectionsSdk.readAll();
    console.log(response, "Collection fetched successfully");
  } catch (err) {
    console.log("Error fetching Collection:", err);
  }
};

retrieveCollections();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI: 8000},
)

def readAllCollections():
    collections = lexi.Collection.retrieve_all()
    print(collections)

readAllCollections()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
List<Collection> response = await sdk.Collection.RetrieveMany();

Response Structure

Returns an array of all collection objects for the tenant, or a 404 Not Found error if no collections exist.

Response

Returns an array of all collection objects:

[
    {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    {
        "GUID": "another-collection-guid",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My second collection",
        "AllowOverwrites": false,
        "AdditionalData": "Production collection",
        "CreatedUtc": "2025-03-26T10:15:30.123456Z"
    }
]

Enumerate Collections

Enumerate all collections for a tenant using GET /v2.0/tenants/[tenant-guid]/collections. Returns a enumerated JSON array containing all collection objects with their complete configuration details.

curl --location 'http://localhost:8000/v2.0/tenants/00000000-0000-0000-0000-000000000000/collections' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const enumerateCollections = async () => {
  try {
    const collections = await api.collections.enumerate();
    console.log(collections, 'Collections enumerated successfully');
  } catch (error) {
    console.log(error, 'Error enumerating collections');
  }
};

Enumerate Collection Documents

Enumerates all documents within a collection using POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/documents?enumerate. Supports pagination, filtering, and ordering for efficient document discovery and management.

Request Parameters

Required Parameters

collection-guid (string, Path, Required): GUID of the collection to enumerate documents from

Optional Parameters

MaxResults (integer, Body, Optional): Maximum number of documents to return (default: 1000)
Skip (integer, Body, Optional): Number of documents to skip for pagination (default: 0)
ContinuationToken (string, Body, Optional): Token for continuing pagination from previous request
Ordering (string, Body, Optional): Sort order for results (e.g., "CreatedDescending", "CreatedAscending")
Filters (array, Body, Optional): Array of filter objects for document filtering

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 91.21,
        "Messages": {
            ... log messages ...
        }
    },
    "MaxResults": 1000,
    "IterationsRequired": 1,
    "ContinuationToken": "3135b2ba-7939-4cc3-8849-bff23b27bc9a",
    "EndOfResults": false,
    "RecordsRemaining": 46,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "5.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/5.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z"
        },
        { ... }
    ]
}

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?enumerate=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 100,
    "Skip": 0,   
    "ContinuationToken": null, 
    "Ordering": "CreatedDescending",
    "Filters": [
        {
            "Field": "ObjectKey",
            "Condition": "IsNotNull",
            "Value": ""
        }
    ]
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const enumerateCollection = async () => {
  try {
    const response = await api.sourceDocumentSdk.enumerate(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: null,
        Ordering: "CreatedDescending",
        Filters: [
          {
            Field: "ObjectKey",
            Condition: "IsNotNull",
            Value: "",
          },
        ],
      }
    );
    console.log(response, "Collections enumerated successfully");
  } catch (err) {
    console.log("Error enumerating Collections:", err);
  }
};

enumerateCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def enumerateCollectionDocuments():
    documents = lexi.Collection.enumerate_documents("<collection-guid>",{
            "MaxResults": 100,
            "Skip": 0,   
            "ContinuationToken": None, 
            "Ordering": "CreatedDescending",
            "Filters": [
                {
                    "Field": "ObjectKey",
                    "Condition": "IsNotNull",
                    "Value": ""
                }
            ]
        }
        )
    print(documents)

enumerateCollectionDocuments()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
EnumerationResult<Collection> response = await sdk.Collection.Enumerate();

Response Structure

Returns a paginated enumeration result with document metadata, continuation tokens, and performance metrics.

Response

Returns a paginated enumeration result with document metadata:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 91.21,
        "Messages": {
            "Info": "Enumeration completed successfully",
            "Debug": "Processed 100 documents"
        }
    },
    "MaxResults": 1000,
    "IterationsRequired": 1,
    "ContinuationToken": "3135b2ba-7939-4cc3-8849-bff23b27bc9a",
    "EndOfResults": false,
    "RecordsRemaining": 46,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "5.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/5.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z"
        }
    ]
}

Read Top Terms

Retrieves the most frequently occurring terms across all documents in a collection using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/topterms?max-keys=10. Useful for content analysis, trending topics, and search optimization.

Request Parameters

collection-guid (string, Path, Required): GUID of the collection to analyze
max-keys (integer, Query, Optional): Maximum number of top terms to return (default: 10)

{
  "yes": 152463,
  "answered": 48717,
  "apple": 20362,
  "2023": 19689,
  "compliance": 17300,
  "2022": 11238,
  "2024": 5169,
  "b2b": 2808,
  "digital": 2801,
  "products": 2800
}

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/topterms?max-keys=10' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveTopTerms = async () => {
  try {
    const response = await api.collectionsSdk.readTopTerms(
      "<collection-guid>"
    );
    console.log(response, "top term fetched successfully");
  } catch (err) {
    console.log("Error fetching top terms:", err);
  }
};

retrieveTopTerms();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def readTopTerms():
    terms = lexi.Collection.retrieve_top_terms("<collection-guid>")
    print(terms)

readTopTerms()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
CollectionTopTerms response = await sdk.Collection.RetrieveTopTerms(Guid.Parse("<collection-guid>"), 5);

Response Structure

Returns a JSON object with terms as keys and their frequency counts as values, or a 404 Not Found error if the collection doesn't exist.

Response

Returns the most frequently occurring terms across all documents:

{
  "yes": 152463,
  "answered": 48717,
  "apple": 20362,
  "2023": 19689,
  "compliance": 17300,
  "2022": 11238,
  "2024": 5169,
  "b2b": 2808,
  "digital": 2801,
  "products": 2800
}

Read Statistics

Retrieves comprehensive statistics for a collection using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]?stats=null. Provides document counts, byte totals, term counts, and other analytical metrics for collection analysis and monitoring.

Request Parameters

collection-guid (string, Path, Required): GUID of the collection to analyze
stats (string, Query, Required): Must be set to "null" to retrieve statistics

{
    "Collection": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    "DocumentCount": 227,
    "TotalBytes": 197530629,
    "TermCount": 200697,
    "KeyValueCount": 8
}

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000?stats=null' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollectionStatistics = async () => {
  try {
    const response = await api.collectionsSdk.readStatistics(
      "<collection-guid>"
    );
    console.log(response, "Statistics fetched successfully");
  } catch (err) {
    console.log("Error fetching Statistics:", err);
  }
};

retrieveCollectionStatistics();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def readCollectionStatistics():
    statistics = lexi.Collection.retrieve_statistics("<collection-guid>")
    print(statistics)

readCollectionStatistics()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
CollectionStatistics response = await sdk.Collection.RetrieveStatistics(Guid.Parse("<collection-guid>"));

Response Structure

Returns a JSON object containing collection metadata and comprehensive statistics including document counts, byte totals, and term analysis, or a 404 Not Found error if the collection doesn't exist.

Response

Returns comprehensive collection statistics:

{
    "Collection": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    "DocumentCount": 227,
    "TotalBytes": 197530629,
    "TermCount": 200697,
    "KeyValueCount": 8
}

Check Collection Existence

Checks whether a collection exists using HEAD /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Returns only HTTP status codes without response body for efficient existence verification.

Request Parameters

collection-guid (string, Path, Required): GUID of the collection object to check

curl --location --head 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const collectionExists = async () => {
  try {
    const response = await api.collectionsSdk.exists(
      "<collection-guid>"
    );
    console.log(response, "collection exists");
  } catch (err) {
    console.log("Error fetching collection:", err);
  }
};

collectionExists();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def existsCollection():
    exists = lexi.Collection.exists("<collection-guid>")
    print(exists)

existsCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
bool exists = await sdk.Collection.Exists(Guid.Parse("<collection-guid>"));

Response

200 OK: Collection exists
404 Not Found: Collection does not exist
No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the collection exists.

Response

200 No Content: Collection exists
404 Not Found: Collection does not exist
No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the collection exists.

Search Collection

Searches for documents within a collection using POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/documents?search. Supports advanced filtering, term matching, date ranges, and content type filtering for comprehensive document discovery.

Request Parameters

Required Parameters

collection-guid (string, Path, Required): GUID of the collection to search within

Optional Parameters

MaxResults (integer, Body, Optional): Maximum number of search results to return
Skip (integer, Body, Optional): Number of results to skip for pagination
ContinuationToken (string, Body, Optional): Token for continuing pagination from previous search
Ordering (string, Body, Optional): Sort order for search results
Filter (object, Body, Optional): Search filter object containing various filtering criteria

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      }
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection()

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def searchCollection():
    search = lexi.Collection.search("<collection-guid>",   
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.Search(Guid.Parse("<collection-guid>"), request);

Response Structure

Returns search results with document metadata, relevance scores, and pagination information.

Response

Returns search results with document metadata and relevance scores:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 45.32,
        "Messages": {
            "Info": "Search completed successfully",
            "Debug": "Found 15 matching documents"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95
        }
    ]
}

Search collection and include data

To search for a collection with data, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&incldata

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&incldata=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        include_data=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchIncludeData(Guid.Parse("<collection-guid>"), request);

Response

Returns search results with document metadata, relevance scores, and processed content data:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 78.45,
        "Messages": {
            "Info": "Search with data completed successfully",
            "Debug": "Found 15 matching documents with content"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95,
            "UdrDocument": {
                "Success": true,
                "Key": "document.pdf:1",
                "Terms": ["view", "platform", "search", "document"],
                "TopTerms": {
                    "view": 15,
                    "platform": 12,
                    "search": 8,
                    "document": 6
                }
            }
        }
    ]
}

Search collection and include top terms

To search for a collection with top terms, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&incltopterms

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&incltopterms=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      undefined,
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        include_top_terms=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchIncludeTopTerms(Guid.Parse("<collection-guid>"), request);

Response

Returns search results with document metadata, relevance scores, and top terms analysis:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 62.18,
        "Messages": {
            "Info": "Search with top terms completed successfully",
            "Debug": "Found 15 matching documents with term analysis"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "TopTerms": {
        "view": 152463,
        "platform": 48717,
        "search": 20362,
        "document": 19689,
        "analysis": 17300
    },
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95
        }
    ]
}

Search collection and emit result

To search for a collection and emit result, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&async

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&async=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 2,
        Skip: 1,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2024-01-01 00:00:00.000000",
          CreatedBefore: "2025-01-01 00:00:00.000000",
          Terms: ["foo"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      undefined,
      undefined,
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection()

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        emit_results=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchAsync(Guid.Parse("<collection-guid>"), request);

Response

Returns an async search result with operation tracking:

{
    "Success": true,
    "AsyncOperationId": "async-search-operation-guid",
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 12.34,
        "Messages": {
            "Info": "Async search initiated successfully",
            "Debug": "Search operation queued for processing"
        }
    },
    "Status": "Queued",
    "EstimatedCompletionTime": "2024-10-27T06:02:00.000000Z",
    "ProgressUrl": "/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/search/async/async-search-operation-guid"
}

Delete Collection

Deletes a collection by GUID using DELETE /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Collections must be empty before deletion to prevent data loss.

Request Parameters

collection-guid (string, Path, Required): GUID of the collection object to delete

Response

200 OK: Collection deleted successfully
400 Bad Request: Collection is not empty and cannot be deleted
404 Not Found: Collection does not exist

Note: Collections must be empty (contain no documents) before they can be deleted. If the collection contains documents, a 400 Bad Request error will be returned.

curl --location --request DELETE 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data ''

import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const deleteCollection = async () => {
  try {
    const response = await api.collectionsSdk.delete(
      "<collection-guid>"
    );
    console.log(response, "Collection deleted successfully");
  } catch (err) {
    console.log("Error deleting Collection:", err);
  }
};

deleteCollection();

import view_sdk
from view_sdk import lexi
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.LEXI:8000},
)

def deleteCollection():
    response = lexi.Collection.delete("<collection-guid>")
    print(response)

deleteCollection()

using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
        
bool deleted = await sdk.Collection.Delete(Guid.Parse("<collection-guid>"));

Response

Returns 204 No Content on successful deletion. No response body is returned.

Best Practices

When managing collections in the Lexi metadata database, consider the following recommendations for optimal document organization, search performance, and data management:

Collection Organization: Create logical collections based on document types, sources, or business domains for better organization
Search Optimization: Use appropriate filters and ordering parameters to optimize search performance and reduce response times
Document Management: Regularly monitor collection statistics and clean up unused or outdated documents to maintain performance
Security Configuration: Implement proper access controls and ensure collections are properly secured within tenant boundaries
Performance Monitoring: Monitor collection statistics, document counts, and search performance to identify optimization opportunities

Next Steps

After successfully managing collections, you can:

Source Documents: Upload and manage source documents within your collections for content processing
Search Operations: Implement advanced search functionality using the comprehensive filtering and search capabilities
Metadata Analysis: Analyze collection statistics and top terms to gain insights into your document content
Integration: Integrate collections with other View platform services for comprehensive data processing workflows
Document Processing: Set up automated document processing pipelines using collections as organizational containers