Collection Management

Comprehensive guide to managing collections in the Lexi metadata database and search platform.

Overview

Collections provide a logical grouping of source documents within the Lexi data catalog and search platform. They serve as containers for organizing, searching, and managing processed documents with their associated metadata, terms, and semantic information. Collections enable efficient document discovery, content analysis, and search operations across large document repositories.

Collections are managed via the Lexi server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/collections and support comprehensive operations including document enumeration, search with various filters, top terms analysis, and statistical reporting.

Collection Object Structure

Collections have the following structure:

{
    "GUID": "default",
    "TenantGUID": "default",
    "Name": "My first collection",
    "AllowOverwrites": true,
    "AdditionalData": "Created by setup",
    "CreatedUtc": "2024-07-10T05:11:51.000000Z"
}

Field Descriptions

  • GUID (GUID): Globally unique identifier for the collection object
  • TenantGUID (GUID): Globally unique identifier for the tenant that owns this collection
  • Name (string): Display name for the collection
  • AllowOverwrites (boolean): Specifies whether source documents can be overwritten when they already exist
  • AdditionalData (string): User-supplied notes, descriptions, or additional metadata for the collection
  • CreatedUtc (datetime): Timestamp indicating when the collection was created, in UTC time

Create Collection

Creates a new collection in the Lexi metadata database using PUT /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Collections serve as logical containers for organizing and managing source documents with their associated metadata and search capabilities.

Request Parameters

Required Parameters

  • Name (string, Body, Required): Display name for the collection
  • AllowOverwrites (boolean, Body, Required): Whether source documents can be overwritten when they already exist

Optional Parameters

  • AdditionalData (string, Body, Optional): User-supplied notes or additional metadata for the collection
curl -X PUT http://localhost:8601/v1.0/tenants/[tenant-guid]/collections \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer [accesskey]" \
     -d '
{
    "Name": "My collection",
    "AllowOverwrites": true,
    "AdditionalData": "My notes"
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const createCollection = async () => {
  try {
    const response = await api.collectionsSdk.create({
      Name: "My second collection [ASH]",
      AdditionalData: "Yet another collection",
    });
    console.log(response, "Collection created successfully");
  } catch (err) {
    console.log("Error creating Collection:", err);
  }
};

createCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def createCollection():
    collection = lexi.Collection.create(  
        Name="My second collection",
        AdditionalData="Yet another collection"
    )
    print(collection)

createCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
Collection newCollection = new Collection
{
   Name = "My second collection [ASH]",
   AdditionalData = "Yet another collection",
};

Collection response = await sdk.Collection.Create(newCollection);

Response

Returns the created collection object with all configuration details:

{
    "GUID": "default",
    "TenantGUID": "default",
    "Name": "My second collection [ASH]",
    "AllowOverwrites": true,
    "AdditionalData": "Yet another collection",
    "CreatedUtc": "2024-07-10T05:11:51.000000Z"
}

Read Collection

Retrieves a specific collection by GUID using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Returns the complete collection configuration including metadata and settings.

Request Parameters

  • collection-guid (string, Path, Required): GUID of the collection object to retrieve

Response

Returns the collection object with all configuration details if found, or a 404 Not Found error if the collection doesn't exist.

{
    "GUID": "oneminute",
    "TenantGUID": "default",
    "Name": "Every minute",
    "Schedule": "MinutesInterval",
    "Interval": 1,
    "CreatedUtc": "2024-07-10T05:21:00.000000Z"
}
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollection = async () => {
  try {
    const response = await api.collectionsSdk.read(
      "<collection-guid>"
    );
    console.log(response, "Collection fetched successfully");
  } catch (err) {
    console.log("Error fetching Collection:", err);
  }
};

retrieveCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readCollection():
    collection = lexi.Collection.retrieve("<collection-guid>")
    print(collection)

readCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
Collection response = await sdk.Collection.Retrieve(Guid.Parse("<collection-guid>"));

Response

Returns the complete collection configuration:

{
    "GUID": "oneminute",
    "TenantGUID": "default",
    "Name": "Every minute",
    "AllowOverwrites": true,
    "AdditionalData": "Created by setup",
    "CreatedUtc": "2024-07-10T05:21:00.000000Z"
}

Read All Collections

Retrieves all collections for a tenant using GET /v1.0/tenants/[tenant-guid]/collections. Returns a JSON array containing all collection objects with their complete configuration details.

Request Parameters

No additional parameters required beyond authentication.

Response

Returns an array of all collection objects for the tenant, or a 404 Not Found error if no collections exist.

[
    {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    }
]
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections' \
--header 'Authorization: ••••••'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollections = async () => {
  try {
    const response = await api.collectionsSdk.readAll();
    console.log(response, "Collection fetched successfully");
  } catch (err) {
    console.log("Error fetching Collection:", err);
  }
};

retrieveCollections();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readAllCollections():
    collections = lexi.Collection.retrieve_all()
    print(collections)

readAllCollections()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
List<Collection> response = await sdk.Collection.RetrieveMany();

Response

Returns an array of all collection objects:

[
    {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    {
        "GUID": "another-collection-guid",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My second collection",
        "AllowOverwrites": false,
        "AdditionalData": "Production collection",
        "CreatedUtc": "2025-03-26T10:15:30.123456Z"
    }
]

Enumerate Collection Documents

Enumerates all documents within a collection using POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/documents?enumerate. Supports pagination, filtering, and ordering for efficient document discovery and management.

Request Parameters

Required Parameters

  • collection-guid (string, Path, Required): GUID of the collection to enumerate documents from

Optional Parameters

  • MaxResults (integer, Body, Optional): Maximum number of documents to return (default: 1000)
  • Skip (integer, Body, Optional): Number of documents to skip for pagination (default: 0)
  • ContinuationToken (string, Body, Optional): Token for continuing pagination from previous request
  • Ordering (string, Body, Optional): Sort order for results (e.g., "CreatedDescending", "CreatedAscending")
  • Filters (array, Body, Optional): Array of filter objects for document filtering

Response

Returns a paginated enumeration result with document metadata, continuation tokens, and performance metrics.

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 91.21,
        "Messages": {
            ... log messages ...
        }
    },
    "MaxResults": 1000,
    "IterationsRequired": 1,
    "ContinuationToken": "3135b2ba-7939-4cc3-8849-bff23b27bc9a",
    "EndOfResults": false,
    "RecordsRemaining": 46,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "5.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/5.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z"
        },
        { ... }
    ]
}
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?enumerate=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 100,
    "Skip": 0,   
    "ContinuationToken": null, 
    "Ordering": "CreatedDescending",
    "Filters": [
        {
            "Field": "ObjectKey",
            "Condition": "IsNotNull",
            "Value": ""
        }
    ]
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const enumerateCollection = async () => {
  try {
    const response = await api.sourceDocumentSdk.enumerate(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: null,
        Ordering: "CreatedDescending",
        Filters: [
          {
            Field: "ObjectKey",
            Condition: "IsNotNull",
            Value: "",
          },
        ],
      }
    );
    console.log(response, "Collections enumerated successfully");
  } catch (err) {
    console.log("Error enumerating Collections:", err);
  }
};

enumerateCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def enumerateCollectionDocuments():
    documents = lexi.Collection.enumerate_documents("<collection-guid>",{
            "MaxResults": 100,
            "Skip": 0,   
            "ContinuationToken": None, 
            "Ordering": "CreatedDescending",
            "Filters": [
                {
                    "Field": "ObjectKey",
                    "Condition": "IsNotNull",
                    "Value": ""
                }
            ]
        }
        )
    print(documents)

enumerateCollectionDocuments()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
EnumerationResult<Collection> response = await sdk.Collection.Enumerate();

Response

Returns a paginated enumeration result with document metadata:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 91.21,
        "Messages": {
            "Info": "Enumeration completed successfully",
            "Debug": "Processed 100 documents"
        }
    },
    "MaxResults": 1000,
    "IterationsRequired": 1,
    "ContinuationToken": "3135b2ba-7939-4cc3-8849-bff23b27bc9a",
    "EndOfResults": false,
    "RecordsRemaining": 46,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "5.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/5.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z"
        }
    ]
}

Read Top Terms

Retrieves the most frequently occurring terms across all documents in a collection using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/topterms?max-keys=10. Useful for content analysis, trending topics, and search optimization.

Request Parameters

  • collection-guid (string, Path, Required): GUID of the collection to analyze
  • max-keys (integer, Query, Optional): Maximum number of top terms to return (default: 10)

Response

Returns a JSON object with terms as keys and their frequency counts as values, or a 404 Not Found error if the collection doesn't exist.

{
  "yes": 152463,
  "answered": 48717,
  "apple": 20362,
  "2023": 19689,
  "compliance": 17300,
  "2022": 11238,
  "2024": 5169,
  "b2b": 2808,
  "digital": 2801,
  "products": 2800
} 
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/topterms?max-keys=10' \
--header 'Authorization: ••••••'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveTopTerms = async () => {
  try {
    const response = await api.collectionsSdk.readTopTerms(
      "<collection-guid>"
    );
    console.log(response, "top term fetched successfully");
  } catch (err) {
    console.log("Error fetching top terms:", err);
  }
};

retrieveTopTerms();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readTopTerms():
    terms = lexi.Collection.retrieve_top_terms("<collection-guid>")
    print(terms)

readTopTerms()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
CollectionTopTerms response = await sdk.Collection.RetrieveTopTerms(Guid.Parse("<collection-guid>"), 5);

Response

Returns the most frequently occurring terms across all documents:

{
  "yes": 152463,
  "answered": 48717,
  "apple": 20362,
  "2023": 19689,
  "compliance": 17300,
  "2022": 11238,
  "2024": 5169,
  "b2b": 2808,
  "digital": 2801,
  "products": 2800
}

Read Statistics

Retrieves comprehensive statistics for a collection using GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]?stats=null. Provides document counts, byte totals, term counts, and other analytical metrics for collection analysis and monitoring.

Request Parameters

  • collection-guid (string, Path, Required): GUID of the collection to analyze
  • stats (string, Query, Required): Must be set to "null" to retrieve statistics

Response

Returns a JSON object containing collection metadata and comprehensive statistics including document counts, byte totals, and term analysis, or a 404 Not Found error if the collection doesn't exist.

{
    "Collection": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    "DocumentCount": 227,
    "TotalBytes": 197530629,
    "TermCount": 200697,
    "KeyValueCount": 8
}
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000?stats=null' \
--header 'Authorization: ••••••'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const retrieveCollectionStatistics = async () => {
  try {
    const response = await api.collectionsSdk.readStatistics(
      "<collection-guid>"
    );
    console.log(response, "Statistics fetched successfully");
  } catch (err) {
    console.log("Error fetching Statistics:", err);
  }
};

retrieveCollectionStatistics();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def readCollectionStatistics():
    statistics = lexi.Collection.retrieve_statistics("<collection-guid>")
    print(statistics)

readCollectionStatistics()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
CollectionStatistics response = await sdk.Collection.RetrieveStatistics(Guid.Parse("<collection-guid>"));

Response

Returns comprehensive collection statistics:

{
    "Collection": {
        "GUID": "00000000-0000-0000-0000-000000000000",
        "TenantGUID": "00000000-0000-0000-0000-000000000000",
        "Name": "My first collection",
        "AllowOverwrites": true,
        "AdditionalData": "Created by setup",
        "CreatedUtc": "2025-03-25T21:12:32.461527Z"
    },
    "DocumentCount": 227,
    "TotalBytes": 197530629,
    "TermCount": 200697,
    "KeyValueCount": 8
}

Check Collection Existence

Checks whether a collection exists using HEAD /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Returns only HTTP status codes without response body for efficient existence verification.

Request Parameters

  • collection-guid (string, Path, Required): GUID of the collection object to check

Response

  • 200 OK: Collection exists
  • 404 Not Found: Collection does not exist
  • No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the collection exists.

curl --location --head 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const collectionExists = async () => {
  try {
    const response = await api.collectionsSdk.exists(
      "<collection-guid>"
    );
    console.log(response, "collection exists");
  } catch (err) {
    console.log("Error fetching collection:", err);
  }
};

collectionExists();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def existsCollection():
    exists = lexi.Collection.exists("<collection-guid>")
    print(exists)

existsCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
            
bool exists = await sdk.Collection.Exists(Guid.Parse("<collection-guid>"));

Response

  • 200 No Content: Collection exists
  • 404 Not Found: Collection does not exist
  • No response body: Only HTTP status code is returned

Note: HEAD requests do not return a response body, only the HTTP status code indicating whether the collection exists.

Search Collection

Searches for documents within a collection using POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/documents?search. Supports advanced filtering, term matching, date ranges, and content type filtering for comprehensive document discovery.

Request Parameters

Required Parameters

  • collection-guid (string, Path, Required): GUID of the collection to search within

Optional Parameters

  • MaxResults (integer, Body, Optional): Maximum number of search results to return
  • Skip (integer, Body, Optional): Number of results to skip for pagination
  • ContinuationToken (string, Body, Optional): Token for continuing pagination from previous search
  • Ordering (string, Body, Optional): Sort order for search results
  • Filter (object, Body, Optional): Search filter object containing various filtering criteria

Response

Returns search results with document metadata, relevance scores, and pagination information.

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      }
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection()
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def searchCollection():
    search = lexi.Collection.search("<collection-guid>",   
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.Search(Guid.Parse("<collection-guid>"), request);

Response

Returns search results with document metadata and relevance scores:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 45.32,
        "Messages": {
            "Info": "Search completed successfully",
            "Debug": "Found 15 matching documents"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95
        }
    ]
}

Search collection and include data

To search for a collection with data, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&incldata

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&incldata=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        include_data=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchIncludeData(Guid.Parse("<collection-guid>"), request);

Response

Returns search results with document metadata, relevance scores, and processed content data:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 78.45,
        "Messages": {
            "Info": "Search with data completed successfully",
            "Debug": "Found 15 matching documents with content"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95,
            "UdrDocument": {
                "Success": true,
                "Key": "document.pdf:1",
                "Terms": ["view", "platform", "search", "document"],
                "TopTerms": {
                    "view": 15,
                    "platform": 12,
                    "search": 8,
                    "document": 6
                }
            }
        }
    ]
}

Search collection and include top terms

To search for a collection with top terms, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&incltopterms

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&incltopterms=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 100,
        Skip: 0,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2025-01-01 00:00:00.000000",
          CreatedBefore: "2026-01-01 00:00:00.000000",
          Terms: ["view"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      undefined,
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        include_top_terms=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchIncludeTopTerms(Guid.Parse("<collection-guid>"), request);

Response

Returns search results with document metadata, relevance scores, and top terms analysis:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 62.18,
        "Messages": {
            "Info": "Search with top terms completed successfully",
            "Debug": "Found 15 matching documents with term analysis"
        }
    },
    "MaxResults": 100,
    "TotalResults": 15,
    "ContinuationToken": "search-continuation-token",
    "EndOfResults": true,
    "TopTerms": {
        "view": 152463,
        "platform": 48717,
        "search": 20362,
        "document": 19689,
        "analysis": 17300
    },
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "document.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/document.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z",
            "Score": 0.95
        }
    ]
}

Search collection and emit result

To search for a collection and emit result, call POST /v1.0/tenants/[tenant-guid]/collections/[collection-guid]/document?search&async

curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/documents?search=null&async=null' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "MaxResults": 2,
    "Skip": 1,
    "ContinuationToken": "",
    "Ordering": "CreatedDescending",
    "Filter": {
        "CreatedAfter": "2024-01-01 00:00:00.000000",
        "CreatedBefore": "2025-01-01 00:00:00.000000",
        "Terms": [
            "foo"
        ],
        "MimeTypes": [
        ],
        "Prefixes": [
        ],
        "Suffixes": [
        ],
        "SchemaFilters": [
        ]
    }
}'
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const searchCollection = async () => {
  try {
    const response = await api.searchCollectionSdk.searchDocuments(
      "<collection-guid>",
      {
        MaxResults: 2,
        Skip: 1,
        ContinuationToken: "",
        Ordering: "CreatedDescending",
        Filter: {
          CreatedAfter: "2024-01-01 00:00:00.000000",
          CreatedBefore: "2025-01-01 00:00:00.000000",
          Terms: ["foo"],
          MimeTypes: [],
          Prefixes: [],
          Suffixes: [],
          SchemaFilters: [],
        },
      },
      undefined,
      undefined,
      true
    );
    console.log(response, "Collections searched successfully");
  } catch (err) {
    console.log("Error searching Collections:", err);
  }
};

searchCollection()
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def searchCollection():
    search = lexi.Collection.search("<collection-guid>", 
        emit_results=True,
        MaxResults= 2,
        Skip= 1,
        ContinuationToken= "",
        Ordering= "CreatedDescending",
        Filter= {
            "CreatedAfter": "2024-01-01 00:00:00.000000",
            "CreatedBefore": "2025-01-01 00:00:00.000000",
            "Terms": [
                "foo"
            ],
            "MimeTypes": [
            ],
            "Prefixes": [
            ],
            "Suffixes": [
            ],
            "SchemaFilters": [
            ]
    })
    print(search)

searchCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

CollectionSearchRequest request  = new CollectionSearchRequest
{
  MaxResults = 100,
  Skip = 0,
  ContinuationToken = string.Empty,
  Ordering = EnumerationOrderEnum.CreatedDescending,
  Filter = new QueryFilter
  {
    CreatedAfter = DateTime.Parse("2025-01-01T00:00:00Z"),
    CreatedBefore = DateTime.Parse("2026-01-01T00:00:00Z"),
    Terms = new List<string> { "view" },
    MimeTypes = new List<string>(),
    Prefixes = new List<string>(),
    Suffixes = new List<string>(),
    SchemaFilters = new List<string>()
  }
};
       
SearchResult response = await sdk.Search.SearchAsync(Guid.Parse("<collection-guid>"), request);

Response

Returns an async search result with operation tracking:

{
    "Success": true,
    "AsyncOperationId": "async-search-operation-guid",
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 12.34,
        "Messages": {
            "Info": "Async search initiated successfully",
            "Debug": "Search operation queued for processing"
        }
    },
    "Status": "Queued",
    "EstimatedCompletionTime": "2024-10-27T06:02:00.000000Z",
    "ProgressUrl": "/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000/search/async/async-search-operation-guid"
}

Delete Collection

Deletes a collection by GUID using DELETE /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. Collections must be empty before deletion to prevent data loss.

Request Parameters

  • collection-guid (string, Path, Required): GUID of the collection object to delete

Response

  • 200 OK: Collection deleted successfully
  • 400 Bad Request: Collection is not empty and cannot be deleted
  • 404 Not Found: Collection does not exist

Note: Collections must be empty (contain no documents) before they can be deleted. If the collection contains documents, a 400 Bad Request error will be returned.

curl --location --request DELETE 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/collections/00000000-0000-0000-0000-000000000000' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data ''
import { ViewLexiSdk } from "view-sdk";

const api = new ViewLexiSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const deleteCollection = async () => {
  try {
    const response = await api.collectionsSdk.delete(
      "<collection-guid>"
    );
    console.log(response, "Collection deleted successfully");
  } catch (err) {
    console.log("Error deleting Collection:", err);
  }
};

deleteCollection();
import view_sdk
from view_sdk import lexi

sdk = view_sdk.configure( access_key="default",base_url="localhost", tenant_guid= "<tenant-guid>")

def deleteCollection():
    response = lexi.Collection.delete("<collection-guid>")
    print(response)

deleteCollection()
using View.Sdk;
using View.Sdk.Lexi;

ViewLexiSdk sdk = new ViewLexiSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
        
bool deleted = await sdk.Collection.Delete(Guid.Parse("<collection-guid>"));

Response

Returns 200 No Content on successful deletion. No response body is returned.

Best Practices

When managing collections in the Lexi metadata database, consider the following recommendations for optimal document organization, search performance, and data management:

  • Collection Organization: Create logical collections based on document types, sources, or business domains for better organization
  • Search Optimization: Use appropriate filters and ordering parameters to optimize search performance and reduce response times
  • Document Management: Regularly monitor collection statistics and clean up unused or outdated documents to maintain performance
  • Security Configuration: Implement proper access controls and ensure collections are properly secured within tenant boundaries
  • Performance Monitoring: Monitor collection statistics, document counts, and search performance to identify optimization opportunities

Next Steps

After successfully managing collections, you can:

  • Source Documents: Upload and manage source documents within your collections for content processing
  • Search Operations: Implement advanced search functionality using the comprehensive filtering and search capabilities
  • Metadata Analysis: Analyze collection statistics and top terms to gain insights into your document content
  • Integration: Integrate collections with other View platform services for comprehensive data processing workflows
  • Document Processing: Set up automated document processing pipelines using collections as organizational containers