This page provides an overview of collection-related APIs.

Object Overview

Collections provide a logical grouping of source documents within the Lexi data catalog and search platform.

Endpoint, URL, and Supported Methods

Collections are managed via the Lexi server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/collections

By default, the Lexi server is accessible on port 8201.

Supported methods include: GET HEAD PUT DELETE

Structure

Objects have the following structure:

{
    "GUID": "default",
    "TenantGUID": "default",
    "Name": "My first collection",
    "AllowOverwrites": true,
    "AdditionalData": "Created by setup",
    "CreatedUtc": "2024-07-10T05:11:51.000000Z"
}

Properties:

  • GUID string globally unique identifier for the object
  • TenantGUID string globally unique identifier for the tenant
  • Name string name of the object
  • AllowOverwrites bool boolean specifying whether or not a source document could be overwritten
  • AdditionalData string user-supplied notes or additional data
  • CreatedUtc datetime timestamp from creation, in UTC time

Create

To create, call PUT /v1.0/tenants/[tenant-guid]/collections/[collection-guid] with the following properties using the Lexi server: Name AllowOverwrites AdditionalData

curl -X PUT http://localhost:8601/v1.0/tenants/[tenant-guid]/collections \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer [accesskey]" \
     -d '
{
    "Name": "My collection",
    "AllowOverwrites": true,
    "AdditionalData": "My notes"
}'

Read

To read all collections, call GET /v1.0/tenants/[tenant-guid]/collections. This API will return a JSON array.

To read a collection by GUID, call GET /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. If the object exists, it will be returned as a JSON object in the response body. If it does not exist, a 404 will be returned with a NotFound error response.

{
    "GUID": "oneminute",
    "TenantGUID": "default",
    "Name": "Every minute",
    "Schedule": "MinutesInterval",
    "Interval": 1,
    "CreatedUtc": "2024-07-10T05:21:00.000000Z"
}

Note: the HEAD method can be used as an alternative to get to simply check the existence of the object. HEAD requests return either a 200/OK in the event the object exists, or a 404/Not Found if not. No response body is returned with a HEAD request.

Enumerate Documents

Enumerate objects by using POST /v1.0/tenants/[tenant-guid]/collections/[guid]/documents?enumerate. The resultant object will appear as:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-27T06:01:17.502560Z",
        "TotalMs": 91.21,
        "Messages": {
            ... log messages ...
        }
    },
    "MaxResults": 1000,
    "IterationsRequired": 1,
    "ContinuationToken": "3135b2ba-7939-4cc3-8849-bff23b27bc9a",
    "EndOfResults": false,
    "RecordsRemaining": 46,
    "Objects": [
        {
            "GUID": "1fdbe0c8-8b85-4b0e-ac42-dd4757684a9f",
            "TenantGUID": "default",
            "BucketGUID": "example-data-bucket",
            "CollectionGUID": "default",
            "ObjectGUID": "f615ac92-d1d1-4b46-8cc5-acf721131067",
            "ObjectKey": "5.pdf",
            "ObjectVersion": "1",
            "ContentType": "application/pdf",
            "DocumentType": "Pdf",
            "SourceUrl": "http://dcc249eaaf06:8001/v1.0/tenants/default/buckets/example-data-bucket/objects/5.pdf",
            "ContentLength": 31811,
            "MD5Hash": "DC477A85FF3882BBFDEB03D7B79ECC9E",
            "SHA1Hash": "CC5D85073F193A578F97D46B8A6E4CE946270B5F",
            "SHA256Hash": "E5285C6023A46E4E8917C67CCB56B91FED2E578A7AA3129680012C029868B321",
            "CreatedUtc": "2024-10-25T14:14:22.000000Z"
        },
        { ... }
    ]
}

Delete

To delete a collection by GUID, call DELETE /v1.0/tenants/[tenant-guid]/collections/[collection-guid]. If the collection is not empty, a 400 will be returned.