This page covers configuration and management of View metadata rule objects.

Object Overview

Metadata rules define how metadata is generated and where resultant metadata is stored within View.

Endpoint, URL, and Supported Methods

Objects are managed via the configuration server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/metadatarules

By default, the configuration server is accessible on port 8601.

Supported methods include: GET HEAD PUT DELETE

Structure

Objects have the following structure:

{
    "GUID": "example-metadata-rule",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "example-metadata-rule",
    "ContentType": "*",
    "MaxContentLength": 134217728,
    "ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
    "CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
    "TypeDetectorEndpoint": "http://nginx-orchestrator:8501/processor/typedetector",
    "SemanticCellEndpoint": "http://nginx-semcell:8341/",
    "MinChunkContentLength": 2,
    "MaxChunkContentLength": 2048,
    "ShiftSize": 1920,
    "UdrEndpoint": "http://nginx-processor:8321/",
    "TopTerms": 25,
    "CaseInsensitive": true,
    "IncludeFlattened": true,
    "DataCatalogEndpoint": "http://nginx-lexi:8201/",
    "DataCatalogType": "Lexi",
    "DataCatalogCollection": "default",
    "GraphRepositoryGUID": "example-graph-repository",
    "TargetBucketGUID": "example-udr-bucket",
    "CreatedUtc": "2024-07-10T05:09:32.000000Z"
}

Properties:

  • GUID string globally unique identifier for the object
  • TenantGUID string globally unique identifier for the tenant
  • BucketGUID string GUID of the bucket to which this metadata rule should be associated
  • OwnerGUID string GUID of the user to which this rule should be attributed
  • Name string name of the object
  • ContentType string content-type on which this rule should match
  • MaxContentLength int maximum content length on which this rule should match
  • ProcessingEndpoint string the URL to be used to generate metadata for matching objects
  • CleanupEndpoint string the URL to be used should a matching object be deleted
  • TypeDetectorEndpoint string the URL to be used to identify the content-type of data to be processed
  • SemanticCellEndpoint string the URL to be used to extract semantic cells from the data to be processed
  • MinChunkContentLength int the minimum chunk content length
  • MaxChunkContentLength int the maximum chunk content length
  • ShiftSize int the number of bytes to shift while extracting content
  • UdrEndpoint string the URL to use while generating UDR for content
  • TopTerms int the number of top terms to extract
  • CaseInsensitive bool Boolean flag to indicate whether or not case-insensitive text extraction should be used
  • IncludeFlattened bool Boolean flag to indicate whether or not a flattened representation of the object should be produced
  • DataCatalogEndpoint string the URL for the data catalog, typically Lexi, to which results should be emitted
  • DataCatalogType enum type of data catalog, one of: Lexi
  • DataCatalogCollection string the name of the data catalog collection
  • GraphRepositoryGUID string the GUID of the graph repository to which document metadata should be emitted
  • TargetBucketGUID string the GUID of the bucket to which metadata results should be emitted
  • CreatedUtc datetime timestamp from creation, in UTC time

Important: the user's password is never stored by View, but rather the SHA-256 hash within the PasswordSha256 property. As such this property is redacted when retrieving, enumerating, or updating the user object.

Create

To create, call PUT /v1.0/tenants/[tenant-guid]/metadatarules with the following properties using the configuration server: BucketGUID OwnerGUID Name ContentType ProcessingEndpoint CleanupEndpoint TypeDetectorEndpoint UdrEndpoint SemanticCellEndpoint MaxChunkContentLength MinChunkContentLength ShiftSize TopTerms CaseInsensitive IncludeFlattened DataCatalogEndpoint DataCatalogType DataCatalogCollection MaxContentLength

curl -X PUT http://localhost:8601/v1.0/tenants/[tenant-guid]/metadatarules \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer [accesskey]" \
     -d '
{
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "Metadata rule",
    "ContentType": "*",
    "ProcessingEndpoint": "http://localhost:8501/processor",
    "CleanupEndpoint": "http://localhost:8501/processor/cleanup",
    "TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
    "UdrEndpoint": "http://localhost:8321/",
    "SemanticCellEndpoint": "http://localhost:8341/",
    "MaxChunkContentLength": 512,
    "ShiftSize": 512,
    "TopTerms": 25,
    "CaseInsensitive": true,
    "IncludeFlattened": true,
    "DataCatalogEndpoint": "http://localhost:8201/",
    "DataCatalogType": "Lexi",
    "DataCatalogCollection": "default",
    "TargetBucketGUID": "example-udr-bucket",
    "MaxContentLength": 16777216
}'

Enumerate

Refer to the Enumeration page in REST API for details about the use of enumeration APIs.

Enumerate objects by using GET /v2.0/tenants/[tenant-guid]/metadatarules. The resultant object will appear as:

{
    "Success": true,
    "Timestamp": {
        "Start": "2024-10-21T02:36:37.677751Z",
        "TotalMs": 23.58,
        "Messages": {}
    },
    "MaxResults": 10,
    "IterationsRequired": 1,
    "EndOfResults": true,
    "RecordsRemaining": 16,
    "Objects": [
        {
            "GUID": "example-metadatarule",
            ... metadatarule details ...
        },
        { ... }
    ],
    "ContinuationToken": "[continuation-token]"
}

Read

To read an object by GUID, call GET /v1.0/tenants/[tenant-guid]/metadatarules/[metadatarule-guid]. If the object exists, it will be returned as a JSON object in the response body. If it does not exist, a 404 will be returned with a NotFound error response.

{
    "GUID": "example-metadata-rule",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "example-metadata-rule",
    "ContentType": "*",
    "MaxContentLength": 134217728,
    "ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
    "CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
    "TypeDetectorEndpoint": "http://nginx-orchestrator:8501/processor/typedetector",
    "SemanticCellEndpoint": "http://nginx-semcell:8341/",
    "MinChunkContentLength": 2,
    "MaxChunkContentLength": 2048,
    "ShiftSize": 1920,
    "UdrEndpoint": "http://nginx-processor:8321/",
    "TopTerms": 25,
    "CaseInsensitive": true,
    "IncludeFlattened": true,
    "DataCatalogEndpoint": "http://nginx-lexi:8201/",
    "DataCatalogType": "Lexi",
    "DataCatalogCollection": "default",
    "GraphRepositoryGUID": "example-graph-repository",
    "TargetBucketGUID": "example-udr-bucket",
    "CreatedUtc": "2024-07-10T05:09:32.000000Z"
}

Note: the HEAD method can be used as an alternative to get to simply check the existence of the object. HEAD requests return either a 200/OK in the event the object exists, or a 404/Not Found if not. No response body is returned with a HEAD request.

Update

To update an object by GUID, call PUT /v1.0/tenants/[tenant-guid]/metadatarules/[metadatarule-guid] with a fully populated object in the request body. The updated object will be returned to you.

Note: certain fields cannot be modified and will be preserved across updates.

Request body:

{
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "Updated metadata rule",
    "ContentType": "*",
    "MaxContentLength": 134217728,
    "ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
    "CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
    "TypeDetectorEndpoint": "http://nginx-orchestrator:8501/processor/typedetector",
    "SemanticCellEndpoint": "http://nginx-semcell:8341/",
    "MinChunkContentLength": 2,
    "MaxChunkContentLength": 2048,
    "ShiftSize": 1920,
    "UdrEndpoint": "http://nginx-processor:8321/",
    "TopTerms": 25,
    "CaseInsensitive": true,
    "IncludeFlattened": true,
    "DataCatalogEndpoint": "http://nginx-lexi:8201/",
    "DataCatalogType": "Lexi",
    "DataCatalogCollection": "default",
    "GraphRepositoryGUID": "example-graph-repository",
    "TargetBucketGUID": "example-udr-bucket",
    "CreatedUtc": "2024-07-10T05:09:32.000000Z"
}

Response body:

{
    "GUID": "example-metadata-rule",
    "TenantGUID": "default",
    "BucketGUID": "example-data-bucket",
    "OwnerGUID": "default",
    "Name": "Updated metadata rule",
    "ContentType": "*",
    "MaxContentLength": 134217728,
    "ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
    "CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
    "TypeDetectorEndpoint": "http://nginx-orchestrator:8501/processor/typedetector",
    "SemanticCellEndpoint": "http://nginx-semcell:8341/",
    "MinChunkContentLength": 2,
    "MaxChunkContentLength": 2048,
    "ShiftSize": 1920,
    "UdrEndpoint": "http://nginx-processor:8321/",
    "TopTerms": 25,
    "CaseInsensitive": true,
    "IncludeFlattened": true,
    "DataCatalogEndpoint": "http://nginx-lexi:8201/",
    "DataCatalogType": "Lexi",
    "DataCatalogCollection": "default",
    "GraphRepositoryGUID": "example-graph-repository",
    "TargetBucketGUID": "example-udr-bucket",
    "CreatedUtc": "2024-07-10T05:09:32.000000Z"
}

Delete

To delete an object by GUID, call DELETE /v1.0/tenants/[tenant-guid]/metadatarule/[metadatarule-guid].