This page covers configuration and management of View crawl operation objects.
Object Overview
Crawl operations provide metadata of an invocation of a crawl plan.
Endpoint, URL, and Supported Methods
Objects are managed via the crawler server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/crawloperations
By default, the crawler server is accessible on port 8101
.
Supported methods include: GET
HEAD
DELETE
Structure
Objects have the following structure:
{
"GUID": "9ced1af3-2e19-4cd6-81ce-a8c90a9ed32d",
"TenantGUID": "default",
"CrawlPlanGUID": "e9a7d61e-7cbd-46e4-9956-533e22008978",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"DataRepositoryGUID": "1a56c067-9e6d-4f7b-85bf-eb6b04aeda3f",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
"CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
"Name": "Alienware CIFS (started 2024-10-25T22:29:31 UTC)",
"ObjectsEnumerated": 123,
"BytesEnumerated": 61052490,
"ObjectsAdded": 0,
"BytesAdded": 0,
"ObjectsUpdated": 0,
"BytesUpdated": 0,
"ObjectsDeleted": 0,
"BytesDeleted": 0,
"ObjectsSuccess": 0,
"BytesSuccess": 0,
"ObjectsFailed": 2,
"BytesFailed": 16384,
"State": "Success",
"CreatedUtc": "2024-10-25T22:29:31.000000Z",
"StartUtc": "2024-10-25T22:29:31.000000Z",
"StartEnumerationUtc": "2024-10-25T22:29:31.000000Z",
"StartRetrievalUtc": "2024-10-25T22:29:39.000000Z",
"FinishEnumerationUtc": "2024-10-25T22:29:39.000000Z",
"FinishRetrievalUtc": "2024-10-25T22:29:39.000000Z",
"FinishUtc": "2024-10-25T22:29:39.000000Z"
}
Properties:
GUID
string
globally unique identifier for the objectTenantGUID
string
globally unique identifier for the tenantCrawlPlanGUID
string
globally unique identifier for the crawl planCrawlScheduleGUID
string
globally unique identifier for the crawl scheduleCrawlFilterGUID
string
globally unique identifier for the crawl filterDataRepositoryGUID
string
globally unique identifier for the data repositoryMetadataRuleGUID
string
globally unique identifier for the metadata ruleEmbeddingsRuleGUID
string
globally unique identifier for the embeddings ruleProcessingEndpoint
string
URL to use to process new and changed objectsCleanupEndpoint
string
URL to use to process deleted objectsName
string
the name of objectObjectsEnumerated
int
the number of objects enumeratedBytesEnumerated
int
the number of bytes enumeratedObjectsAdded
int
the number of objects added since the latest enumerationBytesAdded
int
the number of bytes added since the latest enumerationObjectsUpdated
int
the number of objects updated since the latest enumerationBytesUpdated
int
the number of bytes updated since the latest enumerationObjectsDeleted
int
the number of objects deleted from the latest enumerationBytesDeleted
int
the number of bytes deleted from the latest enumerationObjectsSuccess
int
the number of objects succeededBytesSuccess
int
the number of bytes succeededObjectsFailed
int
the number of objects failedBytesFailed
int
the number of bytes failedState
enum
the state of the crawl operation, values includeNotStarted
Starting
Stopped
Canceled
Enumerating
Retrieving
Deleting
Success
Failed
CreatedUtc
datetime
timestamp from creation, in UTC timeStartUtc
datetime
timestamp from start, in UTC timeStartEnumerationUtc
datetime
timestamp from beginning of enumeration, in UTC timeStartRetrievalUtc
datetime
timestamp from beginning of retrieval, in UTC timeFinishEnumerationUtc
datetime
timestamp from completion of enumeration, in UTC timeFinishRetrievalUtc
datetime
timestamp from completion of retrieval, in UTC timeFinishUtc
datetime
timestamp from completion, in UTC time
Enumerate
Refer to the Enumeration page in REST API for details about the use of enumeration APIs.
Enumerate objects by using GET /v2.0/tenants/[tenant-guid]/crawlschedules
. The resultant object will appear as:
{
"Success": true,
"Timestamp": {
"Start": "2024-10-21T02:36:37.677751Z",
"TotalMs": 23.58,
"Messages": {}
},
"MaxResults": 10,
"IterationsRequired": 1,
"EndOfResults": true,
"RecordsRemaining": 16,
"Objects": [
{
"GUID": "example-crawloperation",
... crawloperation details ...
},
{ ... }
],
"ContinuationToken": "[continuation-token]"
}
Read
To read an object by GUID, call GET /v1.0/tenants/[tenant-guid]/crawloperations/[crawloperation-guid]
. If the object exists, it will be returned as a JSON object in the response body. If it does not exist, a 404 will be returned with a NotFound
error response.
{
"GUID": "9ced1af3-2e19-4cd6-81ce-a8c90a9ed32d",
"TenantGUID": "default",
"CrawlPlanGUID": "e9a7d61e-7cbd-46e4-9956-533e22008978",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"DataRepositoryGUID": "1a56c067-9e6d-4f7b-85bf-eb6b04aeda3f",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"ProcessingEndpoint": "http://nginx-orchestrator:8501/processor",
"CleanupEndpoint": "http://nginx-orchestrator:8501/processor/cleanup",
"Name": "Alienware CIFS (started 2024-10-25T22:29:31 UTC)",
"ObjectsEnumerated": 123,
"BytesEnumerated": 61052490,
"ObjectsAdded": 0,
"BytesAdded": 0,
"ObjectsUpdated": 0,
"BytesUpdated": 0,
"ObjectsDeleted": 0,
"BytesDeleted": 0,
"ObjectsSuccess": 0,
"BytesSuccess": 0,
"ObjectsFailed": 2,
"BytesFailed": 16384,
"State": "Success",
"CreatedUtc": "2024-10-25T22:29:31.000000Z",
"StartUtc": "2024-10-25T22:29:31.000000Z",
"StartEnumerationUtc": "2024-10-25T22:29:31.000000Z",
"StartRetrievalUtc": "2024-10-25T22:29:39.000000Z",
"FinishEnumerationUtc": "2024-10-25T22:29:39.000000Z",
"FinishRetrievalUtc": "2024-10-25T22:29:39.000000Z",
"FinishUtc": "2024-10-25T22:29:39.000000Z"
}
Note: the HEAD
method can be used as an alternative to get to simply check the existence of the object. HEAD
requests return either a 200/OK
in the event the object exists, or a 404/Not Found
if not. No response body is returned with a HEAD
request.
{
"GUID": "4292118d-3397-4090-88c6-90f1886a3e35",
"TenantGUID": "default",
"DataRepositoryGUID": "c854f5f2-68f6-44c4-813e-9c1dea51676a",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"Name": "My updated local files",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 16,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true,
"CreatedUtc": "2024-10-23T15:14:26.000000Z"
}
Delete
To delete an object by GUID, call DELETE /v1.0/tenants/[tenant-guid]/crawloperations/[crawloperation-guid]
.