This page covers configuration and management of View crawl plan objects.
Object Overview
Crawl plans provide a mapping of a data repository to a crawl schedule and a crawl filter, indicating to View the parameters by which a data repository should be crawled.
Endpoint, URL, and Supported Methods
Objects are managed via the crawler server API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/crawlplans
Supported methods include: GET
HEAD
PUT
DELETE
Structure
Objects have the following structure:
{
"GUID": "4292118d-3397-4090-88c6-90f1886a3e35",
"TenantGUID": "default",
"DataRepositoryGUID": "c854f5f2-68f6-44c4-813e-9c1dea51676a",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"Name": "Local files",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 16,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true,
"CreatedUtc": "2024-10-23T15:14:26.000000Z"
}
Properties:
GUID
GUID
globally unique identifier for the objectTenantGUID
GUID
globally unique identifier for the tenantDataRepositoryGUID
GUID
globally unique identifier for the data repositoryCrawlScheduleGUID
GUID
globally unique identifier for the crawl scheduleCrawlFilterGUID
GUID
globally unique identifier for the crawl filterMetadataRuleGUID
GUID
globally unique identifier for the metadata ruleEmbeddingsRuleGUID
GUID
globally unique identifier for the embeddings ruleName
string
the name of objectEnumerationDirectory
string
directory in which previous enumerations of the repository are storedEnumerationsToRetain
int
the number of enumerations to retainMaxDrainTasks
int
the maximum number of objects to emit in parallelProcessAdditions
bool
boolean indicating whether or not new files should be processedProcessDeletions
bool
boolean indicating whether or not deleted files should be processedProcessUpdates
bool
boolean indicating whether or not updated files should be processedCreatedUtc
datetime
timestamp from creation, in UTC time
Create
To create, call PUT /v1.0/tenants/[tenant-guid]/crawlplans
with the following properties using the configuration server: DataRepositoryGUID
CrawlScheduleGUID
CrawlFilterGUID
MetadataRuleGUID
EmbeddingsRuleGUID
EnumerationDirectory
EnumerationsToRetain
MaxDrainTasks
ProcessAdditions
ProcessDeletions
ProcessUpdates
curl -X PUT http://localhost:8601/v1.0/tenants/[tenant-guid]/crawlschedules \
-H "Content-Type: application/json" \
-H "Authorization: Bearer [accesskey]" \
-d '
{
"DataRepositoryGUID": "e9068089-4c90-4ef7-b4bb-bafccb771a9c",
"CrawlScheduleGUID": "default",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "example-embeddings-rule",
"Name": "My crawl plan",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 30,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true
}'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const createCrawlPlan = async () => {
try {
const response = await crawler.createCrawlPlan({
DataRepositoryGUID: "00000000-0000-0000-0000-000000000000",
CrawlScheduleGUID: "00000000-0000-0000-0000-000000000000",
CrawlFilterGUID: "00000000-0000-0000-0000-000000000000",
Name: "My crawl plan [ASH]",
EnumerationDirectory: "./enumerations/",
EnumerationsToRetain: 30,
MetadataRuleGUID: "00000000-0000-0000-0000-000000000000",
ProcessingEndpoint:
"http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
ProcessingAccessKey: "default",
CleanupEndpoint:
"http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing/cleanup",
CleanupAccessKey: "default",
});
console.log(response, "Crawl plan created successfully");
} catch (err) {
console.log("Error creating Crawl plan:", err);
}
};
createCrawlPlan();
Enumerate
Refer to the Enumeration page in REST API for details about the use of enumeration APIs.
Enumerate objects by using GET /v2.0/tenants/[tenant-guid]/crawlschedules
. The resultant object will appear as:
{
"Success": true,
"Timestamp": {
"Start": "2024-10-21T02:36:37.677751Z",
"TotalMs": 23.58,
"Messages": {}
},
"MaxResults": 10,
"IterationsRequired": 1,
"EndOfResults": true,
"RecordsRemaining": 16,
"Objects": [
{
"GUID": "example-crawlplan",
... crawlplan details ...
},
{ ... }
],
"ContinuationToken": "[continuation-token]"
}
curl --location 'http://view.homedns.org:8000/v2.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/' \
--header 'Authorization: ••••••'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const enumerateCrawlPlans = async () => {
try {
const response = await crawler.enumerateCrawlPlans();
console.log(response, "Crawl plans fetched successfully");
} catch (err) {
console.log("Error fetching Crawl plans:", err);
}
};
enumerateCrawlPlans();
Read
To read an object by GUID, call GET /v1.0/tenants/[tenant-guid]/crawlplans/[crawlplan-guid]
. If the object exists, it will be returned as a JSON object in the response body. If it does not exist, a 404 will be returned with a NotFound
error response.
{
"GUID": "4292118d-3397-4090-88c6-90f1886a3e35",
"TenantGUID": "default",
"DataRepositoryGUID": "c854f5f2-68f6-44c4-813e-9c1dea51676a",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"Name": "Local files",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 16,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true,
"CreatedUtc": "2024-10-23T15:14:26.000000Z"
}
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const readCrawlPlan = async () => {
try {
const response = await crawler.retrieveCrawlPlan(
"418cd284-4a30-4a9b-9e2a-b36645cbc6d7"
);
console.log(response, "Crawl plan fetched successfully");
} catch (err) {
console.log("Error fetching Crawl plan:", err);
}
};
readCrawlPlan();
Note: the HEAD
method can be used as an alternative to get to simply check the existence of the object. HEAD
requests return either a 200/OK
in the event the object exists, or a 404/Not Found
if not. No response body is returned with a HEAD
request.
Read all
o read all objects, call GET /v1.0/tenants/[tenant-guid]/crawlplans/
. If the object exists, it will be returned as an array of JSON object in the response body
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/' \
--header 'Authorization: ••••••'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"default", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const readAllCrawlPlans = async () => {
try {
const response = await crawler.retrieveCrawlPlans();
console.log(response, "All crawl plans fetched successfully");
} catch (err) {
console.log("Error fetching All crawl plans:", err);
}
};
readAllCrawlPlans();
Update
To update an object by GUID, call PUT /v1.0/tenants/[tenant-guid]/crawlplans/[crawlplan-guid]
with a fully populated object in the request body. The updated object will be returned to you.
Note: certain fields cannot be modified and will be preserved across updates.
Request body:
{
"GUID": "4292118d-3397-4090-88c6-90f1886a3e35",
"TenantGUID": "default",
"DataRepositoryGUID": "c854f5f2-68f6-44c4-813e-9c1dea51676a",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"Name": "My updated local files",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 16,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true,
"CreatedUtc": "2024-10-23T15:14:26.000000Z"
}
curl --location --request PUT 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/00000000-0000-0000-0000-000000000000' \
--header 'content-type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"DataRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"CrawlScheduleGUID": "00000000-0000-0000-0000-000000000000",
"CrawlFilterGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My updated crawl plan",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 30,
"MetadataRuleGUID": "00000000-0000-0000-0000-000000000000",
"ProcessingEndpoint": "http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing",
"ProcessingAccessKey": "default",
"CleanupEndpoint": "http://nginx-processor:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing/cleanup",
"CleanupAccessKey": "default"
}'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const updateCrawlPlan = async () => {
try {
const response = await crawler.updateCrawlPlan({
GUID: "418cd284-4a30-4a9b-9e2a-b36645cbc6d7",
TenantGUID: "00000000-0000-0000-0000-000000000000",
DataRepositoryGUID: "2dc3ae2f-200c-4f5f-8c5a-9bedd7b6447c",
CrawlScheduleGUID: "00000000-0000-0000-0000-000000000001",
CrawlFilterGUID: "00000000-0000-0000-0000-000000000000",
MetadataRuleGUID: "00000000-0000-0000-0000-000000000000",
EmbeddingsRuleGUID: "00000000-0000-0000-0000-000000000001",
Name: "Traeger Recipe Forums [UPDATED]",
EnumerationDirectory: "./enumerations/",
EnumerationsToRetain: 16,
MaxDrainTasks: 4,
ProcessAdditions: true,
ProcessDeletions: true,
ProcessUpdates: true,
CreatedUtc: "2025-03-25T21:50:09.230321Z",
});
console.log(response, "Crawl plan updated successfully");
} catch (err) {
console.log("Error updating Crawl plan:", err);
}
};
updateCrawlPlan();
Response body:
{
"GUID": "4292118d-3397-4090-88c6-90f1886a3e35",
"TenantGUID": "default",
"DataRepositoryGUID": "c854f5f2-68f6-44c4-813e-9c1dea51676a",
"CrawlScheduleGUID": "oneminute",
"CrawlFilterGUID": "default",
"MetadataRuleGUID": "example-metadata-rule",
"EmbeddingsRuleGUID": "crawler-embeddings-rule",
"Name": "My updated local files",
"EnumerationDirectory": "./enumerations/",
"EnumerationsToRetain": 16,
"MaxDrainTasks": 4,
"ProcessAdditions": true,
"ProcessDeletions": true,
"ProcessUpdates": true,
"CreatedUtc": "2024-10-23T15:14:26.000000Z"
}
Delete
To delete an object by GUID, call DELETE /v1.0/tenants/[tenant-guid]/crawlplans/[crawlplan-guid]
.
curl --location --request DELETE 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const deleteCrawlPlan = async () => {
try {
const response = await crawler.deleteCrawlPlan(
"418cd284-4a30-4a9b-9e2a-b36645cbc6d7"
);
console.log(response, "Crawl plan deleted successfully");
} catch (err) {
console.log("Error deleting Crawl plan:", err);
}
};
deleteCrawlPlan();
Check Existance
The HEAD
requests return either a 200/OK in the event the object exists, or a 404/Not Found if not. No response body is returned with a HEAD request.
curl --location --head 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/crawlplans/00000000-0000-0000-0000-000000000000' \
--header 'Authorization: ••••••'
import { ViewCrawlerSdk } from "view-sdk";
const crawler = new ViewCrawlerSdk(
"00000000-0000-0000-0000-000000000000", //tenant Id
"default", //access token
"http://localhost:8000/" //endpoint
);
const existsCrawlPlan = async () => {
try {
const response = await crawler.existsCrawlPlan(
"418cd284-4a30-4a9b-9e2a-b36645cbc6d7"
);
console.log(response, "Crawl plan exists");
} catch (err) {
console.log("Error checking Crawl plan:", err);
}
};
existsCrawlPlan();