Comprehensive guide to managing processing pipeline operations in the View Processing platform for automated data processing and content analysis.
Overview
The processing pipeline provides comprehensive automated data processing and content analysis capabilities within the View Processing platform. It enables systematic processing of data assets through metadata generation, semantic cell extraction, embeddings generation, and content analysis workflows.
Processing pipeline operations are accessible via the View Processing API at [http|https]://[hostname]:[port]/[apiversion]/tenants/[tenantguid]/processing
and support both storage-based and crawler-based processing workflows.
API Endpoints
- POST
/v1.0/tenants/[tenant-guid]/processing
- Execute processing pipeline operations for storage or crawler data
Processing Pipeline (Storage)
Executes comprehensive processing pipeline operations for storage-based data using POST /v1.0/tenants/[tenant-guid]/processing
. Processes objects through metadata generation, semantic cell extraction, embeddings generation, and content analysis workflows with full processing capabilities.
Request Parameters
Required Parameters
- Async (boolean, Body, Required): Whether to execute the processing operation asynchronously
- Tenant (object, Body, Required): Tenant metadata for the processing operation
- Collection (object, Body, Required): Collection metadata for the processing operation
- Bucket (object, Body, Required): Bucket metadata for the processing operation
- Pool (object, Body, Required): Storage pool metadata for the processing operation
- Object (object, Body, Required): Object metadata for the processing operation
- MetadataRule (object, Body, Required): Metadata rule configuration for processing
- EmbeddingsRule (object, Body, Required): Embeddings rule configuration for processing
- VectorRepository (object, Body, Required): Vector repository configuration for processing
- GraphRepository (object, Body, Required): Graph repository configuration for processing
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"Async": true,
"Tenant": {
"GUID": "00000000-0000-0000-0000-000000000000",
"Name": "Default Tenant",
"Region": "us-west-1",
"S3BaseDomain": "localhost",
"DefaultPoolGUID": "00000000-0000-0000-0000-000000000000",
"Active": true
},
"Collection": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My first collection",
"AllowOverwrites": true,
"AdditionalData": "Created by setup"
},
"Bucket": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"PoolGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Category": "Data",
"Name": "example-data-bucket",
"RegionString": "us-west-1",
"Versioning": true,
"MaxMultipartUploadSeconds": 604800
},
"Pool": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "default",
"Provider": "Disk",
"WriteMode": "GUID",
"UseSsl": false,
"DiskDirectory": "./disk/",
"Compress": "None",
"EnableReadCaching": false
},
"Object": {
"GUID": "00000000-0000-0000-0000-000000000000",
"ParentGUID": null,
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"TenantName": "My default tenant",
"PoolGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"BucketName": "data",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Key": "hello1.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 13
},
"MetadataRule": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Name": "example-metadata-rule",
"ContentType": "*",
"MaxContentLength": 16777216,
"DataFlowEndpoint": "http://localhost:8501/processor",
"TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
"SemanticCellEndpoint": "http://localhost:8341/",
"MaxChunkContentLength": 512,
"ShiftSize": 448,
"UdrEndpoint": "http://localhost:8321/",
"TopTerms": 25,
"CaseInsensitive": true,
"IncludeFlattened": true,
"DataCatalogEndpoint": "http://localhost:8201/",
"DataCatalogType": "Lexi",
"DataCatalogCollection": "default",
"GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"TargetBucketGUID": "00000000-0000-0000-0000-000000000000"
},
"EmbeddingsRule": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My storage server embeddings rule",
"ContentType": "*",
"GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"DataFlowEndpoint": "http://localhost:8501/processor",
"EmbeddingsGenerator": "LCProxy",
"GeneratorUrl": "http://localhost:8301/",
"GeneratorApiKey": "",
"VectorStoreUrl": "http://localhost:8311/",
"MaxContentLength": 16777216
},
"VectorRepository": {
"GUID": "00000000-0000-0000-0000-000000000000",
"Name": "My vector repository",
"RepositoryType": "Pgvector",
"Model": "all-MiniLM-L6-v2",
"Dimensionality": 384,
"DatabaseHostname": "localhost",
"DatabaseName": "vectordb",
"DatabaseTable": "minilm",
"DatabasePort": 5432,
"DatabaseUser": "postgres",
"DatabasePassword": "password"
},
"GraphRepository": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My LiteGraph instance",
"RepositoryType": "LiteGraph",
"EndpointUrl": "http://localhost:8701/",
"ApiKey": "default",
"GraphIdentifier": "00000000-0000-0000-0000-000000000000"
}
}
'
import { ViewProcessorSdk } from "view-sdk";
const api = new ViewProcessorSdk(
"http://localhost:8000/", //endpoint
"<tenant-guid>", //tenant Id
"default" //access key
);
const processingPipeline = async () => {
try {
const response = await api.processSdk.processingPipeline({
"Async": true,
"Tenant": {
"GUID": "<tenant-guid>",
"Name": "Default Tenant",
"Region": "us-west-1",
"S3BaseDomain": "localhost",
"DefaultPoolGUID": "<pool-guid>",
"Active": true
},
"Collection": {
"GUID": "<collection-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My first collection",
"AllowOverwrites": true,
"AdditionalData": "Created by setup"
},
"Bucket": {
"GUID": "<bucket-guid>",
"TenantGUID": "<tenant-guid>",
"PoolGUID": "<pool-guid>",
"OwnerGUID": "<owner-guid>",
"Category": "Data",
"Name": "example-data-bucket",
"RegionString": "us-west-1",
"Versioning": true,
"MaxMultipartUploadSeconds": 604800
},
"Pool": {
"GUID": "<pool-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "default",
"Provider": "Disk",
"WriteMode": "GUID",
"UseSsl": false,
"DiskDirectory": "./disk/",
"Compress": "None",
"EnableReadCaching": false
},
"Object": {
"GUID": "<object-guid>",
"ParentGUID": null,
"TenantGUID": "<tenant-guid>",
"TenantName": "My default tenant",
"PoolGUID": "<pool-guid>",
"BucketGUID": "<bucket-guid>",
"BucketName": "data",
"OwnerGUID": "<owner-guid>",
"Key": "hello1.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 13
},
"MetadataRule": {
"GUID": "<metadatarule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "example-metadata-rule",
"ContentType": "*",
"MaxContentLength": 16777216,
"DataFlowEndpoint": "http://localhost:8501/processor",
"TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
"SemanticCellEndpoint": "http://localhost:8341/",
"MaxChunkContentLength": 512,
"ShiftSize": 448,
"UdrEndpoint": "http://localhost:8321/",
"TopTerms": 25,
"CaseInsensitive": true,
"IncludeFlattened": true,
"DataCatalogEndpoint": "http://localhost:8201/",
"DataCatalogType": "Lexi",
"DataCatalogCollection": "default",
"GraphRepositoryGUID": "<graph-repository-guid>",
"TargetBucketGUID": "<target-bucket-guid>"
},
"EmbeddingsRule": {
"GUID": "<embeddingrule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "My storage server embeddings rule",
"ContentType": "*",
"GraphRepositoryGUID": "<graph-repository-guid>",
"VectorRepositoryGUID": "<vector-repository-guid>",
"DataFlowEndpoint": "http://localhost:8501/processor",
"EmbeddingsGenerator": "LCProxy",
"GeneratorUrl": "http://localhost:8301/",
"GeneratorApiKey": "",
"VectorStoreUrl": "http://localhost:8311/",
"MaxContentLength": 16777216
},
"VectorRepository": {
"GUID": "<vector-repository-guid>",
"Name": "My vector repository",
"RepositoryType": "Pgvector",
"Model": "all-MiniLM-L6-v2",
"Dimensionality": 384,
"DatabaseHostname": "localhost",
"DatabaseName": "vectordb",
"DatabaseTable": "minilm",
"DatabasePort": 5432,
"DatabaseUser": "postgres",
"DatabasePassword": "password"
},
"GraphRepository": {
"GUID": "<graph-repository-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My LiteGraph instance",
"RepositoryType": "LiteGraph",
"EndpointUrl": "http://localhost:8701/",
"ApiKey": "default",
"GraphIdentifier": "<graph-identifier>"
}
}
);
console.log(response);
} catch (err) {
console.log("Error", err);
}
};
processingPipeline();
import view_sdk
from view_sdk import processor
from view_sdk.sdk_configuration import Service
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="default",
service_ports={Service.LEXI: 8000},
)
def processingPipeline():
result = processor.Processor.processing_pipeline(
MetadataRuleGUID="00000000-0000-0000-0000-000000000000",
EmbeddingsRuleGUID="00000000-0000-0000-0000-000000000000",
Object={
"GUID": "00000000-0000-0000-0000-000000000000",
"ParentGUID": None,
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"TenantName": "My default tenant",
"PoolGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"BucketName": "data",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Key": "hello1.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 85,
"Data": "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
}
)
print(result)
processingPipeline()
using View.Sdk;
using View.Sdk.Processor;
ViewProcessorSdk sdk = new ViewProcessorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
ObjectMetadata object = new ObjectMetadata
{
GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
ParentGUID = null,
TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
TenantName = "My default tenant",
OwnerGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
DataRepositoryGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
CrawlPlanGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
CrawlOperationGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
Key = "hello1.txt",
Version = "1",
ContentType = "text/plain",
DocumentType = "Text",
ContentLength = 85,
Data = "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
};
ProcessorResult response = await sdk.Processor.Process(Guid.Parse("<metadata-rule-guid>"),
Guid.Parse("<embeddings-rule-guid>"),
object);
Response
Returns processing pipeline operation results with execution status and timing information.
{
"GUID": "3292d8eb-642b-40f4-a2de-9b81e66de288",
"Success": true,
"Async": true,
"Timestamp": {
"Start": "2025-04-30T13:19:30.096373Z",
"TotalMs": 34.2,
"Messages": {}
}
}
Processing Pipeline (Crawler)
Executes comprehensive processing pipeline operations for crawler-based data using POST /v1.0/tenants/[tenant-guid]/processing
. Processes objects from data repositories through metadata generation, semantic cell extraction, embeddings generation, and content analysis workflows with full processing capabilities.
Request Parameters
Required Parameters
- Async (boolean, Body, Required): Whether to execute the processing operation asynchronously
- Tenant (object, Body, Required): Tenant metadata for the processing operation
- Collection (object, Body, Required): Collection metadata for the processing operation
- DataRepository (object, Body, Required): Data repository metadata for the processing operation
- Object (object, Body, Required): Object metadata for the processing operation
- MetadataRule (object, Body, Required): Metadata rule configuration for processing
- EmbeddingsRule (object, Body, Required): Embeddings rule configuration for processing
- VectorRepository (object, Body, Required): Vector repository configuration for processing
- GraphRepository (object, Body, Required): Graph repository configuration for processing
curl --location 'http://view.homedns.org:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/processing' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
"Async": true,
"Tenant": {
"GUID": "00000000-0000-0000-0000-000000000000",
"Name": "Default Tenant",
"Region": "us-west-1",
"S3BaseDomain": "localhost",
"DefaultPoolGUID": "00000000-0000-0000-0000-000000000000",
"Active": true
},
"Collection": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My first collection",
"AllowOverwrites": true,
"AdditionalData": "Created by setup"
},
"DataRepository": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My disk data repository",
"RepositoryType": "File",
"DiskDirectory": "./disk/"
},
"Object": {
"GUID": "00000000-0000-0000-0000-000000000001",
"ParentGUID": null,
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"TenantName": "My default tenant",
"NodeGUID": null,
"PoolGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"BucketName": "data",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Key": "hello2.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 13,
"Data": "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
},
"MetadataRule": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Name": "example-metadata-rule",
"ContentType": "*",
"MaxContentLength": 16777216,
"DataFlowEndpoint": "http://localhost:8501/processor",
"TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
"SemanticCellEndpoint": "http://localhost:8341/",
"MaxChunkContentLength": 512,
"ShiftSize": 448,
"UdrEndpoint": "http://localhost:8321/",
"TopTerms": 25,
"CaseInsensitive": true,
"IncludeFlattened": true,
"DataCatalogEndpoint": "http://localhost:8201/",
"DataCatalogType": "Lexi",
"DataCatalogCollection": "00000000-0000-0000-0000-000000000000",
"GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000"
},
"EmbeddingsRule": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"BucketGUID": "00000000-0000-0000-0000-000000000000",
"OwnerGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My storage server embeddings rule",
"ContentType": "*",
"GraphRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"VectorRepositoryGUID": "00000000-0000-0000-0000-000000000000",
"DataFlowEndpoint": "http://localhost:8501/processor",
"EmbeddingsGenerator": "LCProxy",
"GeneratorUrl": "http://localhost:8301/",
"GeneratorApiKey": "",
"VectorStoreUrl": "http://localhost:8311/",
"MaxContentLength": 16777216
},
"VectorRepository": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My vector repository",
"RepositoryType": "Pgvector",
"Model": "all-MiniLM-L6-v2",
"Dimensionality": 384,
"DatabaseHostname": "localhost",
"DatabaseName": "vectordb",
"DatabaseTable": "minilm",
"DatabasePort": 5432,
"DatabaseUser": "postgres",
"DatabasePassword": "password"
},
"GraphRepository": {
"GUID": "00000000-0000-0000-0000-000000000000",
"TenantGUID": "00000000-0000-0000-0000-000000000000",
"Name": "My LiteGraph instance",
"RepositoryType": "LiteGraph",
"EndpointUrl": "http://localhost:8701/",
"ApiKey": "default",
"GraphIdentifier": "00000000-0000-0000-0000-000000000000"
}
}
'
import { ViewProcessorSdk } from "view-sdk";
const api = new ViewProcessorSdk(
"http://localhost:8000/", //endpoint
"00000000-0000-0000-0000-000000000000", //tenant Id
"default" //access key
);
const processingPipeline = async () => {
try {
const response = await api.processSdk.processingPipeline({
"Async": true,
"Tenant": {
"GUID": "<tenant-guid>",
"Name": "Default Tenant",
"Region": "us-west-1",
"S3BaseDomain": "localhost",
"DefaultPoolGUID": "<pool-guid>",
"Active": true
},
"Collection":{
"GUID": "<collection-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My first collection",
"AllowOverwrites": true,
"AdditionalData": "Created by setup"
},
"DataRepository": {
"GUID": "<data-repository-guid>",
"TenantGUID": "<tenant-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "My disk data repository",
"RepositoryType": "File",
"DiskDirectory": "./disk/"
},
"Object": {
"GUID": "<object-guid>",
"ParentGUID": null,
"TenantGUID": "<tenant-guid>",
"TenantName": "My default tenant",
"PoolGUID": "<pool-guid>",
"BucketGUID": "<bucket-guid>",
"BucketName": "data",
"OwnerGUID": "<owner-guid>",
"Key": "hello2.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 13,
"Data": "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
},
"MetadataRule": {
"GUID": "<metadatarule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "example-metadata-rule",
"ContentType": "*",
"MaxContentLength": 16777216,
"DataFlowEndpoint": "http://localhost:8501/processor",
"TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
"SemanticCellEndpoint": "http://localhost:8341/",
"MaxChunkContentLength": 512,
"ShiftSize": 448,
"UdrEndpoint": "http://localhost:8321/",
"TopTerms": 25,
"CaseInsensitive": true,
"IncludeFlattened": true,
"DataCatalogEndpoint": "http://localhost:8201/",
"DataCatalogType": "Lexi",
"GraphRepositoryGUID": "<graph-repository-guid>",
"TargetBucketGUID": "<target-bucket-guid>"
},
"EmbeddingsRule": {
"GUID": "<embeddingrule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "My storage server embeddings rule",
"ContentType": "*",
"GraphRepositoryGUID": "<graph-repository-guid>",
"VectorRepositoryGUID": "<vector-repository-guid>",
"DataFlowEndpoint": "http://localhost:8501/processor",
"EmbeddingsGenerator": "LCProxy",
"GeneratorUrl": "http://localhost:8301/",
"GeneratorApiKey": "",
"VectorStoreUrl": "http://localhost:8311/",
"MaxContentLength": 16777216
},
"VectorRepository": {
"GUID": "<vector-repository-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My vector repository",
"RepositoryType": "Pgvector",
"Model": "all-MiniLM-L6-v2",
"Dimensionality": 384,
"DatabaseHostname": "localhost",
"DatabaseName": "vectordb",
"DatabaseTable": "minilm",
"DatabasePort": 5432,
"DatabaseUser": "postgres",
"DatabasePassword": "password"
},
"GraphRepository": {
"GUID": "<graph-repository-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My LiteGraph instance",
"RepositoryType": "LiteGraph",
"EndpointUrl": "http://localhost:8701/",
"ApiKey": "default",
"GraphIdentifier": "<graph-identifier>"
}
}
);
console.log(response);
} catch (err) {
console.log("Error", err);
}
};
processingPipeline();
import view_sdk
from view_sdk import processor
sdk = view_sdk.configure(
access_key="default",
base_url="localhost",
tenant_guid="default",
service_ports={Service.LEXI: 8000},
)
def processingPipeline():
result = processor.Processor.process_crawler(Async=True,
Tenant= {
"GUID": "<tenant-guid>",
"Name": "Default Tenant",
"Region": "us-west-1",
"S3BaseDomain": "localhost",
"DefaultPoolGUID": "<pool-guid>",
"Active": true
},
Collection= {
"GUID": "<collection-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My first collection",
"AllowOverwrites": true,
"AdditionalData": "Created by setup"
},
DataRepository= {
"GUID": "<data-repository-guid>",
"TenantGUID": "<tenant-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "My disk data repository",
"RepositoryType": "File",
"DiskDirectory": "./disk/"
},
Object= {
"GUID": "<object-guid>",
"ParentGUID": null,
"TenantGUID": "<tenant-guid>",
"TenantName": "My default tenant",
"PoolGUID": "<pool-guid>",
"BucketGUID": "<bucket-guid>",
"BucketName": "data",
"OwnerGUID": "<owner-guid>",
"Key": "hello2.txt",
"Version": "1",
"ContentType": "text/plain",
"DocumentType": "Text",
"ContentLength": 13,
"Data": "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
},
MetadataRule= {
"GUID": "<metadatarule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "example-metadata-rule",
"ContentType": "*",
"MaxContentLength": 16777216,
"DataFlowEndpoint": "http://localhost:8501/processor",
"TypeDetectorEndpoint": "http://localhost:8501/processor/typedetector",
"SemanticCellEndpoint": "http://localhost:8341/",
"MaxChunkContentLength": 512,
"ShiftSize": 448,
"UdrEndpoint": "http://localhost:8321/",
"TopTerms": 25,
"CaseInsensitive": true,
"IncludeFlattened": true,
"DataCatalogEndpoint": "http://localhost:8201/",
"DataCatalogType": "Lexi",
"GraphRepositoryGUID": "<graph-repository-guid>",
"TargetBucketGUID": "<target-bucket-guid>"
},
EmbeddingsRule= {
"GUID": "<embeddingrule-guid>",
"TenantGUID": "<tenant-guid>",
"BucketGUID": "<bucket-guid>",
"OwnerGUID": "<owner-guid>",
"Name": "My storage server embeddings rule",
"ContentType": "*",
"GraphRepositoryGUID": "<graph-repository-guid>",
"VectorRepositoryGUID": "<vector-repository-guid>",
"DataFlowEndpoint": "http://localhost:8501/processor",
"EmbeddingsGenerator": "LCProxy",
"GeneratorUrl": "http://localhost:8301/",
"GeneratorApiKey": "",
"VectorStoreUrl": "http://localhost:8311/",
"MaxContentLength": 16777216
},
VectorRepository= {
"GUID": "<vector-repository-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My vector repository",
"RepositoryType": "Pgvector",
"Model": "all-MiniLM-L6-v2",
"Dimensionality": 384,
"DatabaseHostname": "localhost",
"DatabaseName": "vectordb",
"DatabaseTable": "minilm",
"DatabasePort": 5432,
"DatabaseUser": "postgres",
"DatabasePassword": "password"
},
GraphRepository= {
"GUID": "<graph-repository-guid>",
"TenantGUID": "<tenant-guid>",
"Name": "My LiteGraph instance",
"RepositoryType": "LiteGraph",
"EndpointUrl": "http://localhost:8701/",
"ApiKey": "default",
"GraphIdentifier": "<graph-identifier>"
})
print(result)
processingPipeline()
using View.Sdk;
using View.Sdk.Processor;
ViewProcessorSdk sdk = new ViewProcessorSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
ObjectMetadata object = new ObjectMetadata
{
GUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
ParentGUID = null,
TenantGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
TenantName = "My default tenant",
PoolGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
BucketGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
BucketName = "data",
OwnerGUID = Guid.Parse("00000000-0000-0000-0000-000000000000"),
Key = "hello1.txt",
Version = "1",
ContentType = "text/plain",
DocumentType = "Text",
ContentLength = 85,
Data = "VGhpcyBpcyBhIHNhbXBsZSBkb2N1bWVudCB3aXRoIGp1c3QgYSBoYW5kZnVsIG9mIHdvcmRzIHRoYXQgd2lsbCBiZSBwcm9jZXNzZWQgYnkgVmlldw=="
};
ProcessorResult response = await sdk.Processor.Process(Guid.Parse("<metadata-rule-guid>"),
Guid.Parse("<embeddings-rule-guid>"),
object);
Response
Returns processing pipeline operation results with execution status and timing information.
{
"GUID": "3292d8eb-642b-40f4-a2de-9b81e66de288",
"Success": true,
"Async": true,
"Timestamp": {
"Start": "2025-04-30T13:19:30.096373Z",
"TotalMs": 34.2,
"Messages": {}
}
}
Best Practices
When managing processing pipeline operations in the View Processing platform, consider the following recommendations for optimal automated processing, content analysis, and workflow efficiency:
- Processing Strategy: Implement systematic processing strategies based on content types, processing requirements, and performance optimization needs
- Resource Management: Monitor and manage processing resources including metadata rules, embeddings rules, and repository configurations
- Content Analysis: Configure appropriate content analysis settings for metadata generation, semantic cell extraction, and embeddings generation
- Performance Optimization: Use asynchronous processing operations for large-scale data processing to optimize performance and resource utilization
- Workflow Integration: Integrate processing pipeline operations with data ingestion, storage, and search workflows for comprehensive data management
Next Steps
After successfully managing processing pipeline operations, you can:
- Cleanup Operations: Implement cleanup pipeline operations for resource management and data lifecycle management
- Metadata Management: Generate and manage UDR metadata for enhanced search capabilities and content analysis
- Semantic Processing: Extract semantic cells and generate embeddings for AI-powered content understanding
- Type Detection: Implement automated type detection for various document formats and content types
- Search Integration: Integrate processed content with search capabilities for enhanced document discovery and analysis