This page provides an overview of the types of objects you will encounter while configuring, managing, operating, and integrating View.
Tenants and Nodes
View is a natively multi-tenant system with two top-level objects: tenant
and node
. A tenant represents an operational, security, and data boundary within a deployment, meaning a single deployment could have separate virtual deployments within. A node represents an individual backend microservice, such as storage
, lexi
, crawler
, or others. Nodes perform specific functions within the deployment and are not explicitly tied to a given tenant.
Users and Credentials
Configuring View and ingesting data requires that a user
authenticate with a credential
. The user is associated with a tenant, and a credential is associated with a user. The credential object contains both an accesskey
and a secretkey
.
Processing Rules
Ingested data is processed according to two primary processing rules: a metadatarule
object which specifies how metadata is generated and where it is stored, and an embeddings
rule which specifies how embeddings are generated and where they are stored. Like other objects, these are tenant-specific, and alongside configuration properties, reference other data repositories by GUID and certain services by URL.
Processing Repositories
View uses multiple repositories to store processed and prepared data in the form of metadata, graph representations, and embeddings. Lexi is a data catalog and search platform that stores sourcedocument
objects inside of collections
. Source documents are built using Universal Data Representation (UDR) which is a key step in the processing pipeline. A graphrepository
object contains metadata about graphs where metadata and relationships about source data and metadata are stored. Finally, and vectorrepository
contains configuration-related information about vector repositories.
Ingestion (S3, REST)
Ingestion refers to the process of data being made available for processing by View. When ingesting through S3 or REST, a storagepool
must be created, defining where data is physically stored. A bucket
is then created, mapping to a storage pool, and once created, objects
can then be uploaded into the bucket.
View provides a complete native object storage API and a separate interface for using the S3 API.
Ingestion (Crawler)
Objects can also be made available for processing by using View crawlers. A crawler is informed of a datarepository
(e.g. local file system, CIFS, NFS, S3, or Azure BLOB). A crawlschedule
defines how frequently a given operation should be run, and is a reusable object that can be applied across multiple jobs. A crawlfilter
defines the filters for which objects should be processed, and like crawl schedules, is reusable across jobs.
A crawlplan
defines which data repository is crawled, on what schedule, and using what filter. Once created, jobs will run according to the defined schedule. The result of the invocation of a crawl plan is a crawloperation
object which gives specifics and statistics about the state of the operation.
Metadata Search
Metadata searches are performed against sourcedocument
objects (containing Universal Data Representation, or UDR, metadata) stored inside of collections
inside of Lexi. These source documents are generated during data processing and persisted inside of collections according to the supplied metadata rule.
Vector Search
Vector search is performed against View vector which by default uses pgvector for persistence. Vectors are automatically generated during data processing and persisted as embeddingsdocument
objects within View Vector, alongside metadata that indicates the source of the data, its relative position, hash information, and the original chunk data from which the vectors were created.
Data Processing
Data processing involves a series of steps including type detection, semantic cell extraction, generation of embeddings, persistence within the data catalog (Lexi), persistence within graph (LiteGraph), and persistence within vector storage (pgvector). The entirety of the processing pipeline runs as a dataflow within Orchestrator, or alternatively, you are able to build your own processing pipelines running on Orchestrator or outside of Orchestrator.
Orchestration
View Orchestrator is a function-as-a-service (FaaS) platform that runs code as independent units based on the invocation of a trigger. Orchestrator currently supports functions written in C# (net8.0) and Python (3.9+).
Within Orchestrator, a trigger
is defined as a means by which an operation is invoked. Currently, HTTP triggers are supported, allowing you to specify the HTTP method and URL that must match. A dataflow
is defined, which is a decision tree of independent steps
, where each step is an independent unit of code. The dataflow
object has within it a starting step and a dataflowmap
, which provides a decision tree on which step to invoke next based on success, failure, or exception of the preceding step.
Assistant
View Assistant is both a built-in conversational AI experience with industry-leading retrieval augmented generation (RAG) capabilities and also an API platform for building rich conversational experiences using data processed and prepared by View.