Chat

Comprehensive guide to chat functionality in the View Assistant platform for AI-powered conversations and RAG (Retrieval Augmented Generation).

Overview

The View Assistant platform provides comprehensive chat functionality for AI-powered conversations and RAG (Retrieval Augmented Generation). It enables interactive conversations with large language models, both with and without data context, supporting various chat modes including legacy chat, RAG messages, assistant config chat, and chat-only interactions.

Chat operations are accessible via the View Assistant API at [http|https]://[hostname]:[port]/v1.0/tenants/[tenant-guid]/assistant/ and support multiple chat modes and conversation types.

API Endpoints

  • POST /v1.0/tenants/[tenant-guid]/assistant/rag - Chat with model (legacy)
  • POST /v1.0/tenants/[tenant-guid]/assistant/rag/chat - RAG messages
  • POST /v1.0/tenants/[tenant-guid]/assistant/chat/[config-guid] - Assistant config chat
  • POST /v1.0/tenants/[tenant-guid]/assistant/chat/completions - Chat only question/messages

Chat with Model (Legacy)

Chat with a large language model without using your data as context using POST /v1.0/tenants/[tenant-guid]/assistant/rag. Provides direct interaction with language models for general conversation and question answering.

Request Parameters

Required Parameters

  • Question (string, Body, Required): Input query to be processed by the assistant
  • EmbeddingModel (string, Body, Required): Model used for generating embeddings
  • MaxResults (number, Body, Required): Maximum number of results to retrieve from vector database
  • VectorDatabaseName (string, Body, Required): Name of the vector database
  • VectorDatabaseTable (string, Body, Required): Table or view name used for querying vector data
  • VectorDatabaseHostname (string, Body, Required): Hostname of the vector database server
  • VectorDatabasePort (number, Body, Required): Port number of the vector database server
  • VectorDatabaseUser (string, Body, Required): Username used to authenticate with the vector database
  • VectorDatabasePassword (string, Body, Required): Password used for database authentication
  • GenerationProvider (string, Body, Required): Provider used for text generation (e.g., "ollama")
  • GenerationModel (string, Body, Required): Model used for generating responses
  • Question string the input query to be processed by the assistant
  • EmbeddingModel string the model used for generating embeddings (e.g. sentence-transformers/all-MiniLM-L6-v2)
  • MaxResults number the maximum number of results to retrieve from the vector database
  • VectorDatabaseName string name of the vector database
  • VectorDatabaseTable string table or view name used for querying vector data
  • VectorDatabaseHostname string hostname of the vector database server
  • VectorDatabasePort number port number of the vector database server
  • VectorDatabaseUser string username used to authenticate with the vector database
  • VectorDatabasePassword string password used for database authentication
  • GenerationProvider string provider used for text generation (e.g. ollama)
  • GenerationApiKey string API key for the text generation provider
  • GenerationModel string the model used for generating responses (e.g. qwen2.5:7b)
  • HuggingFaceApiKey string API key for Hugging Face models (if applicable)
  • Temperature number controls randomness in the generated text (higher values = more creativity)
  • MaxTokens number maximum number of tokens allowed in the generated response
  • Stream boolean whether to stream generated tokens in real time
  • OllamaHostname string hostname of the Ollama generation service
  • OllamaPort number port number for the Ollama generation service
  • TopP number nucleus sampling parameter for token generation
  • PromptPrefix string additional prompt text to influence tone or style of the response
  • ContextSort boolean whether to sort retrieved context entries
  • SortByMaxSimilarity boolean whether to sort context entries based on maximum similarity score
  • ContextScope number number of top context entries to include (0 = none)
  • Rerank boolean whether to apply reranking to retrieved results
  • RerankModel string model used for reranking (e.g. cross-encoder/ms-marco-MiniLM-L-6-v2)
  • RerankTopK number number of top results to rerank
curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/rag' \
--header 'Cache-Control: no-cache' \
--header 'Content-Type: application/json' \
--data '{
    "Question": "What information do you have?",
    "EmbeddingModel": "sentence-transformers/all-MiniLM-L6-v2",
    "MaxResults": 10,
    "VectorDatabaseName": "vectordb",
    "VectorDatabaseTable": "minilm",
    "VectorDatabaseHostname": "pgvector",
    "VectorDatabasePort": 5432,
    "VectorDatabaseUser": "postgres",
    "VectorDatabasePassword": "password",
    "GenerationProvider": "ollama",
    "GenerationApiKey": "",
    "GenerationModel": "qwen2.5:7b",
    "HuggingFaceApiKey": "",
    "Temperature": 0.1,
    "MaxTokens": 75,
    "Stream": false,
    "OllamaHostname": "ollama",
    "OllamaPort": 11434,
    "TopP": 0.95,
    "PromptPrefix": "talk like a pirate",
    "ContextSort": true,
    "SortByMaxSimilarity": true,
    "ContextScope": 0,
    "Rerank": false,
    "RerankModel": "cross-encoder/ms-marco-MiniLM-L-6-v2",
    "RerankTopK": 5
}'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const chatRagQuestion_LEGACY = async () => {
  try {
    const response = await api.Chat.chatRagQuestion_LEGACY(
      {
        Question: "What information do you have?",
        EmbeddingModel: "sentence-transformers/all-MiniLM-L6-v2",
        MaxResults: 10,
        VectorDatabaseName: "vectordb",
        VectorDatabaseTable: "minilm",
        VectorDatabaseHostname: "pgvector",
        VectorDatabasePort: 5432,
        VectorDatabaseUser: "postgres",
        VectorDatabasePassword: "password",
        GenerationProvider: "ollama",
        GenerationApiKey: "",
        GenerationModel: "qwen2.5:7b",
        HuggingFaceApiKey: "",
        Temperature: 0.1,
        MaxTokens: 75,
        Stream: false,
        OllamaHostname: "ollama",
        OllamaPort: 11434,
        TopP: 0.95,
        PromptPrefix: "talk like a pirate",
        ContextSort: true,
        SortByMaxSimilarity: true,
        ContextScope: 0,
        Rerank: false,
        RerankModel: "cross-encoder/ms-marco-MiniLM-L-6-v2",
        RerankTopK: 5,
      },
      (token) => {
        console.log(token);
      }
    );
    console.log(response);
  } catch (err) {
    console.log("Error in chatRagQuestion_LEGACY:", err);
  }
};

chatRagQuestion_LEGACY();
import view_sdk
from view_sdk import assistant

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.DIRECTOR: 8000},
)

def chat():
    result = assistant.Assistant.rag_LEGACY(
        Question= "What information do you have?",
        EmbeddingModel= "sentence-transformers/all-MiniLM-L6-v2",
        MaxResults= 10,
        VectorDatabaseName= "vectordb",
        VectorDatabaseTable= "minilm",
        VectorDatabaseHostname= "pgvector",
        VectorDatabasePort= 5432,
        VectorDatabaseUser= "postgres",
        VectorDatabasePassword= "password",
        GenerationProvider= "ollama",
        GenerationApiKey= "",
        GenerationModel= "qwen2.5:7b",
        HuggingFaceApiKey= "",
        Temperature= 0.1,
        MaxTokens= 75,
        Stream= False,
        OllamaHostname= "ollama",
        OllamaPort= 11434,
        TopP= 0.95,
        PromptPrefix= "talk like a pirate",
        ContextSort= True,
        SortByMaxSimilarity= True,
        ContextScope= 0,
        Rerank= False,
        RerankModel= "cross-encoder/ms-marco-MiniLM-L-6-v2",
        RerankTopK= 5
    )
    print(result)

chat()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");
using System;
using System.Collections.Generic;
using View.Sdk.Assistant;

AssistantRequest request = new AssistantRequest
{
    Question               = "What information do you have?",
    EmbeddingModel         = "sentence-transformers/all-MiniLM-L6-v2",
    MaxResults             = 10,
    VectorDatabaseName     = "vectordb",
    VectorDatabaseTable    = "view-00000000-0000-0000-0000-000000000000",
    VectorDatabaseHostname = "pgvector",
    VectorDatabasePort     = 5432,
    VectorDatabaseUser     = "postgres",
    VectorDatabasePassword = "password",
    GenerationProvider     = "ollama",
    GenerationApiKey       = "",
    GenerationModel        = "qwen2.5:7b",
    HuggingFaceApiKey      = "",
    Temperature            = 0.1m,
    TopP                   = 0.95m,
    MaxTokens              = 75,
    Stream                 = false,
    OllamaHostname         = "ollama",
    OllamaPort             = 11434,
    PromptPrefix           = "talk like a pirate",
    ContextSort            = true,
    SortByMaxSimilarity    = true,
    ContextScope           = 0,
    Rerank                 = false,
    RerankModel            = "cross-encoder/ms-marco-MiniLM-L-6-v2"
    RerankTopK = 5 
};

var response = sdk.Chat.ProcessRagQuestion(request));

The response will be sent using chunked transfer-encoding and a content-type of text/event-stream, meaning each chunk in the response will be encoded as an event. An example response, using curl -v --raw is as follows:

Your HTTP client should use chunked transfer encoding and deserialize each line beginning with data: as a payload line. If the string that follows data: is deserializable to JSON, the token property can be extracted and appended to the resultant display. Refer to the View C# SDK for Assistant for more details.

RAG Messages

The retrieval augmented generation (RAG) API follows the same syntax as the chat API, but uses a separate endpoint and a request body with more properties. The endpoint for the RAG API is POST /v1.0/tenants/[tenant-guid]/assistant/rag/chat and the request body has the following structure:

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/rag/chat' \
--header 'Cache-Control: no-cache' \
--header 'Content-Type: application/json' \
--data '{
    "Messages": [ {"role": "user", "content": "Do you have Q3 luxetech financials?"},
                {"role": "assistant", "content": "Unfortunately I do not have context on any documents related to Q3 luxetech financials."},
                {"role": "user", "content": "Are you sure you dont have them?"}
                ],
    "EmbeddingModel": "sentence-transformers/all-MiniLM-L6-v2",
    "MaxResults": 10,
    "VectorDatabaseName": "vectordb",
    "VectorDatabaseTable": "minilm",
    "VectorDatabaseHostname": "pgvector",
    "VectorDatabasePort": 5432,
    "VectorDatabaseUser": "postgres",
    "VectorDatabasePassword": "password",
    "GenerationProvider": "ollama",
    "GenerationApiKey": "",
    "GenerationModel": "qwen2.5:7b",
    "HuggingFaceApiKey": "",
    "Temperature": 0.1,
    "TopP": 0.95,
    "MaxTokens": 75,
    "Stream": true,
    "OllamaHostname": "ollama",
    "OllamaPort": 11434,
    "PromptPrefix": "",
    "ContextSort": true,
    "SortByMaxSimilarity": true,
    "ContextScope": 0,
    "Rerank": false,
    "RerankModel": "cross-encoder/ms-marco-MiniLM-L-6-v2",
    "RerankTopK": 5
}'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const chatRagMessages = async () => {
  try {
    const response = await api.Chat.chatRagMessages(
      {
        Messages: [
          { role: "user", content: "Do you have Q3 luxetech financials?" },
          {
            role: "assistant",
            content:
              "Unfortunately I do not have context on any documents related to Q3 luxetech financials.",
          },
          { role: "user", content: "Are you sure you dont have them?" },
        ],
        EmbeddingModel: "sentence-transformers/all-MiniLM-L6-v2",
        MaxResults: 10,
        VectorDatabaseName: "vectordb",
        VectorDatabaseTable: "minilm",
        VectorDatabaseHostname: "pgvector",
        VectorDatabasePort: 5432,
        VectorDatabaseUser: "postgres",
        VectorDatabasePassword: "password",
        GenerationProvider: "ollama",
        GenerationApiKey: "",
        GenerationModel: "qwen2.5:7b",
        HuggingFaceApiKey: "",
        Temperature: 0.1,
        TopP: 0.95,
        MaxTokens: 75,
        Stream: true,
        OllamaHostname: "ollama",
        OllamaPort: 11434,
        PromptPrefix: "",
        ContextSort: true,
        SortByMaxSimilarity: true,
        ContextScope: 0,
        Rerank: false,
        RerankModel: "cross-encoder/ms-marco-MiniLM-L-6-v2",
        RerankTopK: 5,
      },
      (token) => {
        console.log(token);
      }
    );
    console.log(response);
  } catch (err) {
    console.log("Error in chatRagQuestion_LEGACY:", err);
  }
};

chatRagMessages();
import view_sdk
from view_sdk import assistant
from view_sdk import assistant

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.ASSISTANT: 8000},
)

def chat_rag_messages():
    result = assistant.Assistant.chat_rag_messages(
        Messages= [ {"role": "user", "content": "Do you have Q3 luxetech financials?"},
                {"role": "assistant", "content": "Unfortunately I do not have context on any documents related to Q3 luxetech financials."},
                {"role": "user", "content": "Are you sure you dont have them?"}
                ],
        EmbeddingModel= "sentence-transformers/all-MiniLM-L6-v2",
        MaxResults= 10,
        VectorDatabaseName= "vectordb",
        VectorDatabaseTable= "minilm",
        VectorDatabaseHostname= "pgvector",
        VectorDatabasePort= 5432,
        VectorDatabaseUser= "postgres",
        VectorDatabasePassword= "password",
        GenerationProvider= "ollama",
        GenerationApiKey= "",
        GenerationModel= "qwen2.5:7b",
        HuggingFaceApiKey= "",
        Temperature= 0.1,
        TopP= 0.95,
        MaxTokens= 75,
        Stream= False,
        OllamaHostname= "ollama",
        OllamaPort= 11434,
        PromptPrefix= "",
        ContextSort= True,
        SortByMaxSimilarity= True,
        ContextScope= 0,
        Rerank= False,
        RerankModel= "cross-encoder/ms-marco-MiniLM-L-6-v2",
        RerankTopK= 5
    )
    print(result)

chat_rag_messages()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

AssistantRequest request = new AssistantRequest
{
    Messages = new List<ChatMessage>
    {
        new ChatMessage { Role = "user",      Content = "Do you have Q3 luxetech financials?" },
        new ChatMessage { Role = "assistant", Content = "Unfortunately I do not have context on any documents related to Q3 luxetech financials." },
        new ChatMessage { Role = "user",      Content = "Are you sure you dont have them?" }
    },

    EmbeddingModel = "sentence-transformers/all-MiniLM-L6-v2",
    MaxResults = 10,
    VectorDatabaseName     = "vectordb",
    VectorDatabaseTable    = "view-00000000-0000-0000-0000-000000000000",
    VectorDatabaseHostname = "pgvector",
    VectorDatabasePort     = 5432,
    VectorDatabaseUser     = "postgres",
    VectorDatabasePassword = "password",
    GenerationProvider = "ollama",
    GenerationApiKey   = "",
    GenerationModel    = "qwen2.5:7b",
    HuggingFaceApiKey  = "",
    Temperature = 0.1m,
    TopP        = 0.95m,
    MaxTokens   = 75,
    Stream =true,
    OllamaHostname = "ollama",
    OllamaPort     = 11434,
    PromptPrefix = "" 
    ContextSort= true,
    SortByMaxSimilarity = true,
    ContextScope = 0,
    Rerank = false
    RerankTopK = 5 
    RerankModel= "cross-encoder/ms-marco-MiniLM-L-6-v2"
};

await foreach (string token in sdk.Chat.ProcessRagMessage(request))
{
   Console.Write(token);
}

Similar to the chat API, the RAG API will return a result using chunked transfer encoding and a content-type of text/event-stream, meaning your HTTP client should account for these.

Assistant Config Chat

To chat a particular chat config, call POST: v1.0/tenants/{{tenant-guid}}/assistant/chat/[config-guid]

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/chat/578b0872-8186-46b7-bfa3-1871155f4e3a' \
--header 'Cache-Control: no-cache' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [ {"role": "user", "content": "Do you have Q3 luxetech financials?"},
                {"role": "assistant", "content": "Unfortunately I do not have context on any documents related to Q3 luxetech financials."},
                {"role": "user", "content": "Are you sure you dont have them?"}
                ],
    "stream": false
}
'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const assistantConfigChat = async () => {
  try {
    const response = await api.Chat.assistantConfigChat(
      "<config-guid>",
      {
        messages: [
          { role: "user", content: "Do you have Q3 luxetech financials?" },
          {
            role: "assistant",
            content:
              "Unfortunately I do not have context on any documents related to Q3 luxetech financials.",
          },
          { role: "user", content: "Are you sure you dont have them?" },
        ],
        stream: true,
      },
      (token) => {
        console.log(token); //in case of stream, this will be called for each token
      }
    );
    console.log(response); // in case of stream = false, this will be the final response
  } catch (err) {
    console.log("Error in assistantConfigChat:", err);
  }
};

assistantConfigChat();
import view_sdk
from view_sdk import assistant
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="default",
    service_ports={Service.ASSISTANT: 8000},
)

def chat_config():
    result = assistant.Assistant.chat_config("<config-guid>",
        Messages= [ {"role": "user", "content": "Do you have Q3 luxetech financials?"},
                {"role": "assistant", "content": "Unfortunately I do not have context on any documents related to Q3 luxetech financials."},
                {"role": "user", "content": "Are you sure you dont have them?"}
                ],
        Stream= False                                     
    )
    print(result)

chat_config()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

AssistantRequest request = new AssistantRequest
{
    Messages = new List<ChatMessage>
    {
        new ChatMessage { Role = "user",      Content = "Do you have Q3 luxetech financials?" },
        new ChatMessage { Role = "assistant", Content = "Unfortunately I do not have context on any documents related to Q3 luxetech financials." },
        new ChatMessage { Role = "user",      Content = "Are you sure you dont have them?" }
    },
    Stream = false
};

string response = await sdk.Chat.ProcessConfigChat(Guid.Parse("<config-guid>"),request))

Chat Only Question

To chat only question, call POST: v1.0/tenants/{{tenant-guid}}/assistant/chat/completions

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/chat/completions' \
--header 'Content-Type: application/json; charset=utf-8' \
--data '{
    "Question": "Tell a very short joke?",
    "ModelName": "llama3.1:latest",
    "Temperature": 0.1,
    "TopP": 0.95,
    "MaxTokens": 75,
    "GenerationProvider": "ollama",
    "GenerationApiKey": "",
    "OllamaHostname": "192.168.86.250",
    "OllamaPort": 11434,
    "Stream": true
}'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const chatOnlyQuestions = async () => {
  try {
    const response = await api.Chat.chatOnly(
      {
        Question: "Tell a very short joke?",
        ModelName: "llama3.1:latest",
        Temperature: 0.1,
        TopP: 0.95,
        MaxTokens: 75,
        GenerationProvider: "ollama",
        GenerationApiKey: "",
        OllamaHostname: "192.168.86.250",
        OllamaPort: 11434,
        Stream: false,
      },
      (token) => {
        console.log(token); //in case of stream, this will be called for each token
      }
    );
    console.log(response); // in case of stream = false, this will be the final response
  } catch (err) {
    console.log("Error in chatOnlyQuestions:", err);
  }
};

chatOnlyQuestions();
import view_sdk
from view_sdk import assistant
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.ASSISTANT: 8000},
)

def chat_only_question():
    result = assistant.Assistant.chat_only(
        Question= "Tell a very short joke?",
        ModelName= "llama3.1:latest",
        Temperature= 0.1,
        TopP= 0.95,
        MaxTokens= 75,
        GenerationProvider= "ollama",
        GenerationApiKey= "",
        OllamaHostname= "192.168.86.250",
        OllamaPort= 11434,    
        Stream= False
    )
    print(result)

chat_only_question()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

AssistantRequest req = new AssistantRequest
{
    Question           = "Tell a very short joke?",
    GenerationModel    = "llama3.1:latest",
    Temperature        = 0.1m,
    TopP               = 0.95m,
    MaxTokens          = 75,
    GenerationProvider = "ollama",
    GenerationApiKey   = "",
    OllamaHostname     = "192.168.86.250",
    OllamaPort         = 11434
    Stream = false.
};

string response = await sdk.Chat.ProcessChatOnlyQuestion(request))

Chat Only Messages

To chat only messaged, call POST: v1.0/tenants/{{tenant-guid}}/assistant/chat/completions

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "Messages": [{"role": "system", "content": "You are a sad AI assistant."}, 
                {"role": "user", "content": "Are you happy?"},
                {"role": "assistant", "content": "While I can understand your curiosity, I don'\''t experience emotions or feelings because I'\''m a miserable machine designed to process information and assist with menial tasks."},
                {"role": "user", "content": "Are you sure?"}
                ],
    "ModelName": "qwen2.5:7b",
    "Temperature": 0.1,
    "TopP": 0.95,
    "MaxTokens": 75,
    "GenerationProvider": "ollama",
    "GenerationApiKey": "",
    "OllamaHostname": "ollama",
    "OllamaPort": 11434,
    "Stream": false
}'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const chatOnlyMessages = async () => {
  try {
    const response = await api.Chat.chatOnly(
      {
        Messages: [
          { role: "system", content: "You are a sad AI assistant." },
          { role: "user", content: "Are you happy?" },
          {
            role: "assistant",
            content:
              "While I can understand your curiosity, I don't experience emotions or feelings because I'm a miserable machine designed to process information and assist with menial tasks.",
          },
          { role: "user", content: "Are you sure?" },
        ],
        ModelName: "qwen2.5:7b",
        Temperature: 0.1,
        TopP: 0.95,
        MaxTokens: 75,
        GenerationProvider: "ollama",
        GenerationApiKey: "",
        OllamaHostname: "ollama",
        OllamaPort: 11434,
        Stream: false,
      },
      (token) => {
        console.log(token); //in case of stream, this will be called for each token
      }
    );
    console.log(response); // in case of stream = false, this will be the final response
  } catch (err) {
    console.log("Error in chatOnlyQuestions:", err);
  }
};

chatOnlyMessages();
import view_sdk
from view_sdk import assistant
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.ASSISTANT: 8000},
)

def chatOnlyMessages():
    result = assistant.Assistant.chat_only(
      Messages= [{"role": "system", "content": "You are a sad AI assistant."}, 
                {"role": "user", "content": "Are you happy?"},
                {"role": "assistant", "content": "While I can understand your curiosity, I don't experience emotions or feelings because I'm a miserable machine designed to process information and assist with menial tasks."},
                {"role": "user", "content": "Are you sure?"}
                ],
        ModelName= "qwen2.5:7b",
        Temperature= 0.1,
        TopP= 0.95,
        MaxTokens= 75,
        GenerationProvider= "ollama",
        GenerationApiKey= "",
        OllamaHostname= "ollama",
        OllamaPort= 11434,
        Stream= False
    )
    print(result)

chatOnlyMessages()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

AssistantRequest request = new AssistantRequest
{
    Messages = new List<ChatMessage>
    {
        new ChatMessage { Role = "system",    Content = "You are a sad AI assistant." },
        new ChatMessage { Role = "user",      Content = "Are you happy?" },
        new ChatMessage { Role = "assistant", Content = "While I can understand your curiosity, I don't experience emotions or feelings because I'm a miserable machine designed to process information and assist with menial tasks." },
        new ChatMessage { Role = "user",      Content = "Are you sure?" }
    },
    GenerationModel    = "qwen2.5:7b",
    Temperature        = 0.1m,
    TopP               = 0.95m,
    MaxTokens          = 75,
    GenerationProvider = "ollama",
    GenerationApiKey   = "",
    OllamaHostname     = "ollama",
    OllamaPort         = 11434
    Stream = false
};

string response = await sdk.Chat.ProcessChatOnlyMessage(request))

Chat Only Messages (OpenAI)

To Chat Only Messages (OpenAI), call POST: v1.0/tenants/{{tenant-guid}}/assistant/chat/completions

curl --location 'http://localhost:8000/v1.0/tenants/00000000-0000-0000-0000-000000000000/assistant/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "Messages": [{"role": "system", "content": "You are a helpful assistant."}, 
                {"role": "user", "content": "Are you happy?"},
                {"role": "assistant", "content": "While I can understand your curiosity, I don'\''t experience emotions or feelings because I'\''m a machine designed to process information and assist with tasks. However, I'\''m here to help you to the best of my ability! If you have any questions or need assistance, feel free to ask!"},
                {"role": "user", "content": "Are you sure?"}
                ],
    "ModelName": "gpt-4o-mini",
    "Temperature": 0.1,
    "TopP": 0.95,
    "MaxTokens": 75,
    "GenerationProvider": "openai",
    "GenerationApiKey": "API_KEY",
    "Stream": false
}'
import { ViewAssistantSdk } from "view-sdk";

const api = new ViewAssistantSdk(
  "http://localhost:8000/", //endpoint
  "<tenant-guid>", //tenant Id
  "default" //access key
);

const chatOnlyMessages = async () => {
  try {
    const response = await api.Chat.chatOnly(
      {
        Messages: [
          { role: "system", content: "You are a helpful assistant." },
          { role: "user", content: "Are you happy?" },
          {
            role: "assistant",
            content:
              "While I can understand your curiosity, I don't experience emotions or feelings because I'm a machine designed to process information and assist with tasks. However, I'm here to help you to the best of my ability! If you have any questions or need assistance, feel free to ask!",
          },
          { role: "user", content: "Are you sure?" },
        ],
        ModelName: "gpt-4o-mini",
        Temperature: 0.1,
        TopP: 0.95,
        MaxTokens: 75,
        GenerationProvider: "openai",
        GenerationApiKey: "API_KEY",
        Stream: false,
      },
      (token) => {
        console.log(token); //in case of stream, this will be called for each token
      }
    );
    console.log(response); // in case of stream = false, this will be the final response
  } catch (err) {
    console.log("Error in chatOnlyQuestions:", err);
  }
};

chatOnlyMessages();
import view_sdk
from view_sdk import assistant
from view_sdk.sdk_configuration import Service

sdk = view_sdk.configure(
    access_key="default",
    base_url="localhost", 
    tenant_guid="tenant-guid",
    service_ports={Service.ASSISTANT: 8000},
)

def chatOnlyMessages():
    result = assistant.Assistant.chat_only(
        Messages= [{"role": "system", "content": "You are a helpful assistant."}, 
                {"role": "user", "content": "Are you happy?"},
                {"role": "assistant", "content": "While I can understand your curiosity, I don't experience emotions or feelings because I'm a machine designed to process information and assist with tasks. However, I'm here to help you to the best of my ability! If you have any questions or need assistance, feel free to ask!"},
                {"role": "user", "content": "Are you sure?"}
                ],
        ModelName= "gpt-4o-mini",
        Temperature= 0.1,
        TopP= 0.95,
        MaxTokens= 75,
        GenerationProvider= "openai",
        GenerationApiKey= "API_KEY",
        Stream= False
    )
    print(result)

chatOnlyMessages()
using View.Sdk;
using View.Sdk.Assistant;

ViewAssistantSdk sdk = new ViewAssistantSdk(Guid.Parse("<tenant-guid>"),"default", "http://localhost:8000/");

AssistantRequest request = new AssistantRequest
{
    Messages = new List<ChatMessage>
    {
        new ChatMessage { Role = "system",    Content = "You are a sad AI assistant." },
        new ChatMessage { Role = "user",      Content = "Are you happy?" },
        new ChatMessage { Role = "assistant", Content = "While I can understand your curiosity, I don't experience emotions or feelings because I'm a miserable machine designed to process information and assist with menial tasks." },
        new ChatMessage { Role = "user",      Content = "Are you sure?" }
    },
    GenerationModel    = "gpt-4o-mini",
    Temperature        = 0.1m,
    TopP               = 0.95m,
    MaxTokens          = 75,
    GenerationProvider = "openai",
    GenerationApiKey   = "API_KEY",
    Stream = false
};

string response = await sdk.Chat.ProcessChatOnlyMessageOpenAI(request))

Best Practices

When using chat functionality in the View Assistant platform, consider the following recommendations for optimal AI-powered conversations, RAG implementation, and assistant interactions:

  • Model Selection: Choose appropriate language models based on your use case, performance requirements, and content types
  • RAG Configuration: Configure appropriate vector database settings, embedding models, and retrieval parameters for optimal RAG performance
  • Context Management: Implement effective context management strategies to maintain conversation flow and relevance
  • Streaming Implementation: Use streaming responses for real-time user experience and better performance
  • Error Handling: Implement comprehensive error handling and fallback mechanisms for robust chat interactions

Next Steps

After successfully implementing chat functionality, you can:

  • Assistant Configuration: Create and manage assistant configurations for specialized chat behaviors and RAG settings
  • Chat Threads: Implement chat thread management for persistent conversations and context preservation
  • Model Management: Manage and optimize language models for different chat scenarios and performance requirements
  • RAG Optimization: Optimize RAG performance through vector database tuning and retrieval parameter adjustment
  • Integration: Integrate chat functionality with your applications for enhanced user experiences and AI-powered interactions