Training

Training your replica

This documentation explains how to train your replicas using the Sensay API. Training is essential for creating personalized replicas that can provide accurate and relevant responses based on your specific content.

What is a knowledge base?

A knowledge base is a collection of information that your replica uses to answer questions. It's the foundation of your replica's ability to provide accurate and contextually relevant responses. All training in Sensay relies on knowledge base entries.

Knowledge base workflow

When training a replica, each knowledge base entry goes through three stages:

Raw text stage: The initial, unprocessed content you provide (such as documents, articles, or custom text). This is the information you want your replica to learn from.
Processed text stage: The system optimizes your content for better understanding and retrieval.
Vector stage: The processed content is converted into a mathematical representation (vectors) that allows the replica to quickly find and retrieve relevant information when answering questions.

Adding content to the knowledge base

There are two methods to add content to your replica's knowledge base:

Method 1: Adding text content

Create a knowledge base entry

curl -X POST https://api.sensay.io/v1/replicas/$REPLICA_UUID/training \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json" \
 -d '{}'

Example response:

{
  "success": true,
  "knowledgeBaseID": 12345
}

This creates a new empty knowledge base entry. The response includes a knowledgeBaseID that you'll need for the next step.

Add text to the knowledge base entry

curl -X PUT https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json" \
 -d '{
   "rawText": "Your training text content goes here. This can be any text you want your replica to learn from, such as product information, company policies, or specialized knowledge."
 }'

After adding text, the system automatically processes it and makes it available for your replica to use when answering questions.

Method 2: Uploading text-based files

Get a signed URL for file upload

curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/files/upload?filename=your_file.pdf \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json"

Example response:

{
  "success": true,
  "signedURL": "https://storage.googleapis.com/...",
  "knowledgeBaseID": 12345
}

This prepares the system for your file upload and returns a special URL where you can upload your file, along with the knowledge base ID for tracking. Files up to 50MB are supported.

Upload the file to the signed URL

curl -X PUT $SIGNED_URL \
 -H "Content-Type: application/octet-stream" \
 --data-binary @/path/to/your/file.pdf

After uploading, the system automatically extracts text from your file, processes it, and makes it available for your replica to use.

Managing knowledge base entries

List all knowledge base entries

curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json"

Get a specific knowledge base entry

curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json"

Example response:

{
  "success": true,
  "id": 12345,
  "replica_uuid": "12345678-1234-1234-1234-123456789abc",
  "type": "text",
  "filename": null,
  "status": "READY",
  "raw_text": "Your training text content...",
  "processed_text": "Optimized version of your content...",
  "created_at": "2025-04-15T08:11:00.093761+00:00",
  "updated_at": "2025-04-15T08:11:05.299349+00:00",
  "title": null,
  "description": null
}

Delete a knowledge base entry

curl -X DELETE https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
 -H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
 -H "Content-Type: application/json"

Example response:

{
  "success": true
}

Understanding knowledge base status values

The status field in a knowledge base entry indicates its current processing state:

BLANK: Initial state for a newly created text entry
AWAITING_UPLOAD: Initial state for a file entry before upload
SUPABASE_ONLY: File has been uploaded but not yet processed
PROCESSING: Entry is being processed
READY: Entry has been fully processed and is available for retrieval
SYNC_ERROR: An error occurred during synchronization
ERR_FILE_PROCESSING: An error occurred during file processing
ERR_TEXT_PROCESSING: An error occurred during text processing
ERR_TEXT_TO_VECTOR: An error occurred during vector conversion

If you encounter any error states, you may need to delete the entry and try again.