Training your replica
This documentation explains how to train your replicas using the Sensay API. Training is essential for creating personalized replicas that can provide accurate and relevant responses based on your specific content.
What is a knowledge base?
A knowledge base is a collection of information that your replica uses to answer questions. It's the foundation of your replica's ability to provide accurate and contextually relevant responses. All training in Sensay relies on knowledge base entries.
Knowledge base workflow
When training a replica, each knowledge base entry goes through three stages:
- Raw text stage: The initial, unprocessed content you provide (such as documents, articles, or custom text). This is the information you want your replica to learn from.
- Processed text stage: The system optimizes your content for better understanding and retrieval.
- Vector stage: The processed content is converted into a mathematical representation (vectors) that allows the replica to quickly find and retrieve relevant information when answering questions.
Adding content to the knowledge base
There are two methods to add content to your replica's knowledge base:
Method 1: Adding text content
- Create a knowledge base entry
curl -X POST https://api.sensay.io/v1/replicas/$REPLICA_UUID/training \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json" \
-d '{}'
Example response:
{
"success": true,
"knowledgeBaseID": 12345
}
This creates a new empty knowledge base entry. The response includes a knowledgeBaseID
that you'll need for the next step.
- Add text to the knowledge base entry
curl -X PUT https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json" \
-d '{
"rawText": "Your training text content goes here. This can be any text you want your replica to learn from, such as product information, company policies, or specialized knowledge."
}'
After adding text, the system automatically processes it and makes it available for your replica to use when answering questions.
Method 2: Uploading text-based files
- Get a signed URL for file upload
curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/files/upload?filename=your_file.pdf \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json"
Example response:
{
"success": true,
"signedURL": "https://storage.googleapis.com/...",
"knowledgeBaseID": 12345
}
This prepares the system for your file upload and returns a special URL where you can upload your file, along with the knowledge base ID for tracking. Files up to 50MB are supported.
- Upload the file to the signed URL
curl -X PUT $SIGNED_URL \
-H "Content-Type: application/octet-stream" \
--data-binary @/path/to/your/file.pdf
After uploading, the system automatically extracts text from your file, processes it, and makes it available for your replica to use.
Managing knowledge base entries
List all knowledge base entries
curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json"
Get a specific knowledge base entry
curl -X GET https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json"
Example response:
{
"success": true,
"id": 12345,
"replica_uuid": "12345678-1234-1234-1234-123456789abc",
"type": "text",
"filename": null,
"status": "READY",
"raw_text": "Your training text content...",
"processed_text": "Optimized version of your content...",
"created_at": "2025-04-15T08:11:00.093761+00:00",
"updated_at": "2025-04-15T08:11:05.299349+00:00",
"title": null,
"description": null
}
Delete a knowledge base entry
curl -X DELETE https://api.sensay.io/v1/replicas/$REPLICA_UUID/training/$KNOWLEDGE_BASE_ID \
-H "X-ORGANIZATION-SECRET: $ORGANIZATION_SECRET" \
-H "Content-Type: application/json"
Example response:
{
"success": true
}
Understanding knowledge base status values
The status
field in a knowledge base entry indicates its current processing state:
BLANK
: Initial state for a newly created text entryAWAITING_UPLOAD
: Initial state for a file entry before uploadSUPABASE_ONLY
: File has been uploaded but not yet processedPROCESSING
: Entry is being processedREADY
: Entry has been fully processed and is available for retrievalSYNC_ERROR
: An error occurred during synchronizationERR_FILE_PROCESSING
: An error occurred during file processingERR_TEXT_PROCESSING
: An error occurred during text processingERR_TEXT_TO_VECTOR
: An error occurred during vector conversion
If you encounter any error states, you may need to delete the entry and try again.