Knowledge Bases

Knowledge Bases API

Manage knowledge bases and documents for Retrieval Augmented Generation (RAG).

Knowledge Base Management

List Knowledge Bases

GET /orgs/{organizationId}/knowledgebases

Query Parameters:

  • limit (optional): Number of results (default: 20)
  • cursor (optional): Pagination offset
  • orderBy (optional): Sort field
  • forAgentId (optional): Get/create personal KB for specific agent
  • includePersonalKBs (optional): Set to "true" to include personal knowledge bases

Response:

{
  "items": [
    {
      "knowledgeBaseId": "kb-123",
      "organizationId": "org-456",
      "type": "ORGANIZATION",
      "title": "Product Documentation",
      "llmDescription": "Product docs and FAQs for our software",
      "creatorUserId": "user-789",
      "isPublic": false,
      "createdAt": "2024-01-01T00:00:00Z",
      "updatedAt": "2024-01-01T00:00:00Z"
    }
  ],
  "totalRows": 1,
  "offset": 0
}

Create Knowledge Base

POST /orgs/{organizationId}/knowledgebases

Required Fields:

  • title: Knowledge base name
  • llmDescription: Description for LLM to understand when to use this KB
  • type: Either "ORGANIZATION" or "PERSONAL"

Request Body:

{
  "title": "Product Documentation",
  "llmDescription": "Contains all product documentation, user guides, FAQs, and technical specifications for our software platform",
  "type": "ORGANIZATION"
}

Response: Returns created KnowledgeBase object with knowledgeBaseId

ℹ️
The llmDescription field is crucial - it tells the LLM when this knowledge base should be used. Be specific and descriptive.

Get Knowledge Base

GET /orgs/{organizationId}/knowledgebases/{id}

Response: Returns KnowledgeBase object

Update Knowledge Base

PUT /orgs/{organizationId}/knowledgebases/{id}

Required Fields:

  • title: Knowledge base title
  • llmDescription: LLM description

Request Body:

{
  "title": "Updated Product Documentation",
  "llmDescription": "Updated description with more context about what this KB contains"
}

Delete Knowledge Base

DELETE /orgs/{organizationId}/knowledgebases/{id}

Permanently deletes the knowledge base and all its files.

Response:

{
  "message": "File deleted",
  "knowledgeBaseId": "kb-123"
}

Copy Knowledge Base

POST /orgs/{organizationId}/knowledgebases/{id}/copy-to

Copies entire knowledge base to another organization.

File Management

List Files

GET /orgs/{organizationId}/knowledgebases/{id}/files

Query Parameters:

  • query (optional): Search query to filter files
  • status (optional): Filter by status (WAITING-UPLOAD, ENQUEUED, PROCESSING, READY, ERROR)
  • limit (optional): Number of results (default: 20)
  • cursor (optional): Pagination offset

Response:

{
  "items": [
    {
      "fileId": "file-123",
      "knowledgeBaseId": "kb-456",
      "organizationId": "org-789",
      "fileName": "product-guide.pdf",
      "key": "s3-key-path",
      "status": "READY",
      "embeddingMode": "TEXT",
      "customInstructions": null,
      "statusMessage": null,
      "externalFileUrl": null,
      "creatorUserId": "user-101",
      "createdAt": "2024-01-01T00:00:00Z",
      "updatedAt": "2024-01-01T00:05:00Z"
    }
  ],
  "totalRows": 1,
  "offset": 0
}

Upload File

Step 1: Get upload URL

POST /orgs/{organizationId}/knowledgebases/{id}/files/upload-url

Request Body:

{
  "fileName": "document.pdf",
  "contentType": "application/pdf",
  "createDownloadLink": false
}

Required Fields:

  • fileName: Name of the file
  • contentType: MIME type (e.g., application/pdf, image/png, text/plain)

Optional Fields:

  • createDownloadLink: If true, creates a 7-day temporary download link

Response:

{
  "fileId": "file-123",
  "knowledgeBaseId": "kb-456",
  "organizationId": "org-789",
  "fileName": "document.pdf",
  "key": "orgs/org-789/kbs/kb-456/files/file-123",
  "status": "WAITING-UPLOAD",
  "embeddingMode": "TEXT",
  "creatorUserId": "user-101",
  "createdAt": "2024-01-01T00:00:00Z",
  "updatedAt": "2024-01-01T00:00:00Z",
  "uploadUrl": "https://s3.amazonaws.com/signed-url...",
  "downloadLink": "https://short.link/abc123"
}

Step 2: Upload file to S3

curl -X PUT "${uploadUrl}" \
  -H "Content-Type: application/pdf" \
  --data-binary "@document.pdf"

Step 3: Ingest file

POST /orgs/{organizationId}/knowledgebases/{id}/files/{fileId}/s3-ingest
⚠️
This endpoint is marked as “DO NOT USE IN PRODUCTION” in the code. File ingestion happens automatically via background processing after upload. This endpoint is for development/testing only.

The file status will change from WAITING-UPLOADENQUEUEDPROCESSINGREADY as it’s processed.

Supported File Types:

  • PDF files: .pdf
  • Images: .png, .jpg, .jpeg, .gif, .webp
  • Text files: .txt, .md, .csv, .json, .xml, .html, .docx

Delete File

DELETE /orgs/{organizationId}/knowledgebases/{id}/files/{fileId}

Permanently deletes the file and all its embeddings.

Response:

{
  "message": "File deleted",
  "fileId": "file-123"
}

Reindex File

POST /orgs/{organizationId}/knowledgebases/{id}/files/{fileId}/reindex

Reprocesses and re-embeds the file content. Useful if embedding algorithm changes or file needs updating.

Download File Link

POST /orgs/{organizationId}/knowledgebases/{id}/files/{fileId}/download-link

Generates a temporary signed download URL for the file.

Copy File

POST /orgs/{organizationId}/knowledgebases/{id}/file/{fileId}/copy-to

Copies a file to another knowledge base.

Request Body:

{
  "targetKnowledgeBaseId": "kb-456",
  "targetOrganizationId": "org-789"
}

Search Knowledge Base

POST /orgs/{organizationId}/knowledgebases/{id}/search

Performs semantic search across all files in the knowledge base.

Request Body:

{
  "query": "How do I reset my password?",
  "limit": 10
}

Response:

{
  "results": [
    {
      "vectorId": "vec-123",
      "fileId": "file-456",
      "fileName": "user-guide.pdf",
      "key": "s3-key",
      "pages": "5-7",
      "text": "To reset your password, navigate to Settings > Account > Reset Password...",
      "similarity": 0.92,
      "externalFileUrl": null
    }
  ]
}

Knowledge Base ETL

ETL (Extract, Transform, Load) configurations allow automatic processing of files when they’re added to a knowledge base.

List ETL Configurations

GET /orgs/{organizationId}/knowledgebases/{id}/etls

Response: Array of KnowledgeBaseETLView objects

Create ETL Configuration

POST /orgs/{organizationId}/knowledgebases/{id}/etls

Request Body:

{
  "etlType": "agent",
  "agentId": "agent-123",
  "prompt": "Extract key information from this document and summarize it",
  "conditions": [
    {
      "field": "fileName",
      "comparison": "ENDS_WITH",
      "value": ".pdf",
      "mergeOperator": "AND"
    },
    {
      "field": "status",
      "comparison": "EQUALS",
      "value": "READY"
    }
  ]
}

ETL Types:

  • agent: Run an agent on the file
  • action-chain: Execute an action chain

Condition Comparisons:

  • EQUALS - Exact match
  • CONTAINS - Contains substring
  • STARTS_WITH - Starts with value
  • ENDS_WITH - Ends with value
  • IS_EMPTY - Field is empty
  • IS_NOT_EMPTY - Field is not empty

Get ETL Configuration

GET /orgs/{organizationId}/knowledgebases/{id}/etls/{etlId}

Update ETL Configuration

PUT /orgs/{organizationId}/knowledgebases/{id}/etls/{etlId}

Delete ETL Configuration

DELETE /orgs/{organizationId}/knowledgebases/{id}/etls/{etlId}

File Status Lifecycle

Understanding file statuses:

  1. WAITING-UPLOAD: File record created, waiting for S3 upload
  2. ENQUEUED: File uploaded to S3, queued for processing
  3. PROCESSING: File is being parsed and embedded
  4. READY: File processed successfully, embeddings available
  5. ERROR: Processing failed (check statusMessage for details)

Best Practices

  1. Descriptive LLM Descriptions - Make llmDescription specific so agents know when to use the KB
  2. File Organization - Use separate knowledge bases for different domains/topics
  3. File Formats - PDF and text files work best; images use OCR
  4. ETL for Automation - Set up ETL configs to automatically process new files
  5. Monitor Status - Check file status after upload to ensure processing completed
  6. Search Testing - Test semantic search to verify content is retrievable
  7. Public vs Private - Use isPublic: false for sensitive organizational data

Common Issues

Issue Solution
File stuck in PROCESSING Check statusMessage for error details
Search returns no results Verify file status is READY and embeddings were created
Upload fails Verify contentType matches actual file type
ETL not triggering Check condition field names and comparison values