Skip to main content

ai_services app

Multi-provider AI: chatbot, embeddings for RAG, AI-generated reports, conversational onboarding.

Models (8)

  • AIConfig — per-org provider config (chat + embedding); encrypted API keys; monthly_query_cap; is_logging_enabled
  • PlatformAIConfig — platform-default fallback when an org doesn't configure
  • ContentEmbedding — vector index of indexable content; pgvector column; FK to org
  • EmbeddingJob — long-running batch indexing job state
  • ReportRequest — AI-generated report (e.g., "summary of last quarter"); status, output URL
  • OnboardingTemplate — conversational onboarding flow definition
  • OnboardingStep — individual step in a template
  • OnboardingProgress — per-user state in an onboarding flow

Key endpoints

URLPurpose
GET /api/organizations/<id>/ai/config/Org AI config
PATCH /api/organizations/<id>/ai/config/Update provider / model / keys
GET /api/organizations/<id>/ai/config/providers/Available providers
GET /api/organizations/<id>/ai/config/providers/<provider>/models/Models for a provider
POST /api/ai/embeddings/jobs/Trigger reindex
GET /api/ai/embeddings/jobs/<id>/Job status
POST /api/ai/reports/Request a report
GET /api/ai/reports/<id>/Report status + download URL
WS /ws/ai/chat/Chatbot WebSocket connection

Permissions

  • IsNationalAdmin — config + report management
  • IsAuthenticated — chat + onboarding (per-user)

Background tasks

  • generate_embeddings(content_type, content_id, org_id) — embeds a single content row
  • reindex_org_content(org_id) — full reindex; partition by content type for parallelism
  • process_report_request(report_id) — runs the report job, uploads PDF / XLSX to S3
  • auto_assign_courses_to_users (cross-app helper) — uses AI to suggest course assignments based on profile

External integrations

  • Anthropic (Claude — default chat model)
  • OpenAI (GPT, text-embedding-3-*)
  • Google (Gemini, text-embedding-004)
  • Voyage AI (Voyage-2 embeddings, used with Anthropic chat)

Notable patterns

Provider abstraction

apps/ai_services/providers/:

  • base.pyBaseProvider interface (chat(messages, stream=True), embed(text), count_tokens(text))
  • anthropic_provider.py
  • openai_provider.py
  • google_provider.py

apps/ai_services/services/router.py picks the provider based on org.ai_config.chat_provider.

Variable-dim embeddings

Different providers produce different embedding dimensions (Voyage 1024, OpenAI 1536, Google 768). The ContentEmbedding.embedding column is a pgvector(1536) (most common); rows from other providers either pad / truncate or use a separate column. The HNSW index is built per-dim — see migration files for current state.

RAG pipeline

AI / RAG pipeline

Chat consumer

apps/ai_services/consumers.pyChatConsumer(AsyncWebsocketConsumer):

  1. Authenticate user (resolve from cookie / token)
  2. Receive question
  3. Embed question
  4. Vector search filtered by org_id + user access scope
  5. Construct prompt with retrieved snippets
  6. Stream LLM response chunk-by-chunk back to client
  7. Persist ChatTurn (if logging enabled)

If user disconnects: cancel the LLM stream to save tokens.

Cost control

  • monthly_query_cap per org; checked before each chat / report
  • Token-usage tracking on every call (stored regardless of is_logging_enabled for billing)
  • Output token cap per response (default 1024)

BYOM (Bring Your Own Model)

Org admins can override chat_api_key / embedding_api_key to use their own provider account. The platform's keys are billed via subscription tier.

Code paths

  • Models: backend/apps/ai_services/models.py
  • Providers: backend/apps/ai_services/providers/
  • Consumers: backend/apps/ai_services/consumers.py
  • Retrieval: backend/apps/ai_services/services/retrieval.py
  • Tasks: backend/apps/ai_services/tasks.py