AI Agent Platform

Internal • April 2026
Assisted
Sonnet
Haiku
pgvector
01 — Overview
Multi-Agent GTM Platform
A specialized AI agent platform powering Spekit's Go-To-Market. 7 core agents + 7 sub-agents orchestrated through Atlas, backed by semantic memory, a durable job queue, and human governance.
7
Core Agents
7
Sub-Agents
12+
PostgreSQL Tables
3
LLM Backends
4
Graduation Stages
1024d
pgvector Embeddings
Slack
Socket Mode Bot
Open WebUI
Chat Interface
OpenAI-compatible API — /v1/chat/completions
API BRIDGE — FastAPI
Authentication • Routing • Streaming • Feedback • Health
Each agent exposed as a selectable "model"
ATLAS
Router • Haiku • Intent Analysis
Routes to the specialized agent based on intent
ELLEGENTIC
Content
INTELLIGENCE
Competitive
PROSPECTOR
Leads
CLOSER
Sales
HERALD
Distribution
ORACLE
Knowledge
PostgreSQL
pgvector • Jobs • Events • Memory
Content Brain
Semantic Search • 1024d Vectors
External APIs
fal.ai • VoyageAI • Remotion
02 — Routing
Atlas — The System's Brain
Atlas is the single entry point. It analyzes intent and decides: respond directly, search the Content Brain, or route to a specialized agent.
User Message
ATLAS analyzes intent
Claude Haiku • max 1024 tokens • <500ms
3 possible paths
Direct Response
Lightweight questions, greetings
Responds without routing
Content Brain
Semantic search first
Searches existing knowledge
Route to Agent
New content, heavy work
Forwards brief + context
Context forwarded to routed agents
audiencetonekeywordsgoalplatformdesired_lengthctauser_iduser_profilerelevant_memoriesconversation_history
03 — Core Agents
The 7 Specialized Agents
Each agent is a domain expert. Click any card to reveal tools and configuration.
🧭
ATLAS — Router
HaikuAssistedPhase 1
Single entry point. Analyzes intent, searches Content Brain, routes to agents.
Click for details
Tools
route_to_agentkill_switchdm_oussamaschedule_taskschema_applyops_queue_inspectorops_audit_viewerops_approval_viewerops_metrics_viewerweb_searchfetch_url
Targets: ellegentic, intelligence, prospector, herald, oracle, closer
Max tokens: 1024 • Internal router only
🎨
ELLEGENTIC — Content
SonnetAssistedPhase 1
Master content orchestrator. Supervises 7 sub-agents for full content pipeline.
Click for details
Tools
dispatch_to_copywriterdispatch_to_videographerdispatch_to_narratordispatch_to_reporterdispatch_to_formatterdispatch_to_visualistdispatch_to_seo_engineersearch_content_braincheck_brand_scoreget_style_rulesweb_search
Max tokens: 2048 • Quality gates: Brand score ≥80
🔍
INTELLIGENCE — Competitive
SonnetAssistedPhase 1
Competitor monitoring, battle cards, market signals. Tracks Seismic, Highspot, Mindtickle, Guru, Showpad, Allego, Gong.
Click for details
Tools
monitor_competitorgenerate_battle_cardsearch_market_signalssearch_knowledgestore_findingweb_searchfetch_url
Max tokens: 4096
📈
PROSPECTOR — Lead Gen
SonnetAssistedPhase 1
Prospect identification, enrichment, and ICP qualification. Mid-market B2B SaaS, 100-2000 employees, sales-led.
Click for details
Tools
search_prospectsenrich_leadfind_contactsqualify_leadsearch_knowledgestore_findingweb_search
Max tokens: 4096 • ICP signals: Hiring, Series B/C, Competitor usage
🤝
CLOSER — Sales Support
SonnetAssistedPhase 2
Pre-call deal support: account prep, objection handling with proof points, content recommendations.
Click for details
Tools
prep_accounthandle_objectionrecommend_contentsearch_knowledgeweb_searchfetch_url
Max tokens: 4096
📢
HERALD — Distribution
SonnetAssistedPhase 2
Multi-platform publishing: Webflow, LinkedIn, Twitter, Email, Reddit. Human approval required.
Click for details
Tools
prepare_publicationvalidate_assetssearch_knowledgeweb_searchfetch_url
Max tokens: 4096 • Preview → Validate → Approve → Publish
📚
ORACLE — Knowledge
SonnetAssistedPhase 2
Knowledge manager. Answers questions about products, processes, competitive intel, customer stories.
Click for details
Tools
search_knowledgesearch_semantic_memoryweb_searchfetch_url
Max tokens: 4096 • Sources: Content Brain, Semantic Memory
04 — Sub-Agents
ELLEGENTIC's 7 Workers
Each sub-agent has its own system prompt, tools, and quality gates.
✍️
Copywriter
Sonnet
Blog posts, emails, social copy. Spekit Style Guide.
Brand score ≥80 • 2048 tokens
🎬
Videographer
Sonnet
Video scene definitions (JSON) → Remotion → MP4.
Scenes: text_overlay, stats, quote, outro • 4096 tokens
🎙️
Narrator
Haiku
TTS voiceover via Qwen3-TTS MLX (Apple Silicon).
Backends: MLX, OpenAI, ElevenLabs, Polly • 512 tokens
📊
Reporter
Sonnet
Branded PDF reports with data and charts.
PDF validation, brand compliance • 4096 tokens
🔄
Formatter
Sonnet
Repurposing: blog → LinkedIn, Twitter, Email, Reddit.
Platform-specific validation • 2048 tokens
🖼️
Visualist
Sonnet
Image/video/audio via fal.ai (Flux Pro, Kling, Minimax).
fal.ai API • Prompt-only fallback • 2048 tokens
🌐
SEO Engineer
Sonnet
SEO/AEO: meta tags, slugs, FAQs, internal linking.
SEO score ≥70 • 2048 tokens
05 — How to Use
Agent Usage Guides
Step-by-step instructions for each agent. Access via Open WebUI (model dropdown) or Slack (message Atlas).
Using ELLEGENTIC — Content Creation
1
Open Open WebUI and select ellegentic from the model dropdown, or message Atlas in Slack.
2
Describe what you need. Specify:
Content type: blog, email, social, video, report
Audience: e.g., "VP Sales at mid-market SaaS"
Tone: professional, conversational, authoritative
Goal: generate leads, educate, nurture, convert
3
ELLEGENTIC auto-dispatches to sub-agents. Blog: CopywriterSEO EngineerFormatter for social variants.
4
Review output. If approval required, it enters the queue with a Slack notification.
Expected Outputs
  • Full blog post (1500-3000 words) with SEO meta tags and slug
  • Social variants: 3-5 LinkedIn posts, Twitter thread, email section
  • Brand score ≥80 with style compliance report
  • SEO score ≥70 with keyword analysis
  • Optional: hero image, voiceover, video
Example: "Write a blog post about how AI is transforming sales enablement for mid-market SaaS. Target VP Sales. Professional but approachable. Include 3 LinkedIn posts and a Twitter thread."
Using INTELLIGENCE — Competitive Research
1
Select intelligence or ask Atlas: "Generate a battle card for Seismic."
2
Searches Content Brain for existing intel, then live web search for fresh data.
3
Results stored via store_finding for cross-agent access.
Expected Outputs
  • Battle card: strengths/weaknesses, pricing, win themes
  • Market signal report with competitor moves and trends
  • Competitive positioning with Spekit differentiation
Using PROSPECTOR — Lead Generation
1
Select prospector or ask: "Find 10 companies matching our ICP."
2
Searches for ICP matches: mid-market B2B SaaS, 100-2000 employees. Pain signals: rapid hiring, CRM switch, competitor usage.
3
Use enrich_lead, find_contacts, qualify_lead for deep-dive.
Expected Outputs
  • Qualified prospect list with ICP fit scores
  • Company profiles: size, funding, tech stack, pain signals
  • Key contacts with titles and LinkedIn
Using CLOSER — Sales Support
1
Select closer or ask: "Prep me for a call with Acme Corp."
2
Combines Content Brain + web research for pre-call briefs.
3
For objections, pulls proof points from case studies and intel.
Expected Outputs
  • Pre-call brief with background and recent news
  • Objection responses with customer evidence
  • Content recommendations by deal stage
Using HERALD — Distribution
1
Herald receives approved content from ELLEGENTIC, or select herald directly.
2
Validates against platform requirements and prepares previews.
3
Human approval required. Review in Slack before publication.
Expected Outputs
  • Platform-specific previews (Webflow, LinkedIn, Twitter, Email)
  • Asset validation report
  • Publication schedule
Using ORACLE — Knowledge Q&A
1
Select oracle or ask: "What's our enterprise pricing?"
2
Searches Content Brain and Semantic Memory, ranked by similarity + freshness.
3
Falls back to web search if internal knowledge is insufficient.
Expected Outputs
  • Sourced answers with document references
  • Product info, customer stories, process docs
06 — Memory
Multi-Layer Memory System
4 complementary layers: Content Brain, Semantic Memory, Agent Memory, Feedback Learning.
Content Brain
content_chunks • pgvector • ivfflat cosine
Documents → ~500-word chunks → 1024d embeddings → cosine similarity. 85% similarity + 15% freshness.
idsource_documentcategorychunk_textembedding (1024d)metadata
Semantic Memory
semantic_memory • pgvector • ivfflat cosine
Per-agent + shared scope. Agent isolation with NULL scope for cross-agent access. Importance scoring (0-1).
idagent_namecontentembedding (1024d)importance_scoreaccess_count
Agent Memory
agent_memory • JSONB key-value • TTL-based
Short-term (24h) and long-term (permanent). recall(key), recall_with_user(key, user_id), recall_recent(n).
idagent_namekeyvalue (JSONB)memory_typeexpires_at
Feedback & Learning
feedback_log + feedback_patterns • LLM consolidation
User corrections → nightly consolidation → extracted patterns. Improves future outputs with confidence scoring.
original_outputcorrectionembeddingpattern_typeconfidence
Memory Flow
Document Upload
PDF, Docs, Web
Chunking
~500 words
Embedding
VoyageAI 1024d
pgvector
IVFFlat cosine
Agent Query
Cosine Similarity
85% + 15% freshness
Ranked Results
07 — Job Queue
Durable PostgreSQL Job Queue
Exponential retry, heartbeat, dead-letter handling. No Celery or Temporal.
PENDING
Waiting for claim
CLAIMED
Worker has lease
COMPLETED
Success
FAILED
Retry or dead-letter
DEAD LETTER
Max retries exhausted
Retry Policy
Base delay5sMax delay300sMultiplier2.0x exponentialLease900s (15 min)Heartbeat60sPoll2s
Job Types
publication_pipelinePost-approval prepcontent_generationSub-agent dispatchcompetitive_reportIntelligence reportslead_enrichmentProspector enrichment
08 — Graduation
Progressive Deployment Stages
4 autonomy levels. All agents currently in ASSISTED mode.
SHADOW
Observe only. No real actions.
ASSISTED ✓
Produces artifacts. Human approval required.
Current stage
SUPERVISED
Acts in scope. Humans spot-check.
AUTONOMOUS
Full automation within guardrails.
09 — Approval
Governance & Approval Workflow
Brand scoring, per-content-type rules, Slack-based human review.
Agent generates content
Brand Score
≥80
SEO Score
≥70
Auto-Approve
seo_technical, social_media
Approval Queue
blog, email, ceo_content
Slack Review
Approved
Publication Pipeline
Herald prepares distribution
Content TypeAuto-ApproveMin ScoreApprovers
seo_technicalYes80
social_mediaYes85
blog_articleNo80elle
email_campaignNo80elle
comparison_pageNo80elle + ian
ceo_contentNo80elle + mel
video_contentNo80elle
10 — Database
PostgreSQL + pgvector Schema
TableFunctionKey FieldsIndex
content_chunksContent Brainsource_document, chunk_text, embedding(1024d)ivfflat (20)
semantic_memoryAgent vector memoryagent_name, content, embedding(1024d)ivfflat (20)
agent_memoryKV store + TTLagent_name, key, value(JSONB), expires_atidx_lookup
agent_eventsEvent bussource_agent, target_agent, event_type, payload
agent_jobsJob queuejob_type, agent_name, status, priorityidx_pending
approval_queueReview queuecontent_type, brand_score, seo_scoreidx_status
feedback_logCorrectionsoriginal_output, correction, embedding
feedback_patternsPatternspattern_type, corrected_pattern, confidenceivfflat (10)
agent_actionsAudit trailaction_type, tokens_used, duration_ms
user_profilesUser datauser_id, role, team, preferences(JSONB)
scheduled_tasksCron tasksschedule(cron), next_run, enabled
release_signoffsBeta promotionsignoff_status, approved_by
11 — Integrations
Integration Ecosystem
💬
Slack
Socket Mode bot. Approval workflow, review digests, auto-ingestion.
Approvers: Oussama, Elle, Ian, Mel
Features: Auto-ingestion, review digest, kill_switch
🖼
fal.ai
Media: Flux Pro (images), Kling (video), Minimax (audio).
Agent: Visualist • Fallback: prompt-only mode
🎬
Remotion
React video engine → MP4. Agent: Videographer.
Location: /video_engine/spekit-video/
Output: MP4 + metadata JSON
🎙
TTS
Qwen3-TTS MLX (local), OpenAI, ElevenLabs, Polly.
Primary: MLX on Apple Silicon • Agent: Narrator
🌊
VoyageAI
1024d embeddings. Fallback: Ollama qwen3-embedding:0.6b.
Config: VOYAGE_API_KEY, EMBEDDING_BACKEND
🌐
Planned
Webflow, Ayrshare, SerpAPI, HubSpot.
CMS publishing, social scheduling, CRM integration
12 — LLM Models
Multi-Backend LLM Strategy
Anthropic API
Development (active)
Direct SDK. ANTHROPIC_API_KEY
AWS Bedrock
Production (planned)
IAM auth, VPC. AWS_ACCESS_KEY_ID
Ollama Local
Fallback (free)
Qwen3 localhost:11434. OLLAMA_BASE_URL
TierModelUsageAgents
Tier 1Claude OpusDeep reasoning (rare)Reserved
Tier 2Claude SonnetContent generationELLEGENTIC, Intelligence, Prospector, Herald, Oracle, Closer
Tier 3Claude HaikuFast routingAtlas, Narrator
LocalQwen3Free devFallback
13 — Events
Inter-Agent Event Bus
Async communication via PostgreSQL. No RabbitMQ or Kafka.
Event Types
chain
Sequential handoff
request
Targeted request
broadcast
Fan-out
Example: Content Pipeline
ELLEGENTICemit("content_ready")publication_pipelineJob queuedHERALD distributes
14 — Architecture Decision
Why Custom — Not a Framework
We evaluated 7 frameworks and chose to build custom. Here's why.
Frameworks Evaluated & Rejected
Framework Why Rejected
LangChainGeneric abstraction, massive overhead, no content pipeline or brand checking
CrewAIToo opinionated, no governance/approval workflow, no brand enforcement
OpenClawCritical security risk. 512 vulnerabilities identified (8 critical). 135K+ instances exposed to the internet. ClawJacked attack enables full remote takeover via a single link. 1-in-5 ClawHub plugins contain malware. Anthropic banned flat-rate subscriptions (Apr 2026). Meta banned it from corporate machines. A single instance can consume $1K–$5K/day in API costs. Not suitable for enterprise with company data.
NemoClaw (NVIDIA)Alpha software — not production-ready. NVIDIA’s security layer on top of OpenClaw, announced at GTC March 2026. Improves containment (sandbox) but does NOT address behavioral governance — can’t verify if agent actions are correct or aligned with business goals. 391 open issues on GitHub. APIs subject to change without notice. Optimized for NVIDIA hardware (vendor lock-in). No content pipeline, no brand checking, no approval workflows.
HermesNot mature for production. Limited multi-agent support — no sub-agent hierarchy. No content pipeline or brand enforcement. Small community, limited documentation. Not designed for the kind of specialized agent orchestration (routing + delegation + governance) that our use case requires.
TemporalComplex self-hosting (Cassandra + Elasticsearch), $200+/month, overkill. Good for durable execution but doesn’t solve content generation, brand voice, or multi-agent routing.
InngestGood for event-driven workflows, but no content pipeline, no brand voice, no multi-agent hierarchy or governance layer.
n8n (as brain)We use n8n for GTM workflow orchestration (Clay, Apollo, Instantly), but agents do the reasoning. n8n = arms, agents = brain. Clean separation of concerns.
What Custom Gives Us (That No Framework Has)
Content Pipeline
Research → Create → Brand Check → SEO Check → Approval → Publish. No framework has this end-to-end.
Graduation Stages
SHADOW → ASSISTED → SUPERVISED → AUTONOMOUS. Agents earn autonomy over time. Unique to our system.
Content Brain
75K chunks of shared semantic memory via pgvector. All agents search and write to the same knowledge base.
Brand Voice Enforcement
AI-stink detection (score 0–100), SEO evaluator, output contracts. Zero-tolerance for generic AI content.
Multi-LLM Failover
Claude API → Ollama → Bedrock. Circuit breaker auto-switches. No provider lock-in.
n8n = GTM Orchestration
n8n handles Clay, Apollo, Apify, Instantly workflows. Agents handle reasoning. Clean separation.
Why NOT OpenClaw — Security Deep Dive
512
Vulnerabilities identified (8 critical)
135K+
Instances exposed to the internet
1 in 5
ClawHub plugins contain malware
$5K/day
Potential API cost per instance
Key incidents: ClawJacked attack — full remote takeover via a single malicious link. Supply chain compromise — npm package silently installed OpenClaw on dev machines. Anthropic banned flat-rate subscriptions (April 4, 2026). Meta banned from corporate machines. China restricted from state agencies.
NemoClaw (NVIDIA’s fix): Adds sandbox containment but still alpha. Does NOT solve behavioral governance (correctness of actions). 391 open issues. Not production-ready.
Our approach: Zero external agent frameworks. No MCP server exposed. No plugin marketplace. All tools are hardcoded and audited. Bearer token auth, CORS whitelist, rate limiting, SSRF protection. Human approval on every output (ASSISTED stage).
Why Open WebUI as Interface
Familiar ChatGPT-like UI
Every team member already knows how to use a chat interface. Zero training needed. Select an agent from the dropdown, type, get results.
OpenAI-Compatible API
Our API Bridge exposes all 13 agents as standard /v1/chat/completions models. Open WebUI connects natively — no custom integration needed.
Self-Hosted & Private
Runs in our Docker stack. No data leaves our infrastructure. No third-party SaaS dependency for the UI. Full control over user accounts and permissions.
Multi-Agent Selection
Users pick the right agent for the job: Atlas for general routing, ELLEGENTIC for content, Intelligence for competitor research, Prospector for leads, Closer for deals.
15 — Cost Efficiency
Token Cost Optimization
Our architecture is designed to minimize LLM token usage at every layer.
How We Save Tokens
Optimization How It Works Savings
Tiered Model Selection Haiku ($0.25/MTok) for routing, Sonnet ($3/MTok) for content, Opus only for complex reasoning. Most calls use the cheapest model. ~70%
Single-Router Architecture Atlas (Haiku) routes to 1 specialist agent. No multi-agent chain. LangChain/CrewAI create 3–5 LLM calls per request — we do 2. ~60%
Content Brain Pre-Fill 75K chunks of Spekit data injected as context. Agents don't need to "learn" from scratch — they already know the product, competitors, and customers. ~40%
Brand Check = No Regen Programmatic brand scoring (regex, not LLM). Catches AI-stink without burning tokens. Only regenerates if score < 80. ~30%
Refinement Skip Research When user says "make it shorter", we reuse previous research context. No web search, no Content Brain re-query. ~50%
Tool Loop Cap Max 10 iterations, 40-message cap, 3-error abort. Prevents runaway token consumption from stuck agents. Safety
Sub-Agent Delegation ELLEGENTIC dispatches to specialized sub-agents with focused prompts. Each sub-agent has a narrow scope → shorter prompts → fewer tokens. ~25%
Cost Comparison: Custom vs Framework
Our Architecture
~2
LLM calls per user request
Atlas (Haiku) routes → 1 Agent (Sonnet) generates
Typical Framework (LangChain/CrewAI)
5–8
LLM calls per user request
Chain of prompts, memory summarization, tool planning, execution, reflection
Estimated monthly cost for 1000 requests/day:
~$150
Our architecture
~$500–800
LangChain/CrewAI
Spekit
Architecture Summary
Custom multi-agent system. No external frameworks. Full control over agent lifecycle, job orchestration, and human governance.
0
External Frameworks
13
Total Agents
98
Total Tools
~$150/mo
Est. LLM Cost