Architecture
The four layers behind Solen, and how they work together.
Solen runs as four coordinated layers. From your perspective, everything is "Solen", but understanding what each layer does helps when reading run logs, debugging integrations, or interpreting heal events.
Runtime Kernel: Intelligence & Control Plane
The Runtime Kernel is the agent runtime core. Every withSolen() call routes through it. It ingests steps and checkpoints, watches for failure patterns, coordinates the heal loop, and surfaces results to the dashboard and API.
Owns: run lifecycle, step ingestion, stall detection, heal orchestration, API responses
Produces: run records, step timelines, heal incidents, checkpoint stores, audit entries
Visible in: all API responses, the /agents dashboard, live run SSE, log prefix [RUNTIME]
LearnShift: Pattern Store
LearnShift stores failure patterns and known fixes across all customers. When a failure matches a stored pattern, the fix applies in under 100ms with zero LLM calls. Novel failures escalate to AI diagnosis. Every resolved incident sharpens the pattern store for the next occurrence.
Owns: failure pattern matching, fix recipes, confidence scoring, cross-customer learning
Produces: pattern matches, heal proposals, confidence scores, stored lessons
Visible in: heal history, pattern match lines in incident timelines, log prefix [LEARNSHIFT]
Stratum: Compute OS
Stratum is the compute layer beneath agent workloads. It places agent runs on isolated nodes with per-tenant resource limits, network policy, and encrypted checkpoint storage. When Stratum is enabled, runs can be scheduled as workloads with CPU and memory bounds instead of running only in your own infrastructure.
Owns: workload placement, resource isolation, network policy, node health
Produces: workload IDs, node assignments, placement metadata on run records
Visible in: run metadata (stratumWorkloadId), Stratum dashboard, log prefix [STRATUM]
Nexus: Enterprise Control Plane
Nexus is the fleet-wide governance layer for enterprise workspaces. It provides org management, SAML SSO, API key policies, budget caps, DORA rollups across agents, and the immutable audit trail. Autonomy mode and heal approval policy live here for org-scoped enforcement.
Owns: org governance, fleet dashboards, SAML SSO, budget enforcement, audit trail
Produces: org-level DORA metrics, audit entries, policy decisions, fleet alerts
Visible in: Nexus dashboard, Settings → Workspace → Agent Policy, audit log
How they work together
withSolen() starts a run ↓ Runtime Kernel: ingests steps, checkpoints state ↓ Stratum: places workload (if enabled) ↓ Runtime Kernel: watches step cadence every 60s ↓ [on stall / tool loop / step failure] LearnShift: pattern match (or AI diagnosis) ↓ Runtime Kernel: applies fix, resumes from checkpoint ↓ LearnShift stores the lesson ↓ Nexus: audit entry + fleet metrics updated
The full loop from detected failure to agent resumed takes under 60 seconds for known patterns.
What this means for integrations
- Webhook payloads include a
sourcefield indicating which layer fired the event (for exampleruntime.heal.resolved,learnshift.pattern.matched). - SDK telemetry routes to the Runtime Kernel via
/api/agent-runtime/*endpoints. - Autonomy mode is a Nexus/Workspace setting: changing it in Settings → Workspace → Agent Policy applies immediately to all new heal incidents.
- Confidence scores on heal proposals come from LearnShift pattern matching or Runtime Kernel AI diagnosis, evaluated against governance policy before any fix is applied.