Atlas

Atlas Development Guide


type: guide title: Atlas Development Guide status: Draft status_detail: "DG1 S3 — front-door doc for building on Atlas, proven against a real worked-example plugin (dev-guide-proof); author-from-guide gaps folded back. Map, not rewrite." author: devops-lead drafted: 2026-06-09 updated: 2026-06-09 summary: "The one doc a Codex or Claude developer reads before touching Atlas: how the dev loop and runtimes work, where plugins/skills/code live, how to use the SDK, how to write tests, and the documentation and delivery standards to follow." source: "Workspace SoT — knowledge/atlas/development-guide.md. The BookStack copy on docs.netos.io is a rendered mirror, not a second source of truth." related:


Atlas Development Guide

This is the single page you read before you build on Atlas. It is a map, not an encyclopedia: it tells you where everything lives and how the pieces fit, and it points you at the authoritative doc for each piece rather than repeating it. When this guide and a linked design doc disagree, the design doc wins for its own subject and this guide should be corrected.

How capabilities are cited. Atlas describes itself through the Capability Spine — a grouped index of every Atlas capability, each citable by a stable id of the form group.slug. Throughout this guide, a capability is cited by that id and linked to the live Capabilities index. The ids are stable, so cite them freely; to jump to one, open the index and filter by the id (the page has a filter box that matches on id, name, surface, and owner). The spine itself is gates.capability-spine.

Source of truth. This file — knowledge/atlas/development-guide.md in the workspace — is authoritative. It renders in Atlas at /file?path=knowledge%2Fatlas%2Fdevelopment-guide.md. The copy published to BookStack (docs.netos.io) is a rendered mirror for the wiki audience; never edit the wiki and expect it to flow back.


1. What Atlas is, for a developer

Atlas v2 is an agent platform assembled from a small kernel plus plugins, driven by LLM runtimes, deployed by Ansible-pull from Git. As a developer you work across five repositories, all cloned under netos-gitlab/netos-agents/ in this workspace and originating on uks-git01.prod.netos.io:

Repo (clone path under netos-gitlab/netos-agents/) Owns
netos-atlas The kernel (packages/kernel/), the web app (apps/web/, Next.js), and the API (apps/api/, Fastify). Kernel modules live in modules/ (files, playbook-runtime, scheduler).
netos-atlas-plugins Every plugin under plugins/<id>/. This is where most feature code lands.
netos-atlas-sdk The published TypeScript SDK, @netos/atlas-sdk (packages/ts/). Plugins import their types and helpers from here.
netos-atlas-deploy The deploy repo (Ansible). Symlinked into the workspace at platforms/ansible/playbooks/atlas-deploy/. Stages workspace files onto hosts, fetches plugin source, builds, flips the release.
netos-mcp The MCP server exposing the netbox.* / support.* data tools and the offline bin/dryrun-playbook harness.

The kernel / plugin / SDK boundary is the thing to internalise first:

For the why behind this shape — capability as the primary object, the in-Atlas dev loop, the authoring kit — read the Atlas Operating Foundation design (section 8 is the dev loop, section 6 the authoring kit). This guide does not restate it.


2. The dev loop — author → test → publish → deploy

This is the one section that is genuinely new connective tissue; everything else links out. The loop has four legs, and a change is not done until it has been round all four.

flowchart LR
  A["Author<br/>(Codex / Claude runtime)"] --> B["Test<br/>(colocated suites + CI)"]
  B --> C["Publish<br/>(commit → main → push to uks-git01)"]
  C --> D["Deploy<br/>(atlas-deploy → host-verify)"]
  D -->|gap found| A
  classDef leg fill:#1f2937,color:#e5e7eb,stroke:#374151;
  class A,B,C,D leg;

Author. Code is written by one of two dev runtimes. The dev.codex-runtime runs Codex on lab02 atlas-dev via a ChatGPT account login over HTTP dispatch; the dev.claude-runtime is the second authoring runtime, driven through the model router. Both reach their backend models through dev.model-router, the local proxy that selects bound models behind one endpoint (design: Goose runtime & model-router). Which runtime, model, credential, and bot identity a given task uses is selected by a Runtime Profile — see dev.runtime-profiles and the Development Agency gate-chain plan. The end-to-end intent → runtime routing is described in the dev-loop intent/runtime routing design, and the in-Atlas loop overall in the in-Atlas dev-loop design. The runtimes, profiles, and the per-host/role bots that drive them run on the dev.dev-lab-fleet (eus-az2-atlas-*), with lab01 as the build/control host and lab02 as the active dev/UAT surface; see the internal dev-lab strategy design and the dev-lab fleet host-vars design.

Test. Write tests with the code (section 6). CI runs them on every change and fails red on a real failure.

Publish. Commit to main. Workspace files (KB, spine YAML, skills) sync via source.workspace-mirror; repo clones push to uks-git01 via source.gitlab-sync (the git-repo wrapper). The deploy repo is its own clone, source.atlas-deploy-repo, and is not workspace-mirrored — it must be pushed explicitly.

Deploy. Run the gate's named atlas-deploy command against the target host, then verify ground truth on the host — never self-report (section 7). For multi- step gated work the gates.gate-runner drives the steps and gates.gated-delivery is the process they follow.


3. Where code lives

Four homes, by kind of thing:

The "two kinds of skills" distinction

The word "skill" names two different layers; keep them separate (Operating Foundation section 4.4):


4. Using the SDK

Plugins build against @netos/atlas-sdk (the FND1 deliverable that consolidated 59 local type mirrors down to one published package — the story is in the SDK mirror re-audit, and consumer docs are in the SDK README).

Import types — never re-declare them locally. The SDK is types + helpers, not a manifest builder: there is no defineManifest/definePlugin export. The manifest is hand-authored JSON (plugin.json, below); from the SDK you import the runtime contract your init() builds against:

import type { PluginContext, ModuleInitResult } from '@netos/atlas-sdk';

PluginContext is the single object the kernel hands your init(ctx). Its surface (from packages/ts/src/index.ts):

Field What you use it for
ctx.routes Register HTTP routes (ctx.routes.get/post(...)), with a requirePerm guard.
ctx.actions Expose and call cross-plugin actions (the provides_api surface).
ctx.settings / ctx.secrets Read plugin config and secrets. Keys must be lowercase — the seeder lowercases env keys before schema match, so an upper-case key seeds as unknown_key.
ctx.db Scoped SQL handles, but only for the bindings you opted into via requires.db (see allowlist below).
ctx.audit / ctx.capture Emit audit records; resolve capture level.
ctx.health Register a health probe (surfaced on the host health endpoint).
ctx.flow Ambient flow-frame helpers for orchestration-driven plugins.
ctx.log Structured logging.
ctx.api The stable cross-plugin read API surface.

The manifest (plugin.json)

Authored at plugins/<id>/plugin.json. The enforced gate is the kernel's zod ManifestSchema (netos-atlas/packages/kernel/src/manifest.ts) — the module loader runs ManifestSchema.parse() on it before the plugin is loaded, and the schema is .strict(), so an unknown top-level field or an out-of-enum category rejects the whole manifest and the plugin silently never loads. The companion JSON Schema is netos-atlas-sdk/schemas/plugin.schema.json (in the SDK repo, not the plugins repo), hand-kept in lockstep with the zod schema. The fields that matter most:

For the worked manifest, read plugins/system-capability-spine/plugin.json — a read-only, requires.db: [], single-UI-page plugin that is the cleanest minimal example.


5. Writing a plugin

The reliable path is scaffold from a proven read-only plugin, then adapt:

  1. Clone the template. Copy plugins/system-capability-spine/ to plugins/<your-id>/. Rename id, name, routes, and provides_api in plugin.json. Set pack to the pack you intend (core for always-on).
  2. Write src/index.ts. Export async function init(ctx: PluginContext). Register routes with an explicit requirePerm (e.g. admin.plugins.read), and return any actions you expose. Keep the handler thin; put logic in sibling modules (src/scan.ts, etc.).
  3. Settings & secrets. Declare a settings.schema.json and read values via ctx.settings.get(...) / ctx.secrets. Lowercase every key.
  4. Pack allowlist. If the plugin is new, add its id under the matching atlas_pack_catalog.<pack> list in netos-atlas-deploy/group_vars/all/packs.yml (mirroring its plugin.json pack). This is a deploy-repo edit and must be pushed to uks-git01 (source.atlas-deploy-repo).
  5. Provide an API only if another plugin needs it. provides_api plus ctx.actions is how plugins call each other; don't expose internals you don't have to.

The DG1 worked example — the dev-guide-proof plugin authored solely by following this guide — is the concrete "now you do it" exhibit; see section 9.


6. Writing tests

Tests are real and gating now (the FND2 gate made CI execute the colocated suites and fail on red). Two patterns coexist:

Local checks before you push:

CI is the backstop, not the substitute: a green local run plus a green pipeline on uks-git01 is the bar before deploy.


7. Publishing & deploying

Publish

Deploy

Verify on the host — never self-report

A deploy step is "done" only when ground truth on the host confirms it (safety.fail-closed-verify): read the live release with readlink current, confirm the plugin loaded, check the journal, and probe the real route with a host-minted admin cookie. Build-then-flip means a broken redeploy leaves the prior release current, so a failed verify is safe to retry. Gate verify scripts must be fail-closed and must avoid echo | grep -q under pipefail (SIGPIPE races a false-negative pause); use here-strings.


8. Documentation & delivery standards


9. Worked example — dev-guide-proof

The proof that this guide is sufficient is a minimal plugin authored solely by following it: netos-atlas-plugins/plugins/dev-guide-proof/. It is a read-only TypeScript plugin scaffolded from system-capability-spine, with:

It exercises the highest-value path the guide documents: importing PluginContext / ModuleInitResult from @netos/atlas-sdk with no local mirror, hand-authoring a strict-schema-valid manifest, adding the slug to atlas_pack_catalog.core in packs.yml, and running a colocated test. It is additive and disposable — it can stay as a permanent reference exhibit or be removed with no schema or DB impact.

Validated (DG1 S3), all from inside the plugin dir:

Gaps found while authoring solely from this guide, folded back above: the SDK has no defineManifest helper (section 4 import corrected); the manifest JSON Schema lives in the SDK repo and the enforced gate is the kernel's strict zod ManifestSchema, not a schemas/plugin.schema.json in the plugins repo (section 4); and the pack catalog is nested under atlas_pack_catalog: with the plugin.json pack field as source of truth (sections 4 and 5). The guide and the real path are now in sync.


Capabilities cited in this guide