Home / docs / research
Name Modified Size InfoDownloads / Week
Parent folder
279_spec_kit1234567.txt 2025-09-09 8.6 kB
spec_kit_autopilot.txt 2025-09-09 8.6 kB
279_spec_kit12.txt 2025-09-09 34.2 kB
279_spec_kit123.txt 2025-09-09 37.8 kB
279_spec_kit1234.txt 2025-09-09 45.0 kB
279_spec_kit12345.txt 2025-09-09 63.0 kB
279_spec_kit123456.txt 2025-09-09 65.0 kB
279_spec_kit.txt 2025-09-09 68.0 kB
Totals: 8 Items   330.2 kB 0

SpecKit Autopilot

A durable, multi-agent orchestration overlay for complex software development projects. This system extends GitHub's SpecKit with autonomous execution, human-in-the-loop governance, and resilient long-running processes.

Features

  • Multi-Agent Architecture: Specialized agents for planning, implementation, validation, review, merging, and spec evolution
  • Durable Execution: SQLite-based state persistence with crash recovery and resumable workflows
  • Human-in-the-Loop (HITL): Tiered approval system (REQUIRED/OPTIONAL/NONE) based on code criticality
  • Local LLM Support: Works with Ollama, LM Studio, or cloud APIs
  • Secure Execution: Sandboxed environment with configurable guardrails
  • SpecKit Integration: Consumes existing SpecKit artifacts (spec.yaml, plan.md, tasks.md)

Quick Start

1. Prerequisites

  • Python 3.11+
  • LM Studio (recommended) or Ollama
  • Git repository with SpecKit artifacts

2. Setup

# Install dependencies
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Configure your LLM backend
cp autopilot/config.example.json autopilot/config.json
# Edit config.json to point to your local LLM

3. Start Your Local LLM Server

# 1. Download and install LM Studio from https://lmstudio.ai
# 2. Open LM Studio β†’ Server tab
# 3. Download a coding model like:
#    - Qwen2.5-Coder-7B-Instruct
#    - Llama-3.1-8B-Instruct
#    - CodeLlama-7B-Instruct
# 4. Click "Start Server" (usually runs on http://localhost:1234/v1)
# 5. Note the exact model name shown in LM Studio

Option B: Ollama

# Start Ollama
ollama serve

# Pull required models
ollama pull qwen2.5-coder:7b
ollama pull llama3.1:8b

4. Configure the Model Names

Edit autopilot/config.json and update the model names to match exactly what your LLM server shows:

{
  "agents": {
    "planner": {
      "backend": "lmstudio",
      "model": "YOUR-EXACT-MODEL-NAME-HERE"
    }
  }
}

5. Run the Interactive Chat Interface

# Start the chat interface (recommended)
python -m autopilot.server
# Open http://127.0.0.1:5173/ in your browser

# Or run the orchestrator directly (command-line only)
python -m autopilot.orchestrator

πŸ€– Interactive Chat Interface (NEW!)

The primary way to use SpecKit Autopilot is now through the conversational chat interface!

Instead of writing YAML files manually, simply chat with the AI assistant to build your specification naturally through conversation.

Features:

  • πŸ—£οΈ Natural Language Spec Building - Describe your project in plain English
  • πŸ“Š Real-time Progress Tracking - Watch your specification grow as you chat
  • πŸ”„ Live Spec Updates - See your YAML specification update in real-time
  • 🎯 Smart Suggestions - Get contextual prompts to help define your project
  • πŸš€ Integrated Execution - Start building your project with one click
  • πŸ“± Beautiful Interface - Modern, responsive design that works everywhere

Quick Start with Chat:

  1. Start the server: python -m autopilot.server
  2. Open your browser: http://127.0.0.1:5173/
  3. Start chatting: "I want to build a web application for managing tasks"
  4. Watch the magic: Your specification builds itself as you describe your project!

Traditional GUI (Dashboard)

You can also access the traditional task management dashboard at http://127.0.0.1:5173/dashboard

Start the GUI server

Option 1: Quick Start Scripts

# Windows
.\start-gui.bat

# Linux/macOS  
./start-gui.sh

Option 2: Manual Launch

# Make sure dependencies are installed
pip install -r requirements.txt

# Start the FastAPI server (serves API + static UI)
python -m autopilot.server

# Open the UI in your browser
http://127.0.0.1:5173/

What the GUI provides

  • Real-time task visualization with status indicators
  • One-click orchestrator control (start/stop)
  • HITL approval interface - approve tasks directly from the browser
  • Live system logs with auto-refresh
  • Task dependency visualization
  • Beautiful, responsive interface with modern design

The GUI automatically refreshes every 3 seconds when the orchestrator is running, providing real-time visibility into your autonomous development workflow.

Configuration

LLM Backend (autopilot/config.json)

For LM Studio:

{
  "llm_backends": {
    "lmstudio": {
      "type": "openai",
      "base_url": "http://localhost:1234/v1",
      "api_key": "lm-studio-placeholder"
    }
  },
  "agents": {
    "planner": { "backend": "lmstudio", "model": "Qwen2.5-Coder-7B-Instruct", "temperature": 0.2 },
    "implementer": { "backend": "lmstudio", "model": "Qwen2.5-Coder-7B-Instruct", "temperature": 0.1 },
    "reviewer": { "backend": "lmstudio", "model": "Qwen2.5-Coder-7B-Instruct", "temperature": 0.0 }
  }
}

For Ollama:

{
  "llm_backends": {
    "ollama": {
      "type": "openai",
      "base_url": "http://localhost:11434/v1",
      "api_key": "ollama-placeholder"
    }
  },
  "agents": {
    "planner": { "backend": "ollama", "model": "qwen2.5-coder:7b", "temperature": 0.2 },
    "implementer": { "backend": "ollama", "model": "qwen2.5-coder:7b", "temperature": 0.1 },
    "reviewer": { "backend": "ollama", "model": "qwen2.5-coder:7b", "temperature": 0.0 }
  }
}

Guardrails (policies/guardrails.yaml)

Configure which paths require human approval:

hitl:
  lanes:
    HITL_REQUIRED:
      paths: ["src/core/**", "src/kernel/**"]
      timeout_h: 72
    HITL_OPTIONAL:
      paths: ["src/tools/**", "docs/**"]
      timeout_h: 24
    HITL_NONE:
      paths: ["tests/**", "scripts/**"]

Human-in-the-Loop Approvals

Local Approvals

Create approval files for tasks requiring human review:

# List pending approvals
ls .autopilot/hitl/*.awaiting.json

# Approve a task
touch .autopilot/hitl/TASK_ID.approved

GitHub PR Approvals

  1. Enable the HITL workflow in .github/workflows/hitl-approve.yml
  2. Comment /approve TASK_ID on PRs
  3. Or add the HITL-APPROVED label

Directory Structure

β”œβ”€β”€ autopilot/              # Orchestration engine
β”‚   β”œβ”€β”€ orchestrator.py     # Main loop
β”‚   β”œβ”€β”€ dag.py             # Task dependency management
β”‚   β”œβ”€β”€ llm.py             # LLM client (OpenAI-compatible APIs)
β”‚   β”œβ”€β”€ policy.py          # HITL lane routing
β”‚   β”œβ”€β”€ hitl.py            # Human approval gates
β”‚   β”œβ”€β”€ memory.py          # Project context (future: vector DB)
β”‚   β”œβ”€β”€ config.json        # LLM backend configuration
β”‚   └── agents/            # Specialized agents
β”‚       β”œβ”€β”€ planner.py     # Task planning
β”‚       β”œβ”€β”€ implementer.py # Code generation
β”‚       β”œβ”€β”€ validator.py   # Testing & validation
β”‚       β”œβ”€β”€ reviewer.py    # Code review
β”‚       β”œβ”€β”€ merger.py      # PR merging
β”‚       └── steward.py     # Spec evolution
β”œβ”€β”€ .autopilot/            # Runtime state
β”‚   β”œβ”€β”€ state.db          # Task database
β”‚   β”œβ”€β”€ hitl/             # Approval tokens
β”‚   └── runs/             # Execution logs
β”œβ”€β”€ policies/
β”‚   └── guardrails.yaml   # Security & approval policies
β”œβ”€β”€ .github/workflows/    # CI & HITL workflows
β”œβ”€β”€ spec/                 # SpecKit artifacts
β”‚   └── spec.yaml        # Project specification
└── requirements.txt

Agent Workflow

  1. Planner: Analyzes spec.yaml and breaks down objectives into tasks
  2. Implementer: Generates code and creates branches/PRs
  3. Validator: Runs tests, static analysis, and domain-specific checks
  4. Reviewer: Performs code review and security analysis
  5. Merger: Merges approved changes or rolls back on failure
  6. Steward: Proposes spec changes when implementation reality diverges

Development Workflow

  1. Define project in spec/spec.yaml
  2. Run python autopilot/orchestrator.py
  3. System creates task DAG and begins execution
  4. Review and approve tasks in HITL lanes as needed
  5. Monitor progress in .autopilot/runs/ logs
  6. System continues until all objectives completed

Advanced Features

For OS Development

  • Add QEMU validation in autopilot/agents/validator.py
  • Configure kernel-specific HITL lanes
  • Enable hardware-in-the-loop testing

Cost Controls

Set API budgets in policies/guardrails.yaml:

cost_controls:
  daily_budget: 50.0
  max_tokens_per_task: 4096
  alert_threshold: 0.8

Spec Evolution

The Steward agent can propose changes to spec.yaml when: - Implementation challenges are encountered - New requirements are discovered - Architecture decisions need documentation

Troubleshooting

LM Studio Connection Issues

# Check if LM Studio server is running
curl http://localhost:1234/v1/models

# Test chat completion
curl -X POST http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "YOUR-MODEL-NAME",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 50
  }'

Common LM Studio Issues: - Model not found: Use the exact model name shown in LM Studio's interface - Port conflict: LM Studio usually uses port 1234, check the Server tab for the actual URL - Model not loaded: Make sure you've clicked "Load Model" in LM Studio

Ollama Connection Issues

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Test model availability
ollama list

# Test Ollama OpenAI-compatible endpoint
curl -X POST http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-coder:7b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Task Stuck in Approval

# Check pending approvals
sqlite3 .autopilot/state.db "SELECT * FROM tasks WHERE status='pending';"

# Manual approval
touch .autopilot/hitl/TASK_ID.approved

Reset State

# Clear all tasks and start fresh
rm .autopilot/state.db
rm -rf .autopilot/hitl/*.json

Contributing

This is a research prototype. To extend:

  1. Enhance agent implementations in autopilot/agents/
  2. Add domain-specific validators
  3. Improve spec parsing from SpecKit artifacts
  4. Add vector-based memory in memory.py

License

MIT License - see LICENSE file for details.

Source: README.autopilot.md, updated 2025-09-09