Quick Start

5-Minute Setup

Get from zero to managing your first incident in just 5 minutes.

Step 1: Start the Platform (1 minute)

# Using Go (fastest)
go run ./cmd/im serve

# Or using Docker
docker run -d -p 8080:8080 incident-management:latest

# Platform available at http://localhost:8080

Step 2: Create Your First Incident (2 minutes)

Using the Web Interface

  1. Open http://localhost:8080 in your browser
  2. Complete the setup wizard (if first time)
  3. Click “Declare Incident”
  4. Fill in the incident details
  5. Click “Create Incident”

Using the CLI

# Declare an incident
./im declare --sev SEV-2 --title "Database connection issues" --service api

# Output: Created incident INC-12345678

Step 3: Manage the Incident (2 minutes)

# Acknowledge the incident
./im ack --incident INC-12345678

# Add an update
./im update --incident INC-12345678 --note "Investigating MySQL cluster"

# Resolve the incident
./im resolve --incident INC-12345678 --note "Fixed by restarting MySQL cluster"

# Export timeline for documentation
./im timeline export --incident INC-12345678 --format md > postmortem.md

Congratulations! πŸŽ‰ You’ve successfully managed your first incident.

Understanding the Interface

Web Dashboard

The web interface provides a comprehensive view of your incident management:

  • Incident List: Overview of all incidents with real-time updates
  • Incident Timeline: Chronological view of all incident events
  • Severity Badges: Color-coded severity levels (SEV-1 to SEV-4)
  • Status Tracking: Visual status indicators (Open, Mitigated, Resolved, Closed)
  • Auto-Refresh: Timeline updates every 5 seconds automatically

Key Components

Timeline View

Every action creates an immutable event in the incident timeline:

{
  "event_type": "incident.acknowledged",
  "timestamp": "2025-08-23T18:45:00Z",
  "actor": "user@example.com",
  "incident_id": "INC-12345678",
  "data": {
    "note": "Investigating database issues"
  }
}

Status Flow

Incidents follow this lifecycle:

Open β†’ Mitigated β†’ Resolved β†’ Closed

Severity Levels

  • SEV-1: Critical system down, customer impact
  • SEV-2: Major functionality impacted
  • SEV-3: Minor functionality impacted
  • SEV-4: Informational, no customer impact

Essential CLI Commands

Incident Management

# Create incidents with different severities
./im declare --sev SEV-1 --title "Payment system down" --service payments
./im declare --sev SEV-3 --title "Slow API responses" --service api

# List all open incidents
./im list

# Get incident details
./im status --incident INC-12345678

# Add timeline updates
./im update --incident INC-12345678 --note "Found root cause in database query"

Server Operations

# Start server with custom port
./im serve --port 9090

# Start with PostgreSQL
DATABASE_URL="postgres://user:pass@localhost/incidents" ./im serve

# Enable debug logging
LOG_LEVEL=debug ./im serve

Data Export

# Export incident timeline as Markdown
./im timeline export --incident INC-12345678 --format md

# Export as JSON for analysis
./im timeline export --incident INC-12345678 --format json

# Export all incidents
./im export --format csv > incidents.csv

Development Workflow

Hot Reload Development

For active development with instant feedback:

# Install air for hot reload
go install github.com/air-verse/air@latest

# Start with hot reload (watches Go files)
air

# Server restarts automatically on code changes

Running Tests

# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run specific package tests
go test ./internal/server/...

Database Management

# Database is created automatically as incidents.db (SQLite)
ls -la incidents.db

# For PostgreSQL, set environment variable
export DATABASE_URL="postgres://incidents:password@localhost:5432/incidents?sslmode=disable"

# Verify database connection
./im db ping

Configuration

Environment Variables

Create a .env file for consistent configuration:

# Server configuration
PORT=8080
LOG_LEVEL=info

# Database (optional - defaults to SQLite)
DATABASE_URL=postgres://incidents:password@localhost:5432/incidents

# JWT secret for production
JWT_SECRET=your-secret-key-here

# External integrations
REDIS_URL=redis://localhost:6379
GORUSH_URL=http://localhost:8088

Service Configuration

The platform automatically creates sensible defaults, but you can customize:

# View current configuration
./im config show

# Set custom service mappings
./im config set services.api "API Service"
./im config set services.db "Database Cluster"

Integration Quick Setup

SCIM 2.0 User Provisioning

Enable automated user provisioning from your identity provider:

# Create SCIM service account
./im scim create-account --name "Okta Integration"

# Configure in your IdP:
# SCIM URL: https://your-domain/scim/v2
# Bearer Token: (from create-account output)

Webhook Integrations

Set up real-time integrations with external systems:

# Add PagerDuty webhook
./im webhook add pagerduty --url https://your-domain/webhooks/pagerduty/v3

# Add Slack webhook
./im webhook add slack --url https://your-domain/webhooks/slack

Architecture Overview

Understanding the platform’s architecture helps with effective usage:

Event-Sourced Timeline

  • Every incident action creates an immutable CloudEvent
  • Complete audit trail with timeline reconstruction capability
  • Events follow CloudEvents v1.0 specification

Tech Stack

  • Backend: Go with Gorilla Mux
  • Frontend: HTMX + Tailwind CSS v4 + Alpine.js
  • Database: SQLite (development) β†’ PostgreSQL (production)
  • Real-time: WebSocket and Server-Sent Events

Key Features

βœ… Server-side rendered with progressive enhancement
βœ… Auto-refreshing incident timeline (every 5s)
βœ… Modal forms with smooth transitions
βœ… Optimistic UI updates
βœ… Policy-based authorization with OPA
βœ… Group-based access control

Next Steps

Now that you’re up and running:

  1. Create Your First Incident - Detailed walkthrough of incident management workflow

  2. Set Up Integrations - Connect PagerDuty, Slack, and other tools:

  3. Configure Security - Set up authentication and fine-grained permissions:

  4. Deploy to Production - Scale for enterprise use:

  5. Explore the API - Automate with comprehensive REST API and CLI

Common Workflows

Daily Incident Management

# Check for open incidents
./im list

# Acknowledge new incidents
./im ack --incident INC-12345678

# Add status updates
./im update --incident INC-12345678 --note "Applied hotfix, monitoring"

# Resolve when fixed
./im resolve --incident INC-12345678 --note "Issue resolved, services restored"

Post-Incident Activities

# Export timeline for postmortem
./im timeline export --incident INC-12345678 --format md > postmortem-inc-12345678.md

# Generate incident metrics
./im metrics --start "2025-01-01" --end "2025-01-31" --format csv

Team Coordination

Use the web interface for collaborative incident management:

  • Real-time timeline updates across all team members
  • Interactive severity and status changes
  • Built-in audit trail for compliance
  • Integration with chat platforms for team notifications

The platform is designed to grow with your team - from simple CLI usage to comprehensive enterprise incident management.