Timeline Service Guide

The Timeline Service is the event-sourced core of the incident management platform, providing an append-only ledger that stores all incident-related events in CloudEvents v1.0 format.

Overview

The Timeline Service acts as a single source of truth for incident events across all integrated systems (ServiceNow, Jira, PagerDuty, Slack, etc.). Every action taken during an incident lifecycle is captured as an immutable event with full traceability.

Key Features

  • Idempotent Writes: Duplicate events are automatically detected and rejected
  • Rich Querying: Filter events by type, time range, actor, and correlation keys
  • Policy Enforcement: RBAC/ABAC controls with automatic field redaction for sensitive data
  • Multiple Export Formats: Export timelines as JSON (CloudEvents) or Markdown reports

Getting Started

Prerequisites

  • Incidents platform installed and running
  • Valid authentication token or CLI configured
  • At least one incident declared in the system

Basic CLI Usage

The incidents timeline command group provides all timeline operations:

# List all timeline events for an incident
incidents timeline list --incident INC-1234

# Get details of a specific event
incidents timeline get evt_550e8400

# Export timeline as Markdown report
incidents timeline export --incident INC-1234 --format md --output report.md

# Export timeline as JSON (CloudEvents format)
incidents timeline export --incident INC-1234 --format json --output events.json

Querying Timelines

Filtering by Event Type

Use glob patterns to match event types:

# Only incident state changes
incidents timeline list --incident INC-1234 --type "im.incident.*"

# Only alert events
incidents timeline list --incident INC-1234 --type "im.alert.*"

# Specific event type
incidents timeline list --incident INC-1234 --type "im.incident.acknowledge"

Time Range Filtering

Query events within a specific time window:

# Events from the last hour
incidents timeline list --incident INC-1234 \
  --from "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)"

# Events between specific timestamps
incidents timeline list --incident INC-1234 \
  --from "2025-12-10T00:00:00Z" \
  --to "2025-12-10T23:59:59Z"

Actor Filtering

Find events performed by specific users or systems:

# Events by specific user
incidents timeline list --incident INC-1234 --actor "alice@example.com"

# Events from automated systems
incidents timeline list --incident INC-1234 --actor "system:pagerduty"

Output Formats

The CLI supports multiple output formats:

# Table format (default)
incidents timeline list --incident INC-1234

# JSON format
incidents timeline list --incident INC-1234 --output json

# JSON with pretty printing
incidents timeline list --incident INC-1234 --output json --pretty

Understanding Event Types

The platform uses standardized CloudEvents types with the im.* namespace:

Incident Lifecycle Events

  • im.incident.declare - New incident created
  • im.incident.acknowledge - Incident acknowledged by responder
  • im.incident.escalate - Incident escalated to higher tier
  • im.incident.resolve - Incident marked as resolved
  • im.incident.close - Incident formally closed

Alert Events

  • im.alert.trigger - New alert received
  • im.alert.acknowledge - Alert acknowledged
  • im.alert.resolve - Alert resolved

Collaboration Events

  • im.note.add - Note added to timeline
  • im.status_update.post - Status update posted
  • im.runbook.execute - Runbook execution started

Integration Events

  • im.servicenow.sync - ServiceNow ticket synchronized
  • im.jira.sync - Jira issue synchronized
  • im.slack.message - Slack message linked

API Usage

Append Events (HTTP API)

Create new timeline events via the REST API:

curl -X POST https://incidents.example.com/api/v1/timeline/events \
  -H "Content-Type: application/cloudevents+json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "specversion": "1.0",
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "source": "im://api",
    "type": "im.incident.declare",
    "subject": "incident/INC-1234",
    "time": "2025-12-11T10:30:00Z",
    "data": {
      "title": "Database Connection Pool Exhausted",
      "severity": "SEV-2",
      "service": "payments-api"
    },
    "im.actor": "oncall@example.com",
    "im.origin_id": "pagerduty-alert-123"
  }'

Response (201 Created):

{
  "event_id": "550e8400-e29b-41d4-a716-446655440000",
  "created": true,
  "incident_id": "INC-1234"
}

Response (200 OK - Duplicate Detected):

{
  "event_id": "550e8400-e29b-41d4-a716-446655440000",
  "created": false,
  "incident_id": "INC-1234",
  "message": "Event already exists with this dedupe key"
}

Query Events (HTTP API)

Retrieve timeline events with filters:

# Simple query
curl "https://incidents.example.com/api/v1/incidents/INC-1234/timeline" \
  -H "Authorization: Bearer $TOKEN"

# Query with filters
curl "https://incidents.example.com/api/v1/incidents/INC-1234/timeline?\
type=im.incident.*&\
from=2025-12-10T00:00:00Z&\
to=2025-12-10T23:59:59Z&\
limit=100" \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "events": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "incident_id": "INC-1234",
      "event_time": "2025-12-11T10:30:00Z",
      "event_type": "im.incident.declare",
      "actor": "oncall@example.com",
      "title": "Database Connection Pool Exhausted",
      "content": "Payment API experiencing connection timeouts",
      "source": "im://api",
      "created_at": "2025-12-11T10:30:01Z"
    }
  ],
  "next_cursor": "MjAyNS0xMi0xMVQxMDozMDowMFo6NTUwZTg0MDA="
}

Pagination

The API uses cursor-based pagination for efficient traversal of large timelines:

# First page
curl "https://incidents.example.com/api/v1/incidents/INC-1234/timeline?limit=50" \
  -H "Authorization: Bearer $TOKEN"

# Next page using cursor from previous response
curl "https://incidents.example.com/api/v1/incidents/INC-1234/timeline?\
limit=50&\
cursor=MjAyNS0xMi0xMVQxMDozMDowMFo6NTUwZTg0MDA=" \
  -H "Authorization: Bearer $TOKEN"

Export Formats

JSON Export (CloudEvents Format)

Export the complete timeline as a CloudEvents v1.0 array:

incidents timeline export --incident INC-1234 --format json > timeline.json

Output Structure:

[
  {
    "specversion": "1.0",
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "source": "im://api",
    "type": "im.incident.declare",
    "subject": "incident/INC-1234",
    "time": "2025-12-11T10:30:00Z",
    "datacontenttype": "application/json",
    "data": {
      "title": "Database Connection Pool Exhausted",
      "severity": "SEV-2"
    },
    "im.actor": "oncall@example.com",
    "im.incident_id": "INC-1234"
  }
]

Markdown Export (Report Format)

Generate a human-readable incident report:

incidents timeline export --incident INC-1234 --format md > report.md

Output Structure:

# Incident Timeline: INC-1234

**Generated**: 2025-12-11T15:30:00Z
**Events**: 42

## Events

### 2025-12-11T10:30:00Z - Incident Declared

**Actor**: oncall@example.com
**Type**: im.incident.declare

Database Connection Pool Exhausted

Payment API experiencing connection timeouts affecting 15% of requests.

---

### 2025-12-11T10:32:15Z - Incident Acknowledged

**Actor**: alice@example.com
**Type**: im.incident.acknowledge

Acknowledged by on-call SRE. Investigating connection pool settings.

---

Policy Enforcement and Permissions

Required Permissions

Timeline operations require specific roles:

  • Read Access: timeline.reader role
  • Write Access: timeline.writer role
  • Export Access: timeline.exporter role

Visibility Control

Events can have visibility restrictions via the im.visibility extension:

  • public - All users with timeline.reader can see
  • restricted - Only users with restricted_access attribute
  • confidential - Only users with confidential_access attribute

Data Classification and Redaction

Events containing sensitive data use im.data_class tagging:

  • pii - Personal Identifiable Information (email, name, phone)
  • phi - Protected Health Information
  • financial - Financial data
  • legal_hold - Data under legal preservation

Automatic Redaction: Users without appropriate data access attributes will see redacted fields:

{
  "event_type": "im.incident.declare",
  "title": "Customer Account Issue",
  "data": {
    "customer_email": "[REDACTED]",
    "customer_name": "[REDACTED]",
    "account_id": "12345"
  },
  "im.redaction_applied": true
}

Best Practices

Event Design

  1. Use Descriptive Titles: Event titles should be concise but informative
  2. Include Context in Data: Store structured data for filtering and analysis
  3. Set Appropriate Visibility: Default to public unless confidential
  4. Tag Sensitive Data: Always mark PII/PHI with im.data_class

Querying Efficiently

  1. Use Time Ranges: Always constrain queries to relevant time windows
  2. Filter Early: Apply type and actor filters to reduce result sets
  3. Paginate Large Results: Use cursor-based pagination for >100 events
  4. Cache Export Results: Store exported timelines rather than regenerating

Integration Patterns

  1. Idempotency Keys: Always provide im.origin_id from source systems
  2. Correlation Keys: Use im.correlation_key to group related events
  3. Fingerprinting: Set im.fingerprint for automatic aggregation
  4. Related Events: Link parent/child events with im.related_ids

Troubleshooting

Common Issues

Permission Denied Errors

Problem: API returns 403 Forbidden

Solution: Verify user has required role assignment:

incidents user roles list --email user@example.com
incidents user roles add --email user@example.com --role timeline.reader

Duplicate Event Conflicts

Problem: Same dedupe key but different payload

Solution: Check your im.origin_id generation logic. Origin IDs must be unique per source system:

# Good: origin_id includes unique alert ID
"im.origin_id": "pagerduty:alert:P123456"

# Bad: origin_id is not unique
"im.origin_id": "pagerduty:alert"

Slow Query Performance

Problem: Timeline queries taking >1 second

Solutions:

  • Add time range filters to constrain results
  • Use more specific event type patterns
  • Check database indexes are present
  • Consider read replicas for high query loads

PDP Unavailable (503 Error)

Problem: Policy Decision Point not responding

Solution:

  1. Check OPA service health: curl http://opa:8181/health
  2. Verify policy bundles are loaded
  3. Check network connectivity to OPA
  4. Review OPA logs for errors

Monitoring and Observability

Key Metrics

Monitor these metrics for timeline service health:

  • timeline_events_total - Total events appended
  • timeline_dedupe_hits_total - Duplicate event detections
  • timeline_query_duration_seconds - Query latency (p50, p95, p99)
  • timeline_export_duration_seconds - Export operation latency

Distributed Tracing

All timeline operations emit OpenTelemetry spans:

# View traces in your tracing backend (Jaeger, Tempo, etc.)
# Look for spans named:
- timeline.append
- timeline.query
- timeline.export
- timeline.policy.evaluate

Next Steps