PagerDuty Integration

Overview

The PagerDuty connector provides bi-directional integration between the incident management platform and PagerDuty, enabling seamless incident response orchestration across both systems.

Key Benefits:

  • Unified incident timeline across both platforms
  • Automatic on-call assignment via PagerDuty escalation policies
  • Real-time webhook synchronization
  • Rich incident context with PagerDuty metadata
  • Reduced context switching for response teams

Features

Bi-directional Sync: Create, update, and sync incidents between both systems
Real-time Webhook Integration: Instant updates from PagerDuty events
Service Mapping: Map PagerDuty services to incident management services
On-call Integration: Automatic assignment based on escalation policies
Rich Incident Data: Full PagerDuty incident context including urgency and assignments
SLA Alignment: Synchronize SLA timers between platforms

Prerequisites

Before setting up the integration:

  1. PagerDuty Account with appropriate permissions
  2. API Key from Account Settings → API Access Keys (Admin role required)
  3. Service ID for the PagerDuty service where incidents will be created
  4. Valid Email Address for incident creation (must be a PagerDuty user)
  5. Webhook Access (if using real-time synchronization)

Required Permissions

  • Admin role for API key generation
  • Manager or Admin role for service configuration
  • Responder role minimum for incident management

Quick Setup

Step 1: Add PagerDuty Connector

Interactive Configuration

# Add connector with interactive prompts
./im connector add pagerduty prod-pagerduty

# You'll be prompted for:
# - API Key: Your PagerDuty API key
# - API Base URL: https://api.pagerduty.com (default)
# - Default Service ID: PagerDuty service ID (e.g., PXXXXXX)
# - From Email: Email of a valid PagerDuty user
# - Escalation Policy ID: Optional escalation policy
# - Sync Direction: bidirectional, outbound, or inbound

JSON Configuration

./im connector add pagerduty prod-pagerduty --config '{
  "base_url": "https://api.pagerduty.com",
  "api_key": "your-api-key-here",
  "sync_direction": "bidirectional",
  "sync_interval": "5m",
  "enable_webhook": true,
  "custom": {
    "service_id": "PXXXXXX",
    "from_email": "ops@company.com",
    "escalation_policy_id": "PXXXXXX"
  }
}'

Step 2: Test the Connection

./im connector test prod-pagerduty

# Output:
✓ Connected to PagerDuty API
✓ Service PXXXXXX accessible
✓ User ops@company.com valid
✓ Escalation policy PXXXXXX found
✓ Webhook endpoint configured

Step 3: Configure Webhooks (Optional)

# Enable real-time synchronization
./im connector webhook prod-pagerduty --enable

# Webhook URL will be: https://your-domain.com/connectors/prod-pagerduty/webhook

Configuration Parameters

Required Settings

Parameter Description Example
api_key PagerDuty API key u+xxxxxxxxxxxxxxxxxxxxxx
custom.service_id Default PagerDuty service ID PXXXXXX
custom.from_email Email for incident creation ops@company.com

Optional Settings

Parameter Description Default
base_url PagerDuty API URL https://api.pagerduty.com
sync_direction Sync mode outbound
sync_interval Sync frequency 5m
enable_webhook Real-time updates true
custom.escalation_policy_id Default escalation policy -

Advanced Configuration

{
  "base_url": "https://api.pagerduty.com",
  "api_key": "your-api-key",
  "sync_direction": "bidirectional",
  "sync_interval": "2m",
  "enable_webhook": true,
  "custom": {
    "service_id": "PXXXXXX",
    "from_email": "ops@company.com",
    "escalation_policy_id": "PXXXXXX",
    "urgency_mapping": {
      "SEV-1": "high",
      "SEV-2": "high",
      "SEV-3": "low",
      "SEV-4": "low"
    },
    "service_mappings": {
      "api": "PAPI001",
      "db": "PDAT001",
      "web": "PWEB001"
    },
    "auto_resolve": true,
    "sync_notes": true
  }
}

Field Mappings

Status Synchronization

IM Status PagerDuty Status Description
open triggered New incident, not yet acknowledged
mitigated acknowledged Someone is working on the issue
resolved resolved Issue is fixed and confirmed
closed resolved Final closure (PagerDuty auto-resolves)

Severity/Urgency Mapping

IM Severity PagerDuty Urgency Priority
SEV-1 high Critical system impact
SEV-2 high Major functionality impact
SEV-3 low Minor functionality impact
SEV-4 low Informational only

Additional Field Mappings

IM Field PagerDuty Field Notes
Title Summary Direct mapping
Description Incident Body Formatted as HTML
Service Service ID Via service mapping configuration
Assignee Assignments Based on escalation policy
Timeline Events Notes Synchronized as incident notes

Usage Examples

Basic Operations

List Connectors

./im connector list

# Output:
Name            Type        Status    Direction     Last Sync
prod-pagerduty  pagerduty   active    bidirectional 2m ago

Sync Incidents Manually

# Force immediate synchronization
./im connector sync prod-pagerduty

# Sync specific incident
./im connector sync prod-pagerduty --incident INC-12345

View Connector Status

./im connector status prod-pagerduty

# Output:
Connector: prod-pagerduty (pagerduty)
Status: Active
Direction: Bidirectional
Last Sync: 2025-08-23 15:30:00
Incidents Synced: 45 (last 24h)
Sync Errors: 0
Webhook: Enabled

Incident Workflows

Creating Incidents

When you create an incident in the platform, it automatically creates a corresponding incident in PagerDuty:

# Create incident (syncs to PagerDuty)
./im declare --sev SEV-2 --title "Database performance issue" --service db

# PagerDuty incident is created with:
# - Summary: "Database performance issue"
# - Urgency: high (SEV-2 mapping)
# - Service: PDAT001 (db service mapping)
# - Escalation: Uses configured escalation policy

Acknowledging Incidents

# Acknowledge in IM platform
./im ack --incident INC-12345 --note "Investigating database locks"

# Automatically updates PagerDuty:
# - Status: acknowledged
# - Note: "Investigating database locks"
# - Assignee: Based on who acknowledged

Resolving Incidents

# Resolve incident
./im resolve --incident INC-12345 --note "Database locks cleared after index optimization"

# PagerDuty incident updated:
# - Status: resolved
# - Resolution note added
# - Timeline synchronized

Webhook Configuration

Enable Real-time Synchronization

  1. Configure Webhook in PagerDuty:

    • Go to IntegrationsGeneric Webhooks (v3)
    • Add webhook URL: https://your-domain.com/connectors/prod-pagerduty/webhook
    • Select events: incident.triggered, incident.acknowledged, incident.resolved
  2. Test Webhook:

    ./im connector webhook prod-pagerduty --test
    
    # Triggers test webhook event
    # Check logs for successful reception

Supported Webhook Events

PagerDuty Event Action IM Platform Response
incident.triggered New incident created Create corresponding incident
incident.acknowledged Someone took ownership Update status to mitigated
incident.escalated Escalation policy triggered Update assignee and add note
incident.resolved Incident resolved Update status to resolved
incident.reopened Incident reopened Reopen incident with note

Webhook Security

  • IP Whitelisting: Configure firewall rules for PagerDuty IPs
  • Signature Validation: Webhook payload signature verification
  • HTTPS Only: Secure webhook endpoints required
  • Rate Limiting: Webhook requests are rate-limited

Service Mapping

Configure Service Mappings

Map incident management services to specific PagerDuty services:

# Add service mapping
./im connector map-service prod-pagerduty --im-service api --pd-service PAPI001

# View current mappings
./im connector list-mappings prod-pagerduty

# Output:
IM Service    PagerDuty Service  Service Name
api           PAPI001           API Service
database      PDAT001           Database Service
web           PWEB001           Web Service
default       PGEN001           General Service

Service Mapping Configuration

{
  "service_mappings": {
    "api": "PAPI001",
    "database": "PDAT001",
    "web": "PWEB001",
    "payments": "PPAY001",
    "default": "PGEN001"
  }
}

Dynamic Service Mapping

Services can be mapped dynamically based on incident attributes:

# Declare incident with specific service
./im declare --sev SEV-1 --title "API Gateway Down" --service api

# Automatically creates PagerDuty incident in service PAPI001
# Uses escalation policy configured for API service
# Notifies appropriate on-call engineers

On-Call Integration

Escalation Policy Integration

The connector integrates with PagerDuty escalation policies to ensure appropriate team members are notified:

# Configure default escalation policy
./im connector config prod-pagerduty --escalation-policy PXXXXXX

# Service-specific escalation policies
./im connector map-escalation prod-pagerduty \
  --service api --escalation-policy PAPI_ESCALATION

# View escalation mappings
./im connector list-escalations prod-pagerduty

Automatic Assignment

When incidents are created or acknowledged:

  1. PagerDuty determines who should be notified based on escalation policy
  2. On-call engineer is automatically assigned in both systems
  3. Assignment changes are synchronized bidirectionally
  4. Escalation timeline is reflected in incident timeline

On-Call Schedule Sync

# Check current on-call schedule
./im oncall --service api

# Output:
Service: api
Current On-Call: john.doe@company.com (until 2025-08-24 09:00)
Next On-Call: jane.smith@company.com (2025-08-24 09:00)
Escalation: After 15 minutes → team-lead@company.com

Advanced Features

Multi-Service Incidents

Handle incidents affecting multiple services:

# Create incident affecting multiple services
./im declare --sev SEV-1 --title "Database cluster failure" \
  --service database,api,payments

# Creates separate PagerDuty incidents for each service
# Each incident routes to appropriate escalation policy
# Incidents are linked with correlation ID

Custom Field Synchronization

Sync custom fields between platforms:

{
  "custom_field_mappings": {
    "im_field": "pd_custom_field",
    "environment": "custom_details.environment",
    "customer_impact": "custom_details.customer_impact",
    "root_cause": "custom_details.root_cause"
  }
}

SLA Alignment

Synchronize SLA timers between platforms:

# Configure SLA mapping
./im connector config prod-pagerduty --sla-mapping '{
  "SEV-1": {"ack": "15m", "resolve": "1h"},
  "SEV-2": {"ack": "30m", "resolve": "4h"},
  "SEV-3": {"ack": "2h", "resolve": "24h"},
  "SEV-4": {"ack": "8h", "resolve": "72h"}
}'

# SLA breaches trigger notifications in both systems

Monitoring & Troubleshooting

Health Checks

# Check connector health
./im connector health prod-pagerduty

# Output:
Health: Healthy
API Connectivity: ✓ Connected (45ms response time)
Authentication: ✓ Valid API key
Service Access: ✓ PXXXXXX accessible
Webhook: ✓ Receiving events (last: 2m ago)
Sync Status: ✓ Up to date (last sync: 30s ago)
Error Rate: 0% (0 errors in 24h)

Common Issues

API Authentication Errors

Symptoms: 401 Unauthorized responses

Solutions:

# Verify API key is correct and has admin permissions
curl -H "Authorization: Token token=YOUR_API_KEY" \
     -H "Accept: application/vnd.pagerduty+json;version=2" \
     https://api.pagerduty.com/users

# Regenerate API key if needed
# Update connector configuration with new key
./im connector update prod-pagerduty --api-key new-api-key

Service Not Found Errors

Symptoms: 404 errors when creating incidents

Solutions:

# List available services
curl -H "Authorization: Token token=YOUR_API_KEY" \
     -H "Accept: application/vnd.pagerduty+json;version=2" \
     https://api.pagerduty.com/services

# Update service ID in configuration
./im connector config prod-pagerduty --service-id CORRECT_SERVICE_ID

Webhook Delivery Issues

Symptoms: Delayed or missing real-time updates

Solutions:

# Check webhook configuration in PagerDuty
# Verify webhook URL is accessible
curl -X POST https://your-domain.com/connectors/prod-pagerduty/webhook \
     -H "Content-Type: application/json" \
     -d '{"test": "ping"}'

# Check webhook logs
./im logs --component webhook --filter pagerduty

Debug Mode

Enable detailed logging for troubleshooting:

# Enable debug logging
export PAGERDUTY_DEBUG=true
./im serve

# Or configure in settings
{
  "connectors": {
    "pagerduty": {
      "debug": true,
      "log_requests": true,
      "log_responses": true
    }
  }
}

Metrics & Analytics

# View connector metrics
./im metrics connector prod-pagerduty

# Output:
Connector: prod-pagerduty
Total Incidents: 156
Sync Success Rate: 99.4%
Average Sync Time: 1.2s
API Calls (24h): 1,247
Rate Limit Usage: 23%
Webhook Events (24h): 89

Best Practices

Configuration

  • Use dedicated API keys for each environment (dev, staging, prod)
  • Configure separate services for different applications/teams
  • Set appropriate escalation policies based on severity
  • Enable webhooks for real-time synchronization

Incident Management

  • Use descriptive titles that make sense in both platforms
  • Include relevant context in incident descriptions
  • Leverage service mapping for proper routing
  • Follow consistent severity guidelines

Monitoring

  • Monitor sync performance and error rates
  • Set up alerts for connector health issues
  • Review webhook delivery regularly
  • Track SLA alignment between platforms

Security

  • Rotate API keys regularly (quarterly recommended)
  • Use HTTPS only for all webhook communications
  • Configure IP whitelisting for webhook endpoints
  • Monitor API usage for unusual patterns

Integration Examples

Automated Response Workflows

# Incident declared in IM platform
./im declare --sev SEV-1 --title "Payment API Down" --service payments

# Automatically triggers:
# 1. PagerDuty incident creation in payments service
# 2. On-call engineer notification via PagerDuty
# 3. Slack/Teams notification via ChatOps
# 4. War room creation for SEV-1
# 5. Management notification for critical incident

Cross-Platform Coordination

# Incident timeline shows unified view:
# 14:30 - Incident declared (IM Platform)
# 14:31 - PagerDuty incident created (PINC-123)
# 14:32 - John Doe paged (PagerDuty)
# 14:33 - Incident acknowledged by John (both platforms)
# 14:45 - Status update added (IM Platform)
# 14:45 - Note synchronized to PagerDuty
# 15:15 - Incident resolved (both platforms)

Compliance & Reporting

# Generate unified incident reports
./im report --start 2025-08-01 --end 2025-08-31 \
  --include-pagerduty --format pdf

# Includes data from both platforms:
# - Incident timelines with PagerDuty correlation
# - Response times from both systems
# - SLA compliance across platforms
# - On-call effectiveness metrics

This comprehensive PagerDuty integration ensures seamless incident response coordination while maintaining the benefits of both platforms.