escalation Package

Overview

Package escalation provides automated incident escalation based on SLA breaches and configurable rules.

This package implements intelligent incident escalation that automatically monitors incidents for SLA breaches, applies configurable escalation rules, and executes appropriate escalation actions to ensure timely incident resolution and management visibility. It integrates with SLA monitoring, push notifications, and timeline services for comprehensive escalation management.

Key Features:

Automated SLA breach monitoring with configurable thresholds and time windows
Rule-based escalation with flexible conditions and multiple action types
Multiple escalation actions: notifications, assignments, severity increases, and paging
Comprehensive escalation history tracking and audit trails
Integration with push notification system for immediate alert delivery
Timeline integration for escalation event logging and incident context
Real-time escalation monitoring with configurable check intervals

Architecture:

The escalation system follows a monitoring and action-based architecture:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│ SLA Monitoring  │───►│ Escalation       │───►│ Action          │
│ (Threshold)     │    │ Rules Engine     │    │ Execution       │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                        │                        │
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│ Active          │    │ Rule             │    │ Notifications   │
│ Incidents       │    │ Evaluation       │    │ & Timeline      │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Escalation Types:

Time-based: Escalation triggered by SLA time thresholds (acknowledge, resolve)
Threshold-based: Escalation at configurable percentages of SLA time limits
Rule-based: Complex escalation conditions with multiple criteria and actions
Severity-based: Different escalation paths based on incident severity levels

Escalation Actions:

severity_increase: Automatically increase incident severity to draw attention
notify: Send push notifications to specific users, teams, or all responders
assign: Reassign incidents to different users or escalation teams
page: Trigger paging systems for critical escalations requiring immediate attention

Rule Configuration: Escalation rules support flexible configuration including:

SLA type targeting (acknowledge or resolve SLAs)
Percentage thresholds for escalation timing (e.g., 80% of SLA time)
Multiple escalation actions per rule with different targets
Rule activation/deactivation for dynamic escalation management
Historical tracking to prevent duplicate escalations

Example usage:

// Create escalation service with dependencies
escalationService := escalation.NewService(db, slaService, timelineService, pushService)

// Start escalation monitoring with 5-minute check interval
err := escalationService.Start(5 * time.Minute)
if err != nil {
	log.Fatal(err)
}

// Add escalation rule for SEV-1 incidents
rule := &escalation.EscalationRule{
	Name:             "SEV-1 Acknowledge Escalation",
	Description:      "Escalate SEV-1 incidents at 80% of acknowledge SLA",
	SLAType:         "acknowledge",
	ThresholdPercent: 80.0,
	EscalationActions: []escalation.EscalationAction{
		{
			Type:   "severity_increase",
			Target: "",
		},
		{
			Type:   "notify",
			Target: "incident-manager",
			Parameters: map[string]interface{}{
				"priority": "urgent",
				"message":  "SEV-1 incident requires immediate attention",
			},
		},
		{
			Type:   "assign",
			Target: "escalation-team",
		},
	},
	Active: true,
}

err = escalationService.AddRule(rule)
if err != nil {
	log.Fatal(err)
}

// Get escalation history for an incident
history, err := escalationService.GetEscalationHistory("INC-123")
if err != nil {
	log.Fatal(err)
}

for _, event := range history {
	fmt.Printf("Escalation: %s - %s (%s)\n",
		event.RuleName, event.Reason, event.ExecutedAt)
}

Integration Points: The escalation service integrates with multiple platform components:

SLA Service: Monitors SLA status and breach conditions for escalation triggers
Push Notification Service: Delivers escalation alerts to mobile devices and teams
Timeline Service: Records all escalation events in incident timelines for audit trails
Database: Stores escalation rules, history, and configuration for persistence
Incident Management: Updates incident severity and assignments based on escalation actions

Monitoring and Analytics: The escalation system provides comprehensive operational visibility:

Escalation frequency and effectiveness tracking across incident types
Rule performance analysis to optimize escalation thresholds and actions
SLA breach correlation to identify systemic issues and improvement opportunities
User and team escalation load balancing for optimal incident distribution
Historical trending for escalation policy refinement and process improvement

Performance and Reliability: The escalation service is designed for reliable, high-performance operation:

Efficient incident polling with configurable check intervals
Rule evaluation optimization to minimize database load
Concurrent-safe rule management with read-write locking
Graceful error handling to prevent escalation system failures
Comprehensive logging and monitoring for operational visibility

Import Path: github.com/systmms/incidents/internal/escalation

Types

CentrifugePublisher

CentrifugePublisher interface for publishing real-time escalation events

{<nil> 442 type 0 [0x1400078a080] 0}

EscalationAction

EscalationAction defines specific actions to execute during incident escalation.

This structure specifies individual escalation actions including action type, target specification, and action-specific parameters for flexible escalation behavior. Multiple actions can be configured per rule to provide comprehensive escalation response including notifications, assignments, and system updates.

Action Types:

“notify”: Send push notifications or alerts to specified targets
“assign”: Reassign incident to different users, teams, or escalation groups
“page”: Trigger paging systems for urgent escalations requiring immediate attention
“severity_increase”: Automatically increase incident severity level

Target Specification: Targets vary by action type and support flexible addressing:

User IDs for individual notifications or assignments
Team names for group notifications and team assignments
“all” for broadcast notifications to all responders
Service endpoints for paging system integration

{<nil> 24807 type 0 [0x140000b4d40] 0}

EscalationEvent

EscalationEvent represents a completed escalation execution with comprehensive context.

This structure captures the complete record of an escalation event including the triggering rule, execution timestamp, escalation reasoning, and all executed actions. It provides comprehensive audit trails and historical tracking for escalation analysis, policy optimization, and compliance documentation.

Event Tracking: Events are automatically created for every escalation execution:

Rule identification and escalation reasoning
Complete action list with execution details
Precise execution timestamps for timing analysis
Incident association for historical correlation

Analytics and Reporting: Events support comprehensive escalation analytics:

Escalation frequency and effectiveness tracking
Rule performance analysis and optimization insights
Time-to-escalation metrics for SLA policy tuning
User and team escalation load distribution analysis

{<nil> 26355 type 0 [0x140000b5080] 0}

EscalationRule

EscalationRule defines comprehensive escalation conditions and actions for automated incident management.

This structure encapsulates the complete specification for when and how incidents should be escalated, including SLA thresholds, escalation timing, and multiple action types. Rules provide flexible, configurable escalation policies that can be tailored to different incident types, severity levels, and organizational response requirements.

Rule Evaluation: Rules are continuously evaluated against active incidents to determine escalation needs:

SLA threshold monitoring based on configurable percentage of SLA time
Historical escalation tracking to prevent duplicate escalations
Active/inactive rule management for dynamic escalation policy control
Rule priority and ordering for complex escalation scenarios

Action Execution: When escalation conditions are met, rules execute multiple actions simultaneously:

Notification delivery to specified users, teams, or all responders
Incident reassignment to escalation teams or managers
Severity increases to draw attention and prioritize response
Paging system integration for critical escalations requiring immediate response

{<nil> 22434 type 0 [0x140000b49c0] 0}

Methods

CreateEscalationRuleFromTemplate

CreateEscalationRuleFromTemplate creates a customizable rule from a template

{<nil> <nil> CreateEscalationRuleFromTemplate 0x140002b7800 <nil>}

GetDefaultRules

GetDefaultRules returns a set of common escalation rules

{<nil> <nil> GetDefaultRules 0x140002b6900 <nil>}

PushNotifier

PushNotifier interface for sending push notifications during escalation actions.

This interface abstracts the push notification system for escalation delivery, enabling flexible notification backends while providing consistent escalation alert functionality. It supports both targeted user notifications and broadcast alerts for comprehensive escalation communication.

{<nil> 30252 type 0 [0x140000b5800] 0}

RealtimeEscalationService

RealtimeEscalationService extends the base escalation service with real-time capabilities

{<nil> 272 type 0 [0x14000709e80] 0}

Methods

NewRealtimeEscalationService

NewRealtimeEscalationService creates a new real-time escalation service

{<nil> <nil> NewRealtimeEscalationService 0x140004da520 <nil>}

Service

Service handles automated incident escalation based on SLA breaches and configurable rules.

The Service provides comprehensive escalation management including real-time SLA monitoring, rule-based escalation execution, and integration with notification systems. It continuously monitors active incidents against configured escalation rules and executes appropriate actions when escalation conditions are met.

Core Responsibilities:

Continuous monitoring of active incidents for SLA breach conditions
Rule-based escalation evaluation with configurable thresholds and conditions
Multi-action escalation execution including notifications, assignments, and system updates
Comprehensive escalation history tracking and audit trail maintenance
Integration with push notification system for immediate alert delivery
Timeline integration for escalation event logging and incident context

Service Architecture: The service operates as a background monitoring system with:

Configurable check intervals for incident SLA evaluation
Thread-safe rule management with concurrent read/write access
Integrated notification delivery through push notification service
Comprehensive logging and error handling for operational visibility
Database persistence for rule storage and escalation history

Concurrency and Safety: The service is designed for safe concurrent operation:

Read-write mutex protection for rule management operations
Thread-safe incident evaluation and escalation execution
Graceful shutdown handling with proper resource cleanup
Isolated error handling to prevent cascade failures

{<nil> 28917 type 0 [0x140000b54c0] 0}

Methods

NewService

NewService creates a new escalation service with comprehensive dependency integration.

This constructor initializes the escalation service with all required dependencies for SLA monitoring, escalation rule management, notification delivery, and audit logging. The service is immediately ready for rule configuration and monitoring startup.

Dependencies:

db: Database connection for rule storage and escalation history persistence
slaService: SLA monitoring service for breach detection and threshold evaluation
timelineService: Timeline service for escalation event logging and audit trails
pushService: Push notification service for escalation alert delivery

Initialization: The constructor performs comprehensive service initialization:

Empty rule map creation with thread-safe access control
Stop channel creation for graceful service shutdown
Structured logger initialization with escalation service identification
Service state preparation for rule loading and monitoring startup

Service Lifecycle: After creation, the service follows this lifecycle:

Rule loading from database with loadRules()
Monitoring startup with Start() and configured check interval
Continuous operation with automatic escalation evaluation
Graceful shutdown with Stop() and resource cleanup

Returns a fully initialized Service ready for rule configuration and monitoring.

{<nil> <nil> NewService 0x14000236b40 <nil>}

Functions

ValidateEscalationRule

ValidateEscalationRule validates that an escalation rule is properly configured

{<nil> <nil> ValidateEscalationRule 0x140002bf660 <nil>}

calculateUrgency

{<nil> <nil> calculateUrgency 0x1400022ad00 <nil>}

getSuggestedSeverity

{<nil> <nil> getSuggestedSeverity 0x1400022b720 <nil>}

getWarningLevel

{<nil> <nil> getWarningLevel 0x1400022af20 <nil>}

validateEscalationAction

validateEscalationAction validates a single escalation action

{<nil> <nil> validateEscalationAction 0x140002c4a60 <nil>}

Generated automatically from Go source code. Last updated: 2025-08-25T07:51:05-04:00

Edit this page on GitHub

correlation Package

metrics Package

Docs

Incidents

Title here

escalation Package

Overview

Types

CentrifugePublisher

EscalationAction

EscalationEvent

EscalationRule

Methods

CreateEscalationRuleFromTemplate

GetDefaultRules

PushNotifier

RealtimeEscalationService

Methods

NewRealtimeEscalationService

Service

Methods

NewService

Functions

ValidateEscalationRule

calculateUrgency

getSuggestedSeverity

getWarningLevel

validateEscalationAction

escalation Package

Overview#

Types#

CentrifugePublisher#

EscalationAction#

EscalationEvent#

EscalationRule#

Methods#

CreateEscalationRuleFromTemplate#

GetDefaultRules#

PushNotifier#

RealtimeEscalationService#

Methods#

NewRealtimeEscalationService#

Service#

Methods#

NewService#

Functions#

ValidateEscalationRule#

calculateUrgency#

getSuggestedSeverity#

getWarningLevel#

validateEscalationAction#

Overview

Types

CentrifugePublisher

EscalationAction

EscalationEvent

EscalationRule

Methods

CreateEscalationRuleFromTemplate

GetDefaultRules

PushNotifier

RealtimeEscalationService

Methods

NewRealtimeEscalationService

Service

Methods

NewService

Functions

ValidateEscalationRule

calculateUrgency

getSuggestedSeverity

getWarningLevel

validateEscalationAction