sla Package

Overview

Package sla provides comprehensive Service Level Agreement monitoring and enforcement for the incident management platform.

This package implements the SLA tracking system that monitors response times, resolution targets, and compliance metrics to ensure service level commitments are met. It provides real-time monitoring, automated alerting, and comprehensive reporting for SLA governance and performance optimization across all incident severity levels.

Key Features:

  • Automatic SLA initialization for new incidents based on severity
  • Real-time acknowledgment and resolution time tracking
  • Configurable SLA thresholds per severity level and service
  • Proactive deadline monitoring with approaching deadline alerts
  • Comprehensive SLA breach detection and reporting
  • Statistical analysis and compliance rate calculations
  • Integration with timeline service for accurate time tracking
  • Automated SLA event generation for audit trails

Architecture:

The SLA system follows a time-based monitoring approach with proactive alerting:

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│ Incident Events │───►│ SLA Tracking     │───►│ Compliance      │
│ (Timeline)      │    │ (Thresholds)     │    │ (Metrics)       │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                                      
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│ SLA Events      │    │ Deadline         │    │ Reporting       │
│ (Audit Trail)   │    │ Monitoring       │    │ (Analytics)     │
└─────────────────┘    └──────────────────┘    └─────────────────┘

SLA Categories:

  • Acknowledgment SLA: Time to first response/acknowledgment
  • Resolution SLA: Time to incident resolution/closure
  • Custom SLA: Service-specific or severity-specific targets

Severity-Based Thresholds:

  • SEV-1 (Critical): 5 min acknowledge, 1 hour resolve
  • SEV-2 (High): 15 min acknowledge, 4 hours resolve
  • SEV-3 (Medium): 30 min acknowledge, 24 hours resolve
  • SEV-4 (Low): 1 hour acknowledge, 72 hours resolve

Example usage:

// Create SLA service
slaService := sla.NewService(db, timelineService)

// Initialize SLA tracking for new incident
incident := &models.Incident{
	ID:        "INC-123",
	Severity:  models.SeveritySEV1,
	CreatedAt: time.Now(),
}

status, err := slaService.InitializeSLA(ctx, incident)
if err != nil {
	log.Fatal(err)
}
fmt.Printf("SLA initialized - Ack deadline: %v, Resolve deadline: %v\n",
	status.AckDeadline, status.ResolveDeadline)

// Update SLA when incident is acknowledged
err = slaService.UpdateAcknowledgment(ctx, "INC-123", time.Now())
if err != nil {
	log.Fatal(err)
}

// Get SLA metrics
since := time.Now().AddDate(0, 0, -30) // Last 30 days
metrics, err := slaService.GetMetrics(ctx, since)
if err != nil {
	log.Fatal(err)
}
fmt.Printf("Overall SLA compliance: %.2f%%\n", metrics.AckComplianceRate*100)

// Check for SLA breaches and approaching deadlines
err = slaService.CheckSLAs(ctx)
if err != nil {
	log.Fatal(err)
}

SLA States:

  • on_track: All SLAs are within acceptable timeframes
  • at_risk: SLA deadlines are approaching (within 20% of threshold)
  • breached: One or more SLA deadlines have been exceeded

Automated Monitoring: The service provides automated SLA monitoring that:

  • Tracks acknowledgment and resolution deadlines continuously
  • Generates proactive alerts when deadlines are approaching
  • Records SLA breach events with detailed context
  • Updates incident status based on SLA performance
  • Integrates with timeline service for comprehensive audit trails

Compliance Reporting: The system generates comprehensive SLA compliance reports including:

  • Overall compliance rates across all incidents
  • Severity-specific compliance breakdown
  • Team performance metrics and trends
  • Historical compliance tracking for trend analysis
  • Detailed breach analysis with root cause context

Configuration: SLA thresholds are configurable per severity level and can be customized for specific services or incident types, enabling flexible SLA governance aligned with business requirements and service level commitments.

Import Path: github.com/systmms/incidents/internal/sla

Types

Adapter

Adapter implements the timeline.SLATracker interface

{<nil> 157 type 0 [0x1400017f500] 0}

Methods

NewAdapter

NewAdapter creates a new SLA adapter

{<nil> <nil> NewAdapter 0x14000408ea0 <nil>}

SLAConfig

SLAConfig defines the SLA thresholds for incidents

{<nil> 2442 type 0 [0x140002f8540] 0}

Methods

DefaultSLAConfig

DefaultSLAConfig returns default SLA thresholds

{<nil> <nil> DefaultSLAConfig 0x140002690a0 <nil>}

SLAEvent

SLAEvent represents an SLA-related event

{<nil> 5496 type 0 [0x1400031a700] 0}

SLAMetrics

SLAMetrics aggregates SLA performance metrics

{<nil> 4219 type 0 [0x140002f98c0] 0}

SLAStatus

SLAStatus represents the current SLA status of an incident

{<nil> 3452 type 0 [0x140002f8c40] 0}

Service

Service provides comprehensive SLA monitoring, tracking, and compliance management.

The Service acts as the central SLA governance system for the incident management platform, ensuring that service level commitments are monitored, tracked, and enforced across all incident severity levels. It provides real-time SLA status tracking, proactive deadline monitoring, and comprehensive compliance reporting.

Core Responsibilities:

  • Automatic SLA initialization based on incident severity and service type
  • Real-time acknowledgment and resolution time tracking with precise measurements
  • Configurable threshold management for different severity levels and services
  • Proactive deadline monitoring with automated alerting for approaching deadlines
  • SLA breach detection and comprehensive event logging for audit trails
  • Statistical analysis and compliance rate calculation for reporting
  • Integration with timeline service for accurate time tracking and correlation

The service operates continuously to ensure SLA compliance and provides the data foundation for performance reporting, compliance dashboards, and operational improvement initiatives.

{<nil> 12330 type 0 [0x1400031b400] 0}

Methods

NewService

NewService creates a new SLA service with database and timeline integration.

This constructor initializes the SLA tracking service with database access for SLA status storage and timeline service integration for comprehensive audit trails. The service is configured with default SLA thresholds that can be customized after creation using the SetConfig method.

Default SLA Configuration: The service initializes with industry-standard SLA thresholds:

  • SEV-1: 5 minutes acknowledge, 1 hour resolve
  • SEV-2: 15 minutes acknowledge, 4 hours resolve
  • SEV-3: 30 minutes acknowledge, 24 hours resolve
  • SEV-4: 1 hour acknowledge, 72 hours resolve

Parameters:

  • db: Database connection for SLA status persistence and metrics calculation
  • timeline: Timeline service for SLA event logging and audit trail integration

Integration: The service integrates with:

  • Database for persistent SLA status tracking and historical metrics
  • Timeline service for comprehensive audit logging of SLA events
  • Configurable threshold system for flexibility across different service types

The service is immediately ready for SLA tracking operations and can be customized with specific threshold configurations as needed.

Returns a configured Service ready for comprehensive SLA monitoring and tracking.

{<nil> <nil> NewService 0x14000205c00 <nil>}

SeverityMetrics

SeverityMetrics contains SLA metrics for a specific severity

{<nil> 5093 type 0 [0x1400031a2c0] 0}

Functions

formatDuration

{<nil> <nil> formatDuration 0x140001f7880 <nil>}

Generated automatically from Go source code. Last updated: 2025-08-25T07:51:05-04:00