ITSM Bi-Directional Sync Guide
This guide covers the concepts, configuration, and best practices for bi-directional synchronization between the Incidents platform and external ITSM systems.
Overview
Bi-directional sync enables real-time incident synchronization between the Incidents platform and external ITSM systems like ServiceNow, Jira Service Management, and PagerDuty.
Key Features
- Real-time Updates: Changes propagate within 5 seconds
- Field Ownership: Designate authoritative sources per field
- Conflict Resolution: Handle simultaneous edits gracefully
- Drift Detection: Reconciliation to catch missed updates
- Dead Letter Queue: Retry failed syncs automatically
Supported Platforms
| Platform | Outbound | Inbound | Bi-directional |
|---|---|---|---|
| ServiceNow | Yes | Yes | Yes |
| Jira SM | Yes | Yes | Yes |
| PagerDuty | Yes | Yes | Yes |
Architecture
Data Flow
┌─────────────────┐ ┌─────────────────┐
│ Incidents │◄──── Webhook ─────│ External │
│ Platform │ │ ITSM System │
│ │──── Outbox ───────►│ │
└─────────────────┘ └─────────────────┘
│ │
│ Field Ownership Rules │
│ Conflict Detection │
│ Drift Reconciliation │
└──────────────────────────────────────┘
Components
- Integration Configuration: Stores connection settings and credentials
- Field Ownership Engine: Determines authoritative source per field
- Outbound Sync Service: Pushes changes to external systems
- Inbound Webhook Handler: Processes updates from external systems
- Conflict Detector: Identifies simultaneous updates
- Drift Detector: Reconciles state between systems
- Dead Letter Queue: Manages failed sync attempts
Sync Directions
Outbound Only
Platform changes push to external system, webhooks are ignored:
im integration configure prod-snow --direction outboundInbound Only
Webhooks update platform, platform changes don’t sync out:
im integration configure prod-snow --direction inboundBi-directional
Full two-way sync with conflict detection:
im integration configure prod-snow --direction bidirectionalField Ownership
Field ownership determines which system is authoritative for each incident field.
Ownership Types
| Owner | Behavior |
|---|---|
platform |
Platform changes sync out; inbound updates are ignored |
external |
External changes sync in; outbound updates are skipped |
Configuration
# Set platform as owner of severity (priority 10)
im integration field-ownership set prod-snow severity --owner platform --priority 10
# Set external as owner of assigned_to (priority 5)
im integration field-ownership set prod-snow assigned_to --owner external --priority 5
# List all ownership rules
im integration field-ownership list prod-snowDefault Ownership Rules
Each integration type has sensible defaults:
ServiceNow:
- Platform owns: severity, title, description
- External owns: assigned_to, work_notes
Jira SM:
- Platform owns: severity, priority
- External owns: assignee, comments
PagerDuty:
- Platform owns: title, description
- External owns: escalation_policy, assigned_to
Conflict Resolution
When both systems update the same field within 5 seconds, a conflict is detected.
Resolution Strategies
Last-Write-Wins (Default)
The most recent update takes precedence:
im integration configure prod-snow --conflict-strategy last-write-winsOwnership-Priority
The system with higher ownership priority for the field wins:
im integration configure prod-snow --conflict-strategy ownership-priorityUse this when field ownership should always be respected, even during simultaneous edits.
Manual Review
Conflicts are queued for human review:
im integration configure prod-snow --conflict-strategy manual-reviewManaging Conflicts
# List pending conflicts
im conflict list --status pending
# View conflict details
im conflict show conflict-123
# Resolve using platform value
im conflict resolve conflict-123 --use-platform
# Resolve using external value
im conflict resolve conflict-123 --use-external
# Ignore the conflict (keep current state)
im conflict resolve conflict-123 --ignoreDrift Detection
Drift occurs when systems become out of sync due to missed webhooks, network issues, or direct database edits.
On-Demand Reconciliation
# Check for drift without making changes
im integration reconcile prod-snow --dry-run
# Auto-heal detected drift
im integration reconcile prod-snow --auto-heal
# View what would change
im integration reconcile prod-snow --dry-run --format jsonScheduled Reconciliation
Configure automatic reconciliation:
curl -X POST http://localhost:8080/api/v1/integrations/{id}/reconcile/schedule \
-H "Content-Type: application/json" \
-d '{
"interval": "1h",
"auto_heal": true,
"enabled": true
}'Reconciliation History
im integration reconcile history prod-snowDead Letter Queue
Failed sync attempts are moved to the DLQ after 5 retries.
Managing the DLQ
# List failed entries
im integration dlq prod-snow
# Retry a specific entry
im integration dlq-retry prod-snow entry-123
# Discard permanently failed entry
im integration dlq-discard prod-snow entry-456Automatic Retry
The system automatically retries with exponential backoff:
- Attempt 1: Immediate
- Attempt 2: 30 seconds
- Attempt 3: 2 minutes
- Attempt 4: 8 minutes
- Attempt 5: 30 minutes
- After 5 failures: Move to DLQ
Monitoring
Sync Status
# View integration health
im integration sync-status prod-snow
# Get detailed metrics
im integration sync-status prod-snow --format jsonMetrics Available
| Metric | Description |
|---|---|
outbound_success |
Successful outbound syncs |
outbound_failed |
Failed outbound syncs |
inbound_success |
Successful inbound syncs |
inbound_failed |
Failed inbound syncs |
conflicts_detected |
Total conflicts detected |
conflicts_resolved |
Conflicts auto-resolved |
drift_discrepancies |
Drift items found |
OpenTelemetry Tracing
All sync operations emit OpenTelemetry spans with attributes:
sync.integration_id: Integration identifiersync.integration_type: servicenow, jira, pagerdutysync.direction: outbound, inboundsync.incident_id: Platform incident IDsync.external_id: External system ID
Security
Credential Storage
Credentials are encrypted at rest and never returned in API responses.
Webhook Validation
Each platform uses different signature validation:
| Platform | Method |
|---|---|
| ServiceNow | HMAC-SHA256 |
| Jira | JWT (HS256) |
| PagerDuty | HMAC-SHA256 |
RBAC
Access is controlled via OPA policies:
| Role | Permissions |
|---|---|
platform.administrator |
Full access to all integrations |
integration.manager |
Manage integrations in assigned scope |
incident_commander |
Read sync status, resolve conflicts for assigned incidents |
viewer |
Read-only access to sync status |
Best Practices
1. Start with Outbound Only
Test outbound sync before enabling bi-directional to verify field mappings.
2. Define Clear Ownership
Explicitly set field ownership before enabling bi-directional sync.
3. Monitor Conflicts
Start with manual-review strategy to understand your conflict patterns.
4. Schedule Regular Reconciliation
Run drift detection at least hourly to catch missed updates.
5. Handle DLQ Promptly
Monitor the dead letter queue and investigate failures before they accumulate.
Troubleshooting
Sync Not Working
- Check integration is enabled:
im integration get prod-snow - Verify webhook is configured in external system
- Test connectivity:
im integration test prod-snow - Check logs for errors
Field Not Syncing
- Verify field ownership:
im integration field-ownership list prod-snow - Check field mapping exists
- Ensure field is included in webhook payload
Rate Limiting
- Check current rate limit settings
- Reduce sync frequency
- Enable rate limit backoff
Webhook Failures
- Verify signature/JWT secret
- Check network connectivity
- Validate payload format