Docs

    AWS Personal Health Dashboard serves as a centralized notification system for service events that directly impact your AWS resources. Unlike the general AWS Service Health Dashboard, Personal Health Dashboard delivers account-specific alerts about events affecting your particular infrastructure, enabling targeted response and proactive management. Troubleshooting AWS Personal Health events requires understanding event classifications and implementing appropriate response strategies, but we are here to help you with that. This documentation provides guidance on interpreting Personal Health notifications and offering actionable recommendations for effective resolution.

    How Monitoring Works


    We monitor AWS Personal Health Dashboard events that could impact your infrastructure, tracking:

    • Service disruptions affecting your resources
    • Scheduled maintenance windows
    • Account-specific notifications
    • Resource-specific health events

    While AWS CloudWatch monitors metrics, and Personal Health Dashboard provides event notifications, Blue Matador advances monitoring by correlating health events with your actual infrastructure state and performance metrics. With intelligent filtering and proactive alerting, Blue Matador surfaces the Personal Health events that require action rather than overwhelming teams with routine notifications. By integrating Personal Health monitoring with broader infrastructure observability, administrators can respond to issues efficiently and maintain optimal system availability.

     

    Understanding Event Categories


    Event Types

    • Issue Events: Active problems currently affecting your resources, requiring immediate investigation and potential mitigation.
    • Scheduled Change Events: Planned maintenance activities that may impact resource availability during specified timeframes.

     

    Responding to Issue Events

    When AWS Personal Health Dashboard reports an active issue affecting your resources, immediate assessment and potential mitigation actions are necessary to minimize service impact.

    Possible Solutions

    • Identify Affected Resources
      • Review event details to determine which specific resources are impacted, noting the provided ARNs or resource identifiers.
      • Cross-reference affected resources with your application architecture to understand potential downstream impacts.
    • Assess Service Impact
      • Determine whether the issue is causing user-facing service degradation or if redundancy mechanisms are maintaining availability.
      • Consult monitoring dashboards and application metrics to validate the scope and severity of impact.
    • Implement Mitigation Strategies
      • For resources configured with redundancy, consider failing over to healthy instances or regions.
      • Evaluate whether deploying replacement resources in unaffected availability zones or regions can restore full functionality.
      • Document incident timeline and mitigation actions for post-incident analysis.
    • Maintain Communication
      • Update stakeholders regarding the AWS service issue and your mitigation strategies.
      • Reference the specific event in incident communications for clarity.

    Effective response requires distinguishing between problems that can be mitigated through infrastructure adjustments and those where AWS service restoration is necessary. Where failover or resource redeployment can restore service, implement those changes promptly. When resolution depends on AWS service restoration, monitor progress and maintain appropriate escalation channels.


     

    Managing Scheduled Maintenance


    Scheduled change events provide advance notification of planned AWS maintenance activities that may affect your resources during specified timeframes.

    Possible Solutions

    • Evaluate Maintenance Impact
      • Review the scheduled maintenance timeframe and assess potential conflicts with business-critical operations.
      • Determine whether maintenance affects resources with built-in redundancy or presents risk of service interruption.
    • Plan Resource Adjustments
      • For maintenance affecting non-redundant resources, consider migrating workloads to unaffected resources before the maintenance window.
      • Deploy replacement resources in advance and transition traffic gradually to minimize disruption.
    • Request Schedule Modification (When Applicable)
      • Some scheduled maintenance can be rescheduled through AWS Support if the timing presents significant business impact.
      • Document business justification when requesting maintenance window adjustments.
    • Prepare Validation Procedures
      • Ensure rollback procedures are documented and tested in case post-maintenance issues arise.
      • Establish monitoring and validation protocols to verify system health following maintenance completion.

    Scheduled maintenance notifications provide the opportunity to implement proactive measures that minimize service disruption. Evaluate whether passive acceptance is appropriate or if active resource migration is necessary to maintain service availability during the maintenance window.