Outage category: Blackboard, myMason
Location: All campuses
Status: Closed
Resolved Alert:
Initial Symptoms
Blackboard was unavailable several times on October 21 and 22.
Root Cause Analysis
Cause
The root cause of the incident was due to an issue with ActiveMQ where the ActiveMQ broker appears only on instances that have been terminated. This occurs with automated restarts when the broker is not terminated and AWS restarts the instances. As a result the database does not remove broker instance and when new apps are brought up, it appears the broker is running. It was also noted during the service interruptions experienced GMU was running their SIS feed file and this caused the outage. It was also observed the What’s New Module and custom language exacerbated the performance issue.
Resolution
Vendor working to create a permanent solution, expected in mid-December.
Prevention
Until the issue is fixed (expected mid-December) workarounds have been established to ensure the issue does not occur again. Nodes will not be restarted (except manually) and building blocks will not be installed or updated.