Information Technology Services

Webex Outage

Outage category: Webex

Location: All locations

Status: Open

Resolved Alert:

Initial Symptoms

Users were not able to join meetings, the website gmu.webex.com was unavailable.

Root Cause Analysis

Cause

Cisco Engineering identified a service latency condition that occurred when high volumes of users connected to the Webex sites for web page joins, meeting launches, and Webex desktop/client updates. The root cause was a higher than normal I/O wait on the data for the Webex web servers when querying for the meeting client update availability. Due to the combination of traffic initiators during a high traffic period, the web server pools became unable to handle the additional requests, which led to small percentages of meeting join failures.

Resolution

Cisco Engineering made a configuration change to add additional traffic shaping to the random client upgrade paths during the March 26th maintenance window.

Additional web service storage was deployed to production over the 3/27 weekend maintenance window.

Prevention

Cisco Engineering made a configuration change to add additional traffic shaping to the random client upgrade paths during the March 26th maintenance window.

Additional web service storage was deployed to production over the 3/27 weekend maintenance window.

A client code change is also being deployed to add further load balancing and randomization intelligence to the update process.