Kaltura MyMedia Unavailable

Outage category: 
Blackboard
Location: 
All Kaltura Users
Status: 
Closed
Resolved alert: 
01/23/2023 12:04 pm

Users were not able to access their MyMedia library or upload video. Many videos in Blackboard courses would not play.

Initial symptoms: 

No videos played

Duration: 
01/23/2023 9:58 am - 01/23/2023 12:04 pm
Impact to Mason: 

Users unable to view or upload videos to Kaltura/Blackboard. Students unable to view videos in courses.

Affected Services: 
Kaltura
ROOT CAUSE ANALYSIS
Cause: 

Kaltura’s system experienced service degradation due to a high load on our cached connections. Users intermittently experienced failures and access issues while trying to use certain applications and APIs. During a recent geolocation database update on Kaltura’s production environment, a new package of the backend was deployed. The package included a new code change that should not have been included because the version has already been frozen. The new code caused significant load on the cache layer due to wrong cache paths for thumbnails.

Resolution: 

Kaltura mitigated the high caching layer load to address the main impact by separating the thumbnails and captions services from the rest of the system. Subsequently, and after further investigation, the defective code was traced and reverted, a new cache layer component was added, and the system returned to be stable & was fully recovered.

Prevention: 

Enable GitHub branch protection on all branches that are deployed, preventing accidental merge of code into frozen versions. Research & implement optimizations for cache-layer resilience and scale (move to distributed memecached, evcache, or redis based cache-layer). Optimize thumbnail generation code updates Ongoing to reduce cache layer connections.

STATISTICS
Service Team: 
OLR