App down

Incident Report for Wisembly

Resolved

This incident has been resolved.
Posted Feb 04, 2016 - 12:16 CET

Monitoring

We deployed a new config for our logging system and we'll see if it improves performances and limit messages queue size. If so, we will roll out this new config on production environments in a few days.
Posted Feb 03, 2016 - 11:36 CET

Identified

Our log manager system rsyslog appeared to have filled up all its buffer while experiencing difficulties to send logs to our various log visualisation tools. It freezed the network stack of our frontals servers, making our solution unreachable until the buffer pool empties.

We are monitoring now closely rsyslog and considering updating its version to a newer and more robust one or using other ways to collect and send our logs, in a non-blocking way.
Posted Jan 22, 2016 - 15:57 CET

Investigating

We had a service disruption from 2:25pm to 2:37pm (12 minutes). We suspect an internal logging system that caused a system overflow on the network stack, making our application unreachable during this time. We'll post here more info as we investigate further. Sorry for the inconvenience.
Posted Jan 21, 2016 - 14:47 CET