High severity incidents, first thing on Monday morning, haven’t even had a coffee yet – not what you want!
Well thanks to our deployment of Log Insight, at least the troubleshooting and report writing was quick. There was a high sev incident the other day after a HA event caused hosts with mailbox servers to reboot. Due to an ‘issue’ with the hosts that failed during which they lost access to local disk and failed without even a PSOD, no logs were saved. Basically, dead in the water with no reason why they went down. Log Insight to the rescue. The BAU team jumped on to the right Log Insight instance, queried it with the names of the failed hosts, used a few filters and bingo – they pinpointed the cause of the issue. Took all of 30 mins all up. In a hour or two, they were able to write up a report of why the incident happened.
Log Insight saved every bit of logs right upto the moment the hosts were lost to the point where the VMs were restarted on surviving hosts – in customizable detail. Perfect!
Before I put in Log Insight, troubleshooting was cumbersome, not easily customized and sometimes plain annoying. Log Insight – you rock!