Indian telecom giant struggled to offer more than 97% system availability.
Bharat Sanchar Nigam Limited (BSNL), the world’s 7th largest telecom company, ran on the smooth functioning of its data centers. Having steadily grown to 258 servers and 45 databases, with 10+ TB of data each, managing the increasing complexity and criticality of the four zonal data centers was becoming challenging. Their system availability hovered at 97% and they were unable to surpass it. As an industry leader, BSNL needed to improve efficiencies and streamline processes to avoid unplanned outages. A 360° view on the health of their data center and IT systems was critical to their operations and success.
TCS rolls out data center health monitoring portal with real-time monitoring and alerts.
TCS adopted a holistic approach, treating the data center as one system. We created a health monitoring portal to keep track of the status of each component of the data center in real-time. We measured nearly 30 distinct parameters and Key Performance Indicators for each application, database, and operating system, with the option for further customizations. There were 3 distinct states to indicate system health: green (healthy), amber (minor alert), or red (imminent failure).
Our solution was designed to cater to the data center complexity, scale of operations and maintenance, need for 24x7 monitoring, and report synchronization. We created an SMS Tracker to send data from the portal to the data center team so that system problems could be resolved quickly and proactively, before they worsened. Our system is designed for secure access, with only registered mobile numbers being authorized to send or receive SMS notifications.
The storage of alerts and subsequent actions enable detailed analysis. BSNL employees can now receive alerts in their mailboxes, as well as mobile phones.