About the Client
The client is a learning and upskilling platform that connects learners, talent developers, and internal mobility opportunities in one place. It helps individuals and organizations discover, measure, and recognize all kinds of learning, from formal courses to informal activities like articles, podcasts, videos, and more. The client empowers learners to build their skills and advance their careers while enabling organizations to align their talent strategy with their business goals. The company has over five million users and 350 clients.
With a user base exceeding five million, ensuring an exceptional customer experience was paramount. The demanding Azure environment required vigilant monitoring and swift response to alerts. Any downtime or performance drop could affect the customer experience, necessitating continuous availability and peak performance around the clock.
However, the absence of a dedicated support structure led to an ad-hoc approach to Azure monitoring, causing backlogs and performance hiccups. This approach also disrupted the productivity and efficiency of DevOps and Engineering teams, forcing them to divert considerable time and resources from their core responsibilities to monitor Azure instances.
The client invested in Azure monitoring and partnered with Trigent to augment their existing capabilities. Trigent formed a dedicated monitoring and analysis team responsible for various crucial tasks, including continuous monitoring of Azure resources and meticulously analyzing performance statistics.
They established alert systems to detect performance issues, device reboots, or unresponsiveness, ensuring a proactive approach to maintain system health. In addition, Trigent leveraged tools like Datadog to create informative data dashboards. Patterns and trends were identified, helping the team differentiate between normal and abnormal system behavior, thus facilitating effective visualization of performance metrics and aiding in informed decision-making and better resource allocation.
The team also developed and improved Datadog dashboards to track specific services, such as databases and FTP services, and to display the uptime of Azure services using this tool. The core team efficiently managed incident tracking and response using Jira, thus ensuring errors were logged and addressed based on priority.
Trigent also monitored Azure system availability, reviewed Datadog statistics, updated and executed Runbooks, analyzed patterns in Azure performance, read error logs, conducted Root Cause Analysis (RCA), and followed up on escalated tickets involving collaboration with other teams, such as DevOps and Infrastructure.
A dedicated monitoring team ensured round-the-clock attention to the Azure environment. This meant that issues could be detected and addressed promptly, minimizing downtime and performance degradation.
With Trigent’s cloud expertise, alerts were analyzed and escalated to the appropriate teams, such as DevOps or engineering, when necessary. This streamlined the response process, ensuring the right experts addressed issues, contributing to a well-rounded strategy for monitoring and maintaining the client’s Azure environment.
The implementation of this solution enabled the organization to
The collaboration between the client and Trigent in establishing a dedicated monitoring team led to significant improvements in Azure environment management. It reduced downtime, improved efficiency, and allowed the onshore team to focus on their core intellectual property while Trigent managed their cloud infrastructure, ensuring a win-win for internal teams and customers.