Application Monitoring Subject Matter Expert

Location KUALA LUMPUR, MALAYSIA
Experience level Experienced Hire
Job details sector Information Technology
  • Responsible for the monitoring, analysis, troubleshooting and reporting for AXA Application’s operational performance. This includes but not limited to Infrastructure, Application, Network and Security.
  • Responsible for driving performance enhancements, and leading targeted process improvement initiatives.
  • Responsible for defining the metrics, data collection methods, and reporting mechanisms as well as implementation of an overall performance management strategy.
  • Ensures the effective capture of all logging and monitoring of all aspects of system and application behavior to facilitate fast detection and resolution of Application availability issues.
  • SME in troubleshooting all performance issue across the Enterprise. This role will work closely with IT, Application Development, Project Management and external vendors ensuring the consistent tracking and reporting of metrics and performance data across the Enterprise.
  • supporting cost transparency efforts, and helping to develop mature cost metrics and Cost Optimization

Key Responsibilities:

  • Define and maintain IT’s performance monitoring and reporting strategy (processes, tools, & templates); develop enhanced reporting capabilities through standardization and automation
  • Proactively analyze trends in performance across IT; collaborate with process owners and stakeholders to identify and implement process improvements to increase operational efficiency and Application availability
  • Analyze and recommend performance improvements for capacity, availability, performance, support and security.
  • Stays informed of production changes that could affect functionality and alerting.
  • Ability to coordinate across teams, working closely with peers to ensure the appropriate focus and sense of urgency is applied to all issues
  • Troubleshooting using logs, alerts and external data sources to determine network, application, or security issues. The ability to correlate data to determine root cause.
  • Accurately troubleshoots, reproduces, and documents issues and other pertinent information in Incident or Problem tickets.
  • Handles incident queue and performs various tasks as assigned and determines business impact.
  • Handles ad hoc requests and take on new procedures as required.
  • When working on projects, identify and track project issues and dependencies, ensure follow-through, and appropriate actions are taken to complete project on time
  • Recommend, implement and manage cloud Automation using native Cloud tools. 
  • A minimum of five years of experience related to Performance analysis and monitoring across multiple areas including Infrastructure, Application, Network and Security for medium to large scale companies.
  • Bachelor’s degree in computer science or information systems or an equivalent combination of education, work experience and/or applicable certifications.
  • Expert knowledge of IT performance metrics. Experience with data management, report design, data visualization and presentation techniques
  • Hands-on experience using open source and commercial tools such as: Load Runner/Performance Center, Jmeter, Gatling, Locust and APM tools like Dynatrace, AppDynamics, New Relic, Splunk etc.
  • Ability to troubleshoot Application performance and monitoring issues and provide detailed analysis.
  • Ability to provide documentation that other Performance Operations Engineers can use.
  • Provide runbooks for other departments to execute.
  • Recommend ideas to streamline operations, improve operations, and create processes to proactively determine potential issues.
  • Provide training and mentoring other team members
  • Ability to work independently.
  • Experience with one or more Cloud platforms; Microsoft Azure, Amazon Web Service (AWS), Google Cloud or IBM Cloud as it relates to performance, monitoring and cost management.
  • Expert experience with Application and Network Performance Management Tools
  • Knowledge and understanding of microservices and web application Protocols
  • Thorough understanding of throughput, latency, memory and CPU utilization
  • Knowledge on CI/CD technologies such as Jenkins, Ansible and docker container
  • Excellent communication, collaboration, reporting, analytical and problem-solving skills
  • Design/Implementation/Integration Experience on Azure Monitors, New Relic, Splunk and Infrastructure Monitoring tool like Nagios
  • Scripting Expertise on one or more languages like Python, Power Shell, Perl
  • Integration experiences with Third Party Monitoring (Logs/Events Triggers), Ticketing (Events/Workflows Triggers), Orchestration/Automation (Events/Workflow Triggers) Tools
  • Support solving complex performance issues, events correlation, resource optimization, tuning and/or triaging performance problems across on-premise and cloud environment
  • Collaborate and work with other senior staff to recommend and design systems architecture and topology from both general and specific perspectives.
  • Interact with IT Operation teams to communicate and understand the monitoring requirements and provide support on an on-call rotation model

Would you like to wake up every day driven and inspired by our noble mission and to work together as one global team to empower people to live a better life?  Here at AXA we strive to lead the transformation of our industry. We are looking for talented individuals who come from varied backgrounds, think differently and want to be part of this exciting transformation by challenging the status quo so we can push AXA - a leading global brand and one of the most innovative companies in our industry - onto even greater things.

In a fast-evolving world and with a presence in 64 countries, our 166,000 employees and exclusive distributors anticipate change to offer services and solutions tailored to the current and future needs of our 103 million customers.



We’re an integral part of AXA, serving 100+ million people worldwide.
Our goal is to support AXA entities around the world in empowering people to live better lives. By embracing technology, data and innovation, we’re helping AXA become a customer-focused, tech-led company. 

At AXA, we lead an HR policy that encourages diversity, maintains your professional and private life balance and accelerates the skills and career development: promotion of diversity, remuneration policy, training device, ... Discover everything that makes AXA an employer of choice.

Whatever your job is, we strive to offer you career opportunities. Our goal is to develop your skills to support the transformation of our changing business.