Senior IT Operations & Applications – Dubai
Key Responsibilities:
• Ensure the stability, availability, and performance of digital platforms including mobile applications, websites, APIs, CRM, and ERP systems.
• Lead proactive monitoring across applications, databases, infrastructure, and network environments using enterprise monitoring solutions such as Dynatrace and ManageEngine.
• Detect potential service degradation at an early stage, implement corrective measures, and minimize business impact.
• Continuously evaluate operational health and identify architectural or solution gaps affecting long-term scalability and sustainability.
• Establish robust observability frameworks with meaningful service metrics, thresholds, and alerts aligned to business KPIs.
• Drive continuous improvement initiatives to transition operations from reactive support to predictive and proactive service management.
• Identify unstable or inefficient solutions and lead initiatives to enhance resilience, scalability, and operational efficiency.
• Promote operational excellence by leveraging incident trends, performance analytics, and lessons learned to improve services.
• Lead major incident response activities, ensuring timely containment, communication, resolution, and service restoration.
• Own end-to-end Root Cause Analysis (RCA) processes, ensuring accurate identification of root causes and contributing factors.
• Drive remediation plans through to completion and implement preventive controls to avoid recurring incidents.
• Monitor incident patterns and systemic risks, escalating critical issues and recommending long-term corrective actions.
• Act as the production governance lead for all changes, releases, and deployments to ensure production stability.
• Define and enforce governance processes around production readiness, risk validation, and change management controls.
• Oversee deployment activities to ensure minimal disruption, rollback preparedness, and uninterrupted service delivery.
• Continuously enhance release and change management practices to balance agility with operational reliability.
• Own Disaster Recovery (DR) strategies, runbooks, and recovery procedures across all B2B services.
• Ensure DR frameworks, RTOs, and RPOs align with business priorities, regulatory standards, and contractual commitments.
• Plan and execute regular DR drills, documenting outcomes, risks, and improvement opportunities.
• Embed lessons learned from DR exercises and production incidents into operational procedures and system architecture.
• Maintain DR documentation and evidence required for audits, compliance reviews, and risk management activities.
• Prepare and present regular reports on service performance, availability, incidents, and operational risks to business and management stakeholders.
• Translate technical metrics into business-focused insights and actionable recommendations.
• Ensure clear and transparent communication during incidents and service disruptions to maintain stakeholder confidence.
• Collaborate closely with digital, infrastructure, delivery, and security teams throughout the service lifecycle.
• Manage operational engagement with external technology vendors and digital service partners.
• Ensure vendor deliverables consistently meet agreed SLAs, quality benchmarks, and operational standards.
• Challenge vendors on incident root causes, remediation effectiveness, and long-term service improvements.
• Align vendor operations with organizational governance, resilience, and operational expectations.
• Lead remediation efforts for audit findings related to application operations, service availability, and operational controls.
• Coordinate with internal teams and vendors to ensure timely closure of audit observations.
• Strengthen operational processes through preventive controls to minimize recurring audit findings.
• Maintain operational documentation, runbooks, escalation procedures, and audit evidence to support compliance requirements.
• Drive adoption of innovative technologies and operational improvements to modernize service operations.
• Champion automation initiatives including Agentic AI, AIOps, and RPA to improve efficiency and reduce manual intervention.
• Assess emerging operational technologies and lead controlled implementation into production environments.
• Support the evolution of operations toward intelligent, scalable, and self-healing service models.
Requirements
- Minimum of 10 years of experience in IT Operations & Applications
- Good understanding of database systems (Azure Cosmos DB, Azure SQL Database)
- Good understanding of networking & firewalls (F5, Palo Alto, Fortinet, Azure Front Door, Azure Networks)
- Strong understanding of Azure cloud infrastructure (Azure IaaS, Azure Blob Storage, Azure Compute, Azure PaaS, Azure Web App, Azure AKS, Azure Functions, Azure Logic Apps, Azure DevOps)
- Strong understanding of digital solutions, iOS apps, Android apps, web/portal technologies, API gateways (Mulesoft, Azure API Management)
- Required Good understanding of MS Dynamics CRM and ERP
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Network & Security Implementation Engineer
2026-05-20
Senior Backend Engineer - Dubai
2026-04-06
Senior Backend Engineer - Dubai
2026-03-25
- Posted
- May 20, 2026
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Dubai
- Company
- Kingston Stanley
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Network & Security Implementation Engineer
2026-05-20
Senior Backend Engineer - Dubai
2026-04-06
Senior Backend Engineer - Dubai
2026-03-25