Senior Engineer - Site Reliability
Inception
Date: 10 hours ago
City: Abu Dhabi
Contract type: Full time

Overview
A Senior Site Reliability Engineer you will be responsible to ensures reliability, availability, and performance of Azure services by designing scalable, secure systems, automating operations, managing incidents, and collaborating across teams for continuous improvement and robust disaster recovery.
The opportunity
Inception is the UAE’s national-scale enabler in AI Research and Development. Partnering with Microsoft's AI SaaS, we offer domain-specific Agentic AI Orchestrator platforms utilizing reasoning agents for precise and cost-effective services. Our focus includes AI incubation, IP creation, applied AI R&D, and AI investment products. By creating models tailored to specific domains and languages, we ensure superior accuracy and efficiency. Collaborating with top universities and industry giants to drive significant advancements in AI technology within the region.
Responsibilities
Responsibilities
Qualifications
If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Inception community.
What working at Inception offers
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.
Work-Life: A hybrid work policy to strike the perfect balance between office and home.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.
If you can confidently demonstrate that you meet the criteria above, please contact us as soon as possible.
A Senior Site Reliability Engineer you will be responsible to ensures reliability, availability, and performance of Azure services by designing scalable, secure systems, automating operations, managing incidents, and collaborating across teams for continuous improvement and robust disaster recovery.
The opportunity
Inception is the UAE’s national-scale enabler in AI Research and Development. Partnering with Microsoft's AI SaaS, we offer domain-specific Agentic AI Orchestrator platforms utilizing reasoning agents for precise and cost-effective services. Our focus includes AI incubation, IP creation, applied AI R&D, and AI investment products. By creating models tailored to specific domains and languages, we ensure superior accuracy and efficiency. Collaborating with top universities and industry giants to drive significant advancements in AI technology within the region.
Responsibilities
Responsibilities
- Ensure the reliability, availability, and performance of Azure-based services and infrastructure, meeting strict SLAs and business requirements.
- Design, implement, and maintain highly scalable, resilient, and secure systems within Azure environments.
- Automate repetitive operational and deployment tasks using scripting (Python, Go, Bash), infrastructure-as-code (Terraform, Bicep, Ansible), and CI/CD pipelines to streamline processes and reduce manual intervention.
- Monitor system performance using advanced tools (Azure Monitor, Prometheus, Grafana), proactively identify issues, and implement solutions to prevent service disruptions.
- Lead incident response, perform root cause analysis, and manage post-incident reviews to ensure continuous improvement and reliability.
- Develop, document, and enforce best practices for system operations, security, and compliance within Azure environments.
- Work closely with development, security, and operations teams to enhance system design, implement security controls, and support modern application platforms (Docker, Kubernetes).
- Participate in on-call rotations to provide rapid response and resolution for critical incidents.
- Utilize IT Service Management tools (ServiceNow, Jira) for incident tracking, change management, and security automation.
- Collaborate with cross-functional teams to analyze trends, resolve persistent issues, and implement enhancements to products and processes.
- Demonstrated experience in team leadership and mentoring is required.
- Must possess knowledge of Scrum, ITIL, Agile methodologies, ISO 27001 ISMS processes and standards, and have experience interfacing with external auditors
Qualifications
- Bachelor’s degree in computer science, Engineering, or related field.
- Minimum 10 years as a Site Reliability Engineer, with significant expertise in Azure cloud environments.
- Strong knowledge of Azure cloud services, networking, and security
- Proficiency in scripting languages (Python, Go, Bash) and infrastructure-as-code tools (Terraform, Bicep, Ansible).
- Experience with CI/CD pipelines, Docker, and Kubernetes for deployment automation.
- Hands-on experience with monitoring tools (Azure Monitor, Prometheus, Grafana).
- Proven track record in incident management and troubleshooting.
- Excellent problem-solving, communication, and collaboration skills; attention to detail and a commitment to continuous learning.
If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Inception community.
What working at Inception offers
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.
Work-Life: A hybrid work policy to strike the perfect balance between office and home.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.
If you can confidently demonstrate that you meet the criteria above, please contact us as soon as possible.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Senior Systems Engineer - Virtualization
Core42,
Abu Dhabi
11 hours ago
Senior Systems Engineer, Core 42, Abu Dhabi – UAEAbout UsCore42, a leader in AI-powered cloud and digital infrastructure, is driving transformative technology solutions globally. Leveraging advanced resources and partnerships, Core42 empowers clients to harness sovereign AI infrastructure, especially in sectors with stringent regulatory needs. With a mission to redefine digital transformation, we combine sovereign capabilities with scalable, high-performance compute infrastructure,...

Senior Specialist - Emirati Talent
CARACAL,
Abu Dhabi
12 hours ago
About CARACAL:Headquartered in Abu Dhabi, CARACAL is the regional leader in small arms manufacturing. We build high-quality next-generation firearms in state-of-the-art facilities, using some of the world’s best CNC machines, QC equipment, and moulding technologies. Every phase of our manufacturing process ensures our products comply with rigorous international standards, such as NATO and CIP. Advanced technologies, including additive manufacturing and...

NDT Engineer 1
Bureau Veritas Group,
Abu Dhabi
16 hours ago
Minimum Roles and Responsibilities but not limited to will be as followsTo take all precautions to protect the environmentWork in such manner as to ensure self-safety as well as that of co-workers safety in well covered with.Avail themselves of company internal and any other required training programs as and when required by the managementShall participate in the daily toolbox talks...
