Lead Site Reliability Engineer

Presight


Date: 20 hours ago
City: Abu Dhabi
Contract type: Full time
Overview

Role: Lead Engineer – Site Reliability

Location: Abu Dhabi

About Presight

Presight, an ADX-listed public company limited by shares whose majority shareholder is Abu Dhabi company G42, is the region’s leading big data analytics company powered by Artificial Intelligence (“AI”). It combines big data, analytics, and AI expertise to serve every sector, of every scale, to create business and positive societal impact. With its world-class computer vision, AI and omni-analytics platform as its engine, Presight leverages all-source data to support insight-driven decision making that shapes policy and creates safer, healthier, happier, and more sustainable societies.

The Opportunity

Seeking a meticulous and expert Lead Engineer - Site Reliability to build and support the Presight delivery model that empowers product & technology teams to develop & deliver high-quality products, improve platform infrastructure and strengthen the reliability of products and solutions. You play a key role in defining & establishing the delivery model deployed in the development of cutting edge, next-gen analytics solutions & services at Presight.

Responsibilities

Key Responsibilities:

As a Lead Engineer – Site Reliability, you will be responsible for working with relevant stakeholders to drive reliability, performance, and scalability across our infrastructure. You will own the SRE roadmap and guide implementation through mentorship, code contributions, and hands-on infrastructure work. Partnering closely with Engineering, Data Science, and Product teams to embed reliability into the development lifecycle.

Functional

  • Architect and lead reliability strategies across services and environments.
  • Define and enforce SLOs, SLIs, and error budgets with engineering leadership.
  • Lead incident response and root cause analysis.
  • Implement automation to reduce toil and improve system resilience.
  • Manage capacity planning, traffic forecasting, and cost optimization.
  • Mentor junior and senior SREs in technical and process excellence.
  • Collaborate with MLOPS, DevSecOps and CloudOps teams to enforce best practices.
  • Champion observability, metrics-driven decisions, and platform maturity.
  • Deploy monitoring tools such as Prometheus and Grafana to track system performance.
  • Ensure system reliability adheres to security and compliance standards, particularly within regulated sectors.
  • Comply with QHSE (Quality Health Safety and Environment), Business Continuity, Information Security, Privacy, Risk, Compliance Management and Governance of Organizations policies, procedures, plans and related risk assessments.

Qualifications

Required Skills:

  • Bachelor's Degree in Computer Engineering or related field.
  • Minimum 10 years of experience in site reliability with 2 years in people management.
  • Expertise in Kubernetes, CI/CD (e.g., GitLab), and infrastructure-as-code (Terraform/Helm).
  • Strong experience in cloud (Azure, AWS, or GCP).
  • Experience with multi-tenant systems or high-throughput data platforms.
  • Exposure to AI/ML infrastructure or MLOps pipelines.
  • Proven background in SRE principles, SLIs/SLOs, and reliability-focused engineering.
  • Programming proficiency in Python, or Shell (Nice to have)
  • Deep understanding of distributed systems, networking, and incident management.

Ideally, you’ll also need

  • A highly detail-oriented and methodical approach to problem solving.
  • A passion for technology, troubleshooting and customer service.
  • A strongly analytical mind.
  • Great verbal and written communication skills.

What We Look For

Join us at Presight, where we offer a culture of innovation, outstanding career growth opportunities, and competitive rewards. If you're eager to conquer new frontiers in AI and thrive in a dynamic environment, we welcome you to our community.

What Working At Presight Offers

Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.

Career: Accelerate your career through high-impact projects and access to resources for continuous growth and learning opportunities.

Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Back office /Policy admin - Arabic speaker (Abu Dhabi)

Concentrix, Abu Dhabi
2 hours ago
Job Title:Back office /Policy admin - Arabic speaker (Abu Dhabi)Job DescriptionPolicy onboarding, constantly improving turnaround timesProduct configuration & endorsement processingSupport key client transactions and manage related queries & complaintsCentrally manage complaints from various channels e.g. call centre, Customer Happiness Centre, walk-in customers and customer emailsSupport enrolment & renewals for Thiqa, individual basic, individual enhanced and small groupsRole Purpose Work on...

Associate Engineer - Client Delivery & Operations

KATIM, Abu Dhabi
16 hours ago
About KATIMKATIM is a leader in developing innovative secure communication products and solutions for governments and businesses. As an integral part of the Space & Cyber Technologies cluster at EDGE, one of the world’s most distinguished advanced technology groups, KATIM stands as a beacon of trust in an ever-evolving landscape where cyber risks are a constant menace.Our aim is to...

Lead QC E&I Engineer

KBR, Inc., Abu Dhabi
19 hours ago
TitleLead QC E&I Engineer"Belong, Connect, Grow, with KBR! The KBR team of teams delivers future-forward science, technology and engineering solutions and mission-critical services that help governments and companies around the world accomplish their most important objectives, while also helping achieve their sustainability goals.KBR Sustainable Technology Solutions provides holistic and value-added solutions across the entire asset life cycle. These include world-class...