Senior Engineer -Site Reliability Engineering- Emirati Talent

KATIM


Date: 2 weeks ago
City: Abu Dhabi
Contract type: Full time
About KATIM

KATIM is a leader in the development of innovative secure communication products and solutions for governments and businesses. As part of the Electronic Warfare & Cyber Technologies cluster at EDGE, one of the world’s leading advanced technology groups, KATIM delivers trust in a world where cyber risks are a constant threat, and fulfils the increasing demand for advanced cyber capabilities by delivering robust, secure, end-to-end solutions centered on four core business units: Networks, Ultra Secure Mobile Devices, Applications, and Satellite Communications.

The Senior SRE Engineer is responsible for ensuring the reliability, scalability, and performance of mission-critical systems and services. This role combines software engineering and operations expertise to automate processes, optimize infrastructure, and reduce toil. Acting as a bridge between development and operations, the Senior SRE Engineer drives continuous improvement in availability, observability, and incident response, while mentoring junior team members and promoting a culture of reliability across the organization.

Key Responsibilities:

  • Design, implement, and maintain highly available, scalable, and resilient infrastructure and services.
  • Develop automation frameworks and tools to improve deployment, monitoring, and operational processes.
  • Lead incident response, root cause analysis (RCA), and implement permanent fixes to improve system reliability.
  • Collaborate with development and infrastructure teams to embed reliability and performance best practices into the product lifecycle.
  • Define and monitor SLOs/SLAs to ensure service quality and client satisfaction.
  • Drive capacity planning, performance tuning, and cost optimization initiatives.
  • Mentor junior engineers and contribute to knowledge sharing, standards, and documentation.
  • Stay current with industry trends and emerging technologies to propose innovative solutions.

Experience and Education:

Bachelor's degree in Computer Science, Engineering, or a related field

  • 7–10 years of overall IT experience, with at least 4–5 years in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.
  • 3+ years of Ops experience in a production, customer-facing environment
  • Hands-on experience managing large-scale distributed systems and production environments.
  • Proven experience in incident management, performance tuning, and capacity planning
  • Strong expertise in Linux/Unix administration and scripting (Python, Bash, Go preferred).
  • Proficiency with containerization and orchestration technologies (Docker, Kubernetes, Helm).
  • Experience with cloud platforms (AWS, Azure, GCP) and on-prem hybrid environments.
  • Knowledge of CI/CD pipelines and automation frameworks (Jenkins, GitLab CI, ArgoCD, Terraform, Ansible).
  • Solid understanding of networking, security, and load balancing.
  • Experience with observability stacks (Prometheus, Grafana, ELK/EFK, OpenTelemetry).
  • Database operations knowledge (PostgreSQL, MySQL, NoSQL)

Key Skills:

  • Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.
  • Kubernetes Administrator (CKA) or Kubernetes Application Developer (CKAD).
  • Cloud certifications (AWS Solutions Architect, Azure Administrator, or GCP Professional Cloud Engineer)
  • Proven track record of working in Agile/Scrum environments and using tools like Jira and Confluence.
  • Exceptional communication and collaboration skills, with the ability to work effectively in cross-functional teams.

#KATIM

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

C++ Engineer - Trading

Keyrock, Abu Dhabi
21 hours ago
Job title: C++/Rust Engineer - TradingAbout KeyrockSince our beginnings in 2017, we've grown to be a leading change-maker in the digital asset space, renowned for our partnerships and innovation.Today, we rock with over 180 team members around the world. Our diverse team hails from 42 nationalities, with backgrounds ranging from DeFi natives to PhDs. Predominantly remote, we have hubs in...

Automation Engineer

Agile Consultants, Abu Dhabi
22 hours ago
Job Title: Automation EngineerLocation: Abu DhabiSalary: Based on experienceJob Code: 287/001/089Industry: Paper ManufacturingBenefits: Accommodation & TransportationJob Brief:We are seeking an experienced Automation Engineer to join a leading paper manufacturing company in Abu Dhabi. The ideal candidate will have extensive hands-on experience with Valmet DNA automation systems, ensuring seamless operation, monitoring, and troubleshooting of DCS and QCS systems within paper machine processes.Job Responsibilities:Monitor, maintain, and troubleshoot...

Auditor I

United Arab Emirates University, Department of Family Medicine, Abu Dhabi
22 hours ago
Job Description Support in implementing and adhering to policies and procedures for Accounts Payables across UAEU. Keep accurate records of all payments and expenditures, including payroll, purchase orders, invoices, statements for future reference, etc. Control and audit all financial transactions and ensure their compliance with approved systems, policies and procedures. Auditing accounting entries. Work to transfer knowledge of accounting practices...