Engineer - Platform Operations

Core42


Date: 3 weeks ago
City: Abu Dhabi
Contract type: Full time
Overview

The opportunity

A Platform Engineer is responsible for designing, building, and maintaining the infrastructure that supports high-performance computing tasks and AI workloads. They ensure the scalability, reliability, and efficiency of computing platforms, integrating hardware and software to optimize performance. Additionally, they collaborate with data scientists and developers to troubleshoot and enhance platform capabilities, enabling advanced computational tasks and innovations.

Core42 is the UAE’s national-scale enabler for cloud and generative AI, combining G42 Group’s expertise across multiple technology disciplines into a single platform for public sector and large enterprise transformations. Building on our capabilities as sovereign cloud and HPC specialist, we bring generative AI, cybersecurity, professional and managed services expertise to enable national-scale program deployments across industries.

Responsibilities

Objectives of this role:

  • Develop and deploy scalable and efficient computing platforms to support AI and HPC workloads, ensuring they meet performance, reliability, and security requirements.
  • Continuously optimize system performance by tuning hardware configurations, software parameters, and network settings to maximize throughput and minimize latency for AI and HPC applications.
  • Integrate various tools and technologies to streamline workflows, automate repetitive tasks, and enhance overall system efficiency and manageability.
  • Implement monitoring solutions to track system health and performance, promptly identifying and resolving issues to ensure minimal downtime and optimal functionality.
  • Work closely with data scientists, researchers, and developers to understand their needs, provide technical support, and make adjustments to the platform to accommodate evolving requirements.

Key Responsibilities

  • Design, deploy, and maintain the underlying hardware and software infrastructure necessary for AI and HPC applications, ensuring it is scalable and robust.
  • Monitor and optimize system performance by fine-tuning configurations, managing resources, and implementing best practices to achieve maximum efficiency.
  • Develop and implement automation scripts and tools to streamline repetitive tasks, deployment processes, and system updates.
  • Integrate various technologies, including cloud services, databases, and AI frameworks, to create cohesive and effective computing environments.
  • Diagnose and resolve technical issues related to the platform, providing support to developers and data scientists to address performance bottlenecks and system failures.
  • Ensure that the computing platform adheres to security standards and compliance requirements, implementing measures to protect data and infrastructure.
  • Maintain detailed documentation of system configurations, processes, and procedures, and generate reports on system performance and resource utilization.
  • Work closely with cross-functional teams, including data scientists, researchers, and software engineers, to understand their needs and provide solutions that support their objectives.

Qualifications

Required skills and qualifications

  • A bachelor's degree in Computer Science, Engineering (such as Electrical or Software Engineering), Information Technology, or any related field.
  • 5 or more years of experience in platform engineering, systems administration, or a related field, with a focus on high-performance computing or large-scale infrastructure management.
  • Hands-on experience with AI or HPC environments, including managing and optimizing computational resources, is often required. This might involve working with HPC clusters, cloud computing platforms, or AI frameworks.
  • Demonstrated experience with relevant technologies such as Linux/Unix systems, cloud platforms (e.g., AWS, Azure), scripting languages, and performance tuning tools.
  • Proven track record of working on projects involving the design, implementation, and optimization of complex computing platforms, ideally with examples of successfully managed AI or HPC workloads.

Preferred Skills And Qualifications

  • Knowledge of security best practices and tools for protecting infrastructure and data, including experience with identity management and access controls.
  • Strong analytical and troubleshooting skills to quickly identify and resolve technical issues that impact system performance or stability.
  • Effective verbal and written communication skills for collaborating with cross-functional teams and documenting technical processes.
  • Several years of experience in platform engineering, systems administration, or related

What We Look For

If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Core42 community.

What Working At Core42 Offers

Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.

Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.

Work-Life: A hybrid work policy to strike the perfect balance between office and home.

Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.

If you can confidently demonstrate that you meet the criteria above, please contact us as soon as possible.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Senior Technology Support Analyst

Robert Walters, Abu Dhabi
10 hours ago
Provides technology support to the local office for areas including software, hardware, infrastructure, network services, mobile and remote services, audio visual, telephony and security systems.A high level of customer service, accurate reporting and team work with other local technology staff as well as the Firm's central resources in White Plains.Responsible for management and documentation of computer hardware/desktop configurations/network systems and...

Designer (Social Media)

Edelman, Abu Dhabi
11 hours ago
Edelman is a voice synonymous with trust, reimagining a future where the currency of communication is action. Our culture thrives on three promises: boldness is possibility, empathy is progress, and curiosity is momentum.At Edelman, we understand diversity, equity, inclusion and belonging (DEIB) transform our colleagues, our company, our clients, and our communities. We are in relentless pursuit of an equitable...

Senior Analyst- Business Analytics (Emiratized Role )

First Abu Dhabi Bank (FAB), Abu Dhabi
16 hours ago
Company DescriptionFirst Abu Dhabi Bank is an inclusive environment where each person values the experiences, perspectives, ideas and beliefs of others. We’re in a unique position to learn from all our colleagues, combining international experience with deep cultural knowledge and local expertise. At FAB, you’ll have the support of your team and a strong relationship with your line manager, who...