AIML – Software Engineer (ML Efficiency), Machine Learning Platform & Infrastructure – 200571396 -Cupertino, California, United States

Apple

Do you want to shape the platform that enables the next generation of intelligent experiences on Apple products & services? In Appleā€™s Machine Learning Platform Technology & Infra team we have built the platform that Apple uses for developing machine learning, artificial intelligence, and computer vision applications. As a team, we have a variety of technical backgrounds, from machine learning PhDs to builders of large-scale production systems.

Specifically in this role you will be working on optimizing end-to-end system performance of distributed machine learning workloads. This is a highly collaborative role and you will be working with key partners across the company.

We are seeking highly motivated and experienced engineers to join our team. The ideal candidate will have a deep understanding of machine learning systems and cloud computing infrastructure. Key responsibilities in this role are:

Engage with ML researchers to optimize end-to-end performance of large scale distributed ML workloads
Analyze workload metrics to identify sources of inefficiencies and work with users to understand and optimize ML workloads
Conduct workload analysis based on benchmarking key workloads on deployed systems
Improve large scale training resiliency by optimizing applications and frameworks for improved recovery from failures and preemptions
Influence architecture, design, development, and operations of next generation ML accelerator systems based on workload insights

Experience working with large scale parallel and distributed accelerator-based systemsExperience optimizing performance and AI workloads at scaleExperience developing code in one or more of training frameworks (such as PyTorch, TensorFlow or JAX)Experience in performance analysis and optimization experience in Cloud acceleratorsDeep understanding of computer systems and the interactions between HW and SWStrong communicator with ability to analyze complex and ambiguous problemsProgramming and software design skills (proficiency in C/C++ and/or Python)Experience working in a high-level collaborative environment and promoting a teamwork mentality

 

Job Overview