Would you like to play a part in shipping groundbreaking technology for large-scale systems, natural language, and artificial intelligence? Join the Siri ML Systems Evaluation Engineering (MLSEE) team at Apple! We interact with various engineering teams to identify issues and test requirements, drive the quality of Siri features, and create frameworks that support evaluation and qualification. You will work with the people who craft the intelligent assistant that helps millions of people get things done — just by asking!
As a senior manager, you will play a critical role in driving the evaluation methods of various platforms and product areas for Siri. You will drive coverage and fidelity of testing and evaluation for pre-ship hill climbing and readiness for a wide range of AIML products, including Apple Intelligence features. You will be crafting new and leveraging existing methodologies and frameworks to validate the end-to-end quality of new features. Working alongside SW and ML engineering product teams as a critical partner, you will utilize a suite of available libraries and frameworks – test automation, LLM-based evaluators, and failure analysis tools – and drive requirements for improving those. You will be responsible for benchmarking performance for new features/platforms and driving improvement recommendations.
We are seeking an expert senior engineering manager to lead Siri’s intelligent services and platforms (iPhone, HomePod, CarPlay, Apple Watch, iPad, MacBook) evaluation and qualification. You will be responsible for partnerships with product engineering that will drive the test and evaluation strategy, frameworks, performance testing framework, and solutions that can scale. You will work alongside SW and ML engineering product teams as a critical partner. You will need to drive efficiency through the adoption of LLMs for testing and evaluation, including evaluators, auto-triage of issues, and auto-error-attribution.
This role will collaborate with Siri Engineering teams, create a quality strategy in partnership with the product roadmap, work on client integration and performance automation frameworks, analyze results, and reach consensus as we assess the impact of changes, new features, and overall readiness for each release. This is a fast-paced role with high visibility, impact, and influence in building one of the most advanced AI systems in the industry.
Minimum of 5+ years of leadership experienceExperience with Machine Learning development and evaluationExperience with building and shipping products at scaleProven and consistent track record of forming partnerships to solve complex technical problems at scale, ideally for products with a global customer baseMS/PhD in Machine Learning, Computer Science, or a related field