Site Reliability Engineer (SRE) – Object Storage – 200562944 -Seattle, Washington, United States

Apple

People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.

Apple Cloud infrastructure is BIG. The storage SRE teams of Apple Cloud are building and running the next generation distributed storage systems to support Apple’s most critical services. Operating at our scale, across multiple geographically dispersed data centers, and servicing users with vast data need presents unique challenges. As a Storage SRE at Apple, you’ll need to tackle these problems using your deep understanding of storage, data analysis, programming, teamwork, and expertise in Linux system internals. Storage SREs at Apple involve themselves across the full infrastructure stack; from tuning the block storage layer to content delivery network traffic management!

We are looking for seasoned software and systems engineers to join the Object Storage SRE team at Apple. The role involves tremendous amount of individual responsibility and influence over the direction the platform, shaping its use by many critical Apple Cloud services for years to come. You are solution-oriented and have a passion for software delivered as a service to improve reuse, efficiency, and simplicity. Your work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.

The role involves understanding the team’s priorities; taking ownership of projects or deliverables; crafting solutions and building buy-in for those designs; and successful delivery of those designs in order to meet the project goal. The role involves giving technical feedback to colleagues to assist them in the delivery of their designs, features and projects, as well as driving technical standards across the two-site team in collaboration with other senior members of the team.

The team has an on-call rota including the week-ends and the successful candidate should expect to handle alerts and other critical issues in order to maintain a high level of availability and functionality for our provided services. The team is divided into two locations and cross-timezone meetings are a core feature of how our team collaborates, reaches agreements, and executes to deliver projects.

At Apple Cloud, we run a mix of open source, vendor licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software development & deployment, logging, and monitoring. You’ll learn these tools and have opportunities to improve them. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

The candidate may be expected to travel to different Apple locations nationally or internationally.

7+ years experience in building, operating, and scaling distributed storage systems in a private, public, or hybrid cloud environment, or working with other large-scale, stateful systems such as distributed databases.The ability to design, author, understand, and release code in languages like Go (preferred), Java, Python, or Rust.Good understanding of block, object, and file storage solutions in Linux (such as LVM, XFS, ext4, S3, Ceph, Gluster, NFS).Understanding of Linux internals, standard networking protocols, and distributed systems.Experience with provisioning, data migration, backup & recovery, at-scale testing, disaster recovery, and capacity planning.

 

Job Overview