Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us making each other’s ideas stronger. That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing lives for the better. It’s the diversity of our people and their thinking that inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something.
People at Apple don’t just build products — they craft the kind of experience that has revolutionized entire industries. The diverse collection of our people and their ideas encourage innovation in everything we do. Bring passion and dedication to your job and there’s no telling what you could accomplish. Join Apple’s Cloud network Infrastructure team as a site reliability engineer to help support and scale cloud services.
The Apple Networking team builds software-defined cloud network infrastructure as a part of Apple Cloud. Our infrastructure is a critical foundation in delivering Apple’s services (such as iCloud, iTunes, Siri, Maps) to billions of customers. We are a fast paced organization where drive and collaboration are the keys to success. Teams across Apple rely critically on us for infrastructure that help them build services that scale globally, are highly available, and “just work”.
As a Site Reliability Engineer you will be responsible for the cloud network services to maintain high availability, scale and resilience. The successful candidate is expected to be highly self-motivated with a passion for excellence, quality and detail.
In this role, you will engage in the entire service lifecycle from product release, through to deployment, operation and refinement. Following are the key responsibilities
– As a part of launch readiness, support activities such as system design engineering, developing software tools and platforms, managing/planning capacity, and conducting launch reviews to ensure readiness
– Maintain service quality via monitoring and improving availability, performance and health.
– Proactive designs and process implementations to mitigate risk, reduce impact radius, incident detection and resolution times. Deliver on a sustainable incident response practices learning from experiences through blameless postmortems.
– Collaborate with cross-functional teams in driving service integrations, resolving dependencies and representing the service offerings.
Experience in crafting and operationalizing large scale distributed, fault-tolerant, multi-tenant servicesExperience with operating systems and network fundamentalsExperience in API design and interface technologies (JSON, ProtoBuf, REST, RPC, XML, etc)