Staff Software Engineer, Infrastructure (SRE)
Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.
Affirm’s Infrastructure Platform team is building a large-scale, massively distributed, fault-tolerant global infrastructure shared across multiple financial products, merchants and vendors. Ensuring that our infrastructure is openly available to engineers is a critical part of Affirm’s success story. We pride ourselves on our culture across engineering design, architecture and writing detailed tech specs and capturing feedback before large changes to systems.
We are looking for a Staff Site Reliability Engineer with deep technical knowledge and who’s passionate about Linux, networking topics, microservices and distributed architectures and has experience with handling large scale services to join our Site Reliability Engineering team. Our goal is to enable Affirm's global, service oriented architecture based product and infrastructure stack to be observable, highly resilient, scalable and fault tolerant, while maintaining our high SLA uptime expectations. You will excel if you have passion for digging deep, and a flare for sharp technical communication, prioritization, and organization. You will work directly with our Platform / Infrastructure and Product Development teams to build our next generation “always up” cloud-based platform.
Our work ranges from Observability/Telemetry Engineering, Reliability and Scalability Engineering, Chaos Engineering, Performance Engineering, Capacity Engineering and Disaster Recovery Engineering, and working closely with the security team on managing application level security.
Site Reliability Engineers are hybrid System, Software, Data and Network Engineers who are responsible and accountable to build and scale reliable systems that impresses our customers.
What you'll do
- Own end to end availability, reliability and performance of the mission critical services
- Troubleshoot various issues around reliability, resiliency, scalability and availability.
- Define and measure SLI, SLA and SLO
- Augment instrumentation to build a cohesive dependency mapping with special attention to points of failure
- Build command and control automations to quickly fail away to reduce TTR and reduce manual work/eliminate Toil.
- Assist with oncall and triage rotation
What we look for
- Linux, Networking and AWS experience
- Experience with containerization and container platforms. (e.g., Docker, Kubernetes)
- Familiarity with Elasticsearch, Kibana/Grafana, Logstash, kafka and ways to scale these systems
- Incident management that leads to lower time to repair
- Fostering reliability engineering culture
- Experience with automation systems (ansible, puppet, terraform) is a plus, saltstack preferred
- Software development experience in Python/Kotlin/Go is a plus
- Experience with high performance networking (Quic, network layer optimization) or Real Time transaction protocols/methods (HTTP2, Server Sent Events, MQTT, WebSockets).
- Recommends or helps architect an entire system. Acts as an expert in understanding and performing TCP dumps, snoop, and other network sniffers. Understands and applies knowledge of most protocols (TCP/IP, HTTP, UDP, etc.)
USA Pacific base pay range (CA, WA, NY, NJ, CT): $190,000-$284,900
Sapphire base pay range (all other U.S. states): $171,000-$256,500
Affirm is proud to be a remote-first company! The majority of our roles are remote and you can work almost anywhere within the country of employment. Affirmers in proximal roles have the flexibility to work remotely, but will occasionally be required to work out of their assigned Affirm office. A limited number of roles remain office-based due to the nature of their job responsibilities.
We have a simple and transparent remote-first grade-based compensation structure. Offer amounts within the range are based on a number of factors including but not limited to job-related skills, experience, and relevant education or training. Across the broader organization, certain roles are eligible for equity awards upon hire, promotion, tenure milestones and for performance.
We’re extremely proud to offer competitive benefits that are anchored to our core value of people come first. Some key highlights of our benefits package include:
- Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
- Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
- Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
- ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount
We believe It’s On Us to provide an inclusive interview experience for all, including people with disabilities. We are happy to provide reasonable accommodations to candidates in need of individualized support during the hiring process.
Something looks off?