Senior Site Reliability Engineer - Platform
Coalition is the world's first Active Insurance provider designed to help prevent digital risk before it strikes. Founded in 2017, Coalition combines broad insurance coverage with a digital risk assessment and continuous security monitoring to help organizations protect themselves in today’s hyper-connected world.
Coalition offers its Active Insurance products in the U.S., U.K., and Canada through relationships with leading global insurers including Allianz, Arch Insurance, Lloyd’s of London, Swiss Re and Zurich, as well as cyber capacity through its own carrier, Coalition Insurance Company. Coalition's Active Risk Platform provides automated security alerts, threat intelligence, expert guidance, and cybersecurity tools to help businesses worldwide remain resilient against cyber attacks.
Coalition comprises a team of cybersecurity and technology experts, as well as experienced insurance professionals, who have come together to build a world-class organization with a massive technological advantage. Our secret sauce is bringing these expertise together to create a world-class organization with one mission: to protect the unprotected as the world digitizes. Today, Coalition is one of the world’s largest commercial insurtechs serving hundreds of thousands of customers worldwide.
Since its founding, Coalition has raised $755 million in equity funding, including $250 million in June 2022, affirming its ability to deliver profitable growth and cementing its position as a long-term business with a clear competitive advantage.
Coalition’s exceptional growth stems from its ability to address real-world problems for organizations of all sizes, and by remaining true to our founding values of character, humility, responsibility, purpose, authenticity and inclusion. We are proud to have been named among Inc.’s Best Workplaces in 2021 and 2023, and one of Fast Company’s Most Innovative Companies in 2022.
About The Role
We are looking for a Senior Site Reliability Engineer (Remote) who has the experience and ability to instrument and monitor the breadth of our full platform stack (hosts, applications, and performance). In this role you will work closely with our engineering and information security teams to enhance the automated system provisioning and deployment subsystems within codified infrastructure. You will work with developers to create more scalable services and help us build self service paved roads to simplify writing services and provisioning infrastructure. You will help to isolate, trap, and respond from the inevitability of system failure and develop strategies for continuous monitoring and analysis to reduce both downtime and required manual intervention. You will participate in On-Call rotation to maintain platform SLAs.
Our core platform is written mostly in Python with some services in Java and Go. We prefer to use the right tool for the job and make pragmatic decisions about how to scale and decouple systems as we continue to grow. We’re looking for someone who can navigate a cloud environment (AWS) with many moving pieces and systems to help the team understand how they fit into the broader puzzle. If you are a software engineer looking to move into SRE this may be for you.
- 5+ years of experience in SRE/DevOps/Cloud engineering or Software Development roles in a full stack engineering environment
- Must have experience with a customer facing production environment using containerization and orchestration tools such as ECS, Kubernetes, or Swarm
- Solid development experience in Go or Python, writing production grade shared libraries and tooling
- Experience working with fault tolerance services and the iterative development of highly-available systems
- Experience with running a production environment in one or more Infrastructure as a Service cloud providers (AWS/Azure/Google Cloud)
- Prior experience with full-stack monitoring from system level metrics to SLOs, failure-based testing approaches, and monitoring strategies
- Understanding of CI/CD pipelines to accelerate deployments and improve both security and auditability (e.g. Github Actions, Jenkins, Travis, or CircleCI)
- Experience soliciting systems requirements, designing, and implementing new platform components leveraging infrastructure or SaaS services
- Some knowledge of software engineering design patterns, agile development, and architecture principles
- Excellent organizational, verbal, and written communication skills
- Mentor junior engineers in SRE best practices and software engineering
- Experience working in an agile methodology development lifecycle
- Bachelor’s or Master’s degree in Computer Science, related field, or equivalent experience
- Experience with converting monolithic applications to microservices and service discovery technology
- Experience automating system provisioning, configuration, and Infrastructure as Code (Cloudformation, Terraform, Ansible, etc)
- Exposure to systems security requirements, information assurance techniques, and system hardening
- Exposure to Kafka, AMQP, Kinesis, job queue and other pub/sub queuing systems
To learn more, check out our featured press releases:
- Coalition Closes $250 Million in Series F Funding, Valuing the Cyber Insurance Provider at $5 Billion
- Coalition Named to Fast Company’s Annual List of of the World’s Most Innovative Companies for 2022
- Coalition Launches Active Insurance, Reaches $650M Run Rate GWP
- Coalition launches tech-powered executive risks products with personalized risk assessment for all US small-businesses
Something looks off?