Senior Site Reliability Developer
This is a flexible position and has the option of working in our Toronto office full time, hybrid throughout the week or working entirely remotely. #LI-Remote
About the Role:
Vena is looking for a Senior SRE to join our SaaS Technology and Operations (STO) team. This role is a match for you if you love building highly scalable, resilient and automated services.
We are an innovative team which aims to provide exceptional customer experience by leveraging best-in-class automation and orchestration practices for Vena's SaaS platform. As a Senior Site Reliability Developer, you will utilize your software and systems engineering background to build and run large-scale, distributed, fault-tolerant systems and services. We strive to hire people who are looking to make an impact and thrive in a flexible work environment driven by business objectives.
Your role is to ensure that our systems - both internally and externally facing-have been designed with maximizing resiliency and uptime. Our team focuses on optimizing existing systems, building infrastructure and reducing toil through automation. Practices such as limiting time spent on manual operational work, post-mortems and proactive identification of potential outages factor into iterative improvement that is key to both product quality and technical standards.
What you will do:
- Help Vena's technology organization build scalable systems, using best practices around automation (reliability) and developer self-service (velocity)
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, planning and reviews
- Define and document runbooks and standard operating procedures
- You will own and lead projects as part of the STO team’s delivery strategy
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health
- Provide mentorship and training to other Vena SREs as well as members of the Product and Technology organization on emerging technologies and new processes, drive education and knowledge transfer of design patterns and technical practices
- Drive high standards around incident response practices and policies with a focus on automated response and remediation
- Participate in influencing and shaping the overall STO team culture
- You are a subject-matter expert in one or more technologies leveraged by the Vena platform.
- Participate in technical interviews for technical positions within STO and occasionally extend into other positions within Vena’s product and technology organization
- Participate in on-call rotation
What we use:
Please note this reflects only a portion of our current technical stack, and we are constantly evolving and revisiting our stack as we grow:
- A modern AWS cloud infrastructure managed through infrastructure-as-code (Terraform), configuration-as-code (Ansible), and CI/CD (Jenkins)
- RDS MySQL, Redshift, Redshift Spectrum, MongoDB, and Elasticsearch
- Kinesis, SQS, and RabbitMQ
- DevOps tools written in Python
- Monitoring with Datadog, and CloudWatch
Does this sound like you?
- 6+ years of experience in an IT Operational, DevOps, Site Reliability Engineer, or Software Engineering role.
- You ideally possess an Associate or Professional level certification from AWS or Microsoft Azure.
- You are adept at core SRE concepts such as SLO, SLIs and error budgets and have direct experience in implementing them.
- In-depth knowledge of cloud computing platforms (AWS and Azure) and solid experience of setup and management of cloud infrastructure using IaC and orchestration tools. ∙ You can write code - in any language. You have implemented your work in a production environment and can back it up with examples.
- In-depth experience with tools and platforms such as: AWS, Azure, Ansible, Artifact storage (such as Artifactory, ECR), Build/Release Pipelines (such as Jenkins, Gitlab, GH Actions, or equivalents), Docker, Github, Kubernetes, Terraform etc.
- Direct experience with large-scale distributed systems in the cloud using observability and telemetry for oversight of code deployments.
- Experience with the operational aspects of software systems using telemetry, centralized logging, and alerting with tools such as: CloudWatch, Datadog, Prometheus, etc.
Not checking every box?
Sounds like the job for you, but you don’t think you have what it takes on paper? Reach out to us anyway! We’re aware that members of marginalized groups typically apply only when they check every box. Vena is an inclusive workplace that considers all applicants. We value diversity—in professional backgrounds and in experiences—and are committed to providing equal opportunity and a sense of belonging for all employees and applicants. Let’s discover together whether you could be a great fit at Vena.
Why choose Vena?
- Total Rewards: Grow with Vena and celebrate its success with our Employee Stock Option Program (ESOP). We look ahead and invest in your future with our Retirement Savings Matching Program. We also provide comprehensive health benefits through our employer group plan effective from day one.
- Unique Culture: Join us in our ongoing commitment to build a diverse and inclusive workplace. Every voice, action and idea matters at Vena.
- Career Growth: We invest in your job training, professional development and continuing education and offer an Education Subsidy. Pursue your interests and chart your growth towards a new position on your current team or a new one. Vena had 240+ employee promotions and internal moves to new roles in 2022!
- Executive Leadership: Be inspired by our executive leadership as they lead and motivate our team.
- Read what employees say about working at Vena!
Please note: All interviews will be conducted using Zoom. We believe everyone should get to work from a location that suits their job and lifestyle the best, whether that is at home, in office, or a hybrid.
Something looks off?