New

Site Reliability Engineer II, Infrastructure Services

Hashicorp
United States, California, San Francisco
Feb 13, 2025
Site Reliability Engineer II, Infrastructure Services Our Team The Infrastructure Services team builds and maintains the backbone of HashiCorp's cloud products. We focus on creating reliable, scalable, and secure infrastructure services that enable engineering teams to transition quickly without breaking things. Instead of just keeping the lights on, we're constantly improving automation, reducing toil, and making infrastructure more self-service and developer-friendly. We work with Nomad, Consul, Vault, Terraform, and AWS services to power HashiCorp's cloud offerings. Our mission is to provide infrastructure that's easy to use, resilient, and secure by default so product teams can focus on delivering great experiences to customers. About this Role As a Site Reliability Engineer II on the Infrastructure Services team, you will help build, maintain, and improve the infrastructure that supports all HashiCorp cloud products. You will work alongside skilled engineers to ensure our systems are reliable, scalable, and secure while gaining hands-on experience in operating and automating cloud infrastructure. This role is ideal for an engineer looking to deepen their expertise in site reliability engineering, learn from senior engineers, and take on increasing responsibility over time. In this role, you can expect to: Contribute to the development and maintenance of core infrastructure services, ensuring reliability, scalability, and security Implement automation to improve operational efficiency and reduce manual toil Assist in monitoring, alerting, and logging improvements to enhance system observability Debug and address medium-complexity infrastructure issues with guidance from senior engineers Participate in on-call rotations after an initial onboarding period, learning incident response best practices Work within established team practices, exercising self-directed judgment on tasks while seeking guidance when necessary Propose and implement improvements to existing infrastructure components and deployment processes Write and maintain documentation for infrastructure configurations, procedures, and troubleshooting guides Collaborate with other teams to understand infrastructure needs and contribute to solutions Shadow interviews for entry-level candidates and participate in discussions on hiring evaluations You may be a good fit for our team if you: Have experience in site reliability engineering, cloud infrastructure management, or systems administration Familiar with cloud platforms such as AWS and infrastructure as code tools like Terraform Have some experience with observability tools such as Datadog, Prometheus, or Grafana Enjoy problem-solving and working through operational challenges Are comfortable writing scripts or simple automation in languages such as Python, Go, or Bash Communicate skillfully and collaborate well in a team environment Are interested in growing into a senior SRE role and learning from skilled engineers Have a growth mindset and seek continuous improvement in processes and technical skills #LI-Remote Individual pay within the range will be determined based on job related-factors such as skills, experience, and education or training. The base pay range for this role in the SF Bay Area / NYC area is: $151,300 - $178,000 USD The base pay range for this role in California (excluding SF Bay Area), New York (excluding NYC), Seattle Metro, Denver / Boulder Metro, Washington D.C., or Maryland is: $138,600 - $163,100 USD The base pay range for this role in Colorado (excluding Denver / Boulder Metro), Illinois, Minnesota, or Washington (excluding Seattle Metro) is: $126,100 - $148,300 USD