Site Reliability Engineer

Posted on Engineer


Sysdig is the secure DevOps company, and we’re at the forefront of the container and Kubernetes revolution. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to secure and operate cloud-native applications in production. Our consistent contributions to open source software projects reflect our commitment to the open cloud movement.


As Site Reliability Engineer on our Infrastructure team, you will contribute to improve Sysdig provisioning, monitoring, and cloud platform management. You have an aptitude for analytical and creative problem solving and you are very excited to use the power of automation to manage the stability, availability, and scale of our Infrastructure.

Your Responsibilities:

You will join a highly skilled and globally distributed team of SREs, and you can expect to:

Build solutions to enhance the observability, availability, performance, and resilience of the Sysdig SaaS and On-Premise products
Implement reliability improvement initiatives, including performance tuning and infrastructure optimization
Maintain and support the production environments and communicate directly with customer stakeholders
Participate in an on-call rotation with other SREs

Your Background

Experience managing Kubernetes clusters in a production environment
Solid understanding of Linux systems and networking
Proficiency with infrastructure as code/configuration management tools. We love Terraform, but you may have experience with Ansible, Chef, Puppet or SaltStack
Familiarity with monitoring tools such as Sysdig, Prometheus, Nagios, Icinga, Zabbix
Experience managing multi-tenant solutions with Cassandra, Elasticsearch, Kafka or Redis
Proficiency with SQL relational databases, preferably PostgreSQL and MySQL
Command of a scripting language such as python or bash
Knowledge of CI/CD concepts; hands-on experienced is a strong plus
Experience supporting a customer-facing product hosted in a public or private cloud ecosystem
Experience diagnosing and troubleshooting complex problems in high-throughput web applications and network services
Strong sense of ownership and a focus on customer delight

About Company


Job Information

Status: Open Job type: Full Time Salary: Negotiable Publish date: 21 Nov 2020 Expire in: 4 weeks

Apply for job

External website
Please login to submit application