Back to Jobs
Site Reliability Engineer
The vacancy has expired
- LocationBangalore, India
-
IndustryEngineering & Automotive
Job Description
Roles
- Build completely automated observability platform with continuous metric instrumentation and SRE KPI(SLI/SLO/Error-Budget)
- Build automated platforms with scale, reliability and performance for multi-cloud, multi-environment
- Manage projects like multi and/or self-customised environments, alerting and monitoring platforms
- Handle and manage service quality – Incident Management
- Handle and manage problem management (RCA, postmortem and drive ZERO remediation backlog)
- Handle reporting for improved visibility of tech issues across companies and execs
- Understand product workflows and improve their scale and reliability through instrumentation, analysis and reporting
- Proactively promote/adopt the SRE principle and culture in the organisatiom
What you will need
- 5 - 8 years of work experience
- Graduate degree in Computer Science or a related field or equivalent experience
- Experience working with multiple teams – both external and internal
- Hands-on experience in building and running SRE practices
- Knowledge of Kubernetes, Docker, Microservices, and REST APIs and multi-cloud(AWS/GCP)
- Proficiency in coding (Python, Golang) and tooling(Ansible, Terraform)
- Experience with Containers and virtual instances in the public cloud
- A keen eye for reliability, scalability, and performance of the platform, end to end
- Proficiency in agile development, code reviews, design documentation, debugging and troubleshooting
- Extensive experience running high scale cloud-native services
- Ability to develop high-performing teams fast, while maintaining a positive team culture
