- LocationBangalore, India
-
IndustryInsurance
What will your role be? The Site Reliability Engineering team (SRE team) is at the heart of Turtlemint’s server infrastructure. The team ensures that the services in production are available, scale to the variable load while ensuring latencies meeting set benchmarks at the most optimal cost point. To achieve this, the team relies on talented SRE engineers which are complimented by a set of cutting edge tools and technologies. As an engineer with the SRE team you will create and maintain software and infrastructure to keep the system highly available and operations seamless.
Key Responsibilities As a Site Reliability Engineer, you will:
● As a SRE, you will be responsible for designing, implementing, and managing cutting-edge deployment and automation of cloud resources. ● Maintain a highly available and reliable production service.
● Setup monitoring and alerts for production environments.
● Looking for improvements for any platform or resourcing constraints through automation and new technologies.
● Be an AWS platform expert in the “core” infrastructure areas (VPC, EC2, RDS, S3, etc.)
● Deploy and maintain AWS infrastructure in client environments
● Contribute to codebase with Terraform, Ansible, or potentially other automation and scripting languages
● Contribute to organization’s ever-growing knowledge base
What are we looking for? Required
● At least 3 to 5 years of hands-on experience with AWS or any other Public Cloud.
● Experience deploying and managing infrastructure on public clouds such as AWS or Azure
● Experience in implementing containerized solutions using Docker, Amazon EKS, AKS, AWS Elastic Container Service etc.
● Proficient in Unix/Linux tools and systems
● Scripting and automation skills (Python, Bash or similar)
● Working experience of using Jenkins, Git etc. Monitoring tools (Prometheus Stack), Logging tools (ELK) is a plus
● Solid understanding of Continuous Integration and Continuous Delivery best practices
● Hand on experience of IaC and configuration management tools such as Cloud formation, Terraform, Ansible, Packer etc.
● Agile and fast Learner with an optimistic attitude, taking the team along towards a common goal.
