Senior Site Reliability Engineer (General Rodríguez)
Senior Site Reliability Engineer (General Rodríguez)
-
General Rodríguez, Argentina
-
Publicado: hace menos de una semana
-
Guardar
Descripción
We are looking for a Senior Site Reliability Engineer (SRE) – Azure to drive service health, reliability, and performance as we launch and scale services for our client. This critical role requires expertise in incident response, troubleshooting, and advancing cloud reliability practices in high‑stakes environments with minimal process maturity. Responsibilities
- Develop and automate processes to enhance system reliability, scalability, and performance
- Collaborate with teams to integrate reliability best practices into the software development lifecycle
- Respond quickly to and resolve service incidents within the Azure environment to minimize downtime
- Lead root cause analysis and post‑incident reviews, ensuring actionable improvements are implemented
- Design and maintain robust monitoring, alerting, and observability solutions for critical services
- Identify reliability risks proactively and address them before impacting customers
- Establish and refine SRE practices such as incident management and service level objectives (SLOs)
- Mentor team members in adopting SRE principles and leveraging Azure tools
- Analyze recurring incidents and outages to drive systemic improvements
- Advocate for a culture of reliability and continuous learning
Requirements
- 3+ years in SRE, DevOps, or similar roles with proven experience in cloud environments, including Azure
- Expertise in troubleshooting distributed systems, networking, and cloud‑native architectures
- Hands‑on experience with Azure tools like Monitor, Log Analytics, Application Insights, ARM, Bicep, and Terraform
- Proficiency in scripting or programming languages such as Python, Power Shell, or Bash
- Understanding of incident management workflows and post‑incident evaluations
- Experience implementing observability solutions and defining service level indicators (SLIs)
- Strong communication skills with the ability to collaborate effectively under pressure
- English proficiency at B2 level or higher Postúlate en Kit Empleo: kitempleo.com.ar/empleo/qemcr
- Develop and automate processes to enhance system reliability, scalability, and performance
- Collaborate with teams to integrate reliability best practices into the software development lifecycle
- Respond quickly to and resolve service incidents within the Azure environment to minimize downtime
- Lead root cause analysis and post‑incident reviews, ensuring actionable improvements are implemented
- Design and maintain robust monitoring, alerting, and observability solutions for critical services
- Identify reliability risks proactively and address them before impacting customers
- Establish and refine SRE practices such as incident management and service level objectives (SLOs)
- Mentor team members in adopting SRE principles and leveraging Azure tools
- Analyze recurring incidents and outages to drive systemic improvements
- Advocate for a culture of reliability and continuous learning
Requirements
- 3+ years in SRE, DevOps, or similar roles with proven experience in cloud environments, including Azure
- Expertise in troubleshooting distributed systems, networking, and cloud‑native architectures
- Hands‑on experience with Azure tools like Monitor, Log Analytics, Application Insights, ARM, Bicep, and Terraform
- Proficiency in scripting or programming languages such as Python, Power Shell, or Bash
- Understanding of incident management workflows and post‑incident evaluations
- Experience implementing observability solutions and defining service level indicators (SLIs)
- Strong communication skills with the ability to collaborate effectively under pressure
- English proficiency at B2 level or higher Postúlate en Kit Empleo: kitempleo.com.ar/empleo/qemcr
Información clave
-
Nombre de la empresaEPAM Systems
-
Nombre de la vacanteSenior Site Reliability Engineer (General Rodríguez)
Consejos de seguridad
Tené cuidado con trabajos prometedores que no exigen demasiado.
Más info sobre el aviso
El aviso Senior Site Reliability Engineer (General Rodríguez) fue publicado en la categoría General Rodríguez Otras ofertas de empleo de Locanto.
En estos momentos, este es el único aviso disponible en esta categoría en General Rodríguez.
¿Buscás algo más? Podés aumentar tu radio de búsqueda y mirar los resultados en otras ubicaciones en tu región, como Otras ofertas de empleo en La Reja, Moreno o Luján. Además, en esta sección, disponemos de más avisos clasificados en un radio de 15 km. Hacé clic aquí para verlos.