- Страна
- Израиль
Откликайтесь
на вакансии с ИИ

Site Reliability Engineer
Отличная вакансия в стабильной компании с сильной корпоративной культурой и фокусом на современные технологии (AI, Kubernetes). Высокий процент внутреннего продвижения менеджеров говорит о хороших карьерных перспективах.
Сложность вакансии
Роль требует глубоких знаний в области облачных платформ (GCP/AWS), оркестрации контейнеров и автоматизации. Высокий уровень ответственности за инциденты и участие в on-call ротациях повышают сложность позиции.
Анализ зарплаты
Зарплата в вакансии не указана, но для SRE-инженера с опытом от 4 лет в Тель-Авиве рыночные показатели являются одними из самых высоких в мире. Предложенный диапазон отражает текущие реалии израильского хайтек-рынка для специалистов среднего и старшего звена.
Сопроводительное письмо
I am writing to express my strong interest in the Site Reliability Engineer position at Optimove. With over four years of experience in managing large-scale cloud infrastructure and a deep passion for automation, I am drawn to Optimove’s 'Positionless' culture and your innovative approach to AI-powered marketing. My background in Kubernetes orchestration and Terraform-based infrastructure provisioning aligns perfectly with your technical requirements.
In my previous roles, I have successfully implemented intelligent monitoring and automated incident response workflows, which resonates with your focus on using AI tools to enhance SRE processes. I am particularly impressed by Optimove's commitment to internal growth and its recognition in Gartner’s Magic Quadrant. I am eager to bring my expertise in GCP/AWS and observability tools like Datadog and Prometheus to help scale your platform and ensure the highest levels of system resilience.
Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в optimove уже сейчас
Присоединяйтесь к команде Optimove и станьте архитектором надежности в компании, признанной визионером Gartner!
Описание вакансии
At Optimove, we believe people are capable of more than a single job description. You’re not hired just to fill a position- you’re empowered to shape it, grow it, and make it your own.
We call this being Positionless.
And Positionless isn’t just our culture. It’s our product.
Optimove is the creator of Positionless Marketing, an AI-powered platform that gives every marketer the power to analyze, create, launch, and optimize independently. The result is faster execution, deeper personalization, and 88% greater campaign efficiency.
Recognized as a Visionary in Gartner’s Magic Quadrant, we partner with leading brands like Sephora, Staples, and Entain. Today, more than 550 Optimovers across NYC, London, Tel Aviv, Scotland, Brazil, Estonia, and beyond are building the future of marketing together, in an environment that actively encourages ownership and growth, with two out of every three managers promoted from within.
If you’re looking for a place where you can do more, be more, come grow with us.
Are you passionate about ensuring system reliability, scalability, and performance? Do you thrive in a dynamic environment where automation and operational excellence are key?
Optimove is looking for aSite Reliability Engineer (SRE) to join our team and play a crucial role in designing, implementing, and maintaining our cloud-based infrastructure. In this role, you will collaborate across teams to drive automation, improve system resilience, and optimize performance while fostering a culture of reliability.
Responsibilities:
- System Reliability- Ensure high availability and performance of services through effective monitoring, incident management, and root cause analysis.
- Automation & Tooling- Develop and maintain automation for infrastructure provisioning, configuration management, and application deployment.
- Performance Optimization- Analyze and enhance system performance, including load balancing, caching, and database tuning. Conduct regular capacity planning.
- Incident Response & Troubleshooting- Lead incident response efforts, participate in on-call rotations, and troubleshoot complex infrastructure issues.
- Security & Compliance- Collaborate with security teams to implement best practices and ensure compliance with relevant standards (ISO 27001, SOC 2, etc.).
- Collaboration & Mentorship- Work closely with developers, DevOps, Support, and product teams to enhance application reliability and implement SRE best practices.
Requirements:
- 4+ years in Site Reliability Engineering, DevOps, or related roles.
- Proven experience managing large-scale, cloud-based infrastructure in GCP, AWS, or Azure.
- Expertise in container orchestration (Kubernetes, Docker) and microservices architecture.
- Strong proficiency in scripting and programming languages (Python, Go, Bash, etc.).
- Experience with CI/CD pipelines, infrastructure as code (Terraform, CloudFormation), and configuration management (Ansible, Puppet, Chef).
- Hands-on experience with monitoring and observability tools (Datadog, Prometheus, Grafana, ELK Stack).
- Experience using AI tools to enhance SRE processes, such as intelligent monitoring, incident prediction, and automation of incident response.
- Deep understanding of networking concepts, DNS, load balancing, and distributed systems.
- Strong problem-solving skills, excellent communication, and a proactive mindset.
Advantages:
- Certifications- AWS Certified Solutions Architect, GCP Professional Cloud Architect, or Kubernetes certifications (CKA, CKAD).
Why Join Us?
In this role, you will have the opportunity to work on cutting-edge technology, solve challenging problems, and make a tangible impact on the reliability and scalability of our systems. Join a team that values collaboration, innovation, and continuous learning, and be part of an exciting journey as we scale our platform to new heights!
Создайте идеальное резюме с помощью ИИ-агента

Навыки
- Google Cloud Platform
- Amazon Web Services
- Microsoft Azure
- Kubernetes
- Docker
- Python
- Go
- Bash
- Terraform
- AWS CloudFormation
- Ansible
- Puppet
- Chef
- Datadog
- Prometheus
- Grafana
- ELK stack
- CI/CD
- DNS
- Microservices
Возможные вопросы на собеседовании
Проверка опыта работы с высоконагруженными системами и понимания архитектуры.
Расскажите о самом сложном инциденте в вашей практике: как вы его диагностировали и какие меры приняли для предотвращения повторения?
Оценка навыков автоматизации и владения инструментами IaC.
Как вы структурируете код в Terraform для обеспечения масштабируемости и повторного использования в разных окружениях?
Проверка понимания специфики SRE и работы с метриками.
Как вы определяете SLI и SLO для критически важного микросервиса? Приведите примеры.
Оценка опыта работы с Kubernetes в продакшене.
С какими проблемами производительности в кластерах Kubernetes вы сталкивались и как оптимизировали потребление ресурсов?
Проверка инновационного подхода, упомянутого в вакансии.
Как, по вашему мнению, инструменты ИИ могут реально улучшить процессы мониторинга и прогнозирования инцидентов в SRE?
Похожие вакансии
DevOps Engineer II (LATAM)
DevOps Engineer
Platform Engineer (Cloud)
Linux Production Engineer
Site Reliability Engineer
Site Reliability Engineer
1000+ офферов получено
Устали искать работу? Мы найдём её за вас
Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!
- Страна
- Израиль