Откликайтесь
на вакансии с ИИ

Site Reliability Operations Manager
Отличная позиция в быстрорастущей международной компании с сильной корпоративной культурой и современным стеком. Привлекательный пакет льгот, включая гибридный формат работы, бонусы и медицинскую страховку для всей семьи.
Сложность вакансии
Высокая сложность обусловлена необходимостью управления критическими инцидентами в режиме 24/7 и координации работы распределенных команд в высоконагруженной среде GameTech. Требуется сочетание глубоких технических знаний сетевых протоколов и инфраструктуры с сильными лидерскими качествами.
Анализ зарплаты
Зарплата для данной позиции в Афинах обычно выше среднего по рынку из-за специфики GameTech индустрии и высокого уровня ответственности. Kaizen Gaming предлагает конкурентоспособную оплату и бонусную схему, что соответствует верхнему децилю локального рынка для руководящих ролей в SRE.
Сопроводительное письмо
Dear Hiring Team at Kaizen Gaming,
I am writing to express my strong interest in the Site Reliability Operations Manager position. With extensive experience in leading technical operations and managing 24/7 shift-based teams in high-availability environments, I am confident in my ability to enhance the operational maturity of your production systems. My background in refining incident management frameworks and driving improvements in MTTA/MTTR metrics aligns perfectly with your mission to provide a seamless experience for millions of customers.
Throughout my career, I have focused on transforming reactive monitoring into proactive, engineered reliability. I am particularly impressed by Kaizen Gaming's scale and its commitment to a 'one team' culture. I look forward to the possibility of bringing my expertise in infrastructure stability and stakeholder communication to your Athens-based team and contributing to the continued success of Betano.
Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в kaizengaming уже сейчас
Присоединяйтесь к лидеру GameTech индустрии и возглавьте команду, обеспечивающую надежность мирового уровня!
Описание вакансии
We are Kaizen Gaming
Kaizen Gaming, the team powering Betano, is one of the biggest GameTech companies in the world, operating in 20 markets. We always aim to leverage cutting-edge technology, providing the best experience to our millions of customers who trust us for their entertainment.
We are a diverse team of more than 2.700 Kaizeners, from 40+ nationalities spreading across 3 continents.
Our #oneteam is proud to be among the Best Workplaces in Europe and certified Great Place to Work across our offices. Here, there’ll be no average day for you. Ready to Press Play on Potential?
Let’s start with the role
As a Site Reliability Operations Manager, you will lead the operational reliability layer of our production environment, ensuring 24/7 service stability across networks, applications, and infrastructure.
You will own the performance and evolution of our Site Reliability Operations function — managing shift-based teams, strengthening incident response practices, driving measurable improvements in uptime, response time, and operational maturity, and directly handling and overseeing the end-to-end incident flow.
You will be responsible for ensuring that incidents are properly triaged, escalated, coordinated, and resolved, while continuously improving our incident management processes.
This role sits at the intersection of Infrastructure, Platform, Security, and Product, ensuring that reliability is not reactive, but engineered and continuously improved.
Reliability at scale in a high-traffic, real-time gaming environment demands precision, discipline, and strong leadership. This role is critical to that mission.
As a Site Reliability Operations Manager, you will:
- Lead and develop the Site Reliability Operations team, ensuring high performance across 24/7 shift coverage.
- Own incident management processes, including severity classification, escalation paths, communication standards, and post-incident reviews.
- Ensure proactive monitoring of production systems with meaningful alerting that minimizes noise and maximizes actionability.
- Track and improve key operational metrics such as MTTA, MTTR, uptime, and SLA adherence.
- Establish and refine standard operating procedures for monitoring, escalation, and vendor coordination.
- Drive structured communication during incidents, ensuring clear updates to technical and business stakeholders.
- Collaborate closely with SRE, Infrastructure, Security, and Engineering teams to eliminate recurring incidents through root cause analysis and systemic improvements.
- Oversee relationships with external vendors and providers during both routine operations and major outages.
- Promote a culture of operational excellence, accountability, and continuous improvement.
- Participate in capacity planning and operational readiness reviews for new launches and major changes.
What you will bring
- Proven experience leading technical operations or NOC/SRE Operations teams in high-availability environments.
- Strong understanding of production monitoring, alerting systems, and incident management frameworks.
- Solid knowledge of networking fundamentals (TCP/IP), infrastructure components, and cloud or hybrid environments.
- Experience working in 24/7 operational models with shift-based teams.
- Hands-on familiarity with ticketing systems and operational reporting.
- Ability to analyze operational data and translate it into improvement initiatives.
- Strong stakeholder communication skills, especially under pressure.
- Structured thinker with close attention to detail and strong execution discipline.
- Experience in gaming, fintech, e-commerce, or other real-time, high-scale digital environments is considered a strong plus.
Kaizen Gaming Perks
- 🕑 Hybrid way of working
- 🏃 A buddy will support you with your onboarding
- 💸 Competitive pay & bonus scheme
- ⭐Developmental 360° feedback framework
- 💰 Monthly meal allowance
- 👩⚕️ Private health insurance for you and your family
- 📚 Unlimited access to Udemy & continuous training
- 👨👩👧👦 Family Support.
- #LI-Hybrid
Recruitment Privacy Notice
Regarding the data you share with us, you may find and read our recruitment privacy notice here.
We are an equal opportunity employer committed to fostering a diverse and inclusive workplace. We welcome applications from individuals of all backgrounds, regardless of race, gender, religion, sexual orientation,or age.
Создайте идеальное резюме с помощью ИИ-агента

Навыки
- SRE
- Incident Management
- Networking
- TCP/IP
- Infrastructure
- Cloud Computing
- Monitoring
- Alerting
- SLA
- Root Cause Analysis
- Capacity Planning
Возможные вопросы на собеседовании
Проверка опыта управления критическими ситуациями и способности сохранять спокойствие под давлением.
Опишите самый сложный инцидент, которым вы руководили. Какие шаги вы предприняли для координации команд и как обеспечили информирование стейкхолдеров?
Оценка аналитических способностей и умения работать с метриками эффективности.
Какие ключевые показатели эффективности (KPI) вы считаете наиболее важными для SRE Operations и как вы использовали данные для улучшения MTTR?
Проверка навыков управления персоналом в условиях сменного графика.
Как вы подходите к управлению мотивацией и предотвращению выгорания в команде, работающей в режиме 24/7?
Оценка технического видения и стремления к автоматизации.
Как вы определяете баланс между операционной работой («тушением пожаров») и внедрением системных улучшений для предотвращения рецидивов?
Проверка умения работать с внешними партнерами.
Расскажите о вашем опыте взаимодействия с внешними вендорами во время крупных сбоев в инфраструктуре. Как вы добиваетесь соблюдения SLA?
Похожие вакансии
System Reliability Engineer/DevOps
Site Reliability | DevOps Engineer
Database Administrator
Process Automation Engineer
Python Cloud Engineer
Cloud Infrastructure Engineer (Kineto)
1000+ офферов получено
Устали искать работу? Мы найдём её за вас
Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!