- Страна
- Германия
Откликайтесь
на вакансии с ИИ

Engineering Manager - Observability & Reliability Engineering Obsession (x/f/m)
Отличная позиция в стабильном HealthTech-единороге с сильной инженерной культурой. Предлагается расширенный соцпакет, работа с современным стеком (K8s, OTel, Vault) и реальное влияние на надежность критически важного сервиса.
Сложность вакансии
Высокая сложность обусловлена необходимостью сочетать глубокую техническую экспертизу в SRE (Observability, IaC) с сильными лидерскими качествами. Процесс отбора включает 5 этапов, включая системный дизайн и разбор управленческих кейсов.
Анализ зарплаты
Указанная роль Engineering Manager в Берлине обычно оплачивается в диапазоне 100,000–130,000 евро в год. Doctolib известен конкурентными зарплатами, соответствующими верхнему децилю рынка для опытных руководителей.
Сопроводительное письмо
I am writing to express my strong interest in the Engineering Manager position for the OREO team at Doctolib. With over 8 years of experience in SRE and infrastructure, including the last 4 years in engineering management, I have a proven track record of building high-performing platform teams that prioritize observability and system reliability. My background in scaling telemetry pipelines and managing critical transversal services like HashiCorp Vault and Terraform Enterprise aligns perfectly with Doctolib's current technical roadmap.
Throughout my career, I have focused on fostering a culture of psychological safety and operational excellence. I am particularly impressed by Doctolib's commitment to ethical AI and its mission to improve healthcare access. I am confident that my technical expertise in OpenTelemetry and Prometheus, combined with my experience in mentoring senior SREs and driving strategic architectural decisions, will allow me to contribute significantly to the evolution of your observability platform.
I am eager to bring my experience in managing complex incident response processes and reducing technical debt to the OREO team. Thank you for considering my application. I look forward to the possibility of discussing how my leadership style and technical background can support Doctolib's continued growth and reliability goals.
Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в doctolib уже сейчас
Присоединяйтесь к Doctolib, чтобы возглавить команду OREO и трансформировать культуру надежности в одном из ведущих HealthTech-единорогов Европы!
Описание вакансии
We are looking for an Engineering Manager to join the OREO (Observability Reliability Engineering Obsession) team in Platform Engineering.
As an Engineering Manager, your mission will be to lead the Reliability & Observability team and drive the evolution of Doctolib's observability platform, supporting the exponential growth of Doctolib services while building and empowering a world-class SRE team.
Working in the tech team at Doctolib involves building innovative products and features to improve the daily lives of care teams and patients. We work in feature teams in an agile environment, while collaborating with product, design, and business teams.
You will lead a team of Site Reliability Engineers who are responsible for shaping Doctolib's observability strategy and ensuring our platform remains reliable, debuggable, and scalable. This role sits at the intersection of people management, technical leadership, and strategic planning with a particular focus on building organizational capabilities around logging, metrics, tracing, and alerting.
Your team also owns and operates critical transversal services that enable secure, scalable infrastructure management across the organization, including HashiCorp Vault for secrets management and Terraform Enterprise for infrastructure as code.
Your responsibilities include but are not limited to:
People Leadership:
- Lead, coach, and grow a team of Site Reliability Engineers, supporting their technical development and career progression
- Create a culture of operational excellence, continuous improvement, and psychological safety within the team
- Conduct regular 1:1s, performance reviews, and career development conversations
- Recruit, onboard, and retain top SRE talent aligned with Doctolib's mission and values
Technical Strategy:
- Partner with SREs and senior engineers to define and evolve the observability strategy across the platform, focusing on logging, metrics, tracing, and alerting
- Own the strategy and evolution of critical transversal services including HashiCorp Vault and Terraform Enterprise
- Drive prioritization and roadmap planning for large-scale reliability and observability initiatives
- Ensure alignment between team objectives and broader engineering and business goals
- Advocate for and allocate resources toward reducing technical debt and improving developer experience
Operational Excellence:
- Own the team's on-call experience and contribute to the incident response processes, ensuring sustainable practices and continuous improvement
- Ensure high availability and reliability of transversal services that are critical to the entire engineering organization
- Lead postmortem reviews and drive systemic improvements to prevent recurring issues
Cross-functional Collaboration:
- Work closely with Product Managers, Engineering Managers, and architects to align observability capabilities with product and platform needs
- Partner with security and infrastructure teams to evolve secrets management and IaC practices across the organization
- Represent the OREO team in engineering leadership forums, architectural reviews, and strategic planning sessions
- Foster strong partnerships with software engineering teams to improve instrumentation quality and adoption of observability best practices
About our tech environment
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to the country and healthcare specialty requirements. To address these challenges, we are modularizing our platform run in a distributed architecture through reusable components.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here and learn about our first AI hackathon here!
Who you are
Before you read on — if you don't have the exact profile described below, but you feel this job description matches your skill set, we still encourage you to apply.
- You have at least 5+ years of software engineering or SRE experience, with a strong technical background in cloud-native environments (preferably AWS, GCP, and/or Kubernetes-based)
- You have 3+ years of engineering management experience, leading technical teams (ideally SRE, platform, or infrastructure teams)
- You have deep understanding of observability tooling and architecture (Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Prometheus, Thanos, Datadog)
- You have experience with infrastructure as code (Terraform, OpenTofu) and secrets management systems (Vault, AWS Secrets Manager)
- You have proven ability to balance technical depth with people leadership, able to mentor engineers, review technical designs, and guide architectural decisions
Now it would be fantastic if you:
- Have experience scaling SRE or platform teams in fast-growing, high-traffic environments
- Have background in designing and operating high-scale telemetry pipelines
- Have hands-on experience with HashiCorp Vault and Terraform Enterprise in production environments
- Have hands-on experience with backend programming languages (e.g., Go, Python, Ruby)
- Have experience driving cultural and technical transformations
What we offer
- Free comprehensive health insurance for you and your children
- Parent Care Program: receive one additional month of leave on top of the legal parental leave
- Free mental health and coaching services through our partner Moka.care
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
- Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
- Work Council subsidy to refund part of sport club membership or creative class
- Up to 14 days of RTT
- A subsidy from the work council to refund part of the membership to a sport club or a creative class
- Lunch voucher with Swile card
The interview process
- 30-min phone screen with a Tech Recruiter
- 1h30 technical interview (SRE System Design & Architecture)
- 1h15 behavioral interview (Leadership & People Management)
- 1h30 Engineering Management case study (team scenarios, prioritization, and conflict resolution)
- 1h manager interview with Senior Engineering Leadership
- At least one reference check
Job details
- Permanent position
- Full-time
- Berlin
- Start date: as soon as possible
If you would like to find out more about tech life at Doctolib, feel free to read our latest Medium blog articles!
At Doctolib, we are committed to improving access to healthcare for everyone. This translates into our recruitment process. We evaluate candidates based solely on qualifications and motivation, without any form of discrimination.
The more diverse ideas are heard, the more our product will truly improve healthcare for all. You are welcome to apply to Doctolib, regardless of your gender, religion, age, sexual orientation, ethnicity, disability.
To ensure equal opportunities, we invite you to exclude personal information (e.g. pictures, age) from your applications. If you require any accommodation, please let us know for support during the hiring process.
Join us in building the healthcare we all dream of!
All information provided is processed by Doctolib for application management. For data processing details, click here.
Please contact hr.dataprivacy(at)doctolib.com for inquiries or to exercise your rights.
Создайте идеальное резюме с помощью ИИ-агента

Навыки
- AWS
- Python
- Terraform
- Kubernetes
- Prometheus
- OpenTelemetry
- SRE
- Ruby
- Observability
- Go
- ElasticSearch
- Datadog
- HashiCorp Vault
- Loki
- Fluent Bit
Возможные вопросы на собеседовании
Проверка способности кандидата мыслить стратегически и проектировать масштабируемые системы мониторинга.
Как бы вы спроектировали архитектуру сбора метрик и логов для распределенной системы, обрабатывающей миллиарды событий в день, обеспечив при этом разумную стоимость владения?
Оценка лидерских качеств и умения развивать команду в условиях высокой нагрузки.
Опишите ваш подход к профессиональному развитию SRE-инженеров. Как вы помогаете им расти от уровня Middle до Senior/Staff?
Проверка опыта управления инцидентами и внедрения культуры Post-mortem.
Расскажите о самом сложном инциденте, которым вы руководили. Какие системные изменения вы внедрили после него, чтобы предотвратить повторение?
Оценка навыков приоритизации между продуктовыми задачами и техническим долгом.
Как вы аргументируете перед бизнесом необходимость выделения ресурсов на сокращение технического долга в инфраструктуре или обновление систем мониторинга?
Проверка опыта работы с конкретными инструментами безопасности и инфраструктуры.
С какими основными проблемами масштабирования HashiCorp Vault вы сталкивались и как решали вопросы управления секретами в большой организации?
Похожие вакансии
Тимлид С++
Тим лид разработки в команду цифровых продуктов кибербезопасности (python/go)
ИТ-лидер команды (Lead)
Руководитель направления автоматизации клиентского сервиса
Техлид в команду HRDWH
Руководитель команды разработки
1000+ офферов получено
Устали искать работу? Мы найдём её за вас
Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!
- Страна
- Германия