Страна: Индия

+500% приглашений

Откликайтесь
на вакансии с ИИ

УдалённоПолная занятость

Site Reliability Engineer II

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Backblaze — стабильная публичная компания с сильной инженерной культурой. Вакансия предлагает удаленную работу и возможность работать с распределенными системами мирового масштаба.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Роль требует уверенных знаний Linux, опыта работы с Kubernetes и навыков программирования на Python или Go. Уровень сложности средний (SRE II), так как ожидается опыт работы с высоконагруженными системами от 2 до 4 лет.

Анализ зарплаты

Медиана35 000 $

Рынок25 000 $ – 45 000 $

Указанная роль SRE II в Бангалоре соответствует рыночному уровню для международных технологических компаний. Зарплаты в этом регионе для специалистов среднего звена обычно начинаются от 1.8 млн INR и могут достигать 3.5 млн INR в зависимости от бонусов.

I am writing to express my interest in the Site Reliability Engineer II position at Backblaze. With over three years of experience in systems engineering and a strong focus on Linux administration and automation, I have developed a deep understanding of maintaining high-availability distributed systems. My background in implementing SLIs/SLOs and managing incident response aligns perfectly with Backblaze's commitment to reliability and customer success.

In my previous roles, I have successfully reduced operational toil by developing custom automation tools in Python and Go, and I am proficient in managing infrastructure using Terraform and Ansible. I am particularly drawn to Backblaze because of its unique position as a leader in the open cloud movement and its impressive scale of managing over three billion gigabytes of data. I am eager to bring my technical skills and reliability-focused mindset to your Production Systems team to help ensure the continued stability and scalability of your services.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в backblaze уже сейчас

Присоединяйтесь к команде Backblaze и помогайте строить будущее открытых облачных технологий!

Описание вакансии

About Backblaze

Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets, unburden administrators, and unleash innovators. Together with our partners, we’re helping customers break free from the restrictive, overpriced legacy solutions that hold them back, and blaze forward with the full power of the open cloud in their hands.

Founded in 2007, we scaled the business with less than $3 million in outside funding until 2021, when we did a traditional IPO on the Nasdaq stock exchange. Today, Backblaze generates over $100m in revenue and is the leading specialized storage cloud - managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries, including businesses, developers, IT professionals, and individuals.

About the Role

We are seeking a Site Reliability Engineer II (SRE II) to help ensure the stability, scalability, and reliability of our services and infrastructure. This role focuses on building automation, maintaining observability, and supporting incident response to keep customer-facing systems performing at their best. The SRE will collaborate with engineering, product, and operations teams to embed reliability practices into day-to-day development and operations while contributing to tools and processes that improve efficiency and reduce manual effort.

Key Responsibilities

Service Reliability & Operations

Support the availability and durability of critical services across production environments.
Monitor service health using SLIs, SLOs, and error budgets, and escalate issues when thresholds are at risk.
Participate in on-call rotations, incident response, and post-incident reviews to drive service improvements.
Follow established ITIL/OSS processes (incident, change, problem, and capacity management).

Automation & Tooling

Develop automation for common operational tasks, reducing manual intervention and toil.
Contribute to monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint,ELK).
Work with CI/CD pipelines, configuration management, and infrastructure as code tools (Terraform, Ansible, Jenkins).
Write scripts (Bash, Python, Go, etc.) to improve system reliability and efficiency.

Collaboration

Partner with engineering, product, and operations teams to support resilient system design and operations.
Assist in capacity planning and disaster recovery exercises.
Work with vendors and service providers to troubleshoot service issues and track SLA performance.
Document systems, share learnings, and help grow a reliability-minded engineering culture.

Continuous Improvement

Contribute to playbooks, runbooks, and operational documentation.
Identify recurring issues and propose long-term improvements.
Promote reliability-focused practices within development and operations teams.

Qualifications

Education & Experience

Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
2–4 years of experience in site reliability, systems engineering, or operations.
Exposure to large-scale, production-grade systems.

Technical Skills

Solid Linux systems administration and troubleshooting skills.
Familiarity with service reliability concepts - monitoring, alerting, incident response, and root cause analysis.
Proficiency in at least one scripting language (Python, Bash, or Go).
Understanding of containers (Kubernetes, Docker) and microservices concepts.
Knowledge of incident response and operational best practices.

Preferred Attributes

Experience in a SaaS, service provider, or distributed systems environment.
Familiarity with ITIL/OSS practices and SLO/SLA’s
Strong problem-solving skills and willingness to learn new technologies.
Experience with cloud platforms (AWS, GCP, or Azure).
Ability to work independently, take ownership, and drive projects from problem discovery through resolution.

At this point, we hope you're feeling excited about the job description you're reading. Even if you don't meet every requirement, we still encourage you to apply. Learning, developing, and growing are key parts of our culture. We're eager to meet people who believe in our mission and can contribute to our team in various ways. We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.

At Backblaze, we value being fair and good to our customers, partners, and employees. That’s why diversity, equity, and inclusion are at the core of our values. We are committed to fostering a workforce where all employees feel a sense of belonging regardless of race, ethnicity, nationality, gender, sexual orientation, age, religion, socio-economic status, ability, veteran status, and education. We believe that our dedication to cultivating a diverse workspace not only allows us to better serve our customers in over 175 countries, but further reinforces our commitment to doing the right thing. We are proud to be an Equal Opportunity Employer.

To understand more about the data we collect and process as part of your application, please view our Backblaze Employee Privacy Notice.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Linux
Python
Go
Bash
Kubernetes
Docker
Terraform
Ansible
Prometheus
Grafana
Jenkins
ELK stack
SaaS
Cloud Computing

Возможные вопросы на собеседовании

Проверка практических навыков автоматизации и понимания того, как кандидат оценивает эффективность своей работы.

Расскажите о случае, когда вы автоматизировали рутинную задачу (toil). Какой инструмент вы использовали и как это повлияло на метрики команды?

SRE должен уметь работать в условиях неопределенности и минимизировать время простоя.

Опишите ваш процесс реагирования на критический инцидент в продакшене. Как вы определяете приоритеты при поиске первопричины?

Понимание основ SRE (SLI/SLO) критично для этой позиции в Backblaze.

Как бы вы спроектировали систему мониторинга и алертинга для нового микросервиса, чтобы избежать 'усталости от алертов' (alert fatigue)?

Проверка глубины знаний Linux, так как это ключевое требование вакансии.

Как вы будете отлаживать проблему высокой нагрузки (high load average) на Linux-сервере, если загрузка CPU при этом остается низкой?

Backblaze работает с огромными объемами данных, поэтому важно понимание масштабируемости.

С какими проблемами масштабируемости вы сталкивались при работе с Docker/Kubernetes и как вы их решали?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

Индия

Откликайтесь
на вакансии с ИИ

Site Reliability Engineer II

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в backblaze уже сейчас

Описание вакансии

About the Role

Key Responsibilities

Service Reliability & Operations

Automation & Tooling

Collaboration

Continuous Improvement

Qualifications

Education & Experience

Technical Skills

Preferred Attributes

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о случае, когда вы автоматизировали рутинную задачу (toil). Какой инструмент вы использовали и как это повлияло на метрики команды?

Опишите ваш процесс реагирования на критический инцидент в продакшене. Как вы определяете приоритеты при поиске первопричины?

Как бы вы спроектировали систему мониторинга и алертинга для нового микросервиса, чтобы избежать 'усталости от алертов' (alert fatigue)?

Как вы будете отлаживать проблему высокой нагрузки (high load average) на Linux-сервере, если загрузка CPU при этом остается низкой?

С какими проблемами масштабируемости вы сталкивались при работе с Docker/Kubernetes и как вы их решали?

Похожие вакансии

Senior Devops инженер\Тимлид

Senior DevOps

DevOps Middle

DevOps Engineer (Senior)

DevOps Middle/Middle+

DevOps Engineer

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Site Reliability Engineer II

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в backblaze уже сейчас

Описание вакансии

About the Role

Key Responsibilities

Service Reliability & Operations

Automation & Tooling

Collaboration

Continuous Improvement

Qualifications

Education & Experience

Technical Skills

Preferred Attributes

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о случае, когда вы автоматизировали рутинную задачу (toil). Какой инструмент вы использовали и как это повлияло на метрики команды?

Опишите ваш процесс реагирования на критический инцидент в продакшене. Как вы определяете приоритеты при поиске первопричины?

Как бы вы спроектировали систему мониторинга и алертинга для нового микросервиса, чтобы избежать 'усталости от алертов' (alert fatigue)?

Как вы будете отлаживать проблему высокой нагрузки (high load average) на Linux-сервере, если загрузка CPU при этом остается низкой?

С какими проблемами масштабируемости вы сталкивались при работе с Docker/Kubernetes и как вы их решали?

Похожие вакансии

Senior Devops инженер\Тимлид

Senior DevOps

DevOps Middle

DevOps Engineer (Senior)

DevOps Middle/Middle+

DevOps Engineer

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ