Страна: США

+500% приглашений

Откликайтесь
на вакансии с ИИ

ГибридПолная занятость

Staff Software Engineer, Stream Compute

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Stripe — один из лучших работодателей в финтехе с уникальными инженерными задачами. Позиция предлагает высокую степень автономности, работу с передовым стеком (Flink, Kafka) и возможность влиять на глобальную финансовую инфраструктуру.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Роль уровня Staff требует более 10 лет опыта и глубоких знаний в области распределенных систем и Apache Flink. Высокая сложность обусловлена необходимостью решать задачи масштабирования и надежности для критически важных финансовых данных.

Анализ зарплаты

Медиана250 000 $

Рынок210 000 $ – 320 000 $

Указанная роль Staff-уровня в Stripe обычно предполагает компенсацию выше среднерыночной, включая значительную долю в виде акций (RSU). В крупных технологических хабах США и Канады такие позиции оцениваются в верхнем дециле рынка.

I am writing to express my strong interest in the Staff Software Engineer position within the Stream Compute team at Stripe. With over a decade of experience in building and evolving large-scale distributed systems, I have developed a deep expertise in Apache Flink and Kafka, which aligns perfectly with Stripe's mission to provide robust financial infrastructure. My background includes leading technical initiatives that prioritize state integrity and exactly-once processing, ensuring that critical financial data is handled with the highest level of reliability.

In my previous roles, I have successfully driven the transition from manual operations to automated, self-healing systems, significantly reducing operational toil. I am particularly drawn to Stripe's commitment to innovation and its 'Flink-first' approach to infrastructure. I am eager to bring my experience in multi-region strategies and disaster recovery to help scale Stripe's stream processing capabilities while maintaining the high availability targets that your global users depend on.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в stripe уже сейчас

Присоединяйтесь к Stripe, чтобы определять будущее стриминговых вычислений в глобальном масштабе!

Описание вакансии

Who we are

About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world's largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone's reach while doing the most important work of your career.

About the team

The Stream Compute team at Stripe builds and operates the infrastructure, tooling, and systems behind our Flink-powered stream processing systems. We're at the heart of several core asynchronous workflows, operating at significant scale and handling vast amounts of sensitive financial data. Our work powers intricate processes involving various critical financial operations and real-time analytics. We run globally distributed systems with high reliability and performance to meet Stripe's scaling, availability, and product needs, and we continually reduce operational toil by investing in automation and self-service tooling for upgrades, maintenance, and day-to-day operations. The team is distributed between Seattle, Toronto and remote locations.

What makes our team truly exciting is our commitment to our users: we ensure no event is dropped, state integrity is preserved, and support exactly-once processing as a first-class feature. Working at the intersection of real-time data processing and fintech innovation, we continuously push the boundaries of what's possible. Our focus on innovation, user experience, reliability, and compliance drives increased ROI and operational excellence, making us a crucial part of Stripe's success.

What you'll do

You'll help define and deliver the next generation of Stripe's Flink-first stream compute infrastructure—driving innovation to meet extremely high availability targets at global scale. Partnering with infrastructure engineers, adjacent platform teams, and the product orgs that depend on Flink every day, you'll set a long-term technical direction that scales with Stripe's growth while enabling reliable, efficient operations for years to come. You'll work on the hardest problems in operating Flink in production—state management, exactly-once processing, performance isolation, and automated recovery—so teams across Stripe can confidently build stateful stream processing applications on top of it.

Responsibilities

Design, build, and operate stream compute infrastructure with Apache Flink at the center, alongside technologies like Kafka, Temporal, and AWS services
Partner with product and platform teams across Stripe to understand requirements, unblock Flink adoption, and improve how stream processing infrastructure is used end-to-end
Define and implement operational best practices (e.g., shuffle sharding, cellular architecture, load shedding, automated state recovery) to improve resilience and reliability at scale
Drive fleet-level automation and standardization ("pets" to "cattle") through self-service workflows, safer rollouts, and self-healing systems that reduce manual operations
Lead initiatives that raise the bar on Flink availability and state durability (e.g., multi-region strategies, disaster recovery readiness, operational readiness reviews, incident learning)
Evaluate and productionize Flink ecosystem capabilities (e.g., SQL, connectors, state backends) to improve developer experience and scalability without compromising reliability
Work closely with the open source community to identify opportunities for adopting new open source features as well as contribute back to OSS

Who you are

We're looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements

This is a Staff-level role - that typically means 10+ years of experience building, operating, and evolving large-scale production systems
Experience as a technical lead for team(s) working on distributed systems, including scaling them in fast-moving environments
Hands-on experience with big data technologies such as Flink, Spark, Kafka, Pulsar, or Pinot
Experience developing, maintaining and debugging distributed systems built with open source tools
Experience building and scaling infrastructure as a product
Strong software engineering skills and a passion for Big Data Distributed Systems
Ability to write high quality code (in programming languages like Go, Java, Scala, etc)
Comfortable operating with high autonomy and ownership
Growth mindset and a willingness to learn quickly, explore ambiguous problem spaces, and dive deep when needed
Strong written and verbal communication skills, including the ability to produce clear technical documentation

Preferred qualifications

Experience operating streaming infrastructure as a platform (e.g., Flink clusters, Kafka, Pulsar) for internal customers at scale
Deep hands-on experience authoring, optimizing, and operating real-time processing frameworks such as Flink, Spark Streaming, Storm, or Kafka Streams in production
Experience building or operating control planes for managing large-scale infrastructure
Open source contributions to data processing or big data systems (Hadoop, Spark, Celeborn, Flink, etc)

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

AWS
Apache Spark
Distributed Systems
Java
Apache Flink
Big Data
Temporal
Apache Kafka
Go
Scala

Возможные вопросы на собеседовании

Проверка глубокого понимания внутреннего устройства Flink для обеспечения надежности данных.

Как бы вы реализовали семантику exactly-once в распределенной системе с использованием Flink и Kafka, и какие потенциальные узкие места вы видите?

Оценка опыта проектирования отказоустойчивых систем.

Опишите ваш подход к проектированию стратегии аварийного восстановления (Disaster Recovery) для крупномасштабного кластера Flink с сохранением состояния.

Проверка навыков оптимизации производительности.

С какими проблемами производительности при управлении состоянием (state management) во Flink вы сталкивались и как их решали?

Оценка лидерских качеств и умения работать с инфраструктурой как с продуктом.

Как вы балансируете между внедрением новых функций из open-source сообщества и поддержанием стабильности внутренней платформы для других команд?

Проверка архитектурного мышления.

Как бы вы внедрили концепцию 'cellular architecture' для стриминговой инфраструктуры Stripe, чтобы минимизировать радиус поражения при сбоях?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

США

Откликайтесь
на вакансии с ИИ

Staff Software Engineer, Stream Compute

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в stripe уже сейчас

Описание вакансии

Who we are

About Stripe

About the team

What you'll do

Responsibilities

Who you are

Minimum requirements

Preferred qualifications

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как бы вы реализовали семантику exactly-once в распределенной системе с использованием Flink и Kafka, и какие потенциальные узкие места вы видите?

Опишите ваш подход к проектированию стратегии аварийного восстановления (Disaster Recovery) для крупномасштабного кластера Flink с сохранением состояния.

С какими проблемами производительности при управлении состоянием (state management) во Flink вы сталкивались и как их решали?

Как вы балансируете между внедрением новых функций из open-source сообщества и поддержанием стабильности внутренней платформы для других команд?

Как бы вы внедрили концепцию 'cellular architecture' для стриминговой инфраструктуры Stripe, чтобы минимизировать радиус поражения при сбоях?

Похожие вакансии

Junior Backend-разработчик

Python разработчик (Senior)

Python - разработчик (Senior)

Junior Python разработчик

Junior разработчик

.NET разработчик Middle+ , Senior

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Staff Software Engineer, Stream Compute

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в stripe уже сейчас

Описание вакансии

Who we are

About Stripe

About the team

What you'll do

Responsibilities

Who you are

Minimum requirements

Preferred qualifications

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как бы вы реализовали семантику exactly-once в распределенной системе с использованием Flink и Kafka, и какие потенциальные узкие места вы видите?

Опишите ваш подход к проектированию стратегии аварийного восстановления (Disaster Recovery) для крупномасштабного кластера Flink с сохранением состояния.

С какими проблемами производительности при управлении состоянием (state management) во Flink вы сталкивались и как их решали?

Как вы балансируете между внедрением новых функций из open-source сообщества и поддержанием стабильности внутренней платформы для других команд?

Как бы вы внедрили концепцию 'cellular architecture' для стриминговой инфраструктуры Stripe, чтобы минимизировать радиус поражения при сбоях?

Похожие вакансии

Junior Backend-разработчик

Python разработчик (Senior)

Python - разработчик (Senior)

Junior Python разработчик

Junior разработчик

.NET разработчик Middle+ , Senior

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ