yandex
snorkelai
Страна
Канада
Зарплата
150 000 $ – 180 000 $
+500% приглашений

Откликайтесь
на вакансии с ИИ

Ускорим процесс поиска работы
УдалённоПолная занятость

Applied Research Engineer – Training Infra

Оценка ИИ

Отличная позиция в топовом стартапе с сильной научной базой, конкурентной зарплатой и возможностью работать удаленно над передовыми технологиями.


Вакансия из Quick Offer Global, списка международных компаний
Пожаловаться

Сложность вакансии

ЛегкоСложно
Оценка ИИ

Роль требует глубоких знаний в области распределенных систем, управления GPU-кластерами и специфики обучения LLM, что делает порог входа достаточно высоким.

Анализ зарплаты

Медиана175 000 $
Рынок145 000 $ – 210 000 $
Оценка ИИ

Предложенная зарплата ($150k–$180k) находится в пределах рыночной нормы для Senior/Staff уровней в США, хотя для топовых AI-инженеров в Кремниевой долине верхняя планка может быть выше.

Сопроводительное письмо

I am writing to express my strong interest in the Applied Research Engineer – Training Infra position at Snorkel AI. With a solid background in managing GPU clusters and optimizing distributed training pipelines, I am excited about the opportunity to contribute to a team that prioritizes data-centric AI development. My experience in orchestrating large-scale workloads using Kubernetes and Slurm aligns perfectly with your mission to provide robust infrastructure for enterprise-grade AI.

In my previous roles, I have focused on bridging the gap between research requirements and production-ready systems. I have a proven track record of implementing fault-tolerant training environments and optimizing inter-node communication to ensure high resource utilization. I am particularly drawn to Snorkel AI's unique approach to programmatic data labeling and its impact on the generative AI landscape. I am confident that my technical expertise in cloud-native ML infrastructure will help unblock experiments and accelerate your research cycles.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в snorkelai уже сейчас

Присоединяйтесь к Snorkel AI и создавайте инфраструктуру будущего для обучения масштабных моделей ИИ!

Описание вакансии

About Snorkel

At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data.

We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale. The AI landscape has gone through incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!

THE ROLE

As an Applied Research Engineer at Snorkel AI, you will own the infrastructure that powers our model training and evaluation work. This is a hands-on role where you will build and operate GPU cluster infrastructure, training pipelines, and the tooling that allows our research and engineering teams to run experiments reliably and at scale. You will work closely with research scientists and engineers, translating training requirements into robust, reproducible systems—and proactively removing infrastructure blockers before they slow down the work that matters most.

Snorkel AI operates in a fast-paced, high-impact environment. We are looking for someone who takes pride in operational excellence, loves solving complex distributed systems problems, and thrives when given real ownership.

Location: Redwood City or San Francisco — OR REMOTE

MAIN RESPONSIBILITIES

  • Set up and manage GPU cluster infrastructure on major cloud providers (e.g., AWS HyperPod) for distributed model training, including networking, provisioning, and cost tracking.
  • Build and operate job orchestration and scheduling systems (e.g., Kubernetes, Slurm, or cloud-native equivalents) to reliably launch and manage training, rollout, and evaluation jobs across multi-node clusters.
  • Integrate and maintain ML training frameworks and post-training pipelines, ensuring they run stably and reproducibly at scale.
  • Set up and maintain experiment tracking, dataset versioning, and model artifact management to support fast iteration.
  • Monitor and optimize cluster health, inter-node communication, and resource utilization; implement fault tolerance and auto-recovery so long-running jobs survive node failures.
  • Work closely with research scientists and ML engineers to understand requirements, unblock experiments, and evolve infrastructure as our training workloads needs change.

PREFERRED QUALIFICATIONS

  • Hands-on experience managing GPU clusters on major cloud providers, including provisioning, network configuration, and cost management.
  • Experience with distributed compute orchestration tools such as Kubernetes, Slurm, or equivalent cluster management systems.
  • Working knowledge of distributed training concepts: parallelism strategies, memory optimization techniques, and inter-node communication.
  • Experience with setting up, managing, and integrating ML experiment tracking and data/model versioning tools..
  • Strong Python proficiency and solid software engineering fundamentals such as version control, modular design, and automation.
  • Ability to work in a fast-moving, iterative environment and take end-to-end ownership of ambiguous infrastructure problems.
  • Hands-on experience with post-training workflows such as supervised fine-tuning (SFT) or reinforcement learning (RLHF, GRPO, or similar) is a strong plus, but not required.

The salary range is $150,000.00 – $180,000.00.

This role is a great fit for engineers who love building reliable systems close to the frontier of AI research. We welcome applicants from a wide range of backgrounds—whether your experience comes from industry, research labs, or direct hands-on work with distributed infrastructure at scale.

BE YOUR BEST AT SNORKEL

Joining Snorkel AI means becoming part of a company that has market proven solutions, robust funding, and is scaling rapidly—offering a unique combination of stability and the excitement of high growth. As a member of our team, you’ll have meaningful opportunities to shape priorities and initiatives, influence key strategic decisions, and directly impact our ongoing success. Whether you’re looking to deepen your technical expertise, explore leadership opportunities, or learn new skills across multiple functions, you’re fully supported in building your career in an environment designed for growth, learning, and shared success.

Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Salary Range

$150,000—$180,000 USD

Be Your Best at Snorkel

Joining Snorkel AI means becoming part of a company that has market proven solutions, robust funding, and is scaling rapidly—offering a unique combination of stability and the excitement of high growth. As a member of our team, you’ll have meaningful opportunities to shape priorities and initiatives, influence key strategic decisions, and directly impact our ongoing success. Whether you’re looking to deepen your technical expertise, explore leadership opportunities, or learn new skills across multiple functions, you’re fully supported in building your career in an environment designed for growth, learning, and shared success.

Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Создайте идеальное резюме с помощью ИИ-агента

Навыки

  • Python
  • GPU
  • Kubernetes
  • Slurm
  • AWS HyperPod
  • Distributed Training
  • Machine Learning Infrastructure
  • SFT
  • RLHF
  • Docker
  • PyTorch

Возможные вопросы на собеседовании

Проверка опыта работы с конкретными инструментами оркестрации, упомянутыми в вакансии.

Расскажите о вашем опыте настройки и масштабирования кластеров Kubernetes или Slurm специально для задач обучения нейросетей.

Важно понимать, как кандидат справляется с типичными проблемами при обучении на нескольких узлах.

Как вы подходите к отладке проблем с сетевым взаимодействием (inter-node communication) при распределенном обучении?

Вакансия подразумевает работу с дорогостоящими ресурсами.

Какие стратегии оптимизации затрат и мониторинга использования GPU вы внедряли в прошлых проектах?

Проверка понимания современных техник обучения.

Каков ваш опыт работы с техниками параллелизма (Data, Pipeline, Tensor parallelism) и как они влияют на требования к инфраструктуре?

Оценка способности обеспечивать непрерывность длительных процессов.

Как вы реализовывали механизмы отказоустойчивости для длительных задач обучения, чтобы минимизировать потери при сбое одного узла?

Похожие вакансии

roku
135 000 $ – 185 000 $

Software Engineer, Machine Learning

ГибридСША
Machine Learning · Deep Learning · Python · C++ · TensorFlow · PyTorch · CNN · RNN · Computer Vision · Firmware Development · Edge Computing
+11 навыков
lucidmotors
180 900 $ – 265 320 $

Staff Machine Learning Engineer – (ADAS/Autonomous Driving)

В офисеСША
C++ · Python · PyTorch · CUDA · TensorRT · ROS 2 · OpenCV · Docker · CI/CD · ADAS · Autonomous Vehicles · ISO 26262 · LiDAR
+13 навыков
evolutioniq
225 000 $ – 270 000 $

AI Engineering Manager (Medhub)

УдалённоСША
Python · Data Pipelines · Cloud Platforms · Agile · Scrum · Kanban · Project Management
+7 навыков
evolutioniq
225 000 $ – 270 000 $

AI Engineering Manager (Medhub)

УдалённоСША
Python · Data Pipelines · Cloud Platforms · Agile · Scrum · Kanban · Project Management
+7 навыков
evolutioniq
Не указана

Auto P&C Claims Subject Matter Expert (SME Consultant) - AI SaaS

УдалённоСША
Insurance Claims · Property & Casualty Insurance · Data Labeling · Natural Language Processing · Artificial Intelligence · Subject Matter Expertise · User Experience (UX) Feedback · Medical Document Analysis
+8 навыков
levio
Не указана

Conseiller.ère en architecture AI

УдалённоКанада
Python · Databricks · Azure · MLOps · Software Architecture · Infrastructure as Code · Automation · Big Data
+8 навыков
более 1000 офферов получено
4.9

1000+ офферов получено

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

snorkelai
Страна
Канада
Зарплата
150 000 $ – 180 000 $