Страна: США
Зарплата: 216 700 $ – 303 400 $

+500% приглашений

Откликайтесь
на вакансии с ИИ

SeniorУдалённоПолная занятость

Senior Research Engineer, Post-training & Evaluation

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная вакансия в топовой компании с высокой прозрачностью зарплаты и возможностью работать над фундаментальными моделями. Удаленный формат и сильный социальный пакет делают предложение очень привлекательным.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Высокая сложность обусловлена необходимостью глубоких знаний в области LLM, опыта работы с распределенным обучением (FSDP/DeepSpeed) и создания кастомных бенчмарков. Роль требует сочетания навыков исследователя и инженера инфраструктуры.

Анализ зарплаты

Медиана240 000 $

Рынок190 000 $ – 320 000 $

Предложенная зарплата ($216k - $303k) находится на верхнем уровне рынка для Senior ML ролей в США, особенно учитывая дополнительные бонусы в виде RSU. Она полностью соответствует или даже превосходит медианные значения для Tier-1 технологических компаний.

I am writing to express my strong interest in the Senior Research Engineer position for Post-training & Evaluation at Reddit. With over four years of experience in machine learning and a deep focus on LLM fine-tuning, I am excited by the opportunity to build the 'Reddit Benchmark' and architect the evaluation suites that will define the next generation of Reddit-native foundational models. My background in implementing scalable SFT pipelines and developing Model-as-a-Judge systems aligns perfectly with your strategic initiative to enhance model reasoning and safety.

In my previous roles, I have successfully navigated the complexities of distributed training using FSDP and DeepSpeed, while maintaining a rigorous focus on data quality and evaluation metrics. I am particularly drawn to Reddit's unique challenge of capturing community-specific nuances and slang. I am confident that my technical expertise in PyTorch and Hugging Face, combined with my passion for building safe and intelligent AI systems, will allow me to make a significant contribution to the Anti-Evil Engineering team and the broader AI mission at Reddit.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в reddit уже сейчас

Присоединяйтесь к команде Reddit и создавайте будущее ИИ для миллионов пользователей — откликайтесь прямо сейчас!

Описание вакансии

Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 121 million daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit www.redditinc.com.

Reddit is continuing to grow our teams with the best talent. This role iscompletely remote friendly within the United States. If you happen to live close to one of our physical office locations (San Francisco, Los Angeles, New York City & Chicago) our doors are open for you to come into the office as often as you'd like.

The AI Engineering team at Reddit is embarking on a strategic initiative to build our own Reddit-native foundational Large Language Models (LLMs). This team sits at the intersection of applied research and massive-scale infrastructure, tasked with training models that truly understand the unique culture, language, and structure of Reddit communities. You will be joining a team of distinguished engineers and safety experts to build the "engine room" of Reddit's AI future—creating the foundational models that will power Safety & Moderation, Search, Ads, and the next generation of user products.

As a Senior Research Engineer for Post-Training & Evaluation, you will own the critical "feedback loop" of our model development. While the pre-training team builds the base models, you will architect the evaluation suites and fine-tuning pipelines that determine if those models are actually safe, smart, and "Reddit-native." You will build the "Reddit Benchmark"—our internal standard for model quality—and execute the Supervised Fine-Tuning (SFT) workflows that adapt our models for Safety and Moderation tasks.

Responsibilities:

Architect and maintain the "Reddit Benchmark" evaluation suite: A comprehensive harness that rigorously tests model capabilities across Safety, Reasoning, and Reddit-specific knowledge (slang, norms).
Build scalable SFT (Supervised Fine-Tuning) pipelines: Implement efficient, distributed training loops for instruction tuning, converting raw base models into helpful assistants.
Develop Model-as-a-Judge systems: Engineer automated evaluation pipelines using strong models (e.g., GPT-5, Nova, Claude) to grade the outputs of our internal models, enabling rapid iteration cycles.
Execute Synthetic Data generation strategies: Create and curate high-quality instruction sets to improve model generalization where human data is scarce.
Collaborate with Safety Engineering: Translate high-level safety policies into concrete evaluation metrics and unit tests that run in our CI/CD pipelines.
Debug post-training instability: Dive deep into loss curves and evaluation logs to identify when fine-tuning is causing alignment tax or capability degradation.

Required Qualifications:

4+ years of professional experience in machine learning engineering, with a focus on LLM fine-tuning or evaluation.
Fluency in Python and PyTorch, with experience using libraries like Hugging Face Transformers, vLLM, or lm-eval-harness.
Deep understanding of Instruction Tuning (SFT) and how data quality impacts model behavior.
Experience building Evaluation Pipelines: You know the difference between MMLU, GSM8K, and how to build a custom domain-specific benchmark.
Familiarity with distributed training (FSDP/DeepSpeed) for fine-tuning jobs.
Strong data engineering skills for curating and cleaning instruction datasets.

Nice to Have:

Experience with MLFlow, Weights & Biases, or other experiment tracking tools.
Experience with Synthetic Data generation (e.g., Self-Instruct papers)

Benefits:

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave

#LI-SP1

Pay Transparency:

This job posting may span more than one career level.

In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Additionally, Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave. To learn more, please visit https://www.redditinc.com/careers/.

To provide greater transparency to candidates, we share base salary ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar stage growth companies. Final offer amounts are determined by multiple factors including, skills, depth of work experience and relevant licenses/credentials, and may vary from the amounts listed below.

The base salary range for this position is:

$216,700—$303,400 USD

In select roles and locations, the interviews will be recorded, transcribed and summarized by artificial intelligence (AI). You will have the opportunity to opt out of recording, transcription and summarization prior to any scheduled interviews.

During the interview, we will collect the following categories of personal information: Identifiers, Professional and Employment-Related Information, Sensory Information (audio/video recording), and any other categories of personal information you choose to share with us. We will use this information to evaluate your application for employment or an independent contractor role, as applicable. We will not sell your personal information or disclose it to any third party for their marketing purposes. We will delete any recording of your interview promptly after making a hiring decision. For more information about how we will handle your personal information, including our retention of it, please refer to our Candidate Privacy Policy for Potential Employees and Contractors.

Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve. Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Python
PyTorch
Hugging Face Transformers
vLLM
DeepSpeed
FSDP
LLM Fine-tuning
Data Engineering
MLflow
Weights & Biases
Synthetic Data Generation

Возможные вопросы на собеседовании

Проверка понимания специфики дообучения моделей.

Как вы подходите к выбору данных для SFT, чтобы минимизировать 'alignment tax' и сохранить базовые способности модели?

Оценка навыков построения систем оценки.

Опишите архитектуру системы 'Model-as-a-Judge': какие метрики вы бы использовали для оценки 'Reddit-native' ответов?

Техническая экспертиза в распределенных вычислениях.

С какими основными проблемами стабильности вы сталкивались при использовании FSDP или DeepSpeed для моделей большого размера?

Проверка навыков работы с данными.

Как вы организуете процесс генерации и фильтрации синтетических данных для обучения инструкциям?

Интеграция безопасности в разработку.

Как перевести абстрактные политики безопасности сообщества в конкретные юнит-тесты для CI/CD пайплайна модели?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

СШАот 216 700 $

Откликайтесь
на вакансии с ИИ

Senior Research Engineer, Post-training & Evaluation

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в reddit уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как вы подходите к выбору данных для SFT, чтобы минимизировать 'alignment tax' и сохранить базовые способности модели?

Опишите архитектуру системы 'Model-as-a-Judge': какие метрики вы бы использовали для оценки 'Reddit-native' ответов?

С какими основными проблемами стабильности вы сталкивались при использовании FSDP или DeepSpeed для моделей большого размера?

Как вы организуете процесс генерации и фильтрации синтетических данных для обучения инструкциям?

Как перевести абстрактные политики безопасности сообщества в конкретные юнит-тесты для CI/CD пайплайна модели?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI-разработчик (Senior)

Аналитик AI-агентов Senior

Аналитик AI-агентов

Senior Analyst AI-агентов

Middle/Senior AI-разработчик

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Senior Research Engineer, Post-training & Evaluation

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в reddit уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как вы подходите к выбору данных для SFT, чтобы минимизировать 'alignment tax' и сохранить базовые способности модели?

Опишите архитектуру системы 'Model-as-a-Judge': какие метрики вы бы использовали для оценки 'Reddit-native' ответов?

С какими основными проблемами стабильности вы сталкивались при использовании FSDP или DeepSpeed для моделей большого размера?

Как вы организуете процесс генерации и фильтрации синтетических данных для обучения инструкциям?

Как перевести абстрактные политики безопасности сообщества в конкретные юнит-тесты для CI/CD пайплайна модели?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI-разработчик (Senior)

Аналитик AI-агентов Senior

Аналитик AI-агентов

Senior Analyst AI-агентов

Middle/Senior AI-разработчик

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ