Страна: США

+500% приглашений

Откликайтесь
на вакансии с ИИ

ГибридПолная занятость

Member of Technical Staff - Post Training, Applied (Vision)

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная возможность работать в стартапе-«единороге», основанном выходцами из MIT. Вакансия предлагает редкое сочетание работы над фундаментальными моделями и их реального применения, отличный соцпакет и высокую долю ответственности.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Роль требует глубоких экспертных знаний в области Vision-Language Models (VLM), опыта обучения моделей (SFT, RL) и умения работать с данными. Высокая сложность обусловлена необходимостью совмещать глубокую техническую экспертизу с прямым взаимодействием с крупными корпоративными клиентами.

Анализ зарплаты

Медиана230 000 $

Рынок190 000 $ – 280 000 $

Указанная роль в Сан-Франциско для компании уровня 'unicorn' обычно предполагает базовый оклад выше среднего по рынку, дополненный значительным пакетом опционов. Рыночные оценки для Senior/Staff ML ролей в этом регионе начинаются от $200k.

I am writing to express my strong interest in the Member of Technical Staff - Post Training, Applied (Vision) position at Liquid AI. With a deep background in fine-tuning vision-language models and a passion for bridging the gap between foundational research and enterprise-grade deployment, I am excited by Liquid AI's mission to build efficient, general-purpose AI systems. My experience in designing visual data curation pipelines and executing SFT/RLHF workflows for VLMs aligns perfectly with your need for someone who can own the post-training stack end-to-end.

In my previous work, I have focused on optimizing visual data quality and developing task-specific evaluations for complex multimodal capabilities like OCR and document understanding. I pride myself on being a pragmatic engineer who prioritizes model performance and customer outcomes. I am particularly drawn to Liquid AI because of the unique opportunity to contribute to core multimodal model development while simultaneously solving high-impact problems for global enterprises. I look forward to the possibility of bringing my expertise in multimodal alignment and evaluation to your rapidly scaling team.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в liquid-ai уже сейчас

Присоединяйтесь к Liquid AI, чтобы создавать передовые мультимодальные модели на стыке науки и реального бизнеса!

Описание вакансии

About Liquid AI

Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there.

The Opportunity

This is a rare chance to sit at the intersection of frontier vision-language models and real-world deployment. You'll own applied post-training work for VLMs end-to-end for some of the world's largest enterprises, while still contributing directly to Liquid's core multimodal model development.

Unlike most roles that force a trade-off between customer impact and foundational work, this role gives you both: deep ownership over how vision-language models are adapted, evaluated, and shipped, and a direct line into the evolution of Liquid's multimodal post-training stack.

If you care about visual understanding, data quality, evaluation, and making VLMs actually work in production, this is a chance to shape how applied multimodal AI is done at a foundation model company.

What We're Looking For

We need someone who:

Takes ownership: Owns VLM post-training projects end-to-end, from customer requirements through delivery and evaluation.
Thinks end-to-end: Can reason across visual data curation, training, alignment, and evaluation as a single system.
Is pragmatic: Optimizes for model quality and customer outcomes over publications or theory.
Communicates clearly: Can translate between customer needs and internal technical teams, and push back when needed.

The Work

Act as the technical owner for enterprise customer VLM post-training engagements.
Translate customer requirements into concrete multimodal post-training specifications and workflows.
Design and execute visual data generation, filtering, and quality assessment processes, including image-text pair curation, annotation pipelines, and synthetic data generation for visual tasks.
Run supervised fine-tuning, preference alignment, and reinforcement learning workflows for vision-language models.
Design task-specific evaluations for visual understanding, grounding, OCR, document parsing, and other multimodal capabilities. Interpret results and feed learnings back into core post-training pipelines.

Desired Experience

Must-have:

Hands-on experience with data generation and evaluation for VLM or multimodal post-training.
Experience training or fine-tuning vision-language models using SFT, preference alignment, and/or RL.
Strong intuition for visual data quality, annotation design, and multimodal evaluation.
Familiarity with vision encoders, image-text architectures, and how visual representations interact with language model backbones.

Nice-to-have:

Experience with visual grounding, document understanding, OCR, or video understanding tasks.
Experience contributing to shared or general-purpose multimodal post-training infrastructure.
Prior exposure to customer-facing or applied ML delivery environments.
Familiarity with alignment or RL techniques beyond basic supervised fine-tuning in the multimodal setting.

What Success Looks Like (Year One)

Independently owns and delivers enterprise VLM post-training projects with minimal oversight.
Is trusted by customers as the technical owner, demonstrating strong judgment and delivery quality on multimodal workloads.
Has made durable contributions to Liquid's general-purpose multimodal post-training pipelines by feeding applied learnings back into baseline model development.

What We Offer

Real ML work: You will fine-tune vision-language models, generate multimodal data, and ship solutions, not configure API calls. Your work feeds directly back into our core model development.
Compensation: Competitive base salary with equity in a unicorn-stage company.
Health: We pay 100% of medical, dental, and vision premiums for employees and dependents.
Financial: 401(k) matching up to 4% of base pay.
Time Off: Unlimited PTO plus company-wide Refill Days throughout the year.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Python
PyTorch
Vision Language Models
Multimodal Learning
Supervised Fine-Tuning
Reinforcement Learning
Computer Vision
Natural Language Processing
OCR
Data Curation
Machine Learning Evaluation

Возможные вопросы на собеседовании

Проверка практического опыта работы с мультимодальными данными.

Расскажите о вашем подходе к фильтрации и оценке качества пар изображение-текст для обучения VLM. Какие метрики вы считаете наиболее показательными?

Оценка навыков дообучения и выравнивания моделей.

В чем заключаются основные сложности при применении методов Preference Alignment (например, DPO или PPO) к мультимодальным моделям по сравнению с чисто текстовыми?

Проверка умения решать специфические задачи компьютерного зрения.

Как бы вы спроектировали систему оценки для задачи визуального обоснования (visual grounding) в контексте корпоративного документа?

Оценка архитектурных знаний.

Как выбор визуального энкодера влияет на последующее дообучение языковой модели в составе VLM? Приведите примеры компромиссов.

Проверка навыков взаимодействия с клиентами и приоритизации.

Опишите ситуацию, когда требования клиента противоречили техническим возможностям модели. Как вы аргументировали свою позицию и какое решение предложили?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

США

Откликайтесь
на вакансии с ИИ

Member of Technical Staff - Post Training, Applied (Vision)

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в liquid-ai уже сейчас

Описание вакансии

About Liquid AI

The Opportunity

What We're Looking For

The Work

Desired Experience

What Success Looks Like (Year One)

What We Offer

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о вашем подходе к фильтрации и оценке качества пар изображение-текст для обучения VLM. Какие метрики вы считаете наиболее показательными?

В чем заключаются основные сложности при применении методов Preference Alignment (например, DPO или PPO) к мультимодальным моделям по сравнению с чисто текстовыми?

Как бы вы спроектировали систему оценки для задачи визуального обоснования (visual grounding) в контексте корпоративного документа?

Как выбор визуального энкодера влияет на последующее дообучение языковой модели в составе VLM? Приведите примеры компромиссов.

Опишите ситуацию, когда требования клиента противоречили техническим возможностям модели. Как вы аргументировали свою позицию и какое решение предложили?

Похожие вакансии

T-shape Аналитик AI (Middle / Senior)

Архитектор мультиагентных систем на базе LLM

Fullstack разработчик-подмастерье (AI Engineer)

Специалист по AI-инструментам

Fullstack / AI разработчик (подмастерье)

AI engineer (ML/DS)

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Member of Technical Staff - Post Training, Applied (Vision)

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в liquid-ai уже сейчас

Описание вакансии

About Liquid AI

The Opportunity

What We're Looking For

The Work

Desired Experience

What Success Looks Like (Year One)

What We Offer

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о вашем подходе к фильтрации и оценке качества пар изображение-текст для обучения VLM. Какие метрики вы считаете наиболее показательными?

В чем заключаются основные сложности при применении методов Preference Alignment (например, DPO или PPO) к мультимодальным моделям по сравнению с чисто текстовыми?

Как бы вы спроектировали систему оценки для задачи визуального обоснования (visual grounding) в контексте корпоративного документа?

Как выбор визуального энкодера влияет на последующее дообучение языковой модели в составе VLM? Приведите примеры компромиссов.

Опишите ситуацию, когда требования клиента противоречили техническим возможностям модели. Как вы аргументировали свою позицию и какое решение предложили?

Похожие вакансии

T-shape Аналитик AI (Middle / Senior)

Архитектор мультиагентных систем на базе LLM

Fullstack разработчик-подмастерье (AI Engineer)

Специалист по AI-инструментам

Fullstack / AI разработчик (подмастерье)

AI engineer (ML/DS)

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ