Страна: Канада

+500% приглашений

Откликайтесь
на вакансии с ИИ

SeniorВ офисеПолная занятость

Senior ML Systems Engineer

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная возможность работать с самым мощным ИИ-железом в мире в компании, сотрудничающей с OpenAI. Высокий потенциал роста и работа над технологиями, которые на порядок превосходят текущие облачные решения.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Роль требует редкого сочетания навыков: глубокого понимания архитектур нейросетей (LLM, MoE) и экспертных знаний в системном программировании (C++, LLVM, MLIR). Работа с уникальным "wafer-scale" оборудованием добавляет сложности из-за отсутствия стандартных рыночных паттернов оптимизации.

Анализ зарплаты

Медиана215 000 $

Рынок175 000 $ – 260 000 $

Для позиций уровня Senior ML Systems Engineer в Кремниевой долине и Торонто рыночная вилка составляет $180k-$250k базового оклада плюс значительный пакет акций (RSU). Cerebras, как быстрорастущий "единорог", обычно предлагает конкурентоспособные условия, соответствующие верхним границам рынка.

I am writing to express my strong interest in the Senior ML Systems Engineer position at Cerebras Systems. With over five years of experience in deep learning frameworks and low-level optimization, I have closely followed Cerebras' breakthroughs in wafer-scale integration. My background in C++ and compiler development, specifically working with LLVM and MLIR, aligns perfectly with your SOTA Training Platform team's mission to bridge the gap between high-level model architectures and hardware-specific execution.

In my previous roles, I have successfully optimized large-scale models like LLaMA and implemented custom kernels to improve hardware utilization. I am particularly excited about the opportunity to work on the CSX systems and contribute to the end-to-end bring-up of proprietary and open-source models. My experience in debugging complex numerical accuracy issues and performance bottlenecks across the entire software stack makes me a strong candidate to help Cerebras maintain its lead in generative AI inference and training speeds.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в cerebrassystems уже сейчас

Присоединяйтесь к команде, создающей будущее ИИ на базе уникальной архитектуры Cerebras, и подайте заявку сегодня!

Описание вакансии

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About the Role

We are seeking a versatile and experienced engineer to join our SOTA Training Platform team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack.

Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Responsibilities

Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.
Study emerging training and post-training algorithms and map to Cerebras software architecture and hardware.

Skills & Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
5+ years of relevant industry experience (internship/co-op experience included)
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.
Familiarity with large scale ML systems and state of the art algorithms, including model training and reinforcement learning.

What We Offer

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer.We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Python
C++
PyTorch
TensorFlow
LLVM
MLIR
Deep Learning
Large Language Models
Compilers
Optimization
Reinforcement Learning

Возможные вопросы на собеседовании

Проверка понимания специфики оборудования Cerebras по сравнению с традиционными GPU.

Как бы вы адаптировали алгоритм распределенного обучения для архитектуры с одним гигантским чипом (wafer-scale) по сравнению с кластером из нескольких GPU?

Оценка опыта работы с компиляторными технологиями, указанными в вакансии.

Опишите ваш опыт работы с MLIR. Как вы использовали диалекты для оптимизации графов вычислений?

Проверка навыков отладки на стыке софта и железа.

Расскажите о случае, когда вы столкнулись с расхождением в численной точности (numerical accuracy) между CPU и ускорителем. Как вы локализовали проблему?

Оценка знаний современных архитектур LLM.

Какие основные сложности возникают при оптимизации моделей Mixture of Experts (MoE) на уровне компилятора и рантайма?

Проверка навыков низкоуровневой оптимизации.

Как вы подходите к решению NP-трудных задач при распределении ресурсов или планировании инструкций в компиляторе?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

Канада

Откликайтесь
на вакансии с ИИ

Senior ML Systems Engineer

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в cerebrassystems уже сейчас

Описание вакансии

Why Join Cerebras

Apply today and become part of the forefront of groundbreaking advancements in AI!

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как бы вы адаптировали алгоритм распределенного обучения для архитектуры с одним гигантским чипом (wafer-scale) по сравнению с кластером из нескольких GPU?

Опишите ваш опыт работы с MLIR. Как вы использовали диалекты для оптимизации графов вычислений?

Расскажите о случае, когда вы столкнулись с расхождением в численной точности (numerical accuracy) между CPU и ускорителем. Как вы локализовали проблему?

Какие основные сложности возникают при оптимизации моделей Mixture of Experts (MoE) на уровне компилятора и рантайма?

Как вы подходите к решению NP-трудных задач при распределении ресурсов или планировании инструкций в компиляторе?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI-разработчик (Senior)

Аналитик AI-агентов Senior

Аналитик AI-агентов

Senior Analyst AI-агентов

Middle/Senior AI-разработчик

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Senior ML Systems Engineer

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в cerebrassystems уже сейчас

Описание вакансии

Why Join Cerebras

Apply today and become part of the forefront of groundbreaking advancements in AI!

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Как бы вы адаптировали алгоритм распределенного обучения для архитектуры с одним гигантским чипом (wafer-scale) по сравнению с кластером из нескольких GPU?

Опишите ваш опыт работы с MLIR. Как вы использовали диалекты для оптимизации графов вычислений?

Расскажите о случае, когда вы столкнулись с расхождением в численной точности (numerical accuracy) между CPU и ускорителем. Как вы локализовали проблему?

Какие основные сложности возникают при оптимизации моделей Mixture of Experts (MoE) на уровне компилятора и рантайма?

Как вы подходите к решению NP-трудных задач при распределении ресурсов или планировании инструкций в компиляторе?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI-разработчик (Senior)

Аналитик AI-агентов Senior

Аналитик AI-агентов

Senior Analyst AI-агентов

Middle/Senior AI-разработчик

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ