Страна: США
Зарплата: 138 000 $ – 206 000 $

+500% приглашений

Откликайтесь
на вакансии с ИИ

SeniorВ офисеПолная занятость

Senior Engineer, AI Systems

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная позиция в R&D подразделении мирового лидера. Работа с передовыми технологиями (AGI, Triton, LLM) и конкурентная заработная плата делают вакансию очень привлекательной для топ-инженеров.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Высокая сложность обусловлена необходимостью глубоких знаний в узких областях: разработка ядер Triton, архитектура ускорителей (GPU) и оптимизация LLM на низком уровне. Требуется опыт работы на стыке железа и софта.

Анализ зарплаты

Медиана185 000 $

Рынок150 000 $ – 220 000 $

Предложенный диапазон $138k – $206k полностью соответствует рыночным стандартам для Senior-позиций в Кремниевой долине. Верхняя граница в $206k является конкурентной для крупных технологических компаний (Big Tech), хотя в стартапах на стадии роста общая компенсация может быть выше за счет опционов.

I am writing to express my strong interest in the Senior AI Systems Engineer position at Samsung Semiconductor’s AGI Computing Lab. With a robust background in high-performance computing and deep expertise in developing Triton kernels, I am eager to contribute to your mission of revolutionizing LLM inference and training through hardware-software co-design. My experience in optimizing memory access patterns and tiling strategies aligns perfectly with your team's focus on maximizing performance for next-generation AI workloads.

Throughout my career, I have developed a deep understanding of accelerator architectures, including HBM and SRAM hierarchies, and have a proven track record of diagnosing complex performance bottlenecks using advanced profiling tools. I am particularly drawn to this role because of the opportunity to work on the Triton-based software stack and influence future hardware designs. I am confident that my technical skills in Python and low-level performance programming, combined with my passion for AGI, will make me a valuable asset to your research-driven systems lab.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Присоединяйтесь к команде Samsung AGI Lab, чтобы создавать будущее ИИ-систем на стыке аппаратного и программного обеспечения!

Описание вакансии

Please Note:

To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.

Advancing the World’s Technology Together

Our technology solutions power the tools you use every day--including smartphones, electric vehicles, hyperscale data centers, IoT devices, and so much more. Here, you’ll have an opportunity to be part of a global leader whose innovative designs are pushing the boundaries of what’s possible and powering the future.

We believe innovation and growth are driven by an inclusive culture and a diverse workforce. We’re dedicated to empowering people to be their true selves. Together, we’re building a better tomorrow for our employees, customers, partners, and communities.

The AGI (Artificial General Intelligence) Computing Lab is dedicated to solving the complex system-level challenges posed by the growing demands of future AI/ML workloads. Our team is committed to designing and developing scalable platforms that can effectively handle the computational and memory requirements of these workloads while minimizing energy consumption and maximizing performance. To achieve this goal, we collaborate closely with both hardware and software engineers to identify and address the unique challenges posed by AI/ML workloads and to explore new computing abstractions that can provide a better balance between the hardware and software components of our systems. Additionally, we continuously conduct research and development in emerging technologies and trends across memory, computing, interconnect, and AI/ML, ensuring that our platforms are always equipped to handle the most demanding workloads of the future. By working together as a dedicated and passionate team, we aim to revolutionize the way AI/ML applications are deployed and executed, ultimately contributing to the advancement of AGI in an affordable and sustainable manner. Join us in our passion to shape the future of computing!

This role is being offered under the AGICL lab as a part of DSRA. We are a research-driven systems lab working at the intersection of large language models, accelerator hardware, and high-performance software stacks. Our mission is to design, prototype, and optimize next-generation AI systems through tight hardware–software co-design.

Our team works hands-on with cutting-edge accelerator hardware, experimental memory systems, and emerging domain-specific languages (DSLs). We build and optimize a Triton-based software stack that pushes the limits of performance, efficiency, and scalability for modern LLM workloads.

We are looking for a Senior AI Systems Engineer with deep experience in high performance Triton kernel development on modern accelerators. In this role, you will design, analyze, and optimize performance-critical kernels used in large scale LLM inference and training pipelines. You will work closely with hardware architects, compiler engineers, and ML researchers to identify performance bottlenecks, interpret profiling data, and co-design solutions that span software and hardware boundaries.

This role is ideal for engineers who enjoy working close to the hardware stack while still reasoning deeply about model level abstractions.

Location: Daily onsite presence at our San Jose, CA office / U.S. headquarters in alignment with our Flexible Work policy.

What You’ll Do

Design, implement, and optimize high-performance Triton kernels for LLM workloads on existing accelerators.
Analyze kernel performance using profiling tools; interpret metrics such as latency, throughput, occupancy, memory bandwidth, and compute utilization.
Identify performance bottlenecks in kernel design (e.g., memory access patterns, synchronization, tiling strategies) and propose concrete optimizations.
Work across the stack; from model architecture to kernel implementation—to ensure end-to-end performance efficiency.
Collaborate with hardware and compiler teams on hardware–software co-design, providing feedback that influences future accelerator and DSL designs.
Prototype and evaluate kernel optimizations using upcoming DSLs and experimental compiler flows.
Contribute to the evolution of a Triton-based software stack used for cutting-edge research and production-grade experimentation.
Document design decisions, performance trade-offs, and optimization strategies clearly for internal and external stakeholders.

What You Bring

Bachelor’s with 5+ years, or Master’s with 3+ years, or PhD's with 0+ years of industry experience.
Strong experience writing high-performance Triton kernels for GPUs or other accelerators.
Solid understanding of LLM fundamentals, including attention mechanisms, transformer architectures, and inference/training workflows.
Deep knowledge of accelerator hardware architecture, including: Memory hierarchies (HBM, SRAM, caches).
Proven ability to read and interpret profiling data and performance counters.
Experience diagnosing and resolving performance bottlenecks in kernel-level code.
Strong systems programming skills in Python and low-level performance-oriented programming paradigms.
Experience with hardware–software co-design or compiler-assisted optimization.
Familiarity with FlashAttention, fused kernels, MoE kernels, and different attention mechanisms.
Experience working with emerging or experimental domain-specific languages (DSLs) for accelerator programming.
Background in ML systems, compilers, or performance engineering.
Prior experience working with different accelerator backends (including but not limited to CUDA).
Opportunity to work on cutting-edge accelerator hardware and experimental software stacks.
Direct impact on the performance and design of next-generation AI systems.
A highly collaborative environment spanning hardware, systems, and ML research.
Flexibility to publish, prototype, and influence future hardware and software directions.
Ability to work effectively in cross-functional, research-oriented environments.
Strong analytical and problem-solving skills.
You’re inclusive, adapting your style to the situation and diverse global norms of our people.
An avid learner, you approach challenges with curiosity and resilience, seeking data to help build understanding.
You’re collaborative, building relationships, humbly offering support and openly welcoming approaches.
Innovative and creative, you proactively explore new ideas and adapt quickly to change.

#LI-VL1

What We OfferThe pay range below is for all roles at this level across all US locations and functions. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. We also offer incentive opportunities that reward employees based on individual and company performance.

This is in addition to our diverse package of benefits centered around the wellbeing of our employees and their loved ones. In addition to the usual Medical/Dental/Vision/401k, our inclusive rewards plan empowers our people to care for their whole selves. An investment in your future is an investment in ours.

Give Back With a charitable giving match and frequent opportunities to get involved, we take an active role in supporting the community.

Enjoy Time Away You’ll start with 4+ weeks of paid time off a year, plus holidays and sick leave, to rest and recharge.

Care for Family Whatever family means to you, we want to support you along the way—including a stipend for fertility care or adoption, medical travel support, and virtual vet care for your fur babies.

Prioritize Emotional Wellness With on-demand apps and free confidential therapy sessions, you’ll have support no matter where you are.

Stay Fit Eating well and being active are important parts of a healthy life. Our onsite Café and gym, plus virtual classes, make it easier.

Embrace Flexibility Benefits are best when you have the space to use them. That’s why we facilitate a flexible environment so you can find the right balance for you.

Base Pay Range

$138,000—$206,000 USD

Equal Opportunity Employment Policy

Samsung Semiconductor takes pride in being an equal opportunity workplace dedicated to fostering an environment where all individuals feel valued and empowered to excel, regardless of race, religion, color, age, disability, sex, gender identity, sexual orientation, ancestry, genetic information, marital status, national origin, political affiliation, or veteran status.

When selecting team members, we prioritize talent and qualities such as humility, kindness, and dedication. We extend comprehensive accommodations throughout our recruiting processes for candidates with disabilities, long-term conditions, neurodivergent individuals, or those requiring pregnancy-related support. All candidates scheduled for an interview will receive guidance on requesting accommodations.

Recruiting Agency Policy

We do not accept unsolicited resumes. Only authorized recruitment agencies that have a current and valid agreement with Samsung Semiconductor, Inc. are permitted to submit resumes for any job openings.

Applicant AI Use Policy

At Samsung Semiconductor, we support innovation and technology. However, to ensure a fair and authentic assessment, we prohibit the use of generative AI tools to misrepresent a candidate's true skills and qualifications. Permitted uses are limited to basic preparation, grammar, and research, but all submitted content and interview responses must reflect the candidate’s genuine abilities and experience. Violation of this policy may result in immediate disqualification from the hiring process.

Applicant Privacy Policy

https://semiconductor.samsung.com/about-us/careers/us/privacy/

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Triton
Python
CUDA
LLM
FlashAttention
GPU Architecture
Performance Engineering
Compilers
HBM
SRAM
Machine Learning Systems

Возможные вопросы на собеседовании

Проверка практического опыта оптимизации производительности в Triton.

Расскажите о наиболее сложной проблеме с пропускной способностью памяти, которую вы решили при написании Triton-ядра. Какие стратегии тайлинга вы использовали?

Оценка понимания архитектуры современных ускорителей.

Как бы вы спроектировали кастомное ядро для FlashAttention, чтобы минимизировать обращения к HBM и максимизировать использование SRAM?

Проверка навыков профилирования и анализа данных.

Какие метрики профилировщика (например, occupancy или SM efficiency) вы считаете наиболее критичными при оптимизации ядер для инференса LLM?

Оценка знаний в области современных архитектур нейросетей.

В чем заключаются основные сложности реализации эффективных ядер для моделей Mixture of Experts (MoE) по сравнению со стандартными трансформерами?

Проверка способности к кросс-функциональному взаимодействию.

Опишите ваш опыт взаимодействия с архитекторами железа: какие изменения в дизайне ускорителя вы бы предложили для улучшения работы Triton-стека?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

СШАот 138 000 $

Откликайтесь
на вакансии с ИИ

Senior Engineer, AI Systems

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о наиболее сложной проблеме с пропускной способностью памяти, которую вы решили при написании Triton-ядра. Какие стратегии тайлинга вы использовали?

Как бы вы спроектировали кастомное ядро для FlashAttention, чтобы минимизировать обращения к HBM и максимизировать использование SRAM?

Какие метрики профилировщика (например, occupancy или SM efficiency) вы считаете наиболее критичными при оптимизации ядер для инференса LLM?

В чем заключаются основные сложности реализации эффективных ядер для моделей Mixture of Experts (MoE) по сравнению со стандартными трансформерами?

Опишите ваш опыт взаимодействия с архитекторами железа: какие изменения в дизайне ускорителя вы бы предложили для улучшения работы Triton-стека?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI engineer (ML/DS)

Python-разработчик в команду запуска внутренних AI-сервисов

Аналитик AI-агентов Senior

Аналитик AI-агентов

AI-разработчик (Senior)

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Senior Engineer, AI Systems

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Расскажите о наиболее сложной проблеме с пропускной способностью памяти, которую вы решили при написании Triton-ядра. Какие стратегии тайлинга вы использовали?

Как бы вы спроектировали кастомное ядро для FlashAttention, чтобы минимизировать обращения к HBM и максимизировать использование SRAM?

Какие метрики профилировщика (например, occupancy или SM efficiency) вы считаете наиболее критичными при оптимизации ядер для инференса LLM?

В чем заключаются основные сложности реализации эффективных ядер для моделей Mixture of Experts (MoE) по сравнению со стандартными трансформерами?

Опишите ваш опыт взаимодействия с архитекторами железа: какие изменения в дизайне ускорителя вы бы предложили для улучшения работы Triton-стека?

Похожие вакансии

Архитектор мультиагентных систем на базе LLM

AI engineer (ML/DS)

Python-разработчик в команду запуска внутренних AI-сервисов

Аналитик AI-агентов Senior

Аналитик AI-агентов

AI-разработчик (Senior)

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ