Страна: США
Зарплата: 219 000 $ – 351 000 $

+500% приглашений

Откликайтесь
на вакансии с ИИ

В офисеПолная занятость

Principal engineer, AI Serving Framework Architect (Software)

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная вакансия в R&D подразделении мирового лидера с очень высоким уровнем компенсации и возможностью влиять на индустрию. Предлагает отличный соцпакет и работу над передовыми технологиями (LLM, RAG, AI Agents).

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Это позиция высочайшего уровня сложности, требующая степени PhD, более 15 лет опыта и уникальных знаний в области архитектуры памяти и инференса LLM. Кандидат должен обладать навыками технического лидерства международными командами и глубоким пониманием узких мест в железе.

Анализ зарплаты

Медиана285 000 $

Рынок220 000 $ – 380 000 $

Предложенный диапазон $219k–$351k полностью соответствует рыночным стандартам для позиций уровня Principal/Staff Engineer в Кремниевой долине. Верхняя граница диапазона является конкурентной даже для топовых технологических гигантов (Big Tech).

I am writing to express my strong interest in the Principal AI Serving Framework Architect position at Samsung Semiconductor's Architecture Research Lab. With over 15 years of experience in high-performance computing and a deep specialization in LLM inference stacks, I have successfully led large-scale projects that deliver AI services to hundreds of thousands of users. My expertise in optimizing vLLM and managing KVCache in hierarchical memory systems aligns perfectly with your lab's focus on overcoming system-level bottlenecks.

Throughout my career, I have focused on bridging the gap between complex AI workloads and hardware design. I am particularly excited about Samsung's work in memory-centric systems and the opportunity to lead research teams in developing dynamic scheduling methodologies for multi-rack scale environments. My proficiency in C++, Python, and PyTorch, combined with my experience in heterogeneous compute environments, positions me to contribute immediately to ARL’s strategic goals.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Присоединяйтесь к Samsung Semiconductor и создавайте будущее ИИ-архитектур мирового уровня уже сегодня!

Описание вакансии

Please Note:

To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.

Advancing the World’s Technology Together

Our technology solutions power the tools you use every day--including smartphones, electric vehicles, hyperscale data centers, IoT devices, and so much more. Here, you’ll have an opportunity to be part of a global leader whose innovative designs are pushing the boundaries of what’s possible and powering the future.

We believe innovation and growth are driven by an inclusive culture and a diverse workforce. We’re dedicated to empowering people to be their true selves. Together, we’re building a better tomorrow for our employees, customers, partners, and communities.

Job Title: Principal engineer, AI Serving Framework Architect (Software)

What You’ll Do

The Architecture Research Lab (ARL) focuses on addressing fundamental system-level bottlenecks in modern AI, particularly in memory capacity/bandwidth and system-scale communication. By leveraging Samsung’s world-class memory technologies, ARL explores and defines next-generation AI system architectures that deliver step-function improvements in performance, efficiency, and scalability.

We are seeking a Principal AI System Architect who will play a key role in bridging AI workloads, system architecture, and hardware design. In this role, you will develop system-level performance models, drive architecture-level design decisions, and propose forward-looking AI system architectures that shape Samsung’s long-term AI platform strategy.

Location: Daily onsite presence at our San Jose office in alignment with our Flexible Work policy

Job ID: 42853

As a Tech Lead, leading research teams in Korea and proposing technical direction
Research on dynamic scheduling methodologies for maximizing AI inference performance in multi-rack scale memory-centric systems, comprised of heterogeneous compute-capable memory and hierarchical memory
Investigating methods to accelerate search operations in RAG’s vector DB and AI Agent’s knowledge-graph by leveraging compute-capable memory
Studying strategies for optimally placing KVCache and a vector DB in hierarchical memory to minimize frequent SSD accesses and reduce IO stalls
Proposing SW design for implementing the derived optimization algorithms on open-source platforms such as vLLM

What You Bring

PhD in Computer Science or a related field with 15+ years of experience in AI Serving Framework for large-scale computing, with focusing on the AI workloads.
Led a project to build and optimize a Large Language Model (LLM) Inference Software Stack on a multi-rack scale system to deliver AI Inference services to over 100,000 users.
Extensive experience in designing AI Inference Software Stacks for heterogeneous devices.In-depth understanding of the internal architecture and operation mechanisms of inference engines such as vLLM.
Proficiency in AI Inference System Profiling and optimization.
Knowledge and practical experience with future AI workloads, including reasoning models, multi-modal solutions, AI agents, and world models.
Strong understanding of compute, memory, and networking bottlenecks in AI systems.
Required skillsets: PyTorch, Python, and C++
A collaborative mindset, curiosity, and resilience in solving complex challenges.
Excellent verbal, presentation, and written communication skills.
(Nice to have) Native or fluent Korean speakers are preferred.
You’re inclusive, adapting your style to the situation and diverse global norms of our people.
You approach challenges with curiosity and resilience, seeking data to help build. Understanding.
You’re collaborative, building relationships, humbly offering support and openly welcoming approaches.
Innovative and creative, you proactively explore new ideas and adapt quickly to change

#LI-SF1

What We OfferThe pay range below is for all roles at this level across all US locations and functions. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. We also offer incentive opportunities that reward employees based on individual and company performance.

This is in addition to our diverse package of benefits centered around the wellbeing of our employees and their loved ones. In addition to the usual Medical/Dental/Vision/401k, our inclusive rewards plan empowers our people to care for their whole selves. An investment in your future is an investment in ours.

Give Back With a charitable giving match and frequent opportunities to get involved, we take an active role in supporting the community.

Enjoy Time Away You’ll start with 4+ weeks of paid time off a year, plus holidays and sick leave, to rest and recharge.

Care for Family Whatever family means to you, we want to support you along the way—including a stipend for fertility care or adoption, medical travel support, and virtual vet care for your fur babies.

Prioritize Emotional Wellness With on-demand apps and free confidential therapy sessions, you’ll have support no matter where you are.

Stay Fit Eating well and being active are important parts of a healthy life. Our onsite Café and gym, plus virtual classes, make it easier.

Embrace Flexibility Benefits are best when you have the space to use them. That’s why we facilitate a flexible environment so you can find the right balance for you.

Base Pay Range

$219,000—$351,000 USD

Equal Opportunity Employment Policy

Samsung Semiconductor takes pride in being an equal opportunity workplace dedicated to fostering an environment where all individuals feel valued and empowered to excel, regardless of race, religion, color, age, disability, sex, gender identity, sexual orientation, ancestry, genetic information, marital status, national origin, political affiliation, or veteran status.

When selecting team members, we prioritize talent and qualities such as humility, kindness, and dedication. We extend comprehensive accommodations throughout our recruiting processes for candidates with disabilities, long-term conditions, neurodivergent individuals, or those requiring pregnancy-related support. All candidates scheduled for an interview will receive guidance on requesting accommodations.

Recruiting Agency Policy

We do not accept unsolicited resumes. Only authorized recruitment agencies that have a current and valid agreement with Samsung Semiconductor, Inc. are permitted to submit resumes for any job openings.

Applicant AI Use Policy

At Samsung Semiconductor, we support innovation and technology. However, to ensure a fair and authentic assessment, we prohibit the use of generative AI tools to misrepresent a candidate's true skills and qualifications. Permitted uses are limited to basic preparation, grammar, and research, but all submitted content and interview responses must reflect the candidate’s genuine abilities and experience. Violation of this policy may result in immediate disqualification from the hiring process.

Applicant Privacy Policy

https://semiconductor.samsung.com/about-us/careers/us/privacy/

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

C++
Python
PyTorch
LLM
RAG
System Architecture
Distributed Systems
Vector Databases
vLLM
Performance Modeling
AI Inference

Возможные вопросы на собеседовании

Проверка глубоких знаний внутреннего устройства популярных движков инференса, указанных в вакансии.

Опишите механизмы управления памятью в vLLM (например, PagedAttention) и предложите способы их оптимизации для иерархических систем памяти Samsung.

Вакансия делает упор на решение проблем пропускной способности и задержек.

Как бы вы спроектировали стратегию размещения KVCache в системе с многоуровневой памятью (HBM, DDR, SSD), чтобы минимизировать IO-сталы при обслуживании 100к+ пользователей?

Роль предполагает руководство исследовательскими группами в Корее.

Расскажите о вашем опыте руководства распределенными R&D командами и о том, как вы транслируете сложные архитектурные решения в конкретные задачи для инженеров.

Работа связана с инновационными методами поиска.

Каким образом использование вычислительных возможностей внутри памяти (Compute-in-Memory) может ускорить поиск в векторных базах данных для RAG-систем?

Оценка способности кандидата видеть общую картину развития технологий.

Какие основные узкие места в сетевом взаимодействии и масштабировании вы видите при переходе от одноузлового инференса к многостоечным (multi-rack) ИИ-системам?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

Страна: США
Зарплата: 219 000 $ – 351 000 $

Откликайтесь
на вакансии с ИИ

Principal engineer, AI Serving Framework Architect (Software)

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Опишите механизмы управления памятью в vLLM (например, PagedAttention) и предложите способы их оптимизации для иерархических систем памяти Samsung.

Как бы вы спроектировали стратегию размещения KVCache в системе с многоуровневой памятью (HBM, DDR, SSD), чтобы минимизировать IO-сталы при обслуживании 100к+ пользователей?

Расскажите о вашем опыте руководства распределенными R&D командами и о том, как вы транслируете сложные архитектурные решения в конкретные задачи для инженеров.

Каким образом использование вычислительных возможностей внутри памяти (Compute-in-Memory) может ускорить поиск в векторных базах данных для RAG-систем?

Какие основные узкие места в сетевом взаимодействии и масштабировании вы видите при переходе от одноузлового инференса к многостоечным (multi-rack) ИИ-системам?

Похожие вакансии

Junior AI Engineer

AI Engineer (Agents)

Senior Python AI Developer

Ai Tech Lead

AI-разработчик / вайбкодер

Инженер по искусственному интеллекту

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Principal engineer, AI Serving Framework Architect (Software)

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в samsungsemiconductor уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Опишите механизмы управления памятью в vLLM (например, PagedAttention) и предложите способы их оптимизации для иерархических систем памяти Samsung.

Как бы вы спроектировали стратегию размещения KVCache в системе с многоуровневой памятью (HBM, DDR, SSD), чтобы минимизировать IO-сталы при обслуживании 100к+ пользователей?

Расскажите о вашем опыте руководства распределенными R&D командами и о том, как вы транслируете сложные архитектурные решения в конкретные задачи для инженеров.

Каким образом использование вычислительных возможностей внутри памяти (Compute-in-Memory) может ускорить поиск в векторных базах данных для RAG-систем?

Какие основные узкие места в сетевом взаимодействии и масштабировании вы видите при переходе от одноузлового инференса к многостоечным (multi-rack) ИИ-системам?

Похожие вакансии

Junior AI Engineer

AI Engineer (Agents)

Senior Python AI Developer

Ai Tech Lead

AI-разработчик / вайбкодер

Инженер по искусственному интеллекту

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ