Страна: Кипр

+500% приглашений

Откликайтесь
на вакансии с ИИ

ГибридПолная занятость

Machine Learning Engineer, AI Models

Name: Quick Offer — сервис для поиска работы на hh.ru
Brand: Quick Offer
SKU: quick-offer-saas
Availability: InStock
Rating: 4.9 (682 reviews)

Исключительная возможность работать в одной из самых инновационных компаний в сфере AI-железа под руководством легенд индустрии. Высокий балл за уникальность задач, работу с RISC-V и потенциал карьерного роста в hardware-стартапе.

Вакансия из Quick Offer Global, списка международных компаний

Пожаловаться

Сложность вакансии

ЛегкоСложно

Высокая сложность обусловлена необходимостью глубокого понимания не только ML-фреймворков, но и системного программирования, а также работы с кастомным оборудованием. Кандидату предстоит решать задачи на стыке софта и железа, что требует исключительных навыков отладки.

Анализ зарплаты

Медиана75 000 €

Рынок55 000 € – 100 000 €

Зарплата в объявлении не указана, но Tenstorrent заявляет о высокой конкурентоспособности. Для Кипра в секторе Deep Tech и AI уровни компенсации для инженеров такого профиля обычно значительно выше среднего по рынку ИТ.

I am writing to express my strong interest in the Machine Learning Engineer position within the AI Models team at Tenstorrent. With a solid background in developing and optimizing transformer-based architectures using PyTorch, I am particularly drawn to Tenstorrent's mission of unifying software models with custom RISC-V hardware. My experience in debugging complex ML workloads and my passion for performance tuning align perfectly with your goal of chasing down every millisecond of execution time.

In my previous work, I have focused on the practical implementation of state-of-the-art models, often encountering the limitations of standard API-based deployments. The opportunity to work at the silicon scale and collaborate directly with compiler and hardware teams is exactly the challenge I am looking for. I am eager to bring my skills in model analysis and software engineering to help Tenstorrent build the most efficient AI platform on the market.

+250% к просмотрам

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в tenstorrent уже сейчас

Присоединяйтесь к Tenstorrent, чтобы внедрять передовые LLM в кастомное железо и определять будущее AI-вычислений!

Описание вакансии

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Join Tenstorrent’s AI Models team and work at the layer most ML engineers never see: bringing advanced models to life on custom AI hardware. You’ll own real workloads end‑to‑end including porting, tuning, and validating LLMs and vision models on our accelerator, and chasing down every last millisecond and percentage point of accuracy. This role is for people who love the craft of ML engineering and want their work to matter at silicon scale, not just behind another API.

This role ishybrid, based in Cyprus.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

Bring up, run, and debug modern ML models (e.g., transformers) using PyTorch or TensorFlow.
Analyze model behavior and performance, and identify bottlenecks across the stack.
Improve efficiency, correctness, and scalability of model execution in real systems.
Work closely with compiler, kernel, and hardware teams to drive performance and system-level improvements.
Help translate state-of-the-art model architectures into production-grade, high-performance deployments.

What We Need

Strong experience building and working with ML models in PyTorch or TensorFlow.
Strong understanding of modern ML model architectures (ex: transformers).
Solid software engineering fundamentals with strong debugging and problem-solving skills.
Comfort working in a fast-moving, research-meets-engineering environment.
Bonus, not required: experience with profiling or performance tuning, or familiarity with quantization, flash attention, kernel fusion, memory hierarchies, C++, CUDA, or systems programming.

What You Will Learn

How to bring state‑of‑the‑art LLMs and vision models to high performance on a custom AI accelerator.
How to trace and fix performance bottlenecks from PyTorch code down to kernels and memory systems.
How to turn research‑grade models into reliable, production deployments on new hardware.
The practical trade‑offs between techniques like quantization, FlashAttention, and kernel fusion when you’re optimizing real throughput, latency, and memory.
How your findings can drive changes across compiler, kernel, and hardware teams in a full‑stack co‑design loop

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

+400% к собеседованиям

Создайте идеальное резюме с помощью ИИ-агента

Навыки

PyTorch
TensorFlow
Transformers
LLM
Computer Vision
C++
CUDA
Quantization
FlashAttention
Kernel Fusion
RISC-V
Systems Programming
Debugging

Возможные вопросы на собеседовании

Проверка понимания архитектуры трансформеров, которая является ключевой для данной роли.

Можете ли вы подробно описать механизм Attention и то, как он влияет на использование памяти при масштабировании длины последовательности?

Роль предполагает оптимизацию моделей под конкретное железо.

Какие стратегии вы бы использовали для выявления узких мест производительности в модели PyTorch, работающей на новом ускорителе?

В описании упоминаются квантование и FlashAttention как важные навыки.

В чем заключаются основные компромиссы при использовании 8-битного квантования (INT8) по сравнению с FP16 для весов LLM?

Работа в Tenstorrent требует тесного взаимодействия между командами софта и железа.

Опишите случай, когда вам приходилось оптимизировать код на низком уровне (например, через кастомные ядра или управление памятью) для достижения прироста скорости.

Проверка навыков отладки в сложной среде.

Как вы подходите к отладке расхождений в точности (accuracy) между моделью на стандартном GPU и той же моделью, портированной на кастомный ускоритель?

Устали искать работу? Мы найдём её за вас

Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!

Кипр

Откликайтесь
на вакансии с ИИ

Machine Learning Engineer, AI Models

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в tenstorrent уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Можете ли вы подробно описать механизм Attention и то, как он влияет на использование памяти при масштабировании длины последовательности?

Какие стратегии вы бы использовали для выявления узких мест производительности в модели PyTorch, работающей на новом ускорителе?

В чем заключаются основные компромиссы при использовании 8-битного квантования (INT8) по сравнению с FP16 для весов LLM?

Как вы подходите к отладке расхождений в точности (accuracy) между моделью на стандартном GPU и той же моделью, портированной на кастомный ускоритель?

Похожие вакансии

TeamLead MLOps / DevOps (Пайплайны)

Middle-разработчик (AI-инженер)

Middle-разработчик (AI-инженер) в продукт по речевой аналитике

T-shape Аналитик AI (Middle / Senior)

Архитектор мультиагентных систем на базе LLM

LLM/SRE-инженер

Устали искать работу? Мы найдём её за вас

Откликайтесьна вакансии с ИИ

Machine Learning Engineer, AI Models

Анализ зарплаты

Сопроводительное письмо

Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в tenstorrent уже сейчас

Описание вакансии

Создайте идеальное резюме с помощью ИИ-агента

Навыки

Возможные вопросы на собеседовании

Можете ли вы подробно описать механизм Attention и то, как он влияет на использование памяти при масштабировании длины последовательности?

Какие стратегии вы бы использовали для выявления узких мест производительности в модели PyTorch, работающей на новом ускорителе?

В чем заключаются основные компромиссы при использовании 8-битного квантования (INT8) по сравнению с FP16 для весов LLM?

Как вы подходите к отладке расхождений в точности (accuracy) между моделью на стандартном GPU и той же моделью, портированной на кастомный ускоритель?

Похожие вакансии

TeamLead MLOps / DevOps (Пайплайны)

Middle-разработчик (AI-инженер)

Middle-разработчик (AI-инженер) в продукт по речевой аналитике

T-shape Аналитик AI (Middle / Senior)

Архитектор мультиагентных систем на базе LLM

LLM/SRE-инженер

Устали искать работу? Мы найдём её за вас

Откликайтесь
на вакансии с ИИ