- Страна
- Канада
Откликайтесь
на вакансии с ИИ

Senior Lead Machine Learning Engineer, Agentic AI
Высокий балл обусловлен работой в топовой технологической компании над передовым направлением (Agentic AI). Роль предлагает значительное влияние на продукт и возможность формировать стандарты новой области ИИ.
Сложность вакансии
Роль требует исключительного сочетания глубоких знаний в области LLM (SFT, RLHF, агентные фреймворки) и навыков проектирования высоконагруженных распределенных систем. Ожидается опыт работы более 10 лет и способность лидировать кросс-функциональные инициативы.
Анализ зарплаты
Предполагаемая зарплата соответствует уровню Senior Lead в крупных технологических хабах Канады, таких как Торонто. Учитывая специализацию в Generative AI, компенсация может быть выше среднего по рынку за счет дефицита экспертов такого уровня.
Сопроводительное письмо
I am writing to express my strong interest in the Senior Lead Machine Learning Engineer position for Agentic AI at Upwork. With over a decade of experience in applied machine learning and a proven track record of deploying LLM-powered products, I am particularly drawn to Upwork’s mission of creating a robust platform for multi-agent systems. My background in architecting low-latency inference services and implementing advanced alignment techniques like DPO and RLHF aligns perfectly with your goal of building reliable, production-grade agentic workflows.
In my previous roles, I have led the development of complex orchestration layers and evaluation harnesses that bridge the gap between research and scalable engineering. I am excited by the prospect of defining the protocols and guardrails that will power the next generation of AI agents on Upwork. I look forward to the opportunity to bring my expertise in distributed systems and LLM adaptation to your talented team and help drive the evolution of the Upwork Marketplace.
Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в upwork уже сейчас
Присоединяйтесь к Upwork, чтобы проектировать будущее агентного ИИ и определять стандарты индустрии на глобальном уровне!
Описание вакансии
Upwork Inc.’s (Nasdaq: UPWK) family of companies connects businesses with global, AI-enabled talent across every contingent work type including freelance, fractional, and payrolled. This portfolio includes the Upwork Marketplace, which connects businesses with on-demand access to highly skilled talent across the globe, and Lifted, which provides a purpose-built solution for enterprise organizations to source, contract, manage, and pay talent across the full spectrum of contingent work. From Fortune 100 enterprises to entrepreneurs, businesses rely on Upwork Inc. to find and hire expert talent, leverage AI-powered work solutions, and drive business transformation. With access to professionals spanning more than 10,000 skills across AI & machine learning, software development, sales & marketing, customer support, finance & accounting, and more, the Upwork family of companies enables businesses of all sizes to scale, innovate, and transform their workforces for the age of AI and beyond.
Since its founding, Upwork Inc. has facilitated more than $30 billion in total transactions and services as it fulfills its purpose to create opportunity in every era of work. Learn more about the Upwork Marketplace atUpwork.com and follow us onLinkedIn,Facebook,Instagram,TikTok, andX; and learn more about Lifted atGo-Lifted and follow onLinkedIn.
We’re seeking a Senior Lead Machine Learning Engineer to architect, ship, and scale the next generation of agentic intelligence across Upwork. You will lead end‑to‑end development of AI agents and the platform that powers them—from LLM training and evaluation to runtime orchestration, safety, and developer APIs. This is a hands‑on, high‑impact role at the intersection of applied research and platform engineering, enabling internal teams and external developers to build reliable, safe, and high‑performing agents on Upwork.
Responsibilities
- Build Agentic Intelligence. Design and implement multi‑agent systems (planning, tool‑use, memory, debate/critique, reflection) with robust guardrails and recovery strategies.
- Develop protocol‑aware agents and services that interoperate cleanly with developer tooling (e.g., agent frameworks and protocols such as MCP).
- Own reliability at scale: deterministic execution where needed, idempotency, timeouts/retries, and evaluation‑driven iteration on agent behavior.
- Train, Align, and Evaluate LLMs for Agents. Lead data strategy and curation for agent tasks; drive SFT, DPO, RLHF/RLAIF, and safety tuning tailored to multi‑tool, multi‑step workflows.
- Stand up evaluation harnesses for functional, task, and longitudinal metrics (success rate, time‑to‑completion, hallucination/escape rates, cost/latency).
- Build policy‑driven guardrails; partner with Legal/Security on data governance and privacy.
- Engineer Agentic Platform Backend Infrastructure. Architect low‑latency inference, retrieval, and orchestration services (streaming, event‑driven pipelines; scalable queues; caching; batching) with strong SLOs.
- Ship production‑grade services (APIs/SDKs, auth, rate limiting, observability) that make agent features easy to integrate for internal and external developers.
- Optimize cost/performance via quantization, distillation, model‑routing, and autoscaling; integrate evaluation signals directly into runtime and CI/CD.
- Lead, Partner, and Uplevel the Ecosystem. Provide technical leadership across research, product, and platform teams; mentor senior ICs; influence roadmaps with clear metrics and trade‑offs.
- Publish internal guidance and exemplar implementations; contribute to technical content, samples, and reference architectures for our agent platform.
- Define and track KPIs for data/quality/throughput, and drive continuous improvement using experiment results and production telemetry.
What it takes to catch our eye
- 8–12+ years in applied ML/ML systems with 4+ years building LLM‑powered products; proven delivery of agentic workflows in production.
- Hands‑on mastery of LLM adaptation (prompting, tool/function calling), data curation, and safety/guardrails.
- Strong software fundamentals (distributed systems, transactions, consistency, resiliency) and experience building high‑throughput microservices/APIs/SDKs.
- Fluency with Python; proficiency in one of Go/Java/Javascript a plus. Experience with container orchestration, messaging/streaming, and observability stacks.
- Experience designing eval suites for agents (task/rubric‑based, offline/online) and closing the loop from evals → training → runtime policy.
- Comfort with cost, latency, and reliability trade‑offs; you use metrics to make crisp decisions under ambiguity.
- Familiarity with agent frameworks and protocols (e.g., MCP; API/SDK design for developer productivity).
- Track record of leading cross‑functional initiatives and mentoring senior engineers; excellent written communication and bias for measurable results.
Come change how the world works.
This position will initially be employed through a partner to ensure a seamless hiring process while we establish the hub. Once the hub is established, there may be opportunities to transition to employment with Upwork depending on business needs and other requirements. While employed by the partner, you’ll work as part of Upwork’s team, with access to our resources, culture, and growth opportunities.
To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice
Создайте идеальное резюме с помощью ИИ-агента

Навыки
- Python
- Machine Learning
- LLM
- Kubernetes
- JavaScript
- Distributed Systems
- Java
- API Design
- Go
- Natural Language Processing
- SDK
- Agentic AI
- RLHF
- MCP
Возможные вопросы на собеседовании
Проверка понимания архитектуры агентных систем и способов борьбы с зацикливанием или ошибками планирования.
Как бы вы спроектировали систему восстановления (recovery strategy) для многоагентной системы, если один из агентов попадает в бесконечный цикл рассуждений или галлюцинирует при вызове инструмента?
Оценка навыков в области LLM Ops и улучшения качества моделей.
Опишите ваш подход к созданию набора данных для DPO или RLHF специально для улучшения навыков использования инструментов (tool-use) у модели.
Проверка инженерных навыков в области инфраструктуры и оптимизации.
Какие стратегии оптимизации задержки (latency) вы бы применили для агентной системы, требующей последовательных вызовов LLM и внешних API?
Оценка умения измерять успех в сложных, нелинейных задачах.
Как вы определяете и измеряете 'успех' агента в долгосрочных задачах (longitudinal metrics), где результат может быть достигнут через десятки шагов?
Проверка лидерских качеств и умения работать с неопределенностью.
Расскажите о случае, когда вам пришлось принимать архитектурное решение в условиях высокой неопределенности рынка ИИ. Какими метриками вы руководствовались?
Похожие вакансии
Tech Lead / ML Lead
Lead AI Engineer
GenAI Team Lead
Python‑разработчик в инфраструктуру центра ИИ, lead
Лидер ИИ-трансформации
Lead AI Engineer
1000+ офферов получено
Устали искать работу? Мы найдём её за вас
Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!
- Страна
- Канада