- Страна
- США
- Зарплата
- 166 000 $ – 220 000 $
Откликайтесь
на вакансии с ИИ

Senior Site Reliability Engineer, Production Engineering
Исключительно привлекательная вакансия для SRE-специалиста: работа в топовой оборонной технологической компании, высокая зарплата, значительный пакет опционов и возможность стоять у истоков новой команды. Миссия компании имеет реальное мировое значение.
Сложность вакансии
Высокая сложность обусловлена необходимостью иметь глубокую экспертизу в Kubernetes (100+ узлов), навыками программирования на Go/Rust и готовностью пройти проверку для получения допуска к секретной информации США. Роль является основополагающей в новой команде, что требует лидерских качеств и умения выстраивать процессы с нуля.
Анализ зарплаты
Предложенный диапазон $166k – $220k полностью соответствует рыночным стандартам для Senior SRE в Сиэтле. С учетом упоминания конкурентных грантов на акции (equity), совокупный доход может значительно превышать средние показатели по рынку.
Сопроводительное письмо
I am writing to express my strong interest in the Senior Site Reliability Engineer position within the Production Engineering team at Anduril Industries. With over seven years of experience in infrastructure engineering and a deep focus on Kubernetes at scale, I have consistently driven reliability improvements in complex distributed systems. My background in building observability stacks and implementing infrastructure as code aligns perfectly with your mission to ensure the Lattice OS operates flawlessly in demanding environments.
Throughout my career, I have championed the adoption of SLOs and error budgets to balance innovation with stability. I am particularly drawn to this foundational role at Anduril because it offers the unique opportunity to shape the technical direction of a newly formed team. I am eager to bring my expertise in Go and Python, along with my experience in incident management and chaos engineering, to help protect national security through cutting-edge technology.
Составьте идеальное письмо к вакансии с ИИ-агентом

Откликнитесь в andurilindustries уже сейчас
Присоединяйтесь к Anduril, чтобы строить будущее оборонных технологий и обеспечивать надежность систем национального масштаба!
Описание вакансии
Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting-edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.
ABOUT THE TEAM
The Production Engineering team is a newly formed organization within Anduril's Software Platform, dedicated to ensuring the reliability, performance, and scalability of mission-critical systems that directly support our warfighters in the field. We solve complex reliability challenges at massive scale, ensuring that critical components of Lattice—Anduril's autonomous command and control platform—operates flawlessly in the most demanding operational environments.
This is a foundational role and you will be among the first hires building this team from the ground up. You'll have the unique opportunity to shape the technical direction, establish best practices, and define what production engineering excellence means at Anduril. Our team operates at the intersection of software engineering and systems reliability, building the infrastructure, tooling, and processes that keep our systems operational 24/7/365.
ABOUT THE ROLE
We are seeking an experienced Senior Site Reliability Engineer who is passionate about building resilient, highly available systems that scale to meet the demands of the core systems powering Lattice. You will work closely with platform engineering teams, product developers, and field operations to proactively identify reliability risks, implement defensive strategies, and continuously improve the operational excellence of our software platform. If you thrive on solving hard problems at scale and want your work to have direct impact on national security, this is the role for you.
WHAT YOU’LL DO
- Design and implement comprehensive monitoring, observability, and alerting systems to ensure early detection of reliability issues across the Lattice platform
- Drive incident response and conduct blameless postmortems to identify systemic improvements and prevent recurrence of production issues
- Build and maintain infrastructure automation using tools like Terraform, Kubernetes operators, and custom tooling to manage large-scale distributed systems
- Establish and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system reliability
- Partner with software engineering teams to improve system architecture for reliability, implementing patterns like circuit breakers, graceful degradation, and chaos engineering
- Develop capacity planning models and performance testing frameworks to ensure systems can handle growth and peak operational demands
- Create runbooks, documentation, and training materials to enable teams to operate production systems effectively
- Lead cross-functional efforts to improve deployment safety through progressive rollouts, automated testing, and rollback capabilities
- Implement security best practices and compliance controls for production environments handling sensitive defense data
- Build tooling and automation to reduce toil and improve operational efficiency for the engineering organization
- Participate in on-call rotations and serve as an escalation point for critical production incidents
REQUIRED QUALIFICATIONS
- 7+ years of engineering experience with at least 3+ years focused on SRE, production operations, or infrastructure engineering
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- Deep expertise with Kubernetes in production environments, including operational challenges at scale (100+ nodes)
- Strong programming skills in one or more languages such as Go, Python, Rust, or Java with ability to build production-grade tooling
- Proven experience designing and implementing observability stacks (metrics, logging, tracing) using tools like Prometheus, Grafana, ELK/EFK, or equivalent
- Hands-on experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code practices
- Demonstrated ability to debug complex distributed systems issues across multiple layers of the stack
- Track record of improving system reliability through architectural changes, not just operational band-aids
- Strong incident management and communication skills, with experience leading responses to critical outages
- Must be a U.S. Person due to required access to U.S. export controlled information or facilities
- Eligible to obtain and maintain an active U.S. Secret security clearance
PREFERRED QUALIFICATIONS
- Experience with defense, aerospace, or other mission-critical systems where downtime has severe consequences
- Expertise in performance optimization and capacity planning for high-throughput, low-latency systems
- Knowledge of chaos engineering principles and experience implementing resilience testing frameworks
- Experience with service mesh technologies (Istio, Linkerd) and advanced traffic management patterns
- Background in database operations and optimization (PostgreSQL, Cassandra, or similar at scale)
- Familiarity with CI/CD platforms and deployment automation (ArgoCD, FluxCD, Spinnaker, Jenkins)
- Understanding of networking fundamentals including load balancing, DNS, TLS/SSL, and network security
- Experience with configuration management and secrets management solutions (Vault, Sealed Secrets, SOPS)
- Strong written and verbal communication skills with ability to explain technical concepts to non-technical stakeholders
- Active Secret or higher security clearance
US Salary Range
$166,000—$220,000 USD
The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full time offers; and are considered part of Anduril's total compensation package. Additionally, Anduril offers top-tier benefits for full-time employees, including:
Healthcare Benefits
- US Roles: Comprehensive medical, dental, and vision plans at little to no cost to you.
- UK & AUS Roles: We cover full cost of medical insurance premiums for you and your dependents.
- IE Roles: We offer an annual contribution toward your private health insurance for you and your dependents.
Additional Benefits
- Income Protection: Anduril covers life and disability insurance for all employees.
- Generous time off: Highly competitive PTO plans with a holiday hiatus in December. Caregiver & Wellness Leave is available to care for family members, bond with a new baby, or address your own medical needs.
- Family Planning & Parenting Support: Coverage for fertility treatments (e.g., IVF, preservation), adoption, and gestational carriers, along with resources to support you and your partner from planning to parenting.
- Mental Health Resources: Access free mental health resources 24/7, including therapy and life coaching. Additional work-life services, such as legal and financial support, are also available.
- Professional Development: Annual reimbursement for professional development
- Commuter Benefits: Company-funded commuter benefits based on your region.
- Relocation Assistance: Available depending on role eligibility.
Retirement Savings Plan
- US Roles: Traditional 401(k), Roth, and after-tax (mega backdoor Roth) options.
- UK & IE Roles: Pension plan with employer match.
- AUS Roles: Superannuation plan.
The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process.
Protecting Yourself from Recruitment Scams
Anduril is committed to maintaining the integrity of our Talent acquisition process and the security of our candidates. We've observed a rise in sophisticated phishing and fraudulent schemes where individuals impersonate Anduril representatives, luring job seekers with false interviews or job offers. These scammers often attempt to extract payment or sensitive personal information.
To ensure your safety and help you navigate your job search with confidence, please keep the following critical points in mind:
- No Financial Requests:Anduril will never solicit payment or demand personal financial details (such as banking information, credit card numbers, or social security numbers) at any stage of our hiring process. Our legitimate recruitment is entirely free for candidates.
- Please always verify communications:
+ Direct from Anduril: If you receive an email from one of our recruiters, it will only come from an @anduril.com address.
+ Via Agency Partner: If contacted by a recruiting agency for an Anduril role, their email will clearly identify their agency. If you suspect any suspicious activity, please verify the agency's authenticity by reaching out to contact@anduril.com.
- Exercise Caution with Unsolicited Outreach: If you receive any communication that appears suspicious, contains grammatical errors, or makes unusual requests, do not engage. Always confirm the sender's email domain is @anduril.com before providing any personal information or clicking on links.
- What to Do If You Suspect Fraud: Should you encounter any questionable or fraudulent outreach claiming to be from Anduril, please report it immediately to contact@anduril.com. Your proactive caution is invaluable in protecting your personal information and upholding the security and trustworthiness of our recruitment efforts.
Data Privacy
To view Anduril's candidate data privacy policy, please visit https://anduril.com/applicant-privacy-notice/.
Создайте идеальное резюме с помощью ИИ-агента

Навыки
- AWS
- Azure
- Python
- Rust
- Terraform
- GCP
- Kubernetes
- Prometheus
- Grafana
- PostgreSQL
- Chaos Engineering
- Go
- Istio
- ArgoCD
- Vault
Возможные вопросы на собеседовании
Проверка опыта работы с высоконагруженными кластерами и понимания внутренних механизмов K8s.
Расскажите о самом сложном инциденте в Kubernetes, с которым вы столкнулись на масштабе более 100 узлов. Как вы его диагностировали и какие системные изменения внедрили?
Оценка способности кандидата автоматизировать процессы и создавать инструменты, а не просто использовать готовые.
Опишите опыт разработки кастомных инструментов или операторов на Go/Python для автоматизации инфраструктурных задач. Какую проблему они решали?
Проверка понимания методологии SRE и умения находить баланс между стабильностью и скоростью разработки.
Как вы подходите к определению SLO и Error Budgets для критически важных систем? Как вы поступаете, когда бюджет ошибок исчерпан, но бизнесу нужно выкатить фичу?
Оценка навыков проектирования отказоустойчивых систем.
Какие паттерны проектирования (например, circuit breakers, bulkhead) вы внедряли для повышения устойчивости распределенных систем?
Проверка навыков коммуникации и лидерства в кризисных ситуациях.
Опишите ваш опыт проведения «беспристрастных» (blameless) постмортемов. Как вы обеспечиваете, чтобы выводы приводили к реальным изменениям в архитектуре?
Похожие вакансии
DevOps Engineer / Blockchain & AI Infrastructure Engineer
Devops senior
Devops Middle+/Senior
DevOps Middle +/ Senior
Senior DevOps/Mlops
Senior DevOps/SRE Engineer (On-Premise инфраструктура)
1000+ офферов получено
Устали искать работу? Мы найдём её за вас
Quick Offer улучшит ваше резюме, подберёт лучшие вакансии и откликнется за вас. Результат — в 3 раза больше приглашений на собеседования и никакой рутины!
- Страна
- США
- Зарплата
- 166 000 $ – 220 000 $