Back to all jobs

Staff Software Engineer, Infrastructure at Docker

RemoteOK
Apply NowSign in to track
AI-enhanced for better readability

Staff Software Engineer, Infrastructure

Source: remoteok

About Docker

Docker has been one of the most loved brands in developer tooling, trusted by more than 20 million monthly users and over 20 billion container image pulls. From solo founders to the world's largest companies, developers rely on Docker to build, share, and run their applications across our suite of products including Docker Desktop, Docker Hub, and Docker Scout.

We are a globally distributed, remote-first team building the tools that define how software gets built and delivered. As AI agents redefine software development, Docker is at the center of that shift, providing the sandboxed environments, verified images, and secure infrastructure that make autonomous workflows trustworthy by default.

The Role

Docker is shipping a wave of new products this year, and we're investing heavily in the platform underneath all of it. That platform supports hundreds of engineers across many development teams and carries high-scale production traffic and data transfer every day.

The top priority for this role is moving work from expert-driven support to paved roads: self-service systems with clear ownership, safe defaults, strong guardrails, and measurable adoption. The goal is a platform teams trust enough to stop thinking about it, so they can focus on their own products instead of ours.

We need a real multi-region, cross-account network architecture and a testing and continuous-deployment flow teams can trust. You'd be joining a team of four, growing to seven this year, and we're looking for a Staff engineer to set technical direction and lead it through real production adoption.

Responsibilities

This is a Staff-level role, so success is measured by leverage rather than just your own commits. You will stay hands-on in the codebase while setting direction and aligning teams on pragmatic standards. Concretely, you will:

  • Take ambiguous infrastructure problems and turn them into proposals the org can rally around, then drive them through RFCs and architecture reviews across teams.
  • Design self-service capabilities and platform APIs (primarily in Go) for onboarding, provisioning, deployment, observability defaults, and day-2 operations.
  • Set delivery standards using Terraform, GitOps with Argo CD, progressive rollout, and good testing, including building the continuous-deployment flow we're missing today.
  • Evolve the multi-tenant EKS foundations toward better reliability, security, scale, and cost: Envoy Gateway ingress, traffic routing, and the multi-region, cross-account connectivity we need.
  • Improve SLOs, alerting, and incident follow-up on Grafana Cloud so production gets safer and less dependent on heroics.

AI-Assisted Operations

We are actively investing in AI-assisted and agentic workflows to cut operational toil. You'll help shape where these earn their place. Early targets include:

  • Alert enrichment and incident context-gathering: Assembling relevant signals, history, and runbooks.
  • Runbook-assisted diagnosis and remediation recommendations: With a human in the loop on anything that changes production.
  • Onboarding and readiness assistants: Answering the questions our experts answer today.

On-Call

Operational ownership is part of the job. You'll join the rotation after onboarding and shadowing. As a Staff engineer, you'll also improve the health of on-call itself, with better alerts, stronger runbooks, less toil, and blameless postmortems aimed at prevention.

Qualifications

  • 8+ years of professional, hands-on, full-time software engineering experience in backend, infrastructure, or platform engineering.
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • Strong software engineering in Go or a similar language: design, testing, debugging, review, and long-term maintainability.
  • A track record designing, shipping, and operating cloud services or infrastructure platforms in production.
  • Deep expertise in at least one of: Kubernetes, networking, cloud platforms, reliability engineering, or developer platforms, plus solid Linux, networking, and production-ops fundamentals.
  • Experience setting technical direction and leading work that needs cross-team alignment.
  • Clear written and verbal communication in a remote environment (RFCs, design docs, incident writeups).

Nice to have: EKS and ingress/CNI/service-mesh experience; observability with OpenTelemetry/Prometheus/Grafana; CI/CD and progressive delivery (GitHub Actions, Argo CD, canaries); experience leading migrations or adoption programs across teams.

What to Expect

  • First 30 days: Build context, meet partner teams, ship your first change, shadow on-call.
  • First 90 days: Own a strategic platform problem with a clear plan and metrics; lead an improvement from design to production.
  • One Year Outlook: Lead a major cross-team initiative (e.g., self-service provisioning of new regions/environments) and establish durable patterns that change how Docker engineers build and operate services.

Hiring Process & Compliance

  • Docker considers visa sponsorship on a case-by-case basis based on business needs.
  • We use Covey as part of our hiring process. You can view the independent bias audit report here.

Perks

  • Freedom & flexibility; fit your work around your life.
  • Designated quarterly Whaleness Days plus end of year Whaleness break.
  • Home office setup.
  • 16 weeks of paid Parental leave (after 6 months of employment).
  • Technology stipend equivalent to $100 USD net/month.
  • PTO plan that encourages you to take time off.
  • Training stipend for conferences, courses, and classes.
  • Equity in a growing start-up.
  • Docker Swag.
  • Medical benefits, retirement, and holidays vary by country.
  • Remote-first culture, with offices in Seattle and Paris.

Docker embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills.

#LI-REMOTE

Similar jobs