Senior Systems Engineer Software

Source: remoteok

About the Role

Join the NVIDIA Cloud Native Engineering (NVCNE) group's backend software team! As a Cloud Platform Software Engineer, you will contribute to a software platform supporting the lifecycle of Artificial Intelligence (AI) super compute infrastructure on Kubernetes. You will work alongside architects, designers, frontend engineers, and SREs to enable AI services across the cloud. You will own your code from development to production, supporting SRE teams and collaborating with internal product teams on sophisticated distributed systems problems at scale. You will be encouraged to foster NVIDIA’s approach to Cloud Native development and especially Kubernetes.

Responsibilities

Develop software systems to support large-scale deployments of cloud infrastructure
Design, develop, and distribute APIs to support Infrastructure as Code (IaC) automation and deployment workflows
Contribute to multiple source code projects to fulfill NVIDIA requirements with software services
Work and collaborate with engineering managers, architects, designers, and frontend engineers to deliver high-quality software
Automate the validation of software solutions with unit and integration tests
Innovate with other engineers on proposed designs and product direction
Openly share successes and failures in a no-blame environment

Requirements

BS in Computer Science, Information Systems, Computer Engineering (or equivalent experience) and at least 12 years of overall experience
5-7 years of proven experience in large-scale software development
Experience building and delivering services on Kubernetes
Proficiency with cloud-native infrastructure (AWS, GCP, Azure, OCI)
Collaborated with teams to write software to support cloud services at scale
Ability to troubleshoot issues across multiple layers: infrastructure, Kubernetes, application runtime
Strong proficiency in Golang for building Kubernetes operators, controllers, and custom tooling
Experience designing and managing Kubernetes Custom Resource Definitions (CRDs)
Knowledge of managed Kubernetes services and scaling strategies across cloud and on-prem environments
Experience developing auto-scaling infrastructure components and incident response and root cause analysis

Ways to Stand Out

Experience with Kubernetes Cluster API, Terraform, CSP API, and other infrastructure tooling
Background with using and contributing to open-source projects
Solid experience with Kustomize, or other Kubernetes packaging tools
Capable of refactoring software to run in systems such as Kubernetes
Ability to discuss and work with CSI, CNI, and CRI as well as familiarity with the CNCF and the tooling across the ecosystem

Compensation

Base salary range: $224,000 - $356,500 USD (determined by location, experience, and pay of employees in similar positions)
Eligibility for equity and benefits

Additional Information

Applications accepted at least until November 4, 2025.
NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.
Learn more about NVIDIA: https://www.nvidia.com/

Senior Systems Engineer Software at NVIDIA