Senior DevOps Architect
Source: remoteok
About the Role
NVIDIA is seeking a highly motivated Senior Software Architect to join its Software Infrastructure team. This role is part of a dynamic team developing sophisticated software tools to optimize development workflows and increase overall efficiency. NVIDIA is creating incredible user experiences in the mobile, embedded, and automotive spaces by combining its groundbreaking Tegra and GPU development efforts into innovative products. The Infrastructure, Planning, and Processes (IPP) team is a global organization within NVIDIA that helps make this vision possible by crafting and maintaining a large-scale private cloud system used for providing build and test infrastructure services for NVIDIA GPU, Mobile, and Automotive Divisions.
You should thrive when working in the critical path supporting thousands of developers working for billion-dollar business lines and intimately understand the values of responsiveness, thoroughness, and teamwork.
Responsibilities
- Evaluate, identify, and develop software solutions to optimize critical software development workflows across various organizations within Nvidia.
- Architect, implement, and support end-to-end CI/CD systems using open-source and NVIDIA proprietary software.
- Create solutions to support end-to-end container management with Kubernetes and Docker.
- Drive automation to monitor and gain more insight into applications and system health.
- Design solutions with service discovery, networking, monitoring, logging, and scheduling in Kubernetes.
- Lead software development projects and technically lead a team of engineers, guiding them to provide optimal and impactful solutions.
- Identify and resolve issues within software systems.
- Craft and implement critical metrics using various analytics methods and dashboards.
Requirements
- Experience maintaining cloud infrastructure and highly available production environments.
- Excellent debugging, problem-solving, and analytical skills.
- Strong understanding of architectural requirements and development processes involved in building reliable, robust, and scalable data products and pipelines.
- Background in Databases: SQL (MySQL) and/or NoSQL (Elastic Search / MongoDB / Cassandra).
- Proficient with configuration management tools like Ansible, Puppet, and Chef.
- Strong background with Jenkins and/or other CI/CD systems.
- Proficient with Kubernetes, Docker, and virtualization.
- Knowledge of monitoring systems such as Zabbix, Prometheus, and/or similar systems.
- 12+ years of proven experience.
- Bachelor's or Master's degree in Computer Science, Software Engineering, or equivalent experience.
Ways to Stand Out
- Prior experience with DevOps and large-scale operations teams.
- Experience with Windows server infrastructure.
- Background with computer algorithms and the ability to choose the best possible algorithms to meet scaling challenges.
- Ability to analyze sophisticated problems into simple sub-problems and reuse available solutions.
- Ability to design simple systems that can work efficiently without needing much support.
Compensation
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
- Level 5: $208,000 - $333,500 USD
- Level 6: $256,000 - $414,000 USD
You will also be eligible for equity and benefits.
Additional Information
Applications for this job will be accepted at least until February 1, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer.