High Performance Computing (HPC)-Cloud Architect (Experienced)
A ongoing job with Sandia National LaboratoriesApply Now
- Sandia National Laboratories
- 2 months, 2 weeks ago
- Additional Info
- LGBTQ+ Friendly
- Job Posting is in English
Looking for a side quest?Apply Now
This posting will be open for application submissions for a minimum of seven (7) calendar days, including the ‘posting date’. Sandia reserves the right to extend the posting date at any time.
COVID-19 Vaccination Mandate
Sandia demonstrates its commitment to public safety in the national interest by requiring that all new hires be fully vaccinated or have an approved medical or religious accommodation before commencing employment. The requirement also applies to those who are telecommuting and working virtually.
Any concerns about the ability to meet this requirement should be directed to HR Solutions at (505) 284-4700.
What Your Job Will Be Like
Sandia National Laboratories is home to the NNSA's Advanced Architecture Testbeds (AAT) and Application Readiness Testbeds (ART) projects. These projects deploy next-generation processors, network interconnects, accelerators, and memory/storage systems in HPC systems to evaluate hardware and software for suitability for advancing scientific computing and national security applications. The focus of evaluations is broad, with experiments being conducted to balance performance, system cost, energy consumption and system reliability. Every system we deploy is unique and is at the frontier of hardware/software design to help redefine innovative computing for our mission.
Sandia's HPC Development department is looking for self-motivated individuals that want to work at the edge of system deployment, administration, and research in close collaboration with partners in exploring first-of-a-kind computing technologies. We are seeking HPC-Cloud architects and system administrators in the role of Solutions Architect to co-develop innovative methods for combining Cloud-based computing-as-a-service technologies with HPC systems to improve utilization, scalability, and flexibility in software deployments. This job is like no other at the national laboratories due to Sandia's unique capabilities and focus on codesign, bringing together system architects, computer scientists and domain experts to overcome the hardest of today's technical challenges.
Every day will be different in our team, with activities that include:
- Team with HPC researchers and vendors to enhance cloud technologies for HPC environments including novel accelerators, unproven HPC software stacks, and HPC communication models
- Team with HPC researchers and vendors to enhance and deploy HPC operational methodologies such as scheduling, resource management, and monitoring for HPC-Cloud environments
- Collaborate with research and development staff, colleagues, and vendors to deliver functional platforms for multiple pre-production systems running research software in exploratory configurations
- Participate in all aspects of the HPC system lifecycle including facility integration, standup, acceptance testing, performance benchmarking, operational support, and reclamation
- Design and develop infrastructure for the support of multiple concurrent, emerging technology and prototype HPC systems
- Maintain all system aspects of security, networks, filesystems, system software installation, and user support
Qualifications We Require
- Bachelor’s degree in Computer Science, Computer Engineering, Information Systems Engineering (CIS/MIS), or relevant STEM field plus five more years of relevant IT experience
- Experience with installing and managing Kubernetes or other Cloud provisioning technologies
- Experience administering Linux /Unix Cluster systems
- Ability to configure and manage server network configurations
Qualifications We Desire
- Familiarity with HPC and Cloud user-facing programming environments and usage models
- Experience deploying and managing Infrastructure-as-a-Service, Platform-as-a-Service, or Software-as-a-Service
- Experience with complex programming environments typical in HPC platforms, including use of MPI
- Experience with container runtimes such as Docker or Podman
- Experience with debugging and tuning Linux server/system performance
- Experience with automation tools for configuration management (e.g. Ansible, Puppet, Chef)
- Experience with authentication schemes such as Kerberos, OAuth2, or OpenID Connect
- Experience administering heterogeneous clusters consisting of GPU-based, ARM-based, x86-64-based, and next-generation architectures
About Our Team
We partner with different scientific and computing disciplines at Sandia, and externally, to advance high performance computing and operations. The Advanced Architecture Testbeds and Application Readiness Testbeds projects are multi-center collaborations to explore, evaluate, and influence next-generation computing. These technologies will drive transformation in how the laboratories and national security and scientific partners use digital-engineering processes in the coming decade and beyond.
Sandia National Laboratories is the nation’s premier science and engineering lab for national security and technology innovation, with teams of specialists focused on cutting-edge work in a broad array of areas. Some of the main reasons we love our jobs:
- Challenging work with amazing impact that contributes to security, peace, and freedom worldwide
- Extraordinary co-workers
- Some of the best tools, equipment, and research facilities in the world
- Career advancement and enrichment opportunities
- Flexible work arrangements for many positions include 9/80 (work 80 hours every two weeks, with every other Friday off) and 4/10 (work 4 ten-hour days each week) compressed workweeks, part-time work, and telecommuting (a mix of onsite work and working from home)
- Generous vacations, strong medical and other benefits, competitive 401k, learning opportunities, relocation assistance and amenities aimed at creating a solid work/life balance*
World-changing technologies. Life-changing careers. Learn more about Sandia at: http://www.sandia.gov
*These benefits vary by job classification.
Sandia is required by DOE to conduct a pre-employment drug test and background review that includes checks of personal references, credit, law enforcement records, and employment/education verifications. Applicants for employment need to be able to obtain and maintain a DOE Q-level security clearance, which requires U.S. citizenship. If you hold more than one citizenship (i.e., of the U.S. and another country), your ability to obtain a security clearance may be impacted.
Applicants offered employment with Sandia are subject to a federal background investigation to meet the requirements for access to classified information or matter if the duties of the position require a DOE security clearance. Substance abuse or illegal drug use, falsification of information, criminal activity, serious misconduct or other indicators of untrustworthiness can cause a clearance to be denied or terminated by DOE, resulting in the inability to perform the duties assigned and subsequent termination of employment.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status and any other protected class under state or federal law.
More Part Time, Ongoing Jobs
Revenue Operations Launch Manager
- Startup Company
- Remote Position
- LGBTQ+ Friendly
- 2 days, 8 hours ago
- Full Time
- 6+ Months
Senior or Staff Frontend Engineer - React
- Remote Position
- Parental Leave Offered
- 2 days, 11 hours ago
- Part Time