Senior Manager, Solutions Architecture

Deloitte
Grand Rapids, MI
We are seeking an accomplished HPC/AI Platform Engineering Manager to lead the design, implementation, and optimization of advanced computing environments that power AI, ML, and LLM workloads. This role is ideal for a hands-on technologist with deep expertise in HPC systems, GPU-accelerated infrastructure, and large-scale AI deployments-combined with the leadership's ability to drive fast-paced, innovative initiatives. You will collaborate with engineering, research, and business teams to define infrastructure strategy, assess emerging technologies, and deliver scalable, secure, and high-performance solutions. This role is pivotal in advancing generative AI, analytics, and model training capabilities through robust architecture, automation, and software integration. Recruiting for this role ends on January 31, 2026. Key Responsibilities Architecture & Strategy + Design and implement HPC and AI infrastructure leveraging HPE Apollo, ProLiant, Cray, and similar enterprise-class systems. + Architect ultra-low-latency, high-throughput interconnect fabrics (InfiniBand NDR/800G, RoCEv2, 100-400 GbE) for large-scale GPU and HPC clusters. + Deploy and optimize cutting-edge NVIDIA GPU architectures (e.g. H100, H200, RTX PRO / Blackwell series, NVL based systems) + Develop scalable hybrid HPC and cloud architectures across Azure, AWS, GCP, and on-prem environments. + Establish infrastructure blueprints supporting secure, high-throughput AI workloads. AI/ML & LLM Platform Enablement + Build and manage AI/ML infrastructure to maximize performance and productivity of ML research teams. + Architect and optimize distributed training, storage, and scheduling systems for large GPU clusters. + Implement automation, observability, and operational frameworks to minimize manual intervention. + Deploy and manage GPU-accelerated Kubernetes clusters for AI and HPC workloads. + Integrate open-source GenAI components, including vector databases and AI/ML frameworks, for model serving and experimentation. + Identify and resolve performance and scalability of bottlenecks across infrastructure layers. Software Engineering & Integration + Develop and maintain automation tools and utilities in Python, Golang, and Bash. + Integrate HPC infrastructure with ML frameworks, container runtimes, and orchestration platforms. + Contribute to job scheduling, resource management, and telemetry components. + Build APIs and interfaces for workload submission, monitoring, and reporting across heterogeneous environments. Containerization & Orchestration + Design Kubernetes and OpenShift architectures optimized for GPU and AI workloads. + Implement GPU scheduling, persistent storage, and high-speed networking configurations. + Collaborate with DevOps/MLOps teams to build CI/CD pipelines for containerized research and production environments. Systems & Automation + Oversee Linux system architectures (RHEL, Ubuntu, OpenShift) with automation via Ansible and Terraform. + Implement monitoring and observability (e.g Prometheus, Grafana, DCGM, and NVML) + Ensure system scalability, reliability, and security through proactive optimization. Governance & Leadership + Ensure architecture and deployments comply with organizational and regulatory standards. + Conduct technical workshops, architecture reviews, and presentations for both technical and executive audiences. + Define and drive the infrastructure roadmap in partnership with business stakeholders. + Mentor and lead engineering teams, translating business requirements into actionable technical deliverables. + Foster innovation and cross-functional collaboration to accelerate AI/ML initiatives. Required Qualifications + 10+ years of experience in HPC architecture, systems engineering, or platform design with a focus on architecting and operating on-premises Kubernetes for large-scale AI/ML workloads. + 3+ years working hands on and with a proficiency utilizing Linux, Python, Golang, and/or Bash. + 2+ years leading teams and/or processes + 2+ years of recent experience working with GPU platforms (strong preference for NVIDIA), distributed systems, and performance optimization. + Ability to travel 0-10%, on average, based on the work you do and the customers you serve. + Must be a US Citizen. Preferred Qualifications + Master's or Ph.D. in Computer Science, Electrical Engineering, or related discipline and work experience. + Demonstrated success supporting LLM training and inference workloads in both R&D and production environments. + Strong knowledge of high-performance networking, storage, and parallel computing frameworks. + Exceptional communication and leadership skills, capable of bridging technical depth with executive strategy. The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The disclosed range estimate has not been adjusted for the applicable geographic differential associated with the location at which the position may be filled. At Deloitte, it is not typical for an individual to be hired at or near the top of the range for their role and compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range is $130,000 to $241,000. You may also be eligible to participate in a discretionary annual incentive program, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance. Information for applicants with a need for accommodation: EA_ExpHire All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or protected veteran status, or any other legally protected basis, in accordance with applicable law.
Posted 2025-11-20

Recommended Jobs

Manager, Learning Events - US Based Remote

Anywhere Real Estate
Detroit, MI

**Manager Learning Events** **Key Responsibilities** **Event Strategy, Planning & Execution** + Lead full lifecycle project management for all in-person and virtual learning events. + Manage conferenc…

View Details
Posted 2025-11-18

Behavioral Health Case Manager I - Western Kansas

Elevance Health
Garden City, MI

**This is a field-based role where the successful clinician will be responsible for meeting with our members in facilities and/or homes. Kansas licensed master's level clinicians living in Dodge City,…

View Details
Posted 2025-11-19

Supplier Quality Engineer

Western Digital
Lansing, MI

Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, …

View Details
Posted 2025-11-18

CDL Class A Driver

Reyes Coca-Cola Bottling
Kincheloe, MI

Responsibilities: Join the leading beverage provider, Reyes Coca-Cola Bottling! Shift: Full Time, First Shift with a start time of 4:00am Benefits : Union,  Medical, Dental, Vision, Retirement…

View Details
Posted 2025-11-21

Software Engineer, Data Platform

Coinbase
Lansing, MI

Ready to be pushed beyond what you think you're capable of? At Coinbase, our mission is to increase economic freedom in the world. It's a massive, ambitious opportunity that demands the best of us, ev…

View Details
Posted 2025-11-14

Discover Alpena: Your Next Travel Nursing Adventure Awaits!

NurseRecruiter
Alpena, MI

Registered Nurse - Perioperative Nurse - Operating Room - Travel - (OR RN) Embark on an exhilarating travel nursing adventure in Alpena! As a Perioperative Nurse in the operating room, you'll not onl…

View Details
Posted 2025-08-07

Industrial services laborer

GFL Environmental
Manistee, MI

Responsible for safely and efficiently completing tasks and duties associated with industrial field labor. You must be able to work in compliance in the delivery of GFL's four (4) core service lines…

View Details
Posted 2025-11-18

Infusion Nurse - Per Diem - Thursday - Mount Pleasant

Option Care Health Inc.
Mount Pleasant, MI

Extraordinary Careers. Endless Possibilities. With the nation’s largest home infusion provider, there is no limit to the growth of your career. Option Care Health, Inc. is the largest independen…

View Details
Posted 2025-11-03

Home Care Attendant

University Home Care
Ypsilanti, MI

University Home Care Inc is seeking a compassionate and reliable Home Care Attendant to provide in-home assistance to clients in Ypsilanti, MI. This role involves supporting individuals with daily liv…

View Details
Posted 2025-11-20

Inside account manager

Cengage Group
Wyoming, MI

We believe in the power and joy of learning At Cengage Group, our employees have a direct impact in helping students around the world discover the power and joy of learning. We are bonded by ou…

View Details
Posted 2025-11-18