Limitless potential.
Manager of Site Reliability Engineering (SRE)
-
Location Birmingham, Alabama
- Category Technology
- Job ID R26_0000009885
- Date posted 05/07/2026
- Brand Motion Industries (MOT)
- Status Full time
- Job Type Hybrid
SUMMARY:
The Manager of Site Reliability Engineering leads and develops a team of SRE practitioners focused on delivering highly reliable, scalable, and performant cloud-based infrastructure and services. This role ensures the implementation of SRE principles, drives automation, observability, and incident management practices to enhance system reliability, and collaborates across development and operations teams to support continuous delivery and robust cloud platform operations.
You must be eligible to work in the US without Visa Sponsorship
JOB DUTIES
• Lead, mentor, and grow a high-performing team of Site Reliability Engineers, fostering a culture of ownership, continuous improvement, and operational excellence.
• Implement and champion Site Reliability Engineering principles and DevOps best practices within the team to ensure service reliability, availability, and performance.
• Define and track key SRE metrics such as service uptime, incident response and resolution times.
• Drive automation efforts including CI/CD pipeline enhancements, infrastructure-as-code practices, and self-service infrastructure provisioning to increase deployment velocity while reducing manual toil.
• Own and continuously improve observability practices including system monitoring, logging, alerting, and diagnostics to ensure rapid issue detection and resolution.
• Participate in incident response processes including incident management, root cause analysis, post-mortems, and continuous improvement to enhance system resilience.
• Partner closely with software engineering, product management, architecture, and security teams to embed reliability and security early in the software development lifecycle (SDLC).
• Oversee the management and scalability of cloud infrastructure environments, primarily on Google Cloud Platform (GCP), with a focus on Kubernetes, container orchestration, and hybrid cloud integrations.
• Advocate for and apply best practices in performance tuning, capacity planning, and system design for high availability.
• Develop and execute a long-term roadmap for our hybrid cloud platform, aligning with evolving business objectives and technology trends.
• Establish and monitor key performance indicators (KPIs) service level indicators (SLIs) and service level objectives (SLOs) to drive system health and stability.
EDUCATION & EXPERIENCE
Typically requires a bachelor's degree and 7 years of experience in a technology and/or software engineering role or an equivalent combination
KNOWLEDGE, SKILLS, ABILITIES
Experience & Leadership
• Proven experience working in large, complex enterprise environments (Fortune 500 or equivalent).
Site Reliability Engineering & DevOps Practices
• Strong understanding and demonstrated implementation of Site Reliability Engineering (SRE) principles at scale.
• Hands-on experience with infrastructure-as-code (IaC) tools such as Terraform, and ArgoCD.
• In-depth knowledge and practical experience with CI/CD pipelines and automation of software delivery.
• Championing DevOps practices and embedding reliability early in the SDLC.
• Significant hands-on experience in Site Reliability Engineering or related roles focused on cloud infrastructure reliability.
• Strong software engineering background with proficiency in infrastructure-as-code tools (e.g., Terraform, ArgoCD) and CI/CD automation.
• Deep knowledge of cloud platforms, specifically Google Cloud Platform (GCP), Kubernetes, container orchestration, and cloud-native architecture.
• Familiarity with monitoring and observability tools such as Dynatrace, Datadog, or equivalents.
• Experience managing high-availability systems in 24/7 operational environments.
• Ability to collaborate cross-functionally and drive alignment across engineering, product, and security teams.
Tools & Monitoring
• Experience with monitoring, logging, and observability platforms.
• Familiarity with incident management and performance monitoring tools, including Dynatrace and Datadog.
• Proficient in cloud deployment tooling and automation frameworks.
• Experience with Azure DevOps (ADO) or equivalent CI/CD tools.
Core Technical Skills
• Strong software engineering and infrastructure background.
• Solid understanding of Kubernetes, container orchestration, cluster management, and elastic scalability.
• Experience with API-driven, event driven and microservices architectures.
• • Skilled in performance diagnostics, capacity planning, tuning, and system architecture for high-availability systems.
Not the right fit? Let us know you're interested in a future opportunity by joining our Talent Community onjobs.genpt.comor create an account to set up email alerts as new job postings become available that meet your interest!
GPC conducts its business without regard to sex, race, creed, color, religion, marital status, national origin, citizenship status, age, pregnancy, sexual orientation, gender identity or expression, genetic information, disability, military status, status as a veteran, or any other protected characteristic. GPC's policy is to recruit, hire, train, promote, assign, transfer and terminate employees based on their own ability, achievement, experience and conduct and other legitimate business reasons.
Jobs For You
Featured Jobs
No featured jobs available. View all of our available opportunities!
Saved Jobs
No saved jobs available. View all of our available opportunities!
Viewed Jobs
No recently viewed jobs available. View all of our available opportunities!
-
Benefits We offer comprehensive benefit plans and programs designed to support your health and wellness, provide income protection and build financial security for your retirement. -
Career Areas New opportunities await you at Genuine Parts Company. Discover a career where you and your talents can truly thrive. Learn more about available opportunities. -
Culture Our teammates are at the heart of everything we do. We are united by a shared commitment to our purpose: We Keep the World Moving. -
About GPC We are a global service organization engaged in the distribution of automotive and industrial replacement parts. Our vast global supply chain includes more than 10,700 locations across 17 countries. -
Military GPC has a legacy of supporting veterans, reservists and transitioning military to help them have impactful civilian careers. -
Students and Graduates We have many exciting opportunities with our GPC and NAPA teams for undergraduate and graduate students to gain first-hand experience working in the corporate world. -
Global Technology Center Located in Krakow, Poland, the GPC Global Technology Center (GTC) is the research and development hub for GPC’s digital transformation efforts. -
Technology As GPC grows around the world, our teams are developing advanced technologies and solutions that enhance our capabilities and improve the customer experience. -
NAPA We are America’s largest network of automotive parts and care, with nearly 6,000 auto parts stores, more than 16,000 auto care and collision centers and approximately 800,000 available parts. -
Motion As a leading industrial distributor, we offer access to more than 19 million parts and supplies to help MRO (maintenance, repair and operations) and OEM (original equipment manufacturer) customers. -
Sign up for Job Alerts
Sign up to receive job alerts about opportunities you may be interested in!