P

Containers and Provisioning Operations Director

PricewaterhouseCoopers Servicii S.R.L.
Full-time
Remote friendly

Job Description & Summary

Job Summary

PwC IT Services Limited provides shared technology services to the PwC network of member firms in a secure, legally compliant, efficient and transparent manner.

The Containers and Provisioning Operations Director will be a member of the Global Hosting Services (GHS) Platform Operations team and comes to the organization having experience in container technologies, including expertise operating a multitude of Kubernetes clusters in production, service mesh, multi-cluster management tools such as Rancher or Rafay, vulnerability management tools such as Prisma Cloud Compute or Aqua Security, infrastructure as code (IaC) tools such as Terraform or Pulumi, extensive cloud experience with Azure and other clouds, DevOps / CI/CD tools, application development, observability (logging, monitoring, alerting), ITIL and Agile methodologies, and people management of multiple tiers.

The role holder will be responsible for leading both the Containers team and the Provisioning team which each have team leads to handle the day-to-day operations. You will drive strategy, lifecycle management, roadmapping, leadership reporting, continuous improvement, performance management, stakeholder management, metrics collection and reporting, and escalations.

Some examples of activities include:

  • Planning and completing goals and objectives across the Containers and Provisioning teams for each fiscal year

  • Driving strategy and thought leadership among your teams and peers

  • Hosting regular customer feedback sessions and retrospectives

  • Holding regular one-on-ones with each team member

  • Providing team member performance feedback to leadership to make decisions on bonuses and career progression

  • Handling licensing renewals for Rancher (or its successor), Prisma Cloud Compute

  • Managing the lifecycle for the Clusters-as-a-Service (CLaaS) and Containers-as-a-Service (CaaS) services, including Kubernetes upgrades, disaster recovery (DR), Rancher instance upgrades, Prisma Cloud Compute upgrades, support cases, observability, and customer communications

  • Understanding and implementing key metrics of success for operating services in an enterprise

  • Prioritizing each team’s work to meet business objectives and maximize value to the network of firms

  • Communicating your vision, mission, and values to your people and leadership to drive alignment

  • Serving as the escalation point across all of your teams and rapidly unblocking issues

  • Ensuring software engineering best practices are being followed

  • Setting standards for container security, deployment methodology, vulnerability remediation, and infrastructure as code

  • Creating complex business process diagrams that clearly communicate concepts to stakeholders

  • Managing vendor relationships for both technology and resources

  • Leading optimization of cloud resources to drive down costs for PwC and your customers

Job Requirements and Preferences

Basic Qualifications

Minimum Degree Preferred: Bachelors Degree

Minimum Years of Experience: years in IT and 4 years working with Kubernetes and Terraform

Preferred Certifications

  • Certified Kubernetes Administrator (CKA)

  • Certified Kubernetes Security Specialist (CKS)

  • Certified Kubernetes Application Developer (CKAD)

  • HashiCorp Certified: Terraform Associate

Preferred Knowledge/Skills

  • At least 5 years of experience in an operations role

  • At least 5 years of experience managing others

  • Excellent communication skills

  • Understanding and implementing key metrics of success for operating services in an enterprise

  • Understanding of Kubernetes, Docker, container images, and container security;

  • Understanding of cloud-native concepts and the Cloud Native Computing Foundation (CNCF) Landscape;

  • Understanding of at least one cloud provider and its underlying container services;

  • Amazon Web Services

  • Amazon Elastic Kubernetes Service (EKS)

  • Amazon Elastic Container Service (ECS)

  • AWS Fargate

  • Microsoft Azure

  • Azure Kubernetes Service (AKS)

  • App Service

  • Container Instances

  • Google Cloud

  • Google Kubernetes Engine (GKE)

  • Cloud Run

  • Working knowledge of at least one Kubernetes distribution;

  • Amazon Elastic Kubernetes Service (EKS)

  • Azure Kubernetes Service (AKS)

  • Google Kubernetes Engine (GKE)

  • Rancher / Rancher Kubernetes Engine (RKE) / K3s

  • Red Hat OpenShift

  • VMware Tanzu Kubernetes Grid (TKG)

  • Mirantis Kubernetes Engine

  • Nutanix Karbon

  • Rafay

  • Spectro Cloud Palette

  • Working knowledge of at least one service mesh technology;

  • Istio

  • Linkerd

  • Kong Mesh / Kong Kuma

  • HashiCorp Consul Connect

  • AWS App Mesh

  • OpenShift Service Mesh (Red Hat)

  • Open Service Mesh (OSM)

  • Demonstrating Familiarity with at least one Kubernetes security tool;

  • Aqua Platform / Container Security / Kubernetes Security / Dynamic Threat Analysis

  • Prisma Compute (previously Twistlock)

  • Qualys Container Security

  • Snyk Cloud Native Application Security (CNAS)

  • StackRox Kubernetes Security Platform

  • Sysdig Secure

  • Demonstrating Familiarity with at least one container image scanning tool;

  • Anchore

  • Aqua Security / Trivy

  • Clair

  • Dagda

  • Falco

  • JFrog Xray

  • Qualys Container Security

  • Snyk Container

  • Working knowledge of at least one container registry;

  • Amazon Elastic Container Registry (ECR)

  • Azure Container Registry (ACR)

  • GitLab Container Registry

  • Google Cloud Container Registry

  • Harbor

  • JFrog Container Registry

  • Possessing Intimate understanding of the typical Kubernetes resources and how to create manifests defining them;

  • ConfigMap

  • DaemonSet

  • Deployment

  • Ingress

  • PersistentVolume (PV)

  • PersistentVolumeClaim (PVC)

  • Pod

  • ReplicaSet

  • Secret

  • Service

  • StatefulSet

  • Working knowledge of at least one Container Network Interface (CNI) driver;

  • Amazon VPC CNI

  • Azure CNI

  • Calico

  • Canal

  • Cilium

  • Flannel

  • GKE CNI

  • Weave

  • Working knowledge of at least one Container Storage Interface (CSI) driver;

  • AWS Elastic Block Storage (EBS)

  • AWS Elastic File System (EFS)

  • AWS FSx for Lustre (FSX)

  • Azure Disk

  • Azure File

  • CephFS

  • Ceph RBD

  • GCE Persistent Disk

  • Google Cloud Filestore

  • Google Cloud Storage

  • GlusterFS

  • Longhorn

  • Minio

  • NetApp

  • Nutanix

  • OpenEBS

  • Portworx

  • Pure Storage CSI

  • Scaleway CSI

  • vSphere

  • Demonstrating Intimate understanding of cloud networking concepts, including best-practices networking models and security;

  • Understanding of microservice architecture and best practices;

  • Working knowledge of creating a Dockerfile to build an OCI-compliant Docker / container image;

  • Demonstrating the Ability to automate processes using Continuous Integration / Continuous Delivery (CI/CD) tools;

  • Airflow

  • Argo CD

  • AWS CodePipeline

  • Azure DevOps

  • CircleCI

  • Codefresh

  • Concourse

  • Flux CD / Flagger

  • GitHub Actions

  • GitLab

  • Harness

  • Jenkins / Jenkins X

  • Spinnaker

  • Tekton

  • Travis CI

  • Understanding of observability in Kubernetes using three or more tools from the following list

  • Prometheus

  • Grafana

  • Alertmanager

  • Elasticsearch

  • Fluentd / Fluent Bit

  • Kibana

  • AppDynamics

  • Datadog

  • Dynatrace

  • New Relic One

  • Splunk

  • Sumo Logic

  • Working knowledge of all the following command line interface (CLI) tools;

  • git

  • helm

  • kubectl

  • Possessing expert working knowledge of YAML syntax;

  • Possessing Intimate diagramming skills using one or more of the following tools;

  • Coggle

  • Creately

  • draw.io / diagrams.net

  • Lucidchart

  • Microsoft Visio

  • Miro

  • OmniGraffle

  • Demonstrating the ability to empathize with customers and respectfully engage stakeholders to align on goals, business objectives, timelines, and execution plan;

Responsibilities

  • Planning and completing goals and objectives across the Containers and Provisioning teams for each fiscal year

  • Driving strategy and thought leadership among your teams and peers

  • Hosting regular customer feedback sessions and retrospectives

  • Holding regular one-on-ones with each team member

  • Providing team member performance feedback to leadership to make decisions on bonuses and career progression

  • Handling licensing renewals for Rancher (or its successor), Prisma Cloud Compute

  • Managing the lifecycle for the Clusters-as-a-Service (CLaaS) and Containers-as-a-Service (CaaS) services, including Kubernetes upgrades, disaster recovery (DR), Rancher instance upgrades, Prisma Cloud Compute upgrades, support cases, observability, and customer communications

  • Understanding and implementing key metrics of success for operating services in an enterprise

  • Prioritizing each team’s work to meet business objectives and maximize value to the network of firms

  • Communicating your vision, mission, and values to your people and leadership to drive alignment

  • Serving as the escalation point across all of your teams and rapidly unblocking issues

  • Ensuring software engineering best practices are being followed

  • Setting standards for container security, deployment methodology, vulnerability remediation, and infrastructure as code

  • Creating complex business process diagrams that clearly communicate concepts to stakeholders

  • Managing vendor relationships for both technology and resources

  • Leading optimization of cloud resources to drive down costs for PwC and your customers

This job is closed.