Company:
Ampcus Incorporated
Location: Palo Alto
Closing Date: 01/12/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
Position Title - AWS DevOps Lead
Domain EXP - (Healthcare Domain)
Location - Palo Alto CA (Onsite)
Experience - 14+ Years Exp
Must Have - AWS DevOps, Kubernetes, Docker, Jenkins, Terraform, Infrastructure as code
(Requires working from Guardant office in Palo Alto 3 days)
Good to have : Go Lang, Java
About the Role:
Deep understanding of the software development life cycle and zero downtime release management. Experience with agile based iterative development and knowledge of software engineering best practices
Influence the development of solutions that impact strategic projects/program goals and business outcomes
Resolve highly complex problems using a significant application of technical knowledge, conceptualization, reasoning and interpretation
Communicate effectively to help bridge stakeholder and development requirements
Lead the design, implementation of our public Cloud infrastructure and large scale Kubernetes clusters including CI/CD, provisioning, sizing, and Infrastructure as code
Building a release pipeline to enable fast, but safe delivery of critical business software to Production
Driving best practices in cost optimization, security, monitoring, alerting, operations excellence, performance efficiency and reliability in underlying systems
Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity
Practice sustainable incident response and blameless postmortems
Be part of an on-call rotation to support production systems and post-deployment monitoring
Lead and mentor junior engineers on the team
Experience Needed
15+ years exp is needed
Solid DevOps and Release orchestration with industry experience deploying highly available, rapidly scalable cloud-based computing services (AWS, GCP, etc)
Experience in running cloud services using products such as Kubernetes, Docker or OpenShift
Experience with cloud-native ecosystem tools and technology stack such as container security, static code vulnerability, service proxies, container network, and service mesh
strong automation skills using tools such as Ansible, Chef, Terraform, Jenkins a must
Strong knowledge of Linux and Linux environments (RHEL 6/7/8, RHCSA/RHCE, CentOS) a must
Programming experience in operating cloud environments using languages such as Python, Go, Ruby, Java, etc.
Experience in designing and managing CI/CD platforms on tools like Jenkins that allow for multiple releases/day and developer visibility
Implementation, management and optimization of observability tools and platform for system monitoring, logging, tracing and metrics (e.g. Prometheus, Grafana, Kibana, Sentry, Logz.io, New Relic, Jaeger, Splunk, etc.)
Must be fluent with Git
Expert in designing, analyzing and troubleshooting large-scale distributed systems.
Systematic problem-solving approach, coupled with strong verbal and written communication skills and a sense of ownership and drive
Working with the various business units to help define SLOs and SLIs
Ability to successfully work with customers, developers, testing, project management, and support staff
Domain EXP - (Healthcare Domain)
Location - Palo Alto CA (Onsite)
Experience - 14+ Years Exp
Must Have - AWS DevOps, Kubernetes, Docker, Jenkins, Terraform, Infrastructure as code
(Requires working from Guardant office in Palo Alto 3 days)
Good to have : Go Lang, Java
About the Role:
Deep understanding of the software development life cycle and zero downtime release management. Experience with agile based iterative development and knowledge of software engineering best practices
Influence the development of solutions that impact strategic projects/program goals and business outcomes
Resolve highly complex problems using a significant application of technical knowledge, conceptualization, reasoning and interpretation
Communicate effectively to help bridge stakeholder and development requirements
Lead the design, implementation of our public Cloud infrastructure and large scale Kubernetes clusters including CI/CD, provisioning, sizing, and Infrastructure as code
Building a release pipeline to enable fast, but safe delivery of critical business software to Production
Driving best practices in cost optimization, security, monitoring, alerting, operations excellence, performance efficiency and reliability in underlying systems
Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity
Practice sustainable incident response and blameless postmortems
Be part of an on-call rotation to support production systems and post-deployment monitoring
Lead and mentor junior engineers on the team
Experience Needed
15+ years exp is needed
Solid DevOps and Release orchestration with industry experience deploying highly available, rapidly scalable cloud-based computing services (AWS, GCP, etc)
Experience in running cloud services using products such as Kubernetes, Docker or OpenShift
Experience with cloud-native ecosystem tools and technology stack such as container security, static code vulnerability, service proxies, container network, and service mesh
strong automation skills using tools such as Ansible, Chef, Terraform, Jenkins a must
Strong knowledge of Linux and Linux environments (RHEL 6/7/8, RHCSA/RHCE, CentOS) a must
Programming experience in operating cloud environments using languages such as Python, Go, Ruby, Java, etc.
Experience in designing and managing CI/CD platforms on tools like Jenkins that allow for multiple releases/day and developer visibility
Implementation, management and optimization of observability tools and platform for system monitoring, logging, tracing and metrics (e.g. Prometheus, Grafana, Kibana, Sentry, Logz.io, New Relic, Jaeger, Splunk, etc.)
Must be fluent with Git
Expert in designing, analyzing and troubleshooting large-scale distributed systems.
Systematic problem-solving approach, coupled with strong verbal and written communication skills and a sense of ownership and drive
Working with the various business units to help define SLOs and SLIs
Ability to successfully work with customers, developers, testing, project management, and support staff
Share this job
Useful Links