Role - Cloud GCP SRE EngineerExperience – 7 – 10 yearsLocation – BangaloreExperienced engineer that is required to work on public cloud projects within GCP in a global financial organization. That is focused on improving operational stability and reliability while helping establish production best practices for a new CSP partner. This role requires the successful candidate to have:Prior SRE experience or the willingness to specialize in the fieldExperience in transforming IT Service Delivery to one of a Devops cultureStrong infrastructure, development and Devops skillsProven track record of service ownership, accountability and building trusted relationshipsExperience in Change and Incident Management conceptsPrimary responsibilitiesWork in a globally distributed team to provide innovative and robust public Cloud solutionsCollaborate with vendors to develop and deploy Cloud services to meet customer expectationsCollaborate with IT Security to ensure necessary controls to Cloud services are deployed and testedDesign, optimize and document the operational aspects of the Cloud platformDevelop Infra as Code to automate cloud deploymentsDevelop automation workflows in CI/CD pipeline adhering to change management processFacilitate tabletop exercises for incident management processes, chaos engineeringEvaluate and implement emerging Devops toolsComplex troubleshooting of on-premise and cloud environment issuesBuild and integrate observability into cloud platforms and solutionsHighlight and reduce toil with automation, architecture improvements, and process improvementsRequired SkillsExperience with Infrastructure as CodeExperience with CI/CD pipelinesSound knowledge of server infrastructure and cloud computingGood knowledge of security (SAML, OAuth, OpenID, Kerberos, Policies, entitlements etc.)Experience with architecting and maintaining high availability production systemsStrong development skills in PythonExperience in software installation, configuration and patchingHands on experience in playbook and infrastructure automation (Ansible, Terraform)Implementing open source observability tools (Prometheus, Grafana, or Open Telemetry)Experience with Agile and DevOps methodologiesDeveloping monitoring architecture and implementing monitoring agents, dashboards, escalations and alertsAbility to communicate technical issues and ideas to colleagues and customers with claityExperience creating technical architecture documentationDesired SkillsKnowledge of security controls for the Public cloud (encryption of data in motion/rest and key management)Hands-on experience with GCP design and implementationKnowledge of Linux and Windows containersBachelor degree in a related field
Job Title
Site Reliability Engineer