Job Title:
AI Associate Manager
Company: PepsiCo
Location: Hyderabad, Telangana
Created: 2025-12-19
Job Type: Full Time
Job Description:
OverviewWe are seeking a highly skilled and proactive AI Solutions SRE Lead to oversee the maintenance, optimization, and ongoing performance of deployed AI/ML systems and solutions. In this role, you'll act as the bridge between innovation and operations, ensuring our AI solutions consistently deliver value and operate seamlessly in real-world environments. You will lead efforts to monitor deployments, troubleshoot issues, and define best practices for sustaining AI systems throughout their lifecycle.ResponsibilitiesMonitoring & Sustenance:- Lead the post-deployment lifecycle of AI solutions, ensuring continued functionality, reliability, and scalability. - Establish monitoring frameworks to oversee system performance, usage, and metrics for AI/ML models and APIs. - Detect anomalies in AI systems, troubleshoot operational issues, and initiate timely corrective actions.Performance Optimization:- Continuously assess and optimize the performance of AI models to maintain efficiency and accuracy in production environments. - Collaborate with data scientists and engineers to refine algorithms, retrain models, and update solutions as needed. - Implement automation where possible to streamline maintenance processes.Stakeholder Collaboration:- Work with cross-functional teams (engineering, product, operations, etc.) to ensure alignment of AI sustainment activities with business goals. - Communicate effectively with stakeholders to provide updates on system health, risks, and improvements.Governance & Best Practices:- Define and implement best practices for sustaining AI solutions, including documentation, testing protocols, and version control. - Ensure compliance with ethical AI standards, regulatory guidelines, and established governance frameworks. - Manage and mitigate risks associated with model drift, data shifts, and system vulnerabilities.Incident Management:- Lead responses to critical incidents involving AI systems by performing root cause analysis and deploying solutions for quick resolution. - Advocate for proactive risk prevention and early detection strategies. - Mentor and develop junior team members, fostering their skills in AI observability and domain-specific knowledge in ML, Computer Vision, and Generative AI.QualificationsRequired:- Bachelor's degree in Computer Science, Engineering, Data Science, or related field; advanced degree preferred. - 9+ years of experience in machine learning, data science, or software engineering roles, with significant exposure to Computer Vision and Generative AI projects. - 4+ years of experience specifically focused on AI/ML development and sustain the applications / solutions. - Strong programming skills in languages such as Python, Java, or Go. - Extensive experience with AI/ML frameworks (e.g., TensorFlow, PyTorch, scikit-learn) and cloud platforms (e.g., AWS, Azure, GCP). - Proficiency in data visualization tools and techniques (e.g., Grafana, Tableau, D3.js). - Deep understanding of AI/ML concepts, including model training, evaluation, and deployment, with specific knowledge of Computer Vision and Generative AI techniques. - Experience with monitoring and observability tools such as Prometheus, ELK stack, or similar systems. - Excellent problem-solving skills and ability to troubleshoot complex AI systems across various domains. - Proven track record of mentoring and developing junior team members in AI-related roles.Preferred:- Experience with MLOps practices and tools, particularly for large-scale AI systems. - Familiarity with AI ethics and responsible AI principles, especially as they relate to Generative AI. - Knowledge of relevant AI regulations and compliance requirements, including those specific to Computer Vision applications. - Experience with distributed systems and large-scale data processing for AI applications. - Contributions to open-source projects or research publications in AI solution at production scale. Previous experience with large-scale AI/ML solutions in production environments. - Knowledge of DevOps principles and CI/CD pipelines specific to AI/ML systems.Key Competencies- Strong analytical and critical thinking skills - Excellent communication and collaboration abilities - Proactive and self-motivated work ethic - Ability to explain complex technical concepts to both technical and non-technical audiences - Adaptability and willingness to learn in a rapidly evolving field - Strong mentorship and leadership skills - Deep curiosity and passion for AI, particularly in ML, Computer Vision, and Generative AI domains - We are looking for a passionate and innovative individual who can help us build robust, transparent, and reliable AI systems while nurturing the growth of our team. If you have a strong background in AI/ML, with specific expertise in Computer Vision and Generative AI, and a keen interest in observability and system reliability, we encourage you to apply.