IN.JobDiagnosis logo

Job Title:

MLops Engineer

Company: Recro

Location: Eluru, Andhra pradesh

Created: 2025-10-21

Job Type: Full Time

Job Description:

Role OverviewWe are looking for an experienced MLOps Lead with deep expertise in Azure and AWS cloud ecosystems, who can design, deploy, and manage scalable AI/ML infrastructure. The ideal candidate should bring a strong background in cloud governance, GenAI tooling, automation, and CI/CD pipelines, with hands-on experience across modern MLOps frameworks.Key ResponsibilitiesDesign, implement, and manage scalable cloud-based AI/ML infrastructure across Azure and AWS.Drive end-to-end MLOps lifecycle — model deployment, monitoring, retraining, and governance.Enable GenAI and Agentic AI platforms leveraging Azure OpenAI, Bedrock, Anthropic Claude, LangChain, etc.Implement CI/CD pipelines using Azure DevOps or AWS CodePipeline.Ensure security, observability, and compliance across ML and GenAI ecosystems.Manage infrastructure automation via Terraform, Bicep, CloudFormation, or similar IaC tools.Collaborate with data science and engineering teams to optimize ML workflows, data pipelines, and API integrations.Implement monitoring and alerting using Grafana, Prometheus, Azure Monitor, and Application Insights.Oversee networking, identity management, and role-based access controls (IAM, RBAC) across clouds.Support model lifecycle management — drift monitoring, retraining, technical evaluation, and business validation.Technical Skills & ExpertiseCloud & MLOps PlatformsAzure: Azure ML, Azure AI Services, Azure OpenAI, Azure Kubernetes Service (AKS), Databricks, Azure Search, Azure Blob, Cosmos DB, Azure SQL, Azure Functions, Azure Event Hub, Azure Resource Manager (ARM), Bicep.AWS: SageMaker, Bedrock, Lambda, DynamoDB, S3, RDS, Redshift, ECR, CloudFormation, CDK, KMS, EventBridge, Step Functions.AI/ML & ProgrammingHands-on in Python, with exposure to TensorFlow, PyTorch, scikit-learn.Understanding of LLM tokenization, prompt injection risks, jailbreak prevention, and AI safety techniques.Familiarity with LangChain, LlamaCloud, AI Foundry, and related frameworks.Experience in model monitoring, retraining, and evaluation workflows.DevOps & InfrastructureExpertise in CI/CD pipelines, containerization (Docker, Kubernetes), and infrastructure automation.Strong in governance, audit logging, security policies (Azure Policy, AWS SCP, IAM).Deep understanding of networking, DNS, load balancers, VNets/VPCs, VPNs.Skilled in IaC tools – Terraform, Bicep, ARM, CloudFormation.Monitoring & ObservabilityExperience with Grafana, Prometheus, Application Insights, Log Analytics Workspaces, Azure Monitor.Security & Access ManagementUnderstanding of Microsoft AD, least privilege principles, IAM, RBAC.Testing & AutomationFamiliarity with unit testing and integration testing in CI/CD workflows (preferably Azure DevOps).Good to HaveExperience with Azure Bot Framework, M365 Copilot, and APIM.Exposure to code assistants such as GitHub Copilot, Cursor, Claude Code.Knowledge of Boto3 SDK (AWS Python) and TypeScript for IaC.Preferred BackgroundStrong background in cloud infrastructure engineering and machine learning operations.Proven ability to lead cross-functional teams and implement AI governance at scale.Excellent problem-solving, communication, and documentation skills.

Apply Now

➤
Home | Contact Us | Privacy Policy | Terms & Conditions | Unsubscribe | Popular Job Searches
Use of our Website constitutes acceptance of our Terms & Conditions and Privacy Policies.
Copyright © 2005 to 2025 [VHMnetwork LLC] All rights reserved. Design, Develop and Maintained by NextGen TechEdge Solutions Pvt. Ltd.