IN.JobDiagnosis logo

Job Title:

MLOps Engineer — GCP/GKE, vLLM Serving & Production Reliability

Company: AIBound

Location: Bengaluru, Karnataka

Created: 2026-01-29

Job Type: Full Time

Job Description:

Company DescriptionAIBound is revolutionizing AI security with the industry's first unified control plane for secure AI adoption. We discover, test, and protect each AI model, agent, and identity—catching AI risks before impact so enterprises can innovate safely and at speed. As AI adoption outpaces security across global organizations, AIBound eliminates the dangerous gap between innovation and protection.Led by our CEO and founder, the former CISO at Palo Alto Networks and Workday, AIBound brings together a world-class team of cybersecurity veterans who have secured some of the world's most advanced enterprises. We're a fast-growing company backed by leading investors, positioned at the critical intersection of AI innovation and enterprise security—one of the most strategic technology frontiers of our generation.Join us in building the future of AI security, where cutting-edge artificial intelligence meets battle-tested cybersecurity expertise.RoleAIBound ships AI security capabilities that must be fast, reliable, secure, and cost-controlled in real enterprise environments. We’re hiring an MLOps Engineer to productionize and operate our LLM services on GCP using GKE, with a strong focus on high-performance serving (vLLM), safe rollout strategies, monitoring, and operational excellence.You’ll work closely with AI and data engineers to ensure what we build can be deployed, scaled, and trusted.Responsibilities- Deploy and operate LLM inference services on GCP using GKE - Implement high-performance serving with vLLM (or comparable LLM serving stack) - Build inference APIs using FastAPI and containerize services with Docker - Implement autoscaling (HPA, GPU-aware scaling, traffic-based scaling), capacity planning, and SLOs - Set up monitoring/logging/alerting for latency, error rates, throughput, GPU utilization, token usage - Own CI/CD for model + service deployments, including rollback/canary strategies - Implement production controls: secrets management, IAM, network policies, dependency scanning - Drive cost optimization: caching, batching, quantization awareness, right-sizing, cold-start reductionQualifications- 1–2 years of hands-on MLOps / platform / backend deployment experience - Strong experience with GCP and GKE - Solid Kubernetes + Docker fundamentals (deployments, services, configmaps/secrets, ingress) - Experience serving models via vLLM (preferred) or similar serving frameworks - Proficiency with FastAPI (or equivalent) - Practical experience with CI/CD, monitoring, autoscaling, and rollback patternsBenefits & Culture- Highly competitive salary and equity package - Hybrid work environment (2 days on‑site per week), and vacation policy - Comprehensive health benefits - Professional development budget, conference attendance and access to AI research resources. - AIBound is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Apply Now

➤
Home | Contact Us | Privacy Policy | Terms & Conditions | Unsubscribe | Popular Job Searches
Use of our Website constitutes acceptance of our Terms & Conditions and Privacy Policies.
Copyright © 2005 to 2026 [VHMnetwork LLC] All rights reserved. Design, Develop and Maintained by NextGen TechEdge Solutions Pvt. Ltd.