Job Title:
MLOps Engineer — GCP/GKE, vLLM Serving & Production Reliability
Company: AIBound
Location: Bengaluru, Karnataka
Created: 2026-01-29
Job Type: Full Time
Job Description:
Company DescriptionAIBound is revolutionizing AI security with the industry's first unified control plane for secure AI adoption. We discover, test, and protect each AI model, agent, and identity—catching AI risks before impact so enterprises can innovate safely and at speed. As AI adoption outpaces security across global organizations, AIBound eliminates the dangerous gap between innovation and protection.Led by our CEO and founder, the former CISO at Palo Alto Networks and Workday, AIBound brings together a world-class team of cybersecurity veterans who have secured some of the world's most advanced enterprises. We're a fast-growing company backed by leading investors, positioned at the critical intersection of AI innovation and enterprise security—one of the most strategic technology frontiers of our generation.Join us in building the future of AI security, where cutting-edge artificial intelligence meets battle-tested cybersecurity expertise.RoleAIBound ships AI security capabilities that must be fast, reliable, secure, and cost-controlled in real enterprise environments. We’re hiring an MLOps Engineer to productionize and operate our LLM services on GCP using GKE, with a strong focus on high-performance serving (vLLM), safe rollout strategies, monitoring, and operational excellence.You’ll work closely with AI and data engineers to ensure what we build can be deployed, scaled, and trusted.Responsibilities- Deploy and operate LLM inference services on GCP using GKE - Implement high-performance serving with vLLM (or comparable LLM serving stack) - Build inference APIs using FastAPI and containerize services with Docker - Implement autoscaling (HPA, GPU-aware scaling, traffic-based scaling), capacity planning, and SLOs - Set up monitoring/logging/alerting for latency, error rates, throughput, GPU utilization, token usage - Own CI/CD for model + service deployments, including rollback/canary strategies - Implement production controls: secrets management, IAM, network policies, dependency scanning - Drive cost optimization: caching, batching, quantization awareness, right-sizing, cold-start reductionQualifications- 1–2 years of hands-on MLOps / platform / backend deployment experience - Strong experience with GCP and GKE - Solid Kubernetes + Docker fundamentals (deployments, services, configmaps/secrets, ingress) - Experience serving models via vLLM (preferred) or similar serving frameworks - Proficiency with FastAPI (or equivalent) - Practical experience with CI/CD, monitoring, autoscaling, and rollback patternsBenefits & Culture- Highly competitive salary and equity package - Hybrid work environment (2 days on‑site per week), and vacation policy - Comprehensive health benefits - Professional development budget, conference attendance and access to AI research resources. - AIBound is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.