Job Title:
SDE III - AI Software Engineer- RAG- Vector Database
Company: CareerXperts Consulting
Location: Bharatpur, Rajasthan
Created: 2026-01-01
Job Type: Full Time
Job Description:
What You’ll DoArchitect, build, and scale agentic RAG and text-to-SQL copilots supporting 50K+ daily queries, delivering 99.9% uptime, low latency, and high semantic accuracy.Design, operate, and continuously optimize a production-grade LLMOps platform, leveraging LangGraph, LangSmith, MLflow, Kubernetes, async inference, and leading cloud LLM providers such as AWS Bedrock, Google Vertex AI, Azure OpenAI, and Anthropic.Develop and own MCP server integrations, ensuring reliable, efficient, and secure runtime execution across multi-agent workflows and toolchains.Implement evaluation and guardrail frameworks (AI-as-a-Judge, grounding checks, safety filters, regression tests) to minimize hallucinations, control model drift, and reduce token usage and inference costs by 30%+.Own end-to-end system observability and performance, including latency, throughput, reliability, cost optimization, caching strategies, and retrieval quality.Optimize inference, retrieval, and orchestration pipelines to support high-traffic, enterprise-scale workloads.Partner closely with product, infrastructure, and leadership teams to define SLAs, unblock customer requirements, and deliver robust, enterprise-ready AI capabilities.Leverage AI-assisted development tools (GitHub Copilot, MCP-enabled IDEs, Claude, GPT, etc.) to improve development velocity, code quality, and system reliability.What We’re Looking For5+ years of experience in software engineering or ML engineering, with hands-on ownership of production-grade LLM, RAG, or agent-based systems.Strong Python engineering expertise, with deep experience building RAG pipelines, agent architectures, tool-calling workflows, and text-to-SQL copilots.Proven experience working with MCP servers, vector databases, and retrieval-augmented system architectures.Strong understanding of agent development, LLM integration patterns, prompt engineering, and runtime orchestration frameworks.Hands-on experience with cloud-native infrastructure, including Kubernetes, async workers, queueing systems, and observability/monitoring stacks.Demonstrated ability to build LLM evaluation pipelines, guardrails, monitoring, experiment tracking, and regression testing for AI systems.Experience with multiple agent SDKs, such as:Anthropic SDKClaudeAgent SDKGoogle ADK (Agent Developer Kit)Bonus: LangChain, LlamaIndex, AutoGen, or custom agent runtimesStrong ownership mindset, with a track record of taking AI prototypes from concept to scalable, reliable, high-traffic production systems.Write to shruthi.s@ to get connected.