Job Title:

SDE III Gen AI

Company: Glance

Location: Bangalore, Karnataka

Created: 2026-05-07

Job Type: Full Time

Job Description:

What You Will Be Doing - Design and implement production-ready generative AI applications that serve millions of users, from initial architecture through deployment and monitoring - Build advanced RAG (Retrieval-Augmented Generation) pipelines that combine vector databases, hybrid search, and intelligent caching to deliver sub-second response times - Develop multimodal AI systems that seamlessly integrate text, vision, and audio capabilities using state-of-the-art models - Architect scalable microservices that handle thousands of concurrent AI requests while optimizing for cost, latency, and reliability - Lead code reviews and technical design sessions, establishing best practices and architectural patterns that elevate the entire team's capabilities - Optimize large language models through fine-tuning techniques to achieve domain-specific performance improvements - Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing frameworks, and real-time monitoring dashboards - Collaborate with product managers and stakeholders to translate complex business requirements into innovative AI solutions - Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration technologies - Create and maintain technical documentation, runbooks, and architectural decision records that enable knowledge sharing across teams - Mentor junior engineers through pair programming, technical talks, and hands-on guidance to accelerate their growth - Research and prototype emerging AI technologies to identify opportunities for competitive advantage Gen AI Responsibilities - Fine-tune and optimize state-of-the-art language models for specific business use cases, achieving significant improvements in accuracy and relevance - Design multi-agent AI systems using frameworks to orchestrate complex workflows and decision-making processes - Implement advanced prompt engineering strategies including Tree of Thoughts, ReAct patterns, and automatic prompt optimization to maximize model performance - Build production-grade embedding systems that handle billions of vectors, implementing efficient indexing strategies and hybrid search capabilities - Develop computer vision pipelines using models for tasks ranging from object detection to visual question answering - Create secure AI applications with robust safeguards against prompt injection, jailbreaking, and data leakage while maintaining compliance with AI governance standards - Optimize token usage and implement intelligent caching strategies to reduce costs by 50-70% while maintaining quality - Design and implement evaluation frameworks that go beyond traditional metrics, incorporating human feedback loops and domain-specific quality measures - Build real-time AI inference systems capable of processing streaming data with sub-100ms latency requirements - Integrate multiple foundation models into unified applications, implementing fallback mechanisms and load balancing for high availability - Develop custom tools and functions that extend LLM capabilities, enabling models to interact with databases, APIs, and external systems - Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, and Graph RAG for complex reasoning tasks - Create multimodal search systems that enable users to query across text, images, and documents using natural language - Build AI-powered data processing pipelines that automatically extract, transform, and enrich unstructured data at scale - Deploy edge AI solutions using frameworks like ONNX and TensorRT, optimizing models for resource-constrained environments What We're Looking For - 5+ years of hands-on experience building and deploying ML/AI systems, with at least 2+ years focused on generative AI and LLMs - Expert-level Python programming skills with deep knowledge of async programming, multiprocessing, and performance optimization - Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases - Proven track record of deploying AI applications to production environments serving real users at scale - Deep understanding of transformer architectures, attention mechanisms, and the latest advances in generative AI - Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes) - Excellent communication skills with the ability to explain complex AI concepts to both technical and non-technical audiences - Proven experience improving large-scale product search and discovery — including dense retrieval with bi-encoders, cross-encoder reranking, query understanding, and hybrid BM25 + vector search across catalogs of tens of millions of SKUs - Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK — designing stateful, tool-augmented agents for complex, real-world workflows - Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required with relevant experience) Nice to Have - Published research papers or significant contributions to open-source AI projects - Experience with multimodal AI systems combining vision, language, and audio - Domain expertise in specific verticals (healthcare, finance, legal, e-commerce) - Knowledge of AI safety, alignment, and constitutional AI principles - Experience building AI infrastructure and platforms used by other engineers - Familiarity with emerging technologies like neural architecture search, mixture of experts, or neuromorphic computing.

Apply Now

➤