Job Title:
AI Data Engineer
Company: Hero Vired
Location: New delhi, Delhi
Created: 2026-01-04
Job Type: Full Time
Job Description:
About Hero Vired:Would you like to be part of an exciting, innovative, and high-growth startup from one of the largest and most well-respected business houses in the country - the Hero Group?Hero Vired is a premium learning experience offering industry-relevant programs and world-class partnerships, to create the change-makers of tomorrow.At Hero Vired, we believe everyone is made of big things. With the experience, knowledge, and expertise of the Hero Group, Hero Vired is on a mission to change the way we learn. Hero Vired aims to give learners the knowledge, skills, and expertise through deeply engaged and holistic experiences, closely mapped with industry to empower them to transform their aspirations into reality. The focus will be on disrupting and reimagining university education & skilling for working professionals by offering high-impact online certification and degree programs.The illustrious and renowned US$5 billion diversified Hero Group is a conglomerate of Indian companies with primary interests and operations in automotive manufacturing, financing, renewable energy, electronics manufacturing, and education. The Hero Group (BML Munjal family) companies include Hero MotoCorp, Hero FinCorp, Hero Future Energies, Rockman Industries, Hero Electronix, Hero Mindmine, and the BML Munjal University.For detailed information, visit Hero ViredRole : AI Data EngineerJob Type: Full Time Work Type: Work From Office Location: New Delhi (Sultanpur)Experience: 3 + years About the Role:Hero Vired is seeking a skilled AI Data Engineer – Conversational AI Systems with 3+ years of relevant experience to support the design, development, and optimization of intelligent conversational platforms used in our digital learning ecosystem. This role will focus on building scalable data pipelines, training-ready datasets, and AI-driven conversational solutions that enhance learner engagement and personalized experiences.The ideal candidate will work closely with AI/ML engineers, product teams, and content specialists to develop robust data foundations for chatbots, virtual teaching assistants, and AI-powered learner support systems. You will play a key role in enabling NLP models, managing structured and unstructured data, and ensuring high-quality, reliable data flows for conversational AI applications.This position is ideal for professionals passionate about AI, NLP, and education technology, who enjoy working in fast-paced environments and want to create real-world impact through AI-driven learning solution.Key ResponsibilitiesConversational AI InfrastructureBuild and optimize knowledge pipelines for voice bot context injection using Azure AI Search + vector embeddingsDesign ETL pipelines to process CRM data (lead profiles, interaction history, program details) for real-time bot personalizationImplement lazy-recall architecture for efficient prompt context loading during live callsVoice Bot Data LayerStructure product/program data (courses, pricing, features, eligibility) in normalized formats optimized for sub-100ms retrievalBuild semantic search pipelines for dynamic FAQ retrieval and objection handling contentDesign customer persona classification models to enable adaptive conversation flowsReal-Time Processing & IntegrationDevelop data sync pipelines between CRM, LMS (Moodle), and bot orchestration layerBuild streaming pipelines for call metadata, conversation state, and lead scoring updatesIntegrate with Azure services (Cognitive Services, OpenAI, Cosmos DB) for knowledge base managementQuality & Feedback SystemsDesign conversation analytics pipelines to track bot performance metrics (booking rates, objection handling success, human-likeness scores)Implement LLM-based quality analysis for automated transcript evaluationBuild feedback loops that feed conversation insights back into prompt optimization and knowledge base updatesML & Continuous ImprovementTrain/update classification models for intent detection, customer sentiment, and lead scoringImplement confidence scoring and hallucination detection for bot responsesDesign RLHF pipelines using conversation quality scores to fine-tune response generationBuild reward models that learn from QA analyst feedback and successful conversion patternsImplement RAG pipelines for dynamic knowledge retrieval during live callsDeployment & MLOps (End-to-End Ownership):Own CI/CD pipelines for prompt versioning, knowledge base updates, and model deploymentsDesign multi-environment strategy (dev → staging → prod) with automated testing for bot responsesImplement blue-green/canary deployments for safe rollout of prompt changes and model updatesBuild automated regression testing — validate bot responses against golden datasets before production pushSet up infrastructure-as-code (Terraform/Bicep) for Azure resourcesDesign rollback mechanisms for quick recovery when bot performance degradesImplement observability stack: logging, tracing, alerting for latency spikes, error rates, and conversation quality dropsManage containerized deployments (Docker/Kubernetes) for bot services and data pipelinesAutomate knowledge base refresh cycles with validation gatesRequired Core Skill Sets Python (FastAPI, Pandas, Scrapy) + PostgreSQL/Cosmos DBAzure AI stack (AI Search, OpenAI, Cognitive Services, Azure Functions)RAG architecture and vector embeddingsReal-time data pipelines (Airflow/Prefect)MLOps & DevOpsCI/CD tools (Azure DevOps, GitHub Actions, or GitLab CI)Infrastructure-as-code (Terraform/Bicep)Docker, Kubernetes (AKS preferred)Monitoring & observability (Prometheus, Grafana, Azure Monitor)Git-based prompt/config versioning strategiesPreferredExperience with conversational AI or voice systemsRLHF or preference-based model tuningA/B testing frameworks for ML systems