Job Title:

Senior Data Engineer - Gen AI / LLM [T500-25006]

Company: Talent500

Location: Pune, Maharashtra

Created: 2026-04-15

Job Type: Full Time

Job Description:

Talent500 is hiring for one of its clientsAbout Infinite Electronics:Infinite Electronics is a global manufacturer of high-performance connectivity solutions, serving customers across a wide range of industries. With deep engineering expertise and a focus on precision-built components and assemblies, the company partners closely with customers to address complex, real-world challenges and accelerate product innovation.About Infinite India (GCC):Infinite is establishing its India Global Capability Center (GCC) in Pune to expand its engineering, digital, and customer service capabilities. This center will play an important role in supporting mission-critical initiatives while working in close collaboration with global stakeholders.This is an opportunity to be part of the early team shaping how the center operates — influencing technology standards, scalable processes, and collaborative ways of working from the outset. Located at Mont Clare, Baner, the center provides a modern work environment designed to foster collaboration, innovation, and long-term career growth.This is a unique opportunity to grow alongside a center that is being built with long-term capability and excellence in mind.Why Join:Build from the start – Be part of the early team shaping foundational systems, standards, and ways of workingGlobal exposure – Collaborate directly with international stakeholders on impactful engineering and digital initiatives.Modern environment, long-term growth – Work from a state-of-the-art office in Baner, Pune, and grow your career within a center designed for sustained capability expansion.Built with intent – The India GCC is being developed with a strong focus on capability, ownership, and long-term excellencePosition Name: Senior Data EngineerLocation: Pune, IndiaExempt / Non-Exempt: ExemptPosition Description:The Senior Data Engineer is a hands-on data engineering leader based in Pune, reporting to U.S. engineering leadership and working within a small AI engineering team. This is a delivery-focused role - the primary accountability is building, operating, and continuously improving production-grade data pipelines, models, and platforms that power analytics, operational systems, and AI workflows.This role owns the end-to-end lifecycle of production data solutions - from ingestion and transformation through validation, governance, and operational support while contributing to data architecture standards and modeling patterns in close partnership with U.S. engineering, analytics, AI, platform, and security teams.The right candidate is equally comfortable designing scalable data models and debugging pipeline failures, enforcing governance and lineage standards and responding to production incidents, and enabling AI-ready data products while maintaining reliability, compliance, and traceability across the data platform.Work Location & Schedule Expectations:This role is based in Pune and works in close daily collaboration with U.S.-based engineering leadership and cross-functional teams.Work Model: Hybrid, minimum 3 days per week in the office, coordinated with the team to ensure consistent shared in-person working days.U.S. Collaboration: Daily schedule must include 3 to 4 hours of overlap with U.S. Eastern Time (ET) to support active collaboration with U.S. engineering, product, and platform teams.Operational Availability: As a hands-on engineer accountable for data pipelines and platform services, this role requires availability during critical releases, deployments, and production incidents, which may occasionally fall outside standard working hours including early mornings, evenings, or weekends.Qualifications & Experience:Required Experience:Bachelor's degree in computer science, Engineering, Data Science, or a related technical field, or equivalent practical experience.Strong track record designing, building, and operating production-grade batch, streaming, and real-time data pipelines at enterprise scale, with clear ownership of reliability, performance, and operational outcomes.Demonstrated experience designing and maintaining data models and schemas for analytics, operational systems, and AI or GenAI use cases, including schema evolution and controlled change management.Experience implementing data governance, lineage, versioning, and quality assurance frameworks across production data pipelines and data products.Proven operational ownership of production data pipelines, including SLA/SLO definition, monitoring, alerting, incident response, and continuous improvement.Experience with cloud-native data platforms, ETL/ELT tooling, and both SQL and NoSQL storage technologies.Iterative delivery mindset - ships working pipeline components and data products incrementally, incorporating feedback from downstream consumers to refine and improve solutions over time.Communicates data architecture decisions, pipeline tradeoffs, and operational status clearly across technical and non-technical stakeholders in a distributed, cross-cultural environment.Takes ownership of data quality and pipeline reliability as production responsibilities, not just implementation tasks.Comfortable using AI coding assistants as part of a standard development workflow, with the judgment to validate, test, and take ownership of AI-generated code in production contexts.Preferred Experience:Master's degree in computer science, Data Engineering, or a related field.Experience defining enterprise data architecture, including reference models, data contracts, and cross-system integration standards.Experience preparing AI-ready datasets for GenAI workflows, including feature store design, embedding pipelines, feedback loops, and auditability requirements for training and inference.Experience implementing automated data validation, anomaly detection, and quality monitoring frameworks across production pipelines.Experience applying Bronze/Silver/Gold medallion architecture to manage staged data quality, transformations, and downstream consumption at scale.Knowledge of distributed storage and compute architecture design for high availability, resilience, and cost optimization.Experience with CI/CD pipelines, version control, and automated testing for data engineering workloads.Familiarity with regulatory, compliance, and security frameworks relevant to enterprise data, including data privacy, access controls, and audit requirements.Hands-on experience within the Microsoft Azure data ecosystem, including services such as Azure Data Factory, Azure Synapse Analytics, Azure Blob Storage, Azure Data Lake, Key Vault, and Application Insights, with familiarity with Bicep for infrastructure as code. Candidates with equivalent depth on AWS or GCP are encouraged to apply.Key Duties and Responsibilities:Data Engineering & Operational Ownership:Design, build, and operate production-grade batch, streaming, and real-time data pipelines, applying Bronze/Silver/Gold medallion methodology to ensure staged data quality, lineage, and reliability across ingestion, transformation, and consumption layers.Implement data validation, anomaly detection, monitoring, and alerting frameworks to maintain SLA/SLO compliance and production readiness.Own end-to-end pipeline lifecycle including ingestion, transformation, storage, incident response, and continuous improvement, treating data reliability and availability as production responsibilities.Deliver pipeline components and data products incrementally, incorporating feedback from downstream consumers to refine and improve solutions over time.Data Modeling & AI/GenAI Enablement:Design and maintain enterprise data models and schemas to support analytics, operational systems, and AI/GenAI workflows, including schema evolution and controlled change management.Prepare AI-ready datasets by implementing embedding pipelines, feature stores, feedback loops, and traceability controls while maintaining governance and auditability requirements.Define reusable modeling patterns, data contracts, and schema evolution strategies to accelerate delivery and maintain downstream reliability.Data Architecture & Governance:Contribute to enterprise data architecture standards, reference models, and integration patterns for multi-source systems, partnering with U.S. engineering and platform teams to ensure consistency and scalability.Implement and enforce data governance, lineage, compliance, and security frameworks across pipelines and data products.Partner with security, privacy, and cross-functional stakeholders to embed access controls, ensure interoperability, and support long-term data platform scalability.Collaboration & Communication:Works closely with U.S.-based engineering, analytics, AI, and business teams across time zones, translating data requirements into technical solutions and surfacing tradeoffs early to keep delivery on track.Communicates pipeline architecture decisions, data model tradeoffs, and operational status clearly across technical and non-technical audiences.Provides technical guidance and peer review to engineers within the Pune team, contributing to overall data engineering quality through hands-on feedback.

Apply Now

➤