IN.JobDiagnosis logo

Job Title:

AI Engineer - Synthetic Data Generation

Company: Skyfall AI

Location: Belgaum, Karnataka

Created: 2025-08-14

Job Type: Full Time

Job Description:

Position OverviewWe are seeking an experienced AI Data Engineer to lead the development and scaling of synthetic data generation systems that support multiple enterprise platforms. This role combines deep technical expertise in AI/ML data pipelines with practical experience in system integration and orchestration frameworks.Position Overview We are seeking an experienced Data Engineer to lead the development and scaling of synthetic data generation systems that support multiple enterprise platforms. This role combines deep technical expertise in data pipelines with practical experience in system integration, orchestration frameworks, and Kubernetes infrastructure management.Key Responsibilities Synthetic Data Generation & Quality AssuranceDesign and implement scalable synthetic data generation systems to support model training Develop and maintain data quality validation pipelines ensuring synthetic data meets training requirements Build automated testing frameworks for synthetic data generation workflows Collaborate with ML teams to optimize synthetic data for model performanceAPIs & IntegrationDevelop and maintainREST API integrationsacross multiple enterprise platforms Implement robustdata exchange, transformation, and synchronisationlogic between systems Ensureerror handling, retries, and monitoringfor all integration workflowsData Quality & TestingImplementautomated data validation and testing frameworksfor ETL and synthetic data workflows Translatedata quality feedbackfrom stakeholders into pipeline or generation process improvements Proactively monitor and maintaindata consistencyacross systemsMulti-System Integration & MCP DevelopmentBuild and maintain tool registries for Model Control Protocol (MCP) integration across multiple enterprise systems Develop robust APIs for multi-system communication through MCP frameworks Design and implement workflows that coordinate multi-system interactions Ensure reliable data flow and error handling across distributed system architecturesCross-Functional Collaboration & Production IntegrationPartner with domain specialists to translate plan execution feedback into actionable insights Work closely with Product Managers to align synthetic data generation with business requirements Collaborate with Core Engineering teams to ensure seamless production deployment Establish feedback mechanisms between synthetic data systems and production environmentsRequired QualificationsTechnical Skills Programming:Proficiency in Python, Typescript (optional) Data Engineering:Experience in data engineering frameworks and libraries (Pandas, Apache Airflow, Prefect) APIs & Integration:Strong background in REST APIs and system integration Databases:Experience with relational and NoSQL databases (PostgreSQL, MongoDB) Cloud Platforms:Hands on experience with AWS/GCP/Azure Experience Requirements 2+ years experience in building production-scale data pipelines and orchestration systems Demonstrated success in cross-functional collaboration in technical environmentsPreferred QualificationsFamiliarity with managing Kubernetes-based production workloads and workflow orchestration (Argo) Familiarity with containerisation and orchestration with tools like Docker, Kubernetes etc. Familiarity with synthetic or large-scale data generation Background in enterprise software integration Experience with Model Control Protocol (MCP) or similar orchestration frameworks Knowledge of automated testing frameworks for data pipelinesWhat We OfferLots of learning— many systems are being built from the ground up, with no existing references or open-source projects to rely on. This will be the first time not just for you, but for the industry as well. Opportunity to work at the forefront of enterprise-scale synthetic data generation Collaborative environment with product teams, engineering, and domain specialists Competitive compensation and comprehensive benefits Professional development opportunities in cutting-edge data engineering and Kubernetes orchestrationTeam StructureYou'll report to the AI Engineering Lead and work closely with: ML Engineers developing foundation models Product Managers defining business requirements ProductSpecialists providing domain expertise Backend Engineers handling production infrastructureThis role offers significant impact on our data capabilities and the opportunity to shape how we generate and utilize synthetic data for training enterprise systems.

Apply Now

➤
Home | Contact Us | Privacy Policy | Terms & Conditions | Unsubscribe | Popular Job Searches
Use of our Website constitutes acceptance of our Terms & Conditions and Privacy Policies.
Copyright © 2005 to 2025 [VHMnetwork LLC] All rights reserved. Design, Develop and Maintained by NextGen TechEdge Solutions Pvt. Ltd.