Job Title:
Senior Data / ETL Engineer
Company: EazyML
Location: Thane, Maharashtra
Created: 2026-03-10
Job Type: Full Time
Job Description:
Company DescriptionEazyML is an innovative machine learning platform designed to predict outcomes from textual data with unparalleled transparency and ease of use. As the first of its kind, EazyML sets a new standard for user-friendly machine learning solutions. The platform empowers organizations to efficiently leverage machine learning without requiring extensive technical expertise.Role DescriptionThis full-time remote role for a Senior Data/ETL Engineer involves designing, developing, and maintaining ETL processes to ensure seamless data integration and transformation. The role includes collaborating with cross-functional teams to analyze data requirements, create data models, and optimize performance. The engineer will also troubleshoot and resolve data-related challenges while ensuring the integrity and accuracy of data pipelines. We are seeking a highly skilled Senior Data Engineer / ETL Developer to design, develop, and optimize scalable data pipelines, APIs, and database solutions across modern data platforms.The ideal candidate has deep hands-on expertise in PySpark, Python, SQL, and distributed data systems, along with exposure to AI/ML workflows. You will collaborate with cross-functional teams to build reliable, high-performance data infrastructure that powers analytics and machine learning solutions.QualificationsStrong expertise in Extract, Transform, Load (ETL) processes and proficiency with ETL toolsProven experience in data integration and data modeling techniquesSolid analytical skills and a problem-solving mindsetFamiliarity with machine learning platforms and data processing workflows is a plusBachelor's degree in Computer Science, Data Science, or a related fieldAbility to work effectively in a remote, collaborative environmentStrong communication and organizational skillsRequired Qualifications5+ years of hands-on experience in Data Engineering.Strong proficiency in Extract Transform Load (ETL) processes and experience with ETL ToolsExpertise in Data Integration and Data Modeling practicesExcellent Analytical Skills to interpret, structure, and manage complex datasetsStrong expertise in PySpark and distributed data processing frameworks (Spark, Databricks, Hive, etc.).Advanced proficiency in Python for data engineering.Deep knowledge of SQL and database performance tuning.Strong understanding of data modeling, warehousing concepts, and ETL/ELT architectures.Experience with cloud data platforms (AWS, Azure, or GCP).Experience with orchestration tools (Airflow or similar).Experience designing APIs for data services (FastAPI, Flask, etc.).Familiarity with modern data stack tools and real-time streaming architectures.Key ResponsibilitiesData Engineering & Pipeline DevelopmentDesign, develop, and optimize large-scale ETL/ELT pipelines using PySpark and distributed data processing frameworks.Build high-performance data ingestion workflows from structured and unstructured sources.Implement scalable data models, data marts, and enterprise data warehouse solutions.Ensure data quality, reliability, lineage, and governance across pipelines.Strong proficiency in Python, including libraries for data processingAdvanced knowledge of SQL and performance optimization techniques Programming & Database ExpertiseWrite and optimize complex SQL queries, stored procedures, triggers, and functions.Develop clean, modular, and efficient Python code for data processing and automation.Work with relational and NoSQL databases (MySQL, PostgreSQL, SQL Server, MongoDB, etc.).Manage database migrations, schema changes, and lifecycle processes. AI/ML & Data Science CollaborationPartner with Data Science teams to productionize machine learning models. Integrate ML models into scalable data platforms.Support model deployment and MLOps processes.Architecture & Best PracticesDesign scalable, cloud-based data architectures (AWS, Azure, or GCP).Drive best practices in CI/CD, testing, performance optimization, and cloud deployments.Work within Agile development environments using tools like Azure DevOps or GitHub.Preferred Qualifications Exposure to AI/ML workflows and MLOps tools (MLflow or similar).Experience with ETL tools such as Talend, Apache NiFi, or Informatica.Knowledge of API frameworks (Flask, FastAPI, Django REST Framework). Familiarity with CI/CD and ALM tools (Azure DevOps, GitHub).EducationBachelors degree in Computer Science, Engineering, or related field (or equivalent experience).