Job Title:
Data Engineer - ETL
Company: Algoworks
Location: Malappuram, Kerala
Created: 2026-04-23
Job Type: Full Time
Job Description:
Data Engineer - ETLLocation: RemoteExperience: 6-8 YearsAlgoworksAbout the CompanyWe are a global team of engineers, architects, designers, researchers, operators and innovators who share a passion for achieving client goals. Our engineering services help businesses thrive at the intersection of technology and people. From the latest AI implementations to legacy platform migrations and everything in between, our services span the enterprise technology spectrum. Our world class experience transformation playbook elevates digital success and increases ROI with a relentless focus on the human experience. Our customer base includes Fortune 500 companies around the globe. We’ve got the skills and insights and we’re also fun to work with. Our global team spans a diverse cultural spectrum, with wide range of interests, enabling us to bring personality and depth to every engagement.Follow the video below to know about us! ClipchampRole OverviewWe are seeking a Data Engineer - ETL to design, build, and optimize scalable data pipelines using Azure cloud technologies.This role focuses on developing robust data ingestion and transformation pipelines, implementing Delta Lake-based data architectures, and enabling high-quality curated datasets for downstream analytics and reporting. The ideal candidate will have strong expertise in PySpark, Azure Databricks, and Azure Data Factory, along with a deep understanding of data performance optimization and engineering best practices.Key Responsibilities• Pipeline Development- Build and maintain scalable data pipelines using Azure Databricks and Azure Data Factory.- Implement ingestion and transformation logic across Bronze (raw) and Silver (cleaned) layers.- Support batch and incremental data processing patterns.• Curated Layer & Data Processing- Implement hydration, merge, and upsert logic using Delta Lake.- Ensure curated datasets meet business requirements and data quality standards.- Handle late-arriving data and incremental updates efficiently.Performance & Storage Optimization- Optimize Delta Lake tables for performance and cost efficiency.- Select and tune appropriate storage formats (Parquet, Delta).- Apply partitioning, compaction, and file sizing strategies.- Tune Spark jobs for large-scale distributed data processing.• Downstream Collaboration & Data Enablement- Collaborate with DWH and BI teams to support downstream data consumption.- Provide optimized datasets for Synapse and reporting workloads.- Support data validation, reconciliation, and consistency across Gold layer outputs.• Engineering Best Practices- Implement CI/CD practices for data pipelines and workflows.- Follow coding standards, documentation, and version control practices.- Support production troubleshooting, monitoring, and performance tuning.Required Skills & Qualifications· Bachelor’s degree in computer science, Engineering, or related field.· 6–8 years of experience in data engineering.Strong expertise in:· PySpark and distributed data processing· Azure Databricks (hands-on development and optimization)· Azure Data Factory for pipeline orchestration· Deep knowledge of Delta Lake (merge, upsert, optimization techniques).· Strong SQL skills for data transformation and validation.· Experience handling large datasets in distributed environments.· Strong understanding of storage optimization (Parquet, Delta).Tools & Practices· Experience with Git and version control systems.· Familiarity with CI/CD pipelines for data workflows.· Understanding of data quality checks and validation techniques.· Experience working in Agile/Scrum delivery models.Nice to Have Skills· Experience supporting Synapse Dedicated SQL Pool.· Exposure to streaming or near real-time data pipelines.· Familiarity with data governance or metadata management tools.Soft Skills & Collaboration· Strong analytical and problem-solving skills.· Ability to work independently on complex data pipelines.· Good communication and collaboration skills.· Proactive and ownership-driven mindset.Desired Attributes· Strong attention to data quality and performance.· Continuous learning mindset for evolving cloud/data technologies.· Ability to work in fast-paced, data-intensive environments.Interview Process2 to 3 Rounds of Discussion