Job Description

Responsibilities: Design & Architecture of Scalable Data Platforms Design, develop, and maintain large-scale data processing architectures (Lakehouse Platform to support business needs) Architect multi-layer data models including Bronze (raw), Silver (cleansed), and Gold (curated) layers for HRTech Domains Strong experience with any of these Snowflake, Databricks, Redshift. Leverage Delta Lake, Unity Catalog, and advanced features of Data-bricks for governed data sharing, versioning, and reproducibility. MUST have experience with AWS technologies like AWS glue, Athena, Redshift, etc. Data Pipeline Development & Collaboration Collaborate with data engineers and data scientists to develop end-to-end pipelines using Python, PySpark, SQL Performance, Scalability, and Reliability Optimize Spark jobs for performance tuning, cost efficiency, and scalability by configuring appropriate cluster sizing, caching, and query optimization techniques. Implement monitoring and alerting using Observability Platforms, Cloud-native tools Design secure architectures using Unity Catalog, role-based access control (RBAC), encryption, token-based access, and data lineage tools to meet compliance policies. Establish data governance practices including Data Fitness Index, Quality Scores, SLA Monitoring, and Metadata Cataloging. Writing PySpark, SQL, and Python code snippets for data engineering and ML tasks. Data profiling and schema inference. Stay abreast of emerging trends in Lakehouse architectures, Generative AI, and cloud-native tooling. Requirements: 8-12 years of hands-on experience in data engineering, with at least 5+ years on Python and Apache Spark. MUST have experience in migrating to data lakes from RDBMS like systems. Expertise in building high-throughput, low-latency ETL/ELT pipelines on AWS using Python, PySpark, SQL, Athena, AWS glue, Redshift, etc Excellent hands on experience with workload automation tools such as Airflow, Prefect etc. Familiarity with building dynamic ingestion frameworks from structured/unstructured data sources including APIs, flat files, RDBMS, and cloud storage Experience designing Lakehouse architectures with bronze, silver, gold layering. Strong understanding of data modelling concepts, star/snowflake schemas, dimensional modelling, and modern cloud-based data warehousing. Experience with designing Data marts using Cloud data warehouses and integrating with BI tools (Power BI, Tableau, etc.). Experience CI/CD pipelines using tools such as AWS Code commit, Azure DevOps, GitHub Actions. Knowledge of infrastructure-as-code (Terraform, ARM templates) for provisioning platform resources In-depth experience with AWS Cloud services such as Glue, S3, Redshift etc. Strong understanding of data privacy, access controls, and governance best practices. Experience working with RBAC, tokenization, and data classification frameworks

Job Title

Company : Curately AI, Inc

Location : Bellary, Karnataka

Created : 2025-12-19

Job Type : Full Time