Job Title:
Data Modeller (Banking Domain, BIAN Architecture must)
Company: Client of Prasha Consultancy Services Private Limited
Location: Thane, Maharashtra
Created: 2026-04-03
Job Type: Full Time
Job Description:
A US/Canadian based IT MNC is hiring Data Modeller for one of its Banking Client.Experience with Banking Domain and BIAN Architecture is must.Position Name: Data Modeler Location: RemoteTime Overlap: Till 11 PM IST Required skill: Experience with the Banking domain is a big plus.1. Lakehouse Data Modeling on Amazon S3o Design Medallion architecture (Bronze/Silver/Gold)o Model data for scalability, partitioning, and domain-based accesso Handle schema evolution and time-travel use cases 2. AWS Glue + PySpark (ETL Modeling)o Translate logical/physical models into PySpark transformationso Optimize joins, partition pruning, pushdown predicateso Manage schema via Glue Data Catalog 3. Schema Design & Metadata Managemento Define canonical schemas and data contractso Maintain centralized metadata using Glue Catalogo Versioning and backward compatibility of schemas 4. Modern Table Formats (Apache Iceberg / Delta)o Implement ACID-compliant tables on S3o Design for incremental loads, CDC, and snapshot-based queryingo Optimize compaction and partition strategies 5. Streaming & CDC Data Modeling (Kafka / MSK)o Design event schemas aligned with domain modelso Model change data capture flows into lakehouseo Ensure consistency between streaming and batch layers 6. Advanced Data Modeling Techniqueso Data Vault 2.0 (Hubs, Links, Satellites)o Dimensional modeling (Star/Snowflake)o SCD (Type 1/2/3), surrogate keys, historization 7. Data Governance & Quality Engineeringo Data lineage, cataloging, metadata-driven pipelineso Data quality frameworks (Great Expectations, Deequ)o RBAC, audit, compliance8. Lakehouse & Medallion Architectureo Bronze (raw CDC), Silver (conformed), Gold (business-ready)o Schema evolution, late arriving data, deduplication9. Orchestration & Pipeline Engineeringo Apache Airflow (DAG design, dependency mgmt, SLA handling)o Hybrid orchestration (event + schedule driven)o CI/CD for data pipelines10.Canonical & Contract-First Data Designo Canonical schemas, data contracts, schema versioningo API/event schema alignment (Avro/JSON/Protobuf)11. Domain-Centric Data Modelingo Nice to have experience BIAN-aligned service domains ()o Domain-driven design with clear data ownership and boundaries