Data & lakehouse
Analytics and AI are only as good as the platform feeding them. I build the cloud-native data backbone — raw to bronze to silver to gold — with the contracts and quality controls that keep it trustworthy as it grows.
What I do
- Medallion lakehouses. Raw → bronze → silver → gold on Snowflake, Databricks/Delta and AWS (EMR, Glue, S3), with the modelling to make each tier earn its place.
- Contracts at the source seam. Data contracts where operational systems meet the lakehouse — for example SAP Finance ledgers landing in Snowflake — so upstream changes are caught, not absorbed silently.
- Reliable ingestion. Change-data-capture and delta pipelines tuned for near-real-time sync, with lineage and quality monitoring rather than hope.
Evidenced by
- SAP S/4HANA Finance → Snowflake — GL/AR/AP/CO/AA across ~30+ company codes, multi-terabyte backfill plus 10–30 GB daily delta, with contracts at the SAP↔lakehouse seam.
- Confluent Kafka data-product platform — productised streams with Hive LLAP and Spark 3 query acceleration.
Background: Databricks Certified Data Engineer Professional; Snowflake Core; Spark/PySpark, Delta Lake, PostgreSQL, Oracle.