Skip to content

Culinary

Made by Team Culinary

Team Get it DONE

This group project was carried out during the final phase of the 13-week Northcoders Data Engineering Bootcamp. The aim was to design and build a reliable data platform that extracts data from a PostgreSQL operational database (Totesys), transforms it into a denormalised form star schema, and loads it into a data warehouse to support analysis and reporting. The system is designed to run on a schedule, monitor and log all activity, and maintain a full history of changes to fact data. As a final step, we visualised the data to demonstrate its value for business insights. This project showcases our ability to collaboratively deliver a full-stack data engineering solution using best practices in automation, testing, and data architecture.

The Team

Tech Stack

Tech Stack for this group

We used Python Terraform Lambda S3 IAM PostgreSQL Pandas pg8000 Pytest CloudWatch SNS Boto3 Moto SQLAlchemy PyArrow Makefile Github Actions. Most efficient tools for the given task

Challenges Faced

We very much enjoyed the overall process of making this repo and how we supported and learned from each other. We all benefited from the pair-programming process, but it was a challenge to assign tickets evenly, given the odd number of people in the group. In the first week, we set up the very ambitious target of an automated process of updating the warehouse with all three star schemas. We realised that this would not have been achievable within the timeframe of the project, so we decided to focus on delivering the MVP. Another crucial insight from our project is that planning and communication are key (e.g., discussing interfaces to functions/more detailed planning of the three stages before starting to work separately on the different parts of the repo).