Skip to content

Made by Dimensional Transformers

From Data to Discovery: Shaping the Future with Engineering

Our project automates the ETL workflow for Terrific Totes, leveraging Python, AWS, database modeling, and CI/CD practices. It extracts data from a PostgreSQL database on AWS RDS, transforms it with Python Lambda functions, and loads it into a PostgreSQL data warehouse. With secure credential management, CloudWatch monitoring, and Terraform and GitHub Actions automation, this scalable solution ensures quality through rigorous testing and PEP8 compliance, delivering actionable insights via a Tableau dashboard.

The Team

David Sheffield

David Sheffield

No bio provided

LinkedIn
Dorota Sawicki

Dorota Sawicki

No bio provided

Hamza Nazar

Hamza Nazar

No bio provided

LinkedIn
Laura Pugsley

Laura Pugsley

No bio provided

LinkedIn
Rohail Zaheer

Rohail Zaheer

No bio provided

LinkedIn
William Robb

William Robb

No bio provided

LinkedIn

Tech Stack

Tech Stack for this group

We used: Python, AWS, Postgres, Terraform, GitHub Actions, Parquet, Tableau, Pytest Terraform: Used for managing and deploying infrastructure. Python: Employed for programmatically interacting with and manipulating data. SQL: Utilized due to the database’s structure and requirements. Parquet: Chosen for its efficient data storage capabilities. GitHub Actions: Implemented to automate workflows and minimize manual intervention throughout the project. AWS: Provided the hosting environment for all cloud infrastructure. Pytest: For test driven development. Tableau: For effiecient data visualisation.

Challenges Faced

Statelessness of Lambda Functions: Lambda functions are stateless, so we needed a way to track the last updated date for data extraction. We solved this by storing the date in AWS Secrets Manager, allowing us to persist the value between invocations. On the first run, we reset the date to process all source data. Fact Table Load Order: We encountered an issue where the fact table was loading before the dimension tables, causing reference errors. To fix this, we implemented a sleep mechanism after loading the dimension tables, ensuring they were populated first and maintaining data integrity. Lambda Layer Dependency Issues: We faced challenges with Lambda layer compatibility for PyArrow and SQLAlchemy due to package size. For PyArrow, we used an AWS Managed Layer. For SQLAlchemy, we created a custom GitHub library to package and deploy the specific version as a Lambda layer, resolving the compatibility issues. GitHub: https://github.com/hamza8599/nc-final-project

Student Projects – Dimensional Transformers Data Project

A project made by Team Dimensional Transformers – From Data to Discovery: Shaping the Future with Engineering

Read More

Student Projects – Cloudy With A Chance of Terraform Data Project

A project made by Team Cloudy With A Chance of Terraform – Data journey – from raw to ready!

Read More

Student Projects – Baltica 2 Data Project

A project created by Team Baltica – Collaborative and Creative

Read More

Student Projects – Trippy

A project created by Team SouthCoders – Addiction to Travelling

Read More

Student Project – Trail Tales

A project created by Project Trail Tales – A community biased geo-posting app, to document and tag wildlife seen along a users journey through life

Read More

Student Projects – Time Treasures

A project made by Team Productivity Pirates – Organise your inner pirate

Read More

Student Projects – Pix Pursuit

A project made by Team Fantastic Phor – Picture hunting mobile app

Read More

Student Projects – MusiCAL

A project made by Team MusiCAL – Discover events that match your soundtrack

Read More

Student Projects – Mind My Plants

A project made by Team Mind my Plants – Leaf it to the professionals

Read More

Student Projects – Late Plate

A project made by Team arcadia-synergistic-cloud-based-ai-driven-data-analytic-cybersecurity-solutions – Book a restaurant quick!

Read More

Student Projects – InkWell

A project made by Team Project Ribs and Wings – Slow dms fast pen-pals

Read More

Student Projects – HisTOURy

A project made by Team splice-girls – Plan your own history tour!

Read More