Made by Lambda Legends
Work of Legends
This is a data engineering project which implements an end-to-end ETL (extract, transform, load) pipeline. It extracts data from a database, transforms it to a star schema and finally loads it into an AWS warehouse. Current features: Data Extraction: Uses a Python application to automatically ingest data from the totesys operational database into an S3 bucket in AWS. Data Transformation: Uses a Python application to process raw data to conform to a star schema for the data warehouse. The transformed data is stored in parquet format in a second S3 bucket. Data Loading: Loads transformed data into an AWS-hosted data warehouse, populating dimensions and fact tables. Automation: End-to-end pipeline triggered by completion of a data job. Monitoring and Alerts: Logs to CloudWatch and sends SNS email alerts in case of failures.
The Team
Pratik Shrestha
Recently took Independent learning gap, currently a data…
engineer based in London, UK
Tech Stack

We used: pg8000, pandas, boto3, aws wrangler, pytest, moto, terraform, git, github actions pg8000 for connecting and querying the PostgreSQL database. Pandas for manipulating and transforming data into tables. Boto3 for interacting with AWS services. AWS wrangler for simplifying the process of writing transformed dataframes back to S3 in parquet format during the Transform phase. Pytest for testing. Moto for mocking AWS services during testing. Terraform for defining and provisioning the AWS infrastructure Git: enabled version control for tracking changes in our project code GitHub Actions: Automated testing and deployment workflows to ensure code quality and streamline the CI/CD pipeline.
Challenges Faced
We face challenges during the extraction of data as we wanted to avoid saving the data on our local machines. We also faced challenges with terraform changes not automatically reflected in our lambda functions.
Student Projects – Lambda Legends Data Project
A project made by Team Lambda Legends – Work of Legends
Student Projects – Green Bean Solutions Data Project
A project created by Team Green Bean Solutions – We make data eat its greens.
Student Projects – Dimensional Transformers Data Project
A project made by Team Dimensional Transformers – From Data to Discovery: Shaping the Future with Engineering
Student Projects – Cloudy With A Chance of Terraform Data Project
A project made by Team Cloudy With A Chance of Terraform – Data journey – from raw to ready!
Student Projects – Baltica 2 Data Project
A project created by Team Baltica – Collaborative and Creative