Skip to content

Student Projects – Data Dynamo Squad’s Data Project

Made by Data Dynamo Squad

The more you know, the more you don’t know.

The project aimed to establish a robust data ingestion and transformation pipeline utilising various AWS services to ensure efficient, timely, and error-resilient processing of data. The pipeline consists of multiple components orchestrated to ingest data from a database into an S3 “landing zone,” transform it, and then load it into a data warehouse. Key components include AWS EventBridge for job scheduling, AWS Lambda for computing tasks, S3 buckets for storing ingested and processed data, and CloudWatch for logging and monitoring.

The Team

Jovan Ellis

Jovan Ellis

Aspiring Data Engineer with a background studying

Mathematics. Eager to embark on a professional journey in data, leveraging a diverse skill set acquired during the boot-camp. Revels in solving tricky coding problems.

Andreea Mitel

Andreea Mitel

Aspiring Data Engineer fresh from a bootcamp journey, eager

to dive into the world of data. Enthusiastic traveller exploring the world and its wonders. Committed to mastering the intricacies of data engineering and constantly seeking improvement. Dedicated dog lover finding joy in the simple things.

Valerie Parfeliuk

Valerie Parfeliuk

As an automation QA professional with three years of

experience, I transitioned to Data Engineering, driven by a passion for leveraging data to make informed decisions. My background in QA has honed my attention to detail, analytical thinking, and problem-solving skills, setting me apart in Data Engineering. I bring transferable skills, including effective communication and adaptability, making me an asset in collaborative and dynamic work environments. Eager to contribute my unique blend of experience and skills, I am dedicated to making a positive impact as a Data Engineer.

Nathan Stoneley

Nathan Stoneley

Data Engineer who has recently under gone a course with

Northcoders. Before Northcoders, Nathan had a multitude of jobs all of which had little progression, therefore completing the Northcoders course gave him a perfect solution for entering a new career within the Tech industry. The course has given Nathan sound practical experience of constructing a data pipeline in AWS, using many different technologies such as Python, Terraform and SQL.

Ben Ward

Ben Ward

I’m seeking a career change to Data Engineering, building

on 10+ years of Consultative Sales in the technology industry. I have been a champion for implementing new technology to drive operational efficiency and always enjoyed working with Data Analysts to support my role, by understanding and inputting my knowledge on data sets provided. It was an obvious career path when I’ve always been happiest when solving problems, whether that is 1000s of rows in a spreadsheet or at home with a puzzle.

Sumaya Abdirahman

Sumaya Abdirahman

A data engineer who has recently completed an intensive

training program with Northcoders is now embarking on her career in the data engineering world. Prior to her time at Northcoders, she was engaged in studying A-Levels and has since discovered her passion for programming. She particularly enjoys problem-solving and has gained hands-on experience in creating an ETL (Extract, Transform, Load) data pipeline during her training. She is excited to apply her skills and knowledge to real-world data engineering projects and contribute to the development of innovative data solutions.

Tech Stack

Tech Stack for this group

We used: Terraform, github actions, python, aws Leveraging Terraform, GitHub Actions, Python, and AWS has streamlined our project workflow. Terraform simplifies infrastructure management, enabling us to deploy and maintain resources effortlessly. GitHub Actions automates tasks, ensuring seamless integration and continuous deployment. Python, with its versatility, empowers us to handle data manipulation. AWS S3 allowed us to store our data which triggered our functions in AWS lambda.

Challenges Faced

We faced Runtime issues in the transform lambda, solved by doing the data manipulation using pandas. Also, faced issue with Lambda Layers being too large, solved by using an in built lambda layer. Another one was running SQL Tests in the workflow using locally created database – solved by having to import some more pre-built github actions