Capstones
Template
- Introduction
- Background
- Business Objectives
- Module 1: Data Modeling with PostgreSQL/Snowflake/Redshift
- Business Requirements
- Data Analysis
- Data Model
- DDL
- DML
- Views
- Testing
- Module 2: Data Modeling with Cassandra/DynamoDB
- Business Requirements
- Data Analysis
- Data Pattern Analysis
- Data Model
- DDL
- DML
- Testing
- Module 3: Serverless Data Pipeline for simple batch transformations
- Business Requirements
- Process Flow Design
- Data Ingestion into Raw Layer
- Data Refinement with AWS Lambda
- Data Storage into Refined Layer
- Data Joins and Aggregations with AWS Lambda
- Data Loading into Consumption Layer
- Glue Crawler Deployment and Data Cataloging
- Testing with AWS Athena
- Automation with Glue crawler scheduling and Lambda triggers
- Module 4: Serverless Data Pipeline for complex batch transformations
- Business Requirements
- Process Flow Design
- Data Ingestion into Raw Layer
- Data Refinement with GlueETL/PySpark
- Data Storage into Refined Layer
- Data Joins and Aggregations with GlueETL/PySpark
- Data Loading into Consumption Layer
- DDL and Data Cataloging
- Testing with AWS Athena
- Process Automation with GlueETL Job scheduling
- IaC Automation with Cloudformation template