Skip to main content

Capstones

Template

  1. Introduction
    1. Background
    2. Business Objectives
  2. Module 1: Data Modeling with PostgreSQL/Snowflake/Redshift
    1. Business Requirements
    2. Data Analysis
    3. Data Model
    4. DDL
    5. DML
    6. Views
    7. Testing
  3. Module 2: Data Modeling with Cassandra/DynamoDB
    1. Business Requirements
    2. Data Analysis
    3. Data Pattern Analysis
    4. Data Model
    5. DDL
    6. DML
    7. Testing
  4. Module 3: Serverless Data Pipeline for simple batch transformations
    1. Business Requirements
    2. Process Flow Design
    3. Data Ingestion into Raw Layer
    4. Data Refinement with AWS Lambda
    5. Data Storage into Refined Layer
    6. Data Joins and Aggregations with AWS Lambda
    7. Data Loading into Consumption Layer
    8. Glue Crawler Deployment and Data Cataloging
    9. Testing with AWS Athena
    10. Automation with Glue crawler scheduling and Lambda triggers
  5. Module 4: Serverless Data Pipeline for complex batch transformations
    1. Business Requirements
    2. Process Flow Design
    3. Data Ingestion into Raw Layer
    4. Data Refinement with GlueETL/PySpark
    5. Data Storage into Refined Layer
    6. Data Joins and Aggregations with GlueETL/PySpark
    7. Data Loading into Consumption Layer
    8. DDL and Data Cataloging
    9. Testing with AWS Athena
    10. Process Automation with GlueETL Job scheduling
    11. IaC Automation with Cloudformation template