Workspace Setup
Install Anaconda and Jupyter
Watch and follow this video:
Install Visual Studio Code (VSCode)
Watch and follow this video:
- Windows
- macOS
Create GitHub Account
Watch and follow this video:
Setup Git Credentials
Watch and follow this video:
SSH setup commands:
ssh-keygen -t rsa -b 4096 -C "<email>"
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub
Create AWS Account
Watch and follow this video:
Setup AWS Credentials
Watch and follow this video. You can also check out the documentation here.
Install DBeaver
Create Databricks Account
Watch and follow this video.
Create Snowflake Account
Watch and follow this video:
Google Colaboratory
Watch and follow this video:
Install Postgres
Watch and follow this video:
- Windows
- macOS
Setup RDS Postgres
Watch and follow this video:
Install Cassandra
wget https://dlcdn.apache.org/cassandra/4.0.6/apache-cassandra-4.0.6-bin.tar.gz
tar -xvf apache-cassandra-4.0.6-bin.tar.gz
mv apache-cassandra-4.0.6 ~/cassandra
Add export PATH="$HOME/cassandra/bin:$PATH"
in your bash profile. Validate by running cassandra --version
. It will show 4.0.6
.
Congratulations! Now your Cassandra server should be up and running with a new single-node cluster called “Test Cluster,” ready to interact with other nodes and clients.
- Run server:
cassandra -f
Install DataStax Bulk Loader
wget https://downloads.datastax.com/dsbulk/dsbulk-1.10.0.tar.gz
tar -xvf dsbulk-1.10.0.tar.gz
mv dsbulk-1.10.0 ~/dsbulk
rm dsbulk-1.10.0.tar.gz
Add export PATH="$HOME/dsbulk/bin:$PATH"
in your bash profile.
Install JVM
- Go to https://www.oracle.com/java/technologies/downloads/#java8-mac or https://download.oracle.com/otn/java/jdk/8u341-b10/424b9da4b48848379167015dcc250d8d/jdk-8u341-macosx-x64.dmg (for mac users)
- Install the downloaded package
- Run these commands to set java path
/usr/libexec/java_home
export JAVA_HOME=$(/usr/libexec/java_home)
source
Install Airflow
First install airflow with:
pip install apache-airflow==2.4.1
Create a new folder and run the following commands inside that folder to start the airflow:
export AIRFLOW_HOME=$(pwd)
export AIRFLOW__CORE__LOAD_EXAMPLES=False
export AIRFLOW__CORE__ENABLE_XCOM_PICKLING=True
airflow db init
airflow users create \
--username admin \
--password admin \
--firstname User \
--lastname Name \
--role Admin \
--email email@example.com
airflow standalone