Skip to main content

SQL Assignment 3

Setup the environment

!mkdir -p ~/.aws
!pip install -qq psycopg2-binary awscli boto3 s3fs
%%writefile ~/.aws/credentials
[default]
aws_access_key_id=<enter access key>
aws_secret_access_key=<enter secret access key>
region=us-east-1
output=json
import boto3
import json
import pandas as pd

%reload_ext sql
def get_secret(secret_name):
region_name = "us-east-1"
session = boto3.session.Session()
client = session.client(
service_name='secretsmanager',
region_name=region_name)
get_secret_value_response = client.get_secret_value(SecretId=secret_name)
get_secret_value_response = json.loads(get_secret_value_response['SecretString'])
return get_secret_value_response

db_credentials = get_secret(secret_name='wysde')

USERNAME = db_credentials["RDS_POSTGRES_USERNAME"]
PASSWORD = db_credentials["RDS_POSTGRES_PASSWORD"]
HOST = db_credentials["RDS_POSTGRES_HOST"]
PORT = 5432
DBNAME = "postgres"
CONN = f"postgresql://{USERNAME}:{PASSWORD}@{HOST}:{PORT}/{DBNAME}"

%sql {CONN}

Data Schema Analysis

image.png

%sql select * from city limit 5;
indexidnamecountrycodedistrictpopulation
06RotterdamNLDZuid-Holland593321
13878ScottsdaleUSAArizona202705
23965CoronaUSACalifornia124966
33973ConcordUSACalifornia121780
43977Cedar-RapidsUSAIowa120758

image.png

%sql select * from country limit 5;
indexcodenamecontinentregionsurfaceareaindepyearpopulationlifeexpectancygnpgnpoldlocalnamegovernmentformheadofstatecapitalcode2
0ABWArubaNorth AmericaCaribbean193None10300078.4828793.0ArubaNonmetropolitan Territory of The NetherlandsBeatrix129.0AW
1AFGAfghanistanAsiaSouthern and Central Asia6520901919.02272000045.95976NoneAfganistan/AfqanestanIslamic EmirateMohammad Omar1.0AF
2AIAAnguillaNorth AmericaCaribbean96None800076.163NoneAnguillaDependent Territory of the UKElisabeth II62.0AI
3ANDAndorraEuropeSouthern Europe4681278.07800083.51630NoneAndorraParliamentary CoprincipalityNone55.0AD
4ANTNetherlands AntillesNorth AmericaCaribbean800None21700074.71941NoneNederlandse AntillenNonmetropolitan Territory of The NetherlandsBeatrix33.0AN

image.png

%sql select * from station limit 5;
indexidcitystatelat_nlong_w
0794Kissee MillsMO14073
1824Loma MarCA49131
2603Sandy HookCT72148
3478TiptonIN3498
4619ArlingtonCO7593

image.png

%sql select * from students limit 5;
indexidnamemarks
019Samantha87
121Julia96
211Britney95
332Kristeen100
412Dyana55

Questions

Query all columns for all American cities in the CITY table with populations larger than 100000. The CountryCode for America is USA

%%sql
# write your code here
indexidnamecountrycodedistrictpopulation
13878ScottsdaleUSAArizona202705
23965CoronaUSACalifornia124966
33973ConcordUSACalifornia121780
43977Cedar-RapidsUSAIowa120758
53982Coral-SpringsUSAFlorida117549

Query the NAME field for all American cities in the CITY table with populations larger than 120000. The CountryCode for America is USA

%%sql
# write your code here
name
Scottsdale
Corona
Concord
Cedar-Rapids

Query the average population for all cities in CITY, rounded down to the nearest integer

%%sql
# write your code here
round
172792

Query a count of the number of cities in CITY having a Population larger than 100,000

%%sql
# write your code here
count
6

Query the total population of all cities in CITY where District is California

%%sql
# write your code here
sum
339002

Query the difference between the maximum and minimum populations in CITY

%%sql
# write your code here
?column?
502766

Query all attributes of every Japanese city in the CITY table. The COUNTRYCODE for Japan is JPN

%%sql
# write your code here
indexidnamecountrycodedistrictpopulation
351613NeyagawaJPNOsaka257315
361630AgeoJPNSaitama209442
371661SayamaJPNSaitama162472
381681OmutaJPNFukuoka142889
391739TokuyamaJPNYamaguchi107078

Given the CITY and COUNTRY tables, query the names of all cities where the CONTINENT is 'Africa'

Note: CITY.CountryCode and COUNTRY.Code are matching key columns.

%%sql
# write your code here
name
Bujumbura
al-Minya
Banjul
Monrovia
Taza
Malabo

Given the CITY and COUNTRY tables, query the sum of the populations of all cities where the CONTINENT is 'Asia'

Note: CITY.CountryCode and COUNTRY.Code are matching key columns.

%%sql
# write your code here
sum
4199782

Given the CITY and COUNTRY tables, query the names of all the continents (COUNTRY.Continent) and their respective average city populations (CITY.Population) rounded down to the nearest integer

Note: CITY.CountryCode and COUNTRY.Code are matching key columns.

%%sql
# write your code here
continentfloor
Africa254397
Asia262486
South America2042407
North America95368
Europe221749
Oceania8600

Query a list of CITY names from STATION for cities that have an even ID number. Print the results in any order, but exclude duplicates from the answer

%%sql
# write your code here
city
Aguanga
Alba
Albany
Amo
Andersonville

Find the difference between the total number of CITY entries in the table and the number of distinct CITY entries in the table

For example, if there are three records in the table with CITY values 'New York', 'New York', 'Bengalaru', there are 2 different city names: 'New York' and 'Bengalaru'. The query returns 1, because

total number of records - number of unique city names = 3 - 2 = 1

%%sql
# write your code here
?column?
13

Query the two cities in STATION with the shortest and longest CITY names, as well as their respective lengths (i.e.: number of characters in the name).

If there is more than one smallest or largest city, choose the one that comes first when ordered alphabetically. For example, CITY has four entries: DEF, ABC, PQRS and WXY, the output would be:

ABC 3
PQRS 4
%%sql
# write your code here
citylength
Marine On Saint Croix21

Query the list of CITY names starting with vowels (i.e., a, e, i, o, or u) from STATION. Your result cannot contain duplicates

%%sql
# write your code here
city
Arlington
Albany
Upperco
Aguanga
Odin

Query the list of CITY names from STATION which have vowels (i.e., a, e, i, o, and u) as both their first and last characters. Your result cannot contain duplicates

%%sql
# write your code here
city
Acme
Aguanga
Alba
Aliso Viejo
Alpine

Query the sum of Northern Latitudes (LAT_N) from STATION having values greater than 38.7880 and less than 137.2345. Truncate your answer to 4 decimal places

%%sql
# write your code here
sum_lat
36472.0000

Query the smallest Northern Latitude (LAT_N) from STATION that is greater than 38.7780. Round your answer to 4 decimal places

%%sql
# write your code here
round
39.0000

Query the Manhattan Distance between points P1 and P2 and round it to a scale of 4 decimal places

Consider P1(a, b) and P2(c, d) to be two points on a 2D plane.

  1. happens to equal the minimum value in Northern Latitude (LAT_N in STATION).
  2. happens to equal the minimum value in Western Longitude (LONG_W in STATION).
  3. happens to equal the maximum value in Northern Latitude (LAT_N in STATION).
  4. happens to equal the maximum value in Western Longitude (LONG_W in STATION).
%%sql
# write your code here
round
260.0000

Query the Name of any student in STUDENTS who scored higher than Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID

%%sql
# write your code here
name
Stuart
Kristeen
Christene
Amina
Aamina
Priya
Heraldo
Scarlet
Julia
Salma
Britney
Priyanka
Samantha
Vivek
Belvet
Devil
Evil