library(tidyverse)
library(skimr)
Exploring the Age Factor:
Substance Use, Abuse, and the Impact of Age on Patterns and Behaviors
Data 1
Introduction and data
Food Access CSV File From the CORGIS Dataset Project
Curated By Ryan Whitcomb, Joung Min Choi, Bo Guan from the United States Department of Agriculture’s Economic Research Service on 9/14/2021
The dataset contains information about US county’s ability to access supermarkets, supercenters, grocery stores, or other sources of healthy and affordable food.
Research question
- A well formulated research question. (You may include more than one research question if you want to receive feedback on different ideas for your project. However, one per data set is required.)
What US regions have the highest level of food insecurity?
What counties are considered to have food deserts (need to find definition of food desert)?
What state has the most food insecurity?
- A description of the research topic along with a concise statement of your hypotheses on this topic.
Topic: American Food Insecurity
Our hypothesis is that rural counties will likely have higher food insecurity than urban counties.
- Identify the types of variables in your research question. Categorical? Quantitative?
Categorical: County Names
Quantitative: Dist. From Supermarkets By Factor (remaining variables in dataset)
Glimpse of data
<- read.csv('data/food_access.csv')
foodAccess
::skim(foodAccess) skimr
Name | foodAccess |
Number of rows | 3142 |
Number of columns | 25 |
_______________________ | |
Column type frequency: | |
character | 2 |
numeric | 23 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
County | 0 | 1 | 10 | 33 | 0 | 1877 | 0 |
State | 0 | 1 | 4 | 20 | 0 | 51 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Population | 0 | 1 | 98264.02 | 312946.53 | 82 | 11114.50 | 25872.0 | 66780.00 | 9818605 | ▇▁▁▁▁ |
Housing.Data.Residing.in.Group.Quarters | 0 | 1 | 2541.21 | 6512.50 | 0 | 177.00 | 602.0 | 2247.00 | 171670 | ▇▁▁▁▁ |
Housing.Data.Total.Housing.Units | 0 | 1 | 37147.13 | 111990.96 | 39 | 4368.75 | 10017.0 | 25829.00 | 3241204 | ▇▁▁▁▁ |
Vehicle.Access.1.Mile | 0 | 1 | 662.16 | 1095.32 | 0 | 118.00 | 332.0 | 739.75 | 13735 | ▇▁▁▁▁ |
Vehicle.Access.1.2.Mile | 0 | 1 | 1503.13 | 3903.09 | 0 | 180.25 | 481.0 | 1197.75 | 83246 | ▇▁▁▁▁ |
Vehicle.Access.10.Miles | 0 | 1 | 31.01 | 80.16 | 0 | 1.00 | 11.0 | 34.75 | 1826 | ▇▁▁▁▁ |
Vehicle.Access.20.Miles | 0 | 1 | 5.16 | 47.42 | 0 | 0.00 | 0.0 | 0.00 | 1473 | ▇▁▁▁▁ |
Low.Access.Numbers.Children.1.Mile | 0 | 1 | 9527.62 | 16747.45 | 0 | 1649.25 | 4108.0 | 9723.25 | 250060 | ▇▁▁▁▁ |
Low.Access.Numbers.Children.1.2.Mile | 0 | 1 | 16668.66 | 41717.86 | 0 | 2176.50 | 5301.5 | 13327.25 | 911988 | ▇▁▁▁▁ |
Low.Access.Numbers.Children.10.Miles | 0 | 1 | 372.74 | 596.69 | 0 | 34.00 | 210.0 | 524.75 | 11490 | ▇▁▁▁▁ |
Low.Access.Numbers.Children.20.Miles | 0 | 1 | 40.76 | 235.28 | 0 | 0.00 | 0.0 | 0.00 | 5918 | ▇▁▁▁▁ |
Low.Access.Numbers.Low.Income.People.1.Mile | 0 | 1 | 11199.22 | 17273.37 | 0 | 2501.00 | 6300.5 | 13138.25 | 260673 | ▇▁▁▁▁ |
Low.Access.Numbers.Low.Income.People.1.2.Mile | 0 | 1 | 20660.44 | 48784.32 | 0 | 3472.25 | 8403.5 | 19185.50 | 1139072 | ▇▁▁▁▁ |
Low.Access.Numbers.Low.Income.People.10.Miles | 0 | 1 | 617.69 | 1142.24 | 0 | 51.25 | 319.0 | 804.00 | 24663 | ▇▁▁▁▁ |
Low.Access.Numbers.Low.Income.People.20.Miles | 0 | 1 | 76.11 | 476.40 | 0 | 0.00 | 0.0 | 0.00 | 12405 | ▇▁▁▁▁ |
Low.Access.Numbers.People.1.Mile | 0 | 1 | 39091.71 | 64757.27 | 0 | 7306.50 | 17921.5 | 42034.75 | 903299 | ▇▁▁▁▁ |
Low.Access.Numbers.People.1.2.Mile | 0 | 1 | 68483.47 | 164153.98 | 82 | 9527.50 | 22535.5 | 57185.00 | 3696268 | ▇▁▁▁▁ |
Low.Access.Numbers.People.10.Miles | 0 | 1 | 1637.40 | 2386.60 | 0 | 174.00 | 955.0 | 2288.00 | 37500 | ▇▁▁▁▁ |
Low.Access.Numbers.People.20.Miles | 0 | 1 | 172.54 | 823.48 | 0 | 0.00 | 0.0 | 0.00 | 17768 | ▇▁▁▁▁ |
Low.Access.Numbers.Seniors.1.Mile | 0 | 1 | 5339.46 | 8298.88 | 0 | 1194.25 | 2693.5 | 5919.75 | 123489 | ▇▁▁▁▁ |
Low.Access.Numbers.Seniors.1.2.Mile | 0 | 1 | 9148.15 | 20213.49 | 12 | 1556.25 | 3423.5 | 8226.75 | 431862 | ▇▁▁▁▁ |
Low.Access.Numbers.Seniors.10.Miles | 0 | 1 | 274.73 | 382.57 | 0 | 28.00 | 165.5 | 388.75 | 5801 | ▇▁▁▁▁ |
Low.Access.Numbers.Seniors.20.Miles | 0 | 1 | 30.33 | 137.68 | 0 | 0.00 | 0.0 | 0.00 | 4165 | ▇▁▁▁▁ |
Data 2
Introduction and data
Drugs CSV file from the CORGIS Dataset Project
Data is by Austin Cory Bart, Ryan Whitcomb, Joung Min Choi, Bo Guan, created 10/29/2021. Data was collected from individual states as part of the NSDUH study. The data ranges from 2002 to 2018. Both totals (in thousands of people) and rates (as a percentage of the population) are given.
This dataset is about substance abuse. Specifically cigarettes, marijuana, cocaine, and alcohol use among different age groups and states in the US.
State Marijuana Laws CSV from data.world
Data compiled by Selene Arrazolo from 2016 map by Michael Maciag from Governing Data (https://www.governing.com/archive/state-marijuana-laws-map-medical-recreational.html) (article has since been updated) and updated Liam Muecke to be current to 2019 based on Wikipedia article (https://en.wikipedia.org/wiki/Timeline_of_cannabis_laws_in_the_United_States).
This dataset is reflects the legal status of marijuana by state placing each state in 4 categories (Medical, Recretional, No Laws Legalizing, and Decriminalized).
Research question
- A well formulated research question. (You may include more than one research question if you want to receive feedback on different ideas for your project. However, one per data set is required.)
What US regions have the highest level of drug use per category?
How has drug use in specific regions changed over time?
Which category of drug use is the most common?
What factors influence changes in adolescent substance abuse? (Marijuana legalization, popularity of vaping, etc.)
What type of substance does each age category prefer?
Has adolescent marijuana abuse increased in states that have legalized cannabis consumption?
- A description of the research topic along with a concise statement of your hypotheses on this topic.
Topic: Drug use in the United States
Our hypothesis is that cigarette use has declines in most states, and that states with larger populations will have drug use.
- Identify the types of variables in your research question. Categorical? Quantitative?
Categorical: States
Quantitative: Year, Population, (other variables in the dataset)
Glimpse of data
<- read.csv('data/drugs.csv')
drugUse
::skim(drugUse) skimr
Name | drugUse |
Number of rows | 867 |
Number of columns | 53 |
_______________________ | |
Column type frequency: | |
character | 1 |
numeric | 52 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
State | 0 | 1 | 4 | 20 | 0 | 51 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Year | 0 | 1 | 2010.00 | 4.90 | 2002.00 | 2006.00 | 2010.00 | 2014.00 | 2018.00 | ▇▆▆▆▇ |
Population.12.17 | 0 | 1 | 489714.13 | 563795.85 | 30551.00 | 131540.50 | 339685.00 | 541095.00 | 3293484.00 | ▇▂▁▁▁ |
Population.18.25 | 0 | 1 | 658880.04 | 755989.75 | 57395.00 | 174293.50 | 456240.00 | 746808.00 | 4469106.00 | ▇▁▁▁▁ |
Population.26. | 0 | 1 | 3874155.48 | 4320775.92 | 310110.00 | 1027871.00 | 2698757.00 | 4509094.00 | 25917724.00 | ▇▂▁▁▁ |
Totals.Alcohol.Use.Disorder.Past.Year.12.17 | 0 | 1 | 19.22 | 25.29 | 0.00 | 5.00 | 11.00 | 24.00 | 204.00 | ▇▁▁▁▁ |
Totals.Alcohol.Use.Disorder.Past.Year.18.25 | 0 | 1 | 94.48 | 108.27 | 6.00 | 26.00 | 64.00 | 119.50 | 717.00 | ▇▁▁▁▁ |
Totals.Alcohol.Use.Disorder.Past.Year.26. | 0 | 1 | 224.15 | 254.02 | 19.00 | 57.50 | 154.00 | 271.50 | 1586.00 | ▇▁▁▁▁ |
Rates.Alcohol.Use.Disorder.Past.Year.12.17 | 0 | 1 | 0.04 | 0.02 | 0.01 | 0.03 | 0.04 | 0.05 | 0.11 | ▇▇▅▁▁ |
Rates.Alcohol.Use.Disorder.Past.Year.18.25 | 0 | 1 | 0.15 | 0.04 | 0.07 | 0.12 | 0.15 | 0.18 | 0.27 | ▃▇▇▂▁ |
Rates.Alcohol.Use.Disorder.Past.Year.26. | 0 | 1 | 0.06 | 0.01 | 0.03 | 0.05 | 0.06 | 0.07 | 0.11 | ▂▇▃▁▁ |
Totals.Alcohol.Use.Past.Month.12.17 | 0 | 1 | 65.80 | 77.68 | 3.00 | 17.00 | 43.00 | 81.00 | 540.00 | ▇▁▁▁▁ |
Totals.Alcohol.Use.Past.Month.18.25 | 0 | 1 | 393.02 | 440.01 | 32.00 | 99.00 | 258.00 | 480.00 | 2639.00 | ▇▁▁▁▁ |
Totals.Alcohol.Use.Past.Month.26. | 0 | 1 | 2124.66 | 2372.87 | 167.00 | 525.00 | 1380.00 | 2623.00 | 14513.00 | ▇▂▁▁▁ |
Rates.Alcohol.Use.Past.Month.12.17 | 0 | 1 | 0.14 | 0.04 | 0.05 | 0.11 | 0.13 | 0.16 | 0.25 | ▂▇▇▃▁ |
Rates.Alcohol.Use.Past.Month.18.25 | 0 | 1 | 0.61 | 0.08 | 0.30 | 0.56 | 0.61 | 0.66 | 0.76 | ▁▁▅▇▃ |
Rates.Alcohol.Use.Past.Month.26. | 0 | 1 | 0.55 | 0.08 | 0.28 | 0.51 | 0.56 | 0.61 | 0.72 | ▁▂▅▇▃ |
Totals.Tobacco.Cigarette.Past.Month.12.17 | 0 | 1 | 36.80 | 41.88 | 1.00 | 10.00 | 23.00 | 47.50 | 295.00 | ▇▁▁▁▁ |
Totals.Tobacco.Cigarette.Past.Month.18.25 | 0 | 1 | 209.94 | 219.53 | 14.00 | 56.00 | 147.00 | 265.00 | 1281.00 | ▇▂▁▁▁ |
Totals.Tobacco.Cigarette.Past.Month.26. | 0 | 1 | 857.22 | 844.42 | 76.00 | 223.00 | 678.00 | 1066.00 | 4452.00 | ▇▂▁▁▁ |
Rates.Tobacco.Cigarette.Past.Month.12.17 | 0 | 1 | 0.08 | 0.04 | 0.01 | 0.05 | 0.08 | 0.11 | 0.20 | ▆▇▇▃▁ |
Rates.Tobacco.Cigarette.Past.Month.18.25 | 0 | 1 | 0.34 | 0.08 | 0.13 | 0.28 | 0.35 | 0.40 | 0.53 | ▂▅▇▇▁ |
Rates.Tobacco.Cigarette.Past.Month.26. | 0 | 1 | 0.23 | 0.04 | 0.12 | 0.21 | 0.23 | 0.26 | 0.34 | ▁▅▇▅▁ |
Totals.Illicit.Drugs.Cocaine.Used.Past.Year.12.17 | 0 | 1 | 5.06 | 7.51 | 0.00 | 1.00 | 3.00 | 6.00 | 56.00 | ▇▁▁▁▁ |
Totals.Illicit.Drugs.Cocaine.Used.Past.Year.18.25 | 0 | 1 | 37.11 | 46.66 | 2.00 | 10.00 | 22.00 | 46.00 | 345.00 | ▇▁▁▁▁ |
Totals.Illicit.Drugs.Cocaine.Used.Past.Year.26. | 0 | 1 | 59.11 | 72.74 | 2.00 | 14.00 | 36.00 | 75.00 | 585.00 | ▇▁▁▁▁ |
Rates.Illicit.Drugs.Cocaine.Used.Past.Year.12.17 | 0 | 1 | 0.01 | 0.01 | 0.00 | 0.01 | 0.01 | 0.01 | 0.03 | ▇▆▃▁▁ |
Rates.Illicit.Drugs.Cocaine.Used.Past.Year.18.25 | 0 | 1 | 0.06 | 0.02 | 0.02 | 0.04 | 0.06 | 0.07 | 0.12 | ▃▇▆▂▁ |
Rates.Illicit.Drugs.Cocaine.Used.Past.Year.26. | 0 | 1 | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.05 | ▇▆▁▁▁ |
Totals.Marijuana.New.Users.12.17 | 0 | 1 | 24.72 | 28.67 | 2.00 | 7.00 | 17.00 | 29.00 | 197.00 | ▇▁▁▁▁ |
Totals.Marijuana.New.Users.18.25 | 0 | 1 | 24.70 | 29.33 | 2.00 | 6.00 | 16.00 | 29.50 | 204.00 | ▇▁▁▁▁ |
Totals.Marijuana.New.Users.26. | 0 | 1 | 5.53 | 8.92 | 0.00 | 1.00 | 3.00 | 6.00 | 119.00 | ▇▁▁▁▁ |
Rates.Marijuana.New.Users.12.17 | 0 | 1 | 0.06 | 0.01 | 0.03 | 0.05 | 0.06 | 0.07 | 0.10 | ▁▇▆▂▁ |
Rates.Marijuana.New.Users.18.25 | 0 | 1 | 0.08 | 0.02 | 0.03 | 0.06 | 0.07 | 0.09 | 0.16 | ▁▇▃▁▁ |
Rates.Marijuana.New.Users.26. | 0 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | ▇▂▁▁▁ |
Totals.Marijuana.Used.Past.Month.12.17 | 0 | 1 | 34.86 | 41.58 | 2.00 | 9.50 | 22.00 | 42.00 | 307.00 | ▇▁▁▁▁ |
Totals.Marijuana.Used.Past.Month.18.25 | 0 | 1 | 123.24 | 147.85 | 8.00 | 35.00 | 79.00 | 152.50 | 1106.00 | ▇▁▁▁▁ |
Totals.Marijuana.Used.Past.Month.26. | 0 | 1 | 216.13 | 290.39 | 10.00 | 56.50 | 121.00 | 276.50 | 3086.00 | ▇▁▁▁▁ |
Rates.Marijuana.Used.Past.Month.12.17 | 0 | 1 | 0.07 | 0.02 | 0.04 | 0.06 | 0.07 | 0.08 | 0.14 | ▂▇▅▂▁ |
Rates.Marijuana.Used.Past.Month.18.25 | 0 | 1 | 0.19 | 0.05 | 0.08 | 0.15 | 0.18 | 0.22 | 0.39 | ▂▇▃▂▁ |
Rates.Marijuana.Used.Past.Month.26. | 0 | 1 | 0.06 | 0.03 | 0.02 | 0.04 | 0.05 | 0.07 | 0.18 | ▇▅▁▁▁ |
Totals.Marijuana.Used.Past.Year.12.17 | 0 | 1 | 65.55 | 76.86 | 4.00 | 18.00 | 43.00 | 81.00 | 545.00 | ▇▁▁▁▁ |
Totals.Marijuana.Used.Past.Year.18.25 | 0 | 1 | 202.54 | 237.62 | 16.00 | 56.00 | 131.00 | 252.50 | 1687.00 | ▇▁▁▁▁ |
Totals.Marijuana.Used.Past.Year.26. | 0 | 1 | 348.76 | 449.85 | 17.00 | 91.50 | 212.00 | 439.50 | 4476.00 | ▇▁▁▁▁ |
Rates.Marijuana.Used.Past.Year.12.17 | 0 | 1 | 0.14 | 0.03 | 0.09 | 0.12 | 0.13 | 0.16 | 0.23 | ▃▇▅▂▁ |
Rates.Marijuana.Used.Past.Year.18.25 | 0 | 1 | 0.31 | 0.07 | 0.17 | 0.27 | 0.30 | 0.35 | 0.53 | ▂▇▅▂▁ |
Rates.Marijuana.Used.Past.Year.26. | 0 | 1 | 0.09 | 0.04 | 0.04 | 0.07 | 0.08 | 0.11 | 0.25 | ▇▆▂▁▁ |
Totals.Tobacco.Use.Past.Month.12.17 | 0 | 1 | 47.51 | 51.33 | 1.00 | 13.00 | 31.00 | 62.00 | 358.00 | ▇▁▁▁▁ |
Totals.Tobacco.Use.Past.Month.18.25 | 0 | 1 | 249.24 | 253.20 | 18.00 | 67.50 | 181.00 | 313.00 | 1488.00 | ▇▂▁▁▁ |
Totals.Tobacco.Use.Past.Month.26. | 0 | 1 | 1029.91 | 1001.45 | 95.00 | 258.50 | 828.00 | 1289.00 | 5099.00 | ▇▂▁▁▁ |
Rates.Tobacco.Use.Past.Month.12.17 | 0 | 1 | 0.11 | 0.04 | 0.02 | 0.07 | 0.11 | 0.14 | 0.24 | ▅▇▇▃▁ |
Rates.Tobacco.Use.Past.Month.18.25 | 0 | 1 | 0.40 | 0.08 | 0.17 | 0.35 | 0.42 | 0.46 | 0.59 | ▁▃▆▇▂ |
Rates.Tobacco.Use.Past.Month.26. | 0 | 1 | 0.28 | 0.05 | 0.15 | 0.25 | 0.28 | 0.31 | 0.41 | ▁▅▇▅▁ |
<- read.csv('data/state_marijuana_laws_2019_2.csv')
legalStatus
::skim(legalStatus) skimr
Name | legalStatus |
Number of rows | 51 |
Number of columns | 5 |
_______________________ | |
Column type frequency: | |
character | 5 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
State | 0 | 1 | 4 | 20 | 0 | 51 | 0 |
Medical | 0 | 1 | 0 | 3 | 33 | 2 | 0 |
Recreational | 0 | 1 | 0 | 3 | 39 | 2 | 0 |
Illegal | 0 | 1 | 0 | 3 | 34 | 2 | 0 |
Decriminalized | 0 | 1 | 0 | 3 | 47 | 2 | 0 |
Data 3
Introduction and data
Monkeypox CSV file from the CORGIS Dataset Project
It was curated by Sam Donald on 9/27/2022, using data from the World Health Organization.
This dataset contains information about the status of monkeypox in a given country. Each observation is a different country, and the information includes the number of cases and deaths reported on a given day.
Research question
- A well formulated research question. (You may include more than one research question if you want to receive feedback on different ideas for your project. However, one per data set is required.)
Which countries had the highest amount of deaths related to Monkeypox?
How has the rate of Monkeypox decreased over time?
- A description of the research topic along with a concise statement of your hypotheses on this topic.
Topic: Monkeypox around the world.
Hypothesis: Cases of Monkeypox has decreased over time.
- Identify the types of variables in your research question. Categorical? Quantitative?
Categorical variables: country code, country variable, date
Quantitative variables: year, month, day, cases (other variables in the dataset)
Glimpse of data
<- read.csv('data/monkeypox.csv')
monkey_pox
::skim(monkey_pox) skimr
Name | monkey_pox |
Number of rows | 5874 |
Number of columns | 14 |
_______________________ | |
Column type frequency: | |
character | 3 |
numeric | 11 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Country.Iso.code | 0 | 1 | 3 | 8 | 0 | 99 | 0 |
Country.Full | 0 | 1 | 4 | 28 | 0 | 99 | 0 |
Date.Full | 0 | 1 | 10 | 10 | 0 | 126 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Date.Year | 0 | 1 | 2022.00 | 0.00 | 2022 | 2022.00 | 2022.00 | 2022.00 | 2022.00 | ▁▁▇▁▁ |
Date.Month | 0 | 1 | 7.13 | 0.98 | 5 | 6.00 | 7.00 | 8.00 | 9.00 | ▁▅▇▇▁ |
Date.Day | 0 | 1 | 15.91 | 9.11 | 1 | 8.00 | 16.00 | 24.00 | 31.00 | ▇▆▆▆▆ |
Data.Cases.New | 0 | 1 | 19.42 | 113.86 | 0 | 0.00 | 0.00 | 1.00 | 2063.00 | ▇▁▁▁▁ |
Data.Cases.Total | 0 | 1 | 717.81 | 3894.24 | 1 | 3.00 | 15.00 | 116.00 | 57039.00 | ▇▁▁▁▁ |
Data.Cases.New.per.million | 0 | 1 | 0.26 | 1.36 | 0 | 0.00 | 0.00 | 0.01 | 54.52 | ▇▁▁▁▁ |
Data.Cases.Total.per.million | 0 | 1 | 9.35 | 17.88 | 0 | 0.28 | 1.46 | 8.55 | 142.12 | ▇▁▁▁▁ |
Data.Deaths.New | 0 | 1 | 0.01 | 0.10 | 0 | 0.00 | 0.00 | 0.00 | 3.00 | ▇▁▁▁▁ |
Data.Deaths.Total | 0 | 1 | 0.12 | 0.95 | 0 | 0.00 | 0.00 | 0.00 | 19.00 | ▇▁▁▁▁ |
Data.Deaths.New.per.million | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.00 | 0.00 | 0.09 | ▇▁▁▁▁ |
Data.Deaths.Total.per.million | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.00 | 0.00 | 0.09 | ▇▁▁▁▁ |