Project proposal

Author

Dank Vibe

library(tidyverse)
library(tidytuesdayR)
library(readr)

The Goal

The Goal is to analyze border enforcement trends by looking at encounter (deportations, asylum seeking, penalties for unauthorized border crossings and other title of authorities.) patterns across different geographic regions and demographics. We also seek to understand which states experience the highest encounters for specific demographic groups and how these encounters vary in type. These findings should help policy makers, researchers and the public better understand border dynamics and the changing of immigration policies.

Dataset

A brief description of your dataset including its provenance, dimensions, etc. as well as the reason why you chose this dataset.

To prepare the CBP dataset for analysis, we recode variables into categorical and numeric types for better statistical modeling and visualization. The categorical variables include fiscal_year, land_border_region, state, demographic, citizenship, and title_of_authority, all converted using factor(). The month_abbv variable is treated as an ordered factor to maintain chronological order from “JAN” to “DEC”. For numeric variables, encounter_count remains as a continuous numerical value. Additionally, we create a binned fiscal year variable (fiscal_decade), grouping years into “2000s”, “2010s”, and “2020s” using case_when(), and categorize states into broader U.S. Census regions for regional comparisons. These transformations ensure that both numerical and categorical data can be effectively used in visualizations and modeling.

Make sure to load the data and use inline code for some of this information.

Our data is the U.S. Customs and Border Protection (CBP) Encounter data, This data includes the interactions from Northern Land Border, Southwest Land Border, and Nationwide (air, land, and sea modes of transportation). It is extracted from live from CBP systems and data sources. So, the data can and most likely will change over time, due to corrections and system changes through out the fiscal year.

It is separated by two CSVs: - cbp_resp.csv: separated by region nation wide, contains 12 variables - cbp_state.csv: Separated by States in the US, contains 9 variables

We chose this because the variables it provides allows us to answer questions like what regions are experiencing the most encounters, what are the most popular encounters and at what periods are these? These are easily able to be answered by other datasets but a live one allows us to give accurate and almost immediate reports on the experiences and encounters happening at the borders.

# Sources: https://github.com/rfordatascience/tidytuesday/blob/main/data/2024/2024-11-26/readme.md
## install.packages("tidytuesdayR")

cbp_resp <- readr::read_csv(
  'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-11-26/cbp_resp.csv')

Rows: 68815 Columns: 12
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (10): month_grouping, month_abbv, component, land_border_region, area_of...
dbl  (2): fiscal_year, encounter_count

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

cbp_state <- readr::read_csv(
  'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-11-26/cbp_state.csv')

Rows: 54939 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (7): month_grouping, month_abbv, land_border_region, state, demographic,...
dbl (2): fiscal_year, encounter_count

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Add to data folder
write_csv(cbp_resp, paste0("data/", "cbp_resp.csv"))
write_csv(cbp_state, paste0("data/", "cbp_state.csv"))

Questions

How do citizenship and title of authority across different land border regions relate to encounter types?
Do certain states have more encounters with a certain demographic and how do these encounter types differ?

Analysis plan

Question 1: How do citizenship and title of authority across different land border regions relate to encounter types?

The variables involved with this question are:

Citizenship: Country of origin of the individuals encountered.

Title of Authority: Legal authority under which the encounter was processed, such as Title 8 (standard immigration law) or Title 42 (public health-related expulsions), etc.
Land Border Region: Geographical areas where encounters occur, typically categorized as Northern Land Border, Southwest Land Border, and Coastal Border, etc.
Encounter Type: Nature of the encounter, including apprehensions, inadmissibles, and expulsions.
Encounter Count: Number of individuals encountered for each combination of the above variables

Plan: First we filter the data to include relevant variables: citizenship, title of authority, land border region, encounter type, and encounter count. Next we ensure data is consistent by standardizing country names and categorizing encounter types uniformly. Then we aggregate encounter counts by citizenship, title of authority, and land border region. We then examine the relationship between citizenship and title of authority within each land border region.

Plot 1: Create bar charts showing the number of encounter types for each citizenship status and title of authority.

Plot 2: Create Mercator map/ heat map to show the distribution of encounter types across different citizenships and titles of authority within each land border region

This would help us analyze patterns to determine if certain citizenships are more likely to be processed under specific titles of authority in particular regions

Question 2: Do certain states have more encounters with a certain demographic, and how do these encounter types differ? The variables for this question involve:
State: U.S. state where the encounter occurred.
Demographic: Classification of individuals, such as Single Adults, Family Unit Members (FMUA), Unaccompanied Alien Children (UAC), and Accompanied Minors (AM), etc.
Encounter Type: Nature of the encounter, including Title 8 apprehensions, Title 8 inadmissibles, and Title 42 expulsions, etc.
Encounter Count: Number of individuals encountered for each combination of the above variables

Plan: We filter the dataset to include variables: state, demographic, encounter type, and encounter count.Then standardize demographic categories and make sure there is consistency in state naming. Following that we aggregate encounter counts by state and demographic group. Analyze and identify states with higher encounter counts for specific demographics.

Plot 1: Develop a density map using geom_fm(), to display encounter densities for specific demographics by state

Plot 2: Use stacked bar charts to show the breakdown of encounter types within each demographic group per state. We can interpret states with disproportionately high encounters for certain demographics and see how encounter types vary within demographic groups across different states

Variables to be Created: Encounter Density: Calculate the number of encounters per 100,000 residents in each state to account for population differences. For example, we can define encounter density as the number of encounters per 100,000 residents. Practically, you can calculate this by taking the total number of encounters (encounter_count), dividing it by the state’s population, and then multiplying by 100,000.