Solidarity Forever? The Success and Prevalance of Labor Unions in the US

Project Proposal

Author

Red Cassowary - Ming DeMers, Tasnimul Taher, Isabel Dawson, and Anthony Ma.

library(tidyverse)

Introduction

“Unions, the workers who brought you the 40-hour work week.” So goes the famous saying about the success and virtue of labor unions. Formed out of necessity due to the deadly working conditions of the Industrial Revolution, unions gave workers power through collective bargaining. Ideally, unions reduce exploitation, improve working conditions, and increased wages. They fought for the 40-hour work week and minimum wage and ended child labor. However, to accomplish these feats, many unions had to resort to drastic measures: picketing, striking, and even violence. The history of unions is long and complicated, so how has the fight been successful? Do workers in unions fare better than those who aren’t? In what sectors are unions most appropriate and successful? And most importantly, do unions still have a place in the modern American workplace?

Dataset

We have chosen a dataset from the TidyTuesday challenge on 09/05/2023. The data comes from Union Membership, Coverage, and Earnings from the CPS, by Barry T. Hirsch, David A. Macpherson, and William E. Even. The data uses the Current Population Survey (CPS) to provide “private and public sector labor union membership, coverage, and density estimates.”

We chose this dataset because there are a lot of implications when analyzing economic data that relates to unions. In Ithaca, especially, there has been a lot of union activity and controversy in the last few years (i.e. Cornell grad students, Starbucks, etc.) so we wanted to look into union activity and effectiveness beyond Ithaca at a national level.

union_demos <- read_csv("data/demographics.csv")

Rows: 1355 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): facet
dbl (7): year, sample_size, employment, members, covered, p_members, p_covered

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

union_states <- read_csv("data/states.csv")

Rows: 10455 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): state, sector, state_abbreviation
dbl (8): state_census_code, observations, employment, members, covered, p_me...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

union_wages <- read_csv("data/wages.csv")

Rows: 1273 Columns: 9
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): facet
dbl (8): year, sample_size, wage, at_cap, union_wage, nonunion_wage, union_w...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(union_demos)

Rows: 1,355
Columns: 8
$ year        <dbl> 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1983…
$ sample_size <dbl> 49095, 48245, 46488, 47648, 57191, 57321, 58080, 68594, 15…
$ employment  <dbl> 75519.29, 77101.10, 75703.92, 78776.94, 81334.13, 84966.08…
$ members     <dbl> 18088.57, 18176.48, 16778.28, 17402.98, 19335.10, 19548.35…
$ covered     <dbl> NA, NA, NA, NA, 21534.60, 21897.54, 23540.08, 22493.37, 21…
$ p_members   <dbl> 0.2395225, 0.2357486, 0.2216303, 0.2209147, 0.2377243, 0.2…
$ p_covered   <dbl> NA, NA, NA, NA, 0.2647671, 0.2577209, 0.2702135, 0.2571273…
$ facet       <chr> "all wage and salary workers", "all wage and salary worker…

glimpse(union_states)

Rows: 10,455
Columns: 11
$ state_census_code  <dbl> 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,…
$ state              <chr> "Maine", "Maine", "Maine", "Maine", "Maine", "Maine…
$ sector             <chr> "Priv. Construction", "Priv. Construction", "Priv. …
$ observations       <dbl> 85, 93, 95, 89, 114, 117, 119, 109, 78, 62, 59, 55,…
$ employment         <dbl> 16917.99, 20170.49, 23411.73, 22873.16, 28033.17, 3…
$ members            <dbl> 2207.3308, 2207.7742, 2490.9525, 1917.4617, 3377.06…
$ covered            <dbl> 2420.3100, 2207.7742, 2787.8625, 1917.4617, 3377.06…
$ p_members          <dbl> 0.13047244, 0.10945567, 0.10639764, 0.08383022, 0.1…
$ p_covered          <dbl> 0.14306135, 0.10945567, 0.11907975, 0.08383022, 0.1…
$ state_abbreviation <chr> "ME", "ME", "ME", "ME", "ME", "ME", "ME", "ME", "ME…
$ year               <dbl> 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 199…

glimpse(union_wages)

Rows: 1,273
Columns: 9
$ year                        <dbl> 1973, 1974, 1975, 1976, 1977, 1978, 1979, …
$ sample_size                 <dbl> 39774, 37966, 37812, 37888, 46591, 44577, …
$ wage                        <dbl> 3.963343, 4.258988, 4.622152, 4.914102, 5.…
$ at_cap                      <dbl> 0.001110298, 0.001568906, 0.002267337, 0.0…
$ union_wage                  <dbl> 4.613008, 5.021364, 5.428930, 5.835837, 6.…
$ nonunion_wage               <dbl> 3.754528, 4.019072, 4.386672, 4.646127, 4.…
$ union_wage_premium_raw      <dbl> 0.2286519, 0.2493840, 0.2375963, 0.2560648…
$ union_wage_premium_adjusted <dbl> 0.1717432, 0.1761086, 0.1858174, 0.1967221…
$ facet                       <chr> "all wage and salary workers", "all wage a…

We ultimately have three datasets. The first is on demographics. It has 1,355 observations, each representing a year and sector. It provides demographic information, including number of union workers, percent of union workers, and number of workers covered by the union (may be more than membership). The second dataset is on states. It has 10,455 observations, representing every state by year and sector. It provides the number of unions in each group, employment, membership, and more. Finally, the third dataset is on wages. It has 1,273 observations, each representing a year and sector. It provides the mean hourly wage of union and non-union members, an adjusted wage, and more.

We will also join two additional data sets: one showing inflation over time so we can adjust the wages and dues, and one listing when each Right to Work state became that way.

Categorical variables:

state - character

sector - character, there are 12 categories for sector:

"all wage and salary workers"     "construction"                    "wholesale/retail"               
 [4] "public administration"           "private sector: all"             "private sector: nonagricultural"
 [7] "private sector: construction"    "private sector: manufacturing"   "public sector: all"             
[10] "public sector: federal"          "public sector: state government" "public sector: local government"

right_to_work - boolean

Numeric variables:

union_wage - double
nonunion_wage - double
employment - double
year - double
p_members - double
p_covered - double
Union_wage_premium_adjusted - double

Questions

How have unions grown throughout the years, and in which states/sectors are they most prevalent? Basically, how widespread are unions?
How do Right to Work laws impact union membership and union coverage?

Analysis plan

In order to investigate how Right to Work laws impact union membership across all employment sectors, we will look at the percent of workers who are part of a union over time in Right to Work and non-Right to Work states. We will begin by creating a new variable called rtw (Right to Work)^6^ and joining it with the union_states data set. This variable represents whether or not each state was a Right to Work state in each year since 1983. Most Right to Work states adopted their policies in the mid 1900s, but some did not get Right to Work laws on the books until very recently. This new variable addresses how different states changed their Right to Work policies over time.

After the rtw variable is created, we will visualize the median percent of unionized workers over time in Right to Work and non-Right to Work states via a bar chart. For each year there will be two bars: one representing the median percent of unionized workers in Right to Work states and the other representing the median percent of unionized workers in non-Right to Work states. Next, we will look at the percent of workers covered by unions compared to the percent of workers who are actually union members in Right to Work and non-Right to Work states once again using a bar chart. In order to make this second bar chart, we will need the state, year, and p_covered variables from the union_states data set, as well as the new variable we made, rtw.