Powered by Poverty: Final Report

Introduction

Powered by Poverty is a scrollytelling data visualization about the uneven geography of the U.S. electricity transition. The goal is to help a public audience see energy data and socioeconomic data in the same frame. Clean-energy discussions tend to emphasize national totals: more solar, more wind, lower emissions. Those totals matter, but they can obscure where old infrastructure, poverty, and pollution concentrate together. We built a Closeread article that moves from national change over time to state energy maps, then to socioeconomic maps, county context, plant-location overlays, and scatterplots.

The intended audience is a reader who is interested in climate, infrastructure, or inequality but does not necessarily work with energy data. The product is designed to be read as an interactive story rather than as a technical dashboard. The reader scrolls through one argument: the clean-energy transition is real, but it is not automatically equitable. The visualizations support that argument while avoiding a stronger causal claim than the data can support.

Data and Integration

The project combines U.S. Energy Information Administration (EIA) data, Census Small Area Income and Poverty Estimates (SAIPE), and the World Resources Institute (WRI) Global Power Plant Database. From EIA: 2024 installed electricity capacity by fuel source and 2024 power-sector CO2 by state. From SAIPE: 2024 median household income and poverty rates. From WRI: geographic location, capacity, primary fuel type, and commissioning year of power plants across the US. We also use EIA State Electricity Profile summary rows for Texas and Louisiana to contextualize total emissions with direct-use electricity demand. The main join is state-level by design: EIA files are state-level, and SAIPE includes official state rows where county_fips_code == "000". Using those rows directly avoids the problem of comparing state-level energy data against a median of county values. The power plant locations are defined by latitude and longitude, so the WRI data can be leveraged for both state and county-level observations.

County SAIPE rows are still used, but only for the county maps. Those maps show why state averages can mislead, not to imply they identify specific power plants or household burdens. State comparisons use state data; county maps are explicitly framed as geographic context only.

The WRI data was used in conjunction with the county-level SAIPE data so that the audience could compare where power plants are concentrated with where poverty is higher at the county level. Poverty levels are not homogeneous across a state, so comparing fossil fuel plant locations with county poverty context gives the audience more local detail than a state average can provide.

The monthly generation file was explored during EDA, but the local copy covers 2001-2002 while the rest of the analysis uses 2024. Because of that mismatch, it is excluded from the final product. Accordingly, the emissions analysis reports total power-sector CO2, not emissions intensity; the data needed for a per-MWh rate is not available for 2024.

The added EIA context is intentionally narrow. Texas and Louisiana are highlighted because they show why total emissions must be interpreted alongside electricity-system scale. EIA reports Texas as #1 and Louisiana as #2 for direct-use electricity in 2024. That does not weaken the equity argument; it prevents us from confusing industrial electricity demand with household vulnerability.

Data centers and AI are part of this modern demand context as well. The U.S. Department of Energy’s Electricity Demand Growth Resource Hub frames recent load growth as driven mainly by data center expansion and AI, domestic manufacturing growth, and electrification. We do not add a state-level data-center variable because the available project data does not identify data-center load by state, but the point matters for interpretation: an equitable transition has to modernize legacy systems while also meeting new electricity demand from the digital and industrial economy.

State-level comparison Pearson correlation
Poverty rate vs. renewable capacity share -0.29
Poverty rate vs. fossil capacity share 0.33
Poverty rate vs. total power-sector CO2 0.34
Median income vs. renewable capacity share 0.08

Design Process

The new tool we learned for this project is Closeread, a Quarto extension for scrollytelling. We paired it with ggiraph so readers can hover over states and points as they scroll. The format fits the argument because the project depends on sequencing: first showing that the national grid is changing, then freezing the analysis at 2024, then showing that state infrastructure differs, then introducing income and poverty as competing explanations. A static grid of charts would be accurate, but it would not force the reader to confront each step before moving to the next one.

The visual system is restrained: light map backgrounds, sequential color palettes, state outlines on county maps, and short narrative cards. Color assignments are conventional: green for renewables, red for emissions, blue for income, orange for poverty, so readers spend no effort decoding the legend. Narrative copy was rewritten to match what the data actually supports: the product distinguishes total emissions from emissions rates, explains why large states such as Texas can lead in total carbon burden, and is explicit that income has only a weak relationship with renewable capacity.

For our visualizations involving power plant location data, we wanted to show where each fuel source has plants throughout the US and allow the audience to compare whether those patterns overlap with regional poverty patterns. We accomplished this by creating two layers: a poverty percentage map layer and a power plant location layer represented by points. We also included a size scale based on the electrical generating capacity of each power plant, allowing the audience to distinguish between plants that may have a larger or smaller impact on surrounding communities. The point colors distinguish the primary fossil fuel each plant uses: coal, gas, or oil. We selected colors that stand out against the orange poverty scale and remain readable for viewers with color vision deficiencies. Finally, the points come together in one graph that displays all coal, gas, and oil plants across the US. Interactivity lets the audience focus on one fuel source at a time, creating greater visibility for each fuel type’s geographic pattern.

The most important design decision was staying interactive without becoming a dashboard. A dashboard would expose all variables and filters at once, which shifts interpretive burden onto the reader. The Closeread product instead commits to a sequence, explaining why each chart appears in order, then uses county maps, plant overlays, and synthesis charts to connect state-level patterns back to local context.

Results

The key finding is not one dramatic statistic; it is the convergence of several patterns. Poverty and renewable capacity share correlate at -0.29 across the 50 states: higher-poverty states tend to have less renewable capacity, though the relationship is not strict. Poverty and fossil capacity share correlate at 0.33, and 10 of the 11 highest-poverty states are majority-fossil. Poverty and total power-sector CO2 correlate at 0.34, a weaker signal that also reflects state size and electricity demand. Median income is nearly unrelated to renewable capacity: 0.08.

The data does not establish that poverty causes fossil dependence or the reverse. But the joined dataset makes clear that high-poverty states are often fossil-heavy. The highest-poverty examples include LA (84% fossil capacity), MS (80% fossil capacity), WV (91% fossil capacity), NM (40% fossil capacity), KY (91% fossil capacity). That overlap matters for energy equity: residents in poorer states have fewer resources to absorb high energy costs, aging infrastructure, or investment that arrives late.

Plotting power plant locations with county poverty data adds to this by showing that the geographic concentration of power plants by fuel type is shaped by factors beyond the poverty level of the host community. Gas plants appear more visually tied to proximity to water than to poverty rates, showing the audience that the presence of fossil fuel production does not determine a region’s poverty level, or vice versa. Bands of fossil fuel plants do span higher-poverty Appalachian and Southeastern regions, but the audience can also see concentrations in less impoverished regions such as Downstate New York and the California coastline. This makes clear that variables beyond the scope of our project likely play a deeper role in both power plant location and poverty rates.

The WRI commissioning-year field strengthens the “legacy infrastructure” part of the story. Among U.S. plants with known commissioning years, coal has a median commissioning year of about 1980, and 84% of known coal capacity was commissioned before 1990. Oil is similar, with 80% of known capacity predating 1990. Gas is newer by comparison, which is why the product treats gas geography as a scale and infrastructure story more than a simple age story.

Primary fuel Median commissioning year Known capacity before 1990
Coal 1980 84%
Gas 2000 23%
Oil 1995 80%

Limitations

The primary limitation is geographic scale. State-level energy data cannot identify which communities live near generating facilities, which households carry the highest utility burden, or which counties receive clean-energy investment. The county maps surface local variation in poverty and income, but they do not resolve the mismatch between county socioeconomic data and state energy data.

Emissions are also measured as total CO2, not per capita or per megawatt-hour. Total emissions show where carbon burden concentrates, but a large state can lead in total tons simply because it generates more electricity, not because its grid is dirtier. A future version should add current generation data to support emissions intensity calculations.

Additionally, the WRI power plant file reflects capacity records mostly from 2019 or from unspecified years. Commissioning year is available for most U.S. fossil plants in the local file, but not every record has complete metadata. Because the plant locations and commissioning years still help readers compare fossil fuel geography with county poverty patterns, we include them as contextual infrastructure evidence rather than as a complete 2024 operating inventory. We acknowledge that outdated or missing plant records may affect how the audience interprets the story.

Finally, the analysis shows associations, not causes. Energy systems are shaped by resource geography, regulation, utility markets, policy, and historical investment; none of these factors is fully captured here. The product is framed as an equity screening tool, not a causal argument.

Conclusion

Powered by Poverty uses data visualization to make a policy problem easier to see. By placing EIA electricity data alongside SAIPE income and poverty data, the project connects two conversations that rarely share a frame: decarbonization and economic vulnerability. The Closeread format matters because it turns that connection into a guided argument rather than a chart gallery. The takeaway is not that every poor state has a dirty grid or that every wealthy state has a clean one. It is a sharper question: as clean-energy investment expands, which communities are first in line?

The product is reproducible, uses local data files, and satisfies the core project requirement: a substantial data visualization built in R with a new workflow. Next steps should add full state-level industrial composition data, data-center load estimates, and household utility-burden measures, such as DOE/NREL LEAD energy burden estimates, to move from state-level associations toward direct measures of energy justice.

Data Sources

U.S. Energy Information Administration. “Electricity Data.” Used for existcapacity_annual.xlsx, emission_annual.xlsx, and generation_monthly.xlsx, including state-level generating capacity, power-sector emissions, and generation tables. https://www.eia.gov/electricity/data.php

U.S. Energy Information Administration. “Survey-Level Detailed Data Files.” Used as the detailed EIA data source reference for state-level electricity files and survey data documentation. https://www.eia.gov/electricity/data/detail-data.php

U.S. Energy Information Administration. “Texas Electricity Profile 2024” and “Louisiana Electricity Profile 2024.” Used for state_energy_context_2024.csv, including net generation, retail sales, direct use, power-sector CO2, and state ranks for Texas and Louisiana. https://www.eia.gov/electricity/state/texas/ and https://www.eia.gov/electricity/state/louisiana/

U.S. Census Bureau. “US and All States and Counties.” Small Area Income and Poverty Estimates (SAIPE) State and County Estimates for 2024. Used for usa_median_income.csv and usa_poverty.csv. https://www.census.gov/data/datasets/2024/demo/saipe/2024-state-and-county.html

U.S. Census Bureau. “Cartographic Boundary Files.” Used for county geometry through tigris::counties(cb = TRUE) and stored locally as county.geojson. https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html

Global Energy Observatory, Google, KTH Royal Institute of Technology in Stockholm, Enipedia, and World Resources Institute. 2019. “Global Power Plant Database.” Used for global_power_plant_database.csv, including plant location, capacity, primary fuel, and commissioning year. https://datasets.wri.org/datasets/global-power-plant-database

U.S. Department of Energy. “Electricity Demand Growth Resource Hub.” Used for background context on electricity load growth drivers, including data centers, AI, manufacturing, and electrification. https://www.energy.gov/policy/electricity-demand-growth-resource-hub