library(tidyverse)
library(taylor)
library(tidytuesdayR)
Project proposal
Dataset
<- tidytuesdayR::tt_load('2023-10-17') tuesdata
We chose the dataset “Taylor Swift,” sourced from the ‘taylor’ R package. This dataset includes 3 dataframes:
<- tuesdata$taylor_album_songs taylor_album_songs
This dataframe includes lyrics and audio features from Spotify API for official album songs, includes 29 variables and 194 observations.
<- tuesdata$taylor_all_songs taylor_all_songs
This dataframe covering the entire discography including EPs and singles, and includes 29 varibles and 274 observations.
<- tuesdata$taylor_albums taylor_albums
Finally, this dataframe summarizing all album release history, including 5 variables and 14 observations.
We chose this dataset because it represents a unique intersection of entertainment and data analytics. Based on Taylor Swift’s significant impact on the music industry, we believe that analyzing her music records, styles, and sales can provide us insight into evolving musical tastes and societal trends. By examining her music journey, we can gain valuable perspectives on consumer preferences and the dynamics of the music market.
Questions
Trends in Musical Attributes Over Time: How has Taylor Swift’s musical style evolved in terms of emotional tone and energy over her career?
Lyric Analysis: What are the most common themes and words in Taylor Swift’s lyrics across her albums?
Analysis plan
Trends in Musical Attributes Over Time
For our analysis, we will utilize taylor_all_songs
to explore the musical and lyrical composition of songs across all albums (including non-Taylor-owned albums). It integrates spotify’s audio features for each song, including metrics like danceability, energy, loudness, tempo, valence, etc. Below, we load in the dataset and show the first few rows of the table.
data("taylor_all_songs")
head(taylor_all_songs)
# A tibble: 6 × 29
album_name ep album_release track_number track_name artist featuring
<chr> <lgl> <date> <int> <chr> <chr> <chr>
1 Taylor Swift FALSE 2006-10-24 1 Tim McGraw Taylo… <NA>
2 Taylor Swift FALSE 2006-10-24 2 Picture To Burn Taylo… <NA>
3 Taylor Swift FALSE 2006-10-24 3 Teardrops On M… Taylo… <NA>
4 Taylor Swift FALSE 2006-10-24 4 A Place In Thi… Taylo… <NA>
5 Taylor Swift FALSE 2006-10-24 5 Cold As You Taylo… <NA>
6 Taylor Swift FALSE 2006-10-24 6 The Outside Taylo… <NA>
# ℹ 22 more variables: bonus_track <lgl>, promotional_release <date>,
# single_release <date>, track_release <date>, danceability <dbl>,
# energy <dbl>, key <int>, loudness <dbl>, mode <int>, speechiness <dbl>,
# acousticness <dbl>, instrumentalness <dbl>, liveness <dbl>, valence <dbl>,
# tempo <dbl>, time_signature <int>, duration_ms <int>, explicit <lgl>,
# key_name <chr>, mode_name <chr>, key_mode <chr>, lyrics <list>
Variables involved:
- album_release: the release date of the album to track the evolution of her music over time
- album_name: to categorize tracks within their respective albums
- tempo: the overall estimated tempo of a track in BPM
- valence: a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track
- danceability: a measure from 0.0 to 1.0 indicating how suitable a track is for dancing
Variables to be created:
- yearly average of audio features: measures the album average values for tempo, valence, and danceability to analyze trends over time
Approach: Using the variables above, we would perform a trend analysis over time. This would involve using line plots to visualize how average values of each musical attribute (tempo, valence, danceability) have changed over time. We would create faceted line plots for each attribute, to enable comparison across the same time period, but separate the visual clutter by providing a distinct plot for each attribute.
Lyrics Analysis
Variables involved:
$lyrics[1] taylor_all_songs
[[1]]
# A tibble: 55 × 4
line lyric element element_artist
<int> <chr> <chr> <chr>
1 1 "He said the way my blue eyes shined" Verse 1 Taylor Swift
2 2 "Put those Georgia stars to shame that night" Verse 1 Taylor Swift
3 3 "I said, \"That's a lie\"" Verse 1 Taylor Swift
4 4 "Just a boy in a Chevy truck" Verse 1 Taylor Swift
5 5 "That had a tendency of gettin' stuck" Verse 1 Taylor Swift
6 6 "On backroads at night" Verse 1 Taylor Swift
7 7 "And I was right there beside him all summer lo… Verse 1 Taylor Swift
8 8 "And then the time we woke up to find that summ… Verse 1 Taylor Swift
9 9 "But when you think Tim McGraw" Chorus Taylor Swift
10 10 "I hope you think my favorite song" Chorus Taylor Swift
# ℹ 45 more rows
taylor_albums
# A tibble: 14 × 5
album_name ep album_release metacritic_score user_score
<chr> <lgl> <date> <dbl> <dbl>
1 Taylor Swift FALSE 2006-10-24 67 8.5
2 The Taylor Swift Holiday Col… TRUE 2007-10-14 NA NA
3 Beautiful Eyes TRUE 2008-07-15 NA NA
4 Fearless FALSE 2008-11-11 73 8.4
5 Speak Now FALSE 2010-10-25 77 8.6
6 Red FALSE 2012-10-22 77 8.5
7 1989 FALSE 2014-10-27 76 8.2
8 reputation FALSE 2017-11-10 71 8.3
9 Lover FALSE 2019-08-23 79 8.4
10 folklore FALSE 2020-07-24 88 9
11 evermore FALSE 2020-12-11 85 8.9
12 Fearless (Taylor's Version) FALSE 2021-04-09 82 8.9
13 Red (Taylor's Version) FALSE 2021-11-12 91 9
14 Midnights FALSE 2022-10-21 85 8.3
- lyrics: the actual text of the song’s lyrics for thematic analysis
- album_name: to categorize songs by album
- album_release: to categorize songs by year for a temporal analysis
Variables to be created:
- word frequency count: to measure each word across all songs
- thematic categorization: categorization of songs into themes (e.g., love, heartbreak, empowerment) based on keyword
- yearly thematic trends: to analyze the prevalence of certain themes in different years
Approach: Using the above variables, the analysis will extract and count word frequencies and identify common themes. The analysis will conduct a trend analysis over time, using line charts to show the prevalence of specific themes over time. Since songs will be categorized into themes, we can plot the frequency or proportion of songs falling into each category over time.We will also create a bar chart to show the top words in Taylor’s lyrics within specific themes of her music. Combined, these plots will illustrate how the most frequent words and themes have evolved across different albums and years, offering insights into the shifts in Taylor Swift’s lyrical focus throughout her career.