TikTok Tracks

Trends in Duration, Popularity, and Danceability

Brilliant Squirtle
Tiffany Lee, Helen Lim, Elizabeth Moon, Samhita Raman

5/5/23

Introduction

What factors contribute to making a track popular on TikTok?

  • TikTok is known for its short form content. We’d like to know if the overall duration of a song has any effect on its popularity.

How did social media platforms like TikTok play a role in the pandemic?

  • Restlessness and boredom during quarantine
  • Creation of dance trends as an outlet

The Data

  • Kaggle Dataset created by “Team Dan” on GitHub (June 2021)
  • Created by scraping TikTok data in the Phillipines and Southeast Asia
  • .csv file
  • Each observation is a TikTok track
  • 3,560 unique instances | 6,746 total

Variables of Interest:

  • track_name

  • track_pop

  • duration_ms

  • release_date

── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0     ✔ purrr   1.0.0
✔ tibble  3.2.1     ✔ dplyr   1.1.2
✔ tidyr   1.2.1     ✔ stringr 1.5.0
✔ readr   2.1.3     ✔ forcats 0.5.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
── Attaching packages ────────────────────────────────────── tidymodels 1.0.0 ──

✔ broom        1.0.2     ✔ rsample      1.1.1
✔ dials        1.1.0     ✔ tune         1.1.1
✔ infer        1.0.4     ✔ workflows    1.1.2
✔ modeldata    1.0.1     ✔ workflowsets 1.0.0
✔ parsnip      1.0.3     ✔ yardstick    1.1.0
✔ recipes      1.0.6     

── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Search for functions across packages at https://www.tidymodels.org/find/

Rows: 6746 Columns: 23
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (9): track_id, track_name, artist_id, artist_name, album_id, release_da...
dbl (14): duration, popularity, danceability, energy, key, loudness, mode, s...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Duration and Popularity of TikTok Tracks

`geom_smooth()` using formula = 'y ~ x'

Analysis of Significance

\[ H_0 : \beta_{duration} = 0 \]

\[ H_A: \beta_{duration} \ne 0 \]

  • P-value = 0.536 > 0.05

  • Fail to reject null hypothesis

Popularity by Category

Warning: The dot-dot notation (`..y..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(y)` instead.

Danceability Before and After COVID-19 Pandemic

Analysis of Significance

\[ H_0: \mu_{before} = \mu_{after} \]

\[ H_A: \mu_{before} \ne \mu_{after} \]

  • P-value = 0 < 0.05.

  • Reject null hypothesis in favor of the alternative

Conclusions & Future Work

Conclusions

  • The 3-minute mark is a sweet spot for songs to become trendy on TikTok.

  • Our analysis isn’t entirely comprehensive and requires further research.

  • The top trending tracks became more “danceable” after quarantine.

Future Work

  • Use machine learning to predict the next duration length of popular TikTok tracks.

  • Obtain data from other countries to compare how the track trends differ across regions.

  • Obtain data on song popularity globally and compare to its popularity on TikTok.

  • Conduct ANOVA test to determine whether effect on popularity is statistically significant.