Holiday Movies & Their Changes

Yellow Echidna
Nayeon Kwon, Aishwarya Gupta, Yuxuan Chen

2024-02-29

Introduction

Dataset we used: holiday_movies.csv

  • identifiers
  • titles
  • release years
  • runtime
  • genres
  • ratings
  • vote counts
  • boolean flags for specific holiday keywords in the title (“Christmas,” “Hanukkah,” “Kwanzaa,” and “holiday”)

Question 1

Introduction

What’s the average rating and counts of different holiday movie genres across the decades, and which genres are the most popular for each decade?

Approach

  • Current Variables: ‘average_rating’, ‘genres’, and ‘year’
  • New Variables: decades (grouped from ‘year’), average rating, number of movies (from ‘genres’)

  • Stacked bar charts –> Visualize genre distribution over the decades
  • Line chart –> Visualize average ratings over decades
  • Faceted charts –> Visualize individual genres

Analysis

Facet Line: Interactivity

Multi-variate Regression


Call:
lm(formula = average_rating ~ genre_list.x + as.factor(decade) + 
    count, data = combined_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.6835 -0.2335  0.0000  0.3614  2.0562 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)    
(Intercept)              4.889e+00  5.867e-01   8.333  < 2e-16 ***
genre_list.xAdventure    3.578e-01  6.987e-02   5.122 3.22e-07 ***
genre_list.xAnimation    8.782e-01  7.154e-02  12.274  < 2e-16 ***
genre_list.xBiography    1.127e+00  8.930e-02  12.623  < 2e-16 ***
genre_list.xComedy       5.114e-01  6.845e-02   7.471 1.04e-13 ***
genre_list.xCrime        2.086e-02  7.209e-02   0.289 0.772279    
genre_list.xDocumentary  1.410e+00  7.154e-02  19.705  < 2e-16 ***
genre_list.xDrama        5.222e-01  6.845e-02   7.629 3.17e-14 ***
genre_list.xFamily       6.322e-01  6.845e-02   9.236  < 2e-16 ***
genre_list.xFantasy      3.144e-01  6.845e-02   4.592 4.57e-06 ***
genre_list.xFilm-Noir    1.279e+00  1.589e-01   8.054 1.15e-15 ***
genre_list.xHistory      1.403e+00  8.313e-02  16.879  < 2e-16 ***
genre_list.xHorror      -5.045e-01  7.534e-02  -6.697 2.54e-11 ***
genre_list.xMusic        1.103e+00  6.951e-02  15.869  < 2e-16 ***
genre_list.xMusical      5.277e-01  6.845e-02   7.708 1.73e-14 ***
genre_list.xMystery     -8.308e-02  7.268e-02  -1.143 0.253122    
genre_list.xNews         1.122e+00  1.275e-01   8.800  < 2e-16 ***
genre_list.xReality-TV   3.822e+00  1.275e-01  29.978  < 2e-16 ***
genre_list.xRomance      4.322e-01  6.845e-02   6.314 3.14e-10 ***
genre_list.xSci-Fi      -7.441e-02  7.534e-02  -0.988 0.323422    
genre_list.xShort        7.698e-01  7.820e-02   9.844  < 2e-16 ***
genre_list.xSport        8.442e-01  1.008e-01   8.376  < 2e-16 ***
genre_list.xTalk-Show   -1.028e+00  1.275e-01  -8.065 1.05e-15 ***
genre_list.xThriller    -5.102e-01  7.089e-02  -7.196 7.79e-13 ***
genre_list.xWar          1.575e+00  7.916e-02  19.900  < 2e-16 ***
genre_list.xWestern      3.224e-01  8.361e-02   3.857 0.000117 ***
as.factor(decade)1930    9.978e-01  5.862e-01   1.702 0.088844 .  
as.factor(decade)1940    7.319e-01  5.855e-01   1.250 0.211392    
as.factor(decade)1950    1.095e+00  5.857e-01   1.869 0.061728 .  
as.factor(decade)1960    8.176e-01  5.861e-01   1.395 0.163125    
as.factor(decade)1970    1.003e+00  5.855e-01   1.713 0.086899 .  
as.factor(decade)1980    1.112e+00  5.851e-01   1.900 0.057532 .  
as.factor(decade)1990    8.100e-01  5.853e-01   1.384 0.166487    
as.factor(decade)2000    7.799e-01  5.850e-01   1.333 0.182578    
as.factor(decade)2010    5.896e-01  5.849e-01   1.008 0.313542    
as.factor(decade)2020    5.693e-01  5.852e-01   0.973 0.330753    
count                    1.001e-19  1.587e-04   0.000 1.000000    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5827 on 2966 degrees of freedom
  (147 observations deleted due to missingness)
Multiple R-squared:  0.5631,    Adjusted R-squared:  0.5578 
F-statistic: 106.2 on 36 and 2966 DF,  p-value: < 2.2e-16

Discussion

Count

Comedy Drama Comedy Drama Drama Family Family Family Comedy Comedy Romance

Decades

1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2020s

Avg Rating

Comedy Family Music Family Animation Music History War
War Biography Documentary

Question 2

Introduction

How did Christmas, Hanukkah, and Kwanzaa movie distribution change over the years? How different are their average ratings?

Approach

  • Variables: ‘christmas’, ‘hanukkah’, ‘kwanzaa’, ‘year’, and ‘average_rating’.

  • Line chart –> Visualize each holiday’s movie change over years / compare trends easily
  • Bar chart –> Visualize average rating per holiday / straightforward comparison

Analysis

Discussion - Line plot

Holiday Movie Distribution Over Years

  • Significant increase in Christmas movies & sharp rise in recent years
  • Relative scarcity for Hanukkah and Kwanzaa movies throughout the years
  • Stronger cultural or commercial emphasis on Christmas-themed content in the film industry

Discussion - Bar plot

Average Ratings

  • Kwanzaa –> Hanukkah –> Christmas movies
  • Potential quality-over-quantity scenario (lesser-produced holiday movies receive more favorable reviews)
  • Potential of reflecting a niche audience’s rating behavior

Q&A

Thank you for listening! Any questions?