library(tidyverse)Project proposal
Dataset
Australian Frogs Dataset
This dataset contains records from the sixth annual release of data from the FrogID initiative, a citizen science project in Australia that collects frog call recordings through a mobile app. Volunteers record frog calls, which are then identified by museum experts and used to support research in frog ecology, taxonomy, and conservation. FrogID data has contributed to more than 30 scientific publications.
Australia is home to approximately 257 native frog species, many of which are found nowhere else in the world. However, nearly one in five species is threatened with extinction due to pressures such as climate change, urbanization, disease, and invasive species. This dataset helps researchers monitor frog populations and better understand environmental threats affecting amphibian biodiversity.
Loading In the Dataset:
## Loading in datasets:
frog_names_df <- read_csv("data/frog_names.csv")Rows: 294 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): subfamily, tribe, scientificName, commonName, secondary_commonNames
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
frogID_df <- read_csv("data/frogID_data.csv")Rows: 136621 Columns: 11
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): scientificName, timezone, stateProvince
dbl (6): occurrenceID, eventID, decimalLatitude, decimalLongitude, coordina...
date (1): eventDate
time (1): eventTime
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Datasets dimensions:
frog_names_dim <- dim(frog_names_df)
frogID_dim <- dim(frogID_df)Technical Dataset Description
We are combining two data sets from (provenance):
- Data obtained from TidyTuesday (2025-09-02)
- Source: https://github.com/rfordatascience/tidytuesday
The dimensions of these datasets are the following:
- frog_names_df: 294 rows and 5 columns
- frogID_df: 136621 rows and 11 columns
Why This Dataset:
Australia is famous for its unique and diverse wildlife, and its frogs are a particularly fascinating, yet vulnerable, part of that ecosystem. We chose this dataset because it offers a niche look at how technology bridges the gap between public engagement and scientific research. By exploring the connection between app users and real-world conservation, we aim to create visualizations that highlight both the beauty of these species and the environments they are in.
Questions
- How does frog species richness vary across Australian states and across seasons?
- How do calling times (hour of day) vary across frog subfamilies and geographic regions?
Analysis plan
A plan for answering each of the questions including the variables involved, variables to be created (if any), external data to be merged in (if any).
Question 1:
Outcome Variable:
- Species richness (number of distinct scientificName)
Explanatory variables:
stateProvince
season(from eventDate)
Grouping Variables:
stateProvince
season
New variables to create:
- month (from eventDate)
- season(Winter, Spring, Summer, Fall)
Question 2:
Outcome variable:
- Hour of day (from eventTime)
Explanatory variables:
subfamily
stateProvince
Grouping variables:
subfamily
stateProvince
New variables to create:
hour from eventTime
time_of_day category (dawn, dusk, day, night)