Chapter 19 Statistical Sample Quotas Using Clustering Model

19.1 What is the Statistical Sample Quotas Using Clustering Model?

The Statistical Sample Quotas Using Clustering Model leverages naturally-occurring biological groupings and correlation in disease status to plan a statistically valid simple random sampling designs for disease surveillance. Outputs provide statistically robust assurance that CWD is at or below 1%, 1.5%, 2%, 3%, 4%, or 5% in each sub-administrative area.

19.2 What Question Does it Answer?

Question 1. How many random hosts should I test without finding a positive case to have high probability that disease prevalence in the population is at or below 1%, 1.5%, 2%, 3%, 4%, or 5%? The model uses simple random sampling and considers the clustering effect of hosts to pinpoint the number of hosts needed without finding a positive case to have 95% probability that CWD prevalence in the greater population is at or below a desired level.

IMPORTANT: This tool does not consider other sampling schemes beyond Simple Random Sampling. If you intend to use high-harvest sampling (where, for example, some clusters are sampled more heavily than others as a result of hunter access, convenience, or harvest decisions) use the Efficient Sample Size Calculator instead.

19.3 Output Details

Sample size: A map containing the number of hosts that must be randomly sampled in each sub-administrative area without finding a positive case to have 95% probability that the underlying disease prevalence is at or below the desired level.

19.4 Abbreviated Tutorial

Choose your model parameters (see below).
Run the model with those desired population parameters.
Look at the map to see the number of random animals that need to be tested without finding a positive case to ensure there is a high probability that disease prevalence in the population is at or below your desired level.
Explore the model logs, input file, and output files used in the run.
If the model did not run, check the model logs to understand required data that was missing. ## Parameters Needed to Execute the Model

Model type: Select ‘Statistical Sample Quotas Using Clustering Model’ from the drop-down list.
Reference name: Label the run.
(Optional) Applicable season year: Label the season-year. This label is not used in the model execution and is intended to assist the provider in documenting the model execution.
(Optional) Notes: Enter any additional remarks about the run.
Sensitivity of the diagnostic test: The performance of the diagnostic test in declaring a true positive. A decimal between 0 (not sensitive: test will not appropriately declare a true positive) and 0.999 (nearly perfect sensitivity: test has high performance in declaring a true positive).
Average cluster size: Average cluster size of hosts in the population in each sub administrative area. An integer value between 1 (1 host per cluster in the population) and the total population size (1 cluster in the entire population). Note: The software will automatically ensure that your cluster size does not exceed the population size.
Correlation in disease status: Correlation in disease status between hosts sharing a cluster for each sub administrative area. A decimal between 0 (disease status among hosts in a cluster is independent) and 0.995 (disease status is nearly perfectly correlated among hosts sharing a cluster).
Host density: The number of hosts that reside in one square kilometer of land area. OR
Population Size: The number of hosts that reside in the sub administrative unit.

19.5 Details on the Theory

Booth JG, Hanley BJ, Hodel FH, Jennelle CS, Guinness J, Them CE, Mitchell CI, Ahmed MS, Schuler KL. 2024. Sample Size for Estimating Disease Prevalence in Free-Ranging Wildlife Populations: A Bayesian Modeling Approach. Journal of Agricultural, Biological, and Environmental Sciences, 29, 438–454. https://doi.org/10.1007/s13253-023-00578-7.

Booth JG, Hanley BJ, Thompson NE, Gonzalez-Crespo C, Christensen SA, Jennelle CS, Caudell JN, Delisle Z, Guinness J, Hollingshead NA, Them CT, Schuler KL. Management Agencies Can Leverage Animal Social Structure for Wildlife Disease Surveillance. Journal of Wildlife Diseases. Journal of Wildlife Diseases. https://doi.org/10.7589/JWD-D-24-00079.

19.6 Code

The code is publicly available at https://github.com/Cornell-Wildlife-Health-Lab/statistical-sample-quotas-using-clustering-model.