Citizen scientist data collection increases risk of error
November 13, 2014
Using volunteer, amateur “citizen scientists” to collect data, such as types of trees in a forest, would seem like an ideal way for budget-strapped agencies and organizations to augment their workforce.
Unfortunately, citizen scientist-collected data has the potential for a significant degree of error, according to a study by scientists from four universities.
“There has been a movement in science lately to use average citizens to collect data,” said Carolyn Copenheaver, associate professor of forest ecology in Virginia Tech’s College of Natural Resources and Environment and a co-author on the study.
“These people receive a little bit of training — a day at most, but often only 30 minutes — and then they collect data that are used by scientists. However, this movement is happening without really checking the accuracy of the data collected by these citizen scientists,” she added.
A study with high school students in Washington, D.C., and Atlanta evaluated the accuracy of data the students collected at two sites during seven sessions over the course of three years.
The study results, authored by representatives from the University of Georgia, Virginia Tech, the University of Florida, and Appalachian State University, appear in the June 2014 issue of the North American Colleges and Teachers of Agriculture’s NACTA Journal.
“It takes a lot of training to do scientific research well,” said Copenheaver, which is what the team confirmed in the study.
In order to compare citizen-collected data to that collected by professional scientific researchers, the team looked at how accurately high school students collected tree diameter data and identified trees within a fixed plot, and the degree to which accuracy was influenced by the scientific background of the adult instructor.
Environmental science and earth science students completed five sampling periods at Mason Neck National Wildlife Refuge in northeastern Virginia and agriculture students completed two sampling periods at Indian Springs State Park in central Georgia.
At both of the study areas, researchers from Virginia Tech and the University of Georgia collected data in advance to provide an answer key for comparing the accuracy of the student-collected data.
Revisions were made to teaching approaches after each field trip to improve the educational experience and quality of data collected. Approaches included having students trained directly by university faculty members, having university faculty members train high school teachers to be the instructors, using graduate students as instructors, and using undergraduate students as instructors.
The results were best when students were instructed by trained university faculty members. Data accuracy was similar for students from elective agriculture and environmental science classes; however, students in mandatory earth science classes had less accurate results.
“Motivation is important,” said Copenheaver.
To identify tree species, students used photographs of buds, leaves, and bark. The researchers coded responses as correct, semi-correct, or incorrect. An example of a semi-correct response was correctly identifying a tree as a maple (genus), but incorrectly identifying the species — red maple rather than sugar maple, for instance.
“Thus, citizen scientist programs in regions with high biodiversity are likely to have more errors in species identification,” said Copenheaver. “This is of great concern if citizen scientists are used to collect data and monitor ecosystem changes.”
Students who measured the more-diverse forest in Georgia were able to identify 80 percent of the tree species correctly, while students working in the less-diverse Virginia forest were able to identify 97 percent of the tree species correctly.
The most significant error rates came when students were challenged to identify the research plot area, the researchers reported.
“The accuracy of data collected by high school citizen scientists increased in plots where researchers placed metal tags on all trees that needed to be sampled (6 percent error rate), rather than having students establish the plot dimensions with measuring tapes and determine for themselves what trees were in or out of their sampling plot (95 percent error rate).”
The researchers recommend the following when enlisting citizen scientists to gather data:
- Use experienced researchers to train citizen scientists.
- Provide on-site demarcations to indicate what areas should be sampled.
- Recruit individuals who have taken agriculture or science classes as part of their high school curriculum.
- Limit citizen scientist programs to regions with lower biodiversity.
This study was funded by a U.S. Department of Agriculture Higher Education Challenge Grant.