When examining race and ethnicity data from Virginia’s state dashboard, it becomes clear that this data is formulated unlike any other. Rather than displaying race and ethnicity data for each individual county, the Virginia Department of Health provides information grouped by health district, which consists of multiple counties. For example, the Lenowisco health district consists of Wise, Scott, Lee, and Norton counties.
In some instances, health districts are used instead of counties in order to maintain the anonymity of affected individuals if they come from such a small minority that the publication of their background would reveal their identity. After contacting Virginia’s State Department of Health, we were directed to Sections 32.1-36, 32.1-38, and 32.1-41 of the Code of Virginia, which mandates the Department of Health report at the health district level to protect the anonymity of patients during a disease investigation. However, when states report at the health district level instead of the county level it presents a challenge of trying to determine which cases in that health district belong to each county.
The goal of the Health Equity data collection team of the COVID-19 Data Project is to standardize our data and adapt the information displayed by each state or county to fit within the U.S. Census Bureau’s parameters. To do this, we organize our data into 11 categories — white, Black/African American, Asian, American Indian/Alaska Native, Native Hawaiian/Pacific Islander, Multiracial, Other, and Unknown for race, and Hispanic, Non-Hispanic, or Unknown for ethnicity. This information is collected at the county level — in fact, it is currently the only dataset to do so. Therefore, having a state depict information at the health district level rather than the county level complicated things significantly, as it made running any sort of analysis or creating visualizations of our data much more difficult.
In order to solve this problem, we used the 2014-2018 American Community Survey data to get the estimated demographic breakdowns for each county in Virginia. Then, we used this information to proportionally redistribute the case totals in each health district. For instance, Lenowisco health district contains four counties: Wise, Scott, Lee, and Norton and had 203 white cases on August 1, 2020. Within this health district the proportion of the overall white population was distributed as follows: 42.8% in Wise County, 25.9% in Scott County, 27.0% in Lee County, and 4.3% in Norton county. We used these proportions to estimate county-level statistics for this health district by redistributing the 203 white cases proportionally to each county. Wise county had 86.8 of the cases, Scott county had 52.2 cases, Lee county had 54.9 cases, and Norton county had 8.7 cases. Each of these cases would then be rounded to the nearest whole number.
This methodology was applied to all the health districts for Virginia for all our data, which dates back to June 15, 2020. Our data set for Virginia grows everyday and the health department continues to publish data on a health district level. Despite this obstacle, we will continue to use our methodology to convert the data from health district level to county level in order to provide researchers a way to use our data for broader analysis.