Over three percent of data in the most-cited datasets was deemed inaccurate or mislabeled.