From the Deputy Director for Intramural Research
Data: Not Just Another Four-Letter Word
As I think about the new mandate for managing and sharing scientific data (https://sharing.nih.gov/data-management-and-sharing-policy), which goes into effect on January 25, 2023, I can’t help but think about the character Data, played by Brent Spiner on Star Trek: The Next Generation. He was critical to the crew of the Starship Enterprise because he could manage and share data. But his usefulness diminished when he managed it in a way that did not answer the questions being asked or when he shared it in language that seemed unintelligible.
Similarly, it is simple to say that anyone whose research is funded by the federal government should manage their data and make them accessible. Indeed, the concept of proper data management and sharing underscores the importance of accountability, transparency, scientific rigor, and research integrity in serving the overarching goal of improving human health and well-being and is a critical component of team science. But what does “data management” mean and at what stage, how, and with whom should data best be shared?
First, the new Data Management and Sharing Policy (DMS Policy) will require every intramural and extramural investigator conducting NIH-funded research that will generate scientific data to develop and have approved a DMS plan. Intramural plans for each institute and center (IC) will be reviewed and approved by that IC’s scientific director or designee. Investigators and project leads will need to submit a DMS plan for ongoing and new research. In addition, clinical research protocols must include a plan along with other materials submitted for IC initial scientific review.
Extramural plans will be reviewed and assessed by NIH program staff.
In the data-management portion of the plan, the investigator will need to make clear what types of data are to be archived and shared; what the relevant metadata are and what methods were used to generate and acquire them; what software and other tools are needed to access and work with them; and, if relevant, what “standards” (things like unique identifiers, data dictionaries, and formats) apply to these data. In the data-sharing portion, the investigator must specify the repositories in which data will be archived as well as the way they are findable and identifiable. The investigator should also provide a timeline for depositing data in an accessible repository. More information on creating and writing such a plan can be found at https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-DMS/writing-a-data-management-and-sharing-plan.
Given the rate at which science and technology change, DMS plans are not cast in stone forever. They can be updated, revised, rereviewed, and reapproved during the lifetime of any given project.
The spirit of this policy is, I believe, already part of the DNA of any credible science or scientific institution. Unlike many industries, the biomedical research enterprise has forever thrived on challenging its findings. It has tested and retested and specified methods and shortcomings in published papers. Many scientific communities have long shared information in accessible databases across laboratories. They have also harmonized pre-database data so that they can be credibly and productively added to existing databases. But I am afraid that even the best mom-and-apple-pie approach poses challenges when we consider the details of implementation.
For example, datasets have become larger and larger in size. (Think of imaging and microscopy datasets.) If a clinical research dataset includes imaging, pathology, physiology, body-fluid chemistry, history, and physical examination data, in what repository can all that information be stored and how can it be shared so that users can easily find everything in one place? Please forgive the pun, but something tells me the digital weather in Bethesda is about to get much “cloudier” as supersized datasets outgrow NIH’s data-storage infrastructure and must be sent to remote storage systems, a.k.a. the cloud.
No one is asking investigators to put whole laboratory notebooks or their patients’ entire electronic medical records in accessible databases. But at what point in the processing, analyzing, and managing effort should scientific data become sharable? The short answer is “by the time of publication.” What about negative data that are not likely to be published in conventional journals? And how will we deal with the deficiencies of science literacy (a problem that is at least partially of our inadvertent making) in those who, along with the scientific community, will have access to the data but not necessarily to the science education and reasoning skills to understand them?
Although NIH has been working for many months to develop, disseminate, and implement strategies and instruments to facilitate data management and sharing, there remains much work to be done in ensuring and enabling compliance with this policy by the January 25, 2023, deadline. For those who work at NIH, the Office of Intramural Research (OIR) will serve as a resource and facilitator, enabling ICs to empower their investigators.
But make no mistake—the need to solve anticipated as well as unanticipated challenges will mean that this process is iterative and continuously improving. Like all such processes, it will proceed most smoothly and efficiently if done as a team sport. The OIR stands ready to be its hub, and we know we can count on the intramural research programs of every IC to share data as well as best practices and tips for overcoming challenges as we render this process optimal for our science, our patients and their loved ones, our country, and our international colleagues. Like Data, we must keep the mission of our “Enterprise” at the forefront and tailor our process and product to its accomplishment.
For more information about the new DMS policy, go to https://sharing.nih.gov/data-management-and-sharing-policy. Address questions for OIR to Charles Dearolf (dearolfc@mail.nih.gov). For an overview of NIH DMS policies go to https://sharing.nih.gov; for the latest news and events on the policies, go to https://sharing.nih.gov/news-events.
Nina F. Schor has been the Acting Deputy Director for Intramural Research (DDIR) since August 1, 2022, succeeding Michael Gottesman who was the DDIR for 29 years. She became the official DDIR on November 6, 2022. Read about her career in the first DDIR essay she wrote for The NIH Catalyst (September-October 2022 issue).
This page was last updated on Wednesday, November 9, 2022