From ToxPipe to FAIRkit
NIH-Built AI Chatbots Are Helping Scientists Sift Through the Data
BY PAIGE JARREAU, NIA; and THE NIH CATALYST STAFF

NIH researchers are creating an array of AI-powered tools to address unique intramural challenges.
Artificial intelligence (AI) tools are taking root across NIH, reshaping how researchers access information, analyze data, and advance biomedical discovery. From generative chatbots that streamline scientific queries to machine learning models that help harmonize massive datasets, AI is proving to be a powerful partner in tackling complex hypotheses in research topics spanning from toxicology to dementia and beyond. There are many, so let’s chat about ’em!
Chatbot creation for NIH biomedical research
Speakers at a June 11 NIH Library event that featured members from the NIH Generative AI Community of Practice showcased a range of AI-driven chatbot initiatives under development across the agency. Speakers and topics at a roundtable discussion, archived on the NIH Library YouTube channel, included:
- “Generative AI Chatbots in the NIH Landscape: Foundations, Opportunities, and Considerations” by Alicia Lillich, NIH Library
- “Chatbot for the Intramural Research Program, or ChIRP,” by Steevenson Nelson, OD
- “ToxPipe: Chatbots and Retrieval-Augmented Generation on Toxicological Data Streams” by Trey Saddler, NIEHS
- “CARDbiomedbench: Biomedical Benchmark of Chatbots, CARD.AI Arena, CARD.AI, FAIRkit” by Faraz Faghri, NIA
- “AI Chatbots: Opportunities and Considerations at NLM” by Dianne Babski, NLM
- “Using AI to Create a Travel Chatbot” by Fiona Vaughans, NCI
Lillich, an emerging technology specialist at the NIH Library, presented opportunities and ethical considerations in deploying generative AI chatbots within NIH’s information ecosystem. These tools, which use large language models (LLMs) to assist researchers, have the potential to improve literature discovery, scientific synthesis, and internal knowledge management, she noted.
Nelson shared ChIRP updates and reminded attendees that it was designed to respond to NIH intramural queries within a more secure environment than public LLMs offer. Read more about ChIRP in this recent NIH Catalyst article. ChIRP is being reimagined as a tool for all of NIH, encompassing both research and administrative tasks. The new and improved tool will be rolled out to the entire NIH community later this summer.
Saddler, a data scientist in the NIEHS Division of Translational Toxicology, highlighted ToxPipe, a chatbot-enabled platform that lets users explore toxicology databases through an intuitive interface powered by LibreChat. Saddler also demonstrated ChemBioTox, which uses autonomous AI agents to answer toxicology questions such as “What are the exposure levels of bisphenols?.” Responses are generated through multistep reasoning. The functionality can be evaluated via open-source tools that allow scientists to rate accuracy and refine results.
Faghri from NIA’s Center for Alzheimer’s and Related Dementias (CARD), presented several AI-driven platforms, including CARDBiomedBench, CARD.AI Arena, and FAIRkit. These tools, which are described in more detail below, “are using AI to better describe diseases, predict disease progression, and identify new drug targets,” said Faghri, a computer scientist in the advanced analytics expert group at CARD. “AI-powered tools are helping us solve problems that weren’t solvable before.”

CREDIT: NIA, CARD
Faraz Faghri
Advancing data harmonization
CARD’s advanced analytics expert group is applying AI to one of biomedical research’s greatest challenges—data harmonization. Different research groups collect different types of patient data, including genetic profiles, imaging, and environmental exposures. These datasets are often incompatible by default, which complicates, if not impedes, large-scale analyses. Standardization of so many data points is impossible to achieve yet incredibly necessary as biomedical research speeds toward a future of open access data and the application of machine learning.
The DIVER (Data Inventory and Verification Environment for Research) platform, which uses OpenAI’s GPT models to automate the creation of common data elements (CDEs), may help. Faghri’steam applied DIVER to 31 dementia-related datasets and achieved interoperability scores of up to 60% when combining data from the Alzheimer’s Disease Neuroimaging Initiative and the Global Parkinson’s Genetic Program (PMID: 39484274). By automating what is typically a labor-intensive and error-prone process, DIVER enabled them to merge datasets and perform cross-study comparisons, which is a critical step toward identifying early biomarkers and validating therapeutic targets.
Predicting missing data
Missing or incomplete data, especially in electronic health records (EHRs), is another persistent obstacle in biomedical research. Traditional data collection and analysis techniques often fall short in capturing the complexity of health care information consistently.
The CARD team developed a machine learning framework called MUSE (Multimodal Unsupervised Embedding), which helps to predict missing values in patient data. MUSE uses graph neural networks to analyze the relationships among patient data across multiple data types such as brain scans, cognitive scores, and biomarkers. Rather than addressing each data gap in isolation, MUSE models the entire patient ecosystem to generate more accurate predictions.
The model improved predictions of Alzheimer’s disease progression by more than 3% compared with standard approaches. “There’s value in retaining data even from patients with large missing segments,” Faghri said. “We’re trying to figure out the broader structure of the data system and see where people with missing data might fit in. Graph neural networks help us connect the dots.”
An AI web crawler for 508 compliance
Dianne Babski, director of the NLM User Services and Collection Division, presented on NLM’s efforts to pilot human-centered AI. Babski demonstrated the NLM Web Accessibility Assistant, which was created by Dan Wendling to aid in making webpage content more accessible for users. The assistant identifies accessibility issues to help ensure Section 508 compliance and improve user accessibility. The assistant recommends fixes and provides code to make those webpage changes. To date, it has flagged over 67 unique error types across 9,000 website pages.
Babski and Nick Weber, acting director of CIT’s Office of Scientific Computing Services, co-chair the NIH Generative AI Community of Practice group. Hundreds of NIHers attend the group’s monthly meetings. Will we see you there?
The NIH Office of Science Policy is currently seeking input on responsible development of generative AI tools using controlled-access human genomic data. NIH encourages staff and stakeholders to comment on best practices for mitigating data leakage while promoting innovation. Comments are due by July 16, 2025, and a roundtable discussion will follow on July 17. Submit your feedback and learn more here: NIH Comment Form.
Additional resources shared at the event
- GitHub: Learn more about NIH GitHub by emailing GitHub@nih.gov.
CARD tools and benchmarks on GitHub:
GitHub Copilot clone: https://continue.dev
- CARD.AI Arena: https://cardai-arena-809832168532.us-central1.run.app/
- PubTator Central (NLM): extracts information from PubMed abstracts and articles to create annotations of biomedical concepts for use with AI
- Prompt engineering tips and tricks: https://www.promptingguide.ai
- Blog post by CARD’s advanced analytics expert group: Can GPT-4.5, Claude 3.7, and Gemini 2.0 Keep Up with Biomedical Research?
Watch the entire June 11 event to learn more about the tools discussed above and much more.
This page was last updated on Thursday, July 10, 2025