GTC—NVIDIA today announced a partnership with the Broad Institute of MIT and Harvard to provide the Terra cloud platform and its over 25,000 users — from biomedical researchers in academia, startups and large pharma companies — with the AI and acceleration tools needed to quickly analyze massive amounts of healthcare data.
The collaboration is designed to connect NVIDIA’s AI expertise and healthcare computing platforms with the Broad Institute’s world-renowned researchers, scientists and open platforms with a focus on three key areas:
- Making NVIDIA Clara™ Parabricks® available in the Terra platform: Parabricks, a GPU-accelerated software suite for secondary analysis of sequencing data, is now available in six new Terra workflows. Users can now analyze a whole genome in just over one hour with Clara Parabricks compared to 24 hours in a CPU-based environment and can reduce the compute cost by more than half.
- Building large language models (LLMs): Researchers will develop foundational models for DNA and RNA, the building blocks of life, to better understand human biology using NVIDIA BioNeMo, an AI application framework announced today for LLMs in biology.
- Bringing improved deep learning to Genome Analysis Toolkit (GATK): NVIDIA is contributing a new deep learning model directly to the Broad Institute’s GATK toolkit, the industry standard used by more than 100,000 researchers, which helps identify genetic variants that are associated with diseases. This will help drug discovery researchers develop new therapies.
“There’s a need across the healthcare ecosystem for better computational tools to enable breakthroughs in the way we understand disease, develop diagnostics and deliver treatments,” said Kimberly Powell, vice president of healthcare at NVIDIA. “By expanding our collaboration with the Broad Institute, we can bring the power of large language models to ultimately deliver joint solutions and narrow the divide between insights from researchers and real benefits for patients.”
The Broad Institute aims to enable the next generation of collaborative biomedical research by providing an open cloud platform that connects researchers both to each other and to the datasets and tools they need to achieve scientific breakthroughs.
“Life sciences are in the midst of a data revolution, and researchers are in critical need of a new approach to bring machine learning into biomedicine,” said Anthony Philippakis, chief data officer of the Broad Institute. “In this collaboration, we aim to expand our mission of data sharing and collaborative processes to scale genomics research.”
Large Language Models to Study Disease
NVIDIA’s BioNeMo framework includes pretrained LLMs for proteins and chemistry that simplify training, inference and scaling. BioNeMo is an extension of the NVIDIA NeMo Megatron framework and is domain-specific for chemistry, proteins and DNA/RNA sequences.
BioNeMo allows developers to effectively train and deploy biology LLMs with billions of parameters. Together, teams from both organizations will build on this work, creating new models to add to the BioNeMo collection and make available in the Terra platform.
NVIDIA Software for Domain-Specific AI
NVIDIA Parabricks GPU-accelerated workflows provide researchers with faster turnaround times and lower costs for a wide range of genomic data analyses. For Broad’s GATK best practices germline workflow, doing the analysis with Parabricks on GPUs can be as much as 24x faster and less than half the cost.
Broad Institute researchers will also gain access to MONAI, an open-source deep learning framework for medical imaging AI, as well as NVIDIA RAPIDS™, a GPU-accelerated data science toolkit for faster data preparation, which can be used for genomic single-cell analysis.
Learn more about Clara Parabricks and Terra integration and sign up for early access to the NVIDIA BioNeMo LLM service.
Tune in to NVIDIA CEO Jensen Huang’s GTC keynote to learn more about the collaboration with the Broad Institute.