NVIDIA's Evo 2 Revolutionizes Genomic Modeling with Advanced AI
Caroline Bishop Feb 19, 2025 09:49
NVIDIA's Evo 2 introduces groundbreaking advancements in genomic modeling, leveraging AI to analyze and generate biomolecular sequences across various species, enhancing biological research and applications.

NVIDIA has unveiled Evo 2, an advanced AI model that significantly enhances genomic modeling by analyzing and generating biomolecular sequences at an unprecedented scale. This development builds on the success of its predecessor, Evo, and marks a new era in biological research, according to NVIDIA's blog.
Advancements in Genomic Modeling
Evo 2 leverages a massive dataset comprising 8.85 trillion nucleotides from over 128,000 genomes, spanning the three domains of life: Eukarya, Prokarya, and Archaea. This vast training data allows Evo 2 to achieve cross-species generalization, a significant improvement over the original Evo model, which was limited to prokaryotic genomes.
The model utilizes the enhanced StripedHyena 2 architecture, which extends up to 40 billion parameters and accommodates context lengths of up to 1 million tokens. This architecture is designed to efficiently manage long sequences, outperforming traditional models that rely on attention mechanisms.
Applications and Capabilities
Evo 2's capabilities extend across various biological applications, including variant impact analysis, gene essentiality identification, and the design of complex biological systems. Its zero-shot performance allows for accurate predictions of mutation effects and genome annotation, providing invaluable insights into human diseases, agriculture, and environmental science.
The model's generative abilities enable the design of prokaryotic and eukaryotic sequences, as well as chromatin accessibility, showcasing its potential for real-world applications in synthetic biology and precision medicine.
Technical Innovations and Training
Evo 2 was trained using 2,048 NVIDIA H100 GPUs within the NVIDIA DGX Cloud on AWS, highlighting the collaboration between NVIDIA and the Arc Institute. This high-performance platform facilitates large-scale, distributed training, optimizing the model's performance and enabling rapid advancements in genomic research.
The NVIDIA Evo 2 NIM microservice provides an API for generating biological sequences, offering settings to adjust parameters such as tokenization and sampling. This service allows researchers to fine-tune Evo 2 for specialized tasks, further expanding its utility in BioPharma and beyond.
The Future of AI in Biology
Evo 2 represents a significant leap forward in AI-driven biological research, setting the stage for future innovations. Its ability to analyze and generate biomolecular sequences with high accuracy and broad applicability underscores its potential to transform the fields of genomics and synthetic biology.
As AI continues to evolve, models like Evo 2 will play a crucial role in decoding the complexities of life and designing new biological systems, aligning with broader trends in AI-driven scientific research. The advancements made by Evo 2 signal a promising future where AI becomes an indispensable tool in biological discovery and innovation.
For further details, visit the NVIDIA blog.
Image source: Shutterstock