Introducing OpenCRISPR 🧬
Profluent releases the world’s first open source AI-generated gene editor
Prefer narration? Listen to the audio version here.
CRISPR therapeutics are potentially transformative medicines. Instead of treating symptoms, they correct disease-causing mutations in the genome, restoring normal protein and cellular function downstream with a potential “one and done” treatment.
The world’s first CRISPR therapeutic - for sickle cell anemia - received regulatory approvals in the UK and the US at the end of last year. It demonstrated impressive clinical trial performance, with the vast majority of participants being free of severe pain crises and not requiring red blood cell transfusions for at least 12 months after dosing.
While we desperately need more CRISPR-based therapeutics, it’s an incredibly challenging problem space. Not only do researchers face steep R&D costs with trial and error discovery, CRISPR is surrounded by a complex web of patents. This can lead to significant license fees and has already triggered several years of legal disputes. Medical R&D of any kind is a high-risk endeavor but the upfront costs are so immense that CRISPR is currently closed off to all but the best-funded institutions.
Now, our friends at Profluent have released OpenCRISPR-1 - the world’s first open-source, AI-generated gene editor.
Profluent used their proprietary LLMs, trained on a curated dataset of over 1 million CRISPR operons (gene editing systems), mined from 26 terabases of assembled microbial genomes and metagenomes, to generate thousands of diverse CRISPR proteins. Not only did they generate 4.8x the number of protein clusters across CRISPR-Cas families that are found in nature, these included some that are over 400 mutations away from any known natural protein. A number of the gene editors’ performance was comparable to SpCas9, the prototypical CRISPR gene editor.
OpenCRISPR-1, Profluent’s top hit, exhibited up to 95% editing efficiency across cell types, with a low off-target rate, and is compatible with base editing. This latter point is critical - it means the editor has the precision required to change a single DNA base pair without fully cutting the DNA double helix. This significantly reduces the possibility of unwanted insertions or deletions.
This is the most concrete demonstration we’ve seen so far of both the efficacy and safety of AI-powered gene editors.
Considering the challenges around access, Profluent is making OpenCRISPR-1 available for free. For both research and commercial use, there are no upfront costs, milestone payments, or royalties - now or ever.
On a personal note, as a biologist by training, this is the single most impactful real-world application of LLMs that I am yet to see. Ali Madani, Profluent’s CEO and Founder, was ahead of the whole field on the application of language modeling to proteins. This achievement is a case study in how to navigate the difficult journey from intuition to research to proof of concept. It’s also a valuable reminder of the importance of open science.
We’re proud to have supported Profluent from Day 1, especially as OpenCRISPR-1, while impressive, is nowhere near the company’s final product.
This is just an early interim step of Profluent’s mission to shift the paradigm from accidental discovery to the intentional design of novel, functional proteins. If these results are just stage 1, we’re excited to see where the team’s skill and ambition takes them next.