SandboxAQ’s AQAffinity Changes the Workflow for Drug Affinity Prediction
Affinity, or how strongly a drug binds to a receptor, is an important early indicator of whether a molecule is likely to become a viable therapy. For scientists, affinity prediction has been powerful but hard to use, as it depends on detailed protein structures from labor-intensive experiments and computationally demanding physics-based simulations.
Those constraints are what SandboxAQ aims to address with AQAffinity, a new open source AI model designed to predict protein–ligand binding affinity directly from protein sequence, without requiring experimentally determined structures. Launched in January in collaboration with Nvidia, AQAffinity is built on top of OpenFold3, the open biomolecular co-folding model developed by the OpenFold Consortium, offering researchers a more practical way to evaluate binding affinity at scale.
To understand how the AQAffinity model could accelerate affinity prediction, AIwire sat down with Dr. Adam Lewis, a physicist who is head of innovation for the AISim business unit of SandboxAQ. His team creates the company’s AI models aimed at accelerating drug, chemical, and material discovery.
How AQAffinity Approaches Affinity Prediction
“AQAffinity computes the affinity between a protein and a molecule,” Lewis told AIwire. “It gives you a number that tells you how sticky that molecule is to the protein, which is a proxy for drug efficacy that’s typically used at the early stages of drug development.”
AQAffinity is SandboxAQ’s open protein-ligand binding affinity prediction head built on top of OpenFold3 (Credit: SandboxAQ)
That “stickiness” score helps researchers decide which molecules are worth pursuing and which are unlikely to succeed. Binding affinity does not guarantee that a drug will work, but it is one of the earliest ways to narrow down large pools of candidates before committing to expensive laboratory experiments.
What makes AQAffinity different from many existing affinity prediction tools is how it arrives at that stickiness number. Traditional approaches usually depend on experimentally determined protein structures generated through techniques like X-ray crystallography or cryo-electron microscopy. Those structures are often difficult and expensive to obtain, and they only exist for a fraction of biologically relevant proteins. As a result, many drug discovery programs either cannot use affinity prediction at all or can only apply it to a small number of well-studied proteins.
Instead, AQAffinity operates directly from protein sequence, using the amino acid sequence of a target protein as its input. It builds on OpenFold3, which learns internal representations of protein structure and molecular interactions, enabling affinity estimation without separate structural inputs.
That distinction matters because it determines which computational tools researchers can realistically apply to a given target. “You would typically know the protein sequence,” Lewis said, “but you don’t need imaged experimental crystal structure data.”
What AQAffinity Makes Easier
One immediate benefit of removing the structure requirement is speed. SandboxAQ says AQAffinity is designed to run much faster than traditional physics-based affinity methods, allowing researchers to screen large numbers of candidates without the heavy computational burden of those approaches. Another benefit is access. By removing the requirement for experimental structures, AQAffinity can be applied to proteins that are poorly characterized or difficult to image, and can be used earlier to consider affinity across multiple proteins. Lewis says this makes it easier to look beyond a single target and ask relevant questions earlier in a drug discovery program. “What we’re really hoping to erode is the paradigm of having one single drug target and kind of hoping for the best for six years and then getting into clinical trial,” he said.
AQAffinity is also fully open source, released under the permissive Apache 2.0 license, and available on Hugging Face. Training methods and data provenance are documented, allowing researchers to evaluate model performance against internal benchmarks and fine-tune the model for specific targets or chemical spaces.
Current Limitations
AQAffinity’s speed and accessibility come with trade-offs in accuracy. Based on early testing, Lewis said the model’s performance is “not yet better or really even on par with the best physics-based methods at the moment,” a gap he described as expected given AQAffinity’s focus on faster, structure-free prediction.
He also cautioned that the model’s current performance is strongest within the range of targets represented in its training data. “Within the range of structures that are similar to those in the training set, you’re seeing acceptable performance,” Lewis said, adding that generalizability beyond that range “is not as good as we would hope for at the moment.”
Because of those limitations, Lewis emphasized that AQAffinity should not be treated as a drop-in replacement for existing workflows. Instead, he recommends that researchers evaluate the model against their own data before relying on its predictions. “With any new method,” he said, “you should start by getting together something that’s representative for the problem you’re actually trying to solve and developing some kind of retrospective study and testing the model on that.”
Lewis also noted that AQAffinity’s open source design makes it possible to fine-tune the model for specific programs, but he warned that doing so requires caution. “You’ll need to do that carefully because of course you don’t want to contaminate. You need enough data to withhold a reasonable test set, otherwise you’re not going to know if the model is working well or not because you’ll just be showing marked examples.”
SandboxAQ sees these limitations as part of an ongoing development process rather than a barrier to use. Lewis describes improvements in accuracy, speed, and generalizability as active areas of work for future versions of the model.
AQAffinity Within the OpenFold Ecosystem
AQAffinity is closely tied to the OpenFold ecosystem. OpenFold3 itself is fully open source and available for commercial use, in contrast to some competing biomolecular models that are restricted, proprietary, or difficult to evaluate independently.
The OpenFold Consortium, a non-profit initiative hosted by the Open Molecular Software Foundation, brings together academic labs, biopharma companies, and technology partners to jointly develop open tools for biology and drug discovery. Consortium members were able to beta test AQAffinity in late 2025, comparing its performance against existing methods and providing early feedback.
In a panel discussion today, Professor Mohammed AlQuraishi, principal investigator of the OpenFold Consortium, pointed to AQAffinity as an example of how OpenFold3 is designed to support rapid downstream innovation. “OpenFold3 allows people to start from a much higher starting position than they otherwise would,” he said.
AlQuraishi said this kind of progress is only possible because recent advances in protein modeling and AI, such as AlphaFold, have made these tools more practical for real drug discovery work. “These tools are useful enough,” he said. “They’ve gone beyond just kind of pure academia to something that can move the needle. And for that reason, it has created the space for these really innovative, new organizational structures that can bring together industry and academia.”
Nvidia’s Role
Much of the progress described by SandboxAQ and OpenFold depends on advances in accelerated computing. In the panel discussion, Roy Tal, senior alliance manager for BioNeMo at Nvidia, discussed how the company supports AI-driven drug discovery like OpenFold3 and AQAffinity through a combination of open research and domain-specific optimization.
Tal framed Nvidia’s commitment to open models as a practical requirement for advancing AI adoption. “We think that it is an imperative, in order to increase the adoption of AI and in order to continue driving the rapid pace of innovation, it’s important to invest heavily in open source,” he said. “That means open weights, open training, code, research papers, and so forth.”
Representatives from SandboxAQ, Nvidia and the OpenFold Consortium discuss AQAffinity in a panel (Source: SandboxAQ)
Tal described how life sciences models place unique demands on computing systems, requiring optimizations that differ from those used for language or vision models. Biology models like OpenFold3, he explained, rely on specialized operations: “OpenFold3 and AlphaFold and similar architectures have operations called triangle operations. Triangle multiplication, triangle attention — these are geometry-aware operations for the 3D space that don’t really exist as much when we’re thinking of images and language.”
These unique operations are computationally expensive and time-consuming, Tal noted, requiring tailored solutions. “We decided to develop custom CUDA kernels that substantially accelerate these operations,” he said, adding that the optimizations allow both faster training and faster inference, while also enabling larger biomolecular systems to be represented. “For very specific parts of this OpenFold3 class of architectures, we accelerated them with a library called cuEquivariance we developed that is relatively low level, can plug into these models, and was plugged into OpenFold3 when it came out, reducing the time for training and inference.”
SandboxAQ also worked directly with Nvidia’s AI accelerator team to optimize its GPU workflows. According to the company, that collaboration helped boost GPU utilization to 95% and shortened one development cycle from an estimated three months to three weeks. Tal described Nvidia’s role as reducing computational bottlenecks so that researchers can focus on model development and scientific questions. “As an accelerated computing platform that we are, we want to solve the technological bottlenecks that we can,” he said, “so that the industry can keep focusing on research, development, and products.”
Looking Ahead
SandboxAQ’s future goals for AQAffinity are straightforward, but ambitious. “We want to make it faster,” Lewis told AIwire. “We want to make it work for a wider range of proteins, and we want to make it more accurate.” But rather than narrowly focusing on performance metrics, Lewis mentioned the larger implications for how drug discovery programs evolve over time. “Those metrics are simple, but what’s less simple is what that unlocks,” he said, describing a shift toward workflows that allow researchers to revisit and revise assumptions as a program unfolds.
“Our vision here is to use these methods not just for virtual screening, but to enable new kinds of screens so that you can check multiple potential drug targets in the course of a campaign because this would allow you to start iterating between different targets and different molecules as part of a single program,” Lewis said. “The idea is to integrate the chemistry iteration and the biological iteration into one, and that’s the lighthouse goal of all the different technical streams we’re working on.”
Related

