13.5 C
New York
Saturday, October 25, 2025

Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Mannequin that Replaces Consideration with Consensus and Trains with Discrete Diffusion


Anthrogen has launched Odyssey, a household of protein language fashions for sequence and construction technology, protein modifying, and conditional design. The manufacturing fashions vary from 1.2B to 102B parameters. The Anthrogen’s analysis staff positions Odyssey as a frontier, multimodal mannequin for actual protein design workloads, and notes that an API is in early entry.

https://www.biorxiv.org/content material/10.1101/2025.10.15.682677v1.full.pdf

What downside does Odyssey goal?

Protein design {couples} amino acid sequence with 3D construction and with useful context. Many prior fashions undertake self consideration, which mixes data throughout the whole sequence without delay. Proteins comply with geometric constraints, so lengthy vary results journey by means of native neighborhoods in 3D. Anthrogen frames this as a locality downside and proposes a brand new propagation rule, referred to as Consensus, that higher matches the area.

https://www.biorxiv.org/content material/10.1101/2025.10.15.682677v1.full.pdf

Enter illustration and tokenization

Odyssey is multimodal. It embeds sequence tokens, construction tokens, and light-weight useful cues, then fuses them right into a shared illustration. For construction, Odyssey makes use of a finite scalar quantizer, FSQ, to transform 3D geometry into compact tokens. Consider FSQ as an alphabet for shapes that lets the mannequin learn construction as simply as sequence. Practical cues can embrace area tags, secondary construction hints, orthologous group labels, or brief textual content descriptors. This joint view offers the mannequin entry to native sequence patterns and lengthy vary geometric relations in a single latent house.

https://www.biorxiv.org/content material/10.1101/2025.10.15.682677v1.full.pdf

Spine change, Consensus as a substitute of self consideration

Consensus replaces international self consideration with iterative, locality conscious updates on a sparse contact or sequence graph. Every layer encourages close by neighborhoods to agree first, then spreads that settlement outward throughout the chain and get in touch with graph. This transformation alters compute. Self consideration scales as O(L²) with sequence size L. Anthrogen reviews that Consensus scales as O(L), which retains lengthy sequences and multi area constructs inexpensive. The corporate additionally reviews improved robustness to studying charge decisions at bigger scales, which reduces brittle runs and restarts.

https://www.biorxiv.org/content material/10.1101/2025.10.15.682677v1.full.pdf

Coaching goal and technology, discrete diffusion

Odyssey trains with discrete diffusion on sequence and construction tokens. The ahead course of applies masking noise that mimics mutation. The reverse time denoiser learns to reconstruct constant sequence and coordinates that work collectively. At inference, the identical reverse course of helps conditional technology and modifying. You possibly can maintain a scaffold, repair a motif, masks a loop, add a useful tag, after which let the mannequin full the remainder whereas maintaining sequence and construction in sync.

Anthrogen reviews matched comparisons the place diffusion outperforms masked language modeling throughout analysis. The web page notes decrease coaching perplexities for diffusion versus complicated masking, and decrease or comparable coaching perplexities versus easy masking. In validation, diffusion fashions outperform their masked counterparts, whereas a 1.2B masked mannequin tends to overfit to its personal masking schedule. The corporate argues that diffusion fashions the joint distribution of the complete protein, which aligns with sequence plus construction co design.

https://www.biorxiv.org/content material/10.1101/2025.10.15.682677v1.full.pdf

Key takeaways

  1. Odyssey is a multimodal protein mannequin household that fuses sequence, construction, and useful context, with manufacturing fashions at 1.2B, 8B, and 102B parameters.
  2. Consensus replaces self consideration with locality conscious propagation that scales as O(L) and exhibits sturdy studying charge conduct at bigger scales.
  3. FSQ converts 3D coordinates into discrete construction tokens for joint sequence and construction modeling.
  4. Discrete diffusion trains a reverse time denoiser and, in matched comparisons, outperforms masked language modeling throughout analysis.
  5. Anthrogen reviews higher efficiency with about 10x much less knowledge than competing fashions, which addresses knowledge shortage in protein modeling.

Odyssey is spectacular mannequin as a result of it operationalizes joint sequence and construction modeling with FSQ, Consensus, and discrete diffusion, enabling conditional design and modifying below sensible constraints. Odyssey scales to 102B parameters with O(L) complexity for Consensus, which lowers price for lengthy proteins and improves learning-rate robustness. Anthrogen reviews diffusion outperforming masked language modeling in matched evaluations, which aligns with co-design goals. The system targets multi-objective design, together with efficiency, specificity, stability, and manufacturability. The analysis staff emphasizes knowledge effectivity close to 10x versus competing fashions, which is materials in domains with scarce labeled knowledge.


Try the Paperand Technical particulars. Be at liberty to take a look at our GitHub Web page for Tutorials, Codes and Notebooks. Additionally, be happy to comply with us on Twitter and don’t neglect to hitch our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you may be part of us on telegram as effectively.


Michal Sutter is a knowledge science skilled with a Grasp of Science in Knowledge Science from the College of Padova. With a stable basis in statistical evaluation, machine studying, and knowledge engineering, Michal excels at remodeling complicated datasets into actionable insights.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles