open source · rust · mit · ab-initio

fold a protein, see the energy minimum.

an ab-initio folding sandbox. simulated annealing and metropolis monte carlo over the hp lattice model for sequence intuition, plus a toy all-atom force field for small peptides under ~30 residues. output is a pdb file you open in pymol. rust, because ~40m random moves per second matters when your energy landscape is rugged.

install →

seq HPHPPHHPHPPHPHHPPHPH   (20 residues · hp lattice · 2d)

move 1e4    energy -3    temp 4.00
move 1e5    energy -7    temp 2.85
move 1e6    energy -12   temp 1.42
move 1e7    energy -14   temp 0.62

final fold (native-like):
   · · · ○─○ · · · · · ·
   · · ○─●─●─○ · · · · ·
   · · ● · · ● · · · · ·
   · · ● · · ● · · · · ·
   · · ○─○─○─○ · · · · ·

hydrophobic core buried. △G = -14 kT.

trace of a single annealing run, ○ polar, ● hydrophobic

a typical run

sequence in, conformation out.

$ foldlab fold -i HPHPPHHPHPPHPHHPPHPH --steps 1e7 --out fold.pdb
→ lattice: 2d square · replicas: 8 · schedule: geometric
→ annealing from T=4.0 to T=0.05 over 1e7 mc steps

  step       energy     temp      accept
  1e4        -3         4.00      0.82
  1e5        -7         2.85      0.64
  1e6        -12        1.42      0.31
  1e7        -14        0.62      0.09

→ native fold located. △G = -14 kT.
→ wrote fold.pdb (20 atoms · 1 model).

$ foldlab fold --mode allatom --peptide ALGKIPVR --out pep.pdb
→ all-atom mode · 8 residues · lj + harmonic + torsion
→ ramachandran sampling, phi/psi restrained to allowed regions
→ wrote pep.pdb (72 atoms · 10 models).

what's inside

the methods, briefly.

hp lattice model

2d square, 3d square, and fcc lattices. each residue is hydrophobic or polar. the whole protein folding problem, stripped to its reason for existing.

metropolis monte carlo

pivot, corner-flip, crankshaft. proposal accepted with min(1, exp(-△E/kT)). an off-by-one in the pivot move and suddenly the chain passes through itself, ask me how i know.

adaptive annealing

temperature schedule tuned to chain length. longer chains get slower cooling. geometric by default, linear and logarithmic available if you want to argue about it.

toy all-atom ff

for peptides up to about 30 residues. lennard-jones, harmonic bonds and angles, torsions. not amber, not charmm, just enough to see a ramachandran-respecting fold emerge.

parallel replica exchange + pdb

eight replicas at staggered temperatures, swapping configurations across threads, roughly 40m mc steps per second on an m2. trajectories are multi-model pdb files you can drop into pymol, chimera, or vmd.

a small argument

why simulate folding when alphafold exists?

alphafold is a lookup. a very good lookup, trained on every structure in the pdb, but still a model that predicts what a sequence folds into rather than why. it doesn't give you a landscape. it gives you a final coordinate set and a confidence score. nowhere in its forward pass does a residue get pushed around by a thermal fluctuation, nowhere does a hydrophobic contact lower an energy.

the hp lattice model is laughably simple. two residue types, a grid, one rule (hydrophobic residues like touching each other). that's it. but this cartoon captures the hydrophobic collapse, which is the reason any protein folds at all. writing the monte carlo engine yourself teaches you what "the energy landscape is rugged" actually feels like: acceptance rate crashing below 10% at low temperature, trajectory getting stuck in a meta-stable basin with △G just above the native. alphafold wins every benchmark and will keep winning; this is a notebook, a place to feel the physics by hand.

install

not public yet.

source drops on github soon, i'm still finishing the repo. rust 1.75+, macos or linux when it ships. email bennett@frkhd.com if you want an early look.