fold a protein, see the energy minimum.
an ab-initio folding sandbox. simulated annealing and metropolis monte carlo over the hp lattice model for sequence intuition, plus a toy all-atom force field for small peptides under ~30 residues. output is a pdb file you open in pymol. rust, because ~40m random moves per second matters when your energy landscape is rugged.
seq HPHPPHHPHPPHPHHPPHPH (20 residues · hp lattice · 2d) move 1e4 energy -3 temp 4.00 move 1e5 energy -7 temp 2.85 move 1e6 energy -12 temp 1.42 move 1e7 energy -14 temp 0.62 final fold (native-like): · · · ○─○ · · · · · · · · ○─●─●─○ · · · · · · · ● · · ● · · · · · · · ● · · ● · · · · · · · ○─○─○─○ · · · · · hydrophobic core buried. △G = -14 kT.
sequence in, conformation out.
$ foldlab fold -i HPHPPHHPHPPHPHHPPHPH --steps 1e7 --out fold.pdb → lattice: 2d square · replicas: 8 · schedule: geometric → annealing from T=4.0 to T=0.05 over 1e7 mc steps step energy temp accept 1e4 -3 4.00 0.82 1e5 -7 2.85 0.64 1e6 -12 1.42 0.31 1e7 -14 0.62 0.09 → native fold located. △G = -14 kT. → wrote fold.pdb (20 atoms · 1 model). $ foldlab fold --mode allatom --peptide ALGKIPVR --out pep.pdb → all-atom mode · 8 residues · lj + harmonic + torsion → ramachandran sampling, phi/psi restrained to allowed regions → wrote pep.pdb (72 atoms · 10 models).
the methods, briefly.
hp lattice model
2d square, 3d square, and fcc lattices. each residue is hydrophobic or polar. the whole protein folding problem, stripped to its reason for existing.
metropolis monte carlo
pivot, corner-flip, crankshaft. proposal accepted with min(1, exp(-△E/kT)). an off-by-one in the pivot move and suddenly the chain passes through itself, ask me how i know.
adaptive annealing
temperature schedule tuned to chain length. longer chains get slower cooling. geometric by default, linear and logarithmic available if you want to argue about it.
toy all-atom ff
for peptides up to about 30 residues. lennard-jones, harmonic bonds and angles, torsions. not amber, not charmm, just enough to see a ramachandran-respecting fold emerge.
parallel replica exchange + pdb
eight replicas at staggered temperatures, swapping configurations across threads, roughly 40m mc steps per second on an m2. trajectories are multi-model pdb files you can drop into pymol, chimera, or vmd.
why simulate folding when alphafold exists?
alphafold is a lookup. a very good lookup, trained on every structure in the pdb, but still a model that predicts what a sequence folds into rather than why. it doesn't give you a landscape. it gives you a final coordinate set and a confidence score. nowhere in its forward pass does a residue get pushed around by a thermal fluctuation, nowhere does a hydrophobic contact lower an energy.
the hp lattice model is laughably simple. two residue types, a grid, one rule (hydrophobic residues like touching each other). that's it. but this cartoon captures the hydrophobic collapse, which is the reason any protein folds at all. writing the monte carlo engine yourself teaches you what "the energy landscape is rugged" actually feels like: acceptance rate crashing below 10% at low temperature, trajectory getting stuck in a meta-stable basin with △G just above the native. alphafold wins every benchmark and will keep winning; this is a notebook, a place to feel the physics by hand.
not public yet.
source drops on github soon, i'm still finishing the repo. rust 1.75+, macos or linux when it ships. email bennett@frkhd.com if you want an early look.