## Method

### Potential

The eBDIMS (elastic-network driven Brownian Dynamics Importance Sampling) method is a simulation based on the ED-ENM (Essential Dynamics-refined Elastic Network Model) force field (Orellana et al., 2010) to generate transition pathways between different protein conformations.

In the ED-ENM potential, the protein structure is reduced to a simplified beads-and-springs model, where residues are represented by the positions of their C-alpha carbons and interconnected by harmonic springs that mimic the covalent and electrostatic bonding (Figure 1)

$$U_i = \frac{1}{2} \sum_{j=1}^N K_{ij}(r_{ij}-r_{ij}^0)^2$$

The ED-ENM potential was calibrated based on a Molecular Dynamics library as well as experimental data from X-ray and NMR structures, and has been proven accurate enough to assess CASP predictions (Perez et al., 2012).

In the eBDIMS algorithm, the elastic network is subject to an overdamped Langevin simulation (Emperador et al., 2008), where random forces play the role of thermal fluctuations and a friction coefficient is used to represent the implicit solvent:

$$m_{i}\ddot{r_{1}} = F_{i} - \gamma \dot{r_{i}} + \xi_{i}(t)$$

The simulation is integrated using the Verlet algorithm, and monitored by computing a progress variable defined by the internal distances of the molecule between all atom pairs. Every certain number of unbiased steps (k) the progress variable is computed for the current structure and compared to the target, so that if their difference decreases the structure is accepted or otherwise rejected (more details in Orellana et al., 2016).

### Trajectory analysis by 2D-projection onto a motion space

Molecular transitions are usually evaluated in terms of RMSDs between the structures, or by monitoring several ad-hoc defined reaction coordinates specific for each system, such as distances or angles between domains, structural elements, etc.

The eBDIMS algorithm was validated using an novel and more strict approach, based on the analysis of structural ensembles sampling at least three different conformations of a protein (i.e. containing well defined path-intermediates).

Such ensembles render Principal Components (PCs) that act as powerful reaction-coordinates to monitor transitions between different states and to automatically classify structures along a pathway. Each PC describes a molecular movement, which strongly correlate with the heuristic reaction coordinates typically described for each protein but avoid the ambiguity of user and system-tailored definitions. The projection of different structures and trajectories on the PCs allows for immediate visual inspection of the structural clusters and the way in which a transition is proceeding (Figure 2), which cannot be easily achieved with RMSDs or by visual inspection of structures.

Significant PCs should concentrate > 70% of structural variance, and separate clearly the start and target conformation along with additional functional states.

Since accurate ensemble preparation requires manual curation and depends on available structures, the eBDIMS webserver offers a default option to use the first two Normal Modes (NMs) of the starting structure computed by the ED-ENM force-field as motion reaction coordinates. Low frequency normal modes, which describe the natural or easiest movements of a structure, have been shown to capture conformational changes efficiently and to correlate strongly with the PCs of experimental ensembles. When NMs show a high overlap with the transition (>80%), they provide useful axes to monitor how a given transition is proceeding in terms of molecular movements.

Significant NMs should have an overlap >80% with the conformational transition, and separate clearly the start and target end-states.

### References

• Laura Orellana, Ozge Yoluk, Oliver Carrillo, Modesto Orozco, Erik Lindahl
Prediction and validation of protein intermediate states from structurally rich ensembles and coarse-grained simulations
Nature Communications 7, Article number: 12575 (2016) doi:10.1038/ncomms12575
• Agusti Emperador, Oliver Carrillo, Manuel Rueda, Modesto Orozco
Exploring the Suitability of Coarse-Grained Techniques for the Representation of Protein Dynamics
Biophysical Journal (2008), 95(5): 2127-2138. doi:10.1529/biophysj.107.119115
• Alberto Perez, Zheng Yang, Ivet Bahar, Ken A. Dill, Justin L. MacCallum
FlexE: Using Elastic Network Models to Compare Models of Protein Structure
Journal of Chemical Theory and Computation (2012), 8(10):3985-3991. doi:10.1021/ct300148f
• Laura Orellana, Manuel Rueda, Carles Ferrer-Costa, Jose Ramon Lopez-Blanco, Pablo Chacon, Modesto Orozco
Approaching Elastic Network Models to Molecular Dynamics Flexibility
Journal of Chemical Theory and Computation (2010), 6(9):2910-2923. doi:10.1021/ct100208e