Reweighting non-equilibrium steady-state dynamics along collective variables
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Reweighting non-equilibrium steady-state dynamics along collective variables Marius Bause1, a) and Tristan Bereau2, 1 1) Max Planck Institute for Polymer Research, 55128 Mainz, Germany 2) Van ’t Hoff Institute for Molecular Sciences and Informatics Institute, University of Amsterdam, Amsterdam 1098 XH, The Netherlands (Dated: 7 January 2021) Computer simulations generate microscopic trajectories of complex systems at a single thermodynamic state point. We recently introduced a Maximum Caliber (MaxCal) approach for dynamical reweighting. Our approach mapped these trajectories to a Markovian description on the configurational coordinates, and reweighted path probabilities as a function of external forces. Trajectory probabilities can be dynamically arXiv:2101.02004v1 [cond-mat.stat-mech] 6 Jan 2021 reweighted both from and to equilibrium or non-equilibrium steady states. As the system’s dimensionality increases, an exhaustive description of the microtrajectories becomes prohibitive—even with a Markovian assumption. Instead we reduce the dimensionality of the configurational space to collective variables (CVs). Going from configurational to CV space, we define local entropy productions derived from configurationally averaged mean forces. The entropy production is shown to be a suitable constraint on MaxCal for non- equilibrium steady states expressed as a function of CVs. We test the reweighting procedure on two systems: a particle subject to a two-dimensional potential and a coarse-grained peptide. Our CV-based MaxCal ap- proach expands dynamical reweighting to larger systems, for both static and dynamical properties, and across a large range of driving forces. I. INTRODUCTION methods have not yet been shown to reweight complex systems across non-equilibrium conditions. We recently Dynamical processes are used to describe complex be- introduced a method based on a MaxCal ansatz, which havior in a number of fields, examples are transition is capable to reweight the dynamics of minimal systems state dynamics of chemical reactions1 or photosynthesis.2 in NESSs.20 This paper extends this method to reweight Many processes are influenced by external driving and dynamical information of complex systems described by operate away from equilibrium. Long-time driving often collective coordinates. leads to systems eventually settling in a non-equilibrium Designed as an extension of the Ferrenberg-Swendsen steady state (NESS). Application of NESS include de- method, our reweighting scheme is based on the Gibbs scription of lasers,3 photosynthesis,4 gene regulatory maximum entropy approach. While maximum entropy circuits,5 or constant pulling experiments.6,7 Despite our claims that a physical system is in a state where it can current lack of a universal theory for statistical me- be realized by the highest number of microstates (i.e., chanics off equilibrium (or NESS),8 computer simula- highest entropy), MaxCal aims at extending this idea to tions can provide microscopic insight into these com- microtrajectories. The extension to microtrajectories is plex processes. Unfortunately, limited computational motivated by systems out of equilibrium being charac- power often prevents molecular simulations from reach- terized by probability currents. The currents can not be ing the experimentally-relevant time scales, or alterna- modeled by microstates alone and need microtrajectories tively, requires them to operate at artificially-large driv- for a complete description. ing forces.9 A formalism to reweight non-equilibrated dy- Jaynes introduced MaxCal as a theoretical framework namics across these driving forces is needed. for all dynamical processes.21 The method was shown Several reweighting schemes for dynamic and static in- to recover physical relations off equilibrium,22 model dy- formation in equilibrium are known. The Ferrenberg- namical complex systems from limited information,23,24 Swendsen reweighting10 is frequently used on stationary correct dynamic information by inferring physical probability distributions drawn from simulation in equi- information,25,26 and more applications on statistical sys- librium. Potential and force-based reweighting schemes tems in physics, chemistry, and biology.27 We use MaxCal for equilibrium dynamics have been of recent interest as the basis for our NESS reweighting method. and are based on Kramer’s rule,11,12 maximum likeli- The MaxCal formalism requires us to choose a set hood methods,13–15 the Girsanov-Radeon derivative,16 or of implied constraints based on the physical manipula- Maximum Caliber (MaxCal) methods.17 A method simi- tions made on a system in NESS. A driving force ex- lar to the Rosenbluth algorithm performs reweighting in erted on the system will affect the heat exchange of each NESS for minimal processes like birth-death processes.18 pathway. The microscopic characterization of heat ex- Another method based on iterative trajectory weighting change is described by the local entropy production.20 is expected to scale to NESS systems,19 however these The NESS system is also constrained by global bal- ance to preserve probability fluxes. The dynamics can be separated into two parts: a dissipative and a non- dissipative contributions.28 The dissipative contribution a) Electronic mail: bause@mpip-mainz.mpg.de is determined by the target NESS, accessed via the local
2 entropy production. The non-dissipative contribution, system does not change in time. on the other hand, is drawn from the reference data it- The resulting time-independent set of microtrajecto- self. We highlighted an invariant, which contains the ries is mapped onto a discrete Markov process. The time-symmetric contributions—they do not change un- configurational space is discretized into so-called mi- der driving. The invariant acts similar to the density of crostates (i.e., collection of microscopic states) and time states in equilibrium. MaxCal, combined with the ap- is discretized in steps of constant duration τ (i.e., the propriate constraints, opens the possibility to reweight lag-time).31 All observed transitions from microtrajecto- dynamical information across external forces as a func- ries are collected to infer a transition probability matrix tion of the system’s configurational space. pij (τ ), where i and j label microstates. This coarsen- Because the reweighting is performed at a microscopic ing of microtrajectories leaves us with the easier task level, it requires the consideration of large numbers of of sampling transition probabilities, and subsequently microstates. The sheer number of microtrajectories be- constructing microtrajectories out of the combination comes computationally intractable for all but the small- of individual micro-transitions. This mapping has been est of systems, and are here instead coarse-grained by proven to reach time scales that are out of range of brute- Markov state models (MSMs). MSMs describe the sys- force computer simulations.32 tem’s dynamics by coarse-grained time and space. They discretize the configurational space in microstates and model the Markovian probability of transitions between A. Maximum Caliber (MaxCal) these states. We performed space discretization based on configurational coordinates.20 Computational aspects The maximum entropy formalism by Gibbs states that typically limit the size of the transition probability ma- an equilibrium system is in a state where it can realize the trix to ∼ 103 microstates.29 The representation of molec- highest number of microscopic configurations, subject to ular systems with a large number of particles rapidly be- external constraints like the mean energy.33 Analogously, comes problematic. Instead, the configurational space MaxCal proposes a framework to study dynamical sys- is often projected down to a set of low-dimensional col- tems by replacing microstates with microtrajectories.34 lective variables (CVs).30 The application of CVs to In doing so, MaxCal moves away from Gibbs’ physical ar- MaxCal-based dynamical reweighting is the topic of this gument to an information theoretic point of view: Based study. on partial information, what is the most likely state the The paper is structured as follows: First we will intro- system is in? Jaynes answers this question by assum- duce the reweighting method and show that it is applica- ing the most uncertain (or highest entropy) probability ble to CVs without loss of generality. Second, the models distribution as noncommittal as possible regarding un- investigated and first-passage-time distributions used to known information. This point of view boils down to a analyze the dynamics are introduced in Methods. In the general inference method only subject to adequate phys- results section, we will apply the reweighting to a toy ical constraints. We take advantage of this formalism model in full coordinates and along collective variables by generalizing equilibrium reweighting, which focuses to show how the choice of variables impacts the accuracy on the static distributions of microstates, to dynamical of the methodology. Reweighting is then applied to a reweighting of NESS. molecular system: a tetra-alanine peptide. We apply the An adequate choice of physical constraints form an es- reweighting along two collective variables, testing both sential element of MaxCal.35 For dynamical reweighting conservative and non-conservative forces. to another NESS, we recently proposed the combined use of the local entropy production and global balance.20 These constraints focus on the interactions of the system II. THEORY with its environment: 1. Heat exchange is described at the microscopic level Steady states are a special case of non-equilibrium, by entropy production, itself constrained by micro- where heat is supplied to and withdrawn from the sys- scopic reversibility.36,37 The system’s spatial het- tem from an unlimited reservoir at the same rate. The erogeneity, as well as the need to describe dissipa- amount of heat flowing from an to the system is con- tive dynamics, requires a local constraint.35 The lo- trolled by the entropy. The system will eventually settle cal entropy production, ∆Sij , between microstates in a state with constant, positive total entropy produc- i and j is constrained by the relation20 tion dStot > 0, without the system undergoing changes— in a steady state dSsys = 0. The system is characterized pij by steady currents from a macroscopic point of view. h∆Sij i = ln , (1) pji These dynamical currents are described by ensembles of microtrajectories, each with a time-independent weight. where pij denotes the probability to jump between Maintaining the currents results in positive entropy pro- microstates i and j.20 By making use of a micro- duction in the reservoir. The system remains off equilib- scopic expression for ∆S, we integrate the conserva- rium but loses time-dependence because the macroscopic tive and non-conservative force contributions along
3 a trajectory (see Eq. B1).38 B. Collective Variables To reduce the number of microstates, describing com- plex systems by collective variables (CVs) is essential to 2. To connect all local changes we add global balance, make the system computationally accessible. Examples P πi = k πk pki , for each microstate i. Global bal- of CVs include the description of a magnet by its mag- ance ensures conservation of probability flux.39 It netization whilst ignoring the influence of local dipole connects a single state on the left-hand side of the fluctuations41 or the crystallization of particles described equation to all other states, and couples both sta- by the closest radial environment of each crystallizing tionary and dynamical properties. particle.42 Many fast and local processes are averaged out when settling on a set of collective variables. The mesoscopic descriptors or collective variables are inher- ently system-specific and limit the view on the system: Including adequate normalization constraints, the Cal- The crystallization described by the local environment iber functional becomes of the particles holds a detailed view on the crystalline phase, but only holds limited information on the liquid X pij X X phase.42 Furthermore, a poor choice of CVs can hide im- C=− πi pij ln + µi πi pij − 1 portant processes and free-energy barriers or cause an qij i,j i j inaccurate estimation of implied timescales.43–45 The ad- equate choice of collective variables is a widely discussed ! X X X +ζ( πi − 1) + νj πi pij − πj (2) research field on its own, and is applied to describe com- i j i plex systems in chemistry, biology, and physics.46 X pij To extend our reweighting procedure from configura- + πi αij ln − ∆Sij . tional coordinates, x, to CVs, z, we need to adapt the pji ij expression for the change in local-entropy production (Eq. B1). CVs and configurational coordinates are re- Here, the first term represents the relative-entropy term lated by a mapping operator, z = M (x). The potential on pathways, specifically between the target (MSM- energy is replaced by the potential of mean force47 based) transition probability pij with its reference coun- terpart, qij . The other terms consist of constraints, ex- Z pressed as Lagrange multipliers. First, normalization G(z) = −kB T ln dx δ (M (x) − z) π(x). (5) constraints on the transition probability, pij , and the steady-state distribution, πi , with associated parameters The change in entropy production due to a trajectory µi and ζ, respectively. The last two terms constrain z(t) with starting- and end-points z0 and zT , respec- the global-balance condition and local-entropy produc- tively, yields (see appendix A2) tion with Lagrangian multipliers νi and αij , respectively. The parameters αij and µi were both rescaled by πi . ∆S(z0 , zT )−∆S q (z0 , zT ) = Maximization is described in Appendix A1 and yields 1 G(zT ) − Gq (zT ) − (G(z0 ) − Gq (z0 )) (6) kB T 1 q ci + cj + ∆Sij − ∆Sij + (zT − z0 ) · (f − f q ) , pij =qij exp ζ + 2 (3) √ 1 where z(t) is the D-dimensional CV vector, f is the non- = qij qji exp ζ + (ci + cj + ∆Sij ) , 2 conservative force, and superscript q indicates the refer- ence system. Conceptually, adapting ∆S from configura- q where Sij is the local entropy production of the refer- tional to CV space amounts to replacing the potential en- ence system and ci are constants to be determined. This ergy by the potential of mean force. The expression holds shows that we have two options for the input parameters: for an arbitrary system with or without boundary condi- The reweighting depends either on the total entropy pro- tions, but only for driving forces along the CVs. While duction ∆Sij of the target system or the difference in lo- Eq. 6 assumes constant forces, it can readily be gener- q cal entropy production ∆Sij − ∆Sij between target and alized, i.e., f (z), analogous to the full-configurational reference systems. The P unknowns c are calculated by case.20 enforcing the relation j pij = 1 X√ 1 III. METHODS 1= qij qji exp ζ + (ci + cj + ∆Sij ) . (4) j 2 The reweighting procedure for CVs is tested on two This is a convex set of equations that can be solved by systems. The first model is a non-interacting parti- numerical iteration,20 for instance by least-squares.40 cle subject to a two-dimensional potential. The poten-
4 forces along either CV in either direction. We define 15 microstates over the range [−π, +π] in the ϕ-direction, and 15 microstates in the range [0.45 , 1.15] nm in the R14 -direction. Two additional sets of microstates were added to collect end-to-end distances outside this range. kJ Energies are given in = mol and the system is simu- lated at temperature T = 2.479 kB . A lag-time for the MSM is chosen using lag-time analysis and the Chapman- FIG. 1. Atomistic and coarse-grained representation of tetra- Kolmogorov test.49,54 Metastable states for the tetra- alanine. Atoms are shown in licorice, where turquoise, white, alanine are defined by PCCA+.55 The metastable state blue, and red represent C, H, N, and O, respectively. The analysis relies on equilibrium dynamics satisfying de- transparent beads show the coarse-grained representation of tailed balance. Thus, the analysis is performed for the the system. The end-to-end distance R14 and the dihedral reference system and the same metastable states are cho- angle ϕ are defined based on the coarse-grained representa- sen for the driven systems. tion. The dynamics are analyzed by using first-passage-time distributions (FPTD) between metastable states. It is defined by the distribution of time a process starting tial consists of three Gaussian potential wells of varying from metastable state A needs to reach metastable state depth. All boundaries are periodic and the external force B. FPTDs are widely used to characterize processes in is applied along the x-direction. Results for this system biology, chemistry and physics and are often associated are presented in reduced units, where the box size is set to with a free-energy barrier a system has to overcome. The 3L × 1L, the mass of the particle is set to M, and energy FPTD contains detailed transition information by col- is measured in . Thep temperature is T = 1 /kB and the lecting numerous realizations of a process. Often, few unit of time is T = L M/ . The integration time step observed realizations limit the analysis to the mean of the is set to δt = 10−5 T , the non-conservative force is varied distribution.56 Given an MSM with identified metastable between 0 and 9 /L, the microstates consists of 30 × 10 states, the FPTD between all metastable states can be squares of equal size and the lag-time is chosen at 0.02 T . calculated directly.57 The collection of initial states is de- The potential minima are Gaussian functions with depths noted by I, the collection of final states by F, the FPTD 3 , 5 , and 7 , and are located at x = {0.5, 1.5, 2.5} and by p FPT (I → F, t). Knowing the FPTD, all moments of y = 0.5. The standard deviation of the Gaussian is 0.02 L the distribution can be calculated by in both directions. By integrating out the y-dimension orthogonal to the driving force, F2D (x, y), we imitate a (n) X MI→F = pFPT (I → F, t)tn . (8) reduction of variables, providing a testing ground for the t reweighting along CVs. The mean force is calculated via the stationary distribution In particular, we will make use of the quantities Z (1) hF (x)i = dy F2D (x, y)π(x, y). (7) µI→F = MI→F q (2) σI→F = MI→F − µ2I→F (9) Both full and reduced descriptions will be analyzed along (3) 2 x. All dynamics are extracted from the same reference MI→F − 3µI→F σI→F − µ3I→F simulations. An MSM is constructed with the same lag- κI→F = 3 , σI→F time τ = 0.02 T and the same 30 equisized microstates in the x-direction. The lag-time for MSM is validated where µI→F is the mean, σI→F is the standard deviation by the Chapman-Kolmogorov test,48 for both the full 2D and κI→F is the standardized skewness, defined by the 3 und reduced 1D systems.49 expectation value of t−µ . These moments are used σ The second system represents a tetra-alanine peptide to compare FPTDs throughout the paper to capture the consisting of 4 amino acids and 52 atoms. Each amino main features and draw physical information from the acid is coarse-grained to one bead centered at the back- distribution. bone of the peptide. The coarse-grained force field for the molecule solvated in water consists of 3 pair poten- tials along the backbone, 2 bending-angle interactions, IV. RESULTS a dihedral angle ϕ, and an effective pairwise interaction between the first and last beads, R14 .50 Simulations were run with ESPResSo++.51 A. Particle in a two-dimensional potential The MSM is constructed using two CVs: the end-to- end distance, R14 , and the dihedral angle, ϕ, (see Fig- We first consider a toy model: a particle in a multi- ure 1).52,53 The unperturbed equilibrium system is called well. The system is originally in two dimensions, but the reference system. Driven systems consist of constant we also consider a reduced one-dimensional description.
5 (a) 1 0.8 f −1 Rew 1D: f = 9 /L →f =0 Rew 1D: f = 0 → f =9 /L 0.6 −3 Simulation 1D: f =0 /L y /L U/ Simulation 1D: f =9 /L 0.4 Rew 2D: f = 9 /L →f =0 Rew 2D: f = 0 → f =9 /L 0.2 −5 Simulation 2D: f =0 /L Simulation 2D: f =9 /L 0 (b) (d) 0.1 0 A U / −2 f B −4 C 0.01 p FPT (c) 0.3 0.25 0.2 0.001 ps 0.15 0.1 0.05 0 0 0.5 1 1.5 2 2.5 3 1 10 100 1000 x/L t/τ FIG. 2. (a) The 2D potential with the three metastable states indicated by squares. Integrating along the y−dimension gives (b) the mean potential of the equilibrium system. The grey area represents the new metastable states A,B,C. The area of the metastable state is effectively increased. (c) The stationary distribution of the reduced system and (d) FPTD of the process C→ B. The lines in (c,d) represent the results for a single particle in reduced space without (blue) and with (red) external force. The dots are the results from reweighting the systems into each other. The orange and light-blue dashed lines show the same process for the underlying 2D process with dots representing the reweighted FPTD. We perform dynamical reweighting for both descriptions deviations, dynamic and static data are reweighted vir- from and to equilibrium and a driven NESS. Figure 2a,b tually perfectly into each other. We conclude that use of shows the potential of the full and reduced single-particle collective variables of the system did not affect the accu- system. Dynamical reweighting leads to an accurate re- racy of the reweighting process. Hence, it can be applied production of the stationary distribution, as seen for two to the same extent as the reweighting in configurational different driving forces (Figure 2c). Reweigthing also space. leads to an accurate reproduction of the FPTD (Fig- We now more closely compare the dynamics for the two ure 2d), as shown for the process C→B at both equilib- system descriptions (Figure 3). While the processes re- rium and under driving, and for both the full and reduced main qualitatively similar irrespective of representation, descriptions. While longer timescales are reproduced ac- the 2D processes are consistently slower than those in the curately, the reweighting for short processes of 1 − 5 τ reduced representation. These accelerated dynamics are show small deviations. These are caused by the spa- common in coarse-grained modeling.58–61 The reduced tial discretization, especially in highly populated areas. roughness in the free-energy surface results in a decrease Overall the dynamical-reweighting scheme performs as of the effective friction. For our simplified model, this ef- well in both full-configurational and CV spaces. We note fect reduces the effective potential barriers, which leads that the present methodology requires external forces to to the acceleration of the coarse-grained process. Simi- be aligned with the CVs. lar effects can be found for the standard deviation (STD) In the following we reduce all FPTDs to the first three and skewness, though to smaller extents. More details on moments and the stationary distribution of metastable the skewness can be found in the Supporting Material.49 states for complete analysis of deviations between simu- The occupation probability of the metastable states lation and reweighting (Figure 3). The largest deviation is significantly larger for the reduced system. The can be seen for the process A→B, where the deviations metastable states are effectively smaller than for the in 1D and 2D are comparable, as well as the occupation 2D system because they do not span the whole y- probability of state C. This minor deviation in the sta- direction. The reduction to the x-axis enlarges the tionary distribution for heavy driving is indicated in the metastable states effectively and the occupation proba- detailed view in Figure 2c. A metastable state in the re- bility increases. The trend of decreasing occupation in C duced system covers two microstates of the MSM and is and increasing occupation A and B is the same for both thus susceptible to discretization errors. Despite minor systems.
6 (a) (a) A →B A →C B →C H E I B →A C →A C →B 1.2 0 µ/τ 500 1.1 −5 1 MFPT 100 −10 0.9 R14 / nm −15 F/ 25 0.8 −20 10 0.7 (b) 0.6 −25 500 0.5 −30 0.4 −35 σ/τ − 23 − 13 0 1 3 2 3 100 ϕ/(π rad ) STD (b) 25 1200 t1 f =0 /nm t2 f =0 /nm t1 f = −9 /nm 10 1000 t2 f = −9 /nm (c) 2.6 7 800 ti /T 4 κ 600 skewness 2 2 400 200 1.7 100 150 200 250 300 (d) A B C τ /T 0.3 FIG. 4. (a) Free energy surface of tetra-alanine of the ref- Π occupation erence system. The metastable states are indicated by he- 0.1 lical (H), extended (E) and intermediate (I). (b) Implied- timescale analysis of the system defined by the reference force 0.03 field (fR = 0), and driven along the end-to-end distance with fR = −9 nm . The shaded area marks the non-physical area 0.01 where ti < τ . 0 1 2 3 4 5 6 7 8 9 f /(/L) represent an optical tweezer controlling atom distances. The external forces are chosen to test the effectiveness of the reweighting procedure for conservative and non- FIG. 3. (a-c) The first three moments of the FPTD for all six conservative forces. processes between metastable states under varying external The free energy surface of the coarse-grained reference force f . (d) The occupation probability of each metastable system is shown in Figure 4a using the CVs of the end- state. The dots represent the value measured from simulation. to-end distance R14 and the dihedral angle ϕ. Three The line is the reduced 1D equilibrium system continuously basins were identified, representing the helical states H, reweighted. The dashed lines are the processes continuously extended state E, and one intermediate state I. The dy- reweighting of the underlying equililibtium processes in 2D namically metastable states do not always coincide with space. The error bars are smaller than the points and lines. the free-energy minima. State H is associated with heli- cal states located to the right of the middle free-energy barrier at ϕ ≈ 0.15 π. State I is an intermediate state at B. Tetra-alanine peptide ϕ ≈ 0.4 π. PCCA+ identifies a state below of the global basin that we call extended state E. To further challenge the reweighting procedure, we ap- The driving along R14 can be casted to an additional ply it to a coarse-grained tetra-alanine peptide. This sys- attractive or repulsive interaction potential—leaving the tem is of higher complexity than the previous model by system in equilibrium. On the other hand, we can also showing rougher free-energy landscapes and many-body drive the peptide in a NESS along the periodic dihedral interactions. External global forces are applied along the angle ϕ. The direction of driving will impact the dynam- CVs to alter the dynamics. Physically, these forces may ics, because the free energy surface lacks the symmetry of
7 (a) are H→I and E→I, and finally the two slowest processes 80 H → I H →E I → E are both going to the helical state, E→H and I→H. Under µ/τ I → H E →H E → I 60 driving, transitions to I slow down under an attractive MFPT end-to-end potential (i.e., negative forces) and speed up 40 for a repulsive end-to-end potential (i.e., positive forces). 20 The opposite happens for the processes going to H: An at- (b) 60 tractive end-to-end potential increases the speed of these σ/τ processes. Transitions to the extended state E are com- 40 paratively unaffected by the driving. We note that the STD STD behaves roughly proportional to the MFPT, while 20 the skewness varies extremely weakly. (c) Looking at Figure 5d, increasingly repulsive R14 inter- 2.6 actions lead to a stabilization of state I. On the other κ 2.4 skewness hand, this separation of the residues destabilizes both H 2.2 and E, where the former decays more strongly. 2 The impact of the driving force on the MSM’s implied 1.8 timescales is shown in Figure 5e. The nature of the driv- (d) ing force retain the system in equilibrium, so that path- H E I Π 0.2 dependent effects are not expected. In agreement with occupation 0.15 Figure 4b, the first timescale depends strongly on driv- 0.1 ing, while the second one is virtually unaffected. 0.05 Results on the three moments of the FPTD indicate (e) that the transitions are recovered accurately. Minor de- t1 t2 viations at large forces are rationalized by a significant 1100 change in the relevant populations: regions at large or ti / fs 800 small end-to-end distance become highly populated, but 500 may be insufficiently sampled in the reference system. 200 These errors are mostly apparent for the higher-order mo- −10 −5 0 5 10 ments. Overall though, we report extremely encouraging f /(/nm) results in terms of dynamical reweighting for a complex molecular system driven by a constant conservative force. FIG. 5. (a-c) The first three moments of the FPTD for all six processes between metastable states under varying exter- nal force f along R14 . (d) The occupation probability of each metastable state. (e) The timescale of the two slowest 2. NESS reweighting processes covered by the MSM. The dots represent the value measured from simulation. The line is the reference system Figure 6 shows NESS driving along the dihedral ϕ in continuously reweighted. either direction. The dynamics of the system are largely dominated by its large free-energy barrier at ϕ ≈ − π6 . Driving in the positive direction speeds up the processes the previous toy model. Thus, we can test the method for H→E and I→E, while I→H slows down as it runs oppo- reweighting between equilibrium states, NESS, or from site to the driving force. On the other hand H→I slows equilibrium to NESS and vice-versa. down, even though it runs along the external force. Most trajectories starting from H bypass I under heavy driving, leading instead directly to state E. This can be seen by 1. Equilibrium reweighting the narrow, diagonal stripe below state I, which becomes more tightly populated. The trajectories find a direct Figure 4b shows the implied timescale analysis for the path to the global basin (E) without hitting the interme- original force field, and an applied driving along R14 . diate state I, as can be seen in the transition density of We choose a lag-time of 200 T to capture the two slowest H→E.49 processes of both systems, where T = 1 fs. The second Overall agreement between direct simulations and process is captured by the MSM and is virtually unaf- reweighting are observed for the FPTD, especially up to fected by the additional forces applied. In the following |f | < 1 rad . Similar to equilibrium reweighting, we find we assume this process to remain unaffected by larger that the STD follows the behavior of the MFPT, and forces. the skewness varies weakly. We observe some discrepan- Figure 5a-c shows the first three moments of the FPTD cies at larger driving, notably for I→H and E→H. While between the metastable states when driving along R14 . they all increase, the simulation curves seem to reach a For the reference system at f = 0 we note the two fast maximum. We hypothesize that this has to do with cross- processes I→E and H→E. The next two slower processes ing over at the free-energy barrier, ϕ ≈ − π6 . Driving in
8 (a) 140 were separated by three or more barriers: Paths transi- H → I H →E I → E 120 µ/τ I → H E →H E → I tioning over a single barrier have much higher probabil- 100 80 MFPT ities, so that the other set of paths can be neglected20 . 60 In tetra-alanine, driving along the R14 -direction did not 40 20 lead to this issue, because there are no periodic boundary (b) 0 conditions and the forces can be mapped to a potential. 120 Transitions become path independent and errors based 100 σ/τ in path-dependence thus do not occur. Driving along the 80 ϕ-direction, on the other hand, results in a NESS with STD 60 40 periodic boundaries. It shows only one major barrier 20 along the dihedral angle that dominates the dynamics. (c) 0 Choosing the appropriate path direction to feed into the 2.2 local entropy production is more challenging here. For κ every jump in the Markov Model one has to determine skewness 2 if the underlying trajectory is aligned with, or directed 1.8 against, the external force. 1.6 To shed light on path directions, we analyze the ma- (d) H E I trix of transition probabilities, as shown in Figure 7a for 0.25 Π the reference system. We fix a starting point, denoted occupation 0.2 0.15 by a green dot, and analyze the expected direction given 0.1 any final microstate. We expect all states to the right of 0.05 the starting point (blue shaded area) to arise from tra- 0 jectories going right, i.e., in the positive ϕ-direction. On −1.5 −1 −0.5 0 0.5 1 1.5 the other hand, all states to the left of the starting point f /(/ rad) (green shaded area) arise from trajectories going left, i.e., negative ϕ-direction—taking periodic boundaries into ac- FIG. 6. (a-c) The first three moments of the FPTD for all count. A dividing mark (red dashed line) separates the six processes between metastable states in Figure 4a under two regions. The lag-time of the MSM is chosen to be varying external force f along ϕ. (d) The occupation proba- small enough to avoid transition close to the divider, i.e., bility of each metastable state. The dots represent the value no transitions lead to ambiguity as to their likely direc- measured from simulation. The line is the reference system tion. This disconnect in the transition matrix between continuously reweighted. left- and right-trajectories is associated with a disconti- nuity in the local entropy production. Figure 7b shows the transition matrix with a driving force f = 1.4 rad , the negative direction inverts the effect on the dynam- initiated from the same starting point as before. The ics: Processes aligned with the force speed up, whereas non-zero driving lead to a change in transition probabili- opposing processes slow down. H→I does not follow this ties, but the spatially long transitions are still forbidden. trend and instead accelerates. The trajectories that by- The discontinuity in local entropy production can be set pass the intermediate state under positive driving are in the same position. This is important, because the tar- now pushed into occupying the state I. This is indicated get transition matrix is not known before reweighting. by the increasing population of I under negative driv- Having a gap at a similar position in the reference and ing and depopulation under positive driving. The helical target driving forces is essential for the reweighting algo- state population shows similar, but even stronger, behav- rithm. ior. The extended state, on the other hand, displays the Next, we illustrate the impact of an incorrect assign- opposite behavior. Here again, the occupation probabil- ment of path direction. We displace the starting point ities are recovered accurately by the reweighting, even directly to the left of the large, central barrier along ϕ with small deviations in the dynamics at large driving. (Figure 7c). Upon driving (Figure 7d), the divider line cuts the transition matrix through a connected region, close to the intermediate state. As such, a discontinuity 3. Path dependence of entropy production in the local entropy production will be present among the paths connecting this intermediate region. Left- and The observed deviations between simulation and right-trajectories are no more well separated, leading to reweighting at strong driving along the dihedral angle ambiguities. These issues directly result in the discrep- stems from the local entropy production. This key quan- ancies observed in Figure 6. They only materialize at tity is determined by both the external force and the strong driving, otherwise the local entropy productions set of paths connecting every pair of microstates. Find- of these conflicting paths are negligible. ing the shortest connection between two microstates was This analysis highlights the dependence of NESS straightforward for the toy-model system, because they reweighting by the choice of MSM. Upon reweighting
9 (a) 1.2 (b) 1.2 0.06 0.06 1 0.05 1 0.05 R14 / nm R14 / nm 0.04 0.04 0.8 0.8 ps ps 0.03 0.03 0.02 0.02 0.6 0.6 0.01 0.01 0.4 0 0.4 0 − 23 − 31 0 1 3 2 3 − 23 − 13 0 1 3 2 3 ϕ/ (π rad) ϕ/ (π rad) (c) (d) 1.2 1.2 0.1 0.1 1 0.08 1 0.08 R14 / nm R14 / nm 0.8 0.06 0.8 0.06 ps ps 0.04 0.04 0.6 0.6 0.02 0.02 0.4 0 0.4 0 − 23 − 31 0 1 3 2 3 − 23 − 13 0 1 3 2 3 ϕ/ (π rad) ϕ/ (π rad) FIG. 7. Transition probabilities starting at the state marked by the green dot. Left (a,c): reference systems, right (b,d): system driven along ϕ at 1.4 rad . The red line represents the discontinuity in local entropy production, starting from the marked initial state. All states shaded green are connected to the starting state by trajectories going left, all states shaded blue are connected by a trajectories going right. from reference to target driving forces, the set of paths microstates should be described by a unique bundle of connecting two microstates should consist of similar sets paths. This means that the CVs and their separation in of trajectories. Unfortunately, one cannot easily predict microstates should be chosen to reflect underlying kinetic whether conditions for reweighting are met. The small distances of the system, as was formulated as a require- gaps between the groups of trajectories in Figure 7a,b are ment for reweighting of dynamics in equilibrium by Voelz a warning signal. Models with three or more barriers, as et al.17 constructed in the toy model, are less susceptible to this issue. A particle crossing a barrier is expected to take the shorter path over a single barrier, and is unlikely to V. CONCLUSION hop over two barriers within one lag-time. The tackling of larger and more complex systems should naturally do This study contributes to the sparse field of dynamical away with these artefacts. reweighting between non-equilibrium steady states. The Clearly these issues are brought about by the MSM presented method is based on Jaynes’ Maximum Caliber construction of microtrajectories. Can we refine the (MaxCal). It relies on an ensemble description of NESS MSM parametrization? Shorter lag-times would result by physical constraints—global balance and local entropy in shorter trajectories and thus shorter jumps. Unfor- productions—and an efficient construction of microtra- tunately, Figure 4 shows that smaller lag-times show jectories by means of Markov state models (MSMs). In- non-Markovian dynamics. Other options point at the stead of being directly sampled, microtrajectories are role of CVs and microstate selection. We may select a constructed from the transition probability matrix, which different second collective variable (CV) when reweight- robustly addresses issues of path sampling. On the other ing along the first. Such choices can have great impact hand, an MSM description requires the use of appropri- and better represent the free-energy landscape. Alter- ately chosen collective variable (CVs) that can describe natively, increasing the number of microstates does not the dynamics of the slow processes. Our initial descrip- allow us to decrease the lag-time. The microstates in tion of NESS dynamical reweighting was based on the the present model are discretised in equal size along the configurational variables of the system themselves. To CVs. Advanced clustering techniques, like k-means62 scale up, this study presented an extension to CVs. The or k-medoids,63 help define more complex sets of mi- expression for the local entropy production was extended crostates that could allow us to reduce the lag-time. Both from individual forces to mean configurationally averaged the selection and clustering of the CVs influence how dy- forces. We tested the CV-based dynamical reweighting to namics are described by the MSM. The connection of two both conservative and non-conservative forces, applied to
10 both a toy model and a molecular system: a tetra-alanine some algebra one finds peptide. γij q Strong agreement in both the static and dynamical = wij ∆Sij − ∆Sij − µi + νi + µj − νj , (A4) properties are found overall. Discrepancies can be found pij at strong driving. We showed that they can be traced π p q q back to ambiguities in the direction of the constructed where wij = 1/ 1 + πji pij ji ) and ∆Sij = ln qij ji have been MSM-based paths. The periodic boundary conditions used. This expression is set into Eq. A3 and using wij + and relatively small landscape can lead to significant path wji = 1 we find contributions from both directions along the CV. Finding better CVs is an ever-present challenge,46 and is expected pij = qij exp −1 + wji µi + wij µj + wji νj + wij νi to systematically improve the MaxCal-based reweighting q (A5) +wij (∆Sij − ∆Sij ) scheme presented here. The present methodology opens the door to NESS reweighting of large molecular systems. The Caliber maximization with respect to the stationary distribution gives ACKNOWLEDGMENTS X pik X 0=− pik ln + µi pik − µi + ζ − νi qik k k We thank Paul Spitzner for critical reading of the (A6) manuscript. This work was supported in part by the X X pik + νk pik + αik ln − ∆Sik . Emmy Noether program of the Deutsche Forschungsge- pki k k meinschaft (DFG) to TB and the Graduate School of Excellence Materials Science in Mainz (MAINZ) to MB. By combining with Eq. A3 and making use of the prob- ability conservation constraints, one finds a relation be- tween the Lagrangian multipliers γij , νi and µi : DATA AVAILABILITY X µi + νi = 1 + ζ + γik . (A7) The data that support the findings of this study are k availablefrom the corresponding author upon reasonable P Enforcing the constraint k pik = 1 on Eq. A5 results request. in a set of N equations, where N is the number of mi- crostates. Combined with the set of N equations from Eq. A7 there is a set of 2N coupled non-linear equa- Appendix A: Caliber maximization tions to be solved. To solve the problem, we assume that π p deviations from detailed balance are small: πji pij ji ≈ 1, We consider the Caliber 1 resulting in wij ≈ 2 . The approximation is applied to each Markovian jump individually, such that the aggre- X pij X X gate contributions to a microtrajectory may yield signif- C=− πi pij ln + µi πi pij − 1 i,j qij i j ican entropy productions. The approximation applied to ! Eq. A5 yields X X X +ζ( πi − 1) + νj πi pij − πj (A1) 1 q i j i pij = qij exp −2 + µi + νj + µj + νi + ∆Sij − ∆Sij . 2 pij P(A8) X + πi αij ln − ∆Sij . ij pji Using the result of Eq. A7 and the definition ci = k γik we obtain The maximization with respect to the transition proba- 1 q bilities pij gives pij =qij exp ζ + ci + cj + ∆Sij − ∆Sij 2 (A9) pij αij αji √ 1 0 = −πi ln −πi +πi µi +πi νj +πi −πj . (A2) = qij qji exp ζ + (ci + cj + ∆Sij ) . qji pij pij 2 Solving for pij with πi 6= 0 Appendix B: Local-entropy production in collective γij coordinates pij = qij exp −1 + µi + νj + , (A3) pij π To solve the reweighting equation we need an expres- where γij = αij − πji αji is used. Enforcing the local q sion for the relative local entropy production ∆Sij −∆Sij p entropy productions explicitly by ∆Sij = ln pij ji and after between target and reference states, the latter being in-
11 dicated by superscript q. The indices i, j denote mi- force is directed along the CVs. We find crostates that occur from discretizing the coordinates of D the system of interest. Having access to the full set of Z 1 X ∂G ∂zd ∂zd coordinates allows us to analyze a trajectory x(t) and ∆S[{zt }] = dt + fd kB T ∂zd ∂t ∂t calculate the entropy production using d Z D Z ! 1 dG X ∂zd F · ẋ Z = dt + dt fd ∆S[x(t)] = dt , (B1) kB T dt ∂t kB T d D Z where ẋ is the velocity, F is the force, and T is the G(zT ) − G(z0 ) 1 X ∂zd = + dt fd . temperature.38 Making use of numerically discretized tra- kB T kB T ∂t d jectories, x(t) ≈ {xk }, ∆S({xk }) is approximated be- (B5) tween initial and target points, x0 and xT , respectively Analogous to reweighting in full configurational coor- (d) (d) dinates, two points in CV-space can be connected along XX T xt − xt−1 F (d) (xt ) + F (d) (xt−1 ) or against a constant external force. This can create am- ∆S[{xk }] ≈ biguity for periodic systems. By choosing the lag-time 2kB T d t=1 sufficiently small, one set of (long) trajectories has negli- T gible weight compared to the other one. The expression 1 X X (d) ≈ xt F (d) (xt−1 ) − F (d) (xt+1 ) , for local entropy production thereby only depends on the 2kB T d t=0 initial and target points of the trajectory (B2) where Stratonovich integration is used64 and d iterates G(zT ) − G(z0 ) + f · (z0 − zT ) over the configurational dimensions. The second approx- ∆S(z0 , zT ) ≈ . (B6) kB T imation neglects end terms assuming long enough trajec- tories. We project this equation to D-dimensional col- To highlight the reweighting aspect, we focus on the lective variables z = M (x), making use of a linear map- change of the entropy production between reference (“q”) ping operator M . Analogous to structure-based coarse- and target systems graining, the local entropy production is transformed to CV space by a path-ensemble average47 ∆S(z0 , zT ) − ∆S q (z0 , zT ) = 1 R D[x(t)]δ(M (x(t)) − z(t))∆S[x(t)] G(zT ) − Gq (zT ) − (G(z0 ) − Gq (z0 )) (B7) ∆S[z(t)] = R kB T D[x(t)]δ(M (x(t)) − z(t)) + (zT − z0 ) · (f − f q ) . QT R (B3) dxt δ(M (xt ) − zt )∆S[{xt }] ∆S[{zk }] = t=0QT R . t=0 dxt δ(M (xt ) − z) 1 J. I. Steinfeld, J. S. Francisco, and W. L. Hase, Chemical kinetics Using the approximation in Eq. B2, all integrals over and dynamics (Prentice Hall Upper Saddle River, NJ, 1999). (d) 2 S. Eberhard, G. Finazzi, and F.-A. Wollman, Annual review of xt can be performed separately and we find the entropy genetics 42, 463 (2008). production in CV space 3 A. U. Khan and M. Kasha, Proceedings of the National Academy of Sciences 80, 1767 (1983). D T 4 R. S. Knox, Biophysical journal 9, 1351 (1969). 1 X X (d) (d) ∆S[z(t)] = zt F (zt−1 ) − F (d) (zt+1 ) , 5 A. Arkin, J. Ross, and H. H. McAdams, Genetics 149, 1633 2kB T (1998). d t=0 6 K. O. Chong, J.-R. Kim, J. Kim, S. Yoon, S. Kang, and K. An, (B4) where F (d) (zt ) are mean forces projected along dimen- Communications Physics 1, 1 (2018). 7 C. Cormick, T. Schaetz, and G. Morigi, New Journal of Physics sion d evaluated at time t. The solution above requires to 13, 043019 (2011). integrate along stochastic trajectories. We approximate 8 J. P. Dougherty, Philosophical Transactions of the Royal Society the equation by ignoring fluctuations, allowing us to ap- of London. Series A: Physical and Engineering Sciences 346, 259 ply Riemann integration. By averaging over all existing (1994). 9 J. R. Perilla, B. C. Goh, C. K. Cassidy, B. Liu, R. C. Bernardi, pathways between two microstates later on, this approxi- T. Rudack, H. Yu, Z. Wu, and K. Schulten, Current opinion in mation becomes exact because the fluctuations are a sym- structural biology 31, 64 (2015). metric contribution to dynamics and do not contribute 10 A. M. Ferrenberg and R. H. Swendsen, Computers in Physics 3, to entropy production. We express the forces through a 101 (1989). 11 C. A. F. De Oliveira, D. Hamelberg, and J. A. McCammon, The conservative contribution derived from the potential of Journal of chemical physics 127, 11B605 (2007). mean force, G(z), and a non-conservative contribution, 12 P. Tiwary and M. Parrinello, Physical review letters 111, 230602 f , resulting in F = − ∂G(z) ∂z + f . The non-conservative (2013). 13 H. Wu, F. Paul, C. Wehmeyer, and F. Noé, Proceedings of the National Academy of Sciences 113, E3221 (2016). 14 J. F. Rudzinski, K. Kremer, and T. Bereau, The Journal of Chemical Physics 144 (2016).
12 15 L. S. Stelzl, A. Kells, E. Rosta, and G. Hummer, Journal of 40 M. A. Branch, T. F. Coleman, and Y. Li, SIAM Journal on chemical theory and computation 13, 6328 (2017). Scientific Computing 21, 1 (1999). 16 L. Donati, C. Hartmann, and B. G. Keller, The Journal of chem- 41 J. Tóbik, R. Martoňák, and V. Cambel, Physical Review B 96, ical physics 146, 244112 (2017). 140413 (2017). 17 H. Wan, G. Zhou, and V. A. Voelz, Journal of chemical theory 42 R. Radhakrishnan and B. L. Trout, Journal of the American and computation 12, 5768 (2016). Chemical Society 125, 7743 (2003). 18 P. B. Warren and R. J. Allen, Molecular Physics 116, 3104 43 P. G. Bolhuis, C. Dellago, and D. Chandler, Proceedings of the (2018). National Academy of Sciences 97, 5877 (2000). 19 J. D. Russo, J. Copperman, and D. M. Zuckerman, arXiv 44 O. Valsson, P. Tiwary, and M. Parrinello, Annual review of phys- preprint arXiv:2006.09451 (2020). ical chemistry 67, 159 (2016). 20 M. Bause, T. Wittenstein, K. Kremer, and T. Bereau, Physical 45 F. Noé and C. Clementi, Current Opinion in Structural Biology Review E 100, 060103 (2019). 43, 141 (2017). 21 E. T. Jaynes, in Complex Systems—Operational Approaches in 46 M. A. Rohrdanz, W. Zheng, and C. Clementi, Annual review of Neurobiology, Physics, and Computers (Springer, 1985) pp. 254– physical chemistry 64, 295 (2013). 269. 47 W. G. Noid, J.-W. Chu, G. S. Ayton, V. Krishna, S. Izvekov, 22 P. D. Dixit, J. Wagoner, C. Weistuch, S. Pressé, K. Ghosh, and G. A. Voth, A. Das, and H. C. Andersen, The Journal of chemical K. A. Dill, The Journal of chemical physics 148, 010901 (2018). physics 128, 244114 (2008). 23 P. D. Dixit, A. Jain, G. Stock, and K. A. Dill, Journal of chemical 48 J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, theory and computation 11, 5464 (2015). J. D. Chodera, C. Schütte, and F. Noé, The Journal of chemical 24 P. D. Dixit, E. Lyashenko, M. Niepel, and D. Vitkup, bioRxiv , physics 134, 174105 (2011). 137513 (2018). 49 See Supplemental Material at [URL] for additional information. 25 Z. F. Brotzakis, M. Vendruscolo, P. Bolhuis, et al., arXiv preprint 50 J. F. Rudzinski and W. G. Noid, Journal of chemical theory and arXiv:2006.00868 (2020). computation 11, 1278 (2015). 26 D. Meral, D. Provasi, and M. Filizola, The Journal of chemical 51 J. D. Halverson, T. Brandes, O. Lenz, A. Arnold, S. Bevc, physics 149, 224101 (2018). V. Starchenko, K. Kremer, T. Stuehn, and D. Reith, Computer 27 K. Ghosh, P. D. Dixit, L. Agozzino, and K. A. Dill, Annual Physics Communications 184, 1129 (2013). Review of Physical Chemistry 71, 213 (2020). 52 J. Rudzinski and T. Bereau, The European Physical Journal Spe- 28 C. Maes, Non-dissipative effects in nonequilibrium systems cial Topics 225, 1373 (2016). (Springer, 2018). 53 T. Bereau and J. F. Rudzinski, Physical review letters 121, 29 F. Noé and E. Rosta, “Markov models of molecular kinetics,” 256002 (2018). (2019). 54 J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, 30 B. E. Husic and V. S. Pande, Journal of the American Chemical J. D. Chodera, C. Schütte, and F. Noé, The Journal of chemical Society 140, 2386 (2018). physics 134, 174105 (2011). 31 G. R. Bowman, V. S. Pande, and F. Noé, An introduction 55 P. Deuflhard and M. Weber, Linear algebra and its applications to Markov state models and their application to long timescale 398, 161 (2005). molecular simulation, Vol. 797 (Springer Science & Business Me- 56 N. F. Polizzi, M. J. Therien, and D. N. Beratan, Israel journal dia, 2013). of chemistry 56, 816 (2016). 32 N. Plattner, S. Doerr, G. De Fabritiis, and F. Noé, Nature chem- 57 E. Suárez, A. J. Pratt, L. T. Chong, and D. M. Zuckerman, istry 9, 1005 (2017). Protein Science 25, 67 (2016). 33 J. W. Gibbs, Elementary principles in statistical mechanics: de- 58 P. K. Depa and J. K. Maranas, The Journal of chemical physics veloped with especial reference to the rational foundations of 123, 094901 (2005). thermodynamics (C. Scribner’s sons, 1902). 59 M. Guenza, The European Physical Journal Special Topics 224, 34 E. T. Jaynes, Annual Review of Physical Chemistry 31, 579 2177 (2015). (1980). 60 J. F. Rudzinski, Computation 7, 42 (2019). 35 L. Agozzino and K. A. Dill, Physical Review E 100, 010105 61 M. K. Meinel and F. Müller-Plathe, Journal of Chemical Theory (2019). and Computation 16, 1411 (2020). 36 G. E. Crooks, Journal of Statistical Physics 90, 1481 (1998). 62 A. Likas, N. Vlassis, and J. J. Verbeek, Pattern recognition 36, 37 G. E. Crooks, Journal of Statistical Mechanics: Theory and Ex- 451 (2003). periment 2011, P07008 (2011). 63 H.-S. Park and C.-H. Jun, Expert systems with applications 36, 38 U. Seifert, Physical review letters 95, 040602 (2005). 3336 (2009). 39 P. D. Dixit and K. A. Dill, Journal of chemical theory and com- 64 N. G. Van Kampen, Stochastic processes in physics and chem- putation 10, 3002 (2014). istry, Vol. 1 (Elsevier, 1992).
You can also read