Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory

The article reports an attempt to study the protein folding problem by generalized-ensemble Monte Carlo simulations with reference interaction site model theory. Generalized-ensemble algorithms greatly enhance the configurational space sampling in computer simulations. The reference interaction si...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2007
Автори: Mitsutake, A., Kinoshita, M., Hirata, F., Okamoto, Y.
Формат: Стаття
Мова:English
Опубліковано: Інститут фізики конденсованих систем НАН України 2007
Назва видання:Condensed Matter Physics
Онлайн доступ:http://dspace.nbuv.gov.ua/handle/123456789/118897
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Цитувати:Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory / A. Mitsutake, M. Kinoshita, F. Hirata, Y. Okamoto // Condensed Matter Physics. — 2007. — Т. 10, № 4(52). — С. 495-508. — Бібліогр.: 32 назв. — англ.

Репозитарії

Digital Library of Periodicals of National Academy of Sciences of Ukraine
id irk-123456789-118897
record_format dspace
spelling irk-123456789-1188972017-06-02T03:04:01Z Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory Mitsutake, A. Kinoshita, M. Hirata, F. Okamoto, Y. The article reports an attempt to study the protein folding problem by generalized-ensemble Monte Carlo simulations with reference interaction site model theory. Generalized-ensemble algorithms greatly enhance the configurational space sampling in computer simulations. The reference interaction site model theory treats solvent effects with solvent molecular shape and estimate solvation free energy around proteins. We have developed simulation algorithms which combine generalized-ensemble algorithms and one-dimensional reference interaction site model theory. This treatment can also use a simulation with three-dimensional reference interaction site model theory. In this review, we describe the methods and present the results of these simulations for a peptide. Стаття присвячена вивченню проблеми згортання протеїну в рамках комбiнацiї узагальненого ансамблю Монте-Карло та теорiї базисної моделi силових центрiв. Алгоритми узагальненого ансамблю можуть ефективно вибирати конфiгурацiйний простiр в комп’ютерних моделюваннях. Теорiя базисних взаємодiючих силових центрiв може враховувати ефекти розчинника в залежностi вiд форми молекул, визначаючи енергiю сольватацiї протеїнiв. Ми розробили комп’ютернi алгоритми, що комбiнують алгоритми узагальненого ансамблю i одновимiрну базисну модель взаємодiючих силових центрiв. Цей пiдхiд може бути також поєднаний з тривимiрною моделлю взаємодiючих силових центрiв. В даному оглядi описуються методи i представлено результати для пептиду. 2007 Article Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory / A. Mitsutake, M. Kinoshita, F. Hirata, Y. Okamoto // Condensed Matter Physics. — 2007. — Т. 10, № 4(52). — С. 495-508. — Бібліогр.: 32 назв. — англ. 1607-324X PACS: 31.15.Qg, 61.25.Em, 87.15.Aa DOI:10.5488/CMP.10.4.495 http://dspace.nbuv.gov.ua/handle/123456789/118897 en Condensed Matter Physics Інститут фізики конденсованих систем НАН України
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
collection DSpace DC
language English
description The article reports an attempt to study the protein folding problem by generalized-ensemble Monte Carlo simulations with reference interaction site model theory. Generalized-ensemble algorithms greatly enhance the configurational space sampling in computer simulations. The reference interaction site model theory treats solvent effects with solvent molecular shape and estimate solvation free energy around proteins. We have developed simulation algorithms which combine generalized-ensemble algorithms and one-dimensional reference interaction site model theory. This treatment can also use a simulation with three-dimensional reference interaction site model theory. In this review, we describe the methods and present the results of these simulations for a peptide.
format Article
author Mitsutake, A.
Kinoshita, M.
Hirata, F.
Okamoto, Y.
spellingShingle Mitsutake, A.
Kinoshita, M.
Hirata, F.
Okamoto, Y.
Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
Condensed Matter Physics
author_facet Mitsutake, A.
Kinoshita, M.
Hirata, F.
Okamoto, Y.
author_sort Mitsutake, A.
title Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
title_short Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
title_full Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
title_fullStr Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
title_full_unstemmed Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
title_sort combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory
publisher Інститут фізики конденсованих систем НАН України
publishDate 2007
url http://dspace.nbuv.gov.ua/handle/123456789/118897
citation_txt Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory / A. Mitsutake, M. Kinoshita, F. Hirata, Y. Okamoto // Condensed Matter Physics. — 2007. — Т. 10, № 4(52). — С. 495-508. — Бібліогр.: 32 назв. — англ.
series Condensed Matter Physics
work_keys_str_mv AT mitsutakea combinationofgeneralizedensemblealgorithmsandonedimensionalreferenceinteractionsitemodeltheory
AT kinoshitam combinationofgeneralizedensemblealgorithmsandonedimensionalreferenceinteractionsitemodeltheory
AT hirataf combinationofgeneralizedensemblealgorithmsandonedimensionalreferenceinteractionsitemodeltheory
AT okamotoy combinationofgeneralizedensemblealgorithmsandonedimensionalreferenceinteractionsitemodeltheory
first_indexed 2025-07-08T14:51:30Z
last_indexed 2025-07-08T14:51:30Z
_version_ 1837090777777307648
fulltext Condensed Matter Physics 2007, Vol. 10, No 4(52), pp. 495–508 Combination of generalized-ensemble algorithms and one-dimensional reference interaction site model theory A.Mitsutake1, M.Kinoshita2, F.Hirata3,4, Y.Okamoto5 1 Department of Physics, Keio University, Yokohama, Kanagawa 223–8522, Japan 2 Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611–0011, Japan 3 Department of Theoretical Studies, Institute for Molecular Science, Okazaki, Aichi 444–8585, Japan 4 Department of Functional Molecular Science, The Graduate University for Advanced Studies, Okazaki, Aichi 444–8585, Japan 5 Department of Physics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464–8602, Japan Received October 15, 2007, in final form November 26, 2007 The article reports an attempt to study the protein folding problem by generalized-ensemble Monte Carlo sim- ulations with reference interaction site model theory. Generalized-ensemble algorithms greatly enhance the configurational space sampling in computer simulations. The reference interaction site model theory treats solvent effects with solvent molecular shape and estimate solvation free energy around proteins. We have developed simulation algorithms which combine generalized-ensemble algorithms and one-dimensional refer- ence interaction site model theory. This treatment can also use a simulation with three-dimensional reference interaction site model theory. In this review, we describe the methods and present the results of these simula- tions for a peptide. Key words: generalized-ensemble algorithm, multicanonical algorithm, replica-exchange method, RISM, Monte Carlo, protein folding PACS: 31.15.Qg, 61.25.Em, 87.15.Aa 1. Introduction In usual simulations, a peptide molecule and quite numerous surrounding solvent molecules are treated simultaneously. However, the number of solvent molecules required becomes huge for a large peptide or a protein. Moreover, in the usual simulations with explicit molecules, the treat- ment of a mixture of more than one solvent species, such as a saline solution, is nontrivial. The reference interaction site model (RISM) theory [1–4] was developed from the statistical mechanics of molecular liquid and can treat not only an infinite number of solvent molecules but also the salt effects. The theory estimates the solvation free energy of a solute molecule with fixed structures in infinite dilution. Therefore, the RISM theory is particularly suitable for these treatments. Besides the solvation theory, effective simulation algorithms are also important for studying protein folding and stability. The generalized-ensemble algorithm is one of the most powerful configurational space sampling methods (for a review, see [5]). This method is based on non- Boltzmann probability weight factors so that a random walk in potential energy space may be realized. The random walk allows the simulation to go over any energy barrier and to sample a much wider configurational space than by conventional methods. From the simulation run, one can not only locate the energy global minimum but also calculate the canonical-ensemble average of any physical quantity at any temperature. Two of the most well-known generalized-ensemble algorithms for simulating protein systems are perhaps the multicanonical algorithm (MUCA) [6,7] and the replica-exchange method (REM) [8–11]. In the previous works, we developed an algorithm for solving the full one-dimensional (1D) c© A.Mitsutake, M.Kinoshita, F.Hirata, Y.Okamoto 495 A.Mitsutake et al. RISM equations for the system of a solute molecule with many atoms in water and salt solution that is more robust and orders of magnitude faster than the conventional one [4,12]. We analyzed the solvation structure and conformational stability of a pentapeptide Met-enkephalin with 1D RISM theory [13]. We also reported the results of the attempt to combine Monte Carlo (MC) simulated annealing and 1D RISM theory [14]. Moreover, we tried to combine the generalized-ensemble algorithms and 1D RISM theory to per- form simulations of protein system [15,16]. First, we combined 1D RISM theory and multicanonical MC algorithm and performed a simulation of Met-enkephalin in aqueous solution. The effectiveness of the combinational simulation was shown in [15]. In the simulation, however, we only used the solvation free energy at “a room temperature” as the solvent effects because the multicanonical algorithm cannot treat temperature-dependent energy even though the solvent free energy depends on temperature. In order to include the effects of the temperature dependence of the solvation free energy, we then used the replica-exchange MC simulation. In this case, we had to modify the for- mula for the replica-exchange criterion to treat the temperature-dependent energy and performed the replica-exchange MC simulation with 1 D RISM theory [16]. In the article, we review the above attempt to combine the generalized-ensemble MC algorithms and 1D RISM theory. 2. Method 2.1. Hybrid RISM/MC method In usual MC simulations, the energy function is given by an instant value of the potential energy for the entire system configuration. In the simulations given in this article, the total energy function ET at a temperature T is given by the sum of the conformational energy EC and the solvation free energy ESOL of the biomolecule in a fixed conformation rC: ET(rC, T ) = EC(rC) + ESOL(rC, T ). (1) Here, the conformational energy EC consists, for example, of electrostatic term, Lennard-Jones term, and torsion-energy term. In the article, we use the RISM theory to calculate ESOL. The total energy function in equation (1) is different from the usual potential energy, because we use the “solvation free energy” ESOL. We refer to the total energy function as “the solvent-induced potential”. The solvent-induced potential depends on temperature because the solvation free energy is a function of temperature. The validity of using the solvent-induced potential as total energy in Monte Carlo simulations was discussed in book [4] and [16]. From the argument, the ensemble average of a physical quantity A(rC), which depends solely on coordinates rC, is given by 〈A〉T = ∫ drCA(rC) exp{−β(EC + ESOL)} ∫ drC exp{−β(EC + ESOL)} . (2) The “exact” equation, equation (2), implies that the ensemble average of A(rC) can be calculated using the integration over the conformation of the biomolecule alone when the energy function in the Boltzmann factor is taken to be the solvent-induced potential ET = EC+ESOL in equation (1). In the case of the MC simulation at a temperature T with the weight factor W (rC, T ) = exp[−β{ET(rC, T )}], the canonical transition probability from the state X = {r [i] C } to X ′ = {r [j] C } is given by the Metropolis criterion [17]: w(X → X ′) = { 1 , for ∆ 6 0 , exp (−∆) , for ∆ > 0 , (3) where ∆ = β(ET ( r [j] C , T ) −ET ( r [i] C , T ) ) and β = 1/kbT . After performing the simulation, we can 496 Combination of generalized-ensemble algorithms and RISM exactly obtain the average of physical quantity A(rC) by 〈A〉T = ∫ drCA(rC)W (rC, T ) ∫ drCW (rC, T ) = 1 NSIM NSIM ∑ i=1 Ai(rC), (4) where NSIM is the number of samples taken. 2.2. One-dimensional reference interaction site model ESOL is calculated using 1D RISM theory [4,12–14]. It is assumed that the solute is present in the solvent at an infinite-dilution limit. The calculation process is split into two steps where the bulk solvent (step 1) and the solvent near a solute molecule (step 2) are successively treated. The basic equations for step 2 comprise the site-site Ornstein-Zernike (SSOZ) relation and the hypernetted-chain (HNC) closure equation. Let the subscript S and the subscript V denote the solute molecule and the solvent molecule, respectively. The model of the water molecule that we adopted is the SPC/E model [18]. Suppose that the solute molecule has NS atomic sites and the solvent molecule has NV atomic sites (NV = 3 for water). The SSOZ relation in the Fourier space is given by ηsv = wsvcsvHvv − csv , (5) ηsv = hsv − csv , (6) Hvv = wvv + ρvhvv , (7) where Hvv, ηsv, and wss are NV × NV, NS × NV, and NS × NS matrices, respectively, ρv is the matrix of the number density of the solvent, h is the matrix of the site-site intermolecular total correlation functions, c is the matrix of the site-site intermolecular direct correlation functions, and w is the intramolecular correlation matrix. Hvv is calculated in step 1 and is part of the input data for step 2. The HNC closure equation is given by cAB(r) = exp {−uAB(r)/kBT + ηAB(r)} − ηAB(r)− 1, (8) A = 1, . . . , NS , B = 1, . . . , NV, ηAB(r) = hAB(r)− cAB(r), (9) where uAB(r) is the site-site interaction potential energy. A is an atomic site in the solute molecule, and B is an atomic site in the solvent molecule. After the site-site intermolecular correlation functions hAB(r) and cAB(r) are obtained by solving the RISM-HNC equations, the solvation free energy ESOL for the solute molecule is calculated from [19] ESOL(rC, T ) = 4πkBT ∫ ∞ 0 F (r)r2dr, (10) F (r) = NS ∑ A=1 ∑ B= H,O ρB [ hAB(r)2 2 − cAB(r)− hAB(r)cAB(r) 2 ] , (11) ρ H = 2ρV , ρO = ρV , (12) where ρ are number densities. The dimensionless number density of water ρVd 3 (d=2.8 Å) is 0.7317. 3. Generalized-ensemble algorithms The generalized-ensemble algorithm is based on non-Boltzmann probability weight factors so that a random walk in potential energy space may be realized. We introduce two generalized- ensemble algorithms, multicanonical algorithm and replica-exchange method, which are commonly used in computer simulations of biomolecules. 497 A.Mitsutake et al. 3.1. Multicanonical algorithm We briefly summarize the idea and implementations of the multicanonical algorithm [6,7]. In the multicanonical ensemble the probability distribution of energy Pmu(E) is defined in such a way that a configuration with any energy has equal probability: The multicanonical ensemble is based on the following non-Boltzmann weight factor Wmu(E) so that the probability distribution of potential energy Pmu(E) is flat and a one-dimensional random walk in potential energy space is realized: Pmu(E) ∝ n(E)Wmu(E) = const , (13) where n(E) is the density of states. We then have Wmu(E) ∝ n−1(E) . (14) Unlike in the case for the canonical ensemble where the Boltzmann weight factor is used, the mul- ticanonical weight factor Wmu is not a priori known, and one needs its estimator for a numerical simulation. The estimator of the multicanonical weight factor Wmu is usually calculated from iter- ations of short multicanonical simulations (see, for instance, [21–23]). Once the weight factor Wmu has been determined, one performs with this weight factor a multicanonical simulation with high statistics. Monitoring the energy in this simulation, one would see that a random walk between high-energy states and the ground-state configurations is realized. Finally, from this simulation one can not only locate the energy global minimum but also calculate the ensemble average of any phys- ical quantity X for a wide range of temperatures (T = 1/kbβ) by the reweighing techniques [24]: 〈A〉T = ∫ dE A(E)Pmu(E)w−1 mu(E)e−βE ∫ dE Pmu(E)w−1 mu(E)e−βE , (15) where Pmu(E) is the distribution of energy obtained by the final simulation. We remark that we use ET(= EC +ESOL) for E, where ESOL depends on temperature. However, E in the multicanonical simulation should be independent of temperature. Therefore, we fixed the temperature in ESOL at 298.15 K in the present simulation. This means that the expectation value is accurate only in the vicinity of this temperature in the present simulation. 3.2. Modified replica-exchange MC method In our simulations, we use solvent-induced potential as total potential energy of the system. The solvent-induced potential is a function of temperature because the solvation free energy ESOL depends on temperature. When the potential energy depends on temperature, we need to modify the formula for the regular replica-exchange method. In this subsection, we describe the modified replica-exchange method which can treat the temperature-dependent energy function (see [16,20]). The ensemble of replica-exchange method consists of M non-interacting copies of the original system in the canonical ensemble at M different temperatures Tm (m = 1, . . . ,M). We arrange the replicas so that there is always exactly one replica at each temperature. Then there is a one-to-one correspondence between replicas and temperatures; the label i (i = 1, . . . ,M) for replicas is a permutation of the label m (m = 1, . . . ,M) for temperatures, and vice versa: { i = i(m) ≡ f(m) , m = m(i) ≡ f−1(i) , (16) where f(m) is a permutation function of m and f−1(i) is its inverse. Let X = {x [i(1)] 1 , . . . , x [i(M)] M } = {x [1] m(1), . . . , x [M ] m(M)} stand for a “state” in the replica-exchange ensemble. The state X is specified by the M sets of coordinates x [i] m in replica i at temperature 498 Combination of generalized-ensemble algorithms and RISM Tm. In the replica-exchange simulation with the solvent-induced potential, the weight factor for the state X = {· · · , x [i(m)] m , · · ·} is given by the product of exp[−β{ET(r [i] C , Tm)}] for each replica: WREM(X) = M ∏ i=1 exp ( −βm(i) { ET ( r [i] C ;Tm(i) )}) , (17) = M ∏ m=1 exp ( −βm { ET ( r [i(m)] C ;Tm )}) , (18) where i(m) and m(i) are the permutation functions in equation (16) and r [i] C is a coordinate of a protein in replica i at temperature Tm. Note that the temperature dependence of the total energy ET comes from the solvation free energy ESOL (see equations (8), (10), and (11)). We now consider exchanging a pair of replicas, i and j, which are at temperatures Tm and Tn, respectively: X = {· · · , x [i] m, . . . , x [j] n , · · ·} → X ′ = {· · · , x [j] m , . . . , x [i] n , · · ·}. From the condition of detailed balance of replica-exchange, we obtain the following relation: WREM(X)w(X → X ′) = WREM(X ′)w(X ′ → X), (19) where w(X → X ′) is the transition probability of going from state X to state X ′ and w(X ′ → X) is the transition probability of going from state X ′ to state X. From equations (18) and (19), we have (see also [16,20]) w(X → X ′) w(X ′ → X) = exp(−∆), (20) where ∆ = βm { ET ( r [j] C , Tm ) − ET ( r [i] C , Tm )} − βn { ET ( r [j] C , Tn ) − ET ( r [i] C , Tn )} . (21) Here, r [i] C and r [j] C are the set of the coordinates of atoms in replica i and replica j, respectively. We remark that if ET is independent of temperature, we have ∆ = (βm − βn) ( ET ( r [j] C ) − ET ( r [i] C )) , (22) which we usually use in the regular replica-exchange simulation. As a result, the transition probability of the replica-exchange process is given by the Metropolis criterion [17]: w(X → X ′) = { 1 , for ∆ 6 0 , exp (−∆) , for ∆ > 0 , (23) where ∆ is given by equation (21). Without loss of generality we can assume T1 < T2 < · · · < TM . A REM MC simulation is then realized by alternately performing the following two steps: 1. Each replica in canonical ensemble of the fixed temperature is simulated simultaneously and independently for certain MC steps (see equation (3)). 2. A pair of replicas at neighboring temperatures, say x [i] m and x [j] m+1, are exchanged with the probability w(X → X ′) in equation (23). Note that in step 2 we exchange only the pairs of replicas corresponding to the neighboring tem- peratures. 4. Computational details The peptide we considered in the present article is Met-enkephalin, and it has the amino-acid sequence Tyr-Gly-Gly-Phe-Met. The peptide has 75 atomic sites, i.e., NS = 75. The 1D RISM- HNC equations and the replica-exchange method have been incorporated in the program KONF90 499 A.Mitsutake et al. [25,26]. A hybrid of the Newton-Raphson method and the Picard iteration strategy for solving the RISM equations have been employed as described in detail in [4]. The conformational energy is based on ECEPP/2 [27]–[29]. In ECEPP/2, the conformational energy is given by electrostatic energy Eele, hydrogen-bond energy Eh, Lennard-Jones energy Ev, and torsion energy Etor. The peptide-bond dihedral angles ω were set equal to 180◦ for simplicity. The remaining dihedral angles φ and ψ in the main chain and χ in the side chains constitute the variables to be updated in the simulations (for Met-enkephalin the number of degrees of freedom is 19). One MC sweep consists of updating all these angles once with Metropolis evaluation [17] for each update. For the calculation of multicanonical weight factors, five iterations of 100 MC sweeps (500 MC sweeps in total) were required. Initial conformations were randomly generated. All thermodynamic quantities were then calculated from one production run of 10000 MC sweeps, which followed additional 100 MC sweeps for equilibration. ESOL are calculated at 298.15. For the replica-exchange MC simulation, we used four replicas with four temperature values, 200 K, 298 K, 368 K, and 500 K. ESOL was calculated at the corresponding temperatures. Before taking the data, we made equilibration simulations of 5,000 MC sweeps per replica. The total number of MC sweeps for the subsequent REM production run was 20,000 MC sweeps per replica. Replica exchange was tried every 20 MC sweeps. 5. Results and discussion 5.1. Results of the multicanonical MC simulation In figure 1 we show the time series of the solvent-induced potential (ET) for Met-enkephalin from regular canonical MC simulations at 300 K and 800 K (we refer to those simulations as MC– 300 and MC–800, respectively) and from the multicanonical MC simulation. The time series of MC–300 and MC–800 consist of 1000 MC sweeps, while those of the multicanonical MC simulation consist of 10000 MC sweeps. ESOL were calculated at 298.15 K for MC–300 and MC–800. The solvent-induced potential ranges from about 173 to 185 kcal/mol in MC–300 and from 178 to 220 kcal/mol in MC–800, respectively (see figure 1(a)). On the other hand, the solvent-induced potential range between 172 and 200 kcal/mol is observed (already within 1000 MC sweeps) by the multicanonical MC simulation (see figure 1(b)). It is found that the multicanonical MC simulation covers a much wider configurational space than the canonical simulations. A random walk in energy space covering a range of about 170-200 kcal/mol was successfully realized during 10000 MC sweeps. 170 175 180 185 190 195 200 0 200 400 600 800 1000 E ( kc al /m ol ) MC sweeps MC-800 MC-300 170 175 180 185 190 195 200 0 2000 4000 6000 8000 10000 E ( kc al /m ol ) MC sweeps (a) (b) Figure 1. Time series of the solvent-induced potential ET (kcal/mol) for: (a) the regular canoni- cal Monte Carlo simulations at 300 K (MC–300) and 800 K (MC–800) (1000 MC sweeps); (b) the multicanonical MC simulation (10000 MC sweeps) [15]. In all simulations, ESOL was calculated at 298.15 K. We now consider the low-energy conformations of Met-enkephalin obtained during the multi- canonical MC simulation in aqueous solution (which we refer to as MURISM). Multicanonical MC 500 Combination of generalized-ensemble algorithms and RISM simulations in the gas phase (which we refer to as MUGAS) [31,32] and MC simulated annealing runs in aqueous solution based on 1D RISM theory (which we refer to as SARISM) [14] were also performed in previous works. We compare the low-energy conformations obtained by MURISM with those from MUGAS and SARISM. In figure 2 we show the lowest-energy conformations ob- tained from MUGAS, SARISM, and MURISM. For the latter two cases (figures 2(b) and 2(c)), eight low-energy conformations are superimposed. In figure 2(b) the eight conformations are the lowest-energy conformations from eight different simulated annealing runs [14]. In figure 2(c) the eight representative conformations were chosen from those in which the energy was less than 174.0 kcal/mol and the root-mean-square distances of the backbone atoms from Gly2 to Phe4 were larger than 0.4 Å (this implies that similar structures are excluded as much as possible). (a) (b) (c) Figure 2. The lowest-energy conformations of Met-enkephalin obtained by the previous multi- canonical simulation in gas phase [32] (a), superposition of the eight lowest-energy conformations from the previous simulated annealing runs in aqueous solution [14] (b), and superposition of the eight representative low-energy conformations of Met-enkephalin obtained by the multicanonical MC simulation in aqueous solution [15] (c). As is shown in figure 2(a), conformations in gas phase form intrachain hydrogen bonds between the amide nitrogen and carbonyl oxygen in the backbone, which make the structure compact. Especially, the lowest-energy conformation has two intrachain hydrogen bonds between Gly2 and Met5 and forms a type II’ β turn involving Gly-Gly-Phe-Met (see figure 2(a)) [32]. The end-to-end distance of this conformation is 4.8 Å and the conformation is indeed compact. Here, the end- to-end distance is defined as the distance between the nitrogen atom at the N terminus and the oxygen atom at the C terminus. On the other hand, for the multicanonical MC simulation in aqueous solution, the eight con- formations in figure 2(c) are fully extended. The results are in good agreement with those of the simulated annealing runs (see figure 2(b)). The end-to-end distance of the lowest-energy conforma- tion of MURISM is 11.7 Å, close to that of SARISM (13.6 Å), and the average end-to-end distance at 298.15 K is also about 12 Å. It is clear that the conformations tend to be extended in aqueous solution. Apparently, the amide nitrogen and carbonyl oxygen in the backbone of low-energy con- formation in aqueous solution form hydrogen bonds with oxygen and hydrogen of water molecules, respectively. The solvent effect based on 1D RISM theory tends to make hydrogen bonds with backbone atoms and water molecules. The results are in qualitatively good agreement with the observations in the NMR experiments [30]. The solvent-induced potential of the lowest-energy conformation in the multicanonical MC sim- ulation is 172.1 kcal/mol and lower than that obtained by the simulated annealing runs (SARISM) (175.4 kcal/mol) [14]. Moreover, the areas of dihedral angles of backbones are widely sampled in the multicanonical MC simulation (see [16]). The multicanonical MC simulation can thus sample much wider configurational space and provide lower-energy conformations than the simulated annealing simulation. 501 A.Mitsutake et al. 5.2. Results of the replica-exchange MC simulation The replica-exchange MC simulation was performed with four replicas at the four temperatures Tm: 200 K, 298 K, 368 K, and 500 K. In figure 3 the time series of the temperature label m and of the solvent-induced potential ET for one of the replicas are shown. We see that the replica takes every temperature value and randomly walks in temperature space as expected (figure 3(a)). Correspondingly, this replica undergoes a random walk in energy space (figure 3(b)). Other replicas also exhibit similar behavior (data not shown). In table 1 we list the acceptance ratios of replica exchange. It is clear that they are of the same order (the values vary between 16 % and 29 %). For an optimal performance of REM simulations, the acceptance ratios of replica exchange should be sufficiently uniform and large (say, > 10 %). We can see that the REM simulation has indeed achieved an optimal performance. 1 2 3 4 -5000 0 5000 10000 15000 20000 25000 m MC sweeps (a) (b) Figure 3. Time series of the temperature label m (a) and of the solvent-induced potential ET (b) for one of the replicas [16]. The negative values of MC sweeps correspond to the part of equilibration. Table 1. Acceptance ratios of replica exchange. Pair of temperature [K] Acceptance ratio 200 ←→ 298 0.16 298 ←→ 368 0.29 368 ←→ 500 0.25 In figure 4 the time series of the solvent-induced potential and the histogram of the solvent- induced potential distribution at the four temperatures are shown. The difference of solvent-induced potential ET between the neighboring temperatures increases rapidly as the temperature difference increases, because one of its component energies, solvation free energy ESOL, depends on tempera- ture (see equation (10)) and is almost proportional to temperature (see [16]). From figure 4(a) and 4(b) we find that there are no overlaps between pairs of neighboring distributions. In the regular replica-exchange method, the histograms of the total potential energy distributions at neighboring temperatures need to overlap for the replica exchange to occur (see equation (22)). Therefore, at first glance, it seems that the number of temperatures chosen above (four) is too small. However, it turned out that the four temperatures were sufficient for ideal replica-exchange performances (see table 1). The point is that we exchange replicas by using ∆ in equation (21) instead of ∆ of the regular replica-exchange method in equation (22). This can be understood as follows. ET(r [i] C , T ) increases rapidly as temperature T increases and is almost proportional to temperature (see equa- tion (10)). The term β{ET(r [j] C , T )−ET(r [i] C , T )} is therefore almost temperature-independent. The magnitude of ∆ in equation (21) is then of the order of one. The acceptance probability exp(−∆) in equation (23) is thus sufficiently large for the replica exchange to occur. Hence, replica exchange is 502 Combination of generalized-ensemble algorithms and RISM -7 -6 -5 -4 -3 -2 -1 100 150 200 250 300 lo g (P (E T )) ET (kcal/mol) 200 298 368 500 (a) (b) Figure 4. Time series of the solvent-induced potential at the four temperature values (200, 298, 368, and 500 K) (a) and probability distributions of the solvent-induced potential at the four temperatures (b) [16]. indeed possible between all pairs of neighboring temperatures. This system requires much less num- ber of replicas than the case for a regular replica-exchange simulation where the energy function is temperature independent. (a) (b) Figure 5. The lowest-energy conformations of Met-enkephalin obtained by the replica-exchange simulation in aqueous solution at 200 K (a) and 298 K (b) [16]. We now consider the lowest-energy conformations of Met-enkephalin obtained by the replica- exchange MC simulation in aqueous solution. The lowest-energy conformations obtained at 200 K and 298 K are shown in figure 5(a) and figure 5(b), respectively. These conformations are fully extended as in the results of the multicanonical MC simulation. The results are also in qualitatively good agreement with the observations in the NMR experiments [30]. In table 2, we show the energy components and end-to-end distance for the lowest-energy confor- mations at 200 K (REP1) and 298 K (REP2) obtained by the replica-exchange simulation. We also list these values obtained by the multicanonical MC simulation in gas phase (MUGAS), simulated annealing runs (SARISM), the multicanonical MC simulation in aqueous solution (MURISM). The solvent-induced potential, conformational energy, and solvation free energy of MUGAS in parentheses were obtained from [14] in which these values of conformation almost identical to the lowest-energy conformation in gas phase (see figure 2(a)) were calculated. The values of the solvent-induced potential at 200 K (REP1) and 298 K (REP2) are 99.8 kcal/mol and 172.1 kcal/mol, respectively. The values of the conformational energy components of REP1 and REP2 are similar to each other, while the solvation free energy values are different due to different temperatures (see equation (10)). Each energy component of REP2 is almost identical 503 A.Mitsutake et al. Table 2. The energy components (kcal/mol), end-to-end distance de−e (Å) for the lowest-energy conformations obtained by a previous multicanonical MC simulation in gas phase (MUGAS), previous simulated annealing runs (SARISM), the multicanonical MC simulation in aqueous solution (MURISM), and at 200 K (REP1) and 298 K (REP2) from the replica-exchange MC simulation. The values in parentheses were calculated for the similar conformation of the lowest- energy conformation (see figure 2(a)) in gas phase in [14]. ET Econf Eele Eh Ev Et Esol de−e MUGAS (184.8) –12.2(–12.0) 20.3 –6.2 –27.5 1.2 (196.8) 4.8 SARISM 175.4 9.6 26.5 –1.0 –16.3 0.5 165.8 13.6 MURISM 172.2 11.6 27.2 –0.9 –15.5 0.9 160.6 11.7 REP1 99.8 12.3 29.2 –1.3 –16.0 0.4 87.5 10.9 REP2 172.1 11.2 28.4 –1.6 –16.2 0.6 160.9 9.2 with that of MURISM, reflecting the fact that the two cases correspond to the same temperature (298.15 K). The lower-energy conformations of SARISM, MURISM, and REP are similar to one another (i.e., extended) and have similar energy components. The conformational energy EC is about 10 kcal/mol and the solvent free energy ESOL is about 160 kcal/mol. On the other hand, EC of the lowest-energy conformation in gas phase is about –12 kcal/mol [32]. ESOL is about 196.8 kcal/mol [14]. ET of the lowest-energy conformation for gas phase is about 184.6 kcal/mol. EC of the lowest-energy conformation in aqueous solution is about 10 kcal/mol and it is much hi- gher than that in the gas phase because energy components of the conformational energy, Eele, Eh, and Ev, of the simulations with 1D RISM are higher than those in the gas phase. On the contrary, ESOL of SARISM, MURISM, and REP are much lower than that in MUGAS. As a result of the summation of ESOL and EC, in aqueous solution, the extended structures obtained by SARISM, MURISM, and REP are more stable than the compact ones by MUGAS. 0 2 4 6 8 10 12 14 200 250 300 350 400 450 500 d_ (e -e ) T (K) SOL GAS Figure 6. Average end-to-end distance obtained by the replica-exchange MC simulation in aque- ous solution (SOL) and the previous multicanonical simulation in gas phase (GAS) [16]. Here, the end-to-end distance is defined as the distance between the nitrogen atom at the N terminus and the oxygen atom at the C terminus. We calculated some thermodynamic quantities at the given temperatures by equation (4). In figure 6, the average end-to-end distance in aqueous solution as a function of temperature is shown. These average values vary only between 12 Å and 13 Å; the conformations are extended in the entire temperature range. The end-to-end distance of the peptide in gas phase exhibits a coil-globule-like transition with increasing temperature. In aqueous solution, on the other hand, no such transition is observed, since the conformation is always extended due to hydrogen bonds between the peptide and water molecules. The results are similar to the replica-exchange molecular dynamics simulation 504 Combination of generalized-ensemble algorithms and RISM with explicit water molecules [5]. We also calculated the pair distribution function as a function of temperature to study how the peptide is hydrated. In figure 7 we show the distribution functions 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 3 g( r) r 200 298 368 500 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 3 g( r) r 200 298 368 500 (a) (b) Figure 7. The pair distribution functions between water atoms and backbone atoms averaged over the different conformations of the peptide at each temperature [16]. The pair distribution function is with respect to the oxygen atom in the backbone of the 3rd residue and the hydrogen atom (a) or the oxygen atom (b) in water. averaged over the different conformations of the peptide. The peaks in figure 7(a) and 7(b) imply that the oxygen in the backbone of Gly 3 makes hydrogen bonds with water molecules. With the increasing temperature, the peaks become low as the hydrogen bonds become weak. The data clearly show that in aqueous solution the backbone atoms of conformations make hydrogen bonds with water molecules. As a result, the conformations are extended. We now study the relation between the conformational energy (EC and its energy components: electrostatic energy Eele, hydrogen bond energy Eh, Lennard-Jones energy Ev, or torsion energy Etor) and the solvation free energy ESOL. The results at 298.15 K are shown in figure 8. We also show the relation between ESOL and the sum of Eele and Eh. The energy ranges of both abscissa and ordinate are set to the same (25 kcal/mol) to understand the contributions of their energies to the total energy clearly. From figure 8(a), we find a clear inverse correlation between EC and ESOL. The tendency is the same for the results obtained at the given four different temperatures and becomes more pronounced as the temperature is lowered (see [14]). The dotted and solid line in figure 8(a) is the area in which the solvent-induced potential ET is 172.1 kcal/mol (corresponding to the energy of the lowest-energy conformation in aqueous solution) and 184.5 kcal/mol corresponding to that in gas phase, respectively (see table 2). The conformations corresponding to the points between the solid line and the dotted line have ET between 172.1 kcal/mol and 184.5 kcal/mol. Most conformations are in the energy region. In an aqueous solution these conformations are more preferable than the lowest-energy conformation in the gas phase. There are some conformations corresponding to the point near the dotted line. Considering this fact and the inverse correlation between EC and ESOL, it is established that the competition between the conformational energy and the solvation free energy occurs in aqueous solution. From figure 8(a) to figure 8(e), we also see that Eele, Eh, and Ev which are components of the conformational energy EC have clear inverse correlations between ESOL. It implies that the solvent effect based on 1D RISM theory tends to break the interactions between intra-chain atoms and to make more interactions between peptide atoms and water molecules. 6. Conclusions We have developed an algorithm combined with the generalized-ensemble algorithms and 1D RISM theory. This is the first attempt of the kind. For this purpose, we reasonably introduced the solvent-induced potential and developed a modified replica-exchange method. 505 A.Mitsutake et al. (a) (b) (b) (a) (b) (b) Figure 8. The relations between the solvation free energy ESOL and the total conformational energy EC at 298.15 K (a), and its components, electrostatic energy Eele (b), hydrogen bond energy Eh (c), Lennard-Jones energy Ev (d), or torsion energy Etor (e), or the sum of electrostatic energy and hydrogen bond energy (f) and solvation free energy. Firstly, we performed a multicanonical MC simulation of Met-enkephalin in aqueous solution. Secondly, we performed the modified replica-exchange MC simulation of Met-enkephalin in aqueous solution to study temperature dependence. At a room temperature, both simulations have similar results. We obtained a lot of information concerning the solvent effects from these simulations. The conformations in aqueous solution tend to be extended because the backbone atoms make hydrogen bonds to water molecules although the lower-energy conformations in a gas phase tend to make hydrogen bonds between intra-chain backbone atoms. Thermodynamic quantities were also obtained as a function of temperature. The present approach is beneficial for performing protein folding simulations. We shall also apply these modifications to other systems which use a solvent-induced potential such as three-dimensional reference interaction theory. 7. Acknowledgements The simulations were performed on the computers at the Research Center for Computational Science, Institute for Molecular Science. This work was supported, in part, by the Grants-in-Aid for Scientific Research on Priority Areas “Water and Biomolecules”, for Young Scientists (B), 14740170, and for the Next Generation Super Computing Project, Nanoscience Program from the Ministry of Education, Culture, Sports, Science and Technology, Japan. References 1. Chandler D., Andersen H.C., J. Chem. Phys., 1972, 57, 1930–1937. 2. Hirata F., Rossky P.J., Chem. Phys. Lett., 1981, 83, 329–334. 3. Perkyns J., Pettitt B.M., Chem. Phys. Lett., 1992, 190, 626–630. 4. Kinoshita M. – In Molecular Theory of Solvation, edited by Hirata, F. (Kluwer Academic, Dordrecht, 2003), pp.101–168. 5. Mitsutake A., Sugita Y., Okamoto Y., Biopolymers (Peptide Science), 2001, 60, 96–123. 6. Berg B.A., Neuhaus T., Phys. Lett. B, 1991, 267, 249–253. 506 Combination of generalized-ensemble algorithms and RISM 7. Berg B.A., Neuhaus T., Phys. Rev. Lett., 1992, 68, 9–12. 8. Hukushima K., Nemoto K., J. Phys. Soc. Jpn., 1996, 65, 1604–1608. 9. Hukushima K., Takayama H., Nemoto K., Int. J. Mod. Phys., 1996, C7, 337–344. 10. Geyer C.J. – In Computing Science and Statistics: Proc. 23rd Symp. on the Interface, edited by Keramidas, E.M. (Interface Foundation, Fairfax Station, 1991), pp. 156–163. 11. Sugita Y., Okamoto Y., Chem. Phys. Lett., 1999, 314, 141–151. 12. Kinoshita M., Okamoto Y., Hirata F., J. Comput. Chem., 1997, 18, 1320–1326; Kinoshita M., Okamoto Y., Hirata F., J. Comput. Chem., 1998, 19, 1724–1735. 13. Kinoshita M., Okamoto Y., Hirata F., J. Chem. Phys., 1997, 107, 1586–1599. 14. Kinoshita M., Okamoto Y., Hirata F., J. Am. Chem. Soc.,1998, 120, 1855–1863. 15. Mitsutake A., Kinoshita M., Okamoto Y., Hirata F., Chem. Phys. Lett., 2000, 329, 295–303. 16. Mitsutake A., Kinoshita M., Okamoto Y., Hirata F., J. Phys. Chem. B, 2004, 108, 19002–19012. 17. Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., Teller E., J. Chem. Phys., 1953, 21, 1087–1092. 18. Berendsen H.J.C., Grigera J.R., Straatsma T.P.J., J. Phys. Chem., 1987, 91, 6269–6271. 19. Singer S.J., Chandler D., Mol. Phys., 1985, 55, 621–625. 20. La Penna G., Mitsutake A., Masuya M., Okamoto Y., Chem. Phys. Lett., 2003, 380, 609–619. 21. Okamoto Y., Hansmann U.H.E., J. Phys. Chem., 1995, 99, 11276–11287. 22. Berg B.A., Int. J. Mod. Phys. C, 1992, 3, 1083. 23. Berg B.A., Nucl. Phys. B (Proc. Suppl.), 1998, 63A–C, 982–984. 24. Ferrenberg A.M., Swendsen R.H., Phys. Rev. Lett., 1988, 61, 2635–2638. 25. Kawai H., Okamoto Y., Fukugita M., Nakazawa T., Kikuchi T., Chem. Lett., 1991, 1991, 213–216. 26. Okamoto Y., Fukugita M., Nakazawa T., Kawai H., Protein Eng., 1991, 4, 639–647. 27. Momany F.A., McGuire R.F., Burgess A.W., Scheraga H.A., J. Phys. Chem., 1975, 79, 2361–2381. 28. Némethy G., Pottle M.S., Scheraga H.A., J. Phys. Chem., 1983, 87, 1883–1887. 29. Sippl M.J., Némethy G., Scheraga H.A., J. Phys. Chem., 1984, 88, 6231–6233. 30. Graham W.H., Carter II S.E., Hickes P.R., Biopolymers, 1992, 32, 1755–1764. 31. Hansmann U.H.E., Okamoto Y., J. Comput. Chem., 1993, 14, 1333–1338. 32. Mitsutake A., Hansmann U.H.E., Okamoto Y., J. Mol. Graph. Mod., 1998, 16, 226–238. 507 A.Mitsutake et al. Комбiнацiя алгоритмiв узагальненого ансамблю i одновимiрної базисної моделi взаємодiючих силових центрiв А.Мiцутаке1, М.Кiношiта2, Ф.Хiрата3,4, Ю.Окамото5 1 Фiзичний факультет, унiверситет м. Кейо, Йокагама, Канагава, Японiя 2 Iнститут прикладної енергетики, унiверситет м. Кiото, Японiя 3 Факультет теоретичних дослiджень, Iнститут молекулярних наук м. Оказакi, Аiчi, Японiя 4 Факультет функцiональних молекулярних наук, Унiверситет м. Оказакi, Аiчi, Японiя 5 Фiзичний факультет, унiверситет м. Нагоя, Аiчi, Японiя Отримано 15 жовтня 2007 р., в остаточному виглядi – 26 листопада 2007 р. Стаття присвячена вивченню проблеми згортання протеїну в рамках комбiнацiї узагальненого ан- самблю Монте-Карло та теорiї базисної моделi силових центрiв. Алгоритми узагальненого ансам- блю можуть ефективно вибирати конфiгурацiйний простiр в комп’ютерних моделюваннях. Теорiя ба- зисних взаємодiючих силових центрiв може враховувати ефекти розчинника в залежностi вiд форми молекул, визначаючи енергiю сольватацiї протеїнiв. Ми розробили комп’ютернi алгоритми, що ком- бiнують алгоритми узагальненого ансамблю i одновимiрну базисну модель взаємодiючих силових центрiв. Цей пiдхiд може бути також поєднаний з тривимiрною моделлю взаємодiючих силових цен- трiв. В даному оглядi описуються методи i представлено результати для пептиду. Ключовi слова: алгоритм узагальненого ансамблю, багатоканонiчний алгоритм, метод реплiчного обмiну, RISM, Монте Карло, згортання протеїну PACS: 31.15.Qg, 61.25.Em, 87.15.Aa 508