Module bussilab.wham
Module containing a WHAM implementation.
See wham()
.
Functions
def wham(bias,
*,
frame_weight=None,
traj_weight=None,
T: float = 1.0,
maxiter: int = 1000,
threshold: float = 1e-20,
verbose: bool = False,
logZ: numpy.ndarray | None = None,
logW: numpy.ndarray | None = None,
normalize: bool | str = 'log',
method: str = 'minimize',
minimize_opt: dict | None = None)-
Expand source code
def wham(bias, *, frame_weight=None, traj_weight=None, T: float = 1.0, maxiter: int = 1000, threshold: float = 1e-20, verbose: bool = False, logZ: Optional[np.ndarray] = None, logW: Optional[np.ndarray] = None, normalize: Union[bool, str] = "log", method: str = "minimize", minimize_opt: Optional[dict] = None): """Compute weights according to binless WHAM. The main input for this calculation is in the 2D array `bias`. Element `bias[i, j]` should contain the energy of the i-th frame computed according to the j-th Hamiltonian. Trajectories should be concatenated first, so that the total number of frames should be equal to the number of frames of each trajectory multiplied by the number of trajectories. However, it is also possible to contatenate simulations of different lengths. It is crucial however to compute the potential according to each of the employed Hamiltonian on **all the frames**, not only on those simulated using that Hamiltonian. Notice that by default weights are normalized. It is possible to override this behavior with normalize=False. However, starting with v0.0.40, this should not be necessary. The new implementation of normalization should be numerically stable in all cases. Bugs ---- Up to version 0.042, method="minimize" does not work correctly when setting traj_weights. As a consequence, results produced with v0.0.41, where this method is the default, might be incorrect. In v0.0.42 this is temporarily fixed by reverting to method="substitute" when using traj_weights. In v0.0.43 this should be finally fixed: both methods should equally work in all cases, and the default "minimize" method should require less iterations. Combining trajectories of different length ------------------------------------------ Let's imagine three frames obtained from three Hamiltonians. Let's assume that the energy of frame i in Hamiltonian j is given by `bias[i, j]` defined as ```python import numpy as np bias = np.array([[1, 10, 7], [2, 9, 6], [3, 8, 5]]) ``` We can compute the weights with the following command: ```python np.exp(wham.wham(bias).logW) array([0.41728571, 0.39684866, 0.18586563]) ``` We now notice that the second and third columns of this matrix are equal except for a rigid shift. They thus correspond to Hamiltonians that are equivalent. We should have been able to obtain the same result saying that these frames were coming from two simulations. If we only pass the first two columns however we obtain different weights ```python np.exp(wham.wham(bias[:, 0:2]).logW) array([0.28224026, 0.43551948, 0.28224026]) ``` In order to correcly analyze these frames we should pass the information that the second Hamiltonian was used for twice the time: ```python np.exp(wham.wham(bias[:, 0:2], traj_weight=(1, 2)).logW) array([0.41728571, 0.39684866, 0.18586563]) ``` Notice that now the weights are identical to those computed in the first example. Trusting more a trajectory than another --------------------------------------- When you concatenate trajectories, you might explicitly want to trust more a trajectory than another. For instance, two trajectories might have been accumulated with a different stride, and the reliability of each frame of the one with smaller stride should be lower. We thus would like to assign a weight to each frame that accounts for its reliability. Consider the following command: ```python import numpy as np np.exp(wham.wham([[3, 5], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4]]).logW) ``` that results in the following weights ```python array([0.06421006, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899]) ``` Notice that all the frames except for the first one are identical. An equivalent result would have been obtained using ```python import numpy as np np.exp(wham.wham([[3, 5], [4, 4]], frame_weight=(1, 10)).logW) array([0.06421006, 0.93578994]) ``` Clearly, the weight of the second frame in this example is equal to ten times the weights of the ten corresponding frames in the previous example. Parameters ---------- bias: np.ndarray An array with shape (nframes, ntraj) containing the bias potential applied to each frame according to each of the Hamiltonians. frame_weight: np.ndarray, optional An array with nframes elements. These elements should contain the reliability weight of the frames. By default, these weights are set to one. traj_weight: np.ndarray, optional An array with ntraj elements. These elements should contain the total weight of each of the Hamiltonians. Should be used when combining trajectories of different lengths. T: float, optional The temperature of the system. This number is just used to divide the `bias` array in order to make it adimensional. In case replicas are simulated at different temperatures, it is possible to pass an array here, with ntraj elements. maxiter: int, optional Maximum number of iterations in the minimization procedure. threshold: float, optional Threshold for the minimization procedure. verbose: bool, optional If True, print information as the minimization proceeds. logZ: np.ndarray, optional Array with ntraj elements. Initial value for the logarithm of the partition functions. If not provided, it is computed from the bias. Providing an initial guess that is close to the converged value (e.g. as obtained from a calculation with a limited number of frames) can speed up significantly the convergence. logW: np.ndarray, optional Array with nframes elements. Initial value for the logarithm of the weights. If not provided, they are computed from the bias. Providing an initial guess that is close to the converged value can speed up significantly the convergence. *If logW is provided, logZ is ignored*. normalize: bool or str, optional By default, "log", which properly normalizes weights in all cases. normalize=True or False is enabled for backward compatibility. method: str, optional If "substitute", solve self-consistent equations by substitution. If "minimize", use a minimization as in J Chem Phys 136, 144102 (2012). Prior to version 0.0.40, the default was "substitute". Starting with version 0.0.41, the default is "minimize". minimize_opt: dict, optional If method=="minimize", this dict can be used to pass options to scipy.minimize. Notice that by default the minimization is performed using 'L-BFGS-B'. """ # allow tuples or lists bias = coretools.ensure_np_array(bias) frame_weight = coretools.ensure_np_array(frame_weight) traj_weight = coretools.ensure_np_array(traj_weight) nframes = bias.shape[0] ntraj = bias.shape[1] if isinstance(T,float): T=T*np.ones(ntraj) T = coretools.ensure_np_array(T) # default values if frame_weight is None: frame_weight = np.ones(nframes) if traj_weight is None: traj_weight = np.ones(ntraj) assert len(traj_weight) == ntraj assert len(frame_weight) == nframes # divide by T once for all shifted_bias = bias/T[np.newaxis,:] # track shifts shifts1 = np.min(shifted_bias, axis=1) shifted_bias -= shifts1[:,np.newaxis] shifts0 = np.min(shifted_bias, axis=0) shifted_bias -= shifts0[np.newaxis,:] # do exponentials only once expv = np.exp(-shifted_bias) if logW is not None: Z = np.matmul(np.exp(logW-shifts1), expv) Z /= np.sum(Z*traj_weight) elif logZ is not None: Z = np.exp(logZ+shifts0) else: Z = np.ones(ntraj) Zold = Z if verbose: sys.stderr.write("WHAM: start\n") if method == "substitute": for nit in range(maxiter): # find unnormalized weights weight = 1.0/np.matmul(expv, traj_weight/Z)*frame_weight # update partition functions Z = np.matmul(weight, expv) # normalize the partition functions Z /= np.sum(Z*traj_weight) # monitor change in partition functions eps = np.sum(np.log(Z/Zold)**2) Zold = Z if verbose: sys.stderr.write("WHAM: iteration "+str(nit)+" eps "+str(eps)+"\n") if eps < threshold: break nfev=nit elif method == "minimize": from scipy.optimize import minimize # in the equation below, traj_weight should be exactly scaled to the number of frames traj_weight_scaled=traj_weight/np.sum(traj_weight)*np.sum(frame_weight) def func(x): Zm1=np.exp(-x) tmp=expv*(traj_weight_scaled*Zm1)[np.newaxis,:] tmp1=np.sum(tmp,axis=1) C=np.sum(frame_weight*np.log(tmp1))+np.sum(traj_weight_scaled*x) tmp/=tmp1[:,np.newaxis] grad=-np.matmul(frame_weight,tmp)+traj_weight_scaled return C,grad if minimize_opt is not None: if "method" not in minimize_opt: minimize_opt["method"]="L-BFGS-B" if "jac" in minimize_opt: if not minimize_opt["jac"]: raise ValueError("minimize_opt['jac'] must be True") else: minimize_opt["jac"]=True else: minimize_opt={} minimize_opt["method"]="L-BFGS-B" minimize_opt["jac"]=True x=np.log(Z) res = minimize(func, x, **minimize_opt) Z=np.exp(res.x) Z/=np.sum(Z*traj_weight) weight = 1.0/np.matmul(expv, traj_weight/Z)*frame_weight Zold=Z.copy() Z = np.matmul(weight, expv) Z /= np.sum(Z*traj_weight) nit=res.nit nfev=res.nfev eps=np.sum(np.log(Z/Zold)**2) else: raise ValueError("method should be 'minimize' or 'substitute'") # this is the traditional implementation, either normalized or not if isinstance(normalize,bool): if normalize: weight *= np.exp((shifts1-np.max(shifts1))) # normalized weights weight /= np.sum(weight) with np.errstate(divide = 'ignore'): logW = np.log(weight) else: logW = np.log(weight) + shifts1 # this is the new default, which allows normalizing also non-normalizable weights else: if normalize == "log": logW = np.log(weight) + shifts1 logW -= np.max(logW) logW -= np.log(np.sum(np.exp(logW))) else: raise ValueError("normalize should be True, False, or 'log'") if verbose: sys.stderr.write("WHAM: end") logW = cast(np.ndarray, logW) # to avoid mypy error return WhamResult(logW=logW, logZ=np.log(Z)-shifts0, nit=nit, nfev=nfev, eps=eps)
Compute weights according to binless WHAM.
The main input for this calculation is in the 2D array
bias
. Elementbias[i, j]
should contain the energy of the i-th frame computed according to the j-th Hamiltonian. Trajectories should be concatenated first, so that the total number of frames should be equal to the number of frames of each trajectory multiplied by the number of trajectories. However, it is also possible to contatenate simulations of different lengths. It is crucial however to compute the potential according to each of the employed Hamiltonian on all the frames, not only on those simulated using that Hamiltonian.Notice that by default weights are normalized. It is possible to override this behavior with normalize=False. However, starting with v0.0.40, this should not be necessary. The new implementation of normalization should be numerically stable in all cases.
Bugs
Up to version 0.042, method="minimize" does not work correctly when setting traj_weights. As a consequence, results produced with v0.0.41, where this method is the default, might be incorrect. In v0.0.42 this is temporarily fixed by reverting to method="substitute" when using traj_weights. In v0.0.43 this should be finally fixed: both methods should equally work in all cases, and the default "minimize" method should require less iterations.
Combining Trajectories Of Different Length
Let's imagine three frames obtained from three Hamiltonians. Let's assume that the energy of frame i in Hamiltonian j is given by
bias[i, j]
defined asimport numpy as np bias = np.array([[1, 10, 7], [2, 9, 6], [3, 8, 5]])
We can compute the weights with the following command:
np.exp(wham.wham(bias).logW) array([0.41728571, 0.39684866, 0.18586563])
We now notice that the second and third columns of this matrix are equal except for a rigid shift. They thus correspond to Hamiltonians that are equivalent. We should have been able to obtain the same result saying that these frames were coming from two simulations. If we only pass the first two columns however we obtain different weights
np.exp(wham.wham(bias[:, 0:2]).logW) array([0.28224026, 0.43551948, 0.28224026])
In order to correcly analyze these frames we should pass the information that the second Hamiltonian was used for twice the time:
np.exp(wham.wham(bias[:, 0:2], traj_weight=(1, 2)).logW) array([0.41728571, 0.39684866, 0.18586563])
Notice that now the weights are identical to those computed in the first example.
Trusting More A Trajectory Than Another
When you concatenate trajectories, you might explicitly want to trust more a trajectory than another. For instance, two trajectories might have been accumulated with a different stride, and the reliability of each frame of the one with smaller stride should be lower. We thus would like to assign a weight to each frame that accounts for its reliability.
Consider the following command:
import numpy as np np.exp(wham.wham([[3, 5], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4], [4, 4]]).logW)
that results in the following weights
array([0.06421006, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899, 0.09357899])
Notice that all the frames except for the first one are identical. An equivalent result would have been obtained using
import numpy as np np.exp(wham.wham([[3, 5], [4, 4]], frame_weight=(1, 10)).logW) array([0.06421006, 0.93578994])
Clearly, the weight of the second frame in this example is equal to ten times the weights of the ten corresponding frames in the previous example.
Parameters
bias
:np.ndarray
- An array with shape (nframes, ntraj) containing the bias potential applied to each frame according to each of the Hamiltonians.
frame_weight
:np.ndarray
, optional- An array with nframes elements. These elements should contain the reliability weight of the frames. By default, these weights are set to one.
traj_weight
:np.ndarray
, optional- An array with ntraj elements. These elements should contain the total weight of each of the Hamiltonians. Should be used when combining trajectories of different lengths.
T
:float
, optional- The temperature of the system. This number is just used to divide the
bias
array in order to make it adimensional. In case replicas are simulated at different temperatures, it is possible to pass an array here, with ntraj elements. maxiter
:int
, optional- Maximum number of iterations in the minimization procedure.
threshold
:float
, optional- Threshold for the minimization procedure.
verbose
:bool
, optional- If True, print information as the minimization proceeds.
logZ
:np.ndarray
, optional- Array with ntraj elements. Initial value for the logarithm of the partition functions. If not provided, it is computed from the bias. Providing an initial guess that is close to the converged value (e.g. as obtained from a calculation with a limited number of frames) can speed up significantly the convergence.
logW
:np.ndarray
, optional- Array with nframes elements. Initial value for the logarithm of the weights. If not provided, they are computed from the bias. Providing an initial guess that is close to the converged value can speed up significantly the convergence. If logW is provided, logZ is ignored.
normalize
:bool
orstr
, optional- By default, "log", which properly normalizes weights in all cases. normalize=True or False is enabled for backward compatibility.
method
:str
, optional- If "substitute", solve self-consistent equations by substitution. If "minimize", use a minimization as in J Chem Phys 136, 144102 (2012). Prior to version 0.0.40, the default was "substitute". Starting with version 0.0.41, the default is "minimize".
minimize_opt
:dict
, optional- If method=="minimize", this dict can be used to pass options to scipy.minimize. Notice that by default the minimization is performed using 'L-BFGS-B'.
Classes
class WhamResult (*, logW: numpy.ndarray, logZ: numpy.ndarray, nit: int, nfev: int, eps: float)
-
Expand source code
class WhamResult(coretools.Result): """Result of a `bussilab.wham.wham` calculation.""" def __init__(self, *, logW: np.ndarray, logZ: np.ndarray, nit: int, nfev: int, eps: float): super().__init__() self.logW = logW """`numpy.ndarray` containing the logarithm of the weight of the frames.""" self.logZ = logZ """`numpy.ndarray` containing the logarithm of the partition function of each state.""" self.nit = nit """The number of performed iterations.""" self.nfev = nfev """The number of function evalutations (might differ from nit when using method='minimize').""" self.eps = eps """The final error in the iterative solution."""
Result of a
wham()
calculation.Ancestors
- Result
- builtins.dict
Instance variables
var eps
-
The final error in the iterative solution.
var logW
-
numpy.ndarray
containing the logarithm of the weight of the frames. var logZ
-
numpy.ndarray
containing the logarithm of the partition function of each state. var nfev
-
The number of function evalutations (might differ from nit when using method='minimize').
var nit
-
The number of performed iterations.