Once the meshes are aligned the problem is trivial. Simply calculate the squared distance between each point, sum, divide by the number of vertices, and take the square root. However there are two problems. The first, which I do not deal with here, is producing an alignment if two proteins do not have identical sequences. Each point must be paired with exactly one in the reference structure if we are to calculate RMSD. The second problem is to find the matrix that will superimpose two meshes with the most accuracy.
The method was developed by Kabsch in 1976. His paper is rather difficult to follow and I have based the code on a simpler explanation by Lin (2004).
The algorithm works like this:
A rigid body transformation consists of a rotation matrix followed by a translation. The translation is easy. We simply take the centroids of the two meshes, and translate both so that the centroid is at the origin.
The centroid is defined as mean x, mean y, mean z.
The tricky bit if defining the rotation matrix. After translating both our meshes to the centroid, we construct a temporary matrix A.
Aij = Σ V1ki * V2kj
i, j 1 to 3 (the axes)
k 1 to N (the points)
We are stepping through both arrays of points, and multiplying the components together, summing to form the matrix A.
Now the formula is
Or in words, multiply the transpose of A by A, take the square root, and multiply by the inverse of A.
A word of warning. If the points lie in a plane, which is likely for a trivial test set, the matrix will not have an inverse, and so the procedure breaks down. This can be fixed by adding a dummy point by taking the cross-product of three random points. You may also need special code for the cases of one point, two point, and three point (must be planar) meshes.
To calculate the square root of a matrix we use Denman-Beavers iteration.
Y0 = matrix
Z0 = identity
Yi+1 = 0.5 * Yi + Zi
Zi+1 = 0.5 * Zi + Yi
while Yi2 != matrix
The matrix inverse is calculated in the complicated way, using decomposition. Simple inversion by taking sub-matrices and determinants failed, because for some reason the algorithm seems to produce matrices with determiants very clse to zero.
To use the function, call
calculate rmsd between two structures
Params: v1 - first set of points
v2 - second set of points
N - number of points
mtx - return for transform matrix used to align structures
Returns: rmsd score
Notes: mtx can be null. Transform will be rigid. Inputs must
be previously aligned for sequence alignment
double rmsd(float *v1, float *v2, int N, float *mtx)