Cppxfel Statistics
There are a number of merging statistics and plots which can be generated using cppxfel and your favourite graph-drawing software. cppxfel has a number of commands which generate CSV files which can be plotted elsewhere.
Correlation between two images
cppxfel can be used to generate plots of intensities between two images. To calculate the correlation between two halves of the data set in the first merge:
cppxfel.run -cc half1Merge0.mtz half2Merge0.mtz
This creates a new CSV file named correlation.csv
. This can also be carried out for individual images:
cppxfel.run -cc allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz The beginning of correlation.csv begins as so. The "first intensity" and "second intensity" columns can be plotted in a suitable program (e.g. R, Veusz, etc.). <pre> h k l,First intensity,Second intensity,Resolution 0 1 29,6788.76,6681.7,3.64553 0 1 49,113.671,234.905,2.15839 0 1 53,24.636,44.4067,1.99555 0 1 57,110.308,175.868,1.85556 0 2 4,346.637,351.928,23.6538 0 2 18,11660.6,11678.3,5.8409 0 2 36,143.598,30.5506,2.9339 0 2 42,1503.52,1370.33,2.5158 0 2 50,29.0206,20.7457,2.11397 0 2 52,59.3656,110.755,2.03279 0 3 5,5267.67,5281.37,18.1417 0 3 15,786.865,2939.76,6.91526 0 3 47,82.9657,111.247,2.24614 0 3 49,218.114,361.618,2.15481 0 4 16,130.598,202.681,6.41405 0 4 34,9586.69,8670.65,3.08996 0 4 42,619.911,235.344,2.5073 0 4 44,3396.41,3825.67,2.39429 0 4 50,661.039,709.513,2.10893
Rsplit between two halves of the data set
As well as CC1/2, Rsplit can be calculated between two halves of the data set. For example, to compare the two halves of the final merge:
cppxfel.run -rsplit half1Merge.mtz half2Merge.mtz
For the 1000 image data set provided in this tutorial, this will produce an Rsplit of approximately 13%.
Running cppxfel... Welcome to cppxfel! SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib Loaded 14594 reflections (14594 accepted). SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib Loaded 14200 reflections (14200 accepted). N: lowRes highRes Value Hits Multiplicity N: inf 4.33175 0.0571572 383 2 N: 4.33175 3.43811 0.0529099 358 2 N: 3.43811 3.00347 0.0949427 474 2 N: 3.00347 2.72883 0.151788 511 2 N: 2.72883 2.53322 0.161476 652 2 N: 2.53322 2.38385 0.18049 640 2 N: 2.38385 2.26446 0.21476 686 2 N: 2.26446 2.16587 0.224589 735 2 N: 2.16587 2.08249 0.235197 791 2 N: 2.08249 2.01062 0.252543 899 2 N: 2.01062 1.94775 0.280977 816 2 N: 1.94775 1.89207 0.33236 713 2 N: 1.89207 1.84225 0.389915 577 2 N: 1.84225 1.7973 0.43699 354 2 N: 1.7973 1.75644 0.527985 255 2 N: 1.75644 1.71906 0.589053 169 2 N: 1.71906 1.68467 0.693728 68 2 N: 1.68467 1.65287 0.662898 37 2 N: 1.65287 1.62335 0.893698 15 2 N: 1.62335 1.59583 1.25962 5 2 N: *** Overall *** N: 0 1.59583 0.129852 9138 2
Partiality plot for an individual image
cppxfel can produce a CSV file containing information on the success of the partiality model for a particular image. This requires a reference MTZ (generated with > 2.0-3.0 multiplicity) and an image of the format ref-img*.mtz
created by the standard input file refine.txt
.
This can be generated as follows:
cppxfel.run -partiality allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz
Alternatively, a maximum resolution can be specified:
cppxfel.run -partiality allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz 2.0
This creates the following output on the screen:
Running cppxfel... Welcome to cppxfel! SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib Loaded 23822 reflections (23822 accepted). Setting reference to allMerge5.mtz Partiality plot for ref-img-shot-s00-20130316165414164_0.mtz SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib Loaded 3002 reflections (3002 accepted). Ambiguity 0: 0.622248, ambiguity 1: 0.931926 2754 reflections in common with reference MTZ. Total number of reflections in MTZ: 2890 Low res High res Num refl. inf 2.53984 718 2.53984 2.01587 1100 2.01587 1.76103 879 1.76103 1.6 167 N: Total time: 0 minutes, 1 seconds (1 seconds). Done
This shows the four resolution bins used to generate the data, and outputs the appropriate data to partiality_[n].csv
where [n] is the number of generated bins. The format of the partiality CSV files is as follows:
h,k,l,wavelength,partiality,percentage,intensity,resolution 4,2,-6,1.37061,0,3.76223,263.63,0.070742 2,2,-4,1.39565,0.419087,20.8099,142.831,0.0463115 -9,4,9,1.4039,0,3.41021,24.9747,0.126123 -1,9,-6,1.41084,0,3.08305,35.6135,0.102689 3,20,-15,1.41234,0,2.05453,94.6172,0.238028 -10,28,-4,1.41258,0,96.443,158.223,0.283599 17,24,-25,1.41401,0,94.9401,74.3205,0.364902 18,-6,-10,1.41447,0,0.439069,56.5138,0.202751 11,25,-22,1.41478,0,1.55951,39.9897,0.33154 -15,35,2,1.41483,0,1.32867,9.05428,0.360467 -19,8,35,1.41484,0,0,-1.88631,0.383995 -17,4,35,1.41519,0,35.4585,59.3055,0.369768
The wavelength column corresponds to the Ewald sphere on which the centre of the reciprocal lattice point is found in Å, the partiality column is the theoretically calculated partiality value, the percentage column is the percentage (intensity for a given image / intensity of reference data set), the intensity is the raw integrated intensity for that reflection and the resolution is d* or 1 / d, in Å-1.
One should plot the wavelength on the X axis and both the partiality and percentage columns on separate Y axes, and one hopes that the partiality and percentage graphs match each other as closely as possible.
Rmerge, Rmeas, Rpim
The R values Rmerge, Rmeas and Rpim can be generated from the unmerged*.mtz
files generated during refinement (refine.txt
) or the unmerged.mtz
file generated during the final merge (merge.txt
).
This can be carried out as follows:
cppxfel.run -rmerge unmerged.mtz cppxfel.run -rmeas unmerged.mtz cppxfel.run -rpim unmerged.mtz
These should be used to determine the quality of post-refinement, not the quality of the high resolution data, and should reduce as the post-refinement strategy improves. These will produce appropriate tables:
$ cppxfel.run -rpim unmerged.mtz Running cppxfel... Welcome to cppxfel! SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib Loaded 1520200 reflections (102129 accepted). N: lowRes highRes Value Hits Multiplicity N: inf 4.28806 0.0138179 1142 3.75862 N: 4.28806 3.40343 0.0111124 1175 3.62434 N: 3.40343 2.97317 0.0189678 1223 4.25449 N: 2.97317 2.70131 0.0270932 1231 4.18195 N: 2.70131 2.50767 0.0314556 1306 5.0643 N: 2.50767 2.35981 0.0350953 1306 4.94366 N: 2.35981 2.24162 0.0398659 1322 5.34516 N: 2.24162 2.14403 0.0437967 1313 5.49478 N: 2.14403 2.06148 0.0463546 1319 6.04176 N: 2.06148 1.99034 0.0486673 1312 6.70868 N: 1.99034 1.9281 0.0563663 1350 5.81406 N: 1.9281 1.87298 0.0641532 1291 5.29523 N: 1.87298 1.82367 0.0746328 1207 4.17778 N: 1.82367 1.77917 0.0858512 1189 3.44205 N: 1.77917 1.73872 0.104914 997 2.93709 N: 1.73872 1.70172 0.127369 826 2.38837 N: 1.70172 1.66767 0.160804 613 1.98459 N: 1.66767 1.6362 0.187585 418 1.68597 N: 1.6362 1.60698 0.266887 163 1.36477 N: 1.60698 1.57973 0.391058 23 1.17532 N: *** Overall *** N: 0 1.57973 0.0239934 20726 4.29492 N: Total time: 0 minutes, 23 seconds (23 seconds).
This shows an overall Rpim of 2.4%, with significant increase in Rpim after 1.8 Å.