Cppxfel Statistics: Difference between revisions

From cctbx_xfel
Jump to navigation Jump to search
No edit summary
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
There are a number of merging statistics and plots which can be generated using ''cppxfel'' and your favourite graph-drawing software. ''cppxfel'' has a number of commands which generate CSV files which can be plotted elsewhere.
There are a number of merging statistics and plots which can be generated using [[Cppxfel|''cppxfel'']] and your favourite graph-drawing software. ''cppxfel'' has a number of commands which generate CSV files which can be plotted elsewhere.


== Correlation between two images ==
== Correlation between two images ==
Line 37: Line 37:
0 4 44,3396.41,3825.67,2.39429
0 4 44,3396.41,3825.67,2.39429
0 4 50,661.039,709.513,2.10893
0 4 50,661.039,709.513,2.10893
</pre>
== R<sub>split</sub> between two halves of the data set ==
As well as CC1/2, R<sub>split</sub> can be calculated between two halves of the data set. For example, to compare the two halves of the final merge:
<pre>
cppxfel.run -rsplit half1Merge.mtz half2Merge.mtz
</pre>
For the 1000 image data set provided in this tutorial, this will produce an R<sub>split</sub> of approximately 13%.
<pre>
Running cppxfel...
Welcome to cppxfel!
SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib
Loaded 14594 reflections (14594 accepted).
SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib
Loaded 14200 reflections (14200 accepted).
N: lowRes highRes Value Hits Multiplicity
N: inf 4.33175 0.0571572 383 2
N: 4.33175 3.43811 0.0529099 358 2
N: 3.43811 3.00347 0.0949427 474 2
N: 3.00347 2.72883 0.151788 511 2
N: 2.72883 2.53322 0.161476 652 2
N: 2.53322 2.38385 0.18049 640 2
N: 2.38385 2.26446 0.21476 686 2
N: 2.26446 2.16587 0.224589 735 2
N: 2.16587 2.08249 0.235197 791 2
N: 2.08249 2.01062 0.252543 899 2
N: 2.01062 1.94775 0.280977 816 2
N: 1.94775 1.89207 0.33236 713 2
N: 1.89207 1.84225 0.389915 577 2
N: 1.84225 1.7973 0.43699 354 2
N: 1.7973 1.75644 0.527985 255 2
N: 1.75644 1.71906 0.589053 169 2
N: 1.71906 1.68467 0.693728 68 2
N: 1.68467 1.65287 0.662898 37 2
N: 1.65287 1.62335 0.893698 15 2
N: 1.62335 1.59583 1.25962 5 2
N: *** Overall ***
N: 0 1.59583 0.129852 9138 2
</pre>
</pre>


Line 99: Line 141:


One should plot the wavelength on the X axis and both the partiality and percentage columns on separate Y axes, and one hopes that the partiality and percentage graphs match each other as closely as possible.
One should plot the wavelength on the X axis and both the partiality and percentage columns on separate Y axes, and one hopes that the partiality and percentage graphs match each other as closely as possible.
== R<sub>merge</sub>, R<sub>meas</sub>, R<sub>pim</sub> ==
The R values R<sub>merge</sub>, R<sub>meas</sub> and R<sub>pim</sub> can be generated from the <code>unmerged*.mtz</code> files generated during refinement (<code>refine.txt</code>) or the <code>unmerged.mtz</code> file generated during the final merge (<code>merge.txt</code>).
This can be carried out as follows:
<pre>
cppxfel.run -rmerge unmerged.mtz
cppxfel.run -rmeas unmerged.mtz
cppxfel.run -rpim unmerged.mtz
</pre>
These should be used to determine the quality of post-refinement, not the quality of the high resolution data, and should reduce as the post-refinement strategy improves. These will produce appropriate tables:
<pre>
$ cppxfel.run -rpim unmerged.mtz
Running cppxfel...
Welcome to cppxfel!
SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib
Loaded 1520200 reflections (102129 accepted).
N: lowRes highRes Value Hits Multiplicity
N: inf 4.28806 0.0138179 1142 3.75862
N: 4.28806 3.40343 0.0111124 1175 3.62434
N: 3.40343 2.97317 0.0189678 1223 4.25449
N: 2.97317 2.70131 0.0270932 1231 4.18195
N: 2.70131 2.50767 0.0314556 1306 5.0643
N: 2.50767 2.35981 0.0350953 1306 4.94366
N: 2.35981 2.24162 0.0398659 1322 5.34516
N: 2.24162 2.14403 0.0437967 1313 5.49478
N: 2.14403 2.06148 0.0463546 1319 6.04176
N: 2.06148 1.99034 0.0486673 1312 6.70868
N: 1.99034 1.9281 0.0563663 1350 5.81406
N: 1.9281 1.87298 0.0641532 1291 5.29523
N: 1.87298 1.82367 0.0746328 1207 4.17778
N: 1.82367 1.77917 0.0858512 1189 3.44205
N: 1.77917 1.73872 0.104914 997 2.93709
N: 1.73872 1.70172 0.127369 826 2.38837
N: 1.70172 1.66767 0.160804 613 1.98459
N: 1.66767 1.6362 0.187585 418 1.68597
N: 1.6362 1.60698 0.266887 163 1.36477
N: 1.60698 1.57973 0.391058 23 1.17532
N: *** Overall ***
N: 0 1.57973 0.0239934 20726 4.29492
N: Total time: 0 minutes, 23 seconds (23 seconds).
</pre>
This shows an overall Rpim of 2.4%, with significant increase in Rpim after 1.8 Å.

Latest revision as of 17:56, 25 November 2015

There are a number of merging statistics and plots which can be generated using cppxfel and your favourite graph-drawing software. cppxfel has a number of commands which generate CSV files which can be plotted elsewhere.

Correlation between two images

cppxfel can be used to generate plots of intensities between two images. To calculate the correlation between two halves of the data set in the first merge:

cppxfel.run -cc half1Merge0.mtz half2Merge0.mtz

This creates a new CSV file named correlation.csv. This can also be carried out for individual images:

cppxfel.run -cc allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz

The beginning of correlation.csv begins as so. The "first intensity" and "second intensity" columns can be plotted in a suitable program (e.g. R, Veusz, etc.).

<pre>
h k l,First intensity,Second intensity,Resolution
0 1 29,6788.76,6681.7,3.64553
0 1 49,113.671,234.905,2.15839
0 1 53,24.636,44.4067,1.99555
0 1 57,110.308,175.868,1.85556
0 2 4,346.637,351.928,23.6538
0 2 18,11660.6,11678.3,5.8409
0 2 36,143.598,30.5506,2.9339
0 2 42,1503.52,1370.33,2.5158
0 2 50,29.0206,20.7457,2.11397
0 2 52,59.3656,110.755,2.03279
0 3 5,5267.67,5281.37,18.1417
0 3 15,786.865,2939.76,6.91526
0 3 47,82.9657,111.247,2.24614
0 3 49,218.114,361.618,2.15481
0 4 16,130.598,202.681,6.41405
0 4 34,9586.69,8670.65,3.08996
0 4 42,619.911,235.344,2.5073
0 4 44,3396.41,3825.67,2.39429
0 4 50,661.039,709.513,2.10893

Rsplit between two halves of the data set

As well as CC1/2, Rsplit can be calculated between two halves of the data set. For example, to compare the two halves of the final merge:

cppxfel.run -rsplit half1Merge.mtz half2Merge.mtz

For the 1000 image data set provided in this tutorial, this will produce an Rsplit of approximately 13%.

Running cppxfel...
Welcome to cppxfel!
 SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib 
Loaded 14594 reflections (14594 accepted).
 SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib 
Loaded 14200 reflections (14200 accepted).
N: lowRes	highRes	Value	Hits	Multiplicity
N: inf	4.33175	0.0571572	383	2
N: 4.33175	3.43811	0.0529099	358	2
N: 3.43811	3.00347	0.0949427	474	2
N: 3.00347	2.72883	0.151788	511	2
N: 2.72883	2.53322	0.161476	652	2
N: 2.53322	2.38385	0.18049	640	2
N: 2.38385	2.26446	0.21476	686	2
N: 2.26446	2.16587	0.224589	735	2
N: 2.16587	2.08249	0.235197	791	2
N: 2.08249	2.01062	0.252543	899	2
N: 2.01062	1.94775	0.280977	816	2
N: 1.94775	1.89207	0.33236	713	2
N: 1.89207	1.84225	0.389915	577	2
N: 1.84225	1.7973	0.43699	354	2
N: 1.7973	1.75644	0.527985	255	2
N: 1.75644	1.71906	0.589053	169	2
N: 1.71906	1.68467	0.693728	68	2
N: 1.68467	1.65287	0.662898	37	2
N: 1.65287	1.62335	0.893698	15	2
N: 1.62335	1.59583	1.25962	5	2
N: *** Overall ***
N: 0	1.59583	0.129852	9138	2

Partiality plot for an individual image

cppxfel can produce a CSV file containing information on the success of the partiality model for a particular image. This requires a reference MTZ (generated with > 2.0-3.0 multiplicity) and an image of the format ref-img*.mtz created by the standard input file refine.txt.

This can be generated as follows:

cppxfel.run -partiality allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz

Alternatively, a maximum resolution can be specified:

cppxfel.run -partiality allMerge5.mtz ref-img-shot-s00-20130316165414164_0.mtz 2.0

This creates the following output on the screen:

Running cppxfel...
Welcome to cppxfel!
 SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib 
Loaded 23822 reflections (23822 accepted).
Setting reference to allMerge5.mtz
Partiality plot for ref-img-shot-s00-20130316165414164_0.mtz
 SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib 
Loaded 3002 reflections (3002 accepted).
Ambiguity 0: 0.622248, ambiguity 1: 0.931926
2754 reflections in common with reference MTZ.
Total number of reflections in MTZ: 2890
     Low res    High res   Num refl.
         inf     2.53984         718
     2.53984     2.01587        1100
     2.01587     1.76103         879
     1.76103         1.6         167
N: Total time: 0 minutes, 1 seconds (1 seconds).
Done

This shows the four resolution bins used to generate the data, and outputs the appropriate data to partiality_[n].csv where [n] is the number of generated bins. The format of the partiality CSV files is as follows:

h,k,l,wavelength,partiality,percentage,intensity,resolution
4,2,-6,1.37061,0,3.76223,263.63,0.070742
2,2,-4,1.39565,0.419087,20.8099,142.831,0.0463115
-9,4,9,1.4039,0,3.41021,24.9747,0.126123
-1,9,-6,1.41084,0,3.08305,35.6135,0.102689
3,20,-15,1.41234,0,2.05453,94.6172,0.238028
-10,28,-4,1.41258,0,96.443,158.223,0.283599
17,24,-25,1.41401,0,94.9401,74.3205,0.364902
18,-6,-10,1.41447,0,0.439069,56.5138,0.202751
11,25,-22,1.41478,0,1.55951,39.9897,0.33154
-15,35,2,1.41483,0,1.32867,9.05428,0.360467
-19,8,35,1.41484,0,0,-1.88631,0.383995
-17,4,35,1.41519,0,35.4585,59.3055,0.369768

The wavelength column corresponds to the Ewald sphere on which the centre of the reciprocal lattice point is found in Å, the partiality column is the theoretically calculated partiality value, the percentage column is the percentage (intensity for a given image / intensity of reference data set), the intensity is the raw integrated intensity for that reflection and the resolution is d* or 1 / d, in Å-1.

One should plot the wavelength on the X axis and both the partiality and percentage columns on separate Y axes, and one hopes that the partiality and percentage graphs match each other as closely as possible.

Rmerge, Rmeas, Rpim

The R values Rmerge, Rmeas and Rpim can be generated from the unmerged*.mtz files generated during refinement (refine.txt) or the unmerged.mtz file generated during the final merge (merge.txt).

This can be carried out as follows:

cppxfel.run -rmerge unmerged.mtz
cppxfel.run -rmeas unmerged.mtz
cppxfel.run -rpim unmerged.mtz

These should be used to determine the quality of post-refinement, not the quality of the high resolution data, and should reduce as the post-refinement strategy improves. These will produce appropriate tables:

$ cppxfel.run -rpim unmerged.mtz
Running cppxfel...
Welcome to cppxfel!
 SYMINFO file set to /apps/strubi/ccp4/ccp4-6.5/lib/data/syminfo.lib 
Loaded 1520200 reflections (102129 accepted).
N: lowRes	highRes	Value	Hits	Multiplicity
N: inf	4.28806	0.0138179	1142	3.75862
N: 4.28806	3.40343	0.0111124	1175	3.62434
N: 3.40343	2.97317	0.0189678	1223	4.25449
N: 2.97317	2.70131	0.0270932	1231	4.18195
N: 2.70131	2.50767	0.0314556	1306	5.0643
N: 2.50767	2.35981	0.0350953	1306	4.94366
N: 2.35981	2.24162	0.0398659	1322	5.34516
N: 2.24162	2.14403	0.0437967	1313	5.49478
N: 2.14403	2.06148	0.0463546	1319	6.04176
N: 2.06148	1.99034	0.0486673	1312	6.70868
N: 1.99034	1.9281	0.0563663	1350	5.81406
N: 1.9281	1.87298	0.0641532	1291	5.29523
N: 1.87298	1.82367	0.0746328	1207	4.17778
N: 1.82367	1.77917	0.0858512	1189	3.44205
N: 1.77917	1.73872	0.104914	997	2.93709
N: 1.73872	1.70172	0.127369	826	2.38837
N: 1.70172	1.66767	0.160804	613	1.98459
N: 1.66767	1.6362	0.187585	418	1.68597
N: 1.6362	1.60698	0.266887	163	1.36477
N: 1.60698	1.57973	0.391058	23	1.17532
N: *** Overall ***
N: 0	1.57973	0.0239934	20726	4.29492
N: Total time: 0 minutes, 23 seconds (23 seconds).

This shows an overall Rpim of 2.4%, with significant increase in Rpim after 1.8 Å.