Cppxfel Post-refinement

From cctbx_xfel
Jump to navigation Jump to search

Post-refinement by cppxfel of integrated intensities occurs after initial orientation matrix refinement. This is the penultimate stage of producing a viable MTZ file for structure solution. This is also the point at which indexing ambiguities are broken (note that only two-fold indexing ambiguities can be broken at the moment without an external reference MTZ).

Contents of refine.txt

This stage is controlled by the input file refine.txt generated by cppxfel.input_gen and uses a number of files generated at the previous stage. The contents of this file:

cat refine.txt
ORIENTATION_MATRIX_LIST refine-orientations.dat
MATRIX_LIST_VERSION 2.0
NEW_MATRIX_LIST refined.dat

PARTIALITY_CUTOFF 0.2

COMMANDS

REFINE_PARTIALITY

Executing refine.txt

This can be executed using the line

cppxfel.run -i refine.txt

This begins by generating an initial, merged MTZ file using an initial guess at the parameters of the image. For space group I23, this requires the indexing ambiguity to be broken, which can be followed by the values of the fx: lines during L-BFGS refinement. The processing statistics for the first merge are then displayed.

This is followed by a number of rounds of post-refinement (approximately 6). The major processing statistics for each image can be followed as they are produced, followed by a large output of all the parameters used in refinement after each cycle. The processing statistics for each subsequent merge are then displayed until refinement ends.

Merging statistics output

The initial images are merged after attempting to break the indexing ambiguity and the Rsplit and CC1/2 are output from the two halves of the data set. An example of a good Rsplit and CC1/2 is shown below. The merging statistics are measured in resolution shells specified by the first two columns, followed by the value and the number of reflections which contributed to that calculation.

N: === R split ===
N: inf	4.31119	0.0880855	971	2
N: 4.31119	3.4218	0.0860462	949	2
N: 3.4218	2.98922	0.142905	1022	2
N: 2.98922	2.71588	0.197199	1017	2
N: 2.71588	2.5212	0.20902	1119	2
N: 2.5212	2.37254	0.255386	1116	2
N: 2.37254	2.25371	0.280964	1174	2
N: 2.25371	2.1556	0.285217	1157	2
N: 2.1556	2.07261	0.298872	1163	2
N: 2.07261	2.00108	0.334966	1228	2
N: 2.00108	1.9385	0.372736	1160	2
N: 1.9385	1.88309	0.45132	1151	2
N: 1.88309	1.83351	0.502127	1035	2
N: 1.83351	1.78877	0.602383	907	2
N: 1.78877	1.7481	0.628576	746	2
N: 1.7481	1.7109	0.768211	574	2
N: 1.7109	1.67667	0.848593	351	2
N: 1.67667	1.64503	0.898865	257	2
N: 1.64503	1.61565	0.882403	115	2
N: 1.61565	1.58826	0.989214	22	2
N: *** Overall ***
N: 0	1.58826	0.159739	17234	2
N: === CC half ===
N: inf	4.31119	0.985825	971	0
N: 4.31119	3.4218	0.982884	949	0
N: 3.4218	2.98922	0.963	1022	0
N: 2.98922	2.71588	0.917537	1018	0
N: 2.71588	2.5212	0.90104	1121	0
N: 2.5212	2.37254	0.825298	1117	0
N: 2.37254	2.25371	0.849232	1175	0
N: 2.25371	2.1556	0.846396	1158	0
N: 2.1556	2.07261	0.856127	1164	0
N: 2.07261	2.00108	0.811718	1229	0
N: 2.00108	1.9385	0.775128	1162	0
N: 1.9385	1.88309	0.659263	1155	0
N: 1.88309	1.83351	0.586403	1041	0
N: 1.83351	1.78877	0.478341	914	0
N: 1.78877	1.7481	0.35525	761	0
N: 1.7481	1.7109	0.223638	596	0
N: 1.7109	1.67667	0.16558	374	0
N: 1.67667	1.64503	0.0562969	280	0
N: 1.64503	1.61565	0	127	0
N: 1.61565	1.58826	0	23	0
N: *** Overall ***
N: 0	1.58826	0.982438	17357	0

Real-time processing statistics

During refinement the important statistics on individual images are reported as they are refined. The first column specifies the name of the MTZ file and the second column specifies the target function for minimisation. The third and fourth columns show the correlation coefficient and R factor with the reference MTZ, respectively. The fifth column is the correlation of the partialities of individual reflections with the estimated 'true' partiality by comparing against the reference data set, and should be as high as possible. The sixth column is the current Rmerge across all images calculated only for the reflections shared with the refined image. The seventh column is the B factor (unrefined in this case), and the final column is the number of accepted reflections after post-refinement due to the setting of PARTIALITY_CUTOFF.

Refining image 124 img-shot-s00-20130316165200803_0.mtz
img-shot-s00-20130316165051970_0.mtz	rfactor		0.550657	0.600949	0.401253	0.320187	0	171
Refining image 71 img-shot-s01-20130316165048878_0.mtz
img-shot-s00-20130316165008012_0.mtz	rfactor		0.991731	0.111634	0.819409	0.324124	0	208
Refining image 69 img-shot-s01-20130316165031170_0.mtz
img-shot-s00-20130316165323342_0.mtz	rfactor		0.973724	0.223516	0	0.312648	0	185
Refining image 68 img-shot-s00-20130316165111763_0.mtz
img-shot-s00-20130316165002470_0.mtz	rfactor		0.825142	0.575939	0.277102	0.318178	0	191
Refining image 64 img-shot-s01-20130316165033837_1.mtz
img-shot-s01-20130316165129338_0.mtz	rfactor		0.987785	0.113023	0.7968	0.323475	0	198
Refining image 91 img-shot-s00-20130316165111763_1.mtz
img-shot-s00-20130316165049012_0.mtz	rfactor		0.988474	0.120959	0.548244	0.317826	0	208
Refining image 102 img-shot-s00-20130316165037095_0.mtz
img-shot-s00-20130316164958596_1.mtz	rfactor		0.989014	0.143094	0.7369	0.306314	0	189
Refining image 75 img-shot-s01-20130316165017753_1.mtz
img-shot-s01-20130316165126505_0.mtz	rfactor		0.977392	0.186307	0.633476	0.32707	0	189
Refining image 65 img-shot-s00-20130316165138637_0.mtz
img-shot-s01-20130316165130255_0.mtz	rfactor		0.989836	0.0996581	0.827184	0.312349	0	217
Refining image 81 img-shot-s00-20130316165139178_0.mtz
img-shot-s00-20130316165052554_0.mtz	rfactor		0.984416	0.133042	0.526552	0.311772	0	222
Refining image 85 img-shot-s01-20130316165043795_1.mtz
img-shot-s01-20130316165043795_0.mtz	rfactor		0.980657	0.116256	0.734999	0.323394	0	201

Post-refinement parameter table

The parameters used to refine each image are output before each merge. An example is given below. This is also output to a CSV file for each cycle named params_cycle_*.csv so that the data may be analysed by other programs.

Filename	Correl	Rsplit	Partcorrel	Refcount	Mosaicity	Wavelength	Bandwidth	hRot	kRot	rlpSize	exp	cellA	cellB	cellC
img-shot-s00-20130316165428143_0.mtz	0.996824	0.0729367	0.771081	207	0	1.46288	0.0013	-0.00416609	-0.00638984	3.73011e-05	1.5	105.783	105.783	105.783
img-shot-s01-20130316165101088_0.mtz	0.410825	0.734041	0	179	0	1.46487	0.0013	0.000723228	0.0250253	0.000121815	1.5	105.783	105.783	105.783
img-shot-s00-20130316165238305_0.mtz	0.40738	0.753037	0	183	0	1.45447	0.0013	0.00706399	-0.0082277	0.000150607	1.5	105.783	105.783	105.783
img-shot-s00-20130316165159219_0.mtz	0.990674	0.088927	0.865087	216	0	1.46289	0.0013	-0.00286677	-7.7758e-05	6.17136e-05	1.5	105.783	105.783	105.783
img-shot-s01-20130316165021253_0.mtz	0.991536	0.114612	0.787729	217	0	1.46545	0.0013	-0.0109289	-0.0157625	8.25238e-05	1.5	105.783	105.783	105.783

Next steps

After one is happy with the merging statistics for post-refinement, it is time to make a final merge, with recalculation of sigma values.