Cctbx.prime
Prime: post-refinement and merging
With the latest update, prime can be used to process data on multiple nodes (on queuing system). At the moment, only LSF (bsub) is supported. See documentation below for more information how to use the queuing system.
This major update replaces prime.postrefine with
prime.run
For auto mode, you can still use prime.run with your parameter phil file like before. For manual mode, the available sub commands in prime are:
prime.genref #generates a reference set from given integration results prime.postrefine #refines all images <br> prime.merge #merges all refined results for an mtz file <br>
You can choose to run these commands independently (ideally in the above order).
Step-by-step guidelines to post-refine and merge XFEL diffraction images. For more detail and citation, see "Enabling X-ray Free Electron Laser Crystallography for Challenging Biological Systems from a Limited Number of Crystals" "DOI: http://dx.doi.org/10.7554/eLife.05421"
Step I: Generating input file
Like most programs developed under cctbx framework, prime reads in input .phil file, which stores all the parameters needed to run post-refinement and merging steps. To generate the template .phil file, do the dry run by calling
$ prime.postrefine
An example of the template .phil file:
data = None run_no = None title = None scale { d_min = 0.1 d_max = 99 sigma_min = 1.5 } ...
You can save the content of the output to any file name - in this tutorial, let's save it to thermolysin.phil.
Step II: Update input parameters
For the first trial, set the required parameters to match with your experiments (you can leave other parameters with their default values - or just delete them from you .phil file). For this tutorial, we use parameters from our thermolysin data set.
data = /path/to/your/integarion/result/pickle_files run_no = 001 title = First trial for thermolysin scale { d_min = 2.1 d_max = 45 sigma_min = 1.5 } postref { scale { d_min = 2.1 d_max = 45 sigma_min = 1.5 partiality_min = 0.1 } crystal_orientation { flag_on = True d_min = 2.1 d_max = 45 sigma_min = 1.5 partiality_min = 0.1 } reflecting_range { flag_on = True d_min = 2.1 d_max = 45 sigma_min = 1.5 partiality_min = 0.1 } unit_cell { flag_on = True d_min = 2.1 d_max = 45 sigma_min = 1.5 partiality_min = 0.1 uc_tolerance = 3 } allparams { flag_on = False d_min = 0.1 d_max = 99 sigma_min = 1.5 partiality_min = 0.1 uc_tolerance = 3 } } merge { d_min = 2.1 d_max = 45 sigma_min = 1.5 partiality_min = 0.1 uc_tolerance = 3 } target_unit_cell = 93.99,93.99,130.87,90,90,120 target_space_group = P 61 2 2 pixel_size_mm = 0.102
Step III: Post-refine and merge
Once you have the input .phil file, you can run prime by calling
prime.postrefine thermolysin.phil
Prime will post-refine and merge for reflection sets using three (default value) macrocycles. At the end of the run, you can obtain merging statistics in the last cycle - all other cycle statistics are also available in log.txt.
An example of merging statistics:
Summary for 001/postref_cycle_1_merge.mtz Bin Resolution Range Completeness <N_obs> |Rsplit CC1/2 N_ind |CCanom N_ind| <I/sigI> <I> ------------------------------------------------------------------------------------------------------------- 02 5.70 - 4.52 100.0 1055 / 1055 65.89 16.02 89.15 1055 0.00 0 20.17 2101.97 03 4.52 - 3.95 100.0 1032 / 1032 61.53 14.48 92.03 1032 0.00 0 20.39 2529.90 04 3.95 - 3.59 100.0 1016 / 1016 54.15 15.61 90.13 1016 0.00 0 16.69 1971.43 05 3.59 - 3.33 100.0 1004 / 1004 42.67 17.66 89.23 1004 0.00 0 14.21 1502.14 06 3.33 - 3.14 100.0 1013 / 1013 32.77 20.40 84.26 1013 0.00 0 11.76 1077.60 07 3.14 - 2.98 100.0 995 / 995 27.36 23.00 78.72 995 0.00 0 11.58 935.37 08 2.98 - 2.85 100.0 1006 / 1006 23.57 22.63 82.26 1006 0.00 0 10.56 722.62 09 2.85 - 2.74 100.0 986 / 986 16.64 28.51 72.90 985 0.00 0 10.01 591.56 10 2.74 - 2.65 99.9 989 / 990 12.41 31.35 72.95 987 0.00 0 9.91 515.07 11 2.65 - 2.56 99.7 979 / 982 9.35 37.14 65.31 970 0.00 0 9.31 438.96 12 2.56 - 2.49 98.0 979 / 999 6.06 45.98 45.37 930 0.00 0 9.45 390.05 13 2.49 - 2.42 95.1 931 / 979 4.46 50.68 34.20 834 0.00 0 8.93 334.80 14 2.42 - 2.37 91.7 896 / 977 3.35 55.66 37.15 729 0.00 0 9.27 320.17 15 2.37 - 2.31 83.9 829 / 988 2.61 56.92 43.21 600 0.00 0 9.60 296.67 16 2.31 - 2.26 72.4 702 / 969 1.97 65.81 26.89 386 0.00 0 10.29 284.39 17 2.26 - 2.22 59.1 582 / 985 1.75 64.72 31.28 275 0.00 0 9.87 284.06 18 2.22 - 2.18 52.9 513 / 970 1.51 71.27 16.86 188 0.00 0 8.93 215.31 19 2.18 - 2.14 35.7 349 / 978 1.32 62.26 68.25 90 0.00 0 8.22 199.09 20 2.14 - 2.10 23.1 227 / 981 1.20 92.14 -9.20 42 0.00 0 8.59 224.44 ------------------------------------------------------------------------------------------------------------- TOTAL 85.9 17224 / 20046 27.11 21.11 92.07 15305 0.00 0 12.87 999.53 ------------------------------------------------------------------------------------------------------------- Summary of refinement and merging No. good frames: 1809 No. bad cc frames: 153 No. bad G frames) : 0 No. bad unit cell frames: 5 No. bad gamma_e frames: 0 No. bad SE: 0 No. observations: 466997
Advance settings
Now that you have your first trial merged data set, you can explore different parameter settings to merge or to obtain the Bijvoet pairs (I+/I-) for your anomalous data set.
Anomalous data:
target_anomalous_flag = True
In the last cycle, prime will output a reflection set with I+ and I-.
Indexing ambiguity
For space groups with indexing ambiguity, use the solutions from cctbx.xfel (see Tutorial for resolving indexing ambiguity) to merge the data set.
indexing_ambiguity { flag_on = True index_basis_in = /path/to/solution/pickle_file.pickle }
Number of micro- and macrocycles
n_postref_cycle = 3 n_postref_sub_cycle = 3
Number of bins for merging statistics
n_bins = 20
Help for input parameters
Most input parameters are self-explained. However, you can run -h switch to view help information for each parameter.
prime.postrefine -h