Cctbx.prime

From cctbx_xfel
Jump to navigation Jump to search

Prime: post-refinement and merging

With the latest update, prime can be used to process data on multiple nodes (on queuing system). At the moment, only LSF (bsub) is supported. See documentation below for more information how to use the queuing system.

This major update replaces prime.postrefine with

<pr> prime.run </pr>

For auto mode, you can still use prime.run with your parameter phil file like before. For manual ode, the available sub commands in prime are:

<pr> prime.genref generates a reference set from given integration results
prime.postrefine refines all images
prime.merge merges all refined results for an mtz file
</pr>

You can choose to run these commands independently (ideally in the above order).

Step-by-step guidelines to post-refine and merge XFEL diffraction images. For more detail and citation, see "Enabling X-ray Free Electron Laser Crystallography for Challenging Biological Systems from a Limited Number of Crystals" "DOI: http://dx.doi.org/10.7554/eLife.05421"

Step I: Generating input file

Like most programs developed under cctbx framework, prime reads in input .phil file, which stores all the parameters needed to run post-refinement and merging steps. To generate the template .phil file, do the dry run by calling

$ prime.postrefine

An example of the template .phil file:

data = None
run_no = None
title = None
scale {
  d_min = 0.1
  d_max = 99
  sigma_min = 1.5
}
...

You can save the content of the output to any file name - in this tutorial, let's save it to thermolysin.phil.

Step II: Update input parameters

For the first trial, set the required parameters to match with your experiments (you can leave other parameters with their default values - or just delete them from you .phil file). For this tutorial, we use parameters from our thermolysin data set.

data = /path/to/your/integarion/result/pickle_files
run_no = 001
title = First trial for thermolysin
scale {
  d_min = 2.1
  d_max = 45
  sigma_min = 1.5
}
postref {
  scale {
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  crystal_orientation {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  reflecting_range {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  unit_cell {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
    uc_tolerance = 3
  }
  allparams {
    flag_on = False
    d_min = 0.1
    d_max = 99
    sigma_min = 1.5
    partiality_min = 0.1
    uc_tolerance = 3
  }
}
merge {
  d_min = 2.1
  d_max = 45
  sigma_min = 1.5
  partiality_min = 0.1
  uc_tolerance = 3
}
target_unit_cell = 93.99,93.99,130.87,90,90,120
target_space_group = P 61 2 2
pixel_size_mm = 0.102

Step III: Post-refine and merge

Once you have the input .phil file, you can run prime by calling

prime.postrefine thermolysin.phil

Prime will post-refine and merge for reflection sets using three (default value) macrocycles. At the end of the run, you can obtain merging statistics in the last cycle - all other cycle statistics are also available in log.txt.

An example of merging statistics:

Summary for 001/postref_cycle_1_merge.mtz
Bin Resolution Range     Completeness      <N_obs>  |Rsplit  CC1/2  N_ind |CCanom   N_ind| <I/sigI>   <I>
-------------------------------------------------------------------------------------------------------------
02    5.70 -    4.52 100.0   1055 /   1055   65.89   16.02   89.15   1055    0.00      0    20.17    2101.97
03    4.52 -    3.95 100.0   1032 /   1032   61.53   14.48   92.03   1032    0.00      0    20.39    2529.90
04    3.95 -    3.59 100.0   1016 /   1016   54.15   15.61   90.13   1016    0.00      0    16.69    1971.43
05    3.59 -    3.33 100.0   1004 /   1004   42.67   17.66   89.23   1004    0.00      0    14.21    1502.14
06    3.33 -    3.14 100.0   1013 /   1013   32.77   20.40   84.26   1013    0.00      0    11.76    1077.60
07    3.14 -    2.98 100.0    995 /    995   27.36   23.00   78.72    995    0.00      0    11.58     935.37
08    2.98 -    2.85 100.0   1006 /   1006   23.57   22.63   82.26   1006    0.00      0    10.56     722.62
09    2.85 -    2.74 100.0    986 /    986   16.64   28.51   72.90    985    0.00      0    10.01     591.56
10    2.74 -    2.65  99.9    989 /    990   12.41   31.35   72.95    987    0.00      0     9.91     515.07
11    2.65 -    2.56  99.7    979 /    982    9.35   37.14   65.31    970    0.00      0     9.31     438.96
12    2.56 -    2.49  98.0    979 /    999    6.06   45.98   45.37    930    0.00      0     9.45     390.05
13    2.49 -    2.42  95.1    931 /    979    4.46   50.68   34.20    834    0.00      0     8.93     334.80
14    2.42 -    2.37  91.7    896 /    977    3.35   55.66   37.15    729    0.00      0     9.27     320.17
15    2.37 -    2.31  83.9    829 /    988    2.61   56.92   43.21    600    0.00      0     9.60     296.67
16    2.31 -    2.26  72.4    702 /    969    1.97   65.81   26.89    386    0.00      0    10.29     284.39
17    2.26 -    2.22  59.1    582 /    985    1.75   64.72   31.28    275    0.00      0     9.87     284.06
18    2.22 -    2.18  52.9    513 /    970    1.51   71.27   16.86    188    0.00      0     8.93     215.31
19    2.18 -    2.14  35.7    349 /    978    1.32   62.26   68.25     90    0.00      0     8.22     199.09
20    2.14 -    2.10  23.1    227 /    981    1.20   92.14   -9.20     42    0.00      0     8.59     224.44
-------------------------------------------------------------------------------------------------------------
        TOTAL         85.9  17224 /  20046   27.11   21.11   92.07  15305    0.00      0    12.87     999.53
-------------------------------------------------------------------------------------------------------------

Summary of refinement and merging
 No. good frames:                  1809
 No. bad cc frames:                 153
 No. bad G frames) :                  0
 No. bad unit cell frames:            5
 No. bad gamma_e frames:              0
 No. bad SE:                          0
 No. observations:               466997

Advance settings

Now that you have your first trial merged data set, you can explore different parameter settings to merge or to obtain the Bijvoet pairs (I+/I-) for your anomalous data set.

Anomalous data:

target_anomalous_flag = True

In the last cycle, prime will output a reflection set with I+ and I-.

Indexing ambiguity

For space groups with indexing ambiguity, use the solutions from cctbx.xfel (see Tutorial for resolving indexing ambiguity) to merge the data set.

indexing_ambiguity {
  flag_on = True
  index_basis_in = /path/to/solution/pickle_file.pickle
}

Number of micro- and macrocycles

n_postref_cycle = 3
n_postref_sub_cycle = 3

Number of bins for merging statistics

n_bins = 20


Help for input parameters

Most input parameters are self-explained. However, you can run -h switch to view help information for each parameter.

prime.postrefine -h