Cctbx.prime

From cctbx_xfel
Jump to navigation Jump to search

Prime: post-refinement and merging

Step-by-step guidelines to post-refine and merge XFEL diffraction images. For more detail and citation, see "Enabling X-ray Free Electron Laser Crystallography for Challenging Biological Systems from a Limited Number of Crystals" "DOI: http://dx.doi.org/10.7554/eLife.05421"

Step I: Generating input file

Like most programs developed under cctbx framework, prime reads in input .phil file, which stores all the parameters needed to run post-refinement and merging steps. To generate the template .phil file, do the dry run by calling

$ prime.postrefine

An example of the template .phil file:

data = None
run_no = None
title = None
scale {
  d_min = 0.1
  d_max = 99
  sigma_min = 1.5
}
...

You can save the content of the output to any file name - in this tutorial, let's save it to thermolysin.phil.

Step II: Update input parameters

For the first trial, set the required parameters to match with your experiments (you can leave other parameters with their default values - or just delete them from you .phil file). For this tutorial, we use parameters from our thermolysin data set.

data = /path/to/your/integarion/result/pickle_files
run_no = 001
title = First trial for thermolysin
scale {
  d_min = 2.1
  d_max = 45
  sigma_min = 1.5
}
postref {
  scale {
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  crystal_orientation {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  reflecting_range {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
  }
  unit_cell {
    flag_on = True
    d_min = 2.1
    d_max = 45
    sigma_min = 1.5
    partiality_min = 0.1
    uc_tolerance = 3
  }
  allparams {
    flag_on = False
    d_min = 0.1
    d_max = 99
    sigma_min = 1.5
    partiality_min = 0.1
    uc_tolerance = 3
  }
}
merge {
  d_min = 2.1
  d_max = 45
  sigma_min = 1.5
  partiality_min = 0.1
  uc_tolerance = 3
}
target_unit_cell = 93.99,93.99,130.87,90,90,120
target_space_group = P 61 2 2
pixel_size_mm = 0.102

Step III: Post-refine and merge

Once you have the input .phil file, you can run prime by calling

prime.postrefine thermolysin.phil

Prime will post-refine and merge for reflection sets using three (default value) macrocycles. At the end of the run, you can obtain merging statistics in the last cycle - all other cycle statistics are also available in log.txt.

An example of merging statistics:

Summary for 001/postref_cycle_1_merge.mtz
Bin Resolution Range     Completeness      <N_obs>  |Rsplit  CC1/2  N_ind |CCanom   N_ind| <I/sigI>   <I>
-------------------------------------------------------------------------------------------------------------
02    5.70 -    4.52 100.0   1055 /   1055   65.89   16.02   89.15   1055    0.00      0    20.17    2101.97
03    4.52 -    3.95 100.0   1032 /   1032   61.53   14.48   92.03   1032    0.00      0    20.39    2529.90
04    3.95 -    3.59 100.0   1016 /   1016   54.15   15.61   90.13   1016    0.00      0    16.69    1971.43
05    3.59 -    3.33 100.0   1004 /   1004   42.67   17.66   89.23   1004    0.00      0    14.21    1502.14
06    3.33 -    3.14 100.0   1013 /   1013   32.77   20.40   84.26   1013    0.00      0    11.76    1077.60
07    3.14 -    2.98 100.0    995 /    995   27.36   23.00   78.72    995    0.00      0    11.58     935.37
08    2.98 -    2.85 100.0   1006 /   1006   23.57   22.63   82.26   1006    0.00      0    10.56     722.62
09    2.85 -    2.74 100.0    986 /    986   16.64   28.51   72.90    985    0.00      0    10.01     591.56
10    2.74 -    2.65  99.9    989 /    990   12.41   31.35   72.95    987    0.00      0     9.91     515.07
11    2.65 -    2.56  99.7    979 /    982    9.35   37.14   65.31    970    0.00      0     9.31     438.96
12    2.56 -    2.49  98.0    979 /    999    6.06   45.98   45.37    930    0.00      0     9.45     390.05
13    2.49 -    2.42  95.1    931 /    979    4.46   50.68   34.20    834    0.00      0     8.93     334.80
14    2.42 -    2.37  91.7    896 /    977    3.35   55.66   37.15    729    0.00      0     9.27     320.17
15    2.37 -    2.31  83.9    829 /    988    2.61   56.92   43.21    600    0.00      0     9.60     296.67
16    2.31 -    2.26  72.4    702 /    969    1.97   65.81   26.89    386    0.00      0    10.29     284.39
17    2.26 -    2.22  59.1    582 /    985    1.75   64.72   31.28    275    0.00      0     9.87     284.06
18    2.22 -    2.18  52.9    513 /    970    1.51   71.27   16.86    188    0.00      0     8.93     215.31
19    2.18 -    2.14  35.7    349 /    978    1.32   62.26   68.25     90    0.00      0     8.22     199.09
20    2.14 -    2.10  23.1    227 /    981    1.20   92.14   -9.20     42    0.00      0     8.59     224.44
-------------------------------------------------------------------------------------------------------------
        TOTAL         85.9  17224 /  20046   27.11   21.11   92.07  15305    0.00      0    12.87     999.53
-------------------------------------------------------------------------------------------------------------

Summary of refinement and merging
 No. good frames:                  1809
 No. bad cc frames:                 153
 No. bad G frames) :                  0
 No. bad unit cell frames:            5
 No. bad gamma_e frames:              0
 No. bad SE:                          0
 No. observations:               466997

Advance settings

Now that you have your first trial merged data set, you can explore different parameter settings to merge or to obtain the Bijvoet pairs (I+/I-) for your anomalous data set.

Anomalous data:

target_anomalous_flag = True

In the last cycle, prime will output a reflection set with I+ and I-.

Indexing ambiguity

For space groups with indexing ambiguity, use the solutions from cctbx.xfel (see Tutorial for resolving indexing ambiguity) to merge the data set.

indexing_ambiguity {
  flag_on = True
  index_basis_in = /path/to/solution/pickle_file.pickle
}

Number of micro- and macrocycles

n_postref_cycle = 3
n_postref_sub_cycle = 3

Number of bins for merging statistics

n_bins = 20


Help for input parameters

Most input parameters are self-explained. However, you can run -h switch to view help information for each parameter.

prime.postrefine -h