2017 prime tutorial

From cctbx_xfel
Jump to navigation Jump to search

Post-refine and Merge Sample Data Set with PRIME (2017 Tutorial)

In this tutorial, we will work on the integration results from the first of Tutorial 2 (Myoglobin Data). Before proceeding to running the program, we'll consider making the input file for PRIME based on the situation of this data set.

Generating input file

PRIME input files contain information necessary for successful post-refinement and merging steps. You can access and review the list of input parameters by running prime.run or prime.run -h to view the description of these parameters. For this tutorial we'll start building it from scratch.

  • Location of integration results

In this case, we know the location where the integration results (pickle files) are. We can then set,

data = /net/viper/raid1/mu238/XfelProject/dials17/extracted

Note that you supply data as a multiple arguments. The value of the parameter can be a file containing list of integration results, a folder, or a wildcard argument.

  • Unit cell information

You can obtain the mean (or median) unit-cell dimensions from either IOTA or DIALS. In case of IOTA, prime .phil file is auto generated and this information is readily available in there. For n_residues, enter number of residues in asymmetric unit of your molecule.

target_unit_cell = 91.7 91.7 46 90 90 120
target_space_group = P6
n_residues = 128
  • Detector information
pixel_size_mm = 0.172
  • Post-refinement and Scaling information

This is where you specify the optimal resolution cutoffs for post-refinement and merging. Note that when running for the first time on you newly collected data, you can choose the "expected" values (resolution which you see the spots at the corner or on the edge). You can then adjust these parameters when analyzing merging statistics based on the I/sigI values in the high resolution shells and rerun the program again. Note that sigma cutoffs are set to 1.5 in scaling and post-refinement steps while it's set to -3.0 so we can include negative values in the merged reflection set.

scale {
  d_min = 2.5
  d_max = 20
  sigma_min = 1.5
}
postref {
  scale {
    d_min = 2.5
    d_max = 20
    sigma_min = 1.5
    partiality_min = 0.1
  allparams {
    flag_on = True
    d_min = 2.5
    d_max = 20
    sigma_min = 1.5
    partiality_min = 0.1
    uc_tolerance = 5
  }
}
merge {
  d_min = 2.5
  d_max = 20
  sigma_min = -3.0
  partiality_min = 0.1
  uc_tolerance = 5
}
  • Indexing ambiguity

For other sets that are not in polar space or have indexing ambiguity (when one or more of the unit-cell dimensions are very similar but not the same!), you can very well use the .phil file parameters thus far to proceed and run post-refinement. However, this data set is in P6 (polar space group) and therefore, the indexing ambiguity needs to be resolved prior to other refinement and merging steps.

Other point worth noting is for any polar space groups, PRIME will automatically solve the ambiguity based on the default parameters. However, this data set has about 5,000 integration results so we want to make sure that we modify the number of images used for random and best selections.

indexing_ambiguity {
 mode = Auto
 index_basis_in = None
 assigned_basis = None
 d_min = 3.0
 d_max = 10.0
 sigma_min = 1.5
 n_sample_frames = 1000
 n_selected_frames = 100
}

We left other parameters to their default value and modified n_sample_frames to 1000 and n_selected_frames to 100.

  • No. of Bin
n_bins = 10

Now we have a complete .phil file ready to run.