Advanced Merging
Jump to navigation
Jump to search
Here we describe two use cases for merging: one where an isomorphous structure is already known, and the other with a new structure. In the examples below, note the use of the same command line parameters for the MERGE and XMERGE steps, allowing the command script to be formed in a condensed non-redundant fashion. Not all of the parameters are actually used by both steps. Used only by merge: nproc. Used only by xmerge: scaling.*
Use case 1: Isomorphous replacement
In this case, the new XFEL data are scaled to a known isomorphous reference, either a synchrotron-solved structure or a previous XFEL structure:
$ vi psII_merge.csh #!/bin/csh -f set trial=${1} set runs = 127,130,132,134,135,140,141,142,144,145,146,148,151,152,157,162,163 set datastring = \ `python -c "print ' '.join(['data=/my_result_directory/L785/r%04d/${trial}/integration'%i for i in [${runs}]])"` set tag = L785_2flash_${trial} set effective_params = "d_min=4.7 \ output.n_bins=20 \ ${datastring} \ model=/my_work/3bz1_3bz2_core.pdb \ nproc=16 \ merge_anomalous=True \ plot_single_index_histograms=False \ raw_data.sdfac_auto=True \ mysql.runtag=${tag} \ mysql.passwd=terp888 \ mysql.user=nick \ mysql.database=xfelnks \ scaling.mtz_file=./3bz1-sf.mtz \ scaling.show_plots=False \ scaling.algorithm=mark0 \ scaling.log_cutoff=9. \ set_average_unit_cell=True \ rescale_with_average_cell=True \ pixel_size = 0.11 \ output.prefix=${tag}" cxi.merge ${effective_params} cxi.xmerge ${effective_params} $./psII_merge.csh 009 # merge the data from trial 009
Use case 2: New structure
This is a completely unknown structure with no isomorphous reference:
$ vi bt_allmerge.csh #!/bin/csh -f set trial=${1} set datadir = /my_work_area/LCLS/cxis9913 set runs = 62,63,64,65,66,67,68,69,70 set datastring = `python -c "print ' '.join(['data=${datadir}/r%04d/${trial}/integration'%i for i in [${runs}]])"` set tag = last_BT_${trial} set effective_params = “d_min=3.0 \ output.n_bins=13 \ ${datastring} \ target_unit_cell=81.8,94.0,123.0,90,90,90 \ target_space_group=P212121 \ nproc=16 \ merge_anomalous=True \ plot_single_index_histograms=False \ raw_data.sdfac_auto=True \ mysql.runtag=${tag} \ mysql.passwd=terp888 \ mysql.user=nick \ mysql.database=xfelnks \ scaling.mtz_file="fake_fake.mtz" \ scaling.show_plots=True \ scaling.algorithm=mark1 \ scaling.log_cutoff=3. \ scaling.mtz_column_F=f-obs \ set_average_unit_cell=True \ rescale_with_average_cell=True \ output.prefix=${tag}” cxi.merge ${effective_params} cxi.xmerge ${effective_params} $ ./bt_allmerge.csh 001
The two cases contrasted
Note the following differences in the command lines for the two use cases:
model
[pdb file | mtz file] is provided only in case #1 as the scaling reference, not in case #2.min_corr
is ignored in case #2; images are not discarded for lack of correlation to a reference.target_unit_cell
is mandatory in case #2; it is used to form the list of reflections in the asymmetric unit. Not used in case #1.target_space_group
[symbol] is mandatory in case #2 and is carried through to the mtz output.scaling.mtz_file
is supplied in case #2, but here it is a dummy file name, used to output fake structure factor amplitudes for dummy scaling [Do not supply the name of an existing file]. In case #1 it is the actual file containing reference Iobs to calculate the CCiso.scaling.algorithm
is set to mark1 (no scaling) for case #2; mark0 (isomorphous reference) for case #1.scaling.mtz_column_F
is set to "f-obs", the label used by the fake structure factor generator for case #2. For case #1, set it to whatever label is to be used to look up the Iobs for calculating CCiso.
Additional explanation of parameters
- model
- a PBD file (heavy atoms omitted) to compute Fmodel intensities for frame by frame scaling.
- nproc
- the xmerge script is fast and uses 1 core, this parameter has no effect for xmerge, only for merge.
- scaling.mtz_file
- isomorphous experimental structure factors for computing the isomorphous correlation-coefficient CCiso; measures data quality only; does not affect scaling.
- scaling.algorithm
- mark0: the usual method, rejects some frames based on low correlation with reference PDB model, then scales frame-by-frame with data scaled to the isomorphous reference.
mark1: no scaling.
- scaling.log_cutoff
- For the calculation of correlation coefficients, ignore the weak data (affects only the reported quality measures, not the scaling algorithm). Cutoff is expressed on a log scale. The only way to determine a good cutoff is to use scaling.show_plots=True the first time through, and experiment with different values.