Indexing individual stills
Background
Recent advances at synchrotron sources, such as nanofocus beamlines, brigher light and new injection technologies, are enabling the collection of crystal diffraction data by non-traditional means, specifically, without the use of a goniometer. Without rotational information, still photographs of single crystals present new challenges for indexing software. cctbx.xfel was designed to address these issues for still images collected from X-ray free electron lasers, specifically when processing with the significant computing and storage facilities available at the Stanford Linear Accelerator. However, as still images become more common for other sources, we have adapted our first public release of cctbx.xfel to be able to process still images from non-XFEL sources. A complete guide is presented here. It is worth noting that a significant revision of cctbx.xfel is underway, which will make much of this tutorial obsolete. However, until then we hope the information presented here is useful.
Converting image files to the cctbx.xfel pickle image format
cctbx.xfel has certain requirements on the images it indexes. While some crystal images are directly compatible with the software, many are not, due to two limitations: 1) cctbx.xfel requires a complete description of the detector geometry and experimental parameters to be present in the image’s header. Generally this consists of detector distance, wavelength, pixel size (square pixels are assumed), saturation limits and beam center. Initial estimates of these values are required to a certain accuracy, and are often missing from the image header. 2) For legacy reasons, cctbx.xfel currently requires the beam center to match the image center (though this will not be true in the near future). In order to use images that don't satisfy both of these criteria, use cxi.image2pickle to convert them to cctbx.xfel image pickles:
$ cxi.image2pickle --help Usage: cxi.image2pickle [-v] [-c] [-s] [-w wavelength] [-d distance] [-p pixel_size] [-x beam_x] [-y beam_y] [-o overload] files Options: -h, --help show this help message and exit -v, --verbose Print more information about progress -c, --crop Crop the image such that the beam center is in the middle -s, --skip-converted Skip converting if an image already exists that matches the destination file name -w WAVELENGTH, --wavelength=WAVELENGTH Override the image's wavelength (angstroms) -d DISTANCE, --distance=DISTANCE Override the detector distance (mm) -p PIXEL_SIZE, --pixel-size=PIXEL_SIZE Override the detector pixel size (mm) -x BEAM_CENTER_X, --beam-x=BEAM_CENTER_X Override the beam x position (pixels) -y BEAM_CENTER_Y, --beam-y=BEAM_CENTER_Y Override the beam y position (pixels) -o OVERLOAD, --overload=OVERLOAD Override the detector overload value (ADU)
Note, if the image already has elements that don't need to be overridden, such as wavelength or detector distance, they can be omitted as the program will try to determine them automatically. The program will not run if all of the information is not present. The -c option (crop) will crop the image such that the image center matches the beam center, and adjust the beam center location accordingly. This option is recommended for most data. However, if this leads to significant loss of pixels at the edges, please contact the authors. A padding option can be made available instead.
Verifying the experimental geometry with virtual powder rings
Optionally, the detector distance, wavelength, and beam center can be verified using a maximum projection of the images availalible. A maximum projection is a composite image where each pixel represents the brightest pixel in a given set of images. After converting the images in question to pickle, use cxi.image_average:
$ cxi.image_average --help Usage: cxi.image_average [-v] [-a PATH] [-m PATH] [-s PATH] image1 image2 [image3 ...] Options: -h, --help show this help message and exit -a PATH, --average-path=PATH Write average image to PATH -m PATH, --maximum-path=PATH Write maximum projection image to PATH -s PATH, --stddev-path=PATH Write standard deviation image to PATH -v, --verbose Print more information about progress
The resultant maximum projection can be inspected with cctbx.image_viewer. Once loaded, the ring tool in the Actions menu can be used to verify the beam center is aligned to the center of the virtual powder rings from the maximum projection. If it is not, use cxi.image2pixel with further adjustments.
To verify the distance and wavelength, assuming the unit cell of the crystal is already known, use cxi.view thusly (given a maximum projection named max.pickle):
$ cxi.view max.pickle viewer.calibrate_unitcell.d_min=4 viewer.calibrate_unitcell.unitcell=78,78,38,90,90,90 viewer.calibrate_unitcell.spacegroup=P43212
Here, powder rings for lysozyme will be displayed. The distance can be changed until the rings align. Then, regenerate your image pickles using cxi.image2pickle and the new parameters.
Determine the best spotfinder parameters
Every crystal preparation is different, from the size of the crystals to the composition of the mother liquor that contributes to the background. Because of this, it is important to optimize your spotfinder parameters for your specific data of interest. We use Labelit’s spotfinder, whose parameters are described in detail here [1]. The defaults are fairly well tuned for synchrotron data in general, but can easily need further adjustment. Two programs are of interest, distl.image_viewer and distl.signal_strength . The former displays the image with spotfinder spots highlighted and the latter simply prints a summary of spot finder spot counts in various categories. We recommended viewing the image with the default settings, then tweaking distl.minimum_spot_area or distl.minimum_signal_height and distl.minimum_spot height (the latter two should usually be the same). Pass the new parameters using this syntax:
$ distl.image_viewer <image file> distl.minimum_spot_area=2.
The goal here is to adjust the parameters until only real spots are found by the program and noise is not. It is also feasible to optimize these parameters using a grid search and cxi.index (see below).
Using cxi.index to process the images
cxi.index is the command line interface to our indexing software. It uses the Rossmand DPS indexing algorithm as implemented by Labelit to get initial basis vectors, then uses still-specific algorithms for reflection prediction, as documented here [2] and here [3]. Notably, we have recently added multiprocessing to cxi.index (see the -n command). Here is the full description:
$ cxi.index --help Usage: cxi.index [-d] [-n num_procs] [-o output_dir] [-b output_basename] target=targetpath files Options: -h, --help show this help message and exit -d, --no-display Do not show indexing graphics -n NUM_PROCS, --num-procs=NUM_PROCS Number of processors to use -o OUTPUT_DIR, --output-dir=OUTPUT_DIR Directory for integration pickles -b OUTPUT_BASENAME, --output-basename=OUTPUT_BASENAME String to append to the front of output integration pickles
Target: the phil file containing further indexing/integration parameters The simplest way to index a single image is as follows:
$ cxi.index <imagefile> target=<phil file>
This saves no output and displays three views of the image (assuming indexing was successful. The first is the image indexed in the P1 setting. Red pixels are spotfinder spots, blue are integration mask pixels and yellow are background pixels. The second display is as the first, but in the highest symmetry setting found by indexing, and with an automatically determined resolution cutoff applied. A third is also displayed <need info from Nick>. To hide these displays, use -d. The target is a phil file with sample-specific indexing settings as described in our Phil section (though note the discussion of metrology is only relevant for the CSPAD detector). A typical indexing run of a large set of images in a single directory, assuming they have been converted with cxi.image2pickle and that there are 12 cores available, that the directory results/000 exists, using a phil file named target.phil, is as follows:
$ cxi.index -d -n 12 -o results/000 target=target.phil *.pickle | tee results/trial_000.log
After indexing, results/000 will contain files of the form int_<original file name>.pickle. These are not image pickles, but they are integration pickles, containing the integrated intensities and miller indices for each image. They are the direct input for cxi.merge, as described in [Merging] and [Advanced Merging]
Using a grid search with cxi.index
It is usually pretty straightforward to find spotfinder parameters that work well for your data. However, if you'd like to be more thorough below is an example command for testing a variety of spotfinding parameters in a grid-based fashion. Comment out the overridden spotfinder parameters from this command in your phil file first. Assume a directory named results exists in the current folder.
$ for i in `seq 2 6`; do for j in `seq 2 6`; do mkdir -p results/trial_a$i_s$j; cxi.index <pickle files> target=<path to phil file> \ -d -o results/trial_a$i_s$j distl.minimum_spot_area=$i distl.minimum_signal_height=$j distl.minimum_spot_height=$j \ | tee results/trial_a$i_s$j.log; done; done
Using -n here for cxi.index will be useful! If, say, 100 images are tested, whichever combination gives the most number of indexed images is likely best for the entire dataset.