File formats

From cctbx_xfel
Revision as of 14:03, 30 March 2016 by Aaron (Talk | contribs)

Jump to: navigation, search

This page is under construction


cctbx.xfel's current native file formats, image pickles and integration pickles, are mainly intermediate file formats useful for debugging and software development. These binary pickle files are serialized python dictionaries optimized for machine readability. The rationale here is that the raw data from an experiment at LCLS is stored in the xtc streams. We process these in memory without needing to write images to disk at all, but provide image pickles if requested for use with the cctbx image viewer for diagnostic purposes. Integration pickles contain integrated intensities and crystal cell and orientation parameters and are read by the merging programs cxi.merge and prime.postrefine.

In an effort to provide more human-readable data files and conform to international standards, we have worked with representatives of ImageCIF to create 64 tile segmented data in CBF format, the same format used in a wide variety of detectors. These CSPAD CBFs are designed to be used with other crystallographic software packages.

Format specifications for these data files are provided below.

Image pickles

cctbx.xfel image pickles are a binary file format containing pixel data and image metadata. Under the hood these are serialized python dictionaries, but for general users it's sufficient to know that the blob of data consists of name/value pairs. The contents of an image pickle can be inspected with this command:

 cxi.print_pickle image.pickle

For example, here's the output from an image with a thermolysin diffraction pattern collected recently:

Printing contents of idx-20141201073601249.pickle
Detector format version: CXI 10.1
DISTANCE 119.002
SIZE1 1765
SIZE2 1765
TIMESTAMP 2014-12-01T07:36Z01.249
64 active areas, first one:  [715, 439, 909, 624]
WAVELENGTH 1.75124427107 , converted to eV: 7079.77687909
xtal_target None
DATA len=3115225 max=106065.000000 min=-104526.000000 dimensions=(1765, 1765)
cxi_versioned_extract()::cxi_version: CXI 10.1
64 translated active areas, first one:  [725, 449, 917, 632]

Each of the lines here is a name/value pair of data or metadata for the image. LABELIT, DIALS, cctbx.image_viewer, are all programs capable of processing these files directly. Here's a description of each of the name/value pairs:

  • Detector format version: CXI 10.1. Technically this value is not stored in the image pickle. The detector format version is used internally by cctbx.xfel for data collected up to LCLS run 11, and is a way of looking up tile corrections stored in the software. After run 11, this information should be stored in the phil file used to processes the data. The detector format version is looked up based on the detector address and event timestamp. All known detector format versions can be displayed with the command cxi.detector_format_versions.
  • DISTANCE: detector distance
  • DETECTOR_ADDRESS: string to identify the detector in the XTC stream. Recorded here as a unique identifier for the data when coupled with the event timestamp.
  • SIZE1 and SIZE2: image dimensions
  • TIMESTAMP: unique time stamp for this event in the form year-month-dayThour:minuteZsecond.microsecond
  • PIXEL_SIZE: pixel size in mm
  • BEAM_CENTER_X, BEAM_CENTER_Y: beam center
  • MIN_TRUSTED_VALUE: underload
  • WAVELENGTH: wavelength as recorded in the XTC stream
  • DATA: pixel data. Some vital statistics are shown.
  • xtal_target and SEQUENCE_NUMBER: unused
  • 64 active areas: the active areas are coordinates that specify where the 64 tiles are in the image. They are in the form of quartets of numbers [x1,y1,x2,y2], one quartet for each of the 64 tiles. The first active area is shown. Note, for non-CSPAD image pickles, there will only be one active area and it will be the size of the image entire.
  • 64 translated active areas: if corrections were available for this image, the first corrected active area is shown here.

Integration pickles

Coming soon


CSPAD CBFs are not used directly by cctbx.xfel at this time. This aspect of the project is under development. The complete specification for CSPAD CBFs is laid out in an article in the Computational Crystallographic Newsletter: "XFEL Detectors and ImageCIF", Computational Crystallography Newsletter 5, 19-24. (Reprint). CSPAD CBFs

Creating CSPAD CBFs from XTC streams

cctbx.xfel.xtc_dump is useful for converting XTC streams to CBF files as needed. Use -c to get a listing of all options, and -c -a 2 to get a full listing with help strings. Example command:

 cd ~/myrelease # or wherever your release is
 cctbx.xfel.xtc_dump dispatch.max_events=1 input.experiment=xpptut15 input.address=cspad \
   input.run_num=54 format.file_format=cbf output.output_dir=xpptut15/out \
   format.cbf.detz_offset=100 input.override_energy=7000

Here, one image is created from experiment xpptut15, run 54. You can display the image with cctbx.image_viewer xpptut15/out/*.cbf. Note the pink gaps between the tiles. This results from the segmented nature of the CSPAD, preserved in the CBF file.

xpptut15 is data from XPP's CSPAD and was collected without xray's on. Hence format.cbf.detz_offset=100 input.override_energy=7000 are set to fake values in this command.

Converting from SLAC's metrology to CBF

The tile positions of the hierarchical CSPAD detector are specified by SLAC in a geometry file in the calib folder of each experiment. Typically this file is named something like If desired, this file can be converted to just the human readable header portion of a CSPAD cbf using the cctbx.xfel command cxi.slaccalib2cbfheader. Example:

 cxi.slaccalib2cbfheader out=tmp.cbf

This cbf header can be converted back to SLAC format using cxi.cbfheader2slaccalib:

 cxi.cbfheader2slaccalib cbf_header=tmp.cbf

Displaying metrology files

cxi.display_metrology <filename> can be used to show a plot of tile positions. It accepts SLAC metrology files, CSPAD CBFs and image pickles.