Difference between revisions of "Overview"

From cctbx_xfel
Jump to: navigation, search
m (Nitpicks)
Line 1: Line 1:
Cctbx.xfel at LCLS is built on five systems: the CS-PAD detector, pyana, LCLS’s queuing system, phil, and fundamentally, cctbx.
+
''cctbx.xfel'' at LCLS is built on five systems: the CS-PAD detector, pyana, LCLS’s queuing system, phil, and fundamentally, cctbx.
  
 
== The CS-PAD detector ==
 
== The CS-PAD detector ==
 
[[File:750.png|200px|thumb|right|Example CS-PAD image]]
 
[[File:750.png|200px|thumb|right|Example CS-PAD image]]
The LCLS at full capacity operates at 120 Hz.  The incident photon packets are delivered in ~40 femtosecond wide pulses, each containing ~10^15 photons.  This high repetition rate and compact beam delivery time necessitated the construction of a new detector, where the work of reading out and streaming recorded data at these high speeds is accomplished through the use of 64 sensors, arranged in a quadrangular pattern around a central hole (in the place of a beam stop).  Each of the 4 quadrants, containing 16 of the sensors, is adjustable on rails radially away from the central hole to adjust the size of this hole.  Indexing, predicting spot locations using a crystal orientation matrix, and integrating reflection intensities requires precise knowledge of the location of these sensors in three-dimensional space.  For this reason, a portion of this tutorial describes the calibration and refinement of the tile metrology.
+
The LCLS at full capacity operates at 120 Hz.  The incident photon packets are delivered in ≈40 femtosecond wide pulses, each containing ≈10<sup>12</sup> photons.  This high repetition rate and compact beam delivery time necessitated the construction of a new detector, where the work of reading out and streaming recorded data at these high speeds is accomplished through the use of 64 sensors, arranged in a quadrangular pattern around a central hole (in the place of a beam stop).  Each of the 4 quadrants, containing 16 of the sensors, is adjustable on rails radially away from the central hole to adjust the size of this hole.  Indexing, predicting spot locations using a crystal orientation matrix, and integrating reflection intensities requires precise knowledge of the location of these sensors in three-dimensional space.  For this reason, a portion of this tutorial describes the calibration and refinement of the tile metrology.
  
 
More detailed information about the CS-PAD detector is available here: [http://www-public.slac.stanford.edu/sciDoc/docMeta.aspx?slacPubNumber=SLAC-PUB-15284]
 
More detailed information about the CS-PAD detector is available here: [http://www-public.slac.stanford.edu/sciDoc/docMeta.aspx?slacPubNumber=SLAC-PUB-15284]
  
  
== Pyana ==
+
== ''pyana'' ==
  
The LCLS Data Acquisition Systems stream the terabytes of diffraction data collected from the CS-PAD detector to container files in XTC format.  XTC is a linear, non-random access file format, where individual images can be recorded rapidly by the file system as they are collected.  The programmatic interface to interact with these files at LCLS is psana and pyana.  Psana is a C++ interface.  Cctbx.xfel uses the python-based pyana.
+
The LCLS data acquisition systems stream the terabytes of diffraction data collected from the CS-PAD detector to container files in XTC format.  XTC is a linear, sequential-access file format, where individual images can be recorded rapidly by the file system as they are collected.  The programmatic interfaces to interact with these files at LCLS are ''psana'' and ''pyana''''psana'' is a C++ interface.  ''cctbx.xfel'' uses the Python-based pyana.
  
Pyana is driven by configuration files to process frames individually, and is designed with computational parallelization in mind.  As each image is independent, processing of each image can be done by separate computer cores.  Cctbx.xfel uses pyana and pyana's config files to read and process image files stored in XTC format.  The user specifies how each image is to be processed in the configuration file, and the passes the config file and the path to the XTC streams of interest to cctbx.xfel, which calls pyana and submits the job to the queuing system.
+
''pyana'' is driven by configuration files to process frames individually, and is designed with computational parallelization in mind.  As each image is independent, processing of each image can be done by separate computer cores.  ''cctbx.xfel'' uses ''pyana'' and ''pyana'''s configuration files to read and process image files stored in XTC format.  The user specifies how each image is to be processed in the configuration file, and the passes the configuration file and the path to the XTC streams of interest to ''cctbx.xfel'', which calls ''pyana'' and submits the job to the queuing system.
  
For example, if the user wanted to filter an XTC stream for hits, index the hits and then integrate images which successfully indexed, the user would supply a configuration file which specified cctbx.xfel modules that did these tasks, provide options to these modules, and submit the job.  Specific details are in the tutorials.
+
For example, if the user wanted to filter an XTC stream for hits, index the hits and then integrate images which successfully indexed, the user would supply a configuration file which specified ''cctbx.xfel'' modules that did these tasks, provide options to these modules, and submit the job.  Specific details are in the tutorials.
  
During processing, hits are extracted from the XTC stream and written to separate files for each individual image.  At the moment these separate files are in a in a python-programming language friendly format called pickle format.  However, by the end of 2013, cctbx.xfel will be exclusively using CBF and HDF5 formats to output results.
+
During processing, hits are extracted from the XTC stream and written to separate files for each individual image.  At the moment these separate files are in a in a Python-programming language friendly format called pickle format.  However, by the end of 2013, ''cctbx.xfel'' will be exclusively using CBF and HDF5 formats to output results.
  
 
More information about pyana: [http://www-public.slac.stanford.edu/sciDoc/docMeta.aspx?slacPubNumber=SLAC-PUB-15284]
 
More information about pyana: [http://www-public.slac.stanford.edu/sciDoc/docMeta.aspx?slacPubNumber=SLAC-PUB-15284]
  
== LCLS Queuing System ==
+
== LCLS queuing system ==
  
SLAC maintains several computing clusters available to its users for processing data.  While detailed knowledge of their workings isn't required for cctbx.xfel operation, an overview of these systems is provided here: [https://confluence.slac.stanford.edu/display/PCDS/Computing].  Specific commands for submitting cctbx.xfel jobs to the cluster are given in the tutorials.
+
SLAC maintains several computing clusters available to its users for processing data.  While detailed knowledge of their workings isn't required for ''cctbx.xfel'' operation, an overview of these systems is provided here: [https://confluence.slac.stanford.edu/display/PCDS/Computing].  Specific commands for submitting ''cctbx.xfel'' jobs to the cluster are given in the tutorials.
  
 
General instructions for submitting batch jobs can be found here: [https://confluence.slac.stanford.edu/display/PCDS/Submitting+Batch+Jobs]
 
General instructions for submitting batch jobs can be found here: [https://confluence.slac.stanford.edu/display/PCDS/Submitting+Batch+Jobs]
Line 28: Line 28:
 
== Phil ==
 
== Phil ==
  
While pyana is configured using its own .cfg files, cctbx.xfel itself is driven using Python-based hierarchical interchange language (phil) files, the same format that drives Labelit and Phenix (though Phenix calls them .eff files).  The format is intuitive and allows easy specification of per-processing run parameters.
+
While ''pyana'' is configured using its own configuration files, ''cctbx.xfel'' itself is driven using Python-based hierarchical interchange language (phil) files, the same format that drives Labelit and ''PHENIX'' (though ''PHENIX'' calls them .eff files).  The format is intuitive and allows easy specification of per-processing run parameters.
  
A user's pyana config file will have an entry called xfel_target.  This entry will provide a phil filename that contains cctbx.xfel configuration settings.  These settings will include thing such as thresholds for determining hits (number of spots on an image, spot brightness cutoff, etc.), unit cell targets for indexing, resolution cutoffs, and so forth.
+
A user's ''pyana'' configuration file will have an entry called xfel_target.  This entry will provide a phil filename that contains ''cctbx.xfel'' configuration settings.  These settings will include thing such as thresholds for determining hits (number of spots on an image, spot brightness cutoff, etc.), unit cell targets for indexing, resolution cutoffs, and so forth.
  
 
Technical information regarding phil: [http://cctbx.sourceforge.net/libtbx_phil.html]
 
Technical information regarding phil: [http://cctbx.sourceforge.net/libtbx_phil.html]
  
== Cctbx ==
+
== ''cctbx'' ==
  
 
The computational crystallographic toolbox is a foundational set of python and C++ modules that allow abstraction of the crystallographic experiment.    Under continual development, the toolbox provides interfaces for working with crystal models, reflection data, and much more.
 
The computational crystallographic toolbox is a foundational set of python and C++ modules that allow abstraction of the crystallographic experiment.    Under continual development, the toolbox provides interfaces for working with crystal models, reflection data, and much more.

Revision as of 22:17, 25 September 2013

cctbx.xfel at LCLS is built on five systems: the CS-PAD detector, pyana, LCLS’s queuing system, phil, and fundamentally, cctbx.

The CS-PAD detector

Example CS-PAD image

The LCLS at full capacity operates at 120 Hz. The incident photon packets are delivered in ≈40 femtosecond wide pulses, each containing ≈1012 photons. This high repetition rate and compact beam delivery time necessitated the construction of a new detector, where the work of reading out and streaming recorded data at these high speeds is accomplished through the use of 64 sensors, arranged in a quadrangular pattern around a central hole (in the place of a beam stop). Each of the 4 quadrants, containing 16 of the sensors, is adjustable on rails radially away from the central hole to adjust the size of this hole. Indexing, predicting spot locations using a crystal orientation matrix, and integrating reflection intensities requires precise knowledge of the location of these sensors in three-dimensional space. For this reason, a portion of this tutorial describes the calibration and refinement of the tile metrology.

More detailed information about the CS-PAD detector is available here: [1]


pyana

The LCLS data acquisition systems stream the terabytes of diffraction data collected from the CS-PAD detector to container files in XTC format. XTC is a linear, sequential-access file format, where individual images can be recorded rapidly by the file system as they are collected. The programmatic interfaces to interact with these files at LCLS are psana and pyana. psana is a C++ interface. cctbx.xfel uses the Python-based pyana.

pyana is driven by configuration files to process frames individually, and is designed with computational parallelization in mind. As each image is independent, processing of each image can be done by separate computer cores. cctbx.xfel uses pyana and pyana's configuration files to read and process image files stored in XTC format. The user specifies how each image is to be processed in the configuration file, and the passes the configuration file and the path to the XTC streams of interest to cctbx.xfel, which calls pyana and submits the job to the queuing system.

For example, if the user wanted to filter an XTC stream for hits, index the hits and then integrate images which successfully indexed, the user would supply a configuration file which specified cctbx.xfel modules that did these tasks, provide options to these modules, and submit the job. Specific details are in the tutorials.

During processing, hits are extracted from the XTC stream and written to separate files for each individual image. At the moment these separate files are in a in a Python-programming language friendly format called pickle format. However, by the end of 2013, cctbx.xfel will be exclusively using CBF and HDF5 formats to output results.

More information about pyana: [2]

LCLS queuing system

SLAC maintains several computing clusters available to its users for processing data. While detailed knowledge of their workings isn't required for cctbx.xfel operation, an overview of these systems is provided here: [3]. Specific commands for submitting cctbx.xfel jobs to the cluster are given in the tutorials.

General instructions for submitting batch jobs can be found here: [4]

Phil

While pyana is configured using its own configuration files, cctbx.xfel itself is driven using Python-based hierarchical interchange language (phil) files, the same format that drives Labelit and PHENIX (though PHENIX calls them .eff files). The format is intuitive and allows easy specification of per-processing run parameters.

A user's pyana configuration file will have an entry called xfel_target. This entry will provide a phil filename that contains cctbx.xfel configuration settings. These settings will include thing such as thresholds for determining hits (number of spots on an image, spot brightness cutoff, etc.), unit cell targets for indexing, resolution cutoffs, and so forth.

Technical information regarding phil: [5]

cctbx

The computational crystallographic toolbox is a foundational set of python and C++ modules that allow abstraction of the crystallographic experiment. Under continual development, the toolbox provides interfaces for working with crystal models, reflection data, and much more.

Introduction: [6] Homepage: [7]