Practical Approaches to Data Processing Using XDS Kay Diederichs

25 Slides1.13 MB

Practical Approaches to Data Processing Using XDS Kay Diederichs Protein Crystallography / Molecular Bioinformatics

Overview XDS is a data reduction program for X-ray data collected by the oscillation method on area detectors Author: Wolfgang Kabsch (MPI Heidelberg) Information flow within XDS Usage; Optimization; Interfaces XDSwiki; references Demonstration: processing of dataset (e.g. Wladek Minor's corresponding to PDB 1WQ6) Summary and questions throughout this talk: program, file 2

The XDS program suite binary distribution (by W. Kabsch) for Linux & Mac from http://www.mpimf-heidelberg.mpg.de/ kabsch/xds/: XDS: the main program (indexing, integrating, scaling) XSCALE: scale several XDS intensity data sets together, statistics XDSCONV: convert to CCP4, CNS, SHELX, . format source code available from sourceforge.net: XDS-Viewer : inspect control images written by XDS, or (single) data frames (alternatively, latest adxv may be used) my own programs: XDSSTAT, generate adx (both in XDSwiki) 3

Algorithms Unique features: 3D - profiles of reflections transformed into their own coordinate systems – makes them highly similar Pixel-labelling method Smooth scaling Robust estimation of parameters throughout Radiation-damage correction (XSCALE) 4

How to use XDS ? Prepare a single input file XDS.INP with parameters describing data reduction XDS.INP often written by beamline software Parameters and their keywords have the form e.g. DETECTOR DISTANCE 120. There are about 30 relevant parameters, but only about 15 are required (and change between projects). All parameters have reasonable defaults where possible. Quick start: generate XDS.INP from XDSwiki 5

Example for MarCCD 225 @ SLS PX-III JOB XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT ORGX 1546 ORGY 1552 !Detector origin (pixels); e.g. NX/2 NY/2 DETECTOR DISTANCE 180 !(mm) OSCILLATION RANGE 0.50 !degrees ( 0) X-RAY WAVELENGTH 0.980243 !Angstroem NAME TEMPLATE OF DATA FRAMES frms/wga2-27 1 ?.img DATA RANGE 1 360 !Numbers of first and last data image collected BACKGROUND RANGE 1 10 !Numbers of first and last data image for background SPACE GROUP NUMBER 19 !0 for unknown crystals; cell constants are ignored. UNIT CELL CONSTANTS 44.4 86.4 104.5 90 90 90 REFINE(IDXREF) BEAM AXIS ORIENTATION CELL DISTANCE REFINE(INTEGRATE) DISTANCE BEAM ORIENTATION CELL ! AXIS ROTATION AXIS 1.0 0.0 0.0 INCIDENT BEAM DIRECTION 0.0 0.0 1.0 FRACTION OF POLARIZATION 0.99 ! SLS X06SA POLARIZATION PLANE NORMAL 0.0 1.0 0.0 DETECTOR CCDCHESS MINIMUM VALID PIXEL VALUE 1 OVERLOAD 65000 DIRECTION OF DETECTOR X-AXIS 1.0 0.0 0.0 DIRECTION OF DETECTOR Y-AXIS 0.0 1.0 0.0 VALUE RANGE FOR TRUSTED DETECTOR PIXELS 7000 30000 !Used by DEFPIX !for excluding shaded parts of the detector. INCLUDE RESOLUTION RANGE 50.0 1.3 !Angstroem; used by DEFPIX,INTEGRATE,CORRECT Bold keyword/parameter pairs are required. Complete documentation at http://xds.mpimf-heidelberg.mpg.de/html doc/xds parameters.html Templates for many detectors at http://xds.mpimf-heidelberg.mpg.de/html doc/detectors.html 6

Using XDS - principles I simple, if basic idea is understood There is one JOB line in XDS.INP which does not specify a parameter, but instead a list of tasks: JOB XYCORR INIT COLSPOT IDXREF DEFPIX INTEGRATE CORRECT data reduction is divided into tasks/jobs in modular way information storage/exchange/flow between tasks by data files which may be inspected/analyzed each task needs the result from the previous tasks fine-tuning of a task does not require previous tasks to be repeated 7 each task write its output file TASK .LP

Using XDS - principles II XYCORR : write positional correction files ( X-CORRECTIONS.cbf, Y-CORRECTIONS.cbf ) INIT : find background pixels (defaults usually OK) COLSPOT: find reflection positions IDXREF : "index" reflections; user may supply/choose spacegroup XPLAN [not required] : strategy for data collection DEFPIX : find beamstop shadow (defaults mostly OK) INTEGRATE : evaluates intensities on all frames, writes INTEGRATE.HKL and FRAME.cbf CORRECT : scales, rejects outliers, statistics, writes XDS ASCII.HKL (and other files) 8

Information flow NAME TEMPL OSCILLATION ORGX ATE OF DAT RANGE ORGY A FRAMES DETECTOR DISTANCE X RAY WAVELENGTH SPACE GROUP NUMBER XYCORR INIT COLSPOT IDXREF DEFPIX DATA RANGE INTEGRATE CORRECT X-CORRECTIONS.cbf Y-CORRECTIONS.cbf BKGINIT.cbf SPOT.XDS XPARM.XDS BKGPIX.cbf ABS.cbf pointless INTEGRATE.HKL FRAME.cbf XDSSTAT 9 loggraph GXPARM.XDS GX-CORRECTIONS.cbf GY-CORRECTIONS.cbf XDS ASCII.HKL XSCALE XDSCONV

XDS output file: INTEGRATE.HKL !OUTPUT FILE INTEGRATE.HKL DATE 3-Oct-2006 !Generated by INTEGRATE (XDS VERSION August 18, 2006) !PROFILE FITTING TRUE !SPACE GROUP NUMBER 92 !UNIT CELL CONSTANTS 57.69 57.69 150.03 90.000 90.000 90.000 !NAME TEMPLATE OF DATA FRAMES ./series 2 ?.img !DETECTOR ADSC !NX 3072 NY 3072 QX 0.102600 QY 0.102600 !STARTING FRAME 1 !STARTING ANGLE 30.000 !OSCILLATION RANGE 0.500000 !ROTATION AXIS 0.999995 0.002515 -0.001722 !X-RAY WAVELENGTH 0.939010 !INCIDENT BEAM DIRECTION 0.001723 -0.002233 1.064948 !DIRECTION OF DETECTOR X-AXIS 1.000000 0.000000 0.000000 !DIRECTION OF DETECTOR Y-AXIS 0.000000 1.000000 0.000000 !ORGX 1541.53 ORGY 1535.28 !DETECTOR DISTANCE 189.221 !UNIT CELL A-AXIS -11.482 53.781 -17.431 !UNIT CELL B-AXIS -17.974 -20.337 -50.906 !UNIT CELL C-AXIS -139.398 -12.226 54.103 !BEAM DIVERGENCE E.S.D. 0.037 !REFLECTING RANGE E.S.D. 0.113 !NUMBER OF ITEMS IN EACH DATA RECORD 20 !H,K,L,IOBS,SIGMA,XCAL,YCAL,ZCAL,RLP,PEAK,CORR,MAXC, ! XOBS,YOBS,ZOBS,ALF0,BET0,ALF1,BET1,PSI !Items are separated by a blank and can be read in free-format !END OF HEADER -45 -9 -60 -3.755E 01 4.144E 01 3066.2 3053.3 273.5 0.75268 100 -10 46 0.0 0.0 0.0 -49.52 0.16 44.87 49.40 -29.89 -45 -9 -59 8.133E 00 4.372E 01 3044.3 3056.1 274.5 0.75525 100 10 46 0.0 0.0 0.0 -49.52 0.16 45.34 49.22 -29.95 -45 -8 -60 6.502E 01 4.327E 01 3046.6 3054.5 271.3 0.75438 100 14 47 3051.0 3057.7 272.0 -49.52 0.16 45.26 49.23 -30.66 . 10

!FORMAT XDS ASCII MERGE FALSE FRIEDEL'S LAW TRUE !OUTPUT FILE XDS ASCII.HKL DATE 3-Oct-2006 !Generated by CORRECT (XDS VERSION August 18, 2006) !PROFILE FITTING TRUE !SPACE GROUP NUMBER 92 !UNIT CELL CONSTANTS 57.71 57.71 150.08 90.000 90.000 !NAME TEMPLATE OF DATA FRAMES ./series 2 ?.img !DATA RANGE 1 399 !X-RAY WAVELENGTH 0.939010 !INCIDENT BEAM DIRECTION 0.001872 -0.002230 1.064947 !FRACTION OF POLARIZATION 0.980 !POLARIZATION PLANE NORMAL 0.000000 1.000000 0.000000 !ROTATION AXIS 0.999995 0.002477 -0.001917 !OSCILLATION RANGE 0.500000 !STARTING ANGLE 30.000 !STARTING FRAME 1 !DETECTOR ADSC !DIRECTION OF DETECTOR X-AXIS 1.00000 0.00000 0.00000 !DIRECTION OF DETECTOR Y-AXIS 0.00000 1.00000 0.00000 !DETECTOR DISTANCE 189.286 !ORGX 1541.25 ORGY 1535.30 !NX 3072 NY 3072 QX 0.102600 QY 0.102600 !NUMBER OF ITEMS IN EACH DATA RECORD 12 !ITEM H 1 !ITEM K 2 !ITEM L 3 !ITEM IOBS 4 !ITEM SIGMA(IOBS) 5 !ITEM XD 6 !ITEM YD 7 !ITEM ZD 8 !ITEM RLP 9 !ITEM PEAK 10 !ITEM CORR 11 !ITEM PSI 12 !END OF HEADER 0 0 4 4.287E-01 2.814E-01 1501.6 1514.4 99.4 0 0 -4 2.243E-01 2.386E-01 1587.4 1548.6 91.6 0 0 5 5.976E-03 3.443E-01 1490.9 1510.2 100.4 90.000 XDS output file: XDS ASCII.HKL 0.00920 100 0.00920 100 0.01150 100 27 30 22 75.39 -79.02 74.94 11

XDS : feedback of information from later steps to previous steps (postrefinement) To optimize data quality, you may try to rename GXPARM.XDS (written by CORRECT) to XPARM.XDS copy 2 lines of INTEGRATE output: BEAM DIVERGENCE 0.560 BEAM DIVERGENCE E.S.D. 0.056 REFLECTING RANGE 1.741 REFLECTING RANGE E.S.D. 0.249 from INTEGRATE.LP to XDS.INP run the DEFPIX/INTEGRATE/CORRECT steps again – this improves statistics quite a bit if geometry not accurately known on 1st pass. 12 More in XDSwiki (article „Optimization“)

Visualizing Distortions and scaling problems XDS writes .cbf files for control purposes XDS-Viewer (or adxv) can display these files If not corrected: systematic errors, many rejections, reduced data quality, bad anomalous signal 13

X/Y- distortions GX-CORRECTIONS.cbf (from CORRECT task) has 10*(xobs-xcal) as a function of position Similar for y: GY-CORRECTIONS.cbf 14

Further information from XDSSTAT writes XDSSTAT.LP (visualize with CCP4 loggraph) scales.pck shows scale factor in percent as a function of position (after correction in XDS) misfits.pck shows outliers mapped on detector rf.pck shows R-factor mapped on detector anom.pck shows anomalous difference mapped on detector These files may be displayed with adxv, XDS-Viewer, or VIEW (distributed with old versions of XDS) 15

XDSSTAT.LP Frame #refs #misfits 1 2 3 4 5 6 11434 8727 8826 8636 8776 8713 96 107 58 116 59 78 Iobs 137. 125. 131. 127. 131. 132. sigma 21.0 19.9 20.6 20.1 20.8 21.1 Iobs/sigma 6.53 6.27 6.36 6.31 6.30 6.24 Peak Corr Rmeas 97.97 99.86 99.86 99.89 99.06 99.61 42.97 41.05 41.05 40.57 40.06 38.41 0.1419 0.1434 0.1353 0.1361 0.1287 0.1426 . R d factor as a function of frame number difference framediff n-all Rd-all n-notfriedel Rd-notfriedel n-friedel 0 26160 0.1720 10856 0.1698 15304 0.1736 1 51943 0.1738 21047 0.1695 30896 0.1768 2 50238 0.1626 20888 0.1648 29350 0.1612 3 47429 0.1645 20297 0.1639 27132 0.1649 4 46395 0.1679 20095 0.1695 26300 0.1666 5 44861 0.1649 19505 0.1665 25356 0.1637 6 43656 0.1633 19279 0.1658 24377 0.1615 . Rd-friedel DIFFERENCE DIFFERENCE DIFFERENCE DIFFERENCE DIFFERENCE DIFFERENCE DIFFERENCE 16 #rmeas #unique 11429 8725 8824 8633 8773 8710 5 2 2 3 3 3

Interfaces GUIs: XDSi (P. Kursula; M. Krug) CCP4: pointless, (combat), xdsconv (type CCP4 or CCP4 I) CNS/phenix.refine/SHELX: xdsconv pipelines: xia2 (CCP4), autoPROC (Globalphasing), autoxds (SSRL), . 17

XDS References Kabsch, W. (1988). Evaluation of single-crystal X-ray diffraction data from a position-sensitive detector. J. Appl. Cryst. 21, 916-924. Kabsch, W. (1993). Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. J. Appl. Cryst. 26, 795-800. Kabsch, W. (2001) Chapter 11.3. Integration, scaling, space-group assignment and post refinement Kabsch, W. (2001) Chapter 25.2.9. XDS both in International Tables for Crystallography, Volume F. Crystallography of Biological Macromolecules, Rossmann, M.G. and Arnold, E. (2001). Editors. Dordrecht: Kluwer Academic Publishers. Kabsch, W. (2010). XDS. Acta Cryst. D66, 125-132. (open access) Kabsch, W. (2010). Integration, scaling, space-group assignment and post-refinement. Acta Cryst. D66, 13318 144. (open access)

XDSwiki started Feb 2008; 100 pages at http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Main Page e.g. „Optimization“; explanations of task output „Tips and Tricks“ „Quality Control“ with datasets and results anybody can contribute (same holds for CCP4wiki: 220 pages at http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Main Page ) 19

Know what tools are available! Robust processing even if Not all parts of the frame header are read: distance, mosaicity high wavelength, beam position, fast: parallel processing Δ-phi must be supplied by the possible (synchrotron !) user (or the beamline can run on ASCII terminal, software) over a slow line (but needs no/little visualization X11 terminal if difficulties (compared to MOSFLM, arise) d*Trek and HKL) transparent decompression of frames 20

Some typical questions . “How to scale & merge different datasets from similar or same xtal(s), using XDS?” “What about twinning? Is it possible to integrate small molecule data as well?” “Does XDS correct for radiation damage (increased B factors) without scaling all to the first data set?” “Will an easier to use masking system be developed?” More Qs and As in FAQ article of XDSwiki 21

Own current work: Radiation damage and its computational correction: Diederichs, K., Junk, M. (2009) „Post-processing intensity measurements at favourable dose values“ J. Appl. Cryst. 42, 48-57 „Simulation of X-ray frames from macromolecular crystals using a ray-tracing approach“ Diederichs K. (2009) Acta Cryst. D65, 535-42 „Quantifying instrument errors in macromolecular Xray datasets“ (2010) submitted 22

ori cell Examples of simulated frames lambda beam comb „Crystal mosaicity“ has two components: cell parameter disorder, and orientational disorder of mosaic blocks 23

Potential benefits from simulation of raw data Test (debug) the whole data reduction / structure solution pipeline with known data Limits of data quality, and influence of data quality on refinement results Evaluate alternative data collection strategies (e.g. fine-slicing) before the actual data collection Understand physical principles behind mosaicity Simulate certain kinds of systematic errors Teaching . 24

Thank you!

Back to top button