Better models by discarding data

Published on Jul 1, 2013in Acta Crystallographica Section D-biological Crystallography3.227
· DOI :10.1107/S0907444913001121
Kay Diederichs53
Estimated H-index: 53
(University of Konstanz),
P.A. Karplus20
Estimated H-index: 20
(OSU: Oregon State University)
In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC1/2, that can be used for this purpose were characterized and it was shown that CC1/2 has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC1/2 and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘pairedrefinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC1/2 is the one data-quality indicator for which the behaviour accurately reflects which of the alternative datahandling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed.
  • References (14)
  • Citations (150)
📖 Papers frequently viewed together
6,512 Citations
10.9k Citations
1,071 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
It is generally assumed that the quality of X-ray diffraction data can be improved by merging data sets from several crystals. However, this effect is only valid if the data sets used are from crystals that are structurally identical. It is found that frozen macromolecular crystals very often have relatively low structure identity (and are therefore not isomorphous); thus, to obtain a real gain from multi-crystal data sets one needs to make an appropriate selection of structurally similar crysta...
53 CitationsSource
#1P. Andrew Karplus (OSU: Oregon State University)H-Index: 45
#2Kay Diederichs (University of Konstanz)H-Index: 53
1,071 CitationsSource
#1Winn (Daresbury Laboratory)H-Index: 7
#2Charles Ballard (RAL: Rutherford Appleton Laboratory)H-Index: 5
Last. Keith S. Wilson (Ebor: University of York)H-Index: 70
view all 18 authors...
6,512 CitationsSource
This paper presents an overview of how to run the CCP4 programs for data reduction (SCALA, POINTLESS and CTRUNCATE) through the CCP4 graphical interface ccp4i and points out some issues that need to be considered, together with a few examples. It covers determination of the point-group symmetry of the diffraction data (the Laue group), which is required for the subsequent scaling step, examination of systematic absences, which in many cases will allow inference of the space group, putting multip...
676 CitationsSource
#1Garib N. Murshudov (Ebor: University of York)H-Index: 45
#2P. Skubák (LEI: Leiden University)H-Index: 2
Last. A.A. Vagin (Ebor: University of York)H-Index: 9
view all 9 authors...
This paper describes various components of the macromolecular crystallographic refinement program REFMAC5, which is distributed as part of the CCP4 suite. REFMAC5 utilizes different likelihood functions depending on the diffraction data employed (amplitudes or intensities), the presence of twinning and the availability of SAD/SIRAS experimental diffraction data. To ensure chemical and structural integrity of the refined model, REFMAC5 offers several classes of restraints and choices of model par...
4,210 CitationsSource
#1Paul D. Adams (University of California, Berkeley)H-Index: 84
#2Pavel V. Afonine (LBNL: Lawrence Berkeley National Laboratory)H-Index: 33
Last. Peter H. Zwart (LBNL: Lawrence Berkeley National Laboratory)H-Index: 29
view all 18 authors...
12k CitationsSource
Important steps in the processing of rotation data are described that are common to most software packages. These programs differ in the details and in the methods implemented to carry out the tasks. Here, the working principles underlying the data-reduction package XDS are explained, including the new features of automatic determination of spot size and reflecting range, recognition and assignment of crystal sym­metry and a highly efficient algorithm for the determination of correction/scaling ...
1,079 CitationsSource
#1Chad R. SimmonsH-Index: 14
Last. P.A. KarplusH-Index: 7
view all 8 authors...
The common reactions of dioxygen, superoxide, and hydroperoxides with thiolates are thought to proceed via persulfenate intermediates, yet these have never been visualized. Here we report a 1.4 A resolution crystal structure of the Fe2+-dependent enzyme cysteine dioxygenase (CDO) containing this putative intermediate trapped in its active site pocket. The complex raises the possibility that, distinct from known dioxygenases and proposed CDO mechanisms, the Fe-proximal oxygen atom may be involved...
59 CitationsSource
An account is given of the development of the SHELX system of computer programs from SHELX-76 to the present day. In addition to identifying useful innovations that have come into general use through their implementation in SHELX, a critical analysis is presented of the less-successful features, missed opportunities and desirable improvements for future releases of the software. An attempt is made to understand how a program originally designed for photographic intensity data, punched cards and ...
70.8k CitationsSource
#1Chad R. SimmonsH-Index: 14
#2Qun LiuH-Index: 29
Last. Martha H. StipanukH-Index: 42
view all 7 authors...
Abstract Cysteine dioxygenase is a mononuclear iron-dependent enzyme responsible for the oxidation of cysteine with molecular oxygen to form cysteine sulfinate. This reaction commits cysteine to either catabolism to sulfate and pyruvate or the taurine biosynthetic pathway. Cysteine dioxygenase is a member of the cupin superfamily of proteins. The crystal structure of recombinant rat cysteine dioxygenase has been determined to 1.5-A resolution, and these results confirm the canonical cupin β-sand...
101 CitationsSource
Cited By150
#1Dipankar MannaH-Index: 1
#2Gabriele Cordara (University of Oslo)H-Index: 3
Last. Ute KrengelH-Index: 23
view all 3 authors...
Abstract The Marasmius oreades agglutinin (MOA) is the holotype of an emerging family of fungal chimerolectins and an active Ca2+/Mn2+-dependent protease, which exhibits a unique papain-like fold with special active site features. Here we investigated the functional significance of the structural elements differentiating MOA from other papain-like cysteine proteases. X-ray crystal structures of MOA co-crystallized with two synthetic substrates reveal cleaved peptides bound to the catalytic site,...
#1Haehee Lee (SNU: Seoul National University)
#2Sangkee Rhee (SNU: Seoul National University)H-Index: 13
In cyanobacteria, metabolic pathways that use the nitrogen-rich amino acid arginine play a pivotal role in nitrogen storage and mobilization. The N-terminal domains of two recently identified bacterial enzymes, ArgZ from Synechocystis and AgrE from Anabaena, have been found to contain an arginine dihydrolase. This enzyme provides catabolic activity that converts arginine to ornithine, resulting in concomitant release of CO2 and ammonia. In Synechocystis, the ArgZ-mediated ornithine-ammonia cycle...
#1Hsin-Hui Wu (RFUMS: Rosalind Franklin University of Medicine and Science)
#2Jindrich Symersky (RFUMS: Rosalind Franklin University of Medicine and Science)H-Index: 6
Last. Min Lu (RFUMS: Rosalind Franklin University of Medicine and Science)H-Index: 8
view all 3 authors...
The rapid increase of multidrug resistance poses urgent threats to human health. Multidrug transporters prompt multidrug resistance by exporting different therapeutics across cell membranes, often by utilizing the H+ electrochemical gradient. MdfA from Escherichia coli is a prototypical H+ -dependent multidrug transporter belonging to the Major Facilitator Superfamily. Prior studies revealed unusual flexibility in the coupling between multidrug binding and deprotonation in MdfA, but the mechanis...
#1Sanghoon Kim (SNU: Seoul National University)H-Index: 24
#2Raees Khan (Dong-a University)H-Index: 4
Last. Sangkee Rhee (SNU: Seoul National University)H-Index: 13
view all 5 authors...
The synthetic biocide triclosan targets enoyl-acyl carrier protein reductase(s) (ENR) in bacterial type II fatty acid biosynthesis. Screening and sequence analyses of the triclosan resistome from the soil metagenome identified a variety of triclosan-resistance ENRs. Interestingly, the mode of triclosan resistance by one hypothetical protein was elusive, mainly due to a lack of sequence similarity with other proteins that mediate triclosan resistance. Here, we carried out a structure-based functi...
#1M. VollmarH-Index: 8
#2James M. ParkhurstH-Index: 7
Last. Gwyndaf EvansH-Index: 26
view all 7 authors...
This study describes a method to estimate the likelihood of success in determining a macromolecular structure by X-ray crystallography and experimental single-wavelength anomalous dispersion (SAD) or multiple-wavelength anomalous dispersion (MAD) phasing based on initial data-processing statistics and sample crystal properties. Such a predictive tool can rapidly assess the usefulness of data and guide the collection of an optimal data set. The increase in data rates from modern macromolecular cr...
#1Lianne Pope (UCSF: University of California, San Francisco)
#1Lianne Pope (UCSF: University of California, San Francisco)H-Index: 2
Last. Daniel L. MinorH-Index: 37
view all 3 authors...
Summary The trinuclear ruthenium amine ruthenium red (RuR) inhibits diverse ion channels, including K2P potassium channels, TRPs, the calcium uniporter, CALHMs, ryanodine receptors, and Piezos. Despite this extraordinary array, there is limited information for how RuR engages targets. Here, using X-ray crystallographic and electrophysiological studies of an RuR-sensitive K2P, K2P2.1 (TREK-1) I110D, we show that RuR acts by binding an acidic residue pair comprising the “Keystone inhibitor site” u...
#1Lianne Pope (University of California, Berkeley)H-Index: 2
#2Marco Lolicato (University of California, Berkeley)
Last. Daniel L. MinorH-Index: 37
view all 3 authors...
The trinuclear ruthenium amine Ruthenium Red (RuR) inhibits diverse ion channels including K2P potassium channels, TRPs, the mitochondrial calcium uniporter, CALHMs, ryanodine receptors, and Piezos. Despite this extraordinary array, there is very limited information for how RuR engages its targets. Here, using X-ray crystallographic and electrophysiological studies of an RuR sensitive K2P, K2P2.1 (TREK 1) I110D, we show that RuR acts by binding an acidic residue pair comprising the 9Keystone inh...
#1Natasha Stander (ASU: Arizona State University)H-Index: 1
#2Petra Fromme (ASU: Arizona State University)H-Index: 54
Last. Nadia A. Zatsepin (ASU: Arizona State University)H-Index: 24
view all 3 authors...
DatView is a new graphical user interface (GUI) for plotting parameters to explore correlations, identify outliers and export subsets of data. It was designed to simplify and expedite analysis of very large unmerged serial femtosecond crystallography (SFX) data sets composed of indexing results from hundreds of thousands of microcrystal diffraction patterns. However, DatView works with any tabulated data, offering its functionality to many applications outside serial crystallography. In DatView'...
#1Madhumati Sevvana (Purdue University)H-Index: 4
#2Michael RufH-Index: 19
Last. Regine Herbst-Irmer (GAU: University of Göttingen)H-Index: 39
view all 5 authors...
In contrast to twinning by merohedry, the reciprocal lattices of the different domains of non-merohedral twins do not overlap exactly. This leads to three kinds of reflections: reflections with no overlap, reflections with an exact overlap and reflections with a partial overlap of a reflection from a second domain. This complicates the unit-cell determination, indexing, data integration and scaling of X-ray diffraction data. However, with hindsight it is possible to detwin the data because there...
1 CitationsSource
#1Ye Zhou (CAS: Chinese Academy of Sciences)H-Index: 2
#2Can Cao (CAS: Chinese Academy of Sciences)H-Index: 3
Last. Xuejun Cai Zhang (CAS: Chinese Academy of Sciences)H-Index: 9
view all 5 authors...
Multiple subtypes of dopamine receptors within the GPCR superfamily regulate neurological processes through various downstream signaling pathways. A crucial question about the dopamine receptor family is what structural features determine the subtype-selectivity of potential drugs. Here, we report the 3.5-angstrom crystal structure of mouse dopamine receptor D4 (DRD4) complexed with a subtype-selective antagonist, L745870. Our structure reveals a secondary binding pocket extended from the orthos...
1 CitationsSource