Match!

Better models by discarding data

Published on Jul 1, 2013in Acta Crystallographica Section D-biological Crystallography3.23
· DOI :10.1107/S0907444913001121
Kay Diederichs51
Estimated H-index: 51
(University of Konstanz),
P.A. Karplus5
Estimated H-index: 5
(OSU: Oregon State University)
Cite
Abstract
In macromolecular X-ray crystallography, typical data sets have substantial multiplicity. This can be used to calculate the consistency of repeated measurements and thereby assess data quality. Recently, the properties of a correlation coefficient, CC1/2, that can be used for this purpose were characterized and it was shown that CC1/2 has superior properties compared with ‘merging’ R values. A derived quantity, CC*, links data and model quality. Using experimental data sets, the behaviour of CC1/2 and the more conventional indicators were compared in two situations of practical importance: merging data sets from different crystals and selectively rejecting weak observations or (merged) unique reflections from a data set. In these situations controlled ‘pairedrefinement’ tests show that even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality. These results show the folly of such data-filtering practices aimed at improving the merging R values. Interestingly, in all of these tests CC1/2 is the one data-quality indicator for which the behaviour accurately reflects which of the alternative datahandling strategies results in the best-quality refined model. Its properties in the presence of systematic error are documented and discussed.
  • References (14)
  • Citations (143)
Cite
References14
Newest
#1P. Andrew Karplus (OSU: Oregon State University)H-Index: 46
#2Kay Diederichs (University of Konstanz)H-Index: 51
#1Winn (Daresbury Laboratory)H-Index: 7
#2Charles Ballard (RAL: Rutherford Appleton Laboratory)H-Index: 3
Last.Keith S. Wilson (Ebor: University of York)H-Index: 69
view all 11 authors...
#1Garib N. Murshudov (Ebor: University of York)H-Index: 41
#2Pavol Skubák (LEI: Leiden University)H-Index: 8
Last.A.A. Vagin (Ebor: University of York)H-Index: 8
view all 9 authors...
Cited By143
Newest
#1Hsin-Hui Wu (RFUMS: Rosalind Franklin University of Medicine and Science)
#2Jindrich Symersky (RFUMS: Rosalind Franklin University of Medicine and Science)H-Index: 5
Last.Min Lu (RFUMS: Rosalind Franklin University of Medicine and Science)H-Index: 8
view all 3 authors...
#1S. Fragel (University of Cologne)
#2A.M. Montada (University of Cologne)
Last.Karin Schnetz (University of Cologne)H-Index: 18
view all 6 authors...
#1Roman Fudim (Humboldt University of Berlin)H-Index: 3
#2Michal Szczepek (Humboldt University of Berlin)H-Index: 1
Last.Franz Bartl (Humboldt University of Berlin)H-Index: 28
view all 10 authors...
#1Pyeoung-Ann Kang (SNU: Seoul National University)
#2Juntaek Oh (SNU: Seoul National University)H-Index: 1
Last.Sangkee Rhee (SNU: Seoul National University)H-Index: 13
view all 5 authors...
#1Nicolas Foos (European Synchrotron Radiation Facility)H-Index: 2
#2Michele Cianci (Marche Polytechnic University)H-Index: 18
Last.Max H. Nanao (European Synchrotron Radiation Facility)H-Index: 3
view all 4 authors...
#1Chris H. Hill (University of Cambridge)H-Index: 6
#2Georgia M Cook (University of Cambridge)H-Index: 1
Last.Janet E. Deane (University of Cambridge)H-Index: 21
view all 6 authors...
#1Lidia Ciccone (Université Paris-Saclay)
#2Carole Fruchart-Gaillard (Université Paris-Saclay)H-Index: 11
Last.William ShepardH-Index: 18
view all 9 authors...
#1Bhaskar Paidimuddala (Indian Institute of Technology Madras)H-Index: 2
#2Samar B. Mohapatra (Indian Institute of Technology Madras)H-Index: 1
Last.Narayanan Manoj (Indian Institute of Technology Madras)H-Index: 10
view all 4 authors...
#1Kanupriya Pande (LBNL: Lawrence Berkeley National Laboratory)H-Index: 8
#2Jeffrey J. Donatelli (LBNL: Lawrence Berkeley National Laboratory)H-Index: 4
Last.Peter H. Zwart (LBNL: Lawrence Berkeley National Laboratory)H-Index: 26
view all 7 authors...
View next paperOverview of the CCP4 suite and current developments