Can we use Google Scholar to identify highly-cited documents?

Published on Feb 1, 2017in Journal of Informetrics3.88
· DOI :10.1016/j.joi.2016.11.008
Alberto Martín-Martín8
Estimated H-index: 8
(UGR: University of Granada),
Enrique Orduña-Malea11
Estimated H-index: 11
(Polytechnic University of Valencia)
+ 1 AuthorsEmilio Delgado López-Cózar19
Estimated H-index: 19
(UGR: University of Granada)
The main objective of this paper is to empirically test whether the identification of highly-cited documents through Google Scholar is feasible and reliable. To this end, we carried out a longitudinal analysis (1950–2013), running a generic query (filtered only by year of publication) to minimise the effects of academic search engine optimisation. This gave us a final sample of 64,000 documents (1000 per year). The strong correlation between a document’s citations and its position in the search results (r=−0.67) led us to conclude that Google Scholar is able to identify highly-cited papers effectively. This, combined with Google Scholar’s unique coverage (no restrictions on document type and source), makes the academic search engine an invaluable tool for bibliometric research relating to the identification of the most influential scientific documents. We find evidence, however, that Google Scholar ranks those documents whose language (or geographical web domain) matches with the user’s interface language higher than could be expected based on citations. Nonetheless, this language effect and other factors related to the Google Scholar’s operation, i.e. the proper identification of versions and the date of publication, only have an incidental impact. They do not compromise the ability of Google Scholar to identify the highly-cited papers.
Figures & Tables
  • References (58)
  • Citations (18)
#1Fiorenzo Franceschini (Polytechnic University of Turin)H-Index: 24
#2Domenico Augusto Francesco Maisano (Polytechnic University of Turin)H-Index: 18
Last.Luca Mastrogiacomo (Polytechnic University of Turin)H-Index: 16
view all 3 authors...
#1Alberto Martín-Martín (UGR: University of Granada)H-Index: 8
#2Enrique Orduña-Malea (Polytechnic University of Valencia)H-Index: 11
Last.Emilio Delgado López-Cózar (UGR: University of Granada)H-Index: 19
view all 4 authors...
#1Madian Khabsa (PSU: Pennsylvania State University)H-Index: 10
#2C. Lee Giles (PSU: Pennsylvania State University)H-Index: 65
#1Joost C. F. de Winter (TU Delft: Delft University of Technology)H-Index: 7
#2Amir A. Zadpoor (TU Delft: Delft University of Technology)H-Index: 38
Last.Dimitra Dodou (TU Delft: Delft University of Technology)H-Index: 15
view all 3 authors...
Cited By18
#1Cheng Zhang (A&M: Texas A&M University)H-Index: 3
#2Chao Fan (A&M: Texas A&M University)H-Index: 2
Last.Ali Mostafavi (A&M: Texas A&M University)H-Index: 8
view all 5 authors...
#1Amin Vahidi (IUST: Iran University of Science and Technology)H-Index: 1
#2Alireza Aliahmadi (IUST: Iran University of Science and Technology)H-Index: 9
Last.Ebrahim Teimoury (IUST: Iran University of Science and Technology)H-Index: 10
view all 3 authors...
#1Mark R. Stevens (UBC: University of British Columbia)H-Index: 10
#2Keunhyun Park (UofU: University of Utah)H-Index: 3
Last.Reid Ewing (UofU: University of Utah)H-Index: 40
view all 6 authors...
#1Joy Robinson (UAH: University of Alabama in Huntsville)H-Index: 2
#2Candice Lanius (UAH: University of Alabama in Huntsville)H-Index: 1
View next paperWhere you search is what you get: literature mining – Google Scholar versus Web of Science using a data set from a literature search in vegetation science