Generating automatically labeled data for author name disambiguation: an iterative clustering method

Jinseok Kim; Jinmo Kim; Jason Owen-Smith

doi:https://doi.org/10.1007/s11192-018-2968-3

doi.org/10.1007/s11192-018-2968-3

Generating automatically labeled data for author name disambiguation: an iterative clustering method

,

,

Scientometrics3.90

Volume: 118, Issue: 1, Pages: 253 - 280

Published: Jan 1, 2019

Abstract

To train algorithms for supervised author name disambiguation, many studies have relied on hand-labeled truth data that are very laborious to generate. This paper shows that labeled data can be automatically generated using information features such as email address, coauthor names, and cited references that are available from publication records. For this purpose, high-precision rules for matching name instances on each feature are decided...

Paper Fields

Paper Details

Title

Generating automatically labeled data for author name disambiguation: an iterative clustering method

DOI

doi.org/10.1007/s11192-018-2968-3

Published Date

Jan 1, 2019

Journal

Scientometrics

Volume

118

Issue

1

Pages

253 - 280

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

Learn more

Notes

History