Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples

Published on Jun 30, 2010in ZooKeys1.143
· DOI :10.3897/zookeys.50.538
Lyubomir Penev24
Estimated H-index: 24
Donat Agosti25
Estimated H-index: 25
+ 20 AuthorsTerry L. Erwin39
Estimated H-index: 39
The concept of semantic tagging and its potential for semantic enhancements to taxonomic papers is outlined and illustrated by four exemplar papers published in the present issue of ZooKeys. The four papers were created in different ways: (i) written in Microsoft Word and submitted as non-tagged manuscript (doi: 10.3897/zookeys.50.504); (ii) generated from Scratchpads and submitted as XML-tagged manuscripts (doi: 10.3897/zookeys.50.505 and doi: 10.3897/zookeys.50.506); (iii) generated from an author’s database (doi: 10.3897/zookeys.50.485) and submitted as XML-tagged manuscript. XML tagging and semantic enhancements were implemented during the editorial process of ZooKeys using the Pensoft Mark Up Tool (PMT), specially designed for this purpose. The XML schema used was TaxPub, an extension to the Document Type Definitions (DTD) of the US National Library of Medicine Journal Archiving and Interchange Tag Suite (NLM). The following innovative methods of tagging, layout, publishing and disseminating the content were tested and implemented within the ZooKeys editorial workflow: (1) highly automated, fine-grained XML tagging based on TaxPub; (2) final XML output of the paper validated against the NLM DTD for archiving in PubMedCentral; (3) bibliographic metadata embedded in the PDF through XMP (Extensible Metadata Platform); (4) PDF uploaded after publication to the Biodiversity Heritage Library (BHL); (5) taxon treatments supplied through XML to Plazi; (6) semantically enhanced HTML version of the paper encompassing numerous internal and external links and linkouts, such as: (i) vizualisation of main tag elements within the text (e.g., taxon names, taxon treatments, localities, etc.); (ii) internal cross-linking between paper sections, citations, references, tables, and figures; (iii) mapping of localities listed in the whole paper or within separate taxon treatments; (v) taxon names autotagged, dynamically mapped and linked through the Pensoft Taxon Profile (PTP) to large international database services and indexers such as Global Biodiversity Information Facility (GBIF), National Center for Biotechnology Information (NCBI), Barcode of Life (BOLD), Encyclopedia of Life (EOL), ZooBank, Wikipedia, Wikispecies, Wikimedia, and others; (vi) GenBank accession numbers autotagged and linked to NCBI; (vii) external links of taxon names to references in PubMed, Google Scholar, Biodiversity Heritage Library and other sources. With the launching of the working example, ZooKeys becomes the first taxonomic journal to provide a complete XML-based editorial, publication and dissemination workflow implemented as a routine and cost-efficient practice. It is anticipated that XML-based workflow will also soon be implemented in botany through PhytoKeys, a forthcoming partner journal of ZooKeys. The semantic markup and enhancements are expected to greatly extend and accelerate the way taxonomic information is published, disseminated and used.
  • References (25)
  • Citations (65)
📖 Papers frequently viewed together
25 Citations
1 Author (Terence Catapano)
27 Citations
1 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Roderic D. M. Page (Glas.: University of Glasgow)H-Index: 44
Although the Web has transformed science publishing, scientific papers themselves are still essentially ''black boxes'', with much of their content intended for human readers only. Typically, computer-readable metadata associated with an article is limited to bibliographic details. By expanding article metadata to include taxonomic names, identifiers for cited material (e.g., publications, sequences, specimens, and other data), and geographical coordinates, publishers could greatly increase the ...
10 CitationsSource
#1Vladimir Blagoderov (BAS: Bulgarian Academy of Sciences)H-Index: 16
#2Irina Brake (BAS: Bulgarian Academy of Sciences)H-Index: 10
Last. Vincent S. Smith (BAS: Bulgarian Academy of Sciences)H-Index: 25
view all 10 authors...
We describe a method to publish nomenclatural acts described in taxonomic websites (Scratchpads) that are formally registered through publication in a printed journal (ZooKeys). This method is fully compliant with the zoological nomenclatural code. Our approach supports manuscript creation (via a Scratchpad), electronic act registration (via ZooBank), online and print publication (in the journal ZooKeys) and simultaneous dissemination (ZooKeys and Scratchpads) for nomenclatorial acts including n...
25 CitationsSource
#1Pavel StoevH-Index: 14
#2Nesrine AkkariH-Index: 9
Last. Lyubomir PenevH-Index: 24
view all 8 authors...
The centipede genus Eupolybothrus Verhoeff, 1907 in North Africa is revised. A new cavernicolous species, Eupolybothrus kahfi Stoev & Akkari, sp. n., is described from a cave in Jebel Zaghouan, northeast Tunisia. Morphologically, it is most closely related to Eupolybothrus nudicornis (Gervais, 1837) from North Africa and Southwest Europe but can be readily distinguished by the long antennae and leg-pair 15, a conical dorso-median protuberance emerging from the posterior part of prefemur 15, and ...
27 CitationsSource
Stomosis arachnophila Brake sp. n. (Diptera, Milichiidae) is described from Western Australia. Th e species is kleptoparasitic on araneid spiders. Th e paper is an example for a new approach in cybertaxonomy which includes generation of manuscripts within a Virtual Research Environment (Scratchpads), semantic enhancement, parallel release of the publication on paper and online accompanied with registration of new taxa with ZooBank.
6 CitationsSource
#1Charuwat TaekulH-Index: 3
#2Norman F. JohnsonH-Index: 20
Last. K RajmohanaH-Index: 2
view all 5 authors...
The genus Platyscelio Kieffer (Hymenoptera: Platygastridae, Scelioninae) is a widespread group in the Old World, found from West Africa to northern Queensland, Australia. The species concepts are revised and a key to world species is presented. The genus is comprised of 6 species, including 2 known species which are redescribed: Platyscelio africanus Risbec (Benin, Cameroon, Central African Republic, Ghana, Guinea, Guinea-Bissau, Ivory Coast, Kenya, Mozambique, Nigeria, Sierra Leone, South Afric...
6 CitationsSource
#1Vladimir BlagoderovH-Index: 16
#2Heikki HippaH-Index: 11
Last. André NelH-Index: 90
view all 3 authors...
A new genus and a new species of Lygistorrhinidae, Parisognoriste eocenica is described from the Eocene Oise amber of the Paris Basin. Palaeognoriste sciariforme Meunier, 1904 and Palaeognoriste affi ne Meunier, 1912 are re-described. Lectotypes are designated for both species of Palaeognoriste. Th e phylogenetic positions of the new genus and Palaeognoriste Meunier are discussed. Th e paper is an example demonstrating a new approach in cybertaxonomy including automatic generation of manuscript ...
10 CitationsSource
Th e fl ower fl y genus Eosphaerophoria is revised. Eight new species are described (adornata sp. n. Mengual, bifi da sp. n. Mengual, brunettii sp. n. Ghorpade, hermosa sp. n. Mengual, luteofasciata sp. n. Mengual, nigrovittata sp. n. Mengual, symmetrica sp. n. Mengual, and vietnamensis sp. n. Mengual), and an identifi cation key is provided. Redescriptions, illustrations, synonymies, diagnoses and distributional data are given for all 11 known species of Eosphaerophoria. Th e new described spec...
5 CitationsSource
#1Vishwas Chavan (Global Biodiversity Information Facility)H-Index: 15
#2Peter IngwersenH-Index: 7
Background Currently primary scientific data, especially that dealing with biodiversity, is neither easily discoverable nor accessible. Amongst several impediments, one is a lack of professional recognition of scientific data publishing efforts. A possible solution is establishment of a 'Data Publishing Framework' which would encourage and recognise investments and efforts by institutions and individuals towards management, and publishing of primary scientific data potentially on a par with reco...
43 CitationsSource
#1Michael J. SharkeyH-Index: 23
#2Dicky YuH-Index: 1
Last. Lyubomir PenevH-Index: 24
view all 5 authors...
32 CitationsSource
#1Lyubomir PenevH-Index: 24
#2Michael J. SharkeyH-Index: 23
Last. Michael J DallwitzH-Index: 1
view all 10 authors...
Th e concepts of publication, citation and dissemination of interactive keys and other online keys are discussed and illustrated by a sample paper published in the present issue (doi: 10.3897/zookeys.21.271). Th e present model is based on previous experience with several existing examples of publishing online keys. However, this model also suggests ways to publish, cite, preserve, disseminate and reuse the original data fi les to the benefi t of future workers, the authors, and society in gener...
28 CitationsSource
Cited By65
#1Muriel Rabone (AMNH: American Museum of Natural History)H-Index: 7
#2Harriet Harden-Davies (UOW: University of Wollongong)H-Index: 5
Last. Tammy Horton (NOC: National Oceanography Centre)H-Index: 13
view all 11 authors...
Better knowledge of the little known deep sea and areas beyond national jurisdiction (ABNJ) is key to conservation, an urgent need in light of increasing environmental change. Access to marine genetic resources (MGR) for the biodiversity research community to allow these environments to be better characterised is therefore essential. Negotiations have commenced under the auspices of the United Nations Convention on the Law of the Sea (UNCLOS) to develop a new treaty to further the conservation a...
3 CitationsSource
#1Lyubomir PenevH-Index: 24
#2Mariya DimitrovaH-Index: 1
Last. Kiril SimovH-Index: 15
view all 7 authors...
Hundreds of years of biodiversity research have resulted in the accumulation of a substantial pool of communal knowledge; however, most of it is stored in silos isolated from each other, such as published articles or monographs. The need for a system to store and manage collective biodiversity knowledge in a community-agreed and interoperable open format has evolved into the concept of the Open Biodiversity Knowledge Management System (OBKMS). This paper presents OpenBiodiv: An OBKMS that utiliz...
#1Sumira JanH-Index: 1
#2Nazia AbbasH-Index: 1
Himalayan mountain ecosystem embraces huge biodiversity thriving in multitude of diverse microclimatic belts. Himalayan flora reveals immense heterogeneity and high complexity in terms of ecology, physiology, and utilization of plants. There is an urgent need to summarize these factors that influence variation in information pertaining to diversity of Himalayan flora, their biogeographic distribution, medicinal uses, and taxonomic evaluation. Prior to the release of Himalayan herbs to an extensi...
5 CitationsSource
2 CitationsSource
#1Wim HugoH-Index: 3
#2Donald Hobern (Global Biodiversity Information Facility)H-Index: 12
Last. Hannu Saarenmaa (University of Eastern Finland)H-Index: 13
view all 5 authors...
GEO BON regards development of a global infrastructure in support of Essential Biodiversity Variables (EBVs) as one of its main objectives. To realise the goal, an understanding of the context within which such an infrastructure needs to operate is important (for instance, it is part of a larger drive towards research data infrastructures in support of open science?) and the information technology applicable to such infrastructures needs to be considered. The EBVs are likely to require very spec...
4 CitationsSource
Both classical taxonomy and DNA barcoding are engaged in the task of digitizing the living world. Much of the taxonomic literature remains undigitized. The rise of open access publishing this century and the freeing of older literature from the shackles of copyright have greatly increased the online availability of taxonomic descriptions, but much of the literature of the mid- to late-twentieth century remains offline (‘dark texts’). DNA barcoding is generating a wealth of computable data that i...
22 CitationsSource
May 7, 2016 in CHI (Human Factors in Computing Systems)
#1Andrea K. Thomer (UIUC: University of Illinois at Urbana–Champaign)H-Index: 6
#2Michael B. Twidale (UIUC: University of Illinois at Urbana–Champaign)H-Index: 29
Last. Matthew J. Yoder (UIUC: University of Illinois at Urbana–Champaign)H-Index: 14
view all 4 authors...
Taxonomy is the branch of biology concerned with classifying organisms. Taxonomic work entails a range of complex human-computer and human-information interactions, which are under-supported by current software environments, partially because taxonomic software is largely built through ad hoc collaborations by taxonomists themselves. This results in poor user experience and difficult-to-use tools. Here we describe an interface design Hackathon held as part of the NSF-funded Transforming Taxonomi...
6 CitationsSource
This project aims to develop and implement novel ways of publication, visualization, and dissemination of biodiversity and biodiversity-related data and thus bring the Open Biodiversity Knowledge Management System closer to fruition. In order to do so, we will develop new types of Enhanced Publications (EP's), which will allow automated data import into the manuscript and export from the manuscript and provide dynamic visualizations. These EP's will enable biodiversity researchers and taxonomist...
11 CitationsSource
By digitising legacy taxonomic literature using XML mark-up the contents become accessible to other taxonomic and nomenclatural information systems. Appropriate schemas need to be interoperable with other sectorial schemas, atomise to appropriate content elements and carry appropriate metadata to, for example, enable algorithmic assessment of availability of a name under the Code. Legacy (and new) literature delivered in this fashion will become part of a global taxonomic resource from which use...
7 CitationsSource