Establishing a distributed national research infrastructure providing bioinformatics support to life science researchers in Australia

Published on Mar 25, 2019in Briefings in Bioinformatics9.101
· DOI :10.1093/bib/bbx071
Maria Victoria Schneider24
Estimated H-index: 24
(University of Melbourne),
Philippa C. Griffin11
Estimated H-index: 11
(Australia Bioinformatics Resource)
+ 15 AuthorsAndrew Lonie18
Estimated H-index: 18
(University of Melbourne)
EMBL Australia Bioinformatics Resource (EMBL-ABR) is a developing national research infrastructure, providing bioinformatics resources and support to life science and biomedical researchers in Australia. EMBL-ABR comprises 10 geographically distributed national nodes with one coordinating hub, with current funding provided through Bioplatforms Australia and the University of Melbourne for its initial 2-year development phase. The EMBL-ABR mission is to: (1) increase Australia’s capacity in bioinformatics and data sciences; (2) contribute to the development of training in bioinformatics skills; (3) showcase Australian data sets at an international level and (4) enable engagement in international programs. The activities of EMBL-ABR are focussed in six key areas, aligning with comparable international initiatives such as ELIXIR, CyVerse and NIH Commons. These key areas—Tools, Data, Standards, Platforms, Compute and Training—are described in this article.
Figures & Tables
  • References (24)
  • Citations (4)
📖 Papers frequently viewed together
10 Citations
30 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Lindsay Barone (CSHL: Cold Spring Harbor Laboratory)H-Index: 3
#2Jason Williams (CSHL: Cold Spring Harbor Laboratory)H-Index: 6
Last. David Micklos (CSHL: Cold Spring Harbor Laboratory)H-Index: 9
view all 3 authors...
In a 2016 survey of 704 National Science Foundation (NSF) Biological Sciences Directorate principal investigators (BIO PIs), nearly 90% indicated they are currently or will soon be analyzing large data sets. BIO PIs considered a range of computational needs important to their work, including high performance computing (HPC), bioinformatics support, multistep workflows, updated analysis software, and the ability to store, share, and publish data. Previous studies in the United States and Canada e...
32 CitationsSource
#1Fabien ArnaudH-Index: 35
#2Cécile PignolH-Index: 14
Last. Arnaud CailloH-Index: 1
view all 16 authors...
Managing paleoscience data is highly challenging to the multiplicity of actors in play, types of sampling, analysis, post-analysis treatments, statistics etc. However, a well-structured curating of data would permit innovative developments based on data and/or sample re-use, such as meta-analysis or the development of new proxies on previously studied cores. In this paper, we will present two recent initiatives that allowed us tackling this objective at a French national level: the “National Cyb...
#1Erin C McKiernan (UNAM: National Autonomous University of Mexico)H-Index: 7
#2Philip E. Bourne (NIH: National Institutes of Health)H-Index: 57
Last. Tal Yarkoni (University of Texas at Austin)H-Index: 37
view all 15 authors...
Open access, open data, open source and other open scholarship practices are growing in popularity and necessity. However, widespread adoption of these practices has not yet been achieved. One reason is that researchers are uncertain about how sharing their work will affect their careers. We review literature demonstrating that open research is associated with increases in citations, media attention, potential collaborators, job opportunities and funding opportunities. These findings are evidenc...
118 CitationsSource
#1Alejandro Rodríguez-Iglesias (UPM: Technical University of Madrid)H-Index: 1
#2Alejandro Rodríguez-González (UPM: Technical University of Madrid)H-Index: 12
Last. M. Wilkinson (UPM: Technical University of Madrid)H-Index: 16
view all 7 authors...
Pathogen-Host interaction data is core to our understanding of disease processes and their molecular/genetic bases. Facile access to such core data is particularly important for the plant sciences, where individual genetic and phenotypic observations have the added complexity of being dispersed over a wide diversity of plant species versus the relatively fewer host species of interest to biomedical researchers. Recently, an international initiative interested in scholarly data publishing propose...
10 CitationsSource
#1M. Wilkinson (UPM: Technical University of Madrid)H-Index: 16
#2Michel Dumontier (Stanford University)H-Index: 37
Last. Barend Mons (LEI: Leiden University)H-Index: 30
view all 53 authors...
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on...
1,209 CitationsSource
#1Nirav Merchant (UA: University of Arizona)H-Index: 13
#2Eric Lyons (UA: University of Arizona)H-Index: 35
Last. Parker B. Antin (UA: University of Arizona)H-Index: 34
view all 7 authors...
The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identity management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as di...
108 CitationsSource
#1Charles E. Cook (EMBL-EBI: European Bioinformatics Institute)H-Index: 18
#2Mary Todd Bergman (EMBL-EBI: European Bioinformatics Institute)H-Index: 2
Last. Rolf Apweiler (EMBL-EBI: European Bioinformatics Institute)H-Index: 84
view all 6 authors...
New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we ...
67 CitationsSource
#1Jon Ison (DTU: Technical University of Denmark)H-Index: 8
#2Kristoffer Rapacki (DTU: Technical University of Denmark)H-Index: 8
Last. Søren Brunak (DTU: Technical University of Denmark)H-Index: 93
view all 69 authors...
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is tha...
54 CitationsSource
#1Vasileios Lapatas (Ionian University)H-Index: 1
#2Michalis Stefanidakis (Ionian University)H-Index: 6
Last. Maria Victoria Schneider (Norwich Research Park)H-Index: 24
view all 5 authors...
Data sharing, integration and annotation are essential to ensure the reproducibility of the analysis and interpretation of the experimental findings. Often these activities are perceived as a role that bioinformaticians and computer scientists have to take with no or little input from the experimental biologist. On the contrary, biological researchers, being the producers and often the end users of such data, have a big role in enabling biological data integration. The quality and usefulness of ...
29 CitationsSource
#1Zachary D. Stephens (UIUC: University of Illinois at Urbana–Champaign)H-Index: 4
#2Skylar Y. Lee (UIUC: University of Illinois at Urbana–Champaign)H-Index: 1
Last. Gene E. Robinson (UIUC: University of Illinois at Urbana–Champaign)H-Index: 91
view all 10 authors...
Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a “four-headed beast”—it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discu...
456 CitationsSource
Cited By4
#1Abhishek Agarwal (IIIT-D: Indraprastha Institute of Information Technology)
#2Piyush Agrawal (CSIR: Council of Scientific and Industrial Research)H-Index: 9
Last. Gajendra P. S. Raghava (IIIT-D: Indraprastha Institute of Information Technology)H-Index: 54
view all 6 authors...
IndiaBioDb ( is a manually curated comprehensive repository of bioinformatics resources developed and maintained by Indian researchers. This repository maintains information about more than 550 freely accessible functional resources that include around 263 biological databases. Each entry provides a complete detail about a resource that includes the name of resources, web link, detail of publication, information about the corresponding author, name o...
#1Anup Som (Allahabad University)H-Index: 7
#2Priyanka Kumari (Allahabad University)H-Index: 1
Last. Arindam Ghosh (Allahabad University)H-Index: 1
view all 3 authors...
Bioinformatics is an interdisciplinary field of study that uses computation to extract knowledge from biological data. It has become an integral component of today’s biological and biomedical sciences. Thus, considering its necessity, the Government of India through the Department of Biotechnology (DBT) laid the framework for developing bioinformatics infrastructure and human resource in 1986–1987. Over three decades, bioinformatics education and research in India has made consistent progress; c...
1 CitationsSource
#1John D. Van Horn (SC: University of Southern California)H-Index: 47
#2Sumiko Abe (SC: University of Southern California)H-Index: 2
Last. Sonika Tyagi (Monash University, Clayton campus)H-Index: 11
view all 20 authors...
The increasing richness and diversity of biomedical data types creates major organizational and analytical impediments to rapid translational impact in the context of training and education. As biomedical data-sets increase in size, variety and complexity, they challenge conventional methods for sharing, managing and analyzing those data. In May 2017, we convened a two-day meeting between the BD2K Training Coordinating Center (TCC), ELIXIR Training/TeSS, GOBLET, H3ABioNet, EMBL-ABR, bioCADDIE an...
#1Philippa C. Griffin (University of Melbourne)H-Index: 11
#2Jyoti Khadake (Cambridge University Hospitals NHS Foundation Trust)H-Index: 12
Last. Maria Victoria Schneider (University of Melbourne)H-Index: 24
view all 27 authors...
Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations...
3 CitationsSource