Christian Colmsee
Leibniz Association
Data miningGenomeInformation retrievalComputer scienceBiology
What is this?
Publications 28
#1Martin Mascher (Leibniz Association)H-Index: 26
#2Heidrun GundlachH-Index: 34
Last. Nils Stein (Leibniz Association)H-Index: 60
view all 76 authors...
Cereal grasses of the Triticeae tribe have been the major food source in temperate regions since the dawn of agriculture. Their large genomes are characterized by a high content of repetitive elements and large pericentromeric regions that are virtually devoid of meiotic recombination. Here we present a high-quality reference genome assembly for barley (Hordeum vulgare L.). We use chromosome conformation capture mapping to derive the linear order of sequences across the pericentromeric space and...
368 CitationsSource
#1Sebastian BeierH-Index: 11
#2Axel HimmelbachH-Index: 34
Last. Martin Mascher (Leibniz Association)H-Index: 26
view all 60 authors...
Barley (Hordeum vulgare L.) is a cereal grass mainly used as animal fodder and raw material for the malting industry. The map-based reference genome sequence of barley cv. ‘Morex’ was constructed by the International Barley Genome Sequencing Consortium (IBSC) using hierarchical shotgun sequencing. Here, we report the experimental and computational procedures to (i) sequence and assemble more than 80,000 bacterial artificial chromosome (BAC) clones along the minimum tiling path of a genome-wide p...
48 CitationsSource
#1Manuel SpannaglH-Index: 27
#2Michael Alaux (INRA: Institut national de la recherche agronomique)H-Index: 11
Last. Paul Kersey (EMBL-EBI: European Bioinformatics Institute)H-Index: 39
view all 29 authors...
: The genome sequences of many important Triticeae species, including bread wheat ( L.) and barley ( L.), remained uncharacterized for a long time because their high repeat content, large sizes, and polyploidy. As a result of improvements in sequencing technologies and novel analyses strategies, several of these have recently been deciphered. These efforts have generated new insights into Triticeae biology and genome organization and have important implications for downstream usage by breeders, ...
6 CitationsSource
#1Thomas Schmutzer (Leibniz Association)H-Index: 18
#2Birgit Samans (University of Giessen)H-Index: 23
Last. Uwe Scholz (Leibniz Association)H-Index: 38
view all 16 authors...
Brassica napus (oilseed rape, canola) is one of the world’s most important sources of vegetable oil for human nutrition and biofuel, and also a model species for studies investigating the evolutionary consequences of polyploidisation. Strong bottlenecks during its recent origin from interspecific hybridisation, and subsequently through intensive artificial selection, have severely depleted the genetic diversity available for breeding. On the other hand, high-throughput genome profiling technolog...
38 CitationsSource
#1Jinbo ChenH-Index: 4
#2Daniel ArendH-Index: 6
Last. Matthias LangeH-Index: 11
view all 5 authors...
#1Daniel ArendH-Index: 6
#2Jinbo ChenH-Index: 4
Last. Matthias LangeH-Index: 11
view all 5 authors...
#1Christian Colmsee (Leibniz Association)H-Index: 10
#2Sebastian Beier (Leibniz Association)H-Index: 11
Last. Martin Mascher (Leibniz Association)H-Index: 26
view all 7 authors...
Genome browsers visualize the end product of genome assembly, which is a highly contiguous sequence. However, how to visualize the intermediate products of genome sequencing? Next-generation sequencing has enabled genome sequencing in species with huge genomes, but most often the shotgun assemblies obtained from short-read sequence data do not meet the quality standards required for finished reference sequences. In particular, many plant genomes do not yet have finished reference sequences, but ...
34 CitationsSource
#1Christian Colmsee (Leibniz Association)H-Index: 10
#2Jinbo Chen (Leibniz Association)H-Index: 4
Last. Matthias Lange (Hochschule Harz)H-Index: 2
view all 5 authors...
The management and handling of big data is a major challenge in the area of life science. Beside the data storage, information retrieval methods have to be adapted to huge data amounts as well. Therefore we present an approach to improve search results in life science by recommendations based on semantic information. In detail we determine relationships between documents by searching for shared database IDs as well as ontology identifiers. We have established a pipeline based on Hadoop allowing ...
#1Maria Esch (Leibniz Association)H-Index: 3
#2Jinbo Chen (Leibniz Association)H-Index: 4
Last. Matthias Lange (Leibniz Association)H-Index: 11
view all 7 authors...
With the number of sequenced plant genomes growing, the number of predicted genes and functional annotations is also increasing. The association between genes and phenotypic traits is currently of great interest. Unfortunately, the information available today is widely scattered over a number of different databases. Information retrieval (IR) has become an all-encompassing bioinformatics methodology for extracting knowledge from complex, heterogeneous and distributed databases, and therefore can...
6 CitationsSource
Jul 17, 2014 in DILS (Data Integration in the Life Sciences)
#1Daniel Arend (Leibniz Association)H-Index: 6
#2Christian Colmsee (Leibniz Association)H-Index: 10
Last. Matthias Lange (Leibniz Association)H-Index: 11
view all 8 authors...
Research in life sciences faces increasing amounts of cross-domain data, also kown as “big data”. This has notable effects on IT-departments and the dry lab desk alike. In this paper, we report on experiences from a decade of data management in a plant research institute. We explain the switch from personally managed files and heterogeneous information systems towards a centrally organised storage management. In particular, we discuss lessons that were learned within the last decade of productiv...
1 CitationsSource