Steven L. Salzberg
Johns Hopkins University
390Publications
118H-index
136kCitations
Publications 390
Newest
Published on Mar 1, 2019in New Phytologist 7.43
Amanda R. De La Torre8
Estimated H-index: 8
(University of California, Davis),
Daniela Puiu14
Estimated H-index: 14
(Johns Hopkins University)
+ 4 AuthorsDavid B. Neale35
Estimated H-index: 35
(University of California, Davis)
1 Citations Source Cite
Published on Feb 1, 2019in Nature Genetics 27.13
Rachel M. Sherman2
Estimated H-index: 2
(Johns Hopkins University),
Juliet Forman1
Estimated H-index: 1
(Harvey Mudd College)
+ 43 AuthorsVictor E. Ortega8
Estimated H-index: 8
(Wake Forest University)
In the version of this article initially published, the statement “there are no pan-genomes for any other animal or plant species” was incorrect. The statement has been corrected to “there are no reported pan-genomes for any other animal species, to our knowledge.” We thank David Edwards for bringing this error to our attention. The error has been corrected in the HTML and PDF versions of the article.
Source Cite
Published on Feb 15, 2019in Bioinformatics 5.48
Richard Wilton3
Estimated H-index: 3
(Johns Hopkins University),
Sarah J. Wheelan27
Estimated H-index: 27
(Johns Hopkins University)
+ 1 AuthorsSteven L. Salzberg118
Estimated H-index: 118
Source Cite
Published on Jan 1, 2019in Nature Genetics 27.13
Rachel M. Sherman2
Estimated H-index: 2
(Johns Hopkins University),
Juliet Forman1
Estimated H-index: 1
(Harvey Mudd College)
+ 43 AuthorsVictor E. Ortega8
Estimated H-index: 8
(Wake Forest University)
We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of th...
6 Citations Source Cite
Published on Feb 20, 2019in Nature Communications 12.35
Michelle Daya8
Estimated H-index: 8
(University of Colorado Denver),
Nicholas Rafaels29
Estimated H-index: 29
(University of Colorado Denver)
+ 62 AuthorsMonica Campbell7
Estimated H-index: 7
(University of Colorado Denver)
Asthma is a complex disease with striking disparities across racial and ethnic groups. Despite its relatively high burden, representation of individuals of African ancestry in asthma genome-wide association studies (GWAS) has been inadequate, and true associations in these underrepresented minority groups have been inconclusive. We report the results of a genome-wide meta-analysis from the Consortium on Asthma among African Ancestry Populations (CAAPA; 7009 asthma cases, 7645 controls). We find ...
1 Citations Source Cite
Published on Dec 1, 2018in Genome Biology 13.21
Mihaela Pertea30
Estimated H-index: 30
(Johns Hopkins University),
Alaina Shumate2
Estimated H-index: 2
(Johns Hopkins University)
+ 6 AuthorsSteven L. Salzberg118
Estimated H-index: 118
We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearl...
5 Citations Source Cite
Published on Dec 1, 2018in Nature Communications 12.35
Seyedeh M. Zekavat10
Estimated H-index: 10
(Yale University),
Sanni Ruotsalainen2
Estimated H-index: 2
(University of Helsinki)
+ 339 AuthorsJesse M. Engreitz19
Estimated H-index: 19
(Broad Institute)
Lipoprotein(a), Lp(a), is a modified low-density lipoprotein particle that contains apolipoprotein(a), encoded by LPA, and is a highly heritable, causal risk factor for cardiovascular diseases that varies in concentrations across ancestries. Here, we use deep-coverage whole genome sequencing in 8392 individuals of European and African ancestry to discover and interpret both single-nucleotide variants and copy number (CN) variation associated with Lp(a). We observe that genetic determinants betwe...
5 Citations Source Cite
Published on Jan 26, 2018in PLOS Computational Biology 3.96
Guillaume Marçais13
Estimated H-index: 13
(University of Maryland, College Park),
Arthur L. Delcher45
Estimated H-index: 45
(Johns Hopkins University)
+ 3 AuthorsAleksey V. Zimin27
Estimated H-index: 27
(University of Maryland, College Park)
The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of problems including aligning whole genome sequences, aligning reads to a reference genome, and comparing different assemblies of the same genome. Despite its broad utility, MUMmer3 has limitations that can make it difficult to use for large genomes and for the very l...
37 Citations Source Cite
Published on Dec 1, 2018in Genome Biology 13.21
Florian P. Breitwieser18
Estimated H-index: 18
(Johns Hopkins University),
Dannon Baker7
Estimated H-index: 7
(Johns Hopkins University),
Steven L. Salzberg118
Estimated H-index: 118
(Johns Hopkins University)
False-positive identifications are a significant problem in metagenomics classification. We present KrakenUniq, a novel metagenomics classifier that combines the fast k-mer-based classification of Kraken with an efficient algorithm for assessing the coverage of unique k-mers found in each species in a dataset. On various test datasets, KrakenUniq gives better recall and precision than other methods and effectively classifies and distinguishes pathogens with low abundance from false positives in ...
3 Citations Source Cite
12345678910