Release 89 statistics
GTDB release date: June 17, 2019
Taxon overview
GTDB r89 spans 145,904 genomes organized into 24,706 species clusters.
Genome categories
GTDB taxa are comprised of isolate genomes, metagenome-assembled genomes (MAGs), and single-amplified genomes (SAGs). The following plot indicates the proportion of taxa at each taxonomic rank comprised exclusively of isolate genomes, exclusively of environmental genomes (i.e. MAGs/SAGs), or both isolate and environmental genomes.
GTDB species representatives
Each GTDB species cluster is represented by a single genome. Genomes assembled from the type strain of the species were selected where possible, though the majority of species clusters are currently assigned only placeholder names. The proportion of species clusters comprised of isolates or environmental genomes is given for each species category.
Quality of GTDB representative genomes
The quality of the genomes selected as GTDB species representatives is given below. Genome completeness and contamination were estimated using CheckM and are colored based on the MIMAG genome standards. In general, representative genomes were restricted to having a quality satisfying completeness - 5*contamination >50. A few exceptions exist in order to retain well-known species with abnormal CheckM quality estimates.
Taxa with the largest number of species
Taxa encompassing the largest number of GTDB species clusters is given for each taxonomic rank.
Bacteria 23,458Proteobacteria 7,630Gammaproteobacteria 4,645Burkholderiales 1,273Burkholderiaceae 856Streptomyces 470
Archaea 1,248Actinobacteriota 3,118Alphaproteobacteria 2,955Pseudomonadales 1,270Flavobacteriaceae 675Pseudomonas_E 404
Bacteroidota 2,843Actinobacteria 2,735Enterobacterales 1,186Lachnospiraceae 615Prevotella 229
Firmicutes_A 1,886Bacteroidia 2,677Flavobacteriales 1,045Mycobacteriaceae 611Streptococcus 224
Firmicutes 1,878Bacilli 1,878Bacteroidales 1,029Rhodobacteraceae 578Prochlorococcus_A 178
Patescibacteria 1,031Clostridia 1,836Rhizobiales 961Streptomycetaceae 528Flavobacterium 176
Cyanobacteria 654Cyanobacteriia 597Mycobacteriales 948Pseudomonadaceae 514Vibrio 140
Halobacterota 411Paceibacteria 392Actinomycetales 944Enterobacteriaceae 509Corynebacterium 133
Verrucomicrobiota 358Verrucomicrobiae 292Lactobacillales 674Rhizobiaceae 476Mycolicibacterium 122
Chloroflexota 350Bacilli_A 283Lachnospirales 662Sphingomonadaceae 398Microbacterium 119
Taxa with the largest number of sequenced genomes
Taxa encompassing the largest number of genomes in the GTDB is given for each taxonomic ranks.
Bacteria 143,512Proteobacteria 65,677Gammaproteobacteria 58,746Enterobacterales 37,723Enterobacteriaceae 32,509Escherichia 13,879Staphylococcus aureus 9,444
Archaea 2,392Firmicutes 36,312Bacilli 36,312Lactobacillales 19,567Streptococcaceae 12,474Streptococcus 12,273Escherichia flexneri 9,084
Actinobacteriota 14,912Actinobacteria 14,045Staphylococcales 11,372Staphylococcaceae 11,330Staphylococcus 11,225Salmonella enterica 8,698
Bacteroidota 5,620Alphaproteobacteria 6,870Pseudomonadales 10,071Mycobacteriaceae 9,384Salmonella 8,878Streptococcus pneumoniae 8,201
Firmicutes_A 5,049Bacteroidia 5,288Mycobacteriales 9,911Pseudomonadaceae 5,171Mycobacterium 6,112Mycobacterium tuberculosis 5,596
Campylobacterota 3,912Clostridia 4,929Burkholderiales 7,087Burkholderiaceae 4,378Klebsiella 4,527Klebsiella pneumoniae 4,086
Patescibacteria 2,469Campylobacteria 3,903Bacillales 4,037Moraxellaceae 3,831Acinetobacter 3,623Acinetobacter baumannii 2,796
Cyanobacteria 1,158Cyanobacteriia 1,048Campylobacterales 3,894Vibrionaceae 2,937Pseudomonas 2,808Pseudomonas aeruginosa 2,744
Spirochaetota 1,096Paceibacteria 942Rhizobiales 3,317Listeriaceae 2,579Vibrio 2,643Escherichia coli 2,432
Halobacterota 758Verrucomicrobiae 607Bacteroidales 2,614Campylobacteraceae 2,490Listeria 2,568Mycobacteroides abscessus 1,589
Relative evolutionary divergence
The following graphs show the relative evolutionary divergence (RED) of taxa at each taxonomic rank from phylum to genus. RED values provide an operational approximation of relative time with extant taxa existing in the present (RED=1), the last common ancestor occurring at a fixed time in the past (RED=0), and internal nodes being linearly interpolated between these values according to lineage-specific rates of evolution. RED intervals for normalizing taxa at taxonomic ranks was operationally defined as the median RED value (indicated by a blue bar) at each rank ±0.1 (indicated by grey bars).



Comparison of GTDB and NCBI taxa
Comparison of GTDB and NCBI taxonomic assignments across GTDB species representative genomes and all GTDB genomes which have an assigned NCBI taxonomy. For each taxonomic rank, a taxon was classified as being unchanged if its name was identical in both taxonomies, passively changed if the GTDB taxonomy provided name information absent in the NCBI taxonomy, or actively changed if the name was different between the two taxonomies.
Genomic statistics
Key genomic statistics for the GTDB species representative genomes and all genomes in the GTDB.