Release 232 statistics
GTDB release date: 15th April, 2025
Taxon overview
GTDB R232 spans 901,341 genomes organized into 199,923 species clusters.
BacteriaArchaeaTotal
Phylum16224186
Class57269641
Order2,1641792,343
Family6,0266996,725
Genus34,8342,66937,503
Species189,80110,122199,923
Species overview
GTDB R232 is comprised of 878,998 bacterial and 22,343 archaeal genomes organized into 189,801 bacterial and 10,122 archaeal species clusters, respectively.
ReleaseBacterial GenomesArchaeal GenomesBacterial Species ClustersArchaeal Species Clusters
R04-RS89143,5122,39223,4581,248
R05-RS95191,5273,07330,2381,672
R06-RS202254,0904,31645,5552,339
R07-RS207311,4806,06262,2913,412
R08-RS214394,9327,77780,7894,416
R09-RS220584,38212,477107,2355,869
R10-RS226715,23017,245136,6466,968
R11-RS232878,99822,343189,80110,122
Growth from R10-RS22622.90%29.56%38.90%45.26%
Genome categories
GTDB taxa are comprised of isolate genomes, metagenome-assembled genomes (MAGs), and single-amplified genomes (SAGs). The following plot indicates the proportion of taxa at each taxonomic rank comprised exclusively of isolate genomes, exclusively of environmental genomes (i.e. MAGs/SAGs), or both isolate and environmental genomes.
GTDB species representatives
Each GTDB species cluster is represented by a single genome. Genomes assembled from the type strain of the species were selected where possible, though the majority of species clusters are currently assigned only placeholder names. The proportion of representatives which are isolates, MAGs, or SAGs is given for each category.
Quality of GTDB representative genomes
The quality of the genomes selected as GTDB species representatives is given below. Genome completeness and contamination were estimated using CheckM and are colored based on the MIMAG genome standards. In general, representative genomes were restricted to having a quality satisfying completeness - 5*contamination >50, unless a large portion of contamination could be attributed to strain heterogeneity. A few exceptions exist in order to retain well-known species with abnormal CheckM quality estimates, where contamination exceeds 10%.
Taxa with the largest number of species
Taxa encompassing the largest number of GTDB species clusters is given for each taxonomic rank.
PhylumClassOrderFamilyGenus
Pseudomonadota 46,828Gammaproteobacteria 25,165Burkholderiales 9,155Lachnospiraceae 4,549Streptomyces 2,311
Bacillota 27,167Bacteroidia 21,942Bacteroidales 8,804Flavobacteriaceae 3,764Collinsella 1,181
Actinomycetota 24,968Alphaproteobacteria 21,473Oscillospirales 7,815Rhodobacteraceae 3,226Flavobacterium 1,026
Bacteroidota 23,901Clostridia 20,027Rhizobiales 5,752Burkholderiaceae_C 2,966Aquipseudomonas 993
Patescibacteriota 10,945Actinomycetes 14,793Lachnospirales 4,973Sphingomonadaceae 2,805Prevotella 959
Acidobacteriota 7,453Bacilli 4,864Pseudomonadales 4,700Streptomycetaceae 2,611Pelagibacter 938
Chloroflexota 6,831Minisyncoccia 4,322Flavobacteriales 4,674Acutalibacteraceae 2,322Mycobacterium 850
Planctomycetota 4,995Terriglobia 4,081Actinomycetales 4,579Acidobacteriaceae 2,202Streptococcus 761
Verrucomicrobiota 4,825Bacilli_A 3,799Mycobacteriales 3,781Chitinophagaceae 2,191Cryptobacteroides 643
Desulfobacterota 4,552Verrucomicrobiia 3,767Chitinophagales 3,446Bacteroidaceae 2,100Microbacterium 593
Taxa with the largest number of sequenced genomes
Taxa encompassing the largest number of genomes in the GTDB is given for each taxonomic ranks.
PhylumClassOrderFamilyGenusSpecies
Pseudomonadota 324,613Gammaproteobacteria 273,004Enterobacterales 162,324Enterobacteriaceae 139,599Escherichia 37,464Escherichia coli 36,408
Bacillota 228,026Bacilli 112,556Bacteroidales 75,895Lachnospiraceae 42,630Klebsiella 35,140Klebsiella pneumoniae 28,487
Bacteroidota 107,378Clostridia 107,138Lactobacillales 62,342Staphylococcaceae 28,460Staphylococcus 27,543Salmonella enterica 21,527
Actinomycetota 80,926Bacteroidia 103,615Pseudomonadales 50,143Streptococcaceae 25,663Streptococcus 23,747Staphylococcus aureus 19,502
Patescibacteriota 16,000Actinomycetes 60,544Lachnospirales 43,929Muribaculaceae 25,055Salmonella 21,848ECMA0423 sp047199055 13,724
Bacillota_I 14,528Alphaproteobacteria 51,346Burkholderiales 35,997Pseudomonadaceae 24,656Acinetobacter 16,492Pseudomonas aeruginosa 12,456
Campylobacterota 14,392Bacilli_A 14,528Oscillospirales 33,622Bacteroidaceae 23,694ECMA0423 13,724Acinetobacter baumannii 11,844
Verrucomicrobiota 10,795Campylobacteria 14,392Staphylococcales 28,714Mycobacteriaceae 18,607Pseudomonas 12,771Streptococcus pneumoniae 9,634
Acidobacteriota 10,449Verrucomicrobiia 8,872Mycobacteriales 22,121Moraxellaceae 17,964Mycobacterium 12,726Mycobacterium tuberculosis 7,759
Chloroflexota 10,423Acidimicrobiia 7,655Actinomycetales 17,794Lactobacillaceae 14,996Streptomyces 11,995Enterococcus faecalis 5,041
Relative evolutionary divergence
The following graphs show the relative evolutionary divergence (RED) of taxa at each taxonomic rank from phylum to genus. RED values provide an operational approximation of relative time with extant taxa existing in the present (RED=1), the last common ancestor occurring at a fixed time in the past (RED=0), and internal nodes being linearly interpolated between these values according to lineage-specific rates of evolution. RED intervals for normalizing taxa at taxonomic ranks was operationally defined as the median RED value (indicated by a blue bar) at each rank ±0.1 (indicated by grey bars).

Bacteria


Archaea

Comparison of GTDB and NCBI taxa

Comparison of GTDB and NCBI taxonomic assignments across GTDB species representative genomes and all GTDB genomes which have an assigned NCBI taxonomy. For each taxonomic rank, a taxon was classified as being unchanged if its name was identical in both taxonomies, passively changed if the GTDB taxonomy provided name information absent in the NCBI taxonomy, or actively changed if the name was different between the two taxonomies.

Phylum names have been updated to follow the valid publication of 42 names in IJSEM. This has resulted in a large number of active phylum name changes relative to NCBI classifications at the time of this release. NCBI is also adopting these new phyla names.


Genomic statistics
Key genomic statistics for the GTDB species representative genomes and all genomes in the GTDB.

Genomes


Species

Nomenclatural types per rank

This plot shows the breakdown of placeholder versus latinized names for each taxonomic rank.

Bacteria: LatinBacteria: PlaceholderArchaea: LatinArchaea: PlaceholderLatinPlaceholder
Phylum75 (46.30%)87 (53.70%)16 (66.67%)8 (33.33%)91 (48.92%)95 (51.08%)
Class174 (30.42%)398 (69.58%)42 (60.87%)27 (39.13%)216 (33.70%)425 (66.30%)
Order417 (19.27%)1,747 (80.73%)72 (40.22%)107 (59.78%)489 (20.87%)1,854 (79.13%)
Family892 (14.80%)5,134 (85.20%)117 (16.74%)582 (83.26%)1,009 (15.00%)5,716 (85.00%)
Genus4,422 (12.69%)30,412 (87.31%)304 (11.39%)2,365 (88.61%)4,726 (12.60%)32,777 (87.40%)
Species19,798 (10.43%)170,003 (89.57%)888 (8.77%)9,234 (91.23%)20,686 (10.35%)179,237 (89.65%)
Phylum phylogenetic diversity
The following tables show the percentage contribution of each phylum to the Phylogenetic Diversity (PD) of the original and RED-normalised GTDB reference trees.

Phyla that are polyphyletic in the reference tree (suffixed as _A, _B, etc) can be collapsed using the toggle at the top right of each table. Click on headers to sort by column.

Note: RED-normalised values represent the median phylogenetic diversity (PD) across RED-scaled trees, each rooted with a phylum containing at least two classes.

Archaea:

Phylum
AGJL01
Original (%)
0.2
RED-normalised (%)
0.34
Difference
0.14
Phylum
Aenigmatarchaeota
Original (%)
7.18
RED-normalised (%)
5.94
Difference
-1.24
Phylum
Altiarchaeota
Original (%)
1.13
RED-normalised (%)
1.22
Difference
0.09
Phylum
Asgardarchaeota
Original (%)
3.94
RED-normalised (%)
3.92
Difference
-0.02
Phylum
B1Sed10-29
Original (%)
0.84
RED-normalised (%)
0.94
Difference
0.1
Phylum
CAYFZR01
Original (%)
0.01
RED-normalised (%)
0.03
Difference
0.02
Phylum
CAYQAE01
Original (%)
0.14
RED-normalised (%)
0.18
Difference
0.04
Phylum
EX4484-52
Original (%)
0.69
RED-normalised (%)
0.6
Difference
-0.09
Phylum
Hadarchaeota
Original (%)
0.81
RED-normalised (%)
1.09
Difference
0.28
Phylum
Halobacteriota
Original (%)
7.72
RED-normalised (%)
10.81
Difference
3.09

Bacteria:

Phylum
2-12-FULL-45-22
Original (%)
0.01
RED-normalised (%)
0.01
Difference
0
Phylum
4484-113
Original (%)
0.04
RED-normalised (%)
0.04
Difference
0
Phylum
4572-55
Original (%)
0.01
RED-normalised (%)
0.02
Difference
0.01
Phylum
ARS69
Original (%)
0.02
RED-normalised (%)
0.02
Difference
0
Phylum
AUK180
Original (%)
0.02
RED-normalised (%)
0.03
Difference
0.01
Phylum
Abyssobacteria
Original (%)
0.02
RED-normalised (%)
0.03
Difference
0.01
Phylum
Acidobacteriota
Original (%)
3.13
RED-normalised (%)
4.28
Difference
1.15
Phylum
Actinomycetota
Original (%)
8.01
RED-normalised (%)
7.88
Difference
-0.13
Phylum
Aerophobota
Original (%)
0.04
RED-normalised (%)
0.05
Difference
0.01
Phylum
Aquificota
Original (%)
0.12
RED-normalised (%)
0.12
Difference
0