The tree is created using UPGMA and distances shown are computed based on pairwise allelic differences (hamming distance). Missing genes are not included in the count.
The ambition with 1928’s outbreak analyses is to reproduce results from scientific studies and highlight the quality of the results generated by the 1928 platform.
In order to demonstrate the accuracy of outbreak analysis with the 1928 platform, a phylogenetic tree is created from an outbreak of Escherichia coli. The tree shown in the figure above is generated using the 1928 analysis platform.
In this analysis, 1928 reproduces results from the scientific paper Validation of Whole-Genome Sequencing for Identification and Characterization of Shiga Toxin-Producing Escherichia coli To Produce Standardized Data To Enable Data Sharing (pmid: 29263202) by Holmes A, Dallman TJ, Shabaan S, Hanson M, Allison L.
The 1928 Escherichia coli cgMLST schema is created from 622 reference genomes giving the scheme 2510 core genes. All genes included in the schema are present in at least 95% of reference genomes.