Detail publikace

Identification of highly variable sequence fragments in unmapped reads for rapid bacterial genotyping

NYKRÝNOVÁ, M. BARTOŇ, V. BEZDÍČEK, M. LENGEROVÁ, M. ŠKUTKOVÁ, H.

Originální název

Identification of highly variable sequence fragments in unmapped reads for rapid bacterial genotyping

Typ

článek v časopise ve Web of Science, Jimp

Jazyk

angličtina

Originální abstrakt

Background Bacterial genotyping is a crucial process in outbreak investigation and epidemiological studies. Several typing methods such as pulsed-field gel electrophoresis, multilocus sequence typing (MLST) and whole genome sequencing are currently used in routine clinical practice. However, these methods are costly, time-consuming and have high computational demands. An alternative to these methods is mini-MLST, a quick, cost-effective and robust method based on high-resolution melting analysis. Nevertheless, no standardized approach to identify markers suitable for mini-MLST exists. Here, we present a pipeline for variable fragment detection in unmapped reads based on a modified hybrid assembly approach using data from one sequencing platform. Results In routine assembly against the reference sequence, high variable reads are not aligned and remain unmapped. If de novo assembly of them is performed, variable genomic regions can be located in created scaffolds. Based on the variability rates calculation, it is possible to find a highly variable region with the same discriminatory power as seven housekeeping gene fragments used in MLST. In the work presented here, we show the capability of identifying one variable fragment in de novo assembled scaffolds of 21 Escherichia coli genomes and three variable regions in scaffolds of 31 Klebsiella pneumoniae genomes. For each identified fragment, the melting temperatures are calculated based on the nearest neighbor method to verify the mini-MLST’s discriminatory power. Conclusions A pipeline for a modified hybrid assembly approach consisting of reference-based mapping and de novo assembly of unmapped reads is presented. This approach can be employed for the identification of highly variable genomic fragments in unmapped reads. The identified variable regions can then be used in efficient laboratory methods for bacterial typing such as mini-MLST with high discriminatory power, fully replacing expensive methods such as MLST. The results can and will be delivered in a shorter time, which allows immediate and fast infection monitoring in clinical practice.

Klíčová slova

Bacterial genotyping; Genome assembly; Unmapped reads; De novo assembly; Multilocus sequence typing; Mini-MLST

Autoři

NYKRÝNOVÁ, M.; BARTOŇ, V.; BEZDÍČEK, M.; LENGEROVÁ, M.; ŠKUTKOVÁ, H.

Vydáno

29. 12. 2022

Nakladatel

BioMed Central Ltd

ISSN

1471-2164

Periodikum

BMC GENOMICS

Ročník

23

Číslo

SUPPL 3

Stát

Spojené království Velké Británie a Severního Irska

Strany od

1

Strany do

12

Strany počet

12

URL

BibTex

@article{BUT180515,
  author="Markéta {Jakubíčková} and Vojtěch {Bartoň} and Matěj {Bezdíček} and Martina {Lengerová} and Helena {Vítková}",
  title="Identification of highly variable sequence fragments in unmapped reads for rapid bacterial genotyping",
  journal="BMC GENOMICS",
  year="2022",
  volume="23",
  number="SUPPL 3",
  pages="1--12",
  doi="10.1186/s12864-022-08550-4",
  issn="1471-2164",
  url="https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-022-08550-4"
}