open access publication

Article, 2024

ARGprofiler—a pipeline for large-scale analysis of antimicrobial resistance genes and their flanking regions in metagenomic datasets

Bioinformatics, ISSN 1367-4803, 1367-4811, Volume 40, 3, 10.1093/bioinformatics/btae086

Contributors

Martiny H.-M. 0000-0001-6733-7888 (Corresponding author) [1] Pyrounakis N. [1] Petersen T.N. 0000-0002-2484-5716 [1] Lukjancenko O. [1] Aarestrup F.M. 0000-0002-7116-2723 [1] Clausen P.T.L.C. 0000-0002-8197-7520 [1] Munk P. 0000-0001-8813-4019 [1]

Affiliations

  1. [1] Technical University of Denmark
  2. [NORA names: DTU Technical University of Denmark; University; Denmark; Europe, EU; Nordic; OECD]

Abstract

Motivation: Analyzing metagenomic data can be highly valuable for understanding the function and distribution of antimicrobial resistance genes (ARGs). However, there is a need for standardized and reproducible workflows to ensure the comparability of studies, as the current options involve various tools and reference databases, each designed with a specific purpose in mind. Results: In this work, we have created the workflow ARGprofiler to process large amounts of raw sequencing reads for studying the composition, distribution, and function of ARGs. ARGprofiler tackles the challenge of deciding which reference database to use by providing the PanRes database of 14 078 unique ARGs that combines several existing collections into one. Our pipeline is designed to not only produce abundance tables of genes and microbes but also to reconstruct the flanking regions of ARGs with ARGextender. ARGextender is a bioinformatic approach combining KMA and SPAdes to recruit reads for a targeted de novo assembly. While our aim is on ARGs, the pipeline also creates Mash sketches for fast searching and comparisons of sequencing runs.

Funders

  • Horizon 2020 Framework Programme
  • BY-COVID
  • Novo Nordisk Fonden

Data Provider: Elsevier