AquaaG: Automated Quality Assessment and Annotation of Genomes – A scalable pipeline for prokaryotic and eukaryotic genome retrieval, quality assessment, annotation, and completeness evaluation.

Welcome to AquaaG Toolkit


AquaaG is an automated and scalable genome annotation pipeline designed to streamline the end-to-end processing of prokaryotic and eukaryotic genomes. The workflow integrates NCBI assembly retrieval, assembly quality assessment using QUAST, and organism-specific annotation through Prokka for prokaryotes and BRAKER3 for eukaryotes. AquaaG also incorporates BUSCO analysis to evaluate annotation completeness across major lineage datasets. The pipeline supports flexible configuration through YAML files, enabling species-level, kingdom-level, or assembly-list-based processing, with optional submitter-based filtering for targeted data retrieval. All steps are fully automated, reproducible, and optimized for high-throughput genomic studies. AquaaG provides annotated genome files, quality assessment reports, and metadata summaries, offering a unified and reliable solution for large-scale comparative and functional genomics.

High-throughput sequencing technologies have drastically increased the number of genomic assemblies deposited in public databases, providing new opportunities for comparative and evolutionary research. Despite these advances, many eukaryotic genomes remain unannotated, and researchers often face labour-intensive workflows involving manual retrieval, quality checks, and separate annotation runs. AquaaG addresses these challenges by providing an integrated, automated pipeline that handles assembly retrieval, QUAST-based quality assessment, Prokka/BRAKER3 annotation, and BUSCO completeness evaluation in a single reproducible workflow. A unique feature is the optional India-specific filtering of assemblies based on submitter metadata.

Get AquaaG Toolkit from here: DOWNLOAD & MANUAL

You can also clone the repository from GitHub GitHub


AquaaG: Automated Quality Assessment and Annotation of Genomes – Streamlining genome retrieval, quality control, annotation, and evaluation for prokaryotic and eukaryotic organisms.