AlignNFSeq

End-to-end RNA-seq pipeline orchestration from SRA to differential expression

nf-core/fetchngs nf-core/rnaseq STAR Salmon limma-voom

The Challenge

RNA-seq experiments generate enormous volumes of raw sequencing data stored in public repositories like SRA and GEO, but getting from accession numbers to analyzable count matrices requires navigating a complex chain of bioinformatics tools: downloading FASTQs, quality control, adapter trimming, genome alignment, and transcript quantification. Each step demands specialized software, correct parameterization, and substantial compute resources.

Cloud computing solves the resource problem, but introduces its own complexity — configuring GCP Batch executors, managing storage buckets, handling spot instance preemptions, and monitoring distributed jobs across dozens of samples. Researchers shouldn’t need to become cloud infrastructure engineers to analyze their RNA-seq data.

How AlignNFSeq Helps

AlignNFSeq orchestrates the entire upstream RNA-seq workflow through a single interactive Shiny interface. Paste your SRA accession IDs, pick your organism, and launch — the platform handles everything from FASTQ download through alignment and quantification using battle-tested nf-core Nextflow pipelines.

The platform runs nf-core/fetchngs (v1.12.0) to download FASTQs from SRA/ENA/DDBJ, then automatically chains into nf-core/rnaseq (v3.14.0) for STAR alignment and Salmon quantification. All Nextflow parameters are pre-configured with production-proven defaults refined through hundreds of pipeline runs.

A dual-mode interface serves both audiences: wet-lab scientists get a 4-step wizard that abstracts away all complexity, while bioinformaticians get full parameter control, log streaming, and resume/cancel capabilities. Both modes support GCP cloud execution (via Batch API) and local Docker for development or smaller datasets.

Built-in differential expression analysis via limma-voom (powered by AlignRNAseqFlow) completes the pipeline — from accession numbers to DE results and pathway enrichment without leaving the application.

What You Receive

FASTQ Downloads

Raw sequencing data automatically downloaded from SRA/ENA/DDBJ with MD5 validation. Organized samplesheet generated for downstream processing.

Aligned BAMs & Counts

STAR-aligned BAM files and Salmon-quantified gene count matrices (TPM and raw counts). Ready for any downstream analysis tool.

MultiQC Reports

Comprehensive quality control reports aggregating FastQC, STAR alignment statistics, Salmon mapping rates, and sample-level metrics in interactive HTML.

DE & Enrichment Results

Optional differential expression analysis with volcano plots, top gene tables, and ORA pathway enrichment — all from within the same interface.

Methodology & Infrastructure

nf-core/fetchngs
nf-core/rnaseq
STAR
Salmon
Nextflow
GCP Batch
limma-voom
Docker

AlignNFSeq builds on the nf-core framework — community-curated, peer-reviewed Nextflow pipelines used by thousands of genomics labs worldwide. Every tool runs in a versioned Docker container ensuring complete reproducibility.

Cloud execution uses Google Cloud Batch with production-hardened configurations: e2-highmem-16 machine types, automatic retry on preemption (error codes 50001-50006), and intelligent resource allocation. Problematic QC steps (QualiMap, dupRadar) are automatically skipped based on operational experience from hundreds of pipeline runs.

The R package architecture separates concerns cleanly: processx for non-blocking pipeline execution, glue for config generation, and bslib for the responsive Shiny interface. Pipeline state persists across browser disconnects — Nextflow manages the actual compute, and AlignNFSeq reconnects to running jobs on session restore.

Cost transparency: Per-sample cost estimates (~$0.50 for fetchngs, ~$8.50 for rnaseq on GCP) are displayed before every launch, with confirmation dialogs to prevent accidental cloud spend.

Ideal For

Researchers reanalyzing public RNA-seq datasets from GEO/SRA without command-line bioinformatics
Labs processing new sequencing runs through a standardized, reproducible pipeline
Core facilities needing a consistent interface for RNA-seq processing requests
Studies requiring STAR+Salmon alignment with nf-core best practices
Teams wanting cloud-scale processing (GCP Batch) without infrastructure expertise
Projects that need end-to-end traceability from raw accessions to differential expression
Any bulk RNA-seq experiment where you have SRA IDs and need count matrices

Start Your Analysis

Ready to analyze your data with AlignNFSeq? Submit your project and we'll scope a plan tailored to your experimental design.

Request Analysis View Pricing