Introduction
Gene Ontology testing with clusterProfiler
- GO terms are divided into Biological Process (BP), Molecular Function (MF) and Cellular Component (CC), which can be analysed separately or together depending on the biological question.
- The
enrichGO()andgseGO()functions inclusterProfilerallow users to perform ORA and GSEA using the GO database directly. - GO testing results highlight gene sets or pathways that are overrepresented in your dataset, allowing interpretation of downregulated or upregulated genes.
KEGG enrichment analysis with clusterProfiler
KEGG pathway analysis helps link DEGs to functional biological pathways.
Both ORA (
enrichKEGG) and GSEA-style (gseKEGG) methods provide complementary insights.pathviewenables visual interpretation of pathway-level expression changes.
Gene set enrichment analysis with fgsea
- GSEA evaluates enrichment across a ranked list of all genes, not just a subset of significant ones.
- The
fgseapackage provides a fast implementation of GSEA suitable for large RNA-seq datasets. - A positive NES indicates enrichment among up-regulated genes, while a negative NES indicates enrichment among down-regulated genes.
-
plotGseaTable()andplotEnrichment()help visualise how pathways behave across the ranked gene list. - Compared with
clusterProfilers GSEA functions,fgseafocuses on speed and flexibility, whileclusterProfilerprovides tighter integration with specific databases (e.g., GO, KEGG) and additional plotting helpers.
Analysis with RegEnrich
-
RegEnrichhelps identify potential regulatory drivers (e.g. TFs) behind observed gene expression changes. - The package’s built-in TF dataset
(data(TFs))is human-specific and not suitable for mouse RNA-seq analysis. - For mouse data, a mouse-specific TF list (e.g. from TcoF-DB) must be supplied via the reg argument.
- A RegenrichSet object requires: an expression matrix, sample metadata, a regulator list, and a design/contrast specification.
Interaction networks with StringDB
STRINGdblinks your genes to protein–protein interaction networks from the STRING database.Mapping from gene IDs (e.g. ENTREZ) to STRING IDs is a crucial first step.
Network visualisation can reveal modules of interconnected DE genes that may not be obvious from lists or tables.
STRING provides its own functional enrichment, which can complement results from
clusterProfilerandfgsea.
Conclusion
- Enrichment methods help translate gene-level changes into biological
meaning.
- Different tools (ORA, GSEA, network-based methods) answer different
but complementary questions.
- Combining methods provides stronger and more interpretable
biological insights.
- Functional enrichment is an essential component of any RNA-seq analysis workflow.