Software & Tools

Collaborative Data Spaces


ImmuneSpace is a data management and analysis engine where standardized datasets can be easily explored and analyzed using state-of-the-art computational tools. All the data housed here is generated by The Human Immunology Project Consortium (HIPC) program.

Learn More About Immunespace

Immunespace on Twitter
The Journal of Immunology — ImmuneSpace: Enabling integrative modeling of human immunological data

CAVD DataSpace

DataSpace is a data sharing and discovery tool developed to empower HIV vaccine researchers. This LabKey-based software application is designed to facilitate self-guided data exploration across studies and increase awareness of the scientific questions being evaluated in the field of HIV vaccines. Currently, binding antibody, neutralization antibody, and cellular immunoassay results from over 192 vaccine products tested in 64 studies conducted in the CAVD have been harmonized and are available for exploration and download. Data are included from both clinical trials and studies of non-human primates and other animals.
CAVD DataSpace on Twitter

R Packages


A thin wrapper around Rlabkey to access the ImmuneSpace database from R.

This package simplifies access to the HIPC ImmuneSpace database for R programmers. It takes advantage of the standardization of the database to hide all the Rlabkey specific code away from the user. The study-specific datasets can be accessed via an object-oriented paradigm.


A thin wrapper around Rlabkey to access the CAVD DataSpace database from R. This package simplifies access to the database for R programmers.

It takes advantage of the standardization of the database to hide all the Rlabkey specific code away from the user. Study-specific datasets can be accessed via an object-oriented paradigm.


An R package that providing an automated data analysis pipeline for flow cytometry.

Learn More About OpenCyto

PLoS Computational Biology — OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis


MAST is an R/Bioconductor package for managing and analyzing qPCR and sequencing-based single-cell gene expression data, as well as data from other types of single-cell assays. Our goal is to support assays that have multiple features (genes, markers, etc) per well (cell, etc) in a flexible manner. Assays are assumed to be mostly complete in the sense that most wells contain measurements for all features.

Learn More About MAST

Genome Biology — MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data


COMPASS is a statistical framework that enables unbiased analysis of antigen-specific T-cell subsets. COMPASS uses a Bayesian hierarchical framework to model all observed cell-subsets and select the most likely to be antigen-specific while regularizing the small cell counts that often arise in multi-parameter space. The model provides a posterior probability of specificity for each cell subset and each sample, which can be used to profile a subject’s immune response to external stimuli such as infection or vaccination.

Learn More About COMPASS

Nature Biotechnology — COMPASS identifies T-cell subsets correlated with clinical outcomes


MIMOSA is a package for fitting mixtures of beta-binomial or dirichlet-multinomial models to paired count data from single-cell assays, as typically appear in immunological studies (i.e. ICS, intracellular cytokine staining assay, or Fluidigm Biomark single-cell gene expression assays).

The method is, generally, more sensitive and specific to detect differences between conditions (i.e. stimulated vs. unstimulated samples) than alternative approaches such as Fisher's exact test, or empirical ad-hoc methods like ranking by log-fold change.

Learn More About MIMOSA 

Biostatistics — Mixture models for single-cell assays with applications to vaccine studies


DataPackageR aims to simplify data package construction.

It provides mechanisms for reproducibility, data processing and tidying raw data into into documented, versioned, and packaged analysis-ready data sets.

Learn More About DataPackageR

Gates Open Research — DataPackageR: Reproducible data preprocessing, standardization and sharing using R/Bioconductor for collaborative data analysis