Cell Type Identification

Processing and clustering

We treat protein intensity measurements and scalar spatial parameters for each segmented cell as separate modes of data to be separatly arcsine transformed, scaled, and batch-corrected (if needed). Then, clustering is achieved through Phenograph. Alternatively, Seurat Weighed Nearest Neighbor Analysis (WNN) can be used to cluster cells and project into two-dimensional space using both modalities, which also allows for flexibility in integrating multiple datasets.

Cell type classification from reference

Automated cell type classification is achieved through comparing all protein intensities with a reference dataset. This can be previously labeled scRNA-seq or PySeq2500 4i data, stored as Seurat object in RDS format, as specified in the experiment config/config.yaml file (and metadata column name that contains cell type information in the reference). The underlying algorithm used here, Seurat Canonical Correlation Analysis (CCA), has been shown to be performant in cross-modal reference mapping. Cell type results are saved as CSV file in the output tables directory, to be imported into anndata for downstream steps. If no appropriate reference is used, clustering results from above are saved instead.