The **scRNAseq basic** analysis was conducted according to the workflow outlined below:
1. QC: min.cells = 20; min_feature: between 200 and 6000; max pct.mito = 20.
Normalization (default seurat settings)
Scaling (with following variables to regress out: percent.mt + nCount_RNA and CC.Difference calculated as show in [vignette](https://satijalab.org/seurat/articles/cell_cycle_vignette.html#alternate-workflow-1))
3. Dimensionality reduction: PCA (top 25PCs using 15% most variable genes of whole gene list)
4. Harmony batch removal (sample/ orig.ident)
5. Clustering (Louvain improved: algorithm number 2 in FindCluster Seurat function)
6. Markers identification (resolution 0.6 + additional subclustering of clusters 5,7 and 9 with FindSubCluster Seurat function)
7. Automatic cell type annotation using singleR (v1.8.1) and Sakurai et al (23) as reference dataset.
Custom annotation of cells was then performed according to Sakurai classifcation and markers inspection providing two different annotations levels: Classification variable (more refined) and Population variable - less granular as shown in Fig2C.
The complete list of markers produced by FindAllMarkers seurat function for all resolutions and annotation varibles (logfc.threshold = 0, min.pct = 0.2) is included in Data file S2.
see: (FindAllMarkers_Population_dataset_Fig2C.R)[http://www.bioinfotiget.it/gitlab/custom/zonari_mpbhscexp_2025/zonari_mpbhscexp_2025_scrnaseq/-/blob/main/FindAllMarkers_Population_dataset_Fig2C.R] and result in .rds format: **Full_GSEA_markers_Population.rds**
#### Evaluation of expansion culture on mPB CD34+ cells
To assess the effect of expansion culture on mPB CD34+ cells we identified genes
significantly deregulated across culture timepoints starting from uncultured (day0) cells to day4
and day8 expansion timepoints. To correct for differences in cell population abundances, we
performed all comparisons across timepoints within each cell population (Classification label).
First, we identified differentially expressed genes between timepoints (day4 vs day0; day8 vs day4
and day8 vs day0) by the FindMarker function from Seurat R package. According to fold change
direction we defined different patterns of modulation that could be simplified in up or down
regulated across culture (consistently up/down, early or late up/down according to day4 and day8
significance and direction). We next looked for a more comprehensive set of genes which could
summarize a global culture effect across most populations, and selected sets of common up and
down regulated genes shared in at least 4 populations out of 7 identified by our Population label.
Lists of common up/down regulated genes across culture were used as input for ORA using
clusterprofiler R package (v4.7.1)
Code details and input for Supplementary Figure 2 can be found in: