# Amodio_Infertility_2023
Amodio G, Giacomini G, Boeri L, et al. **T cell exhaustion and senescence signatures characterize and differentiate infertile men**

### Single-Cell RNA Sequencing Analysis ###
**scRNAseq** (from 10X Genomics) analysis of CD3+ T cells purified from the peripheral blood of men diagnosed with oligo-astheno-teratozoospermia (OAT, n=4), idiopathic non-obstructive azoospermia (iNOA, n=6), and a control group (FER, n=5).

scRNAseq analysis was performed using a standard [Seurat](https://satijalab.org/seurat/) pipeline that includes the following steps starting from a minimal object after loading of 10X data to markers identification:

   - Preprocessing and cell filtering
      - Each sample was pre-processed and cells with mitochondrial RNA percentages higher than 10 and a number of features <1200 or >6000, were filtered out. Samples were merged into a single Seurat dataset  
   - Normalization 
      - Default Seurat settings [(NormalizeData function)](https://satijalab.org/seurat/reference/normalizedata) 
   - Scaling: 
      - Data was regressed out by passing UMI count, the percentage of mitochondrial genes, the difference between the cell cycle phases scores, as described in the Seurat [vignette](https://satijalab.org/seurat/articles/cell_cycle_vignette.html#alternate-workflow-1).
   - Dimensionality reduction and Harmony batch removal:
      - A principal component analysis (PCA) with 100 principal components (PCs) was performed and a UMAP-representation as well as clusters were computed on the top 55 components (orig.ident as batch variable)
   - Clustering: 
      - K-nearest neighbor (KNN) graph was first constructed based on the Euclidean distance using the [FindNeighbors](https://satijalab.org/seurat/reference/findneighbors) function, with the KNN algorithm set to 20.
      - The modularity optimization technique was applied using the Louvain algorithm through the [FindCluster](https://satijalab.org/seurat/reference/findclusters) function, with resolution parameters set to 1.2.
   - Markers identification:
      - Marker genes for each cluster were identified using the [FindAllMarkers](https://satijalab.org/seurat/reference/findallmarkers) function with the logfc.threshold argument set to 0.25. Only genes expressed in at least 25% of cells in one of the compared clusters were considered (min.pct = 0.25). Genes with pvalues < 1e10 <sup>-5</sup> from the Wilcoxon Rank Sum test were considered as markers for a specific cluster.
      - Cluster annotation 
   - Gene enrichment analysis (GSEA):
      - Intra-cluster comparisons: Intra-cluster comparisons among the experimental conditions were conducted using the [FindMarkers](https://satijalab.org/seurat/reference/findmarkers) function, setting test.use = wilcox, a logFC threshold = 0, min.cells.group = 5 and return.thresh parameter equal to 1.
      - GSEA function of [ClusterProfiler R package](https://bioconductor.org/packages/release/bioc/manuals/clusterProfiler/man/clusterProfiler.pdf) was applied, using the full marker gene list ranked by decreasing logFC and the hallmarks gene set. Gene sets were considered enriched if their adjusted pvalue was <0.1.

  
### Directories and Files ###
- sampleSheet.csv: names of samples and corresponding conditions

- **Script**: R scripts used for the analyses
  - `1_PreProcessing_Data.R`: Preprocessing, cell filtering and Full object creation
  - `2_Infertility_scRNAseq_analysis.R`: 
  - `3_SubsetAnalysis_Annotations.R`: subset analyses of T cells and cluster manual annotation
  - `4_ScoreAnalysis_TcellSubset.R`: 
  - `5_Visualization_Export_Data.R`: 
- **Data**: results of scRNAseq analysis: 





