**scVAR** is a computational tool for extracting and integrating genetic variants from single-cell RNA-seq (scRNA-seq) data. It uses variational autoencoders to build a latent space that combines transcriptional and genetic signals, helping resolve cellular heterogeneity — especially in complex diseases like leukemia.
**scVAR** is a computational tool for extracting and integrating genetic variants from single-cell RNA-seq (scRNA-seq) data. It uses variational autoencoders to construct a latent space that combines transcriptional and genetic signals, helping to resolve cellular heterogeneity — particularly in complex diseases such as leukemia.
## 🔍 Motivation
Leukemias such as AML and B-ALL show high genetic and transcriptomic heterogeneity, making clonal analysis challenging. While scRNA-seq is widely used to study gene expression, it also contains useful information about genetic variants. scVAR takes advantage of this to jointly analyze transcriptional and genetic signals from the same dataset, without the need for parallel DNA sequencing.
Leukemias like AML and B-ALL exhibit high genetic and transcriptomic heterogeneity, making clonal analysis particularly challenging. Although scRNA-seq is widely used to study gene expression, it also contains valuable information on genetic variants. **scVAR** leverages this dual information to jointly analyze transcriptional and genetic signals from the same dataset, without requiring matched DNA sequencing.
## 🧠 What it does
## 🧠 What It Does
- Detects expressed genetic variants directly from scRNA-seq data
- Integrates transcriptomic and variant information using multi-input variational autoencoders
- Builds a shared latent space capturing both omics layers
-Improves detection of rare subclones and subtle transcriptional states
- Recovers structure that is often missed by transcriptomic or genomic data alone
-Enhances detection of rare subclones and subtle transcriptional states
- Recovers structure often missed when analyzing transcriptomic or genomic data in isolation
## 📊 Use cases
## 📊 Use Cases
- Clonal architecture analysis in AML and B-ALL
- Interpretation of relapse samples
- Joint modeling of gene expression and mutation signals
-Making use of sparse variant data from 10x Genomics 5′ scRNA-seq
- Joint modeling of gene expression and mutational signals
-Effective utilization of sparse variant data from 10x Genomics 5′ scRNA-seq
## 📁 Data & Results
In AML samples, scVAR revealed subclones with distinct transcriptional programs that were not identifiable using gene expression or variants alone. In B-ALL, it uncovered fine-grained cellular structure and helped disentangle overlapping signals from transcriptomic and genetic data.
In AML samples, **scVAR** identified subclones with distinct transcriptional programs that were not detectable using gene expression or variant data alone. In B-ALL, it revealed fine-grained cellular structures and helped disentangle overlapping transcriptional and genetic signals.
## 🚀 Getting Started
See the `notebooks/` folder for example workflows.
To install dependencies:
Example workflows are provided in the `notebooks/` folder.
```
pip install -r requirements.txt
```
## 🛠️ Installation
Compatible with **Python ≥ 3.8**.
To install **scVAR**, create a new environment using `mamba` and install the package from source:
mamba create -n scvar_env python=3.10
mamba activate scvar_env
cd scvar
pip install .
**Note:** scVAR requires **Python == 3.10**.
## 📜 License
Distributed under the MIT License. See `LICENSE` for details.
Distributed under the MIT License. See the `LICENSE` file for more information.