BAR-Seq: Analysis of HDR Barcoded Cells

Clonal Tracking of Cells Gene Edited by Homology-Directed Repair

This help page describes how to run the BAR-Seq pipeline on this website.

Input options:

Input Fastq Files

Click on Browse to select the FASTQ files to upload, one for each sample.

The list of selected files will appear on the right side, showing also their total size.

Notice that the files must be already pre-processed (quality filter, adapter trimming, ...).

Read Structure

Specify the structure of the amplicon in which barcodes are inserted:

Upstream Seq.
Nucleotide sequence preceding the barcode in the amplicon.
Downstream Seq.
Nucleotide sequence following the barcode in the amplicon.

Notice that upstream and downstream sequences should cover the entire amplicon for a better extraction of barcodes.

Min. Barcode Count

Set the minimum count for a barcode to be considered as valid.

It corresponds to the minimum number of reads supporting a barcode.

Saturation Threshold (%)

Percentage of saturation to distinguish the dominant and rare barcode (sub)populations.

Advanced options:

Structural Filter

Filter applied to the set of barcodes baed on their composition:

No Filter
No filter is applied.
Filter Nucleotide with Freq. < 1%
For every position of the barcode compute the frequency of each nucleotide and filter barcodes having in at least a position a nucleotide with frequency lower than 1%.
Fixed Structure (IUPAC code)
Specify the allowed nucleotide composition of each position of the barcode in IUPAC code. Barcodes that do not satisfy such structure are discarded.
E.g. for barcodes having length 5: NTYCH
This allows any nucleotide in the first position, only T in the second one, C and T in the third position, only C in the fourth position, and A, C, T in the last one.
Graph Edit Distance
Maximum edit distance used to connect two barcodes in the graph used to merge the similar ones.

Example(s)

Here's some datasets which can be used as examples to try the BAR-Seq pipeline: Sample1, Sample2, Sample3.

To run the computation on these 3 samples, you can start by uploading the 3 FASTQ files, and insert the following sequences:

  • Upstream Seq.: CAGGGGATGCGGTGGGCTCTATGG
  • Downstream Seq.: TAGGGACAGGATTGGTGACAGAAAAGCCCCATCCTTAGGCCTCCTCCTTCCTAGT

Then, you can select the Min. Barcode Count and Saturation Threshold (%). We set these thresholds to 3 and 90, respectively.

Finally, to start the computation you can press the Run button. The overall run will require ~ 10/15 min. depending on the server computational load.


To directly load the example and set the parameters, you can click the following button:


This help page describes how to check the Library Complexity on this website.

Input options:

Input TSV Files

Click on Browse to select the TSV files to upload.

The file must be have two columns, the first with the barcode sequence and the second one with the count values (additional columns are ignored).

Notice that the file must also have a header (i.e. the first line is skipped).

Confidence Level (%)

Confidence level used to estimate the probability of uniquely targeting a certain number of cells with the uploaded library.


Example

Here's a library which can be used as example to check its complexity: Lib. Example.

To run the complexity check on this example, you can start by uploading the TSV file, and select the confidence level (we suggest 95%).

Then, to start the computation you can press the Run button. The run will require ~ 1 min.


To directly load the example and set the confidence level, you can click the following button: