This help page describes how to run the BAR-Seq pipeline on this website.
- Input Fastq Files
Click on Browse to select the FASTQ files to upload, one for each sample.
The list of selected files will appear on the right side, showing also their total size.
Notice that the files must be already pre-processed (quality filter, adapter trimming, ...).
- Read Structure
Specify the structure of the amplicon in which barcodes are inserted:
- Upstream Seq.
- Nucleotide sequence preceding the barcode in the amplicon.
- Downstream Seq.
- Nucleotide sequence following the barcode in the amplicon.
Notice that upstream and downstream sequences should cover the entire amplicon for a better extraction of barcodes.
- Min. Barcode Count
Set the minimum count for a barcode to be considered as valid.
It corresponds to the minimum number of reads supporting a barcode.
- Saturation Threshold (%)
Percentage of saturation to distinguish the dominant and rare barcode (sub)populations.
- Structural Filter
Filter applied to the set of barcodes baed on their composition:
- No Filter
- No filter is applied.
- Filter Nucleotide with Freq. < 1%
- For every position of the barcode compute the frequency of each nucleotide and filter barcodes having in at least a position a nucleotide with frequency lower than 1%.
- Fixed Structure (IUPAC code)
- Specify the allowed nucleotide composition of each position of the barcode in IUPAC code.
Barcodes that do not satisfy such structure are discarded.
E.g. for barcodes having length 5: NTYCH
This allows any nucleotide in the first position, only T in the second one, C and T in the third position, only C in the fourth position, and A, C, T in the last one.
- Graph Edit Distance
- Maximum edit distance used to connect two barcodes in the graph used to merge the similar ones.
To run the computation on these 3 samples, you can start by uploading the 3 FASTQ files, and insert the following sequences:
- Upstream Seq.: CAGGGGATGCGGTGGGCTCTATGG
- Downstream Seq.: TAGGGACAGGATTGGTGACAGAAAAGCCCCATCCTTAGGCCTCCTCCTTCCTAGT
Then, you can select the Min. Barcode Count and Saturation Threshold (%). We set these thresholds to 3 and 90, respectively.
Finally, to start the computation you can press the Run button. The overall run will require ~ 10/15 min. depending on the server computational load.
To directly load the example and set the parameters, you can click the following button:
This help page describes how to check the Library Complexity on this website.
- Input TSV Files
Click on Browse to select the TSV files to upload.
The file must be have two columns, the first with the barcode sequence and the second one with the count values (additional columns are ignored).
Notice that the file must also have a header (i.e. the first line is skipped).
- Confidence Level (%)
Confidence level used to estimate the probability of uniquely targeting a certain number of cells with the uploaded library.
Here's a library which can be used as example to check its complexity: Lib. Example.
To run the complexity check on this example, you can start by uploading the TSV file, and select the confidence level (we suggest 95%).
Then, to start the computation you can press the Run button. The run will require ~ 1 min.
To directly load the example and set the confidence level, you can click the following button: