COmbinatorial PEptide POoling Design for TCR specificity

CopepodTCR is a tool for the design of combinatorial peptide pooling schemes for TCR speficity assays.

CopepodTCR guides the user through all stages of the experiment design and interpetation:

  • selection of parameters for the experiment (Balance check)
  • generation and assessment of peptides (Peptide check)
  • generation of pooling scheme (Pooling scheme)
  • generation of punched cards of efficient peptide mixing (3D masks)
  • results interpetation using hierarchical Bayesian model (Interpretation)

Cite as

Kovaleva V. A., et al. "copepodTCR: Identification of Antigen-Specific T Cell Receptors with combinatorial peptide pooling." bioRxiv (2023): 2023-11.

Or use the following BibTeX entry:

@article{
	kovaleva2023copepodtcr,
	title        = {copepodTCR: Identification of Antigen-Specific T Cell Receptors with combinatorial peptide pooling},
	author       = {Kovaleva, Vasilisa A and Pattinson, David J and He, Guanchen and Barton, Carl and Chapin, Sarah R and Minervina, Anastasia A and Huang, Qin and Thomas, Paul G and Pogorelyy, Mikhail V and Meyer, Hannah V},
	year         = 2023,
	journal      = {bioRxiv},
	publisher    = {Cold Spring Harbor Laboratory},
	pages        = {2023--11}
}

Description

Identification of a cognate peptide for TCR of interest is crucial for biomedical research. Current computational efforts for TCR specificity did not produce reliable tool, so testing of large peptide libraries against a T cell bearing TCR of interest remains the main approach in the field.

Testing each peptide against a TCR is reagent- and time-consuming. More efficient approach is peptide mixing in pools according to a combinatorial scheme. Each peptide is added to a unique subset of pools ("address"), which leads to matching activation patterns in T cells stimulated by combinatorial pools.

Efficient combinatorial peptide pooling (CPP) scheme must implement:

  • use of overlapping peptide in the assay to cover the whole protein space;
  • error detection.

Here, we present CopepodTCR -- a tool for design of CPP schemes. CopepodTCR detects experimental errors and, coupled with a hierarchical Bayesian model for unbiased results interpretation, identifies the response-eliciting peptide for a TCR of interest out of hundreds of peptides tested using a simple experimental set-up.

The experimental setup starts with defining the protein/proteome of interest and obtaining synthetic peptides tiling its space. Peptide sequences can be generated in silico from a protein of interest and then checked using Peptide check tab. This set of peptides, containing an overlap of a constant length, is entered into copepodTCR. Parameters for CPP scheme can be selected using Balance check tab. To create a pooling scheme with selected parameters, use Pooling scheme tab. To make pipetting easier, you can use 3D masks tab to create mask models which could be further 3D printed and overlay the physical plate or pipette tip box.

Following this scheme, the peptides should be mixed, and the resulting peptide pools should be tested in a T cell activation assay. The activation of T cells can be measured for each peptide pool with the assay of choice, such as flow cytometry- or microscopy-based activation assays detecting transcription and translation of a reporter gene. The experimental measurements for each pool can be entered back into copepodTCR which employs a Bayesian mixture model to identify activated pools. Based on the activation patterns, it returns the set of overlapping peptides leading to T cell activation (Interpretation tab).

For more details, refer to “copepodTCR: Identification of Antigen-Specific T Cell Receptors with combinatorial peptide pooling” (bioRxiv version).

CopepodTCR python package

Alternatively, you might want to use copepodTCR python package as it provides more flexibility.

It can be installed with pip:

pip install copepodTCR

or conda:

conda install -c vasilisa.kovaleva copepodTCR

Documentation: copepodTCR.readthedocs.

Usage

The tool consists of five distinctive parts, each of which can be used separately. Each part corresponds to a step in the CPP experimental workflow.

  • selection of parameters for the experiment (Balance check)
  • generation and assessment of peptides (Peptide check)
  • generation of pooling scheme (Pooling scheme)
  • generation of punched cards of efficient peptide mixing (3D masks)
  • results interpetation using hierarchical Bayesian model (Interpretation)

Balance check

First, the appropriate number of pools and peptide occurrence (number of pools per one peptide) should be selected.

Peptide occurrence affects number of peptides in one pool, and therefore too high peptide occurrence may lead to higher dilution of a single peptide. In Kovaleva et al (2023), we were able to detect signal with the cognate peptide diluted to 1.58μM.

Peptide dilution can be mitigated by increasing number of pools, however, it might increase the complexity of experimental set-up. Consequently, these parameters should be chosen carefully.

To assist with this process, copepodTCR provides the user with possible peptide occurrence values based on given number of pools and number of tested peptides.

Also, we discovered that sensitivity and specificity of the developed Bayesian mixture model for identification of activated pools depend on the negative share — the expected number of non-activated pools. After setting the desired peptide occurrence, the approximate negative share is calculated as (N - I - 1)/N, where N is the total number of pools and I is the peptide occurrence (it is assumed that one epitope is shared between two peptides). For exploratory screens, we recommend choosing peptide occurrence resulting in a negative share equal to 0.5–0.6. For focused screens with the prior assumption that a cognate epitope is present in the tested peptide library, peptide occurrence leading to a higher negative share can be chosen. The app then simulates the pooling and shows how close the peptide distribution is to a perfectly balanced scheme.

Peptide check

Here you can generate peptides from an entered amino acid sequence or check the peptides you already have for their overlap length consistency. Inconsistent overlap length in the list of tested peptides can lead to imprecise results interpretation.

  • if you have protein sequence(s): Paste the sequence or upload a FASTA file. Enter the desired peptide length and the overlap length. Select what to do when the last peptide to be generated is too short: either discard this peptide or allow a bigger overlap with the second-to-last peptide. Click "Generate" to see the peptide list, use "Download" or "Send" to export the peptides.

  • if you have peptides: Paste or upload a TSV/CSV file with one peptide per line (without a header), then click "Check". Then, copepodTCR returns the distribution of sequence and overlap lengths in your peptide set. It also returns peptide pairs with overlap lengths differing from the most common one.

Pooling scheme

Upon parameters selection and peptides check, the user can enter them into copepodTCR and get a peptide pooling scheme.

CopepodTCR returns three tables:

  • peptide pooling scheme pool-wise (i.e. the table with peptides in each pool)
  • peptide pooling scheme peptide-wise (i.e. the table with pools for each peptide)
  • simulation table

During simulation step, copepodTCR simulates results of the experiment for any possible epitope of the provided length and returns a table with every possible epitope and all pools where this epitope is present.

Simulation table looks as follows:

  • Peptide — peptide sequence
  • Address — pool indices where this peptide should be added
  • Epitope — checked epitope from this peptide
  • Act pools — list with pool indices where this epitope is present
  • # of act pools — number of pools where this epitope is present
  • # of epitopes with these act pools — number of epitopes that are present in the same pools (= number of possible epitopes upon activation of such pools)
  • # of peptides with these act pools — number of peptides in which there are epitopes that are present in the same pools (= number of possible peptides upon activation of such pools)

# of peptides with these act pools should be equal to number peptides sharing an epitope. For end-position peptides, it would be less. However, if for some epitopes # of peptides with these act pools is bigger, than these peptides have bigger overlap than others.

To interpret the results of the experiment, user can find all rows where the Act Pools column contains respective combination of activated pools. This way, all possible peptides and epitopes leading to the activation of such a combination of pools are obtained. But we recommend using Interpretation tab to interpret results of the experiment.

3D masks

To avoid mixing pools manually, the user might print special punched cards using files with their 3D models.

Each card represents one pool, with holes positioned at the coordinates corresponding to the peptides designated for addition to that pool.

Produced punched card is placed on the empty tip box, and open holes are filled with tips. This patterned pippette tip array is used to transfer peptides from the plate to the corresponding pool.

The user can adjust parameters to fit their plate:

  • number of rows — number of rows in the plate
  • number of columns — number of columns in the plate
  • length — length of the plate (in mm)
  • width — width of the plate (in mm)
  • thickness — thickness of the plate (in mm)
  • hole radius — diameter of the well divided by 2
  • X offset — margin along the X axis for the A1 well, in mm
  • Y offset — margin along the Y axis for the A1 well, in mm
  • well spacing — distance between wells, in mm

To better orient tip pattern, the user can add the last hole (with coordinates m-k). It should be used only in absence of peptide in the corresponding well. Also, the user can choose to add marks indicating pool index to the masks.

Activated pools

The experiment can be conducted using flow cytometry or microscopy.

After the experiment, copepodTCR can help with data analyzation. The primitive version of experiment interpetation is decscribed in section Pooling scheme, in the explanation of simulation step.

But also user can analyze the results using Bayesian Mixture model. This model returns the probability of each pool being activated (green) or not (gray).

To enter the results of the experiment in the model, the user needs to make a CSV table with two columns: Pool and Percentage. Experiment can be conducted with replicas, then all replicas of one pool should have the same name in Pool column. Percentage is a percentage of activated T cells in a given pool (in case of microscopy, the user can divide number of activated T cells per well by total number of activated T cells in the experiment).

Example of the read out input file:

Then the user needs to enter the table with simulation (produced during Pooling scheme step, simulation.tsv), experiment read out (CSV table), and CPP parameters (number of pools, peptide occurrence).

After fitting the data to the model (it might take some time), copepodTCR returns list of activated pools and peptides responsible for their activation.

Pooling scheme settings

Number of pools
Number of peptides
Possible peptide occurrence
Negative share
Resulting balance 

          

Generate peptides

Choose a fasta file with your protein or proteins
Or enter a protein sequence here
Peptide length
Shift between peptides
Padd protein end?


Download generated peptides


Check peptides

Choose a file with peptides to upload
Or enter peptides here



        

Pooling scheme settings

Number of pools
Choose a file with peptides to upload
Or enter peptides here


Possible peptide occurrence
Expected epitope length
Negative share

          

Download the scheme

Download zip with STL files

Choose a file with pools
Choose a file with peptide arrangement in the plate
Number of rows
Number of columns
Length (mm)
Width (mm)
Thickness (mm)
Hole radius (mm)
X offset (mm)
Y offset (mm)
Well spacing (mm)
Add well with coordinates m-k?
Add pool indices to the masks?



Download zip file with STL files

Analyze results

Choose a file with simulation
Experiment read out


Number of pools
Peptide occurrence
Epitope length
Negative controls



Download results