FURI | Summer 2023
Identifying the Optimal Orientation for Selecting Embedding Models for TCR-Epitope Binding Affinity Prediction
Accurate prediction of T cell receptor (TCR)-epitope binding results is crucial for personalized healthcare and immunotherapy. However, encoding TCR amino acid sequences into numerical representations remains challenging. A recent study demonstrated that catELMo, a bi-LSTM-based model, outperformed Transformer-based models prevalent in Natural Language Processing. In this project, we investigate the reasons behind catELMo’s superior performance, exploring two possibilities: 1) bi-LSTM’s suitability for TCR analysis compared to Transformers, and 2) the impact of learning objectives. To further investigate these possibilities, GPT will be trained, a Transformer-based model with the same learning objective as catELMo, as the embedding model and compare it to previous models.