Parameter-Efficient Fine-Tuning of Protein Language Models Improves Prediction of Protein-Protein Interactions

Machine Learning for Structural Biology Workshop, NeurIPS |

Mirroring the massive increase in the size of transformer-based models in natural language processing, proteomics too has seen the advent of increasingly large foundational protein language models. As model size increases, the computational and memory footprint of fine-tuning expands out of reach of many academic labs and small biotechs. In this work, we apply parameter-efficient fine-tuning (PEFT) to protein language models to predict protein-protein interactions. We show that a model trained with the PEFT method LoRA outperforms full fine-tuning while requiring a reduced memory footprint. We also perform an analysis of which weight matrices in the attention layers to adapt, finding that contrary to in natural language processing, modifying the key and value matrices yields the best performance. This work demonstrates that despite the recent increase in scale, the effective use of protein language models for representation learning is not out of the reach of research groups with fewer computational resources.