Figure 4.
Two main axes of variation for ML models in protein engineering. In any ML-guided protein engineering project, the two basic questions that arise are, first, what are the inputs (protein sequences, structures, or both) and second, what does the model do (predict biophysical properties, generate novel sequences/structures, or both). The more we move toward structure-based modeling as well as generative modeling, the more complex it becomes to both build and operate these models.