Project: NLP on Protein Sequences

Merck - NLP on Protein Sequences

Open for Registration

Lecture time: Tuesday: 15:30:00 - 16:20:00 ET
Lab time: Thursday: 15:30:00 - 17:20:00 ET
Domain: Pharmaceutical
Keywords: Database, NLP
Tools: AWS, Confluence, JIRA, Python
Citizenship: Open to all students

Summary

Students will compare different NLP models and their generated protein sequence vectors (for possible empirical/statistical correlations to performance and stability metrics relevant to biopharmaceutical development). Vectors would then be organized into a structured database.

Description

Please see the PDF for a detailed project description. When registering for this project in UniTime, look for 'Merck (NLP on Protein Sequences)' in the Note section, and select the appropriate CRN.

AY 2023 - 2024 Last updated: April 20, 2023, 2:43 p.m.