The project aims to evaluate the feasibility of using machine learning to predict efficacy of molecules against certain diseases based on the molecules’ chemical structures.

  • Extract 1D, 2D and 3D molecular descriptors and fingerprints of molecules using Python and RDKit
  • Implement different machine learning models and perform model stacking to predict molecules’ efficacy based on their structure with 80% accuracy using Python and Scikit-learn
  • Develop a data retrieve pipeline to collect chemical information from PubChem and apply the model to screen for molecules with potential efficacy against certain diseases