Abstract
We previously applied genome-scale CRISPR knockout studies in a range of cell lines to identify genes that are more potent and recurrent genetic dependencies for the growth and survival of individual cancer types compared with others (de Matos Simoes et al., Nature Cancer 2023). As part of this effort to determine dependencies with tumor type/lineage-selective roles, we identified 116 genes that are preferentially essential for multiple myeloma (MM) cells compared to other hematologic malignancies or solid tumors. Most of these MM-preferential dependencies were not previously considered therapeutic targets for MM or other neoplasias and, to our knowledge, have not been formally examined in drug screens that could identify ligands for these proteins and serve as lead compounds for potential therapeutic applications. Because of the large number of promising targets identified by our CRISPR studies and to accelerate potential therapeutic programs against those targets, the current study sought to identify and characterize “druggable” pockets in these MM-preferential dependencies and establish an in silico framework to nominate candidate small molecules for development of inhibitors or degraders.
Protein identifiers and corresponding primary protein sequences were extracted from the UniProt database. Predicted structures were obtained both from the AlphaFold database and from our own local implementation of AlphaFold2 to cross-validate structural predictions. Each predicted structure was annotated based on AlphaFold-derived confidence scores, visualized through pseudo-GenBank files created with custom Biopython scripts and further annotated (Geneious software) to differentiate structured vs. unstructured protein regions. Experimental structures from the Protein Data Bank (PDB) were annotated for comparison. We observed that 82 of 116 target proteins have at least one domain structure in PDB.
For the 34 proteins without any available experimental structures, AlphaFold/ AlphaFold2 predictions for structured regions were further evaluated. We developed a “b-factor filtering” script to isolate candidate structural regions predicted with over 50% confidence, adding three amino acids overhang to ensure robust selection of candidate regions. These filtered structures were analyzed using three pocket identification methodologies: AutoSite (AutoDock Vina suite), FPocket, and the Differential of Gaussians method. Combining these complementary approaches is intended to facilitate reliable identification of consensus pockets. Residues lining pockets with higher score and consensus between all pocket identification programs were used to derive target binding regions for small molecule docking by Autodock Vina. Among the 116 MM-preferential dependencies, 57 unique proteins harbored a total of 80 pockets with significant agreement of the primary sequences of the identified pockets and high druggability scores (≥0.5, based on normalized cumulative distribution function) across the different pocket identification programs.
We proceeded by docking ~300,000 molecules from the NCI/DTP Open Chemicals Repository and ~3.6 million drug-like commercially available compounds to each of the best target identified regions. Known inhibitors for individual MM-preferential dependencies (e.g. the p300 inhibitor inobrodib) were also identified as putative binders to “druggable” pockets that were nominated in unbiased manner by our in silico pipeline: such observations further support the notion that our pipeline can reliably identify both ligand-binding pockets for “druggable” targets and compounds binding to those specific pockets. Further docking with expanded libraries is ongoing.
Our study highlights the importance of machine learning-based pipelines for assessment of primary sequence regions, 3D structurally-defined binding pockets and their interaction with putative small molecule ligands to identify candidate binders to MM-preferential dependencies. Identification of structurally and functionally important binding regions provides essential groundwork for ongoing docking screens with large chemical libraries and innovative machine learning-based docking approaches that involve both existing compounds and synthetic datasets of theoretically plausible chemical space. We also envision that this pipeline can applied more broadly to nominate potential pharmacological modulators for critical dependencies in other neoplasias beyond MM.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal