Discussion The epidermal growth factor receptor family is of tremen dous biological and clinical importance for many solid epithelial tumors

Careful inspection of these genes reveals some that may have relevance to acute lymphoblastic leukemia, and so drugs for www.selleckchem.com/Wnt.html which these are targets provide potential can didates for repurposing. Dorsomorphin 1219168-18-9 FGFR1 is the drug target of Palifer min, a recombinant human keratinocyte growth factor for the treatment of oral mucositis associated with chemotherapy and radiation therapy. Its also the target for several experimental despite drugs. Verified drug targets Drug targets with the entry Pharmacological action labeled as Yes in the Drugbank. All 4013 GWAS reported genes and 1463 drug targets were mapped to NCBI gene IDs to provide unique identi fiers for comparison.

For the 88 GWAS diseases with drugs in Drugbank, there are 1914 GWAS reported genes and 821 drug targets. The verified drug target set has 353 genes for 81 diseases. For each disease, we compare the list of GWAS reported genes and drug targets and find the overlap between these two lists. The Floyd Warshall algorithm was used to calcu late the shortest path between all gene pairs in the net work.

The resulting set of inter node distances serves as a background distribution. For each disease, we extracted the set of all pairwise distances between GWAS genes for that disease, between drug targets genes, and between GWAS genes and drug target genes. For each disease, we also calculated the shortest path from every gene in the network to the nearest GWAS gene for that disease and to the nearest drug target for the disease. Machine learning for drug targets We used a random forest implemented in WEKA to train on the N3 features to predict known drug tar gets for a disease from the set of all drug targets. The training sets are unbalanced since the number of drug targets for each disease is very small com pared to all possible drug targets, 932. We use the MetaCost procedure to deal with the unbalanced training set, which gives more penalty to false negative errors than to false positive errors. We set the cost fac tor to be the ratio between the number of correct and incorrect drug targets. We set the parameter K, the number of separating features, as the square root of the number of all features and set the parameter I, the num ber of decision trees in the random forest, as 50. 10 fold cross validation was used to measure the performance for the random forest method for each disease. Discussion This work began with an evaluation of the capability of GWA studies to identify existing drug targets for com plex trait disease, based on a comparison of proposed disease mechanism genes in the GWAS catalog and drug targets in Drugbank. To our surprise, only 20 of these 856 drug targets correspond to GWAS identified mechanism genes. Although the point is not emphasized there, a recent study also found a small level of overlap between GWAS disease genes and corresponding drug targets for approved drugs. Interestingly, that study found that inclusion of targets for drugs at all stages of development boosts the overlap considerably, to 63.