The association of different genes with the three EGFR associated signa tures is likely reflective of the complexity of signaling in this pathway acro

Multivariate Cox proportional hazards analysis was performed in SAS v9. 0 to estimate the hazard HDAC signaling pathway ratio associated with cluster expression in the three groups after controlling for stand ard clinical predictors. Chi Square tests were used to examine correlations between cluster groups, individual genes, and tumor subtype. The pathway was built de novo based on information from KEGG, BioCarta, and a review selleck by Yarden and Silowkoski with a focus on the RAS MEK and PI3K AKT components. In spite of the success in disco vering disease associations, it is becoming clear that many disease mechanism genes with the highest effect on disease phenotypes are not discovered by GWAS. Studies of blood pressure provide a striking example. There is a long history of identification of genes affecting blood pressure using non genomic methods, and 30 genes dis covered in this way have provided successful targets for treating hypertension. But only a few of these candidate genes and no drug targets are discovered in large scale GWAS. Further, mouse knockout data suggest that some of the missing genes have very large effect sizes, with blood pressure changes of 10s of mm of Hg, whereas the largest changes associated with marker SNPs in GWAS studies are between about 0. 5 and 1 mm of Hg. Known drug targets genes that usually have a large effect size on the corresponding disease phenotype, and so should be found by GWAS provide a means of investigating whether non discovery of mechanism genes is a general phenomenon.

Here, we compare a set of reported mechanism genes in the GWAS catalog with a corresponding set of known drug target genes for the same diseases. We find that the overlap of these two sets is very low. We also investigate two possible expla nations for low overlap. Finally, we consider the rela tionship between GWAS genes and drug targets in the context of a protein functional interaction network, and develop a machine learning method to predict new drug targets using the relationship between GWAS genes and known drug targets. Results Comparison of the GWAS catalog and Drugbank shows GWAS only detects a very small fraction of existing drug targets We examined the relationship between genes in the GWAS catalog and drug target genes in Drugbank. The GWAS catalog is a comprehensive collection of results from published GWAS studies on a wide variety of disease and other traits such as height. Drugbank is a database that combines detailed drug data with comprehensive drug target information. We compiled a list of disease related traits in the GWAS catalog and extracted the reported genes for each of them. The disease list includes a number of cancers, a variety of complex trait diseases, and disease predisposition traits such as obesity and hypertension. We then found the drugs used in treat ment of each of these traits in Drugbank, and extracted the drug target genes for each drug. Thus, for each trait, we have a list of GWAS reported genes and a list of drug targets. For the 88 GWAS diseases that have drugs in Drugbank, there are on average 29. 2 GWAS reported genes and 24.