Multivariate Cox proportional hazards analysis was performed in SAS v9. 0 to estimate the hazard selleck chemicals ARQ197 ratio associated with cluster expression in the three groups after controlling for stand ard clinical predictors. Chi Square tests were used to examine correlations between cluster groups, individual genes, and tumor selleck products subtype. The pathway was built de novo based on information from KEGG, BioCarta, and a review screening library by Yarden and Silowkoski with a focus on the RAS MEK and PI3K AKT components. We compiled a list of disease related traits in the GWAS catalog and extracted the reported genes for each of them. The disease list includes a number of cancers, a variety of complex trait diseases, and disease predisposition traits such as obesity and hypertension. We then found the drugs used in treat ment of each of these traits in Drugbank, and extracted the drug target genes for each drug. Thus, for each trait, we have a list of GWAS reported genes and a list of drug targets. For the 88 GWAS diseases that have drugs in Drugbank, there are on average 29. 2 GWAS reported genes and 24. 0 drug targets for 19. On average, in this set there are 30 GWAS reported genes and 11. 2 verified drug targets for each of these 81 diseases. A second possible cause of low over lap is mis assignment of mechanism genes in the GWAS catalog. Marker SNPs found in a GWAS locus are usually in linkage dis equilibrium with many other SNPs covering a number of genes, any of which in principle might be in disease mechanism. In some cases, the catalog assignments may be incorrect, and the true mechanism gene in a locus may in fact be a drug target. We investigated the effect of this factor by comparing drug targetGWAS overlap described above with that obtained including all genes in each locus as candidates, rather than just those reported as candidates in the GWAS catalog. For the 58 diseases with sufficient information in the catalog, link age disequilibrium expansion from marker SNPs increased the set of candidate genes from the 1997 reported to 4035, about a factor of two. The number of GWAS genes that are also drug targets increased from 18 to 24. This small increase is comparable with the increase of 3 that is expected from the random model. Thus, the number of GWASdrug target matches missed as a consequence of misidentification of candidate genes appears very small. A third data related factor is cover age by the tag SNPs on the microarrays used in GWAS studies. If there is no tag SNP in linkage disequilibrium with the underlying variant involved in a disease mechanism, that contribution to the trait will not be detected. A study of 160 non GWAS derived candidate genes for blood pressure concluded that only half were adequately covered with tag SNPs on a 500K array, suggesting this is a significant factor. But overall, data considerations do not qualitatively change the picture of very low GWAS genedrug target overlap.