Overlap summary among two prior reports that extracted biomarker sets from microarray mRNA profiling of breast cancer tissue from patients, and the top-ranked ChIP-X study identified by ChEA.
Complete tables are provided at Supplementary Tables S1—S4. MMP9 is a metalo-protease that digests the extra-cellular matrix during invasion, whereas changes in CD44 expression likely play a role in evading the host immune response. However, the ChEA analysis combined with the microarray profiling provides unbiased global additional support for such hypothesis.
Our results also complement a network analysis approach applied to the same data using protein interactions. Chuang et al. Here we linked such results to transcriptional regulation evidence from ChIP-X studies. Gene-expression profiling from different cancers, collected from patients or cell types, can now be linked to a transcription factor regulatory signature using ChEA. Such signature may hint, in a direct way, to the molecular regulatory mechanisms altered in any specific cancer subtype.
For the second case study, we re-analyzed results from a report where the authors over-expressed 50 transcription factors, one-by-one, in mouse embryonic stem cells mESCs and then measured the effect of such perturbations on gene-expression response using mRNA microarrays Nishiyama et al. Among the 50 transcription factors used, all the well-known mESCs regulators are included, i.
Oct4, Nanog and Sox2. The study identified Cdx2 as the transcription factor with the most dramatic effect when over-expressed, and as such it was selected for conducting a ChIP-seq experiment. We re-analyzed the results from the Nishiyama et al.
Surprisingly, the two studies that reported ChIP-X results for Suz12 binding appeared as the most statistically enriched for binding sites for almost all of the perturbations Fig. The P -values for overlap with Suz12 targets were very significant, reaching for example, 1.
Additional confidence is added due to the fact that the ChEA database contains two independent Suz12 ChIP-X experiments that do not fully overlap, and both studies appeared at the top for almost all the gain-of-function experiments. Suz12 is a member of the polycomb group PcG complex responsible for methylation of lysine 9 and 27 of histone 3.
Such methylation is known to cause transcriptional suppression of differentiation genes Ru and Yi, Hence, the fact that all the changes in gene expression observed in this study are strongly associated with Suz12 targets, regardless of the perturbation applied to mESCs, may implicate that almost all perturbations cause differentiation. This suggests that the quantitative level of many components of the self-renewal machinery must be critically balanced to maintain the pluripotency state.
It seems that the type of perturbation in itself was less critical as any perturbation induce similar global changes in chromatin rearrangements. ChEA analysis of the top genes that changed in their mRNA expression after 50 over-expression experiments of single transcription factors in mESCs. The P -value rankings from ChEA for each transcription factor over-expression perturbation are inverse log transformed. The top-ranked transcription factors reported by ChEA are labeled peaks in the bar graph.
Out of the 50 perturbations, only those factors that reached a low P -value of 1. Full results are available in Supplementary Table S5. Such combination of databases can be used for identifying and ranking small molecules that can potentially be used for controlling the activity of specific transcription factors.
Examining the genes that increased or decreased significantly after a perturbation, we can use ChEA to rank the transcription factors that most likely regulate statistically over represented transcription factors the genes that increase or decrease in expression due to the drug perturbation. This ranking can be used to design combinations of drugs that can potentially counteract the activity of specific transcription factors in a specific cellular context Fig.
Illustration of the concept of using pair-wise drug perturbations to target the Myc target gene space. The scoring scheme can be used to suggest, for example, how we can use small molecules to induce the activity of specific transcription factors such as Oct4 for iPS reprogramming, or for blocking uncontrolled cell proliferation by targeting Myc.
In this case study, we devise mechanisms to down-regulate the activity of the transcription factor Myc and hence potentially block the proliferation of cancer cells. When we examined the results of inputting all lists of the mostly down-regulated genes as reported in the ranked lists from CMAP entered in a batch mode into ChEA, we noticed that Myc appears often as the top-ranked transcription factor for binding in proximity to genes that decreased in expression after many drug perturbations.
This is expected since Myc is a known oncogene, and all cell-lines in CMAP are human cancer cells, and many of the drugs that were used to create CMAP are anti-cancer drugs. For illustration, we ranked pairs of drugs based on their combined coverage of Myc targets Fig. Our strategy optimizes selection of drug-pairs that do not have similar effects on Myc targets. Table 2 provides the resulting top 10 pair-wise entries a Perl script with an input table is provided as supporting materials, Supplementary Data , Supplementary Table S6.
The top ranked list of pair-wise drugs suggests combinations of drug treatments for further maximally reducing Myc transcriptional regulatory activity. The combinations we identified include known cancer drugs as well as other drugs. For example, monastrol is a known cancer drug that targets kinesin-5, a motor protein important in mitosis Mayer et al. Hence, it is likely acting through a different pathway to regulate a subset of Myc-regulated target genes. Our initial approach of combining and ranking pairs of drugs to regulate the activity of specific transcription factors can be further improved in many ways.
One possibility is to compute the likelihood that a combination of more than two drugs will cover a specific transcription factor target space. This can be achieved, for example, with algorithms such as the probabilistic generative model for GO enrichment analysis Lu et al. Our initial formulation can also be extended by using quantitative values instead of sets, and include statistical randomization as control.
In summary, the approach presented in this third case study provides a step forward toward rationale combinatorial application of drugs to treat specific cancers with a transcription-factor anchoring. Such an approach is amenable also for improving iPS reprogramming strategies by, for example, designing combination of drugs that would activate Oct4 or other key stem-cell self-renewal factors for reprogramming somatic cells into iPS cells.
Many other similar applications to attempt to control cell fate are possible. The strategy should also work for other types of gene expression microarray datasets in other contexts. One of the reasons high-throughput genome-wide ChIP-X studies are expected to be more useful and accurate than computational sequence-based methods is because the sequence-based approaches do not take into consideration the chromatin state of the cell under a specific experimental condition, cell type or organism.
Hence, ChIP-X databases and tools such as ChEA are expected to perform more accurately when combined with data from mRNA gene expression studies as compared to computational sequence based methods.
A match between a transcription factor and changes in gene expression will be found not only by linking the changes in expression to transcription factor binding sites, but also linking such binding to a specific prior experiment which has been conducted under similar condition. Combining different types of ChIP-X experiments from different papers, cell-types and experimental conditions, using different statistical cut-offs and experimental techniques is challenging.
We chose to either use the criteria applied by the authors of each study, or apply our own standard method for finding peaks and calling target genes. Both approaches are simple and relatively unbiased. The two approaches complement each other in regards to coverage. The raw data route excludes many of the studies currently in the database since there are many ChIP-X publications that only provide the target list without the raw data.
There are also many ChIP-X raw data files available in the public domain without a publication that contains an author extracted gene list. Regardless, we expect that the database will rapidly continue to grow. Moreover, multiple entries for the same transcription factor can increase the confidence for functional binding sites Wu and Ji, Our initial analysis shows that overlap among different ChIP experiments using the same factor increases functional gene predictability.
For example, we examined the overlap among independent Oct4 ChIP-X studies and compared the consensus overlapped genes with an Oct4 knock-down followed by a microarray study. Initial results demonstrate that functional genes prediction improves when multiple independent studies are combined, but this should be further investigated in future studies.
Since we keep track in our database on information such as the cell type, organism, experimental method, distance to start site and peak height, we implemented filters that can be used by users to exclude the analysis from including specific organism, cell-type or experimental method; as well as calibrate the gene calling threshold for peak height and distance to start site.
Many of the studies that report global transcription factor binding to DNA using whole genome-wide ChIP-X experiments also often conduct global mRNA experiments after knock-down or over-expression of the transcription factors that were used in the ChIP-X studies, as well as profiling specific histone modifications or polymerase binding using ChIP-X technologies. By combining mRNA microarrays of RNA-seq together with ChIP-X transcription factor, polymerase binding and histone modification studies, we can determine which binding sites are functional, as well as which functional sites are activation or inhibition sites.
By combining expression data with ChIP-X we should be able to obtain a signed and directed network which is desired for understanding pathways, improving enrichment analyses and performing dynamical simulations. The ChIP-X database and ChEA web-based software tool was generated utilizing code from our previous work of developing a kinase-substrate database and software system for kinase enrichment analysis KEA Lachmann and Ma'ayan, These two software systems can potentially be combined.
Since we know the group of transcription factors that regulate genes based on changes at the mRNA level under a certain experimental condition or in a specific disease based on tissue expression profiling, we can use known protein—protein interactions to build a sub-network to connect these transcription factors. Then, we can link this sub-network, as input for KEA, to obtain the protein kinases and pathways that most likely regulate the transcription factor centered sub-network Bromberg et al.
Such an approach can be used to understand cell regulation at the cell signaling network level given mRNA expression profiling data and suggest kinase inhibitors as drug-targets Ma'ayan and He, In summary, the ChIP-X database and the ChEA software provide an alternative way for researchers to analyze mRNA expression data in context of genome-wide transcription-factor ChIP-X experiments collected and organized into a prior knowledge database and an interactive web-based software system.
This page has been archived and is no longer updated. Transcription factors include a wide number of proteins, excluding RNA polymerase, that initiate and regulate the transcription of genes.
One distinct feature of transcription factors is that they have DNA-binding domains that give them the ability to bind to specific sequences of DNA called enhancer or promoter sequences. Some transcription factors bind to a DNA promoter sequence near the transcription start site and help form the transcription initiation complex. Other transcription factors bind to regulatory sequences, such as enhancer sequences, and can either stimulate or repress transcription of the related gene.
View Article Google Scholar 4. Averof M, Patel N. H Crustacean appendage evolution associated with changes in Hox gene expression. Nature — View Article Google Scholar 5. L A role of Ultrabithorax in morphological differences between Drosophila species. View Article Google Scholar 6. Clark R. M, Wagler T. N, Quijada P, Doebley J A distant upstream enhancer at the maize domestication gene tb1 has pleiotropic effects on plant and inflorescent architecture. Nat Genet — View Article Google Scholar 7.
Shapiro M. D, Marks M. E, Peichel C. L, Blackman B. K, Nereng K. S, et al. View Article Google Scholar 8. J, Kassner V. A, Carroll S. B Chance caught on the wing: cis-regulatory evolution and the origin of pigment patterns in Drosophila. View Article Google Scholar 9. A, Barkai N A yeast hybrid provides insight into the evolution of gene expression regulation.
View Article Google Scholar Wittkopp P. J, Haerum B. K, Clark A. G Regulatory changes underlying expression differences within and between Drosophila species. Wilson M. D, Barbosa-Morais N. L, Schmidt D, Conboy C. M, Vanes L, et al. Graze R. M, McIntyre L. M, Main B. J, Wayne M. L, Nuzhdin S.
V Regulatory Divergence in Drosophila melanogaster and D. Genetics — Dickmeis T, Muller F The identification and functional characterisation of conserved regulatory elements in developmental genes. Brief Funct Genomic Proteomic 3: — Nat Methods 4: — J, Aparicio O, Jennings E.
G, et al. Borneman A. R, Gianoulis T.
0コメント