News

[2016/12/20 - V2.2]
ChemDIS V2.2
[2016/11/11 - V2.1]
ChemDIS V2.1
[2016/02/17 - V2.0]
New design & customized analysis

[Archived News]

Quick Search

e.g. DEHP, sibutramine, maleate(maleic acid)

[Advanced search]

Contact

Chun-Wei Tung Ph.D.
Assistant Professor
School of Pharmacy & PhD Program in Toxicology,
Kaohsiung Medical University
E-mail: cwtung@kmu.edu.tw
Webpage: http://cwtung.kmu.edu.tw

Welcome to ChemDIS

Version 2.0 | Last update: Feb 18 2016

ChemDIS is a chemical-disease inference system based on chemical-protein interactions. By integrating the chemical-protein interactions and protein-disease interactions, the diseases associated with a given chemical can be inferred from the chemical-protein-disease relatioinship. ChemDIS provides enrichment analysis tools for identifying chemical-associated Gene Ontology (GO), pathway (KEGG and Reactome) and Disease Ontology (DO and DOlite) based on a hypergeometric distribution. The chemical-protein interactions in humans were retrieved from STITCH database v4.0 and v3.1. ChemDIS is expected to be a useful chemical-disease inference system for assessing potential risks associated with enviromental chemicals. Our previous study on maleic acid provides a case study of chemical-protein-disease inference (Lin et al., 2014).

Citation

C.W. Tung* (2015) ChemDIS: a chemical-disease inference system based on chemical-protein interactions, Journal of Cheminformatics, 7(1), 25.
Y.C. Lin, C.C. Wang and C.W. Tung* (2014) An in silico toxicogenomics approach for inferring potential diseases associated with maleic acid, Chemico-Biological Interactions, 223(5), 38-44.

e.g. DEHP, sibutramine, maleate(maleic acid)












Please wait...

Structure

Pubchem {{basic.CID}}

Properties

IUPAC NAME {{basic.IUPACName}}
IUPAC INCHI {{basic.InChI}}
IUPAC INCHIKEY {{basic.InChIKey}}
Molecular Formula {{basic.MolecularFormula}}
Canonical SMILES {{basic.CanonicalSMILES}}
Isomeric SMILES {{basic.IsomericSMILES}}
H-bond Acceptor {{basic.HBondAcceptorCount}}
H-bond Donor {{basic.HBondDonorCount}}
Molecular Weight {{basic.MolecularWeight}}
TPSA {{basic.TPSA}}
External link PubChem: CID{{basic.CID}}
Download SDF file 2D structure
3D structure
Show: entries
page {{1+Math.floor(offset/num_perpage)}}/{{Math.ceil(pData.length/num_perpage)}}
Protein
Gene Symbol
Gene name
Score
{{p.Ensemble}} {{p.Gene_symbol}} {{p.Gene_name}} {{p.Score/1000 | number:3}}
No interacting protein found!
Show: entries
page {{1+Math.floor(goffset/gnum_perpage)}}/{{Math.ceil(gData.length/gnum_perpage)}}
Type
ID
Description
Gene Ratio
Bg Ratio
P
Adj. P
Q
Genes
{{g.Type}} {{g.ID}} {{g.Description}} {{g.GeneRatio}} {{g.BgRatio}} {{g.pvalue | scientific}} {{g.p_adjust | scientific}} {{g.qvalue | scientific}} [+] [-] {{g.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(aoffset/anum_perpage)}}/{{Math.ceil(aData.length/anum_perpage)}}
Type
ID
Description
Gene Ratio
Bg Ratio
P
Adj. P
Q
Genes
{{a.Type}} {{a.ID}} {{a.ID}} {{a.Description}} {{a.GeneRatio}} {{a.BgRatio}} {{a.pvalue | scientific}} {{a.p_adjust | scientific}} {{a.qvalue | scientific}} [+] [-] {{a.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(doffset/dnum_perpage)}}/{{Math.ceil(dData.length/dnum_perpage)}}
ID
Description
Gene Ratio
Bg Ratio
P
Adj. P
Q
Genes
{{d.ID}} {{d.Description}} {{d.GeneRatio}} {{d.BgRatio}} {{d.pvalue | scientific}} {{d.p_adjust | scientific}} {{d.qvalue | scientific}} [+] [-] {{d.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(ooffset/onum_perpage)}}/{{Math.ceil(oData.length/onum_perpage)}}
ID
Description
Gene Ratio
Bg Ratio
P
Adj. P
Q
Genes
{{o.ID}} {{o.Description}} {{o.GeneRatio}} {{o.BgRatio}} {{o.pvalue | scientific}} {{o.p_adjust | scientific}} {{o.qvalue | scientific}} [+] [-] {{o.geneID | filtergene : more}}
No significant enriched term found!

ChemDIS

ChemDIS is a chemical-disease inference system based on chemical-protein interactions. By integrating the chemical-protein interactions and protein-disease interactions, the diseases associated with a given chemical can be inferred from the chemical-protein-disease relatioinship. ChemDIS provides enrichment analysis tools for identifying chemical-associated Gene Ontology (GO), pathway (KEGG and Reactome) and Disease Ontology (DO and DOLite) based on a hypergeometric distribution. The chemical-protein interactions in humans were retrieved from STITCH database v4.0 and v3.1. ChemDIS is expected to be a useful chemical-disease inference system for assessing potential risks associated with enviromental chemicals. Our previous study on maleic acid provides a case study of chemical-protein-disease inference (Lin et al., 2014).

Back to content

History

[2016/02/17 - V2.0] New design & customized analysis
[2015/04/13 - V1.1] Improve UI & bug fix
[2015/01/06 - V1.0] ChemDIS-online
[2014/11/13 - V0.9] ChemDIS-begin
Back to content

What is the score

The Score in the "Search" form and Score in the "Protein" tab shows the confidence of the chemical-protein interactions that is defined by the STITCH database. STITCH utilized a Bayesian scoring scheme to calculate a combined score from different sources of experiment, database and text-mining. The highest and lowest scores are 1 and 0.

  • Table 1. The level and corresponding threshould for selecting chemical-protein interactions for inference.
    Level Score
    High >=0.7
    Medium >=0.4
    Low (Default) >=0.15
  • Default level used to extract chemical-protein interactions is "Low" (Score: 0.15) that all interactions will be utilized for inference.
  • Users can change the level to "Medium" or "High" that fewer interaction will be included for analysis, while high-confident interactions might provide more reliable estimation of disease inference.
  • For poorly characterized chemicals with only a few available interactions, it is suggested to use the default "Low" level due to the limited information available for inference.
  • For relatively well-studied chemicals, level "High" is suggested for examination of more reliable inferences and generate hypothesis that is more likely to be validated experimentally, while level "Low" may be used to explore novel hypothesis for further study.
Back to content

How to interpret the results

After submitting your queries, the results will be shown in the bottom of search tools. Six tabs of [Basic], [Protein], [GO], [Pathway], [Disease (DO)] and [Disease (DOLite)] can be browsed

  • [Basic information] Basic information includes chemical 2D structure, hydrogen-bond acceptor, hydrogen-bond donor, IUPAC name, INCHI, INCHIKEY, molecular formula, molecular weight, canonical SMILES, isomeric SMILES, topologi-cal polar surface area (TPSA) as shown in the following figure.

    [Back to the start of 'How to interpret the results']
  • [Protein] Protein tab provides interacting proteins associated with a given chemical. Ensembl protein ID, Gene Symbol, Description and Score are provided with links to Ensembl and NCBI gene databases as shown in the following figure. For the score, please refer to the sction 'What is the score' for detail.

    1. In the top left, users can choose to show how many entries per page
    2. In the top right, users can type keywords for searching any field in the table
    3. Protein, Gene symbol and descriptions is searchable by typing keywords in the corresponding fields of head row of table.
    4. For filtering proteins with score, a sliding bar is adjustable for showing proteins in a range of scores. Please note the filtering DOES NOT affect the analysis results. To conduct analysis using different score threshold, please utilize the search options as mentioned in the sction "How to perform a search"
    5. All the headers of table is clickable for sorting.
    6. In the bottom right, the bottons are clickable for pagination
    7. In the bottom left, the results is downloaded by clicking the 'Download' link. The results will be saved as a Tab-delimited file that can be viewed by softwares such as Notepad, Notepad++, Excel, LibreOffice.
    [Back to the start of 'How to interpret the results']
  • [GO] GO tab provides enrichment analysis results using hypergeometric tests with Benjamini-Hochberg correction for multiple testing. Enriched terms with a corrected p-value<0.05 will be identified.
    1. In the top left, users can choose to show how many entries per page
    2. In the top right, users can type keywords for searching any field in the table
    3. Protein, Gene symbol and descriptions is searchable by typing keywords in the corresponding fields of head row of table.
    4. For filtering proteins with score, a sliding bar is adjustable for showing proteins in a range of scores. Please note the filtering DOES NOT affect the analysis results. To conduct analysis using different score threshold, please utilize the search options as mentioned in the sction "How to perform a search"
    5. All the headers of table is clickable for sorting.
    6. For the 'Type' column, Molecular Function, Cellular Component and Biological Process are abbreviated as MF, CC and BP, respectively.
    7. The 'ID' column is clickable for linking to external database of QuickGO.
    8. For the 'Genes' column, if too many gene are associated with the term, only part of the gene list will be shown with a link of '[+]' that is clickable for checking the detailed information. By clicking the '[-]' link, the cell will go back to the original form with only part of gene list
    9. In the bottom right, the bottons are clickable for pagination
    10. In the bottom left, the results is downloaded by clicking the 'Download' link. The results will be saved as a Tab-delimited file that can be viewed by softwares such as Notepad, Notepad++, Excel, LibreOffice.
    11. Please refer to [What are the columns in the result tables] for detailed information of columns
    [Back to the start of 'How to interpret the results']
  • [Pathway] Pathway tab provides enrichment analysis results using hypergeometric tests with Benjamini-Hochberg correction for multiple testing. Enriched terms with a corrected p-value<0.05 will be identified.
    1. In the top left, users can choose to show how many entries per page
    2. In the top right, users can type keywords for searching any field in the table
    3. Protein, Gene symbol and descriptions is searchable by typing keywords in the corresponding fields of head row of table.
    4. For filtering proteins with score, a sliding bar is adjustable for showing proteins in a range of scores. Please note the filtering DOES NOT affect the analysis results. To conduct analysis using different score threshold, please utilize the search options as mentioned in the sction "How to perform a search"
    5. All the headers of table is clickable for sorting.
    6. For the 'Type' column, KEGG and Ractome pathways are abbreviated as KEGG and RACT, respectively.
    7. The 'ID' column is clickable for linking to external database of KEGG and Reactome.
    8. For the 'Genes' column, if too many gene are associated with the term, only part of the gene list will be shown with a link of '[+]' that is clickable for checking the detailed information. By clicking the '[-]' link, the cell will go back to the original form with only part of gene list
    9. In the bottom right, the bottons are clickable for pagination
    10. In the bottom left, the results is downloaded by clicking the 'Download' link. The results will be saved as a Tab-delimited file that can be viewed by softwares such as Notepad, Notepad++, Excel, LibreOffice.
    11. Please refer to [What are the columns in the result tables] for detailed information of columns
    [Back to the start of 'How to interpret the results']
  • [Disease (DO)] Disease tab provides enrichment analysis results using hypergeometric tests with Benjamini-Hochberg correction for multiple testing. Enriched terms with a corrected p-value<0.05 will be identified.
    1. In the top left, users can choose to show how many entries per page
    2. In the top right, users can type keywords for searching any field in the table
    3. Protein, Gene symbol and descriptions is searchable by typing keywords in the corresponding fields of head row of table.
    4. For filtering proteins with score, a sliding bar is adjustable for showing proteins in a range of scores. Please note the filtering DOES NOT affect the analysis results. To conduct analysis using different score threshold, please utilize the search options as mentioned in the sction "How to perform a search"
    5. All the headers of table is clickable for sorting.
    6. The 'ID' column is clickable for linking to external database of Disease Ontology.
    7. For the 'Genes' column, if too many gene are associated with the term, only part of the gene list will be shown with a link of '[+]' that is clickable for checking the detailed information. By clicking the '[-]' link, the cell will go back to the original form with only part of gene list
    8. In the bottom right, the bottons are clickable for pagination
    9. In the bottom left, the results is downloaded by clicking the 'Download' link. The results will be saved as a Tab-delimited file that can be viewed by softwares such as Notepad, Notepad++, Excel, LibreOffice.
    10. Please refer to [What are the columns in the result tables] for detailed information of columns
    [Back to the start of 'How to interpret the results']
  • [Disease (DOLite)] Disease tab provides enrichment analysis results using hypergeometric tests with Benjamini-Hochberg correction for multiple testing. Enriched terms with a corrected p-value<0.05 will be identified.
    1. In the top left, users can choose to show how many entries per page
    2. In the top right, users can type keywords for searching any field in the table
    3. Protein, Gene symbol and descriptions is searchable by typing keywords in the corresponding fields of head row of table.
    4. For filtering proteins with score, a sliding bar is adjustable for showing proteins in a range of scores. Please note the filtering DOES NOT affect the analysis results. To conduct analysis using different score threshold, please utilize the search options as mentioned in the sction "How to perform a search"
    5. All the headers of table is clickable for sorting.
    6. The 'ID' column is clickable for linking to external database of Disease Ontology Lite.
    7. For the 'Genes' column, if too many gene are associated with the term, only part of the gene list will be shown with a link of '[+]' that is clickable for checking the detailed information. By clicking the '[-]' link, the cell will go back to the original form with only part of gene list
    8. In the bottom right, the bottons are clickable for pagination
    9. In the bottom left, the results is downloaded by clicking the 'Download' link. The results will be saved as a Tab-delimited file that can be viewed by softwares such as Notepad, Notepad++, Excel, LibreOffice.
    [Back to the start of 'How to interpret the results']
Back to content

What are the columns in the result tables

For the analysis of GO, pathway and disease ontology, the meaning for each column of the result tables are given in the follows.

  1. [Gene Ratio]: The gene ratio indicates the ratio between the number of interacting genes associated with a DO term and the number of interacting genes mapped to DO terms.
  2. [Bg Ratio]: The ratio between the number of genes associated with a DO term and the number of genes belonging to DO terms is represented as background ratio (Bg Ratio).
  3. [P]: The p-value is calculated based on the hypergeometric test without multiple test correction.
  4. [Adj. P]: The adjusted p-value is calculated based on the hypergeometric test with multiple test correction.
  5. [Q]: The q-value is a measure of false discovery rate
Back to content

The analysis pipeline of ChemDIS

ChemDIS integrated several databases and packages for analyzing enriched terms. The analysis pipeline is listed as follows.

  1. The basic information of chemicals extracted from PubChem will be shown
  2. Given a chemical, the interacting proteins with scores larger than or equal to the user-selected threshold will firstly be extracted from STITCH database
  3. The enriched Gene Ontology (GO) terms will be identified by a hypergeometric test. For GO terms, clusterProfiler is utilized.
  4. For pathway analysis, clusterProfiler and ReactomePA are utilized for analyzing enriched KEGG and reactome pathways.
  5. DOSE package is utilized to analyze enriched Disease Ontology (DO) and DOLite terms.
Back to content

How to cite ChemDIS

If you find ChemDIS useful. Please cite: C.-W. Tung (2015) ChemDIS: a chemical-disease inference system based on chemical-protein interactions, Journal of Cheminformatics, 7(1), 25.
Back to content

References

Back to content

Contact

Chun-Wei Tung Ph.D.
Assistant Professor
School of Pharmacy & PhD Program in Toxicology,
Kaohsiung Medical University
E-mail: cwtung@kmu.edu.tw
Webpage: http://cwtung.kmu.edu.tw
Back to content

ID to retrieve this result:{{c_pList[0].Key}} (will be kept for a week)
New analysis Retrive previous result

Please wait...

Show: entries
page {{1+Math.floor(c_poffset/c_pnum_perpage)}}/{{Math.ceil(c_pData.length/c_pnum_perpage)}}
Gene Symbol
Gene name
{{c_p.Gene_symbol}} {{c_p.Gene_name}}
No interacting protein found!
Show: entries
page {{1+Math.floor(c_goffset/c_gnum_perpage)}}/{{Math.ceil(c_gData.length/c_gnum_perpage)}}
Type
ID Description
Gene Ratio Bg Ratio P Adj. P Q Genes
{{c_g.Type}} {{c_g.ID}} {{c_g.Description}} {{c_g.GeneRatio}} {{c_g.BgRatio}} {{c_g.pvalue | scientific}} {{c_g.p_adjust | scientific}} {{c_g.qvalue | scientific}} [+] [-] {{c_g.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(c_aoffset/c_anum_perpage)}}/{{Math.ceil(c_aData.length/c_anum_perpage)}}
Type
ID Description
Gene Ratio Bg Ratio P Adj. P Q Genes
{{c_a.Type}} {{c_a.ID}} {{c_a.ID}} {{c_a.Description}} {{c_a.GeneRatio}} {{c_a.BgRatio}} {{c_a.pvalue | scientific}} {{c_a.p_adjust | scientific}} {{c_a.qvalue | scientific}} [+] [-] {{c_a.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(c_doffset/c_dnum_perpage)}}/{{Math.ceil(c_dData.length/c_dnum_perpage)}}
ID Description
Gene Ratio Bg Ratio P Adj. P Q Genes
{{c_d.ID}} {{c_d.Description}} {{c_d.GeneRatio}} {{c_d.BgRatio}} {{c_d.pvalue | scientific}} {{c_d.p_adjust | scientific}} {{c_d.qvalue | scientific}} [+] [-] {{c_d.geneID | filtergene : more}}
No significant enriched term found!
Show: entries
page {{1+Math.floor(c_ooffset/c_onum_perpage)}}/{{Math.ceil(c_oData.length/c_onum_perpage)}}
ID Description
Gene Ratio Bg Ratio P Adj. P Q Genes
{{c_o.ID}} {{c_o.Description}} {{c_o.GeneRatio}} {{c_o.BgRatio}} {{c_o.pvalue | scientific}} {{c_o.p_adjust | scientific}} {{c_o.qvalue | scientific}} [+] [-] {{c_o.geneID | filtergene : more}}
No significant enriched term found!