|
||||
|
||||
Oxidoreductases | Transferases | Hydrolases | Lyases | Isomerases | Ligases |
Pipeline to construct RLEdb |
Construction of rate-limiting enzyme regulation database for Human, Rat, Mouse, Yeast and E. coli involve four main steps: curation rate-limiting enzymes from published literatures; mapping enzymes to genes and proteins to assure the enzyme existence; curation regulatory information from published literatures; automatic annotation and update from Intenz,KEGG,Uniprot,Entrez Gene and other related resource. |
Curation rate-limiting enzymes from literature: |
Curation about rate-limiting enzymes include five steps before final submission to RLEdb: exhaustive searching for relevant abstracts from Entrez based on all the enzyme code and names, grouping the downloaded abstracts by their topics, curation of description about the rate-limiting enzymes, mapping the enzyme name to right enzyme code and recording the organism information. As the enzyme name and organism information are curated separately, providing multiple opportunities to verify the accuracy of the final information. exhaustive search: group abstracts: curation: enzyme name: Organism: |
Gene and protein mapping: |
Gene and protein mapping are performed using KEGG ligand and IntEnz database after collected rate-limiting enzymes from five model organisms. Two aims are in this step. One is to validate the existence of the reported rate-limiting enzymes, the other is to get mapped references for rate-limiting enzyme genes and prepare literatures for regulation information collection. |
Regulatory information collection: |
Three types regulatory information are mainly collected including upstream transcription factors, phosphorylation regulation and inhibitors. Each enzymes contain more than one regulation records. Each regulation records is from an separate evidence from literature or database. Transcriptional relavant regulatory information: Although there are many transcription factor databases and enzyme databases,the relation between transcription factor and their target enzymes are not systematically collected. As it has been assumed that the enzymes with the lowest velocities are regulatory, we focus on reported rate-limiting enzymes and collect their upstream transcriptional factors. Only experiment validated transcription factors are collected as regulatory information of "transcription factor". Regulation records of computational promoter analysis or transcriptional binding site analysis are just assigned as "transcription level". "interact with TF" is assigned to interaction pairs of rate-limiting enzymes and transcription factors from high-throughput data of protein protein interaction (Data source :16189514,16169070,17353931,16429126,11799066,15782160,11805837,10688190). Practically, for each enzyme, we search Entrez using expression as (HNF4alpha or hepatocyte nuclear factor or peroxisome proliferator-activated receptor gamma or PPARgamma peroxisome proliferator-activated receptor alpha or PPARalpha or hepatocyte nuclear factor 1 alpha or HNF1alpha or TNF or myc or myb or CREBP or CEBP or CEBP or ChEBP or REST or Sp1 or Sp3 or TGFbeta or NF-kappaB or C/EBPbeta or C/EBPalpha or enhancer binding proteins beta or PGC-1alpha or Nuclear receptor or STAT1 or USF or jun). The quoted transcription factors are common on metabolic network regulation. phosphorylation or post-translational modification relevant regulatory information: For phosphorylation regulation information, we first collect the description for phosphorylation, post-translational regulation, post-translational modification or reversible covalent regulation from published literature using key word "phosphorylation","post-translational regulation","post-translational modification","reversible covalent regulation". Phosphorylation information from Uniprot and PhosSitedb are also integrated into our database during annotation stage to complement phosphorylation information from literature. If enzymes are reported to be regulated by phosphorylation or any proteins contains experimental validated phosphorylation site in database, we will assign "phosphorylation" for the regulation record. If just reported "post-translational modification" or "reversible covalent modification", we also just assign the regulation information as "post-translational modification". inhibitors relevant regulatory information: Feedback and allosteric regulation are two main short-term regulatory mechanisms. We do not do exhaustive searching for the two type information. If they are mentioned in certain literatures, we extract them. All enzyme inhibitor information was extracted from the BRENDA database (version 7.1). Organism specific inhibitors were recorded in a given EC code in BRENDA database. Similar semi-automatic method was used to convert free text inhibitor information to KEGG compound identifiers as described in the previous study. For each enzyme, if the inhibitor description from BRNEDA was exactly match a KEGG compound name, we assigned the KEGG compound to that description. Then we grouped all assigned KEGG compound together by their KEGG compound ID and checked all the mapping results manually. However, many man-made inhibitors such as EDTA could not be produced in vivo. We picked out all the organism-specific inhibitors dataset by in vivo enzyme products of each organism. Although some inhibitors were enzyme products, they just inhibited other proteins not metabolic enzymes. We also excluded such inhibitors from the final dataset as they did not provide inhibiting effect in the metabolic network. other regulatory information: "Key enzyme" and "regulatory enzyme" are very close concepts with rate-limiting enzymes to describe essential enzymes in metabolic pathway. Weber (1974) introduce "key enzyme" to pathway. He lists several features of such enzymes, including low activity, rate-limiting, catalysis of irreversible reactions, their allosteric regulation by inhibitor or feedback. The concept of "regulatory enzyme" emphasize the influence of effectors. Although these two concepts fall short of an unexact definition, description relevant the two concepts provide regulatory information in certain. Although direct regulator are diffcult to discover, upstream signal transduction pathway are often describe in enzyme regulatory study, we record such information as "signal pathway". For each enzyme, regulatory information is extracted from GeneRif annotation first. Extensive searching for more upstream transcription factor and reversible phosphorylation regulation are performed using Entrez as described as above. For any abstracts from Generif or Entrez searching, we need to confirm the enzyme and organisms information first. Rigorous control for enzyme name and organism information during curation ensures the precise regulatory information. After confirming the enzyme information, the article is read to extract classes of information including:
|
Automatic annotation and update from IntEnz,KEGG,Uniprot,Entrez Gene and other resource: |
We will maintain and update RLEdb regularly as more data and information become available. In addition, an automatic pipeline for database annotation and updating was constructed to enable the integration of a pathway-centric set of databases including IntEnz, KEGG/Ligand, and in addition the UniProt, Entrez Gene and NCBI Taxonomy, and the Gene Ontology.
|
Copyright 2009, Center for Bioinformatics | |||
Last Modified: 2009-03-24 | |||
Design by Zhao Min |