SourceForge has been redesigned. Learn more.
Close

Current proteins in Rhea

2009-06-10
2013-01-02
  • Rafael Alcántara

    Dear all,

    Currently we have 145 proteins annotated as such (<protein/> tag) in Rhea as reaction participants. 21 of them have already a ChEBI ID.

     
    • pmatos

      pmatos - 2009-08-05

      Hello,

      At ICBO, Darren Natale of PRO had an additional point to make with the proposed resolution of the ChEBI-PRO overlap. Please see his message below.

      I agree with Darren that residues (whether or not they are parts of proteins) belong in ChEBI as molecule parts. These should be defined classes within ChEBI. We can even have an upper level term 'protein part', and still remain within our scope.

      Any comments?

      Thanks,
      Janna

      > -----Original Message-----

      > From: Darren Natale [mailto:dan5@georgetown.edu]

      > Sent: 26 July 2009 20:01

      > To: hastings@ebi.ac.uk; BatchelorC@rsc.org

      > Subject: our discussion re: PRO, ChEBI, Rhea

      >

      > I was thinking of making a kind of announcement to our users and others

      > regarding our discussion.  Here is what I've crafted.  I'd like to get

      > your comments and/or indicate your agreement/disagreement.  However, I

      > have a concern about the consequence of what we've decided.  The

      > consequence is that a given user might have to sometimes refer to ChEBI

      > and other times refer to PRO, and that it might not always be clear to

      > them.  Please also address this in your comments.

      >

      > ====

      >

      > Representatives from PRO (Darren Natale), ChEBI (Janna Hastings), and SO

      > (Colin Batchelor) conferred about the issue of protein terms in ChEBI,

      > in particular those terms that refer to specific amino acids.  One

      > outcome from earlier discussions between PRO (DN) and ChEBI (JH) was

      > that ChEBI would not have protein terms any longer, and that users

      > requiring such terms (mainly RhEA) would make requests to PRO (to which

      > RhEA agreed).  However, further discussion between PRO (DN) and RhEA

      > (Anne Morgat) revealed that in many cases the important aspect of many

      > of the needed terms was in fact not the protein at all, but rather some

      > particular amino acid attached to a protein (for example

      > “[protein]-L-tyrosine” getting converted to [protein]-L-tyrosine

      > phosphate).  While it is perfectly reasonable for PRO to contain such a

      > term (which would be defined as a protein that has_part tyrosine), such

      > a class would be somewhat trivial, and in fact might not be in the best

      > interest of the user (who in this case really just cares about the

      > tyrosine which happens to be attached to a protein).  Considering that

      > the chemical composition and reactivity of tyrosine (free) differs from

      > tyrosine (protein), it is perfectly reasonable for ChEBI to contain both

      > kinds of tyrosine terms.  The latter would be defined as a tyrosine

      > part_of PRO:protein.  This suggestion would only impact those

      > protein-related terms where the focus is a particular amino acid, but

      > not those where the focus is the protein itself.  One example of the

      > latter is “[G-protein-coupled receptor] phosphate”.  To conclude: It is

      > the tentative opinion of the discussants that [protein]-L-tyrosine and

      > related terms (examples given below) should be ChEBI terms that are

      > built from cross-products with PRO.

      >

      > For PRO:

      >

      > [low-density-lipoprotein receptor]

      > [low-density-lipoprotein receptor] phosphate

      >

      >     focus=modification of the protein

      >

      > For ChEBI:

      > [low-density-lipoprotein receptor]-L-serine

      > [low-density-lipoprotein receptor]-O-phospho-L-serine

      >

      >     focus=modification of amino acid in protein

       
    • pmatos

      pmatos - 2009-08-05

      Hi everybody,

      Thanks for your mail Darren. We CC "caffeine" because I guess this discussion concerns all
      people working for RhEA (sorry to the others!). To Paula, Rafael: Maybe be we have to define a reduced mailing list ;-) ?

      > However, after you and I talked and your needs were a bit clearer to me, I thought a bit further about it and it seems that there are some "protein" terms used by RhEA that really aren't about the protein at all, but rather about some part of the protein ([protein]-L-tyrosine => [protein]-L-tyrosine phosphate is one example; the change occurs to a tyrosine that happens to be within a protein instead of to a free tyrosine).
      > So, when the RhEAction is about the side chain of a residue, use ChEBI, but when the RhEAction is about a protein (for example, [beta-adrenergic receptor] => [beta-adrenergic receptor]phosphate) you would use PRO directly. 
      > ChEBI and PRO will avoid overlap anyway because ChEBI will define their protein-related terms using PRO.
      >

      We are not sure if we understand correctly... You propose that ChEBI defines protein-related terms (using PRO) [for all reactions involving proteins]?
      Kristian and I agree!
      We think it's really important to have ALL these entities in ChEBI. RhEA is built on ChEBI and all the reaction participants have to be in ChEBI.

      Let us have a look at the following example EC 2.5.1.46, deoxyhypusine synthase
      Comment in the IUBMB entry:
      > The eukaryotic initiation factor eIF5A contains a hypusine residue that is essential for activity. This enzyme catalyses the first reaction of hypusine formation from one specific lysine residue of the eIF5A precursor. The reaction occurs in four steps: NAD+-dependent dehydrogenation of spermidine (1a), formation of an enzyme-imine intermediate by transfer of the 4-aminobutylidene group from dehydrospermidine to the active site lysine residue (Lys329 for the human enzyme; 1b), transfer of the same 4-aminobutylidene group from the enzyme intermediate to the e1F5A precursor (1c), reduction of the e1F5A-imine intermediate to form a deoxyhypusine residue (1d). Hence the overall reaction is transfer of a 4-aminobutyl group. For the plant enzyme, homospermidine can substitute for spermidine and putrescine can substitute for the lysine residue of the eIF5A precursor.
      >
      > Global reaction:  [eIF5A-precursor]-lysine + spermidine = [eIF5A-precursor]-deoxyhypusine + propane-1,3-diamine
      >     *   (1a) spermidine + NAD+ = dehydrospermidine + NADH
      >     *   (1b) dehydrospermidine + [enzyme]-lysine = N-(4-aminobutylidene)-[enzyme]-lysine + propane-1,3-diamine
      >     *   (1c) N-(4-aminobutylidene)-[enzyme]-lysine + [eIF5A-precursor]-lysine = N-(4-aminobutylidene)-[eIF5A-precursor]-lysine + [enzyme]-lysine
      >     *   (1d) N-(4-aminobutylidene)-[eIF5A-precursor]-lysine + NADH + H+ = [eIF5A-precursor]-deoxyhypusine + NAD+
      >    

      In this case, [enzyme]-lysine is a lysine residue of deoxyhypusine synthase, but it's not of interest to save this information.
      We only need to refer to it as lysine residue [CHEBI:32568] and in addition we need a ChEBI entity to represent "N-(4-aminobutylidene)-[enzyme]-lysine" residue, both with formula to enable balancing of the reaction.

      Concerning eIF5A eukaryotic translation initiation factor 5A:
      We need [eIF5A-precursor]-lysine, N-(4-aminobutylidene)-[eIF5A-precursor]-lysine and [eIF5A-precursor]-deoxyhypusine
      To balance the reaction we need ChEBI residues
      [eIF5A-precursor]-lysine == [enzyme]-lysine == [CHEBI:32568]   
      N-(4-aminobutylidene)-[eIF5A-precursor]-lysine == N-(4-aminobutylidene)-[residue]-lysine
              [eIF5A-precursor]-deoxyhypusine == [residue]-deoxyhypusine

      ChEBI has the expertise to define chemically:  [R]-lysine, N-(4-aminobutylidene)-[R]-lysine and [R]-deoxyhypusine
      PRO/UniProt has the expertise to define protein families and PTM

      It is important to link these ChEBI residues to the target protein in PRO.

      [Darren, can you help me to identify eIF5A in PRO in order to see what kind of information is provided. Thanks! ]

      We propose that ChEBI plays with the IsA relationships to create as many residues as we need, all of them being linked to PRO
      > * lysine residue [CHEBI:32568]
      > |--- [eIF5A-precursor]-lysine residue  
      > |--- [ribulose-1,5-bisphosphate carboxylase]-lysine
      > |--- etc

      In this way, only a minimal set of information has to be stored in ChEBI : a name (defined in collab with PRO), a cross-ref to PRO, and structural data (formula, 2D structure, etc) necessary to perform mass-conservation calcul and graphical display in RhEA

      Darren, we guess that you in PRO only will have [eIF5A-precursor] and [eIF5A-precursor]-deoxyhypusine, the final states of the protein?
      Or do you envisage to represent  N-(4-aminobutylidene)-[eIF5A-precursor]-lysine too?

      > So, I have a question.  ChEBI is structured in such a way that reactions can be balanced, because they provide chemical formulas. 
      > However, protein-related terms lack a full chemical formula.
      > I think in most cases this is okay, because the protein part is basically inert (that is, "R" on one side of the equation stays as "R" on the other side and is therefore balanced).  But, how do you balance the reactions that involve changes without specifying the residue (and thus appear to be changes made directly to proteins)?

      We will have to check case by case to identify what is happening :-(
      For example, look at the reaction catalyzed by EC 2.7.11.19:
      > ID   2.7.11.19
      > DE   Phosphorylase kinase.
      > CA   2 ATP + phosphorylase b = 2 ADP + phosphorylase a.
      >
      > CC   -!- The enzyme phosphorylates a specific serine residue in each of the
      > CC       subunits of the dimeric phosphorylase b.
      > CC   -!- For muscle phosphorylase but not liver phosphorylase, this is
      > CC       accompanied by a further dimerization to form a tetrameric
      > CC       phosphorylase.
      > CC   -!- The enzyme couples muscle contraction with energy production via
      > CC       glycogenolysis--glycolysis by catalyzing the Ca(2+)-dependent
      > CC       phosphorylation and activation of glycogen phosphorylase b.
      > CC   -!- The gamma subunit of the tetrameric alpha-beta-gamma-delta enzyme is
      > CC       the catalytic subunit.
      >
      In this case, we will have to check what [phosphorylase b] and [phosphorylase a] exactly are and possibly translate [phosphorylase b] into [phosphorylase b]-serine and [phosphorylase a] into [phosphorylase b]-phosphoserine.

      So a lot of work to do in perspective for all of us (ChEBI, PRO, RhEA and X for nucleic acids...).

      Cheers,

      Anne and Kristian

      >
      >
      >
      > Le 30 juil. 09 à 17:07, Darren Natale a écrit :
      >
      >> Hi Anne,
      >>
      >> The outcome of the discussion with my PRO colleagues is that PRO will include the types of terms that will be useful for RhEA.
      >
      >>  However, after you and I talked and your needs were a bit clearer to me, I thought a bit further about it and it seems that there are some "protein" terms
      >> used by RhEA that really aren't about the protein at all, but rather about some part of the protein ([protein]-L-tyrosine => [protein]-L-tyrosine phosphate is one example; the change occurs to a tyrosine that happens to be within a protein instead of to a free tyrosine).  In such cases, for the sake of balancing, it seems better to use ChEBI for such terms.  So, I talked with ChEBI people (Janna Hastings and Colin Batchelor; both cc'd here), and they agree that ChEBI could have terms like tyrosine (free) and additional terms like tyrosine residue (tyrosine within a protein chain).  Actually, they do have such terms already.  So, when the RhEAction is about the side chain of a residue, use ChEBI, but when the RhEAction is about a protein (for example, [beta-adrenergic receptor] => [beta-adrenergic receptor]phosphate) you would use PRO directly.  ChEBI and PRO will avoid overlap anyway because ChEBI will define their protein-related terms using PRO.
      >>
      >> The main idea is for PRO and ChEBI to cooperate in such a way that we provide the maximum benefit for RhEA.
      >>
      >> So, I have a question.  ChEBI is structured in such a way that reactions can be balanced, because they provide chemical formulas.  However, protein-related terms lack a full chemical formula.  I think in most cases this is okay, because the protein part is basically inert (that is, "R" on one side of the equation stays as "R" on the other side and is therefore balanced).  But, how do you balance the reactions that involve changes without specifying the residue (and thus appear to be changes made directly to proteins)?
      >>
      >> I look forward to your answer.  And please do let us know if our proposed use of ChEBI versus PRO for the different types of cases seems reasonable to you.
      >>
      >> Best regards,
      >>
      >> Darren

      * Anne Morgat
      * * * * * * * * * * * * * * * * * * * * * * * *

       
    • pmatos

      pmatos - 2009-08-05

      Sorry for the extensive deletion of examples...

      Anne Morgat wrote:
      > Hi everybody,
      >
      >> So, when the RhEAction is about the side chain of a residue, use ChEBI, but when the RhEAction is about a protein (for example, [beta-adrenergic receptor] => [beta-adrenergic receptor]phosphate) you would use PRO directly.  ChEBI and PRO will avoid overlap anyway because ChEBI will define their protein-related terms using PRO.
      >>
      >
      > We are not sure if we understand correctly... You propose that ChEBI defines protein-related terms (using PRO) [for all reactions involving proteins]?
      > * Kristian and I agree!*
      > We think it's really important to have ALL these entities in ChEBI. RhEA is built on ChEBI and all the reaction participants have to be in ChEBI.

      Yes, you understand the proposal somewhat correctly (see below), though doing so would be problematic for those protein-related terms where a residue is not indicated (like [R] => [R]phosphate) (again, see below).

      > ChEBI has the expertise to define chemically:  [R]-lysine, N-(4-aminobutylidene)-[R]-lysine and [R]-deoxyhypusine PRO/UniProt has the expertise to define protein families and PTM
      >
      > It is important to link these ChEBI residues to the target protein in PRO.
      > We propose that ChEBI plays with the IsA relationships to create as many residues as we need, all of them being linked to PRO
      >> * lysine residue [CHEBI:32568]
      >> |--- [eIF5A-precursor]-lysine residue   |--- [ribulose-1,5-bisphosphate carboxylase]-lysine
      >> |--- etc
      >
      > In this way, only a minimal set of information has to be stored in ChEBI : a name (defined in collab with PRO), a cross-ref to PRO, and structural data (formula, 2D structure, etc) necessary to perform mass-conservation calcul and graphical display in RhEA

      But, this seems to put a heavy burden on ChEBI to create and define terms for every protein-related participant used by RhEA (even though the definitions will use PRO for the protein parts (like [ribulose-1,5-bisphosphate carboxylase]).  It seems that it would be much easier for RhEA to have your example 1c:

      N-(4-aminobutylidene)-[enzyme]-lysine + [eIF5A-precursor]-lysine => N-(4-aminobutylidene)-[eIF5A-precursor]-lysine + [enzyme]-lysine

      specified like this:

      ChEBI:xx N-(4-aminobutylidene)-[R1]-lysine + ChEBI:yy [R2]-lysine => ChEBI:xx N-(4-aminobutylidene)-[R2]-lysine + ChEBI:yy [R1]-lysine
          where R1 = PRO:xxx (deoxyhypusine synthase)
              where R2 = PRO:yyy (eIF5A-precursor)

      ChEBI then needs only to define two terms (such as the lysine-within-a-protein term) using [R] (which they already do), and PRO needs to define the two [R] terms (which we do, or at least will).

      >> ... how do you balance the reactions that involve changes without specifying the residue (and thus appear to be changes made directly to proteins)?
      >
      > We will have to check case by case to identify what is happening :-(
      > For example, look at the reaction catalyzed by EC 2.7.11.19:

      > In this case, we will have to check what [phosphorylase b] and [phosphorylase a] exactly are and possibly translate [phosphorylase b] into [phosphorylase b]-serine and [phosphorylase a] into [phosphorylase b]-phosphoserine.

      This does seem like the best solution.

      > So a lot of work to do in perspective for all of us (ChEBI, PRO, RhEA and X for nucleic acids...).

      Indeed!

      >
      > Cheers,
      >
      > Anne and Kristian
      >
      >>
      >>
      >>
      >> Le 30 juil. 09 à 17:07, Darren Natale a écrit :
      >>
      >>> Hi Anne,
      >>>
      >>> The outcome of the discussion with my PRO colleagues is that PRO will include the types of terms that will be useful for RhEA.
      >>
      >>>  However, after you and I talked and your needs were a bit clearer to me, I thought a bit further about it and it seems that there are some "protein" terms
      >>> used by RhEA that really aren't about the protein at all, but rather about some part of the protein ([protein]-L-tyrosine => [protein]-L-tyrosine phosphate is one example; the change occurs to a tyrosine that happens to be within a protein instead of to a free tyrosine).  In such cases, for the sake of balancing, it seems better to use ChEBI for such terms.  So, I talked with ChEBI people (Janna Hastings and Colin Batchelor; both cc'd here), and they agree that ChEBI could have terms like tyrosine (free) and additional terms like tyrosine residue (tyrosine within a protein chain).  Actually, they do have such terms already.  So, when the RhEAction is about the side chain of a residue, use ChEBI, but when the RhEAction is about a protein (for example, [beta-adrenergic receptor] => [beta-adrenergic receptor]phosphate) you would use PRO directly.  ChEBI and PRO will avoid overlap anyway because ChEBI will define their protein-related terms using PRO.
      >>>
      >>> The main idea is for PRO and ChEBI to cooperate in such a way that we provide the maximum benefit for RhEA.
      >>>
      >>> So, I have a question.  ChEBI is structured in such a way that reactions can be balanced, because they provide chemical formulas.  However, protein-related terms lack a full chemical formula.  I think in most cases this is okay, because the protein part is basically inert (that is, "R" on one side of the equation stays as "R" on the other side and is therefore balanced).  But, how do you balance the reactions that involve changes without specifying the residue (and thus appear to be changes made directly to proteins)?
      >>>
      >>> I look forward to your answer.  And please do let us know if our proposed use of ChEBI versus PRO for the different types of cases seems reasonable to you.
      >>>
      >>> Best regards,
      >>>
      >>> Darren
      >

       
    • pmatos

      pmatos - 2009-08-05

      Hi all,

      I agree with Darren. It makes no sense for ChEBI to define all terms with a protein-related participant even though its convenient for ChEBI.

      Have we thought that maybe we need to allow Rhea to process complex terms i.e. store the cross-products within Rhea rather than the individual ontologies.

      So if you have " [eIF5A-precursor]-lysine "

      Then you have a complex term in Rhea with a cross-product of
      CHEBI:yy (-lysine) and PRO:yyy is eIF5A-precursor.

      The chemical information can still be retrieved via ChEBI but Rhea will need to do more processing with term representation and storing these complex terms.

      I think in the long term an approach like this might be valuable for Rhea to allow capturing of more than one term.

      cheers,
      P

       

Log in to post a comment.