John,

Thanks. Your reply is actually very helpful. Yes, I just care whether there is no overall charge. If the whole compound is neutral, I am supposed to do nothing with it. The case mentioned is from a compound database. I need to make sure that each compound is neutral before my program goes to the next step. In your last example, I believe either output would be enough to me.

My knowledge of Inchi is very limited. Could you please help me to verify that if a Inchi has no†layer starting with /p or /q, this Inchi must be neutral?

I am also confused by the following part:

My guess would be the correct way would be to order the neutralisation of charges using pKa? In which case you need a pKa predictor, again, not simple [1].

Firstly, what is pKa?†

Secondly, what does [1] refer to?†

Yes, I did try RDKit and got help from developers. It looks their solution may also have problems on multiple charge cases. I hope the number of cases with this problem will be reduced after I add a function by calculating the overall charge.

Again, thank you very much.

Merry Christmas and Happy New Year!

Yingfeng



On Mon, Dec 23, 2013 at 12:49 PM, John May <johnmay@ebi.ac.uk> wrote:
Hi Yingfeng,

In short, no. I donít think itís easy to provide a comprehensive solution for neutralisation. However approximations such as the RDKit SMARTS youíve tried offer a good approach for most cases.

What might be easier is to understand why you need to neutralise the compounds?

Anyways, Iím not a chemist but Iíll try my best to answer as to why itís not simple. Firstly in an InChI string you can tell if there is a charge when a layer starts with /p or /q.†

InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/p-1/t3-/m0/s1

You could also check the atoms where the formal charge is not 0. I guess what you really mean by charged is whether there is no overall charge.

C[C@@H](O)[C@H]([NH3+])C([O-])=O uncharged
C[C@@H]([O-])[C@H]([NH3+])C([O-])=O charged

Again a simple procedure of summing all the charges in a connected structure will tell you this:

int sum = 0;
for (IAtom a : m.atoms())
† † sum += a.getFormalCharge();
boolean charged = sum == 0;

As for neutralising, thatís more tricky. There may be something in a dusty corner of the CDK code but Iím not aware of it. The neutralisation of one atom is easily made by adding/removing protons or breaking/making bonds. However when there are multiple charges it is non-trival as it involves a decision. Considering the example from earlier.

C[C@@H]([O-])[C@H]([NH3+])C([O-])=O charged

how do we decide which neutralised form is correct, these are both have no overall charge:

C[C@@H](O)[C@H]([NH3+])C([O-])=O uncharged
C[C@@H]([O-])[C@H]([NH3+])C(O)=O uncharged

My guess would be the correct way would be to order the neutralisation of charges using pKa? In which case you need a pKa predictor, again, not simple [1].

Neutralisation reduces to finding the ionisation a given pH (i.e. find the pH where the compound is neutral). ChemAxon offer this functionality but I have been told of examples where given two ionisation states of the same compound (one > desired pH, one < desired pH) the tool produces different output.†

Sorry I canít be of more help.

Thanks,
John

[1] Lee and Crippen, Predicting pKa†http://pubs.acs.org/doi/abs/10.1021/ci900209w

On 21 Dec 2013, at 14:05, Yingfeng Wang <ywang802@gmail.com> wrote:

I have a compound with Inchi

InChI=1S/C5H9NO4/c6-3(5(9)10)
1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/p-1/t3-/m0/s1

First of, is there is a way to know whether it is charged?

Secondly, is CDK able to neutralize it if it is charged?

Thanks.

Yingfeng


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user