From: SourceForge.net <no...@so...> - 2010-07-22 01:52:39
|
Bugs item #3032847, was opened at 2010-07-21 19:15 Message generated for change (Comment added) made by youjun You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126676&aid=3032847&group_id=248804 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: data Group: None Status: Open Priority: 5 Private: No Submitted By: Kevin S. Clarke (ksclarke) Assigned to: Mark Dominus (mjdominus) Summary: OAI provider returns invalid XML for 12 records Initial Comment: Twelve of the oai_dc records returned from the OAI provider are not valid XML because they contain a Unicode character (0x1a) that is invalid for XML. The twelve records are: TreeBASE.org/study/TB2:s955 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1119 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1226 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1641 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1731 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1779 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1816 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1862 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s1945 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s2028 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s2146 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. TreeBASE.org/study/TB2:s2248 An invalid XML character (Unicode: 0x1a) was found in the element content of the document. Perhaps strip the character before it is returned by the OAI provider? ---------------------------------------------------------------------- >Comment By: youjun guo (youjun) Date: 2010-07-21 21:52 Message: This problem cause by some foreign language letters exist in the Treebase tables especially in table person fields lastname or firestName. We discussed this before. Maybe we need to use some other charset instead of utf-8 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126676&aid=3032847&group_id=248804 |