Greg Holmberg

Show:

What's happening?

  • Encoding conversion and charset attributes

    I'm wondering how encodings are handled. In my case, I have HTML in memory as a byte[] (could be from an HTTP connection, from a file, or several other sources). I know the encoding or have a pretty good guess (I'm using http://icu-project.org to detect encodings if not known). I see an HTMLCleaner constructor that takes an InputStream and a charset name. I can see no other constructor I...

    2007-05-24 06:27:42 UTC in HtmlCleaner

  • NPE in HTMLCleaner.markTree()

    Caused by: java.lang.NullPointerException at org.htmlcleaner.HtmlCleaner.makeTree(HtmlCleaner.java:502) at org.htmlcleaner.HtmlTokenizer.addToken(HtmlTokenizer.java:89) at org.htmlcleaner.HtmlTokenizer.tagStart(HtmlTokenizer.java:365) at org.htmlcleaner.HtmlTokenizer.start(HtmlTokenizer.java:333) at org.htmlcleaner.HtmlCleaner.clean(HtmlCleaner.java:361) at...

    2007-05-24 05:41:37 UTC in HtmlCleaner

  • Comment: Source code for C++

    Great! What are the plans for releasing source code for the 2.x C++ enablement layer?.

    2006-11-16 21:41:40 UTC in UIMA Framework

  • C++: upgrade to ICU 3.x

    Currently the C++ enablement layer uses ICU 2.8. My annotators use ICU 3.0. The current version of ICU is 3.6 (Unicode 5). I could convince the annotator team to upgrade to ICU 3.x (x > 0), but I can't convince them to go back to 2.8. So any version of ICU from 3.0 to 3.6 would be fine.

    2006-10-14 00:22:24 UTC in UIMA Framework

  • Document running C++ app-level APIs

    I would like to be able to create a CAS and call a C++ annotator outside a Java virtual machine, i.e. in a stand-alone C++ executable. I'm told that xcasDriver can do this. I don't even need an Analysis Engine; I would be happy if I could run just a single C++ annotator all my itself.

    2006-10-14 00:17:57 UTC in UIMA Framework

  • Source code for C++

    I would like to get the source code for the C++ enable layer. This would let me: 1. Support platforms other than Linux and Windows. For example, I must support Solaris. 2. Link a different version of the ICU libraries that are compatible with my C++ annotators.

    2006-10-14 00:12:53 UTC in UIMA Framework

  • CAS serialization is not documented

    Class com.ibm.uima.cas.impl.Serialization is not documented in the JavaDoc. Since class CAS doesn't implement Serializable, Serialization is the only way to (efficiently) transport a CAS Java process to Java process (via RMI or Jini). Also, Serialization seems to be missing a method. There is a "CASSerializer serializeCAS(CAS)" method but no way to deserialize it. There is a...

    2006-10-11 22:26:42 UTC in UIMA Framework

  • 2.0 examples use deprecated methods

    For example, in docs\examples\src\com\ibm\uima\examples\RunAE.java, CollectionReader.setCasInitializer() is called, which is deprecated. The examples should show the right way to do things. In this case, it should show how to use a multi-SOFA annotator instead of a CAS Initializer. For example, it might use XmlDetagger annotator. I'm sure there are other places the examples...

    2006-08-30 04:05:23 UTC in UIMA Framework

  • 2.0 examples use deprecated methods

    For example, in docs\examples\src\com\ibm\uima\examples\RunAE.java, CollectionReader.setCasInitializer() is called, which is deprecated. The examples should show the right way to do things. In this case, it should show how to use a multi-SOFA annotator instead of a CAS Initializer. For example, it might use XmlDetagger annotator. I'm sure there are other places the examples...

    2006-08-30 00:51:19 UTC in UIMA Framework

  • Infinite loop in descriptor editor

    In the Component Descriptor Editor, if you are in the Source tab and make a mistake (for example, an import path is wrong and the file can't be read), then when you go to another tab, the editor will pop up an Error dialog. You click OK, and it pops up another (identical) Error dialog. Click OK, and the process repeats without end. Your only choice is to kill the java.exe process.

    2006-08-10 06:12:09 UTC in UIMA Framework

About Me

  • 2006-01-27 (4 years ago)
  • 1437419
  • gcholmberg (My Site)
  • Greg Holmberg

Send me a message