From: Martin H. <mh...@uv...> - 2011-05-11 20:13:07
|
> Another line of inquiry you might track is whether the extra steps > performed by Adam in the case of the Snowball analyzer might be > necessary (see http://markmail.org/message/gcepf56nkc5huck6); I think > Adam's steps render Mike's steps unnecessary but I'm not sure > (http://markmail.org/message/kmpetl2leq457t5i). I tried Adam's patch, but I couldn't get eXist to compile with the patch included. Did anyone else get it working? This is the patch file Adam sent me: +import java.lang.reflect.Constructor; +import java.lang.reflect.Field; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.TypeVariable; import org.w3c.dom.Element; import org.exist.util.DatabaseConfigurationException; import org.apache.lucene.analysis.Analyzer; import java.util.Map; import java.util.TreeMap; +import org.w3c.dom.NamedNodeMap; +import org.w3c.dom.Node; +import org.w3c.dom.NodeList; public class AnalyzerConfig { @@ -24,23 +31,78 @@ } public void addAnalyzer(Element config) throws DatabaseConfigurationException { - String id = config.getAttribute(ID_ATTRIBUTE); Analyzer analyzer = configureAnalyzer(config); - if (id == null || id.length() == 0) + String id = config.getAttribute(ID_ATTRIBUTE); + if (id == null || id.length() == 0) { defaultAnalyzer = analyzer; - else + } else { analyzers.put(id, analyzer); } + } protected static Analyzer configureAnalyzer(Element config) throws DatabaseConfigurationException { String className = config.getAttribute(CLASS_ATTRIBUTE); if (className != null && className.length() != 0) { try { Class<?> clazz = Class.forName(className); - if (!Analyzer.class.isAssignableFrom(clazz)) - throw new DatabaseConfigurationException("Lucene index: analyzer class has to be" + - " a subclass of " + Analyzer.class.getName()); - return (Analyzer) clazz.newInstance(); + if (!Analyzer.class.isAssignableFrom(clazz)) { + throw new DatabaseConfigurationException("Lucene index: analyzer class has to be a subclass of " + Analyzer.class.getName()); + } + + NodeList params = config.getElementsByTagName("param"); + if(params.getLength() == 0){ + return (Analyzer)clazz.newInstance(); + } else { + + Object args[] = new Object[params.getLength()]; + + for(Constructor constructor : clazz.getConstructors()) { + TypeVariable typeVars[] = constructor.getTypeParameters(); + if(typeVars.length == params.getLength()) { + + boolean matched = false; + //found a constructor of the same length + for(int i = 0; i < typeVars.length; i++) { + Node param = params.item(i); + + NamedNodeMap attrs = param.getAttributes(); + String name = attrs.getNamedItem("name").getNodeValue(); + String type = attrs.getNamedItem("type").getNodeValue(); + String value = attrs.getNamedItem("value").getNodeValue(); + + //either field or string - could be extended + if(type != null && type.equals("java.lang.reflect.Field")){ + String clazzName = value.substring(0, name.lastIndexOf(".")); + String fieldName = value.substring(name.indexOf(".") + 1); + + Class fieldClazz = Class.forName(clazzName); + Field field = fieldClazz.getField(fieldName); + + //does the field type match the constructor var type? + if(field.getType().getName().equals(typeVars[i].getName())) { + args[i] = field.get(fieldClazz.newInstance()); + matched = true; + } else { + matched = false; + break; + } + } else if(typeVars[i].getName().equals("java.lang.String")) { + args[i] = value; + matched = true; + } else { + matched = false; + break; + } + } + + if(matched) { + return (Analyzer) constructor.newInstance(args); + } + } + } + } + + } catch (ClassNotFoundException e) { throw new DatabaseConfigurationException("Lucene index: analyzer class " + className + " not found."); @@ -50,6 +112,12 @@ } catch (InstantiationException e) { throw new DatabaseConfigurationException("Exception while instantiating analyzer class " + className + ": " + e.getMessage(), e); + } catch(InvocationTargetException e) { + throw new DatabaseConfigurationException("Exception while instantiating analyzer class " + + className + ": " + e.getMessage(), e); + } catch(NoSuchFieldException e) { + throw new DatabaseConfigurationException("Exception while instantiating analyzer class " + + className + ": " + e.getMessage(), e); } } return null; Cheers, Martin On 11-05-08 05:45 PM, Joe Wicentowski wrote: > Hi Efraim, > > I notice that whereas eXist uses Lucene 2.9.2, the Hebrew analyzer's > default version is Lucene 3.0.2 - see the lib folder inside of: > > https://github.com/synhershko/HebMorph/tree/master/java/lucene.hebrew > > It also appears from the commit logs that there was some effort to > backport to "2.9", but again the lib folder contains 2.9.3 - close but > still newer than eXist's 2.9.2. > > It might be worth finding out if this version is going to be > compatible with the 2.9.2 release of Lucene. > > Another line of inquiry you might track is whether the extra steps > performed by Adam in the case of the Snowball analyzer might be > necessary (see http://markmail.org/message/gcepf56nkc5huck6); I think > Adam's steps render Mike's steps unnecessary but I'm not sure > (http://markmail.org/message/kmpetl2leq457t5i). > > I anticipate using Hebrew for an upcoming side project, so I'm > following this with interest - and I would be happy to confirm tests > on sample texts. > > Cheers, > Joe > > > On Sun, May 8, 2011 at 7:47 PM, Efraim Feinstein > <efr...@gm...> wrote: >> On 05/08/2011 02:48 PM, Wolfgang Meier wrote: >>>> I'm having some issues setting up a custom tokenizer/analyzer in eXist, >>>> and I wanted to know if I'm doing anything obviously wrong. >>> I can't see anything obviously wrong in the lucene index >>> configuration, but the tokenizer setting in conf.xml is for eXist's >>> old (now deprecated) full-text index and won't work with the lucene >>> tokenizer. Does it help if you reset that to the old setting? I don't >>> expect it does. >> >> Unfortunately, I spoke too soon on this one. When I reinstalled the db, >> I had forgotten to copy the analyzer into the classpath. When it's >> there, I still get the same NPE. I also get an additional error when >> trying to store a file in the database through the admin client: >> "Impossible to store a resource [path]: null" >> >> The resource appears anyway, and there's no exception in the logs. >> >> Thanks, >> >> >> -- >> --- >> Efraim Feinstein >> Lead Developer >> Open Siddur Project >> http://opensiddur.net >> http://wiki.jewishliturgy.org >> >> >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Exist-open mailing list >> Exi...@li... >> https://lists.sourceforge.net/lists/listinfo/exist-open >> > > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > Exist-open mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-open |