From: Max R. A. <max...@jb...> - 2006-06-01 11:54:10
|
> I meant having a Criteria type of QL, like what Compass > does :CompassQueryBuilder queryBuilder =3D session.createQueryBuilder(= ); > > CompassHits hits =3D queryBuilder.bool() > .addMust( queryBuilder.term("name", "jack") ) > .addMustNot( queryBuilder.term("familyName", "london") ) > .toQuery() > .addSort("familyName", CompassQuery.SortPropertyType.STRING) > .addSort("birthdate", CompassQuery.SortPropertyType.INT) > .hits(); well...doesn't there exist some existing object version of lucene query = = api or something ? > About the cache : > You're probably right, but I don't know enough about this. > I only know Compass also provides some cache. > About the bytecode enhancement : > This one is quite important. Ok, here you/we should probably utilize the lazy properties support we = already have, but it might require more customization than we have now. Actually all this stuff might be good usecase candidate for alot of thin= gs = we have talked about doing at some point: 1) Expose CustomQuery to provide hooks for alternative querying 2) Fetch-profiles to allow you to define a "lucene" fetch plan for your = = "partial" objects. 3) Lucene as a 2nd lvl cache. /max > Support you have several types of Objects that have an "report" > property, and you want to show all those documents containing the word= > "toto" in their report property. > The best way is for the query facility to return a collection of those= > documents with their id & report property set (which can be done only = by > getting the result from Lucene), without ever touching the SQL databas= e. > Forcing all those objects, that might be persisted in different tables= , > to be loaded by Hibernate would be both a performance killer and > useless. > But then, if you ever decide to do more than access one of the Lucene > initialized property, you will need those documents to be loaded from > Hibernate. This can only be done through some kind of wrapper / mock /= > byte enhancement, whatever you call it. This is what "mixed mode" mean= s. > You initialize the objects from the Lucene index, but later fetch the > real persisted object from the database as needed, and in a transparen= t > way for the user. > As I said, in a first implementation, we can always "fetch eager" from= > Hibernate, but some provision should be made to avoid loading from the= > database when it's not necessary. > If you use mostly the full text search to display search result pages,= > then most of the time, you'll never need to hit the database. > > Sylvain. > > On Thu, 2006-06-01 at 11:23 +0200, Max Rydahl Andersen wrote: > >> All sounds cool ;) >> >> I can see the advantage of "converters" which can put elements into >> Lucence in a better/human manner. >> >> The loading of objects from Lucene + "yet another QL" I'm a bit more >> critical about. >> >> Would it not be better to do the following: >> >> 1. Use whatever QL Lucene supports to express the query. (What does >> another QL helps here ?) >> >> 2. Do the query against the Lucence index and return id's which then = is >> resolved via Hibernate >> and possible in 2nd lvl cache. (We could maybe optimize the id lookup= s = >> via >> some targetted queries) >> >> 3. IFF you really want look into have Lucene be a 2nd lvl cache = >> provider ? >> (would probably require a "chainable" cacheprovider to have both luce= nce >> and ehcache queries in the same app...but that is "sugar") >> >> ...maybe there is something I miss because I don't understand what th= e >> "mixed mode means" and why you >> want bytecode enhancement mixed in here ? >> >> /max >> >> > After chatting with Emmanuel, here is a draft plan for a closer >> > integration between Hibernate and Lucene for performing full text >> > queries. >> > Hibernate annotations for Lucene helps keeping the lucene indexes u= p = >> to >> > date, but doesn't provide a query facility. >> > It also lacks converters that would for example help store a Date w= ith >> > the proper format in Lucene, so that the alphabetic order matches t= he >> > Object's natural order. >> > >> > A framework like Compass ( http://www.opensymphony.com/compass ) is= >> > meant to fix this problem, by implementing it's own OSEM (Object, = >> Search >> > Engine Mapping), and having a query facility that mimics what = >> hibernate >> > is doing with database side. >> > Compass can even reuse Hibernate's mapping thus minimizing the >> > configuration effort. >> > >> > One short coming I've found with Compass though is that the objects= = >> that >> > you get when you query the full text engine aren't connected to the= = >> ones >> > in the database. >> > So if you manipulate them, the changes aren't persisted or can = >> actually >> > erase some of the information in the database. >> > >> > The best way to have a simple and risk free integration is to build= a >> > Full Text query facility that would be closely integrated with = >> Hibernate >> > & Hibernate Lucene annotations. >> > >> > So, querying the Full Text indexes would return objects, like Compa= ss >> > does, but those objects would be fetched from the database. >> > Actually, for performance reasons, they could be initialized with t= he >> > information from the FT index, and, through byte code enhancement, = if = >> an >> > uninitialized property is read, or a property is set, the real obj= ect >> > could be fetched from the database and read/set accordingly. >> > Here are a few examples : >> > >> > 1) Just make a full text search : >> > query "toto" would fetch all the object with an indexed fie= ld >> > containing toto from the Lucene index. >> > If the objects are initialized from the Lucene index, just = one >> > query to the Lucene index is done, and the search results c= an = >> be >> > displayed. >> > =3D> Best performance. >> > Loading the objects from the database is useless here, and = = >> would >> > only lead to poorer performances. >> > 2) Make a full text search AND manipulate the objects : >> > You want to query all the objects with "toto", and incremen= t >> > their "searchHits" property. >> > You do the query, with a Load.EAGER parameter. >> > Only the objects' ids are retrieved from Lucene, and the re= al >> > objects are retrieved from Hibernate >> > 3) Mix both approaches >> > Requires byte code enhancements. >> > Can be useful for cases where for some types of objects you= >> > don't want to store all the properties required to display = the >> > search view results in the index. >> > Only those objects will be loaded from Hibernate. >> > All 3 modes should work, but we can always begin by = >> implementing >> > mode 2 only (retrieving only the id's from Lucene, and >> > initializing the objects from Hibernate). >> > Everything will still work, but performance will not be = >> optimal. >> > Later on we can implement mode 3 (which would also solve >> > situation 1), and the changes will be transparent to the us= er. >> > Only the performance will be better. >> > >> > Another advantage of integrating the Full Text query closely with >> > Hibernate is that in some cases where a field isn't indexed but the= >> > query is still simple (fiels x like toto%), the Lucene index would = not >> > be needed, and some queries can be performed directly via Hibernate= = >> in a >> > transparent way for the user. >> > >> > To summarize this, the biggest changes would be : >> > >> > - Add converters to Hibernate Lucene annotations, like what= >> > Compass is doing : >> > = >> http://www.opensymphony.com/compass/versions/0.9.1/html/core-settings= .html#config-converter >> > - Build a Full Text query facility similar to Hql / Criteria= , >> > but focussed on full text search (also like Compass's one := >> > = >> http://www.opensymphony.com/compass/versions/0.9.1/html/core-workingw= ithobjects.html#Searching >> > ) but that would make sure the objects retrieved from the Lucene in= dex >> > behave as if they were retrieved from the database. >> > >> > I would be glad to ear from your feedback on this. >> > >> > Thanks, >> > >> > Sylvain. >> >> >> -- = -- Max Rydahl Andersen callto://max.rydahl.andersen Hibernate ma...@hi... http://hibernate.org JBoss Inc max...@jb... |