|
From: Max R. A. <max...@jb...> - 2006-06-01 11:54:10
|
> I meant having a Criteria type of QL, like what Compass
> does :CompassQueryBuilder queryBuilder =3D session.createQueryBuilder(=
);
>
> CompassHits hits =3D queryBuilder.bool()
> .addMust( queryBuilder.term("name", "jack") )
> .addMustNot( queryBuilder.term("familyName", "london") )
> .toQuery()
> .addSort("familyName", CompassQuery.SortPropertyType.STRING)
> .addSort("birthdate", CompassQuery.SortPropertyType.INT)
> .hits();
well...doesn't there exist some existing object version of lucene query =
=
api or something ?
> About the cache :
> You're probably right, but I don't know enough about this.
> I only know Compass also provides some cache.
> About the bytecode enhancement :
> This one is quite important.
Ok, here you/we should probably utilize the lazy properties support we =
already have,
but it might require more customization than we have now.
Actually all this stuff might be good usecase candidate for alot of thin=
gs =
we have
talked about doing at some point:
1) Expose CustomQuery to provide hooks for alternative querying
2) Fetch-profiles to allow you to define a "lucene" fetch plan for your =
=
"partial" objects.
3) Lucene as a 2nd lvl cache.
/max
> Support you have several types of Objects that have an "report"
> property, and you want to show all those documents containing the word=
> "toto" in their report property.
> The best way is for the query facility to return a collection of those=
> documents with their id & report property set (which can be done only =
by
> getting the result from Lucene), without ever touching the SQL databas=
e.
> Forcing all those objects, that might be persisted in different tables=
,
> to be loaded by Hibernate would be both a performance killer and
> useless.
> But then, if you ever decide to do more than access one of the Lucene
> initialized property, you will need those documents to be loaded from
> Hibernate. This can only be done through some kind of wrapper / mock /=
> byte enhancement, whatever you call it. This is what "mixed mode" mean=
s.
> You initialize the objects from the Lucene index, but later fetch the
> real persisted object from the database as needed, and in a transparen=
t
> way for the user.
> As I said, in a first implementation, we can always "fetch eager" from=
> Hibernate, but some provision should be made to avoid loading from the=
> database when it's not necessary.
> If you use mostly the full text search to display search result pages,=
> then most of the time, you'll never need to hit the database.
>
> Sylvain.
>
> On Thu, 2006-06-01 at 11:23 +0200, Max Rydahl Andersen wrote:
>
>> All sounds cool ;)
>>
>> I can see the advantage of "converters" which can put elements into
>> Lucence in a better/human manner.
>>
>> The loading of objects from Lucene + "yet another QL" I'm a bit more
>> critical about.
>>
>> Would it not be better to do the following:
>>
>> 1. Use whatever QL Lucene supports to express the query. (What does
>> another QL helps here ?)
>>
>> 2. Do the query against the Lucence index and return id's which then =
is
>> resolved via Hibernate
>> and possible in 2nd lvl cache. (We could maybe optimize the id lookup=
s =
>> via
>> some targetted queries)
>>
>> 3. IFF you really want look into have Lucene be a 2nd lvl cache =
>> provider ?
>> (would probably require a "chainable" cacheprovider to have both luce=
nce
>> and ehcache queries in the same app...but that is "sugar")
>>
>> ...maybe there is something I miss because I don't understand what th=
e
>> "mixed mode means" and why you
>> want bytecode enhancement mixed in here ?
>>
>> /max
>>
>> > After chatting with Emmanuel, here is a draft plan for a closer
>> > integration between Hibernate and Lucene for performing full text
>> > queries.
>> > Hibernate annotations for Lucene helps keeping the lucene indexes u=
p =
>> to
>> > date, but doesn't provide a query facility.
>> > It also lacks converters that would for example help store a Date w=
ith
>> > the proper format in Lucene, so that the alphabetic order matches t=
he
>> > Object's natural order.
>> >
>> > A framework like Compass ( http://www.opensymphony.com/compass ) is=
>> > meant to fix this problem, by implementing it's own OSEM (Object, =
>> Search
>> > Engine Mapping), and having a query facility that mimics what =
>> hibernate
>> > is doing with database side.
>> > Compass can even reuse Hibernate's mapping thus minimizing the
>> > configuration effort.
>> >
>> > One short coming I've found with Compass though is that the objects=
=
>> that
>> > you get when you query the full text engine aren't connected to the=
=
>> ones
>> > in the database.
>> > So if you manipulate them, the changes aren't persisted or can =
>> actually
>> > erase some of the information in the database.
>> >
>> > The best way to have a simple and risk free integration is to build=
a
>> > Full Text query facility that would be closely integrated with =
>> Hibernate
>> > & Hibernate Lucene annotations.
>> >
>> > So, querying the Full Text indexes would return objects, like Compa=
ss
>> > does, but those objects would be fetched from the database.
>> > Actually, for performance reasons, they could be initialized with t=
he
>> > information from the FT index, and, through byte code enhancement, =
if =
>> an
>> > uninitialized property is read, or a property is set, the real obj=
ect
>> > could be fetched from the database and read/set accordingly.
>> > Here are a few examples :
>> >
>> > 1) Just make a full text search :
>> > query "toto" would fetch all the object with an indexed fie=
ld
>> > containing toto from the Lucene index.
>> > If the objects are initialized from the Lucene index, just =
one
>> > query to the Lucene index is done, and the search results c=
an =
>> be
>> > displayed.
>> > =3D> Best performance.
>> > Loading the objects from the database is useless here, and =
=
>> would
>> > only lead to poorer performances.
>> > 2) Make a full text search AND manipulate the objects :
>> > You want to query all the objects with "toto", and incremen=
t
>> > their "searchHits" property.
>> > You do the query, with a Load.EAGER parameter.
>> > Only the objects' ids are retrieved from Lucene, and the re=
al
>> > objects are retrieved from Hibernate
>> > 3) Mix both approaches
>> > Requires byte code enhancements.
>> > Can be useful for cases where for some types of objects you=
>> > don't want to store all the properties required to display =
the
>> > search view results in the index.
>> > Only those objects will be loaded from Hibernate.
>> > All 3 modes should work, but we can always begin by =
>> implementing
>> > mode 2 only (retrieving only the id's from Lucene, and
>> > initializing the objects from Hibernate).
>> > Everything will still work, but performance will not be =
>> optimal.
>> > Later on we can implement mode 3 (which would also solve
>> > situation 1), and the changes will be transparent to the us=
er.
>> > Only the performance will be better.
>> >
>> > Another advantage of integrating the Full Text query closely with
>> > Hibernate is that in some cases where a field isn't indexed but the=
>> > query is still simple (fiels x like toto%), the Lucene index would =
not
>> > be needed, and some queries can be performed directly via Hibernate=
=
>> in a
>> > transparent way for the user.
>> >
>> > To summarize this, the biggest changes would be :
>> >
>> > - Add converters to Hibernate Lucene annotations, like what=
>> > Compass is doing :
>> > =
>> http://www.opensymphony.com/compass/versions/0.9.1/html/core-settings=
.html#config-converter
>> > - Build a Full Text query facility similar to Hql / Criteria=
,
>> > but focussed on full text search (also like Compass's one :=
>> > =
>> http://www.opensymphony.com/compass/versions/0.9.1/html/core-workingw=
ithobjects.html#Searching
>> > ) but that would make sure the objects retrieved from the Lucene in=
dex
>> > behave as if they were retrieved from the database.
>> >
>> > I would be glad to ear from your feedback on this.
>> >
>> > Thanks,
>> >
>> > Sylvain.
>>
>>
>>
-- =
--
Max Rydahl Andersen
callto://max.rydahl.andersen
Hibernate
ma...@hi...
http://hibernate.org
JBoss Inc
max...@jb...
|