Screenshot instructions:
Windows
Mac
Red Hat Linux
Ubuntu
Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(31) |
Nov
(25) |
Dec
(33) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(48) |
Feb
(62) |
Mar
(22) |
Apr
(29) |
May
(9) |
Jun
(45) |
Jul
(28) |
Aug
(41) |
Sep
(60) |
Oct
(96) |
Nov
(99) |
Dec
(70) |
2003 |
Jan
(98) |
Feb
(159) |
Mar
(164) |
Apr
(150) |
May
(143) |
Jun
(97) |
Jul
(184) |
Aug
(143) |
Sep
(207) |
Oct
(126) |
Nov
(159) |
Dec
(165) |
2004 |
Jan
(131) |
Feb
(229) |
Mar
(220) |
Apr
(212) |
May
(320) |
Jun
(223) |
Jul
(191) |
Aug
(390) |
Sep
(261) |
Oct
(229) |
Nov
(215) |
Dec
(184) |
2005 |
Jan
(221) |
Feb
(312) |
Mar
(336) |
Apr
(273) |
May
(359) |
Jun
(277) |
Jul
(303) |
Aug
(321) |
Sep
(256) |
Oct
(415) |
Nov
(428) |
Dec
(508) |
2006 |
Jan
(585) |
Feb
(419) |
Mar
(496) |
Apr
(296) |
May
(403) |
Jun
(404) |
Jul
(553) |
Aug
(296) |
Sep
(252) |
Oct
(416) |
Nov
(414) |
Dec
(245) |
2007 |
Jan
(354) |
Feb
(422) |
Mar
(389) |
Apr
(298) |
May
(397) |
Jun
(318) |
Jul
(315) |
Aug
(339) |
Sep
(253) |
Oct
(317) |
Nov
(350) |
Dec
(264) |
2008 |
Jan
(353) |
Feb
(313) |
Mar
(433) |
Apr
(383) |
May
(343) |
Jun
(355) |
Jul
(321) |
Aug
(338) |
Sep
(242) |
Oct
(206) |
Nov
(199) |
Dec
(279) |
2009 |
Jan
(327) |
Feb
(221) |
Mar
(280) |
Apr
(278) |
May
(237) |
Jun
(345) |
Jul
(322) |
Aug
(324) |
Sep
(676) |
Oct
(586) |
Nov
(735) |
Dec
(329) |
2010 |
Jan
(619) |
Feb
(424) |
Mar
(529) |
Apr
(241) |
May
(312) |
Jun
(554) |
Jul
(698) |
Aug
(576) |
Sep
(408) |
Oct
(268) |
Nov
(391) |
Dec
(426) |
2011 |
Jan
(629) |
Feb
(512) |
Mar
(465) |
Apr
(467) |
May
(475) |
Jun
(403) |
Jul
(426) |
Aug
(542) |
Sep
(418) |
Oct
(620) |
Nov
(614) |
Dec
(358) |
2012 |
Jan
(357) |
Feb
(466) |
Mar
(344) |
Apr
(215) |
May
(408) |
Jun
(375) |
Jul
(241) |
Aug
(260) |
Sep
(401) |
Oct
(461) |
Nov
(498) |
Dec
(294) |
2013 |
Jan
(453) |
Feb
(447) |
Mar
(434) |
Apr
(326) |
May
(295) |
Jun
(471) |
Jul
(463) |
Aug
(278) |
Sep
(525) |
Oct
(343) |
Nov
(389) |
Dec
(405) |
2014 |
Jan
(564) |
Feb
(324) |
Mar
(319) |
Apr
(319) |
May
(384) |
Jun
(259) |
Jul
(210) |
Aug
(219) |
Sep
(315) |
Oct
(478) |
Nov
(207) |
Dec
(316) |
2015 |
Jan
(222) |
Feb
(234) |
Mar
(201) |
Apr
(145) |
May
(367) |
Jun
(318) |
Jul
(195) |
Aug
(210) |
Sep
(234) |
Oct
(248) |
Nov
(217) |
Dec
(189) |
2016 |
Jan
(219) |
Feb
(177) |
Mar
(110) |
Apr
(91) |
May
(159) |
Jun
(124) |
Jul
(192) |
Aug
(119) |
Sep
(125) |
Oct
(64) |
Nov
(80) |
Dec
(68) |
2017 |
Jan
(156) |
Feb
(312) |
Mar
(386) |
Apr
(217) |
May
(89) |
Jun
(115) |
Jul
(79) |
Aug
(122) |
Sep
(100) |
Oct
(99) |
Nov
(129) |
Dec
(77) |
2018 |
Jan
(106) |
Feb
(78) |
Mar
(160) |
Apr
(65) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
1
(17) |
2
(16) |
3
(12) |
4
(2) |
5
(2) |
6
(9) |
7
(5) |
8
(16) |
9
(11) |
10
(5) |
11
(2) |
12
|
13
(20) |
14
(15) |
15
(29) |
16
(25) |
17
(10) |
18
|
19
|
20
(10) |
21
(6) |
22
(11) |
23
|
24
(1) |
25
|
26
(4) |
27
(16) |
28
(18) |
29
(28) |
30
(18) |
31
(31) |
|
From: <Pierrick.BRIHAYE@re...> - 2007-08-31 09:04:03
|
Hi, Answering to myself... >It's been introduced yesterday night, for sure :-) Yes and no. See below. >The message says it all : browse:get-parent-collection($path as >xs:string) as xs:string seems to receive an xs:anyURI, problably from > >xmldb:encode-uri($a as xs:string) xs:anyURI. This was indeed (a part of) the problem. Although xs:anyURI is not a sub-type of xs:string, it can be treated as such. There was a second problem : functions now try to make a return type ch= eck which, in some cases, could be wrong for empty sequences. This check ha= s been skipped (but I still have to introduce a cardinality check to see = if empty sequences are allowed or not by the function signature). A patch is now committed. Please re-check and report any problem. >I will investigate and try to correct the application. I don' think >tere is a bug here. There was :-) However, I'm pretty happy to have improved the design. Un= til now, the internal code was missing the fact that xs:string and xs:anyUR= I are "compatible". Cheers, p.b.= |
From: Bert Schultheiss <bertschultheiss@ho...> - 2007-08-31 08:42:15
|
I checked the tomcat log files (catalina.*.log, localhost.*.log), and the log files in eXist. If I try to upload the large file, I do not see any new entry in validation.log, only exist.log is updated, but no errors. No, I am not using an XML schema or DTD. I also tried to use the client in embedded mode, using the webapps/exist/WEB-INF/conf.xml file (launch webstart client, stop tomcat, connect in embedded mode) However, this seems not to work: it quits directly after having started. How can I access the war file in embedded mode? Possibly that gives more info such as a stack trace ... (e.g if it is an out-of-memory error) Kind regards, Bert > did you check exist log files (validation.log) as well the catalina log > files? > > does the large file have a reference to an XML schema or DTD? in that > case, add the grammar to catalog.xml or set validation mode to 'no' > > regards > > Dannes |
From: Winston <comzak1@ho...> - 2007-08-31 08:09:00
|
Quick update: I just checked out the .jar installer of eXist (http://exist.sourceforge.net/index.html#download) and it looks like the extensions directory and build.sh file do come with that distribution just as described in the extension modules documentation. I am confused as to why those are included in the .jar and not the .war? -- View this message in context: http://www.nabble.com/How-do-I-use-extension-modules--tf4358350.html#a12421787 Sent from the exist-open mailing list archive at Nabble.com. |
From: Winston <comzak1@ho...> - 2007-08-31 07:32:35
|
Hello. I am having trouble understanding the instructions in the documentation for using the extension modules (http://exist.sourceforge.net/extensions.html). I am using eXist version 1.1.1. deployed in Tomcat via the .war file. I would like to use HTTP Module, the Date and Time Module, and the Image Module (all created by Adam Retter). In the documentation, it says "The source code for extension modules should be placed in their own folder inside $EXIST_HOME/extensions/modules/src/org/exist/xquery/modules" but I have no 'extensions' directory in my eXist home directory. The closest thing I can find is $EXIST_HOME/api/org/exist Also the documentation says "They may then be compiled in place using either $EXIST_HOME/build.sh extension-modules or %EXIST_HOME%\build.bat extension-modules depending on the platform.", but I do not have a build.sh file or a build.bat anywhere. Even if I did have those files, I am still not clear on how to "compile in place". Could someone help me with step-by-step instructions on how to get these modules working? thank you - Winston -- View this message in context: http://www.nabble.com/How-do-I-use-extension-modules--tf4358350.html#a12420919 Sent from the exist-open mailing list archive at Nabble.com. |
From: Pierrick Brihaye <pierrick.brihaye@fr...> - 2007-08-31 05:53:01
|
Hi, Andrzej Jan Taramina a écrit : > I updated exist from SVN this evening, and then rebuilt clean, and redeployed > the war from scratch (deleting the database as well). > > Now, when I log in as admin, if I try to browse collections using the web > interface I get the following error message: > > err:XPTY004 in function 'browse:get-parent-collection'. Type error: expected > type: xs:string; got: xs:anyURI [at line 204, column 60] > > Is this a bug that was introduced in the last day? It's been introduced yesterday night, for sure :-) The message says it all : browse:get-parent-collection($path as xs:string) as xs:string seems to receive an xs:anyURI, problably from xmldb:encode-uri($a as xs:string) xs:anyURI. It is strange however, since the problems seem to be with an input parameter, not a return value as I would have expected after my commit. I will investigate and try to correct the application. I don' think tere is a bug here. Cheers, p.b. |
From: Andrzej Jan Taramina <andrzej@ch...> - 2007-08-31 02:14:12
|
I updated exist from SVN this evening, and then rebuilt clean, and redeployed the war from scratch (deleting the database as well). Now, when I log in as admin, if I try to browse collections using the web interface I get the following error message: err:XPTY004 in function 'browse:get-parent-collection'. Type error: expected type: xs:string; got: xs:anyURI [at line 204, column 60] Is this a bug that was introduced in the last day? Note that browsing through the database using Oxygen works just fine. It's just the browser interface that seem screwed up. Thanks! Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: Roland Chan <roland.chan@de...> - 2007-08-30 21:22:52
|
Hi All, Currently reviewing our maintenance routines for IT... Aside from regular backups, does anyone perform anything additional on their eXist instances? That is, do you reindex collections regularly? If so, do you automate this? Do you take them offline? Anything else to keep your db healthy? Our dbx is over gb now. In addition to the regular backup, do you make copies of the actual .dbx's as well? Cheers, Roland |
From: Andrzej Jan Taramina <andrzej@ch...> - 2007-08-30 19:34:56
|
I noticed that there are functions to set cookies and headers for HTTP responses, but there doesn't seem to be a way to set the response code. When using XQuery to process HTTP requests, it would be nice to be able to specify alternate return codes, like a 404 Bad Request, 405 Method Not Allowed, 201 Created, etc. That way, you could use an XQuery to implement a REST-based application interface. Any way of doing this, or will I have to extend the response module to include a set-code() method? Thx! Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: Wolfgang Meier <wolfgangmm@gm...> - 2007-08-30 19:09:03
|
> Say I chose (2). But what exactly should I store? Should I store a "fat" > result list containing each document in full, complete with the match > tags? That would make it easy later to retrieve a chunk. But the stored > result list and the in-memory DOM tree might become very large. How much memory will be consumed depends on what you are storing to the session. If the query result contains nothing but nodes stored in the database, eXist will not create an in-memory DOM, but just keep a list of references to the real nodes stored in the db. Memory consumption should thus be low. The full-text match information is attached to these references and will be available until the session is deleted. So if you need to store query results in the session, it is best to store the "raw", unprocessed nodes. It doesn't matter if you keep a reference to the entire source document root or any other stored node - as long as it remains a reference. Any post-processing should happen when you actually display items to the user. Even if you need to do some post-processing, eXist will not immediately transform the stored nodes into an in-memory DOM. It tries to work with references as long as it can. For example, assume $item is one node returned from an XPath and points into the database. Now assume you post-process $item to create a new element: <item><title>{$item/title}</title></item> eXist will not copy the $item/title nodes into the constructed <item> element. Instead, it just inserts a special link node which points to the <title> nodes stored in the db. So unless you construct larger XML fragments, memory consumption should remain reasonable. Wolfgang |
From: Dannes Wessels <dizzzz@gm...> - 2007-08-30 17:33:36
|
Hi, On 8/30/07, Bert Schultheiss <bertschultheiss@...> wrote: > I tried to upload a large XML document. > Impossible to store a rsource <....xml>: networking error. > I do not see an error in the log files. did you check exist log files (validation.log) as well the catalina log files? does the large file have a reference to an XML schema or DTD? in that case, add the grammar to catalog.xml or set validation mode to 'no' regards Dannes -- # Dannes Wessels # The Netherlands # # Jabber / ICQ / MSN / AIM / Yahoo / gTalk / Skype / LinkedIn # |
From: Adam Retter <adam.retter@de...> - 2007-08-30 16:49:18
|
We use (1) here, Our query returns the results fairly quickly so re-executing the query = is not an issue and for us it is more desirable than keeping huge chunks = of data in RAM. But I know people who have done (2) although not with the word match = highlighting, so really its up to you and your needs... -----Original Message----- From: exist-open-bounces@... on behalf of Oystein = Reigem Sent: Thu 30/08/2007 17:37 To: exist-open@... Subject: [Exist-open] Long result lists shown in chunks, combined with = highlighting of search words =20 Here's a design problem of mine I described in a posting some weeks ago. = Unfortunately I got no replies. I'll try to rephrase and elaborate. My=20 original presentation might not have been clear enough. Hope I do better = this time. - =D8ystein - ............... All, I have a database in eXist that can be queried from a web interface,=20 with the help of a java servlet running under Tomcat. The exact=20 configuration of my system might not be relevant to my questions, but I=20 mention it anyway. What _is_ relevant, however, is that my database stores textual=20 documents, and that I do full text search on these documents.=20 Furthermore, and most important, I want to present the found documents=20 with the search words highlighted. eXist supports such highlighting, by putting <exist:match> tags around=20 the matching words in the retrieved documents. It's just up to the=20 application to convert these tags into something visible. After a search is done, I want to present a result list showing the=20 found documents. To simplify my case a bit, let's assume my documents=20 are fairly short, say 200 words each, and that the result list shows=20 each found document in full. (In a real full text document search system = the result list would normally show each found document represented by=20 some brief "surrogate" - a line of metadata and/or text _excerpts_.=20 Think of a Google result list.) Since a query might find many documents, I want to show long result=20 lists one chunk at a time, e.g, 20 documents at a time. I assume there are two main approaches for dealing with such chunks: (1) Each time the user/browser requests a new chunk, the original query=20 is simply _rerun_, and the desired chunk is extracted from the whole=20 result. The extraction can be done by the query itself. (2) At the initial search the whole result is stored to some temporary=20 location, e.g, stored in the session by use of the=20 session:set-attribute() function. When a chunk is requested, the chunk=20 is somehow retrieved from, or with the help of, that stored result. It seems to me that (2) is the "proper" way of doing it. Say I chose (2). But what exactly should I store? Should I store a "fat" = result list containing each document in full, complete with the match=20 tags? That would make it easy later to retrieve a chunk. But the stored=20 result list and the in-memory DOM tree might become very large. Or should I store a "lean" version - a result list with just a unique=20 reference to each found document? If I choose this alternative I lose=20 the match information and must somehow recreate it when a chunk is=20 retrieved. Let me here inject that I do full text search with the help of an XPath=20 predicate, not a FLWOR "where" clause. One way to recreate the match information of a "lean" chunk would be to=20 retrieve the chunk with a new query that combines (a) a predicate=20 tailored to retrieve exactly the desired chunk with (b) the predicate of = the original query. The purpose of (a) would be to retrieve exactly the=20 right documents of the chunk, in the correct order. (b) would serve to=20 get the match information reapplied. Predicate (a) might be a rather clunky thing, with a long, explicit "or" = expression mentioning each single reference - something like = "[ref=3Did_m=20 or ref=3Did_m+1 or ref=3Did_m+2 or ... or ref=3Did_m+19]". Btw - if I store the (lean) result list in the session I could perhaps=20 store each chunk - a list of 20 references - in its own session=20 attribute? So to retrieve a chunk my servlet should first get the=20 session attribute value for that chunk, i.e, a list of 20 references,=20 then construct a long and explicit predicate from that list, etc. There might also be a third result list alternative, somewhere between=20 "lean" and "fat". I might not throw away the match information=20 altogether, as in the "lean" version, but store it in a compact way, as=20 references to character positions, or something. In this third,=20 "compact" alternative each document in the result list is stored as a=20 reference and a list of match positions. Comments? Suggestions? - =D8ystein - --=20 =D8ystein Reigem, The department of culture, language and information=20 technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32=20 42. Fax: +47 55 58 94 70. E-mail: <oystein.reigem@...>. Home=20 tel: +47 56 14 06 11. Mobile: +47 97 16 96 64. Home e-mail:=20 <oreigem@...>. Aksis home page: <www.aksis.uib.no>. -------------------------------------------------------------------------= This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open -------------------------------------------------------------------------= This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
From: Bert Schultheiss <bertschultheiss@ho...> - 2007-08-30 16:39:09
|
Hello, I tried to upload a large XML document. With the exist-1.1.1 version this worked. With the 1.1.2 preview version ( http://www.exist-db.nl/files/ ) or with a version checked out 28/8 from SVN, and using the WAR file, I get: Impossible to store a rsource <....xml>: networking error. I do not see an error in the log files. Any suggestion? Kind regards, Bert Schultheiss |
From: Oystein Reigem <oystein.reigem@ak...> - 2007-08-30 16:38:36
|
Here's a design problem of mine I described in a posting some weeks ago. Unfortunately I got no replies. I'll try to rephrase and elaborate. My original presentation might not have been clear enough. Hope I do better this time. - Øystein - ............... All, I have a database in eXist that can be queried from a web interface, with the help of a java servlet running under Tomcat. The exact configuration of my system might not be relevant to my questions, but I mention it anyway. What _is_ relevant, however, is that my database stores textual documents, and that I do full text search on these documents. Furthermore, and most important, I want to present the found documents with the search words highlighted. eXist supports such highlighting, by putting <exist:match> tags around the matching words in the retrieved documents. It's just up to the application to convert these tags into something visible. After a search is done, I want to present a result list showing the found documents. To simplify my case a bit, let's assume my documents are fairly short, say 200 words each, and that the result list shows each found document in full. (In a real full text document search system the result list would normally show each found document represented by some brief "surrogate" - a line of metadata and/or text _excerpts_. Think of a Google result list.) Since a query might find many documents, I want to show long result lists one chunk at a time, e.g, 20 documents at a time. I assume there are two main approaches for dealing with such chunks: (1) Each time the user/browser requests a new chunk, the original query is simply _rerun_, and the desired chunk is extracted from the whole result. The extraction can be done by the query itself. (2) At the initial search the whole result is stored to some temporary location, e.g, stored in the session by use of the session:set-attribute() function. When a chunk is requested, the chunk is somehow retrieved from, or with the help of, that stored result. It seems to me that (2) is the "proper" way of doing it. Say I chose (2). But what exactly should I store? Should I store a "fat" result list containing each document in full, complete with the match tags? That would make it easy later to retrieve a chunk. But the stored result list and the in-memory DOM tree might become very large. Or should I store a "lean" version - a result list with just a unique reference to each found document? If I choose this alternative I lose the match information and must somehow recreate it when a chunk is retrieved. Let me here inject that I do full text search with the help of an XPath predicate, not a FLWOR "where" clause. One way to recreate the match information of a "lean" chunk would be to retrieve the chunk with a new query that combines (a) a predicate tailored to retrieve exactly the desired chunk with (b) the predicate of the original query. The purpose of (a) would be to retrieve exactly the right documents of the chunk, in the correct order. (b) would serve to get the match information reapplied. Predicate (a) might be a rather clunky thing, with a long, explicit "or" expression mentioning each single reference - something like "[ref=id_m or ref=id_m+1 or ref=id_m+2 or ... or ref=id_m+19]". Btw - if I store the (lean) result list in the session I could perhaps store each chunk - a list of 20 references - in its own session attribute? So to retrieve a chunk my servlet should first get the session attribute value for that chunk, i.e, a list of 20 references, then construct a long and explicit predicate from that list, etc. There might also be a third result list alternative, somewhere between "lean" and "fat". I might not throw away the match information altogether, as in the "lean" version, but store it in a compact way, as references to character positions, or something. In this third, "compact" alternative each document in the result list is stored as a reference and a list of match positions. Comments? Suggestions? - Øystein - -- Øystein Reigem, The department of culture, language and information technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32 42. Fax: +47 55 58 94 70. E-mail: <oystein.reigem@...>. Home tel: +47 56 14 06 11. Mobile: +47 97 16 96 64. Home e-mail: <oreigem@...>. Aksis home page: <www.aksis.uib.no>. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
From: Wolfgang Meier <wolfgangmm@gm...> - 2007-08-30 13:04:39
|
> No difference, some query results are still missing. The size of words.dbx > file is not increased. The curious thing is that there are very similar > documents with the search string in exactly the same node. Some nodes are > found, others not. I compared the query results for some of my documents and they seemed ok. But it is certainly possible that we have a bug in this area somewhere. If you have an example where the indexing goes wrong, please mail it to me. Wolfgang |
From: <kmuetz@we...> - 2007-08-30 12:37:09
|
exist-open-bounces@... <> wrote: > wolfgang@... <mailto:wolfgang@...> wrote: >>> Now I have modified the configuration as following: >>> >>> <fulltext default="none" attributes="false" alphanum="true"> >>> <create qname="content"/> >>> <create qname="dc:title"/> >>> <create qname="base"/> >>> </fulltext> >>> <create qname="dc:identifier" type="xs:string"/> >>> >>> Questions: >>> >>> 1) The size of words.dbx is only 30% of the "old" words.dbx. Is >>> there a explanation for this? >> >> I would rather expect the index to grow if you define it on qnames. >> Please check if you are missing any query results compared to the old >> index definition. But I guess it should be ok. > > I am actually missing some query results. There are some > documents which have definitly the query string within one of the > descendant nodes of "content" but were not found. I will invoke a > reindex and test it again. No difference, some query results are still missing. The size of words.dbx file is not increased. The curious thing is that there are very similar documents with the search string in exactly the same node. Some nodes are found, others not. Kai |
From: Adam Retter <adam.retter@de...> - 2007-08-30 12:03:50
|
Yes, you seem to have solved your own problem and realized the security = issues. You could create an xquery extension module that specifically = does String -> XML for your needs, this would avoid the security issue = in its entirety, you could even copy the code from the = request:get-data() function... Passing XML documents in request parameters seems a bit strange to me = and isnt there a limit in HTTP to the ammount of data that can be sent = in the querystring? -----Original Message----- From: Andrzej Jan Taramina [mailto:andrzej@...] Sent: Thu 30/08/2007 02:35 To: Adam Retter Cc: exist-open@... Subject: Re: [Exist-open] Best way to transform and store a xml = document? =20 >From my prior inquiry >How to convert a string containing=20 > XML to a node set with eXists XQuery implementation? If you do a util:eval( $data ) this will result in a node set! Yay! The only trick is that you have to strip off the XML declaration (<?xml=20 version=3D"1.0"?>) at the front of the $data string before doing the = eval,=20 cause otherwise you'll get a syntax error. That is not too hard to do with the core XQuery string functions. Now the one problem with using eval on a request parameter is the = typical=20 script injection security hole. However, in my case, that won't be a = big=20 issue since access to that URL/XQuery will only be allowed from a = process=20 that I control running on the same host (localhost), so this kludge will = actually work for me. ....A > Do a POST of the XML document to an xquery.xql based url and then = place > that inside the wrapper using XQuery element constructors - i.e. >=20 > xquery version "1.0"; >=20 > declare function local:wrapData($data) as element() > { > <wrapper> > <child-wrapper>{ $data }</child-wrapper> > <wrapper> > }; >=20 > let $data :=3D request:get-data() return > if($data)then > ( > let $newDoc :=3D xmldb:store("/db/mycollection", (), > local:wrapData($data)) return > if($newDoc)then > ( > <stored>{$newDoc}</stored> > ) > else > ( > <error>could not store the data</error> > ) > ) > else > ( > <error>no data received</error> > ) >=20 >=20 >=20 > And yes, I would store the .xql file in the database as well, it makes > things easier as when you take a backup of the db, you have both your > app and your data. >=20 > Thanks Adam. >=20 >=20 >=20 > On Tue, 2007-08-28 at 20:19 -0400, Andrzej Jan Taramina wrote: > > Need some advice. I want to use HTTP to store an XML document in > > eXist...but the source document needs to be transformed (actually, = embedded > > in some wrapper XML to be exact). I want the transformation to = happen in > > eXist, so the sourcing application only has to deal with the = original xml > > format. > >=20 > > Seems like there would be two ways to do this: > >=20 > > 1) Do a post of the XML document to a xquery.xql based url, and then > > initiate the transformation and update/insert in xquery code. > >=20 > > 2) Do a put of the XML document, and use an xquery trigger to modify = the > > document before it gets actually persisted. > >=20 > > Are both of these viable solutions (I'm not sure if you can use an = xquery > > trigger in option 2 in this way)? > >=20 > > Any pros/cons to either approach? That is, which should I lean = towards? > >=20 > > Also, is it recommended to put the xquery source (.xql) in the = database > > itself, or in a subdir of the war file. All the examples seem to be = in the > > war subdirectory. > >=20 > > Thanks for any/all advice and insight on this. > >=20 > > Andrzej Jan Taramina > > Chaeron Corporation: Enterprise System Solutions > > http://www.chaeron.com > >=20 > >=20 > > = -------------------------------------------------------------------------= > > This SF.net email is sponsored by: Splunk Inc. Still grepping = through log > > files to find problems? Stop. Now Search log events and = configuration files > > using AJAX and a browser. Download your FREE copy of Splunk now >>=20 > > http://get.splunk.com/ = _______________________________________________ > > Exist-open mailing list Exist-open@... > > https://lists.sourceforge.net/lists/listinfo/exist-open > --=20 > Adam Retter >=20 > Principal Developer > Devon Portal Project > Room 310 > County Hall > Topsham Road > Exeter > EX2 4QD >=20 > t: 01392 38 3683 > f: 01392 38 2966 > e: adam.retter@... > w: http://www.devonline.gov.uk >=20 Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: Adam Retter <adam.retter@de...> - 2007-08-30 11:47:48
|
Im afraid that I do not know Cocoon hardly at all, if you can reproduce = the problem in eXisty without Cocoon then I can help you... -----Original Message----- From: Andrzej Jan Taramina [mailto:andrzej@...] Sent: Thu 30/08/2007 00:19 To: Adam Retter Cc: exist-open@... Subject: RE: [Exist-open] Best way to transform and store a xml = document? =20 >=20 > Well see how you get on, I would recommend plenty of testing... to = make sure > you dont run afoul of temporary fragment problems. >=20 > I store my entire web app in the db, xslts, js, css, xql, xqm, jpg... = you get > the idea... everything ;-) Cool. I have run into a problem with this approach. I do a POST to my query, using multipart/form-data, that is, with an = attached=20 file. (using curl for that). When I try to do the following in my xquery: let $data :=3D util:file-read( request:get-uploaded-file( "file" ) ) The get-uploaded-file does return a file object, but the file-read dies = on a=20 weird error: no protocol: /home/andrzej/tomcat6/work/Catalina/localhost/exist/cocoon- files/cache-dir/upload__7c6dfbab_114b3a540b4__8000_00000018.tmp [at line = 17,=20 column 18] The line number for the let statement shown above. Any ideas what that error message means? Or am I doing something wrong = in=20 how I'm trying to get the posted file? Thanks! Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: <kmuetz@we...> - 2007-08-30 10:59:04
|
wolfgang@... <mailto:wolfgang@...> wrote: >> Now I have modified the configuration as following: >>=20 >> <fulltext default=3D"none" attributes=3D"false" alphanum=3D"true"> >> <create qname=3D"content"/> >> <create qname=3D"dc:title"/> >> <create qname=3D"base"/> >> </fulltext> >> <create qname=3D"dc:identifier" type=3D"xs:string"/> >>=20 >> Questions: >>=20 >> 1) The size of words.dbx is only 30% of the "old" words.dbx. Is >> there a explanation for this? >=20 > I would rather expect the index to grow if you define it on qnames. > Please check if you are missing any query results compared to the old > index definition. But I guess it should be ok. I am actually missing some query results. There are some documents which = have definitly the query string within one of the descendant nodes of = "content" but were not found. I will invoke a reindex and test it again. Kai |
From: Wolfgang Meier <wolfgangmm@gm...> - 2007-08-30 10:13:01
|
> If you define an index on a specific QName, it will be bound to that > QName. The query engine will only use it for queries on elements or > attributes with the specified QName. If you want to query "document", > you need to create an index on it. Mmmh, I just had an idea: if we had some statistics on the distribution of elements in the collection, the query engine could first check for an index on "document" and if it can't find one, it could search for indexes defined on potential descendant nodes of "document", e.g. "content" and "dc:title" and consult those. Well, we will need better statistics for the optimizer anyway. I'm looking forward to work on this. Wolfgang |
From: Wolfgang Meier <wolfgang@ex...> - 2007-08-30 10:03:08
|
> Now I have modified the configuration as following: > > <fulltext default="none" attributes="false" alphanum="true"> > <create qname="content"/> > <create qname="dc:title"/> > <create qname="base"/> > </fulltext> > <create qname="dc:identifier" type="xs:string"/> > > Questions: > > 1) The size of words.dbx is only 30% of the "old" words.dbx. Is there a > explanation for this? I would rather expect the index to grow if you define it on qnames. Please check if you are missing any query results compared to the old index definition. But I guess it should be ok. > 2) We are using a fulltext query like this: > let $res := collection('/db/docs')/document[. &= 'term'] > return $res/rdf:Description > > This query works with the "old" config, but not with the qname indexes. If you define an index on a specific QName, it will be bound to that QName. The query engine will only use it for queries on elements or attributes with the specified QName. If you want to query "document", you need to create an index on it. The reason for this is that the QName index uses different index keys, i.e. the keys are a triple <collection, QName(document), term>. This makes index lookups faster, but binds the index on the given QName. The "old" index definition just used <collection, term> as key. > I have to rewrite the query to something like this > let $res := collection('/db/docs')/document[content &= 'term' or > rdf:Description/dc:title &= 'term'] > return $res/rdf:Description Another approach would be to use a union: collection('/db/docs')/document[(content|rdf:Description/dc:title) &= 'term'] Unfortunately, the optimizer does not yet know how to optimize this type of union expression. It should rewrite it into let $term := 'term' let $col := collection('/db/docs') let $res := $col/((#exist:optimize#) { document[content &= $term] } ) | $col/((#exist:optimize#) { document[rdf:Description/dc:title &= $term] } ) return $res/rdf:Description which looks ugly but should be very fast. > BTW, the query optimizer speeds up some of our queries dramatically. That's good news. There's more work to be done, but it is good to see the optimizer is already usable. Wolfgang |
From: <kmuetz@we...> - 2007-08-30 09:30:15
|
Hi, we are using eXist 1.1.1 in production. Currently I am playing around with the NewQueryOptimizer and the qname indexes of the current trunk. Now I have some questions. We have stored about 11.500 documents in multiple collections with the following structure: <document> <rdf:Description xmlns:dc="http://www.purl.org/dc/elements/1.1/"; xmlns:xlink="http://www.w3.org/1999/xlink"; xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <dc:identifier>ID</dc:identifier> <dc:title>Title</dc:title> ... </rdf:Description> <status> ... <base>Text</base> </status> <content> [a set of deeply nested nodes] </content> </document> We want the <dc:title/>, <base/> and the <content/> nodes to be fullindexed. The "old" index configuration is as following: <fulltext default="none" attributes="false" alphanum="true"> <include path="/document/content"/> <include path="/document/rdf:Description/dc:title"/> <include path="/document/status/base"/> </fulltext> <create path="/document/rdf:Description/dc:identifier" type="xs:string"/> ... Now I have modified the configuration as following: <fulltext default="none" attributes="false" alphanum="true"> <create qname="content"/> <create qname="dc:title"/> <create qname="base"/> </fulltext> <create qname="dc:identifier" type="xs:string"/> ... Questions: 1) The size of words.dbx is only 30% of the "old" words.dbx. Is there a explanation for this? 2) We are using a fulltext query like this: let $res := collection('/db/docs')/document[. &= 'term'] return $res/rdf:Description This query works with the "old" config, but not with the qname indexes. I have to rewrite the query to something like this let $res := collection('/db/docs')/document[content &= 'term' or rdf:Description/dc:title &= 'term'] return $res/rdf:Description Can anyone explain the difference? Do I have to rewrite those queries? BTW, the query optimizer speeds up some of our queries dramatically. Regards, Kai |
From: SourceForge.net <noreply@so...> - 2007-08-30 06:28:58
|
Bugs item #1784594, was opened at 2007-08-30 06:28 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=117691&aid=1784594&group_id=17691 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: jamesfuller2007 (jamesfuller2007) Assigned to: Nobody/Anonymous (nobody) Summary: response:set-header() vs xquery serialize pragma Initial Comment: I noticed an interesting situation when trying to set mime-type with an HTTP response from RESTful interface (latest trunk), from within my xquery. initially, I tried to use response:set-header('Content-type: application/xml') to set the header on the output with no joy. Is this supposed to work or is there some sort of hidden conflict or assumed processing precedence (for example from cocoon, or config settings) I then decided to use a serialize pragma declare option exist:serialize "media-type=application/xml"; which gave me the right mime type. cheers, Jim Fuller ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=117691&aid=1784594&group_id=17691 |
From: Andrzej Jan Taramina <andrzej@ch...> - 2007-08-30 01:35:59
|
>From my prior inquiry >How to convert a string containing > XML to a node set with eXists XQuery implementation? If you do a util:eval( $data ) this will result in a node set! Yay! The only trick is that you have to strip off the XML declaration (<?xml version="1.0"?>) at the front of the $data string before doing the eval, cause otherwise you'll get a syntax error. That is not too hard to do with the core XQuery string functions. Now the one problem with using eval on a request parameter is the typical script injection security hole. However, in my case, that won't be a big issue since access to that URL/XQuery will only be allowed from a process that I control running on the same host (localhost), so this kludge will actually work for me. ....A > Do a POST of the XML document to an xquery.xql based url and then place > that inside the wrapper using XQuery element constructors - i.e. > > xquery version "1.0"; > > declare function local:wrapData($data) as element() > { > <wrapper> > <child-wrapper>{ $data }</child-wrapper> > <wrapper> > }; > > let $data := request:get-data() return > if($data)then > ( > let $newDoc := xmldb:store("/db/mycollection", (), > local:wrapData($data)) return > if($newDoc)then > ( > <stored>{$newDoc}</stored> > ) > else > ( > <error>could not store the data</error> > ) > ) > else > ( > <error>no data received</error> > ) > > > > And yes, I would store the .xql file in the database as well, it makes > things easier as when you take a backup of the db, you have both your > app and your data. > > Thanks Adam. > > > > On Tue, 2007-08-28 at 20:19 -0400, Andrzej Jan Taramina wrote: > > Need some advice. I want to use HTTP to store an XML document in > > eXist...but the source document needs to be transformed (actually, embedded > > in some wrapper XML to be exact). I want the transformation to happen in > > eXist, so the sourcing application only has to deal with the original xml > > format. > > > > Seems like there would be two ways to do this: > > > > 1) Do a post of the XML document to a xquery.xql based url, and then > > initiate the transformation and update/insert in xquery code. > > > > 2) Do a put of the XML document, and use an xquery trigger to modify the > > document before it gets actually persisted. > > > > Are both of these viable solutions (I'm not sure if you can use an xquery > > trigger in option 2 in this way)? > > > > Any pros/cons to either approach? That is, which should I lean towards? > > > > Also, is it recommended to put the xquery source (.xql) in the database > > itself, or in a subdir of the war file. All the examples seem to be in the > > war subdirectory. > > > > Thanks for any/all advice and insight on this. > > > > Andrzej Jan Taramina > > Chaeron Corporation: Enterprise System Solutions > > http://www.chaeron.com > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Still grepping through log > > files to find problems? Stop. Now Search log events and configuration files > > using AJAX and a browser. Download your FREE copy of Splunk now >> > > http://get.splunk.com/ _______________________________________________ > > Exist-open mailing list Exist-open@... > > https://lists.sourceforge.net/lists/listinfo/exist-open > -- > Adam Retter > > Principal Developer > Devon Portal Project > Room 310 > County Hall > Topsham Road > Exeter > EX2 4QD > > t: 01392 38 3683 > f: 01392 38 2966 > e: adam.retter@... > w: http://www.devonline.gov.uk > Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: Andrzej Jan Taramina <andrzej@ch...> - 2007-08-30 01:10:41
|
Adam: I've run into a problem with your apprach... The application that is doing the HTTP Post uses request parameters only. So the data comes in as name/value pairs. Specifically looking like: entry=<myxml>....</myxml> (though on the wire the pointy brackets are escaped of course). So instead of the request:get-data() method, I have to use something like: request:get-parameter( "entry", () ) The problem is that this returns a string, and when you try to wrap it, it creates a single text node (with all pointy brackets escaped!) inside the child wrapper, rather than treating $data as an XML document. There doesn't seem to be a good way to convert the string (which holds XML) to a node set to avoid this. Any thoughts on how to address this, or how to convert a string containing XML to a node set with eXists XQuery implementation? I suppose I could could use the Java binding to call the Xerces parser directly, but that seems rather ugly. Thanks! ....A > Do a POST of the XML document to an xquery.xql based url and then place > that inside the wrapper using XQuery element constructors - i.e. > > xquery version "1.0"; > > declare function local:wrapData($data) as element() > { > <wrapper> > <child-wrapper>{ $data }</child-wrapper> > <wrapper> > }; > > let $data := request:get-data() return > if($data)then > ( > let $newDoc := xmldb:store("/db/mycollection", (), > local:wrapData($data)) return > if($newDoc)then > ( > <stored>{$newDoc}</stored> > ) > else > ( > <error>could not store the data</error> > ) > ) > else > ( > <error>no data received</error> > ) > > > > And yes, I would store the .xql file in the database as well, it makes > things easier as when you take a backup of the db, you have both your > app and your data. > > Thanks Adam. > > > > On Tue, 2007-08-28 at 20:19 -0400, Andrzej Jan Taramina wrote: > > Need some advice. I want to use HTTP to store an XML document in > > eXist...but the source document needs to be transformed (actually, embedded > > in some wrapper XML to be exact). I want the transformation to happen in > > eXist, so the sourcing application only has to deal with the original xml > > format. > > > > Seems like there would be two ways to do this: > > > > 1) Do a post of the XML document to a xquery.xql based url, and then > > initiate the transformation and update/insert in xquery code. > > > > 2) Do a put of the XML document, and use an xquery trigger to modify the > > document before it gets actually persisted. > > > > Are both of these viable solutions (I'm not sure if you can use an xquery > > trigger in option 2 in this way)? > > > > Any pros/cons to either approach? That is, which should I lean towards? > > > > Also, is it recommended to put the xquery source (.xql) in the database > > itself, or in a subdir of the war file. All the examples seem to be in the > > war subdirectory. > > > > Thanks for any/all advice and insight on this. > > > > Andrzej Jan Taramina > > Chaeron Corporation: Enterprise System Solutions > > http://www.chaeron.com > > > > > > ------------------------------------------------------------------------- > > This SF.net email is sponsored by: Splunk Inc. Still grepping through log > > files to find problems? Stop. Now Search log events and configuration files > > using AJAX and a browser. Download your FREE copy of Splunk now >> > > http://get.splunk.com/ _______________________________________________ > > Exist-open mailing list Exist-open@... > > https://lists.sourceforge.net/lists/listinfo/exist-open > -- > Adam Retter > > Principal Developer > Devon Portal Project > Room 310 > County Hall > Topsham Road > Exeter > EX2 4QD > > t: 01392 38 3683 > f: 01392 38 2966 > e: adam.retter@... > w: http://www.devonline.gov.uk > Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |
From: Andrzej Jan Taramina <andrzej@ch...> - 2007-08-29 23:21:22
|
> > Well see how you get on, I would recommend plenty of testing... to make sure > you dont run afoul of temporary fragment problems. > > I store my entire web app in the db, xslts, js, css, xql, xqm, jpg... you get > the idea... everything ;-) Cool. I have run into a problem with this approach. I do a POST to my query, using multipart/form-data, that is, with an attached file. (using curl for that). When I try to do the following in my xquery: let $data := util:file-read( request:get-uploaded-file( "file" ) ) The get-uploaded-file does return a file object, but the file-read dies on a weird error: no protocol: /home/andrzej/tomcat6/work/Catalina/localhost/exist/cocoon- files/cache-dir/upload__7c6dfbab_114b3a540b4__8000_00000018.tmp [at line 17, column 18] The line number for the let statement shown above. Any ideas what that error message means? Or am I doing something wrong in how I'm trying to get the posted file? Thanks! Andrzej Jan Taramina Chaeron Corporation: Enterprise System Solutions http://www.chaeron.com |