Screenshot instructions:
Windows
Mac
Red Hat Linux
Ubuntu
Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)
You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(10) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(7) |
Feb
(21) |
Mar
(1) |
Apr
(6) |
May
(22) |
Jun
(8) |
Jul
(5) |
Aug
(4) |
Sep
(4) |
Oct
(35) |
Nov
(52) |
Dec
(9) |
2007 |
Jan
(46) |
Feb
(29) |
Mar
(45) |
Apr
(45) |
May
(12) |
Jun
(39) |
Jul
(72) |
Aug
(20) |
Sep
(36) |
Oct
(87) |
Nov
(40) |
Dec
(30) |
2008 |
Jan
(57) |
Feb
(70) |
Mar
(34) |
Apr
(79) |
May
(69) |
Jun
(13) |
Jul
(86) |
Aug
(130) |
Sep
(28) |
Oct
(69) |
Nov
(50) |
Dec
(47) |
2009 |
Jan
(50) |
Feb
(87) |
Mar
(17) |
Apr
(24) |
May
(28) |
Jun
(51) |
Jul
(55) |
Aug
(60) |
Sep
(46) |
Oct
(7) |
Nov
(18) |
Dec
(51) |
2010 |
Jan
(37) |
Feb
(4) |
Mar
(16) |
Apr
(16) |
May
(4) |
Jun
(8) |
Jul
(21) |
Aug
(15) |
Sep
(4) |
Oct
(1) |
Nov
(6) |
Dec
(6) |
2011 |
Jan
|
Feb
(1) |
Mar
(3) |
Apr
(2) |
May
(1) |
Jun
(10) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(7) |
Dec
(50) |
2012 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(6) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
1
(1) |
2
(1) |
3
|
4
|
5
(5) |
6
(4) |
7
(6) |
8
(1) |
9
|
10
|
11
(1) |
12
|
13
(1) |
14
(8) |
15
(7) |
16
(14) |
17
(2) |
18
(4) |
19
(2) |
20
(2) |
21
(2) |
22
(3) |
23
(4) |
24
|
25
|
26
(1) |
27
|
28
|
29
|
30
|
31
|
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-11 01:29:57
|
Short: Like I wrote a month ago, I have fixed and documented aperture dependencies. JAI licensing terms are the same as those of activation and javamail, so if we consider them inacceptable, we'd need to get rid of all sun-made libs from our distro. The osgi setup doesn't work. I couldn't make a proper osgi bundle from sesame and its dependencies. If you have any ideas, please post them asap. Antoni Mylka antoni.mylka@... Long: I've gone through all the libraries in the lib folder. Before: there were 13 unreleased jars applewrapper-0.2.jar - unknown sources, done by gunnar demork-2.0.jar - unknown sources, done by gunnar DFKIUtils2.jar - unknown sources, done by Andreas Lauer (DFKI) fontbox-0.2.0-dev - nightly build from 30.10.2007, not available anymore ical4j-cvs20061019.jar - CVS trunk from 19.10.2006, compiled by me infsail-0.1.jar - compiled by me jpim-0.1-patched-by-antoni.jar - hacked by me mstor-0.9.11-hacked.jar - hacked by me nrlvalidator-0.1.jar - compiled by me org.apache.commons.codec_1.2.0.jar - OSGI headers hacked by me org.apache.commons.httpclient_3.0.0.rc2.jar - OSGI headers hacked by me pdfbox-0.7.4-dev-20071030.jar - nightly build, not available anymore unionsail-0.1.jar - compiled by me I've documented the sources of all jars. Replaced the hacked versions with released ones (commons-codec, commons-httpclient, mstor, ical4j). Right now we have only the pdfbox/fontbox which aren't available anymore, and the jpim, with known issues that have been reported to the author. I've included the patch. More details in my commit messages: http://aperture.svn.sourceforge.net/viewvc/aperture/trunk/?view=log Everything is documented on the wiki page. I've updated it thoroughly. http://aperture.wiki.sourceforge.net/Dependencies The only problem is that it's difficult to make a proper OSGI bundle out of Sesame. I tried, but it seems impossible to do it right, i.e. have an obligatory RequiredPackage for SLF4J and optional packages for everything else: - activation - commons-cli - commons-httpclient - javax.servlet - javax.servlet.jsp - spring - aduna.app and aduna.webapp.views I've compiled an aggregated source tree of the onejar.jar from all the branches in the Aduna SVN tree (Which wasn't trivial) and tried to turn it into a plugin project, to add all the optional dependencies, until all the classes compile, but it seems that my eclipse plays tricks on me, doesn't seem to take my changes in the manifest into account at all. So, the libraries are as well documented as it gets, but the OSGI setup will probably not work. SLF4J, RDF2Go and Aperture have proper OSGi manifest headers and activators. We only need Sesame and sesame dependencies packaged as bundles. I could take the manifest from NEPOMUK and hack it into sesame onejar. This may be the only feasible solution at the moment, but it's not the 'right' one, we need a manifest that will list ALL ImportedPackages needed to make use of the WHOLE sesame functionality. If there are no better ideas, I will take the nepomuk manifest, try to improve it manually and hammer it into the sesame jar, but without guarantees, that EVERY SINGLE CLASS from the onejar will work in osgi if the optional dependencies are there. We need a way to run a suite of junit tests inside an osgi container, this is needed both for aperture and for sesame. It's the only way to really test the validity of osgi manifests and activators. It seems though that we don't have time for this kind of research before the release. Any ideas? All kinds of coments welcome Antoni Mylka antoni.mylka@... |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-08 00:21:22
|
Hello Aperturians, All the features planned for this release are there (+support for geotagged jpg's). It's time to test and every pair of eyeballs is welcome. If you have half an hour to spare this and next week - give Aperture example apps a spin http://aperture.wiki.sourceforge.net/ApertureExampleApplications We'll try to fix as many bugs as we can. I worked today on the stability of the WebCrawler, it shouldn't crash on faulty URL's in links anymore. (examples of sites with faulty links: http://www.quelle.de, http://www.condenast.com). It will log those urls but will proceed. All kinds of comments welcome. Antoni Mylka antoni.mylka@... |
From: Leo Sauermann <leo.sauermann@df...> - 2008-05-07 19:06:28
|
It was Antoni Myłka who said at the right time 07.05.2008 18:06 the following words: > Antoni Myłka pisze: > >> In Aperture a composer is not described as a string, but as a URI. You >> see on the graph that you actually have two triples, the first one >> pointing at a URI <urn:uuid:17fb...> and a second triple from that URI >> to the fullname. >> >> A call to getURI(NID3.composer) would return that intermediary node. You >> could then use that Node to obtain the actual name from the model. E.g. >> with >> RDFTool.getSingleValue(container.getModel(),uri,NID3.fullname); >> >> Antoni Mylka >> antoni.mylka@... >> > > To Aperture developers. > > This seems to be a common use case. I would vote for an inclusion of a > new method pair in the RDFContainer: > > getFullname(URI property) > add/putFullname(URI property, String name); > > That would take care about the intermediate node issue. Properties with > the range of nco:Contact occur all over the NIE. > > ??? > I don't really fancy it (its not "generic"), but if its useful - do it! maybe think of a longer name, to express the really complicated nature of it... best Leo > Antoni Mylka > antoni.mylka@... > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Aperture-devel mailing list > Aperture-devel@... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauermann@... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-07 18:43:41
|
Hello Aperturians! Gunnar commited a patch that added gps coordinates to the output of the JpgExtractor. I polished it a little and added two tests. Now photos with gps annotations will yield rdf similar to <file:photo.jpg> nexif:gps <uri:point> . <uri:point> a geo:Point . <uri:point> geo:long "-43.20515156" . <uri:point> geo:lat "-22.98725664" . Have fun. Antoni Mylka antoni.mylka@... |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-07 16:06:24
|
Antoni Myłka pisze: > In Aperture a composer is not described as a string, but as a URI. You > see on the graph that you actually have two triples, the first one > pointing at a URI <urn:uuid:17fb...> and a second triple from that URI > to the fullname. > > A call to getURI(NID3.composer) would return that intermediary node. You > could then use that Node to obtain the actual name from the model. E.g. > with > RDFTool.getSingleValue(container.getModel(),uri,NID3.fullname); > > Antoni Mylka > antoni.mylka@... To Aperture developers. This seems to be a common use case. I would vote for an inclusion of a new method pair in the RDFContainer: getFullname(URI property) add/putFullname(URI property, String name); That would take care about the intermediate node issue. Properties with the range of nco:Contact occur all over the NIE. ??? Antoni Mylka antoni.mylka@... |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-07 16:02:09
|
Nico Heid pisze: > Hi, > Thanks for the reply, I checked out the source from svn, then it worked. > Here is some code (maybe it helps somebody who has a similar problem) > > Just an example: > Set factories = extractorRegistry.getExtractorFactories(mimeType); > > if (factories != null && !factories.isEmpty()) { > // just fetch the first available Extractor > ExtractorFactory factory = (ExtractorFactory) factories.iterator().next(); > Extractor extractor = factory.get(); > > > // apply the extractor on the specified file > // (just open a new stream rather than buffer the previous stream) > stream = new FileInputStream(file); > buffer = new BufferedInputStream(stream, 8192); > extractor.extract(uri, buffer, null, mimeType, container); > stream.close(); > } > else{ > // use the FileExtractorFactory > factories = extractorRegistry.getFileExtractorFactories(mimeType); > FileExtractorFactory factory = (FileExtractorFactory) factories.iterator().next(); > FileExtractor extractor = factory.get(); > > extractor.extract(uri, file, null, mimeType, container); > } > > But I have one follow up question. > If I look at the output -> container.getModel().writeTo(new PrintWriter(System.out),Syntax.Ntriples); > > I get something like this: (shortened) > > <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#title> "Sonata No. 1 in F Minor, Op. 2 No. 1 - IV. Prestissimo" . > <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#composer> <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> . > <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> <http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname> "Ludwig van Beethoven" . > > > I don't quite understand this. I need to access composer. > But container.getString(NID3.composer) returns null; > > Nico > In Aperture a composer is not described as a string, but as a URI. You see on the graph that you actually have two triples, the first one pointing at a URI <urn:uuid:17fb...> and a second triple from that URI to the fullname. A call to getURI(NID3.composer) would return that intermediary node. You could then use that Node to obtain the actual name from the model. E.g. with RDFTool.getSingleValue(container.getModel(),uri,NID3.fullname); Antoni Mylka antoni.mylka@... |
From: Leo Sauermann <leo.sauermann@df...> - 2008-05-07 15:49:25
|
It was Nico Heid who said at the right time 07.05.2008 16:35 the following words: > Hi, > But I have one follow up question. > If I look at the output -> container.getModel().writeTo(new PrintWriter(System.out),Syntax.Ntriples); > > I get something like this: (shortened) > > <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#title> "Sonata No. 1 in F Minor, Op. 2 No. 1 - IV. Prestissimo" . > <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#composer> <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> . > <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> <http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname> "Ludwig van Beethoven" . > > > I don't quite understand this. I need to access composer. > But container.getString(NID3.composer) returns null; > The composer is represented as a resource, a kind of "resource-in-between" the string and the song. to get LudwigVan, you get the resource of the artist, then his fullname something like this (not tested, I used RDFContainer here for convenience, any other RDF way to walk the graph is ok) URI composer = container.getURI(NID3.composer); RDFContainer composerContainer = new RDFContainerImpl(container.getModel(), composer); String fullname = composerContainer.getString(NCO.fullname); hope this does the trick? The reason to model it this way is the flexibility of RDF. because other formats may split up fistname/secondname and also the composer is represented as a "contact", so its possible to add (in RDF) an address or something else to it. kind regards Leo > Nico > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Aperture-devel mailing list > Aperture-devel@... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauermann@... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |
From: Nico Heid <Nico.Heid@1u...> - 2008-05-07 14:36:17
|
Hi, Thanks for the reply, I checked out the source from svn, then it worked. Here is some code (maybe it helps somebody who has a similar problem) Just an example: Set factories = extractorRegistry.getExtractorFactories(mimeType); if (factories != null && !factories.isEmpty()) { // just fetch the first available Extractor ExtractorFactory factory = (ExtractorFactory) factories.iterator().next(); Extractor extractor = factory.get(); // apply the extractor on the specified file // (just open a new stream rather than buffer the previous stream) stream = new FileInputStream(file); buffer = new BufferedInputStream(stream, 8192); extractor.extract(uri, buffer, null, mimeType, container); stream.close(); } else{ // use the FileExtractorFactory factories = extractorRegistry.getFileExtractorFactories(mimeType); FileExtractorFactory factory = (FileExtractorFactory) factories.iterator().next(); FileExtractor extractor = factory.get(); extractor.extract(uri, file, null, mimeType, container); } But I have one follow up question. If I look at the output -> container.getModel().writeTo(new PrintWriter(System.out),Syntax.Ntriples); I get something like this: (shortened) <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#title> "Sonata No. 1 in F Minor, Op. 2 No. 1 - IV. Prestissimo" . <myTest> <http://www.semanticdesktop.org/ontologies/2007/05/10/nid3#composer> <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> . <urn:uuid:17fb7051-55f8-4f50-ba73-9ad5ea9f5625> <http://www.semanticdesktop.org/ontologies/2007/03/22/nco#fullname> "Ludwig van Beethoven" . I don't quite understand this. I need to access composer. But container.getString(NID3.composer) returns null; Nico |
From: Leo Sauermann <leo.sauermann@df...> - 2008-05-06 14:46:53
|
It was Christiaan Fluit who said at the right time 06.05.2008 14:49 the following words: > - I don't like the fact that UntouchedIterator auto-closes the wrapped > iterator after the last result, only its close method should do that. > This will only make people sloppy. > no, this is not sloppyness, this is the documented and agreed default-behaviour of closable iterators. Both in openrdf and in rdf2go the semantics of closeable is that it automatically closes after the last result: http://repository.aduna-software.org/docs/info.aduna/api/info/aduna/iteration/CloseableIteration.html "CloseableIterations automatically free their resources when exhausted. " btw, I think our mindset should always be: programmers' time is precious and anything we can do simple and easy for them is better than forcing people to comply to complicated stuff.... best Leo > > Regards, > > Chris > -- > > Christiaan Fluit wrote: > >> Antoni Myłka wrote: >> >>> Hello Aperturians, the new AccessData is ready. >>> >> Just reviewed the code: looks good! >> >> I will try to test-drive the code in AutoFocus on some realistic data >> sets. This means that the FileSystemCrawler, WebCrawler and ImapCrawler >> will get some pretty intensive incremental crawling testing. >> >> As I am using my own AccessData implementation (which will have to be >> modified for this to work), I won't be testing the current Aperture >> implementations such as FileAccessData and ModelAccessData. >> >> Also, this testing will not include checking whether incremental >> crawling works together with the SubCrawlers, only Crawlers and >> Extractors will be used. >> >> >>> 4. isTouched allowed me to get rid of the crawledUrls set in the >>> WebCrawler which should lower the memory consumption >>> >> Neat! >> >> This means that there is only one possible memory hog left in the >> WebCrawler, namely the queue of links to crawl. >> >> >> Regards, >> >> Chris >> -- >> > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Aperture-devel mailing list > Aperture-devel@... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauermann@... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |
From: Christiaan Fluit <christiaan.fluit@ad...> - 2008-05-06 13:10:04
|
Nico Heid wrote: > Hi, > I basically followed the http://aperture.sourceforge.net/tutorial/extractors.html tutorial to hand over files and extract metadata. Works flawless (adding the dependencies for the extractors) > > The only thing I'm wondering is how to get it to work with mp3 files. (using jaudiotagger) > > MimeType is detected correct. But on: > Set factories = extractorRegistry.get(mimeType); > There is no factory in the result. > > Did I misunderstand the concept, or do I miss a dependency? Aperture now has two kinds of extractors: Extractors and FileExtractors. The name may suggest that the latter is a subtype of the other but this is not the case. Extractor works on InputStreams, FileExtractor on Files. These two sets of extractors are kept completely separate. ExtractorRegistry.get is deprecated, there are two methods for these two types of extractors. It is recommended to first use getExtractorFactories and when that returns an empty set, then use getFileExtractorFactories. Once you have a FileExtractorFactory, FileDataObject has methods for getting the File rather than the stream. First, try getFile and when that returns null (i.e. there is no local file corresponding with that URI yet), try downloadContent. Regards, Chris -- |
From: Christiaan Fluit <christiaan.fluit@ad...> - 2008-05-06 12:49:57
|
I gave the code some testing. Not as thorough as I would like, but as thorough as time currently permits :) Here's what I did: - upgrade the Aperture version in AutoFocus with the trunk code and the RDF2Go dependency to 4.6.1 - modify our own Model implementation (GraphModel) to conform to the latest Model API - modify our own RepositoryAccessData, analogous to ModelAccessData, to conform to the latest AccessData API. Good news: the FileSystemCrawler and ImapCrawler work correctly, i.e. the replacement of the deprecatedUrls with the "touch" approach works as advertised. Both a crawl from scratch and incremental crawling gives correct results. No SubCrawlers were tested though. Bad news (for us): we won't be able to use this in our software right away. The reason is in how we implement and apply our own AccessData implementation. In our case, the CrawlerHandler and AccessData each have their own RepositoryConnection that is obtained from the same Repository, i.e. the crawl metadata and access data of a single DataSource is stored in the same Repository. This works correctly when the AccessData connection is in auto-commit mode, but performance takes a huge hit: an incremental recrawl on a set of unchanged files is between one and two orders of magnitude slower. Note that in this case, two commits are done per file: one to remove the previous timestamp and one to add the new one, whereas previously we only had to remove the file's ID from deprecatedUrls. An easy solution would be to turn of auto-commit in the AccessData. This way, the entire crawl becomes a single transaction. This does not work however because the NativeStore does not support concurrent transactions. You will get a deadlock as soon as the Crawler reaches a new or changed file, because then the CrawlerHandler will change data on its connection whereas the AccessData has uncommitted changes on the other connection. We will be looking at ways to work around this (I can think of several), but I don't have more time available due to preparations for the Semantic Conference. Some notes about the ModelAccessData code: - isTouched: you can also do a model.findStatements with the current timestamp as object, rather than query for any timestamp and see if one matches. - not all public methods do a checkInitialization - minor optimization: once you determine the value of timestampLong, also create a Literal version of it so that you don't have to recreate it over and over again - removeUntouchedIDs: code is correct but "subject" is a misleading var name, should be "object". "previousResource" should be reset to null after you remove its statements, or else you will keep trying to remove these statements until you find another previous resource. Furthermore, I don't understand why this code has been made this complex, isn't it simply a matter of: while (hasNext) { ID = getNext(); if (!checkTouched(ID)) { removeStuffAbout(ID) } } - I don't like the fact that UntouchedIterator auto-closes the wrapped iterator after the last result, only its close method should do that. This will only make people sloppy. Regards, Chris -- Christiaan Fluit wrote: > Antoni Myłka wrote: >> Hello Aperturians, the new AccessData is ready. > > Just reviewed the code: looks good! > > I will try to test-drive the code in AutoFocus on some realistic data > sets. This means that the FileSystemCrawler, WebCrawler and ImapCrawler > will get some pretty intensive incremental crawling testing. > > As I am using my own AccessData implementation (which will have to be > modified for this to work), I won't be testing the current Aperture > implementations such as FileAccessData and ModelAccessData. > > Also, this testing will not include checking whether incremental > crawling works together with the SubCrawlers, only Crawlers and > Extractors will be used. > >> 4. isTouched allowed me to get rid of the crawledUrls set in the >> WebCrawler which should lower the memory consumption > > Neat! > > This means that there is only one possible memory hog left in the > WebCrawler, namely the queue of links to crawl. > > > Regards, > > Chris > -- |
From: Leo Sauermann <leo.sauermann@df...> - 2008-05-06 07:48:46
|
It was yogesh pendharkar who said at the right time 05.05.2008 22:41 the following words: > Hi, > > I want to extract the plain text file for some specific information > such as the Operation number etc. So for that I need to add some > additional vocabulary class for it (I suppose). I am not sure about > how to go with it. Does anybody has any idea about it? In what industry/application area are you working? What is an "Operation number"? please try again to state what the problem is, what you have already found in the aperture project to work, which vocabulary you found, and which additions to this vocabulary you propose, or better which other vocabulary suits. best - give us some example file and example RDF you expect as a result then we can much quicker discuss! best Leo > > -- > Thanks, > Yogesh > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > ------------------------------------------------------------------------ > > _______________________________________________ > Aperture-devel mailing list > Aperture-devel@... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauermann@... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |
From: yogesh pendharkar <yogesh.pendharkar@gm...> - 2008-05-05 20:41:37
|
Hi, I want to extract the plain text file for some specific information such as the Operation number etc. So for that I need to add some additional vocabulary class for it (I suppose). I am not sure about how to go with it. Does anybody has any idea about it? -- Thanks, Yogesh |
From: Nico Heid <Nico.Heid@1u...> - 2008-05-05 15:59:05
|
Hi, I basically followed the http://aperture.sourceforge.net/tutorial/extractors.html tutorial to hand over files and extract metadata. Works flawless (adding the dependencies for the extractors) The only thing I'm wondering is how to get it to work with mp3 files. (using jaudiotagger) MimeType is detected correct. But on: Set factories = extractorRegistry.get(mimeType); There is no factory in the result. Did I misunderstand the concept, or do I miss a dependency? I there already some sample you could point me to? I was using both, latest stable and testing versions with the same result. Nico |
From: Christiaan Fluit <christiaan.fluit@ad...> - 2008-05-05 12:33:20
|
Antoni Myłka wrote: > Hello Aperturians! > > I started gathering ideas for a release checklist. Please have a look > and add comments. > > http://aperture.wiki.sourceforge.net/ReleaseChecklist Still missing: - creation of the release in SVN (i.e. copying to the tags dir) - (perhaps) setting version numbers in build.xml, etc. in the tag dir - creating the release from that dir. This way, people creating the release from the release's tag dir are guaranteed to get the same distro as those downloading it from SourceForge. Regards, Chris -- |
From: Leo Sauermann <leo.sauermann@df...> - 2008-05-05 12:27:09
|
Hi It was Antoni My?ka who said at the right time 02.05.2008 18:31 the following words: > Drawback: > 1. The AccessData has become quite a complicated component with many > methods and a sophisticated contract. It will require considerably more > work for a "normal user" to write his/her own AccessData implementation, > though the abstract AccessDataTest class should make it easier to write > appropriate unit tests. > > It's possible to move touchRecursively, getAggregatedIDsClosure and > recursive removal to the CrawlerBase, but it will require an addition of > getAggregatedIDsIterator. also, I think it would break the encapsulation. These are methods that work well with accessdata. best Leo > Returning sets is bad if we want to conserve > memory (which we do, don't we?). It would take a day. Please voice your > opinions as soon as possible. We are already behind the schedule agreed > on at the beginning of may and the website crawlers are still due for a > refactoring. If anyone wishes to contribute to the noble cause of > Aperture development - now it's the best moment. > > The unit tests pass, but I haven't done any functional testing yet. Will > get back to it no sooner than tuesday evening. > > All kinds of comments welcome. > Antoni Mylka > antoni.mylka@... > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > Aperture-devel mailing list > Aperture-devel@... > https://lists.sourceforge.net/lists/listinfo/aperture-devel > -- ____________________________________________________ DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +49 631 20575-116 D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauermann@... Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 ____________________________________________________ |
From: Christiaan Fluit <christiaan.fluit@ad...> - 2008-05-05 11:50:40
|
Antoni Myłka wrote: > Hello Aperturians, the new AccessData is ready. Just reviewed the code: looks good! I will try to test-drive the code in AutoFocus on some realistic data sets. This means that the FileSystemCrawler, WebCrawler and ImapCrawler will get some pretty intensive incremental crawling testing. As I am using my own AccessData implementation (which will have to be modified for this to work), I won't be testing the current Aperture implementations such as FileAccessData and ModelAccessData. Also, this testing will not include checking whether incremental crawling works together with the SubCrawlers, only Crawlers and Extractors will be used. > 4. isTouched allowed me to get rid of the crawledUrls set in the > WebCrawler which should lower the memory consumption Neat! This means that there is only one possible memory hog left in the WebCrawler, namely the queue of links to crawl. Regards, Chris -- |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-02 16:42:20
|
Hello Aperturians, the new AccessData is ready. 1. After some thought I switched to explicit touching in AccessData (touch() and isTouched methods). 2. Completed three implementations AccessDataImpl, FileAccessData and ModelAccessData. 3. All three pass a refactored unit test suite. 4. Updated the CrawlerBase - no deprecatedUrls - the handler is private, added delegate methods that take care about everything 5. Updated all crawlers to conform to those changes, previous references to deprecatedUrls and to the handler are commented out, but visible for the interested. Benefits: 1. crawling with subcrawlers should work correctly 2. deprecated urls are no more 3. the crawl report is reliable again, some crawlers didn't maintain it correctly 4. isTouched allowed me to get rid of the crawledUrls set in the WebCrawler which should lower the memory consumption 5. due to explicit touching everything should go relatively quickly Drawback: 1. The AccessData has become quite a complicated component with many methods and a sophisticated contract. It will require considerably more work for a "normal user" to write his/her own AccessData implementation, though the abstract AccessDataTest class should make it easier to write appropriate unit tests. It's possible to move touchRecursively, getAggregatedIDsClosure and recursive removal to the CrawlerBase, but it will require an addition of getAggregatedIDsIterator. Returning sets is bad if we want to conserve memory (which we do, don't we?). It would take a day. Please voice your opinions as soon as possible. We are already behind the schedule agreed on at the beginning of may and the website crawlers are still due for a refactoring. If anyone wishes to contribute to the noble cause of Aperture development - now it's the best moment. The unit tests pass, but I haven't done any functional testing yet. Will get back to it no sooner than tuesday evening. All kinds of comments welcome. Antoni Mylka antoni.mylka@... |
From: Antoni Myłka <antoni.mylka@gm...> - 2008-05-01 21:23:39
|
Hello Aperturians! I started gathering ideas for a release checklist. Please have a look and add comments. http://aperture.wiki.sourceforge.net/ReleaseChecklist Antoni Mylka antoni.mylka@... |