|
From: Jens L. <le...@in...> - 2015-02-20 06:39:48
|
Hello, Am 19.02.2015 um 13:26 schrieb Diogo FC Patrao: > > Below is the configuration file. I'm sorry, but I can't share the sparql > endpoint, because it's hosted on our intranet. Besides this, I changed > the script /cli/ to increase Xmmx to 25000M. Is the data itself confidential? Otherwise, you could also share the dump behind it via dropbox etc. (not necessarily public, just sharing with Lorenz or me would be sufficient as it could save as some time to look into the problem - we can sign NDAs as well if needed). We can then load it into an endpoint here for testing. Also in the conf file, it may be good to specify some termination criterion (e.g. 5 minutes via alg.maxExecutionTimeInSeconds = 300) to avoid the algorithm running forever. (If it doesn't find a perfect solution, it will indeed always run out of memory at some point otherwise.) Recursion depth 4 could be quite high depending on the data. Trying lower depths first would be something to test. (It depends on how deeply nested you expect the learned constructs to be.) Generally, we are currently looking into various approaches and algorithms related to scalability (also across several machines), so if you like to involve us in the cancer patient use case, we'd be more than happy to do so and could run classifications on larger machines here. For us, it would be a good additional test case to verify whether the improvements we are planning at the moment lead to good results. Kind regards, Jens -- Dr. Jens Lehmann AKSW Group, Department of Computer Science, University of Leipzig Homepage: http://www.jens-lehmann.org GPG Key: http://jens-lehmann.org/jens_lehmann.asc |