Re: [Plastic-devs] Idenitifying different subsets [was Re: Also....
Brought to you by:
johndavidtaylor,
thomasboch
|
From: R H. <ric...@co...> - 2006-08-29 16:07:42
|
On Tue, 29 Aug 2006, John Taylor wrote:
> R Holbrey wrote:
>>
>> I guess this would be a good thing to hammer out a bit more at the
>> hackathon, but I'm still not quite getting there. Let's take the
>> clustering example abit further. R easily has a dozen different routines
>> and many possible parameter settings.
>>
>> In fact, one of the harder things to decide up front is whether you
>> want to set the number of clusters or whether you want some kind of
>> estimate. My idea for eirik is that you might want to do two, three or
>> more clusterings to get a feel how things are going and that,
>> crucially, eirik should start you off with the fastest/dirtiest.
>>
>> The way I read showObjects here is that for each cluster identified,
>> you send back a message saying show these objects, say for those in
>> cluster 1. Then another message for objects in cluster 2 etc. But
>>
>> - how many clusters in all?
>> - what kind of errors were given (possibly on each)?
>> - what other parameters were assumed? (variables used to cluster,
>> types of cluster etc)
> You could put any such extra information into the VOTable that you send
> to other apps.
mmm... but this would imply a separate votable for each clustering/bit of
analysis, which seems rather cumbersome to me. I'd prefer to pass back a
clustering to topcat (say) and then, when the user is happy, s/he can same
the preferred clustering results as an extra column in a votable.
>>
>> For errors, I guess the user could expect this from eirik, but might
>> not these params be useful in another workflow? For the number of
>> clusters, it seems more sensible to have a showObjects message where
>> the elements of the rows array might look like
>>
>> {1,1,1,2,2,3,4,4,3,3}
>>
>> for a 10-row table, with 4 clusters identified ie rows 1-3 belong to
>> cluster 1. As I read showObjects at the moment, this would result in 4
>> messages and then more for "other details". This kind of
>> multiplication of messages seems dangerously confusing to me, harder
>> to implement and more likely to lead to less understanding.
> Sure - we could do it this way instead. The purpose of the showObjects
> message is to tell a visualiser to select a set of rows. We want to
> make sure that we keep it generic, and it doesn't have any
> Eirik-specific information in there. However, the ability to be able to
> identify subsets in the set is quite generally useful I think. The
> format of the message at the moment is
> showObjects(tableId, rows[])
>
> We can change this to
> showObjects(tableId, rows[], groupId)
> (which is what I proposed above, and would require a separate message
> for each group)
> or
> showObjects(tableId, rows[], groupId[])
>
I see how the first one works, though I don't know how other apps would
then know how many clusters to expect. I don't quite get how the second
showObjects would work, unless (as I suggested above) groups are matched
to table rows in rows[], in which case groupId[] seems superfluous...?
If I might make another suggestion, it would be for an optional argument
as a string to identify the form of analysis, to jog the user's memory and
which would optionally be stored in a votable header. It might look like:
"clustering by mclust v1.2, Bic=5, groups=4 (based on Bmag, Rmag, Imag)"
Also, I'd be tempted to call this thing showClusters( .. ).
regards,
Richard
ps wasn't there a suggestion a while back for a "showColumns(int cols[])"
?
> I don't have any strong opinions either way - the latter has the benefit
> that it does keep all the groups together.
>
> Whichever is adopted the groupId should be an optional parameter. In
> fact, I'd prefer to see it tacked on in an optional struct at the end
> showObjects(tableId, rows[], [options])
> with
> options={groupdId=int[]}
>
> to allow for future expansion.
>
>
>
>
>
>
>
>
>>
>> More bluntly, I'd say it was shoe-horning... what happens if someone
>> requests a regression or a pca/fa with clustering on top (please stop,
>> I hear you cry ;)
> No no...go on...
>>
>> An eirik api, anyone?
>>
>> R
>>
>>
>>
>> On Tue, 22 Aug 2006, John Taylor wrote:
>>
>>> I'll need to be reminded about exactly what you want to do....
>>> IIRC the use cases include:
>>>
>>> 1) User loads VOTable into Eirik, through Eirik identifies (say) 3
>>> interesting columns, sends a message to Topcat
>>> ivo://.../selectColumns#plot_3d
>>> 2) User loads VOTable into Eirik, through Eirik identifies 4
>>> interesting columns, sends a message to Weka
>>> ivo://.../selectColumns#set_active_cols. User then runs a clustering
>>> algorithm on those columns.
>>> 3) User wants to display the clusters in VisIVO. Weka sends messages
>>> to VisIVO ivo://..../showObjects#highlight. Each message has an extra
>>> optional argument of groupId that VisIVO maps to a different colour.
>>> Similarly Eirik could do the clustering (through R), and display the
>>> results through A.N.Other display program.
>>>
>>> 4) User loads a VOTable into TabView, decides on a set of columns for
>>> further investigation in Eirik. TabView instructs Eirik to load the
>>> table, then sends a message ivo://.../selectColumns#set_active_cols
>>>
>>> In all these cases, the bit after the # is just a message fragment
>>> that I made up - each application can define its own.
>>>
>>> John
>>>
>>>
>>>
>>> R Holbrey wrote:
>>>> On Tue, 22 Aug 2006, John Taylor wrote:
>>>>
>>>>> I think we got it! At least, the window comes up, and by hitting
>>>>> random buttons like a monkey I can get plots to appear.
>>>>
>>>> cheers John,
>>>>
>>>> maybe now I can teach the monkey how to do showobjects. What was the
>>>> latest on that btw? I've been struggling to think how eirik would
>>>> understand it, and it needs work, the kind of been wanting to get on
>>>> with.
>>>>
>>>> If it's not desirable to expand plastic the way I was wondering a
>>>> while back, I'm starting to think eirik should define his own api
>>>> as, basically, requests to various r functions eg to do clustering.
>>>> (At the moment it seems like to do this through plastic would
>>>> require a new votable to be written and passed back, which I think
>>>> inordinately clumsy).
>>>>
>>>> R
>>>>
>>>>>
>>>>>
>>>>> R Holbrey wrote:
>>>>>>
>>>>>> After going to the bowels of NT and back, I realised I missed a
>>>>>> rather crucial file (doh), but the weird thing is, this still
>>>>>> doesn't cause trouble on my main xp machine. Presumably it finds
>>>>>> the file elsewhere, but I'm darned if I can replicate the problem,
>>>>>> unless I move to my ageing old compaq.
>>>>>>
>>>>>> I've uploaded a zip (built with 7-zip, which is GPL) and rar
>>>>>> (better compression than either, I suspect) of the bundle with the
>>>>>> missing r source file, so I'd be grateful if you would give it
>>>>>> another crack. Maybe then I can get on with my life...
>>>>>>
>>>>>> R
>>>>>>
>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>
>>>>>>> Still no luck I'm afraid...and nothing helpful (in the way of
>>>>>>> diagnostics) appears on the commandline.
>>>>>>> J
>>>>>>>
>>>>>>> R Holbrey wrote:
>>>>>>>>
>>>>>>>> one thing I forgot was to set yet another environment variable,
>>>>>>>> R_HOME, so that RDLL in the .bat should really look like eg
>>>>>>>>
>>>>>>>> set R_HOME=c:\Program Files\R\R-2.3.1
>>>>>>>> set RDLL=%R_HOME%\bin
>>>>>>>>
>>>>>>>> with %RDLL% in the path (these should appear in your path after
>>>>>>>> running the .bat from the command prompt??). Oddly though the
>>>>>>>> "depends" utility (from ms visualc - do you happen to have it?)
>>>>>>>> tells me, on my old laptop that there are no bindings to the
>>>>>>>> Rdll at all and several other key libraries. Looks like I'd
>>>>>>>> better reboot this one and go check...
>>>>>>>>
>>>>>>>> R
>>>>>>>>
>>>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>>>
>>>>>>>>> Hi Richard,
>>>>>>>>> I'm using the latest R: 2.3.1 , java 1.5.0_06 and my path is:
>>>>>>>>> C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\Program
>>>>>>>>> Files\ATI Technologies\ATI Control Panel;C:\Program
>>>>>>>>> Files\QuickTime\QTSystem\;C:\Program Files\Python;C:\Program
>>>>>>>>> Files\Java\jdk1.5.0_06\bin;E:\jdt\applications\maven-2.0\bin;E:\jdt\applications\maven-1.0.2\bin;C:\Program
>>>>>>>>> Files\Putty;E:\jdt\applications\apache-ant-1.6.5\bin;C:\Program
>>>>>>>>> Files\Java\jdk1.5.0_06\jre\bin\client;C:\Program Files\SSH
>>>>>>>>> Communications Security\SSH Secure Shell
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>
>>>>>>>>>> thanks John,
>>>>>>>>>>
>>>>>>>>>> can you post me the versions of R, java and anything else you
>>>>>>>>>> can think of you're using and your path. I'll try some
>>>>>>>>>> different machines and I'll look up my old version of Visual
>>>>>>>>>> C. APparently there was a utility there called "depends" which
>>>>>>>>>> allowed you to see what dynamic libraries were in use (this
>>>>>>>>>> kind of thing is free in linux of course ;)
>>>>>>>>>>
>>>>>>>>>> R
>>>>>>>>>>
>>>>>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Richard,
>>>>>>>>>>> We're getting there. The install worked pretty much out of
>>>>>>>>>>> the box (all my dlls were in the right place I think). Eirik
>>>>>>>>>>> fires up OK and plastic and ACR integration works. However,
>>>>>>>>>>> the test data set isn't loaded, and any attempt to load a
>>>>>>>>>>> VOTable causes Eirik to bomb out. Is there any way I can get
>>>>>>>>>>> some diagnostic info for you?
>>>>>>>>>>>
>>>>>>>>>>> John
>>>>>>>>>>>
>>>>>>>>>>> PS
>>>>>>>>>>> Any particular reason to use RAR? I only have a "free trial"
>>>>>>>>>>> RAR decoder on my machine....zip is more common!
>>>>>>>>>>>
>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I've tried to post an xp versionjust now on the wiki:
>>>>>>>>>>>> http://eurovotech.org/twiki/bin/view/VOTech/EirikDemo
>>>>>>>>>>>>
>>>>>>>>>>>> which has had plenty of hacks and papering although I did
>>>>>>>>>>>> manage to sort out one of the things that was bogging me
>>>>>>>>>>>> down all week.. (actually, a compiler error, if you must
>>>>>>>>>>>> know..). You need to tell it where R and Java are in the
>>>>>>>>>>>> path (see batch file), but I trust you'll let me know how
>>>>>>>>>>>> this goes ;)
>>>>>>>>>>>>
>>>>>>>>>>>> What platform were you trying to run on? I think it would
>>>>>>>>>>>> probably be an idea to track these things, so that at least
>>>>>>>>>>>> we can advise when not to bother...
>>>>>>>>>>>>
>>>>>>>>>>>> Richard
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>>> It's now complaining about missing libstdc++.so.6
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think our OS installations here are a) not the most
>>>>>>>>>>>>> up-to-date b) a bit sparse.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I could ask Horst to install the missing libraries, but I
>>>>>>>>>>>>> think we have to be able to cope with the situation where
>>>>>>>>>>>>> the user can't get this stuff installed and make eirik as
>>>>>>>>>>>>> independent of any shared libs as possible, regardless of
>>>>>>>>>>>>> the size.
>>>>>>>>>>>>>
>>>>>>>>>>>>> John
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> fairly small, so attached mine! (might work if you create
>>>>>>>>>>>>>> links libXinerama.so and libXinerama.so.1 also in your path.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Logo: probably shouldn't really rely on it, but I might
>>>>>>>>>>>>>> change it when I have an idle minute, so the web link
>>>>>>>>>>>>>> should do for now.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> R
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ...could you send me a copy of your logo for putting on
>>>>>>>>>>>>>>>>> the web site?
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> J
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Xinerama is kind of optional with qt and I just left it
>>>>>>>>>>>>>>>> in by default. When I go through another 2-hour compile
>>>>>>>>>>>>>>>> I might take it out to see what happens... otherwise you
>>>>>>>>>>>>>>>> might visit http://rpm.pbone.net and do an advanced
>>>>>>>>>>>>>>>> search for your linux distro and libXinerama, choosing
>>>>>>>>>>>>>>>> the libXinerama-<ver>-i386.rpm if anything comes up.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> eg. for fedora 5 this might turn out as
>>>>>>>>>>>>>>>> ftp://ftp.univie.ac.at/systems/linux/fedora/5/i386/os/Fedora/RPMS/libXinerama-1.0.1-1.2.i386.rpm
>>>>>>>>>>>>>>>> click,save and then run
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> rpm -Uvh libXinerama-1.0.1-1.2.i386.rpm
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> as root.
>>>>>>>>>>>>>>> Not root on this machine, alas, so might have to just try
>>>>>>>>>>>>>>> stuffing it on the library path, assuming I can get hold
>>>>>>>>>>>>>>> of it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Still having teething troubles with windows (you
>>>>>>>>>>>>>>>> probably guessed) but it is working and I'm fairly
>>>>>>>>>>>>>>>> hopeful of getting a demo out this week (I forgot things
>>>>>>>>>>>>>>>> like windows doesn't let you overwrite files whilst
>>>>>>>>>>>>>>>> they're still open...). At the moment it seems to die if
>>>>>>>>>>>>>>>> you tried to load several files one after the other.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A version of my (slightly crude) logo lies at
>>>>>>>>>>>>>>>> http://www.comp.leeds.ac.uk/richardh/eirik/xpm/Eirikship.png
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers! Can I rely on that URL or should I take a copy?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
|