Re: [Plastic-devs] Idenitifying different subsets [was Re: Also....
Brought to you by:
johndavidtaylor,
thomasboch
|
From: R H. <ric...@co...> - 2006-08-30 12:32:12
|
On Wed, 30 Aug 2006, John Taylor wrote:
> R Holbrey wrote:
>>
>>>>
>>>> For errors, I guess the user could expect this from eirik, but might
>>>> not these params be useful in another workflow? For the number of
>>>> clusters, it seems more sensible to have a showObjects message where
>>>> the elements of the rows array might look like
>>>>
>>>> {1,1,1,2,2,3,4,4,3,3}
>>>>
>>>> for a 10-row table, with 4 clusters identified ie rows 1-3 belong to
>>>> cluster 1. As I read showObjects at the moment, this would result in 4
>>>> messages and then more for "other details". This kind of
>>>> multiplication of messages seems dangerously confusing to me, harder
>>>> to implement and more likely to lead to less understanding.
>>> Sure - we could do it this way instead. The purpose of the showObjects
>>> message is to tell a visualiser to select a set of rows. We want to
>>> make sure that we keep it generic, and it doesn't have any
>>> Eirik-specific information in there. However, the ability to be able to
>>> identify subsets in the set is quite generally useful I think. The
>>> format of the message at the moment is
>>> showObjects(tableId, rows[])
>>>
>>> We can change this to
>>> showObjects(tableId, rows[], groupId)
>>> (which is what I proposed above, and would require a separate message
>>> for each group)
>>> or
>>> showObjects(tableId, rows[], groupId[])
>>>
>>
>> I see how the first one works, though I don't know how other apps
>> would then know how many clusters to expect.
> They wouldn't know how many clusters...do they need to?
>> I don't quite get how the second showObjects would work, unless (as I
>> suggested above) groups are matched to table rows in rows[], in which
>> case groupId[] seems superfluous...?
> I might have misunderstood what you were proposing - I thought this was
> it...the addition of a groupId[] mask that identifies the group each row
> belongs to.
I expect you have it better than I do ... you're keeping rows for
compatability ? (ie one of rows or groupId could be the mask)
>>
>> If I might make another suggestion, it would be for an optional
>> argument as a string to identify the form of analysis, to jog the
>> user's memory and which would optionally be stored in a votable
>> header. It might look like:
>>
>> "clustering by mclust v1.2, Bic=5, groups=4 (based on Bmag, Rmag, Imag)"
>>
>> Also, I'd be tempted to call this thing showClusters( .. ).
> This could be done quite easily with an optional "description" parameter
> in the first of the showObjects messages I suggested above. Can I just
> try to nail down what you're wanting to achieve here, and why the
> existing message doesn't suffice? So, Eirik has run its algorithms and
> identified 3 clusters. It sends 3 showObjects messages in quick
> succession to Topcat, which creates 3 subsets from them. You bring up a
> scatterplot, and there are your 3 clusters, ready to be overlaid on the
> full dataset in different colours. You don't even need a "groupId" to
> do this.
>
> Am I misunderstanding what you want to do?
You could probably get by working as you suggest, but my feeling is that
it is seriously flawed for anything beyond demo usage. The problem comes
in the user knowing that the 3 clusters sent to topcat the first time are
different from the 4 clusters sent in the next analysis and the 5 clusters
sent ... I hope you get the idea.
Also, at some point, you probably want to save the preferred clustering
with the data (as an extra column, presumably), so you'll need to combine
the 3/4/5 etc clusters into one item somehow. I imagine this is do-able in
topcat, but, as before, which of the dozens of clusters to choose from ...
unless you can work it out from the optional message (I've missed
this option somewhere, but it still sounds messy).
regards,
Richard
>
>
>>
>> regards,
>> Richard
>>
>> ps wasn't there a suggestion a while back for a "showColumns(int
>> cols[])" ?
> Yes - I think it's still needed. I think we settled on something like
> showColumns(tableId, cols[], [optionsHashTable])#fragment
>
> John
>
>>
>>
>>> I don't have any strong opinions either way - the latter has the benefit
>>> that it does keep all the groups together.
>>>
>>> Whichever is adopted the groupId should be an optional parameter. In
>>> fact, I'd prefer to see it tacked on in an optional struct at the end
>>> showObjects(tableId, rows[], [options])
>>> with
>>> options={groupdId=int[]}
>>>
>>> to allow for future expansion.
>>>
>>>
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>
>>>> More bluntly, I'd say it was shoe-horning... what happens if someone
>>>> requests a regression or a pca/fa with clustering on top (please stop,
>>>> I hear you cry ;)
>>> No no...go on...
>>>>
>>>> An eirik api, anyone?
>>>>
>>>> R
>>>>
>>>>
>>>>
>>>> On Tue, 22 Aug 2006, John Taylor wrote:
>>>>
>>>>> I'll need to be reminded about exactly what you want to do....
>>>>> IIRC the use cases include:
>>>>>
>>>>> 1) User loads VOTable into Eirik, through Eirik identifies (say) 3
>>>>> interesting columns, sends a message to Topcat
>>>>> ivo://.../selectColumns#plot_3d
>>>>> 2) User loads VOTable into Eirik, through Eirik identifies 4
>>>>> interesting columns, sends a message to Weka
>>>>> ivo://.../selectColumns#set_active_cols. User then runs a clustering
>>>>> algorithm on those columns.
>>>>> 3) User wants to display the clusters in VisIVO. Weka sends messages
>>>>> to VisIVO ivo://..../showObjects#highlight. Each message has an extra
>>>>> optional argument of groupId that VisIVO maps to a different colour.
>>>>> Similarly Eirik could do the clustering (through R), and display the
>>>>> results through A.N.Other display program.
>>>>>
>>>>> 4) User loads a VOTable into TabView, decides on a set of columns for
>>>>> further investigation in Eirik. TabView instructs Eirik to load the
>>>>> table, then sends a message ivo://.../selectColumns#set_active_cols
>>>>>
>>>>> In all these cases, the bit after the # is just a message fragment
>>>>> that I made up - each application can define its own.
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>>
>>>>> R Holbrey wrote:
>>>>>> On Tue, 22 Aug 2006, John Taylor wrote:
>>>>>>
>>>>>>> I think we got it! At least, the window comes up, and by hitting
>>>>>>> random buttons like a monkey I can get plots to appear.
>>>>>>
>>>>>> cheers John,
>>>>>>
>>>>>> maybe now I can teach the monkey how to do showobjects. What was the
>>>>>> latest on that btw? I've been struggling to think how eirik would
>>>>>> understand it, and it needs work, the kind of been wanting to get on
>>>>>> with.
>>>>>>
>>>>>> If it's not desirable to expand plastic the way I was wondering a
>>>>>> while back, I'm starting to think eirik should define his own api
>>>>>> as, basically, requests to various r functions eg to do clustering.
>>>>>> (At the moment it seems like to do this through plastic would
>>>>>> require a new votable to be written and passed back, which I think
>>>>>> inordinately clumsy).
>>>>>>
>>>>>> R
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> R Holbrey wrote:
>>>>>>>>
>>>>>>>> After going to the bowels of NT and back, I realised I missed a
>>>>>>>> rather crucial file (doh), but the weird thing is, this still
>>>>>>>> doesn't cause trouble on my main xp machine. Presumably it finds
>>>>>>>> the file elsewhere, but I'm darned if I can replicate the problem,
>>>>>>>> unless I move to my ageing old compaq.
>>>>>>>>
>>>>>>>> I've uploaded a zip (built with 7-zip, which is GPL) and rar
>>>>>>>> (better compression than either, I suspect) of the bundle with the
>>>>>>>> missing r source file, so I'd be grateful if you would give it
>>>>>>>> another crack. Maybe then I can get on with my life...
>>>>>>>>
>>>>>>>> R
>>>>>>>>
>>>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>>>
>>>>>>>>> Still no luck I'm afraid...and nothing helpful (in the way of
>>>>>>>>> diagnostics) appears on the commandline.
>>>>>>>>> J
>>>>>>>>>
>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>
>>>>>>>>>> one thing I forgot was to set yet another environment variable,
>>>>>>>>>> R_HOME, so that RDLL in the .bat should really look like eg
>>>>>>>>>>
>>>>>>>>>> set R_HOME=c:\Program Files\R\R-2.3.1
>>>>>>>>>> set RDLL=%R_HOME%\bin
>>>>>>>>>>
>>>>>>>>>> with %RDLL% in the path (these should appear in your path after
>>>>>>>>>> running the .bat from the command prompt??). Oddly though the
>>>>>>>>>> "depends" utility (from ms visualc - do you happen to have it?)
>>>>>>>>>> tells me, on my old laptop that there are no bindings to the
>>>>>>>>>> Rdll at all and several other key libraries. Looks like I'd
>>>>>>>>>> better reboot this one and go check...
>>>>>>>>>>
>>>>>>>>>> R
>>>>>>>>>>
>>>>>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Richard,
>>>>>>>>>>> I'm using the latest R: 2.3.1 , java 1.5.0_06 and my path is:
>>>>>>>>>>> C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\Program
>>>>>>>>>>>
>>>>>>>>>>> Files\ATI Technologies\ATI Control Panel;C:\Program
>>>>>>>>>>> Files\QuickTime\QTSystem\;C:\Program Files\Python;C:\Program
>>>>>>>>>>> Files\Java\jdk1.5.0_06\bin;E:\jdt\applications\maven-2.0\bin;E:\jdt\applications\maven-1.0.2\bin;C:\Program
>>>>>>>>>>>
>>>>>>>>>>> Files\Putty;E:\jdt\applications\apache-ant-1.6.5\bin;C:\Program
>>>>>>>>>>> Files\Java\jdk1.5.0_06\jre\bin\client;C:\Program Files\SSH
>>>>>>>>>>> Communications Security\SSH Secure Shell
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> thanks John,
>>>>>>>>>>>>
>>>>>>>>>>>> can you post me the versions of R, java and anything else you
>>>>>>>>>>>> can think of you're using and your path. I'll try some
>>>>>>>>>>>> different machines and I'll look up my old version of Visual
>>>>>>>>>>>> C. APparently there was a utility there called "depends" which
>>>>>>>>>>>> allowed you to see what dynamic libraries were in use (this
>>>>>>>>>>>> kind of thing is free in linux of course ;)
>>>>>>>>>>>>
>>>>>>>>>>>> R
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 21 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>>> We're getting there. The install worked pretty much out of
>>>>>>>>>>>>> the box (all my dlls were in the right place I think). Eirik
>>>>>>>>>>>>> fires up OK and plastic and ACR integration works. However,
>>>>>>>>>>>>> the test data set isn't loaded, and any attempt to load a
>>>>>>>>>>>>> VOTable causes Eirik to bomb out. Is there any way I can get
>>>>>>>>>>>>> some diagnostic info for you?
>>>>>>>>>>>>>
>>>>>>>>>>>>> John
>>>>>>>>>>>>>
>>>>>>>>>>>>> PS
>>>>>>>>>>>>> Any particular reason to use RAR? I only have a "free trial"
>>>>>>>>>>>>> RAR decoder on my machine....zip is more common!
>>>>>>>>>>>>>
>>>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've tried to post an xp versionjust now on the wiki:
>>>>>>>>>>>>>> http://eurovotech.org/twiki/bin/view/VOTech/EirikDemo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> which has had plenty of hacks and papering although I did
>>>>>>>>>>>>>> manage to sort out one of the things that was bogging me
>>>>>>>>>>>>>> down all week.. (actually, a compiler error, if you must
>>>>>>>>>>>>>> know..). You need to tell it where R and Java are in the
>>>>>>>>>>>>>> path (see batch file), but I trust you'll let me know how
>>>>>>>>>>>>>> this goes ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What platform were you trying to run on? I think it would
>>>>>>>>>>>>>> probably be an idea to track these things, so that at least
>>>>>>>>>>>>>> we can advise when not to bother...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Richard
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Richard,
>>>>>>>>>>>>>>> It's now complaining about missing libstdc++.so.6
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think our OS installations here are a) not the most
>>>>>>>>>>>>>>> up-to-date b) a bit sparse.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I could ask Horst to install the missing libraries, but I
>>>>>>>>>>>>>>> think we have to be able to cope with the situation where
>>>>>>>>>>>>>>> the user can't get this stuff installed and make eirik as
>>>>>>>>>>>>>>> independent of any shared libs as possible, regardless of
>>>>>>>>>>>>>>> the size.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> fairly small, so attached mine! (might work if you create
>>>>>>>>>>>>>>>> links libXinerama.so and libXinerama.so.1 also in your
>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Logo: probably shouldn't really rely on it, but I might
>>>>>>>>>>>>>>>> change it when I have an idle minute, so the web link
>>>>>>>>>>>>>>>> should do for now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> R
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> R Holbrey wrote:
>>>>>>>>>>>>>>>>>> On Thu, 17 Aug 2006, John Taylor wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ...could you send me a copy of your logo for putting on
>>>>>>>>>>>>>>>>>>> the web site?
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> J
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Xinerama is kind of optional with qt and I just left it
>>>>>>>>>>>>>>>>>> in by default. When I go through another 2-hour compile
>>>>>>>>>>>>>>>>>> I might take it out to see what happens... otherwise you
>>>>>>>>>>>>>>>>>> might visit http://rpm.pbone.net and do an advanced
>>>>>>>>>>>>>>>>>> search for your linux distro and libXinerama, choosing
>>>>>>>>>>>>>>>>>> the libXinerama-<ver>-i386.rpm if anything comes up.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> eg. for fedora 5 this might turn out as
>>>>>>>>>>>>>>>>>> ftp://ftp.univie.ac.at/systems/linux/fedora/5/i386/os/Fedora/RPMS/libXinerama-1.0.1-1.2.i386.rpm
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> click,save and then run
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> rpm -Uvh libXinerama-1.0.1-1.2.i386.rpm
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> as root.
>>>>>>>>>>>>>>>>> Not root on this machine, alas, so might have to just try
>>>>>>>>>>>>>>>>> stuffing it on the library path, assuming I can get hold
>>>>>>>>>>>>>>>>> of it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Still having teething troubles with windows (you
>>>>>>>>>>>>>>>>>> probably guessed) but it is working and I'm fairly
>>>>>>>>>>>>>>>>>> hopeful of getting a demo out this week (I forgot things
>>>>>>>>>>>>>>>>>> like windows doesn't let you overwrite files whilst
>>>>>>>>>>>>>>>>>> they're still open...). At the moment it seems to die if
>>>>>>>>>>>>>>>>>> you tried to load several files one after the other.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A version of my (slightly crude) logo lies at
>>>>>>>>>>>>>>>>>> http://www.comp.leeds.ac.uk/richardh/eirik/xpm/Eirikship.png
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers! Can I rely on that URL or should I take a copy?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
|