ssic-linux-users Mailing List for OpenSSI Clusters for Linux
Brought to you by:
brucewalker,
rogertsang
You can subscribe to this list here.
2003 |
Jan
(17) |
Feb
(23) |
Mar
(32) |
Apr
(48) |
May
(51) |
Jun
(23) |
Jul
(39) |
Aug
(47) |
Sep
(107) |
Oct
(112) |
Nov
(112) |
Dec
(70) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(155) |
Feb
(283) |
Mar
(200) |
Apr
(107) |
May
(73) |
Jun
(171) |
Jul
(127) |
Aug
(119) |
Sep
(91) |
Oct
(116) |
Nov
(175) |
Dec
(143) |
2005 |
Jan
(168) |
Feb
(237) |
Mar
(222) |
Apr
(183) |
May
(111) |
Jun
(153) |
Jul
(123) |
Aug
(43) |
Sep
(95) |
Oct
(179) |
Nov
(95) |
Dec
(119) |
2006 |
Jan
(39) |
Feb
(33) |
Mar
(133) |
Apr
(69) |
May
(22) |
Jun
(40) |
Jul
(33) |
Aug
(32) |
Sep
(34) |
Oct
(10) |
Nov
(8) |
Dec
(18) |
2007 |
Jan
(14) |
Feb
(3) |
Mar
(13) |
Apr
(16) |
May
(15) |
Jun
(8) |
Jul
(20) |
Aug
(25) |
Sep
(17) |
Oct
(10) |
Nov
(8) |
Dec
(13) |
2008 |
Jan
(7) |
Feb
|
Mar
(1) |
Apr
(6) |
May
(15) |
Jun
(22) |
Jul
(22) |
Aug
(5) |
Sep
(5) |
Oct
(17) |
Nov
(3) |
Dec
(1) |
2009 |
Jan
(2) |
Feb
|
Mar
(29) |
Apr
(78) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(1) |
Oct
(21) |
Nov
(1) |
Dec
(4) |
2010 |
Jan
(1) |
Feb
(5) |
Mar
|
Apr
(5) |
May
(7) |
Jun
(14) |
Jul
(5) |
Aug
(72) |
Sep
(25) |
Oct
(5) |
Nov
(14) |
Dec
(12) |
2011 |
Jan
(9) |
Feb
|
Mar
|
Apr
(3) |
May
(3) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(10) |
Aug
(18) |
Sep
(2) |
Oct
(1) |
Nov
|
Dec
|
2013 |
Jan
(1) |
Feb
(3) |
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Darren W. <da...@wi...> - 2019-06-09 00:17:13
|
Hey folks, I'm kinda new to this group but I was looking to create an SSI install using seven nodes, 56 cores, 240GB RAM and 80gbps Infiniband (when I install the cards and switches) along with dual Gbit ethernet cat5e, shared storage. I know SSI is not used very commonly but it still has its uses and all being said, was looking to at least try the user experience out regardless of age or newer software technologies existing which may have superseded oSSI over the years. *What are you folks doing then :) Kind regards, Darren Wise |
From: John H. <jo...@ca...> - 2014-05-12 14:59:54
|
On 02/04/14 11:20, Claudio Scordino wrote: > > > Does OpenSSI provide some kind of software distributedshared > memory(DSM) mechanism ? Shared memory (mmap or shm) is accessible from all nodes in an OpenSSI cluster. |
From: segurah h <se...@gm...> - 2014-04-02 22:04:49
|
OpenSSI is based on process migration, a multi-thread process have a "father" process and it can't be migrated without migrating father. Then I think OpenSSI can't be useful. Regards, Hugo Segura On Wed, Apr 2, 2014 at 4:50 AM, Claudio Scordino <cl...@ev...> wrote: > Hi. > > I'd like to run OpenMP multi-thread applications across a set of ARM/Linux > nodes. > > Is OpenSSI suitable for be used as basis for such kind of activity ? (by > modifying either OpenMP or OpenSSI, in case) > > Does OpenSSI provide some kind of software distributed shared memory (DSM) > mechanism ? > > Many thanks, > > Claudio > > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Claudio S. <cl...@ev...> - 2014-04-02 09:46:24
|
Hi. I'd like to run OpenMP multi-thread applications across a set of ARM/Linux nodes. Is OpenSSI suitable for be used as basis for such kind of activity ? (by modifyingeither OpenMP or OpenSSI, in case) Does OpenSSI provide some kind of software distributedshared memory(DSM) mechanism ? Many thanks, Claudio |
From: igor k. <ig...@ya...> - 2013-11-11 12:46:33
|
http://dehumidifiersale.co.uk/tennlegalaid/reddit.php?atrp145whqcp ig...@ya... igor khurgin *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*. Mon, 11 Nov 2013 12:45:56 |
From: igor k. <ig...@ya...> - 2013-04-19 20:52:10
|
http://www.futurecircle.org/components/com_content/times.php?hkuqxch715kvi __________________ Whom computers would destroy, they must first drive mad. |
From: igor k. <ig...@ya...> - 2013-04-04 14:44:11
|
http://www.loughorboatingclub.com/includes/hello.php?efii712blmshc ----------- The clock maker works over time every day and never gets extra pay. |
From: Roger T. <rog...@gm...> - 2013-02-18 00:38:03
|
The errors don't say whether it is something to do with POSIX locks. POSIX locks are cluster wide and so are semaphores. In the case of POSIX locks OpenSSI has a bug that can race across the cluster. That is when two or more processes on different nodes fight over the same file lock there is no guarantee they will be served in the order their requests were submitted. They are served in the order their requests arrive at the CFS server the file belongs to. A possible fix is to serialize the requests by timestamp. On Feb 17, 2013 12:48 AM, "Mulyadi Santosa" <mul...@gm...> wrote: > On Fri, Feb 15, 2013 at 5:49 PM, Oliver Urbann > <oli...@tu...> wrote: > > the process is migrated but the benchmark crashes with different > > messages, e.g.: > > > > Client 3 aborted in state 11: FATAL: semop(id=491526) failed: > > Bezeichner wurde entfernt > > > > or > > > > Client 2 aborted in state 8: FEHLER: lock RowExclusiveLock on object > > 16384/16391/0 is already held > > > > Did anybody have success load balancing postgresql? > > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > > IIRC, locks in OpenSSI is not system wide. Further more, when a > process (the whole thread group in this case) migrated, then it is > migrated entirely in the sense that it physically move. > > This is different than say MOSIX which use "stub" approach, so lock > might still work because there is still communication between origin > node and the new node (from the migration point of view) > > So that might explain the lock error message. To prove it, try to > install MOSIX and see if the crash/error disappear during your test. > > -- > regards, > > Mulyadi Santosa > Freelance Linux trainer and consultant > > blog: the-hydra.blogspot.com > training: mulyaditraining.blogspot.com > > > ------------------------------------------------------------------------------ > The Go Parallel Website, sponsored by Intel - in partnership with Geeknet, > is your hub for all things parallel software development, from weekly > thought > leadership blogs to news, videos, case studies, tutorials, tech docs, > whitepapers, evaluation guides, and opinion stories. Check out the most > recent posts - join the conversation now. > http://goparallel.sourceforge.net/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Mulyadi S. <mul...@gm...> - 2013-02-17 05:48:10
|
On Fri, Feb 15, 2013 at 5:49 PM, Oliver Urbann <oli...@tu...> wrote: > the process is migrated but the benchmark crashes with different > messages, e.g.: > > Client 3 aborted in state 11: FATAL: semop(id=491526) failed: > Bezeichner wurde entfernt > > or > > Client 2 aborted in state 8: FEHLER: lock RowExclusiveLock on object > 16384/16391/0 is already held > > Did anybody have success load balancing postgresql? > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users IIRC, locks in OpenSSI is not system wide. Further more, when a process (the whole thread group in this case) migrated, then it is migrated entirely in the sense that it physically move. This is different than say MOSIX which use "stub" approach, so lock might still work because there is still communication between origin node and the new node (from the migration point of view) So that might explain the lock error message. To prove it, try to install MOSIX and see if the crash/error disappear during your test. -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com |
From: Oliver U. <oli...@tu...> - 2013-02-15 11:12:39
|
Hi everybody, I've installed OpenSSI on a Debian 5 system together with postgresql. I'm able to migrate the postgresql server to a second node, but only all processes together. Running the postgresql benchmark pgbench all created postgresql processes also run on that node. If I try to migrate one of these worker processes to another node using echo 2 > /proc/68240/goto the process is migrated but the benchmark crashes with different messages, e.g.: Client 3 aborted in state 11: FATAL: semop(id=491526) failed: Bezeichner wurde entfernt or Client 2 aborted in state 8: FEHLER: lock RowExclusiveLock on object 16384/16391/0 is already held Did anybody have success load balancing postgresql? Best regards, Oliver -- Dipl.-Inf. Oliver Urbann Robotics Research Institute Section Information Technology TU Dortmund University 44221 Dortmund, Germany mailto:Oli...@tu... http://www.it.irf.tu-dortmund.de Phone: +49 231 755 6165 Fax: +49 231 755 3251 |
From: scartomail <sca...@ya...> - 2013-01-16 08:25:57
|
http://brightaviation.aero/images/pmcntr.php ! |
From: marius a. p. <ma...@gm...> - 2012-10-16 12:41:59
|
On Fri, Aug 10, 2012 at 1:13 PM, Amit Khatri <ami...@gm...> wrote: > Hi All, > > i have resolved the NIC problem but stuck in sbin/init script. > it mount the devfs file system which is unsuccessfull. > i.e it is throwing error "unknow file system, unable to mount devfs." > anyone can help in this regard. > > Thanks > On Thu, Aug 9, 2012 at 2:55 PM, Mulyadi Santosa <mul...@gm...> > wrote: >> >> Hi.. >> >> On Thu, Aug 9, 2012 at 1:17 PM, Amit Khatri <ami...@gm...> wrote: >> >> >> > when i am booting my machine i am getting following error. >> > >> > "mount error :unable to mount devfs filesystem." >> >> hmmm, IIRC now debian use devtmpfs.... maybe it needs populated http://lists.debian.org/debian-amd64/2005/10/msg00854.html |
From: Vincent D. <di...@xs...> - 2012-09-03 17:01:49
|
hi Roger, Infiniband indeed can also function as TCP conection. it's 10 gbit then. the cards i have are however 2 x 40 gbit. TCP is an impossible slow protocol for a SSI. One way pingpong latencies from fastest most expensive form of TCP (solarflare at this moment) are around a 3 microseconds, infiniband is by default 3x faster. Bandwidth see above, that's a huge bandwidth difference in bandwidth. To get infiniband working, the distro's used by OpenSSI are not compatible to load OpenFED. In first place with such networks you need to run the networks and get them to work. To get them to work one needs OpenFED. That requires a specific kernel simply. AFAIK OpenSSI is not compatible with this. What would be interesting is use the technology available at its fastest incarnation to do memory migration and page fetches. Migrating pages of shared memory in case of my software is an absolute necessity and also is what IRIX was doing. One doesn't want to migrate everything at startup, but one wants to migrate pages of say 2-4 kilobyte at a time. If that would go over TCP that slows down significantly, besides that one first needs to be able to load the infiniband drivers and load OpenFED. The Kernel of allowed kernels always follows the kernel version that RHEL and SLES use, so one would require to upgrade OpenSSI to that. On Sep 3, 2012, at 5:08 AM, Roger Tsang wrote: > FYI: http://wiki.openssi.org/go/Features > > OpenSSI has been known to support Infiniband (IP over IB). Stan > Smith has used OpenSSI with Infiniband. However our CVS code > respository at SourceForge contains a new optimization for CFS over > TCP/IP - an internal CFS read/write cache but this optimization can > be disabled for Infiniband. SHM is not affected. > > OpenSSI provides a single IPC namespace. If I remember correctly, > the single namespace means shared memory segments are accessible > cluster-wide (see below). These SHM segments do migrate but only > during process migration of the owning process. The load-balancing > of just SHM segments (without process migration) is not implemented. > > In OpenSSI, procfs provides an extra column: node_num. All IPC > entities in the cluster are visible. > > $ cat /proc/sysvipc/shm > key shmid perms size cpid lpid nattch uid gid > cuid cgid atime dtime ctime view node_num > 0 1671168 1666 294912 200617 200620 10 > 0 0 0 0 1346607713 1346607607 1346607603 > default 3 > 0 1572876 1666 294912 69603 69606 9 0 > 0 0 0 1346607713 1346587446 1346473273 default 1 > > $ cat /proc/sysvipc/sem > key semid perms nsems uid gid cuid cgid > otime ctime view node_num > 0 4947968 666 1 0 0 0 0 > 1346639170 1346607607 default 3 > 0 3997698 666 1 0 0 0 0 > 1346639050 1346587447 default 1 > > > > On Sun, Aug 26, 2012 at 5:46 PM, Vincent Diepeveen <di...@xs...> > wrote: > Brock, > > If we may be honest about SSI in general: > > For my 8 node $200 a box cluster having a SSI would be really GREAT. > And i can't use big enough capital letters for it. > > For HPC with big supercomputers it's total nonsense to do things in a > semi-centralized manner. > > Yet where it is interesting to make 1 big supercomputer out of my > $3200 set of 8 machines > which eat together by the way around 1.4 kilowatt under full load, so > that's similar to a > 8 socket machine of $200k. > > Now latest 8 socket machines sure are faster than this 64 cores i > have here, > yet realize the price i paid for them. > > So it would be as powerful nearly as a 8 socket machine > given the right software and decent interconnects. > > So you want the choice of which interconnect to use, to take that > yourself. > > Let me assure you that built in ethernet cards are not useful to > combine into a chessmachine. The latency > is too bad of the cheap ethernet cards (sure solarlare at $1000 a > card will work, but that's my point exactly). > > So where the price of the network is already a problem, you sure > don't want to pay big bucks for the software. > > So software that makes 1 SSI out of it, that IS interesting, provided > it is cheap. Anything that's $xxx a port is > nonsense to use. In fact even $xx a node already is nonsense > considering that infiniband software is for free > and also networks from the past usually, not all of them, spreaded > the software for free. > > A great latency interconnect was for example Quadrics. I still have > an old network here. Bandwidth is a joke of it > compared to todays pci-e cards, but latency still is beating the > fastest and most expensive TCP solution on the planet. > Software was for free downloadable. I still have that here somewhere. > > OpenSSI was for free, that is what is good of it. Yet look what > distro's it used to work for. Same with OpenMosix. > They never really were compatible with HPC networks that usually work > at different distro's with different kernel requirements. > > Fixing that costs money. Open source software is great, yet usually > it's paid persons who maintain / carry it out. > Someone needs to pay that maintenance bill. I'm not, and as it > appears no one is willing to do that. > > Yet where having a small SSI using free software, the promise that > OpenSSI delivered, it appears to be > the pipedream that never matured. It appears no one is prepared to > fund some people to build this. > > Now that's sad, but it's the reality of the market simply. > > Let's just move on instead of giving attention to the few who still > try to make big bucks based upon something that's > easy to program for and with, yet which simply always was too > expensive. > > Price matters. > > > > > > > ---------------------------------------------------------------------- > -------- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. > Discussions > will include endpoint security, mobile security and the latest in > malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Roger T. <rog...@gm...> - 2012-09-03 03:08:58
|
FYI: http://wiki.openssi.org/go/Features OpenSSI has been known to support Infiniband (IP over IB). Stan Smith has used OpenSSI with Infiniband. However our CVS code respository at SourceForge contains a new optimization for CFS over TCP/IP - an internal CFS read/write cache but this optimization can be disabled for Infiniband. SHM is not affected. OpenSSI provides a single IPC namespace. If I remember correctly, the single namespace means shared memory segments are accessible cluster-wide (see below). These SHM segments do migrate but only during process migration of the owning process. The load-balancing of just SHM segments (without process migration) is not implemented. In OpenSSI, procfs provides an extra column: node_num. All IPC entities in the cluster are visible. $ cat /proc/sysvipc/shm key shmid perms size cpid lpid nattch uid gid cuid cgid atime dtime ctime view node_num 0 1671168 1666 294912 200617 200620 10 0 0 0 0 1346607713 1346607607 1346607603 default 3 0 1572876 1666 294912 69603 69606 9 0 0 0 0 1346607713 1346587446 1346473273 default 1 $ cat /proc/sysvipc/sem key semid perms nsems uid gid cuid cgid otime ctime view node_num 0 4947968 666 1 0 0 0 0 1346639170 1346607607 default 3 0 3997698 666 1 0 0 0 0 1346639050 1346587447 default 1 On Sun, Aug 26, 2012 at 5:46 PM, Vincent Diepeveen <di...@xs...> wrote: > Brock, > > If we may be honest about SSI in general: > > For my 8 node $200 a box cluster having a SSI would be really GREAT. > And i can't use big enough capital letters for it. > > For HPC with big supercomputers it's total nonsense to do things in a > semi-centralized manner. > > Yet where it is interesting to make 1 big supercomputer out of my > $3200 set of 8 machines > which eat together by the way around 1.4 kilowatt under full load, so > that's similar to a > 8 socket machine of $200k. > > Now latest 8 socket machines sure are faster than this 64 cores i > have here, > yet realize the price i paid for them. > > So it would be as powerful nearly as a 8 socket machine > given the right software and decent interconnects. > > So you want the choice of which interconnect to use, to take that > yourself. > > Let me assure you that built in ethernet cards are not useful to > combine into a chessmachine. The latency > is too bad of the cheap ethernet cards (sure solarlare at $1000 a > card will work, but that's my point exactly). > > So where the price of the network is already a problem, you sure > don't want to pay big bucks for the software. > > So software that makes 1 SSI out of it, that IS interesting, provided > it is cheap. Anything that's $xxx a port is > nonsense to use. In fact even $xx a node already is nonsense > considering that infiniband software is for free > and also networks from the past usually, not all of them, spreaded > the software for free. > > A great latency interconnect was for example Quadrics. I still have > an old network here. Bandwidth is a joke of it > compared to todays pci-e cards, but latency still is beating the > fastest and most expensive TCP solution on the planet. > Software was for free downloadable. I still have that here somewhere. > > OpenSSI was for free, that is what is good of it. Yet look what > distro's it used to work for. Same with OpenMosix. > They never really were compatible with HPC networks that usually work > at different distro's with different kernel requirements. > > Fixing that costs money. Open source software is great, yet usually > it's paid persons who maintain / carry it out. > Someone needs to pay that maintenance bill. I'm not, and as it > appears no one is willing to do that. > > Yet where having a small SSI using free software, the promise that > OpenSSI delivered, it appears to be > the pipedream that never matured. It appears no one is prepared to > fund some people to build this. > > Now that's sad, but it's the reality of the market simply. > > Let's just move on instead of giving attention to the few who still > try to make big bucks based upon something that's > easy to program for and with, yet which simply always was too expensive. > > Price matters. > > > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Brock P. <br...@um...> - 2012-08-28 15:11:46
|
Thanks Mulyadi. Brock Palen www.umich.edu/~brockp CAEN Advanced Computing br...@um... (734)936-1985 On Aug 28, 2012, at 4:16 AM, Mulyadi Santosa wrote: > Hi Brock... > > I think Roger is the man to go.....I cc: this email to him too.... > > good luck.... > > On Mon, Aug 27, 2012 at 1:49 AM, Brock Palen <br...@um...> wrote: >> We host a podcast aimed at the HPC community at rce-cast.com >> >> We had some listeners request that we cover OpenSSI on the show. >> Is this something a developer or two of OpenSSi would be interested in doing? If so contact me off list and we can set this up. It takes about an hour over phone or VoiP. >> >> If you have any suggestion for show topics please let me know! >> >> Brock Palen >> www.umich.edu/~brockp >> CAEN Advanced Computing >> br...@um... >> (734)936-1985 >> >> >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Ssic-linux-users mailing list >> Ssi...@li... >> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > > > > -- > regards, > > Mulyadi Santosa > Freelance Linux trainer and consultant > > blog: the-hydra.blogspot.com > training: mulyaditraining.blogspot.com |
From: Mulyadi S. <mul...@gm...> - 2012-08-28 08:16:49
|
Hi Brock... I think Roger is the man to go.....I cc: this email to him too.... good luck.... On Mon, Aug 27, 2012 at 1:49 AM, Brock Palen <br...@um...> wrote: > We host a podcast aimed at the HPC community at rce-cast.com > > We had some listeners request that we cover OpenSSI on the show. > Is this something a developer or two of OpenSSi would be interested in doing? If so contact me off list and we can set this up. It takes about an hour over phone or VoiP. > > If you have any suggestion for show topics please let me know! > > Brock Palen > www.umich.edu/~brockp > CAEN Advanced Computing > br...@um... > (734)936-1985 > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com |
From: Vincent D. <di...@xs...> - 2012-08-26 21:45:08
|
Brock, If we may be honest about SSI in general: For my 8 node $200 a box cluster having a SSI would be really GREAT. And i can't use big enough capital letters for it. For HPC with big supercomputers it's total nonsense to do things in a semi-centralized manner. Yet where it is interesting to make 1 big supercomputer out of my $3200 set of 8 machines which eat together by the way around 1.4 kilowatt under full load, so that's similar to a 8 socket machine of $200k. Now latest 8 socket machines sure are faster than this 64 cores i have here, yet realize the price i paid for them. So it would be as powerful nearly as a 8 socket machine given the right software and decent interconnects. So you want the choice of which interconnect to use, to take that yourself. Let me assure you that built in ethernet cards are not useful to combine into a chessmachine. The latency is too bad of the cheap ethernet cards (sure solarlare at $1000 a card will work, but that's my point exactly). So where the price of the network is already a problem, you sure don't want to pay big bucks for the software. So software that makes 1 SSI out of it, that IS interesting, provided it is cheap. Anything that's $xxx a port is nonsense to use. In fact even $xx a node already is nonsense considering that infiniband software is for free and also networks from the past usually, not all of them, spreaded the software for free. A great latency interconnect was for example Quadrics. I still have an old network here. Bandwidth is a joke of it compared to todays pci-e cards, but latency still is beating the fastest and most expensive TCP solution on the planet. Software was for free downloadable. I still have that here somewhere. OpenSSI was for free, that is what is good of it. Yet look what distro's it used to work for. Same with OpenMosix. They never really were compatible with HPC networks that usually work at different distro's with different kernel requirements. Fixing that costs money. Open source software is great, yet usually it's paid persons who maintain / carry it out. Someone needs to pay that maintenance bill. I'm not, and as it appears no one is willing to do that. Yet where having a small SSI using free software, the promise that OpenSSI delivered, it appears to be the pipedream that never matured. It appears no one is prepared to fund some people to build this. Now that's sad, but it's the reality of the market simply. Let's just move on instead of giving attention to the few who still try to make big bucks based upon something that's easy to program for and with, yet which simply always was too expensive. Price matters. |
From: Vincent D. <di...@xs...> - 2012-08-26 21:20:48
|
> > We did talk to the creator of ScaleMP, not open but this is what > our listeners asked to compare OpenSSi to: http://www.rce-cast.com/ > Podcast/rce-65-vsmp-scalemp.html > If i google for 'latency' ScaleMP, i basically find 0 links. Having 0 public latency benchmarks whereas the most important thing of a SSI is the latency to remote nodes, not how many 'tfop' it is, as tflops are not useful definition for CPU software that's latency sensitive. The Tflop software that's doing the crunching in floating point world, that's all GPU's anyway that dominate there, and they have a very bad latency to and from the RAM. Huge bandwidth sure, but there is a huge difference between latency and bandwidth. On the homepage of Scale MP there is nothing. With spoken word you cannot easily confront someone with hard statements on what actually gets delivered. So this is not an interesting hardware vendor as there is no simple to find information on the homepage showing you the latencies at a given configuration and latency is everything. It's not even clear if you read their definition whether they just deliver you a bunch of machines with software, or just the software. So i don't consider this serious. Yet i'm sure some ex-SGI customers will love them. If i may remind some people here how expensive all this was. Around year 2k you could get a 16 processor Sun or Dec Alpha shared memory machine at a cost of $10 million. For the same $10 mln you could buy also a huge supercomputer using NUMA, for example SGI. The real clustering of common PC's became more popular when they won it from the specialized HPC cpu's. That's something that happened past 10 years. Right now there seems to be a split between software that works well on CPU's and everything else that can work on a GPU using gpgpu. GPU's have won bigtime the HPC crown, so that leaves for the cpu's everything that's not easy to reprogram to GPU's and/or not sellable as a gpgpu program. It's not clear to me what scalemp delivers there from reading its homepage. Not quoting price is 1 thing, not quoting what you exactly deliver is weird. |
From: Vincent D. <di...@xs...> - 2012-08-26 20:59:28
|
On Aug 26, 2012, at 9:35 PM, Brock Palen wrote: > > We did talk to the creator of ScaleMP, not open but this is what > our listeners asked to compare OpenSSi to: http://www.rce-cast.com/ > Podcast/rce-65-vsmp-scalemp.html > Yeah i realized some commercial spinoff of Mosix was there. ScaleMP if i google it's $400 per socket or so. Compare with Infiniband: OpenFED you can download for free. suppose i would one day sell a chessprogram, my programs name is Diep, which runs SMP using scaleMP. Then i ask $50 for my chessprogram and may give them support to connect 2 machines to 1 'supercomputer' using software from scalemp that probably will have a cost of $800 and some constant costs, say $2k or so? Oh comeon. This is not realistic price. I'm sure there will be a few users, in fact if it can work with infiniband cards it would be interesting to put on top of infiniband here, but of course not at a price of $3200. My cluster is made out of dirt cheap nodes. They're litterally $200 a machine. THAT is HPC computing. Using CHEAP hardware. So in total i have 64 cores of 2.5Ghz core2 Xeon (L5420). Then i would need to pay $400, so double the price of each node, to scale MP, for software? If you want to make propaganda for those guys, be my guest, but this is not a realistic price. Now infiniband can handle well over a million messages a second. What if i put 2 network cards (2 rails) in each node to increase number of short messages a second? Nearly all messages in case of a chessprogram are either 16 bytes, which is a write message, or 128 bytes, which is a read message and that always goes to a random node. Yet those cheap 8 core Xeon nodes of $200 a machine, they are not so slow. They get 1.5 million chesspositions a second for my software, so preferably i do 3 million messages a second a node (1.5 million reads and 1.5 million writes a second). |
From: Vincent D. <di...@xs...> - 2012-08-26 20:34:13
|
There are no active developers in OpenSSI. Last modification is from many years ago and that was a TINY modification. On Aug 26, 2012, at 10:03 PM, Scott Walters wrote: > The developers don't seem to monitor this list at all. Try the > OpenSSI developer's list if you want to talk to them. Mostly this > list is active when someone needs help setting up OpenSSI and then > dies down once they're up and running. > > Or you could have a podcast interview with trolls ;) > > Cheers, > -scott > > On 8/26/12, Brock Palen <br...@um...> wrote: >> On Aug 26, 2012, at 3:16 PM, Vincent Diepeveen wrote: >> >>> OpenSSI doesn't work at HPC machines as it doesn't support >>> Infiniband nor >>> other HPC type connections. >> >> You are making the assumption that HPC only means massive parallel >> MPI jobs >> or large OpenMP jobs. On the other hand I know of a group on >> campus that >> just needs to run many serial jobs and uses Mosix as their >> platform on a few >> hundred cores. What is the difference between 1000 core parallel >> job, and >> 1000 serial jobs? >> >> Our local HPC resource that I admin is 12000 cores and only about >> half of it >> has IB, the rest is Ethernet. >> >>> >>> AFAIK it just supports TCP. >>> Furthermore it's total outdated and not maintained. >> >> This would be a problem and thus makes it non-useful for new >> adopters of the >> software? We can pass on OpenSSI if it is dead. >> >>> >>> As for true SSI it also lacks crucial features like memory >>> migration of >>> shared memory. >>> Know another way to run parallel using multiple processes? >>> >>> Everyone uses shared memory for that of course. >> >> We did talk to the creator of ScaleMP, not open but this is what our >> listeners asked to compare OpenSSi to: >> http://www.rce-cast.com/Podcast/rce-65-vsmp-scalemp.html >> >>> >>> So what do you mean with 'aimed' at HPC? >> >> Our listener base is probably mostly admins and some users and >> developers >> mixed in there. We like informing them of anything that could help >> them in >> their job. >> >>> >>> That's a different league you know... >>> >>> On Aug 26, 2012, at 8:49 PM, Brock Palen wrote: >>> >>>> We host a podcast aimed at the HPC community at rce-cast.com >>>> >>>> We had some listeners request that we cover OpenSSI on the show. >>>> Is this something a developer or two of OpenSSi would be >>>> interested in >>>> doing? If so contact me off list and we can set this up. It >>>> takes about >>>> an hour over phone or VoiP. >>>> >>>> If you have any suggestion for show topics please let me know! >>>> >>>> Brock Palen >>>> www.umich.edu/~brockp >>>> CAEN Advanced Computing >>>> br...@um... >>>> (734)936-1985 >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------- >>>> ----------- >>>> Live Security Virtual Conference >>>> Exclusive live event will cover all the ways today's security and >>>> threat landscape has changed and how IT managers can respond. >>>> Discussions >>>> will include endpoint security, mobile security and the latest in >>>> malware >>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>>> _______________________________________________ >>>> Ssic-linux-users mailing list >>>> Ssi...@li... >>>> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users >>> >> >> >> --------------------------------------------------------------------- >> --------- >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. >> Discussions >> will include endpoint security, mobile security and the latest in >> malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Ssic-linux-users mailing list >> Ssi...@li... >> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users >> |
From: Vincent D. <di...@xs...> - 2012-08-26 20:33:07
|
On Aug 26, 2012, at 9:35 PM, Brock Palen wrote: > On Aug 26, 2012, at 3:16 PM, Vincent Diepeveen wrote: > >> OpenSSI doesn't work at HPC machines as it doesn't support >> Infiniband nor other HPC type connections. > > You are making the assumption that HPC only means massive parallel > MPI jobs or large OpenMP jobs. On the other hand I know of a group > on campus that just needs to run many serial jobs and uses Mosix as > their platform on a few hundred cores. What is the difference > between 1000 core parallel job, and 1000 serial jobs? HPC = High-performance computing OpenSSI is combining a number of machines together to 1 machine with shared memory. If you run things serial or embarrassingly parallel there is no need for SSI. All you need then is for example the free parallel shell to start at all machines at the same time the jobs. A good example of such shell is the pdsh shell. So there is no need for a SSI there. As for Mosix, that ceased business in early 2008, so you must be speaking about something total obsolete. What's the biggest advantage of a SSI over using MPI? That's the fact that when using memory from a remote node that your code doesn't look different from how it runs on a shared memory machine. The reason to use HPC networks is because of latency and bandwidth. Typically your built in ethernet card is not DMA, so every message received gives a big dang to the CPU. Its latency at best is 100-200 microseconds, whereas the HPC cards are in the order of a few microseconds. Latest infiniband (FDR) is 0.85 microseconds. Sure there is ethernet cards that are better than 100-200 microseconds. Note they're usually more expensive than a HPC card and still factor 3 worse in latency. Fastest network card at 10 gigabit ethernet is solarflare right now. It's latency is around a 3 microseconds. Its price double that of infiniband cards. In fact its latency is the same like HPC cards from 10 years ago. TCP simply is not a safe protocol from several viewpoint. It's a slow protocol. Infiniband gets used a lot for TCP as well, yet then it's 10 gigabit. A cheap QDR card is 2 x 40 gigabit using HPC protocols. FDR with which supercomputers get equipped now is even faster. If you do High Performance Computing, you typically see that at large clusters quite some software is not really professional optimized, usually the experts who know what they want to achieve are not the best guys to also optimize the software. Yet already in advance throwing away an advantage of factor 8+ in bandwidth is not a good idea as that removes the possibility of optimizing the software very well as there is also a bunch of persons in HPC which does write the ultimate speed monsters. HPC typically is the area where also the most optimized codes are. A SSI should not be in the way of that by just enforcing TCP. TCP simply is not a suited protocol for HPC. Typically you see then that latency plays a crucial role in some codes. A good example is my chessprogram. For it a SSI is really interesting to have, yet it requires memory migration of shared memory. Then i no longer need to program MPI commands myself and just leave it to the SSI. Yet latency is total crucial then. Doing that over TCP is throwing away too much of a performance. > > Our local HPC resource that I admin is 12000 cores and only about > half of it has IB, the rest is Ethernet. > >> >> AFAIK it just supports TCP. >> Furthermore it's total outdated and not maintained. > > This would be a problem and thus makes it non-useful for new > adopters of the software? We can pass on OpenSSI if it is dead. > There was first Mosix and OpenSSI, also SGI had its own SSI type software using Irix. Irix is no longer, Mosix ceased business and OpenSSI no longer gets maintained. One of the programs that really can use a good SSI is a chessprogram. Yet latency is total crucial to it. It simply needs all features that the hardware also has, which includes migrating pages of memory, regardless whether that's shared memory or normal memory. Also you don't want to migrate the entire shared memory segment, you just want to migrate a page. Say 2 kilobyte at most or so. So in itself a SSI is useful if it is delivering real performance for a limited number of cores. >> >> As for true SSI it also lacks crucial features like memory >> migration of shared memory. >> Know another way to run parallel using multiple processes? >> >> Everyone uses shared memory for that of course. > > We did talk to the creator of ScaleMP, not open but this is what > our listeners asked to compare OpenSSi to: http://www.rce-cast.com/ > Podcast/rce-65-vsmp-scalemp.html > >> >> So what do you mean with 'aimed' at HPC? > > Our listener base is probably mostly admins and some users and > developers mixed in there. We like informing them of anything that > could help them in their job. They have grey hair and had a cluster 10 years ago that has a SSI? As past 6+ years not much happened to OpenSSI nor Mosix, in fact mosix stopped business in 2008. The last supercomputer that ran IRIX that got dismantled i'm not sure. Maybe around 2006 somewhere? Of course a problem of SSI's is that it doesn't scale real well or at a far too high price. Let me give you an example of SGI under Irix. Here in Netherlands at SARA one of the largest supercomputers at the time was installed. Called Teras. It had 1024 processors in the year 2000. Split up in different SSI type partitions. 512 of them were in a single partition. Now the first problem then is that how they solved this was by losing some processors. So effectively the max you could use typically was 500. As a few were used for i/o and others for other centralized aspects. This is a serious problem a SSI brings. If i wanted to time the 500 processes it had to call the clock of course. I use GetTimeOfDay, a normal unix function to do it. However the disadvantage of migration is that one never knows where a proces or its RAM are located, so time then gets maintained in a centralized manner. This was a major problem. The program slowed down exponential already at around a 100+ processors when getting for each search the time. That caused the program to exponential slow down of course. So SSI is very useful but there is a limit in its possibilities. Up to a core or 64 it's nice. Yet most will not want to pay big bucks for this. Programs that typically are interesting to scale to a core or 64, they usually need fast latencies and the software to do that may not be expensive, let alone not be real expensive sorts of hardware. Note majority of such software already cannot work at different machines as such SMP coding is not simple for even the fastest latencies that the interconnects can deliver. For example most chessprograms ugh out here at a tad older box here a 16 core AMD opteron. It has 300 nanoseconds latencies. That still is 3x faster than the fastest latency of the fastest network card. We speak of latency here that is similar to 2x the 'one way ping pong' latency, or the RDMA latency that cards manage to deliver. OpenSSI simply never delivered there. > >> >> That's a different league you know... >> >> On Aug 26, 2012, at 8:49 PM, Brock Palen wrote: >> >>> We host a podcast aimed at the HPC community at rce-cast.com >>> >>> We had some listeners request that we cover OpenSSI on the show. >>> Is this something a developer or two of OpenSSi would be >>> interested in doing? If so contact me off list and we can set >>> this up. It takes about an hour over phone or VoiP. >>> >>> If you have any suggestion for show topics please let me know! >>> >>> Brock Palen >>> www.umich.edu/~brockp >>> CAEN Advanced Computing >>> br...@um... >>> (734)936-1985 >>> >>> >>> >>> >>> -------------------------------------------------------------------- >>> ---------- >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. >>> Discussions >>> will include endpoint security, mobile security and the latest in >>> malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Ssic-linux-users mailing list >>> Ssi...@li... >>> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users >> > |
From: Scott W. <sc...@sl...> - 2012-08-26 20:31:40
|
The developers don't seem to monitor this list at all. Try the OpenSSI developer's list if you want to talk to them. Mostly this list is active when someone needs help setting up OpenSSI and then dies down once they're up and running. Or you could have a podcast interview with trolls ;) Cheers, -scott On 8/26/12, Brock Palen <br...@um...> wrote: > On Aug 26, 2012, at 3:16 PM, Vincent Diepeveen wrote: > >> OpenSSI doesn't work at HPC machines as it doesn't support Infiniband nor >> other HPC type connections. > > You are making the assumption that HPC only means massive parallel MPI jobs > or large OpenMP jobs. On the other hand I know of a group on campus that > just needs to run many serial jobs and uses Mosix as their platform on a few > hundred cores. What is the difference between 1000 core parallel job, and > 1000 serial jobs? > > Our local HPC resource that I admin is 12000 cores and only about half of it > has IB, the rest is Ethernet. > >> >> AFAIK it just supports TCP. >> Furthermore it's total outdated and not maintained. > > This would be a problem and thus makes it non-useful for new adopters of the > software? We can pass on OpenSSI if it is dead. > >> >> As for true SSI it also lacks crucial features like memory migration of >> shared memory. >> Know another way to run parallel using multiple processes? >> >> Everyone uses shared memory for that of course. > > We did talk to the creator of ScaleMP, not open but this is what our > listeners asked to compare OpenSSi to: > http://www.rce-cast.com/Podcast/rce-65-vsmp-scalemp.html > >> >> So what do you mean with 'aimed' at HPC? > > Our listener base is probably mostly admins and some users and developers > mixed in there. We like informing them of anything that could help them in > their job. > >> >> That's a different league you know... >> >> On Aug 26, 2012, at 8:49 PM, Brock Palen wrote: >> >>> We host a podcast aimed at the HPC community at rce-cast.com >>> >>> We had some listeners request that we cover OpenSSI on the show. >>> Is this something a developer or two of OpenSSi would be interested in >>> doing? If so contact me off list and we can set this up. It takes about >>> an hour over phone or VoiP. >>> >>> If you have any suggestion for show topics please let me know! >>> >>> Brock Palen >>> www.umich.edu/~brockp >>> CAEN Advanced Computing >>> br...@um... >>> (734)936-1985 >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. >>> Discussions >>> will include endpoint security, mobile security and the latest in >>> malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Ssic-linux-users mailing list >>> Ssi...@li... >>> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users >> > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Brock P. <br...@um...> - 2012-08-26 19:36:01
|
On Aug 26, 2012, at 3:16 PM, Vincent Diepeveen wrote: > OpenSSI doesn't work at HPC machines as it doesn't support Infiniband nor other HPC type connections. You are making the assumption that HPC only means massive parallel MPI jobs or large OpenMP jobs. On the other hand I know of a group on campus that just needs to run many serial jobs and uses Mosix as their platform on a few hundred cores. What is the difference between 1000 core parallel job, and 1000 serial jobs? Our local HPC resource that I admin is 12000 cores and only about half of it has IB, the rest is Ethernet. > > AFAIK it just supports TCP. > Furthermore it's total outdated and not maintained. This would be a problem and thus makes it non-useful for new adopters of the software? We can pass on OpenSSI if it is dead. > > As for true SSI it also lacks crucial features like memory migration of shared memory. > Know another way to run parallel using multiple processes? > > Everyone uses shared memory for that of course. We did talk to the creator of ScaleMP, not open but this is what our listeners asked to compare OpenSSi to: http://www.rce-cast.com/Podcast/rce-65-vsmp-scalemp.html > > So what do you mean with 'aimed' at HPC? Our listener base is probably mostly admins and some users and developers mixed in there. We like informing them of anything that could help them in their job. > > That's a different league you know... > > On Aug 26, 2012, at 8:49 PM, Brock Palen wrote: > >> We host a podcast aimed at the HPC community at rce-cast.com >> >> We had some listeners request that we cover OpenSSI on the show. >> Is this something a developer or two of OpenSSi would be interested in doing? If so contact me off list and we can set this up. It takes about an hour over phone or VoiP. >> >> If you have any suggestion for show topics please let me know! >> >> Brock Palen >> www.umich.edu/~brockp >> CAEN Advanced Computing >> br...@um... >> (734)936-1985 >> >> >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Ssic-linux-users mailing list >> Ssi...@li... >> https://lists.sourceforge.net/lists/listinfo/ssic-linux-users > |
From: Vincent D. <di...@xs...> - 2012-08-26 19:15:41
|
OpenSSI doesn't work at HPC machines as it doesn't support Infiniband nor other HPC type connections. AFAIK it just supports TCP. Furthermore it's total outdated and not maintained. As for true SSI it also lacks crucial features like memory migration of shared memory. Know another way to run parallel using multiple processes? Everyone uses shared memory for that of course. So what do you mean with 'aimed' at HPC? That's a different league you know... On Aug 26, 2012, at 8:49 PM, Brock Palen wrote: > We host a podcast aimed at the HPC community at rce-cast.com > > We had some listeners request that we cover OpenSSI on the show. > Is this something a developer or two of OpenSSi would be interested > in doing? If so contact me off list and we can set this up. It > takes about an hour over phone or VoiP. > > If you have any suggestion for show topics please let me know! > > Brock Palen > www.umich.edu/~brockp > CAEN Advanced Computing > br...@um... > (734)936-1985 > > > > > ---------------------------------------------------------------------- > -------- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. > Discussions > will include endpoint security, mobile security and the latest in > malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Ssic-linux-users mailing list > Ssi...@li... > https://lists.sourceforge.net/lists/listinfo/ssic-linux-users |
From: Brock P. <br...@um...> - 2012-08-26 18:49:51
|
We host a podcast aimed at the HPC community at rce-cast.com We had some listeners request that we cover OpenSSI on the show. Is this something a developer or two of OpenSSi would be interested in doing? If so contact me off list and we can set this up. It takes about an hour over phone or VoiP. If you have any suggestion for show topics please let me know! Brock Palen www.umich.edu/~brockp CAEN Advanced Computing br...@um... (734)936-1985 |