geneticd-devel Mailing List for Genetic Daemon
Status: Alpha
Brought to you by:
jonnymind
You can subscribe to this list here.
| 2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(5) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002 |
Jan
(13) |
Feb
(2) |
Mar
(3) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Giancarlo N. <gi...@ni...> - 2002-06-10 01:31:07
|
Wow, I did it. Geneticd version 0.2 is out. This is the news on the site: ------------ I release this in pre1 form just to make sure no (hard) bug comes out alo= ng=20 the way. I will test it at work and university for about a month and the = will=20 release version 0.2=20 This version has a good base for parallel processing developement (and ha= s=20 already a basic support for existin engines of any kind!) and has a fairl= y=20 better compilation/installation routine.=20 Home Page will be update soon (hopefully tomorrow).=20 ---------------- Have a look at it, since it's worth! Jonnymind. |
|
From: Jonny M. <jon...@ni...> - 2002-03-16 16:22:20
|
0.2 version is almost ready. If you could be so kind to download the CVS version and try it out a little, I would be very happy. I'll put the GD on version 0.2 in a week... We could even set up an intercontinental cluster! I have some good server that I could use at 50% of their power; we could put some slaves at work to see if the engine is robust enough. Please, let me know if you're still in! Bye, Giancarlo. |
|
From: Jonny M. <jon...@ni...> - 2002-03-15 23:27:27
|
I Finally added the TRAM command, that adds a random brust of newborn agents each N turns. I had grown sick and tired of having to put a \r\n after each print in Core Commands (and the like). So I decided to add it automatically. A method in GStream called addEol( bool ) controls if print and vprint have to put an extra end of line character(s). I also secured the locking of master engines: when there is interaction with a slave, only that slave is locked, but when the engine needs to talk to all it's slaves, or has to change it's internal data, now the master locks. I added a GeneticEngine::cleanup virtual method that is called when a free command is issued, just before deleting the engine. This lets the master to try to free safely all it's slaves before entering the destructor (and having the engine space locked). |
|
From: Jonny M. <jon...@ni...> - 2002-03-12 22:42:17
|
And is working for good on my local network on 3 pc! Last CVS commit is stable enugh to run for hours without human intervention. The changes I've made are soo deep and interesting that this letter will take a litte lo be written. I Hope you will be patient enugh to read it. First of all, let me say that the GEMaster is completely operative, and having lot of automatized checking, it somehow substitutes human intervention in managing a GD cluster... AMAZING. The only thing I have still to implement is the "serial_size" method of GEMaster, that is anyway useless untill we have a master running master, or a software client. I had to solve a lot of nasty problems. I would like to have your opinion about this things I've done. 1) The GEMaster is a kind of "server-client" engine, that remotely manages other engines and displays them and their results as if they where only one engine. This means that the GEMaster logs onto other GDs, creates a suitable GEInABox engine (of the same type of himself, and using the parameters it were given when born, that is, before PREP stage), and runs it, taking care of everything is going fine. 2) The first problem was: how to create that remote "components" of the master? The same mechanism I described in the old mails is still valid. I just want to remember that now compiled learning sets can be moved around with the lset command. 3) But now, when restoring the master engine from a saved file, and creating new instances of the slaves, there is the need of a certain way to address a client engine. I created the UNID = universal ID; a random string that individuate a certain engine in a server (or in a cluster). Any command using <eng-id> (i.e. dump 1 -- dumps the 2nd engine you've created), can be written with "cmd *<eng-UNID> ...", using a star (*) in front of the UNID. This can be also done for route agents in the cluster (i.e. "dage *aasdfae234 *UNIDasdf3.my-agent-generation-0"). In this last case, since the numeric slave id cannot be changed easily, can be still safe to use the numeric address (i.e. "dage 4 2.0"). 4) No two engines can exists with the same UNID on the same GD. The UNID is saved in the engine file, and restored when the engine if loaded. If you want to load 2 copies of the same engine, you'll have to change the unid of the first of them, with the command "reid <eng-id>". 5) There could have been TONS of reasons why slave GD could not have worked well. Errors in the genetic algorithm causing segmentation fault or division by zero (as ... emm... in the gfunc :-(, temporary network failure, server dropdown etc. The master engine HAVE to still be working. So I created some self-preserving mechanism that can be useful: a - GDClient ping-pong automation: every time you "print" or "write" something to a GDClient class (a descendant of GStream), a "ping" command is issued to the target GD. The server replies "+200 pong", and the conversation goes on, or there is an error. In this case, the print method tries to reconnect several times. This ping-pong handshaking can be turned off if you are somehow certain that the connection is still alive, or if not alive you can't bring it back so easily. This can happen when you issue a brust of commands in a short time range. This workaround is specifically studied to "reanimate" the connection when it has fallen due to network inactivity, in a trasparent way. b - resinchronization mechanism: often, the master engine is assuming that the slaves are in a certain state. I.e. while the master is running, the master *hopes* that all the slaves are running with him. If it does not happen, it doesn't matter when the engines are responding as the master expects. When even this is not granted, the slaves are set "out of sync". They are not deleted from the list of slaves, but the master will try to put them back in row. This is done with a combination of activities, the core of which is the "resync" method of the SlaveDef class (that is the class that stores the representation of the slaves for the master engine). 6) Serialization of the master is now possible and easy. The de-serialization is a little more complicated. We have to re-login into all slaves GD, and send them the Slaves engine we stored locally. If they still reside in the contacted server (remember? just one UNID for each engine), the version already being in the client is used. This is because, if the master fall but the slaves are still alive and continue running, the newly brought-up master has only to take over the slaves: the work they have done in the meanwhile is not lost. If the slave GD is not available when the master engine is loaded, the GD is removed from the list of slaves and must be manually re-enslaved. 7) Agent loading is done by sending the agent to the first random engine that is capable of holding it. All this is working now in my local network, and BOY, is amazing!. I still have to do manually some job (as creating from now to then a brust of new random agents (fresh blood!), or moving around agents from engine to engine), but the greater work is automatically done by the GEMaster. TODOS about this topics are: agent loading directed to a paritcular slave, preservation and reallocation to different servers of slaved engines stored in the save file and serialization size. I also have to make concurrent calls to the slaves hash in the GEMaster class safe; but that can be considered far less important changes than the other TODOs in the 0.2 version whish list. WE ARE NEAR. Other things I added, in sparse order: - Many replyes in corecommands have been moved from -5xx to -4xx, that I will call "warnings". - Empty() method for Vector class. Useful. - Lockable::Notify is now public. - SlaveDef is now lockable. all the calls to its - Removed -O2 switch: somehow, optimization caused virtual table corruptions in objects stored in the stack. It could be a bug of my version of gcc. - Added "ping", "unid" and "reid" commands. I am thinking to reduce somehow the size of "corecommands", moving its contents in separate files. I would like also to move GeneticEngine its childs out of the /genetic directory and library. ANY SUGGESTION? The source is changed very much. I suggest you a clean start, and/or a make disclean autoconf ./configure. Bye. Jonnymind. |
|
From: Jonny M. <jon...@ni...> - 2002-02-16 15:43:44
|
Now Start, Stop, rele (release) and slim (set time share) are working in parallel mode. Just create a master engine (be it an intseq, a gfunc or an engine of your own); set maximum population (maxp) and food/competitivness (samb). If your engine needs a learning set (like gfunc does) load it with the lset command, or compile it from a ascii file with ldat. Now prep the master engine. With the slav command, you can enslave another GeneticDaemon (or the same daemon), and create a new GEInABox of the same type of the Master Engine. The learning set will be transferred to the newly born engne. Now, if you start the master engine, all the slaves will be started... I still need some useful commands, as engine mangling (kill or load an agent), dumping the whole slave engines population, serializing the master and things like that... But now it's only a matter of time... We need also a command to talk directly with slaves (ie "quote <master-id> <slave-id> ........."), but the semi-fixed structure of the command we have now could be a problem. Moreover, I need to specialize the "print" method of the GDClient class to make it try to recover died connections transparently (i.e. dued to timeouts). I put a "sync" flag in the SlaveDef class: if a slave engine gets an error of some kind (both hard, if the daemon falls or soft, if the engine gets an error in doing something). If an engine goes "out of sync", i would leave this situation to the user/client to handle. I also corrected a lot of small errors that I left behind in the hurry to have a working parallel algorithm. One last concern. When you compile this version, it should be better to use the -DPARANOID switch, althrough this is not needed: host% CFLAGS="-DPARANOID" ./configure host% make this should work fine. Bye, Giancarlo. |
|
From: Giancarlo N. <gi...@ni...> - 2002-02-13 19:10:01
|
Gentleman, This is to annunce that the basic framework for master engines is ready, along with a bounch of changes that I am going to explain in this mail. The Master engine have still to be programmed, but is a simple matter now: its basic functionality is working and CVS'd by the time you'll recieve this. I would like to have a feedback from you about the solutions i found. I'll explain the changes in a bottom - up manner (as I usually program...). First add, of lesser value but still important, is the Hash class. I used it to store a dictionary of slave engines in the master, but it's implementations are greater. Hash class is in the new module utils/hash.cpp (hash.h). It is basically a keyed dictionary, in which we have two vectors (implemented through the Vector class): keys and values. Values are of void * kind (they can hold anything, but a back-cast is needed to retrieve the data you store). The interesting thing is the key: it is implemented through a HashKey class, that can hold (at the moment) both integer and strings (char *). Further developement can add more key types. Key based methods are transparent: they will cast automatically an integer or a string as a key in the corrisponding HashKey object, and will retrieve a void * from the corresponding value vector. In other words, we can have: Hash h; GeneticEngine *myeng = ...; h.add( 12, "a value in a string" ); h.add( "code12", myeng); printf( "The value of the entry '12' was %s", (char *) h.get (12) ); GeneticEngine * anEng = (GeneticEngine *) h.get( "code12"); Got it? Why is this important? Because it can be applied to a whole lot of vectors that have been limited to integer indexing. First of all, the vector of engines in the daemon; but also the plugin vector, the command vector, the engine type vector and anything that comes to mind. The "get" method of that classes was a little "heavy" and, on my opinion, too redundant. I will apply Hash to anithing that could benefit from that. Also, I will write a "synchash" class that resembles the SyncVector class, to allow different thread to access safely the dictionary. Now, let's get to the serious things. GStream have been moved out from the "Session" module to a different module in the new shared library "gd_client". (src/client). Most important, I created the GDClient class as a child of GStream. GDClient is responsible of client oriented operations towards GD servers; it can be used by client programs, and is also heavily used by master engines in controlling their slaves. It has already methods to connect, log in and create an engine; maybe it will have more, but the basic "print" method from GStream, coupled with advanced reply retrieving (both reply code and text, multi line replies and binary data retriving) should be enough for the most works. And then the master engine. The new engine model is build upon a matrix of engine type combined with engine class. The basic class for all engines is GeneticEngine; then the logic splits. We have the GEInABox class, which is the base class for all local engines, and the GEMaster class, which is the base for all master engines. The basic difference is that GEInABox has all the functionality to manage an owned genetic environment. GEMaster has not this capability: it relies on slaves GEInABox es. Both this new classes have basic functionalities; more specialized ones are bound to rewrite some virtual methods to change their functionality. GEInABox is what you already know: it is the old GeneticEngine class mangled a little, and behaves with no knowledge of it's surrounding environment. It does not even know (at this moment) if it's creator is a user or a master engine, or in other words, if it is a free or a slave engine. GEMaster has the responsiblity to create slave GEInABox engines, and coordinate them. I just had the time to write a basic "enslave" method to create slaves, and it works. All the rest have still to be programmed, but now it's a rather simple matter. Engines are still created with the "crea" command, that have been updated to "crea <type> [master]". Adding the "master" keyword creates a GEMaster engine; in the future this will be changed, and the second parameter will be the name of the class to be created. A static function (in the genetic/metaengine module) is responsible to create the right engine class based upon it's name: GeneticEngine *createEngineOfClass( char * ) The "load" command has also been updated, so it starts reading the engine file, gets the class name and calls that function to create the engine. After that, the load() method of the newly created engine is called. Some problems are still to be dealt with, but they are minor headaches: connection with GD holding slaves can be lost (dued i.e. to timeout); slaves engine can be faulty or removed by a local user. Is a simple tasks to check for this condictions and update transparently the master status. A more complex problem is that: suppose that the master creates an engine; it gets an engine ID from the slave GD; now a user with appropriate rights deletes that engines, and creates a new one. Basing upon the algorithm that we are using now, GD creates an engine with THE SAME ID of the one just deleted. The master would be fooled. The solution comes handy (and this is why I badly needed an Hash class): engine IDs should be no more numeric: they have to be created in an unique way, so that two engines can't have the same ID. This would also symplify human interaction when dealing with more than one GD at the same time. But this can be done almost painlessly, so we'll deal with this problem after that the master class is fully running, but before we issue the 0.2 version of GD. Also, Engine ID should be serialized, and not dynamically assigned by GD at their creation. We could also have two kind of ID: one locally valid (the one we have now can be good) and one absolute, like the way the domain name and the fully qualified domain name work. I would like to have your opinion on this point. One last word to comment the type-class engine matrix. After this step have been made, I can be satisfied in looking back and having choosen this model in developement. Think about the mess we would have had if we had to build a class for each different genetic algorithm AND for each slave kind engine AND master kind engine. We have just two now, but they will be more in the future (i am thinking of a slave class aware of network topology, commuincating without master intervention with neighbours....). It would have been a programming nightmare to recreate a whole set of class when having to add a genetic algoritm, or a different kind of master-slave interaction... Multiple Inheritance would have been even wrose! Implementing this model means also that we won't be able to put different master or slave classes in plugins, but this is a minor drawback. It's far more common, for users, to think about the precise algorithm that the daemon should run: the overall architecture of the parallel network is more concerning with core developer. Moreover, more advanced master-slave structure can be safely shipped with newer official releases. The old structures will still work, but newer GDs will be capable of different behaviour. Another topic about the subclassing model is the "command" limitation. At this moment, commands are overloaded on a engine type basis. Commands bound to master engines must take care of themselves... the command routines must check out if the command can be issued to the engine they are working on. This could be painful for a future developement, and should be changed, but at this moment I do not see any drawback in taking care of this after 0.2 is released. It could be a "todo" for 0.3 or 0.4. We must also consider that master specific commands are less far numerous than ordinary commands (at this moment we have only two: slav - enslave a client and rele - release a client). We can deal with this later. One last concern about release timing. With the master-slave model already on CVS, I can forecast that the 0.2 release will be ready by the last decade of mars. 0.3 and 0.4 should be ready by july, and we could have 1.0 by october. Waiting for some comment, Giancarlo Niccolai. |
|
From: Jonny M. <jon...@ni...> - 2002-01-19 04:52:08
|
It is ready on the CVS. There is a client class that connectcs with another geneticd, and can sent it orders and interpret replies: GDClient. The class has methods to log in, parse the replies, put autmatically all the multi-line responses in a vector, and the binary responses (i.e. agent transfers) in a buffer. It has also an advanced name resolver to find other hosts by name. There is still something to do (i.e. async timed out client), but the basic brick for a master-slave engine are now on. To try it, use the new "test" command, that we'll use to test experimental features from now on: launch test "hostname"; hostname can be also "localhost". Although the client class can connect to any port, the test command is limited to port 2001. The test command will log in as root (passwor root), ask a "stat" command, disply the multi line reply, and then will transfer agent 1 from engine 0. If the engine 0 is not present, the routine won't crash... and this is GOOD. It will automatically fill binary data variable with void values, and reply status to the replied error code (-501 invalid eng id?). Here we are. Next step is a working parallel class. bye, Giancarlo. |
|
From: Jonny M. <jon...@ni...> - 2002-01-17 21:52:55
|
Ok. A complete command queuing is not the answer, but I found a fine
solution, and it is working (and on CVS). The spare population is now the
"working population".
1) If any command that could modify the population (from now on, P-Command)
reaches the engine, it is passed to the environment through a new method
called "alter". Alter locks the environment.
2) if there is not a turn evaluating when the alter command is called, the
enviroment calls immediately a private method called "exec_command", that
does the job.
3) At the beginning of the turn, the environment is locked and the population
is copied to the spare pop. This ensures that the population is coherent,
since the only method authorized to change it ("alter") is sync (locking with
the environment on the critical section).
4) spare pop is evolved; meanwhile any read-only command that reads the
population is directed to the "population" vector (that is unchanged), and
any "alter" invocation is cached.
5) at the end of the turn, the environ is locked again. Now, the spare pop is
copied in the population, and the turn counters are incremented. If there are
commands in queued from a preceding alter() call, they are applied to the new
population.
6) environment is unlocked and the turn ends.
This implies that any reading command should lock the environment: it won't
stop a turn from being evalued, but it will prevent fro inconsistent readings
(while population is being copied in or out, or while it is modified).
Allowed commands for "alter" are now:
- "add" a gene (from engine::loadgene())
- kill a gene (with a valid Genetic * that points to a gene in the population)
- kill an agent knowing it's id (numeric or integer)
- changing population parameters (add rate, del rate and same rate).
In the future they could be more.
All this "async" mechanism is now up, and seems to be working. Also, I have
removed some errors in the shutdown sequence.
Regards,
Giancarlo.
|
|
From: Jonny M. <jon...@ni...> - 2002-01-15 00:47:04
|
Through queuing commands is an elegant solution, and would simplify the engine API, I've come out determined to abandon this way because: 1) the command parser would be inefficent compared to the simple get/set private attribute access, expecially with inline methods. 2) engine modify commands are now limited to agent killing and/or agent loading. 3) I am forecasting for the future that the only other engine changing command will be agent mangling (changing directly the DNA). 4) Queuing those commands would not resolve the problem of agent transfers. The client would not have to wait for the engine to have it's turn ended -- it would have to wait for the engine to execute it's "lage" command... which is the same. 5) Implementing agent exchage in a slave genetic engine class require a "phantom" population, similar to the spare population, that will hold the loaded agent until the engine is able to move them to the real population. This can easily be done by subclassing the GEInABox engine. Lage is implemented with agent de-serialization: you simply put the GStream * in the deserialize method of a newly created agent, and the obtained agent can be put where we want, even in a Vector class, up to the moment in which is copied to the real population. Now, the only change it seem important to me is that the spare population must be moved out from the environment and in to the engine. After each turn, the env_run routine will get the population from the environmnent, will apply cached changes and will copy the spare population in the engine. The engine will only use its private copy of the population to answer queries. How do you feel about it? Giancarlo |
|
From: Jonny M. <jon...@ni...> - 2002-01-14 22:41:08
|
I have a solution that could both provide a coherent and expandible API for the engines and symplify the task to queue the requests sent to an engine: message exchange. I regarded the gengine.h file and, boy, its HUGE. Well not so huge, but it seems to me that is inefficient and full of lots of small functions, that make the task of programming genetic engines an hard one. Suppose that we have a method like: engine->command(int command-id, Vector *params); Vector is a relatively small class, and can store convenient amounts of data; but maybe three or fuor void * parameters are enough. If they are enough for all the MS-Window$ API messages, they should be fine for us. The command parser will be simple: just a switch (all the work is done by the daemon's command parser routines). The return value might be a void *, or a significant structure. This would also reduce a lot of virtual methods (we already have at least 20, and we have just begun), that are an heavy burden for our program. I am having a deep immersion in the gengine code right now to see if this thing is feasible, and with what effort. Giancarlo. |
|
From: Jonny M. <jon...@ni...> - 2002-01-14 22:27:43
|
Pre Scriptum. I am writing this while reading incoming mail from Steeve.
The last letter of Steeve and the new black-box approach to engine
programming gave me an Idea worthy to be considered.
At this moment, the engine model is sync with the connection: an order is
given, and the connection is put in a wait state until the order is evaded.
When I thought this model, I had in mind a user sitting on it's keyboard,
waiting for the daemon to complete it's operations. Also, in the beginning I
did not have a "block" command, that blocked the engine and restored the
previous turn's status, so I did not recorded it.
Now, it is possible to queue all the commands that require to have exclusive
access to a coherent population:
- All the querying commands will operate on the saved state of the
population, while turn is running.
- All the updating command will be queued; as soon as the turn finishes, the
environment will be locked by the engine; then each updating command will be
executed on the newly created population. Finally, the resulting population
will be stored into the saved state, and a new turn will begin.
This bring us to the conclusion that there will be no need to lock the
environment, since it will be managed only by one thread. Let's resume it
with this scheme:
<command 1> Engine Queue
| -------------------
V Command 1
<command2> --> [Engine] Command 2
| |
V V
[Env] * (running) ---> [Env] * (turn finished)
|
V
<query command> ---> [ENV saved state] ---> Query result.
Now, this could lead to a potential problem: the effect of changing the
engine won't be displayed until the end of the turn, but a client would not
be able to know when this happens, unless having a "ticker" on the engine that
says when the turn is over. This can be done with a sort of "short" feed,
that lead us to 2 other consideration:
- we need more than one format of feeds
- we need more than one feed per engine (one per connection should be fine)
But these are all feasible issues on my opinion, and with a small effort.
There is another problem. At this moment, agents can be referenced both by
their symbolic name and their position in the population. After the turn has
completed, the position of an agent could change. Commands referencing to a
certain agents and modifying the engine (i.e. dage, delete agent) can "miss
their target", and hit another agent. We could handle that situations in
three ways:
1) this kind of commands can be issued only when the engine is not
running; this make some sort of sense to me...
2) agents can be identified only with their symbolical name. This would
also simplify remote parallel processing operations (imagine that we have
too coordinate an array that is changing, and that is located on several
different async. machines... this is a programming nightmare). With the name,
all it's simpler: i tell the engine to kill THAT agent, and this request can
be "routed" through the cluster until we reach the server holding that agent.
Then, if the agent is still alive, it will be killed; if not, nothing bad can
happen. This have only one significant drawback: clients willing to iterate
through the population must always have a complete list of agent names.
Imagine to have something as 10.000 agents in our cluster... this is also a
programming nightmare.
3) we can have both. Commands referring to agent names can be issued in
any moment, while commands referring to an agent position must be issued only
when the whole distributed population is sync stopped.
The third solution seems the most reasonable to me. It makes some sense that
when you want to iterate through population, (i.e. when a client program
wants to store locally the whole population) the population must stay "fixed"
as it is. The problem is not so pressing when you want to deal exactly with
just THAT agent; the agent could be removed in the meanwhile, but this is not
important: the client will be told that the agent does not exists anymore.
The only problem in this solution is that, while you iterate through a
stopped population, you don't want someone else to start this population
again.
This can be obtained in this way: when referring to agents with their ordinal
number, if the population is not local (is running in a parallel environment)
you need to obtain a stop-lock, that is: the population must be stopped on
all it's points, and all it's slave servers must be told not to accept orders
to start it (or modify it) again; also, slaves must have replied that they
stopped and locked the population (a kind of "roger" reply). The client will
remove the stop-lock when it finished it's operations. The only connection
allowed to remove the lock will be the one that originates it, or another one
opened by the same user (useful in case the first connection fails, or the
client mess up something.)
Thats all for now. I am having an idea on how queuing the commands in the
engines, but I'll use another mail to describe it.
Wow, if you have reached this line, you're really great genetic programmers!
Giancarlo
|
|
From: Stephen W. <sw...@wa...> - 2002-01-14 22:11:42
|
On Mon, 14 Jan 2002, Jonny Mind wrote: > 1) Are the agent widespread among the population? if so, what will > prevent the best agents to replicate endlessly, sterilizing the whole > population? I believe this must be handled in the same fashion you might normally handle it under any EC algorithm. Of course we never want the diversity of our population to get too low - we have to deal with this problem whether it's a parallel or serial application. Perhaps a diversity measure can be used to control mutation rates? > 2) Will the server be able to store a potentially endless list of > agent locations? Will some slave act as a secondary master? Good questions. > This points are still open, and, on my opinion, we'll have to left > them open until we have a basic parallel process working. Then, we'll > be able to manage all this topics with a better understanding. Absolutely! Having some basic parallel structure in place will provide a nice testbed for many ideas. Meanwhile, it seems that the next steps are clear, and the details can be worked out as you move forward. --Steve |
|
From: Jonny M. <jon...@ni...> - 2002-01-14 22:00:38
|
Il 22:20, lunedì 14 gennaio 2002, Stephen Waits ha scritto: > There are two disconnection scenarios. > > 1) The client is shutdown "properly" by a user or system administrator. > In this case the client can attempt to send on its best bred individuals > before returning control to the operating system. > > 2) The client is shutdown "improperly", as in the case of a modem > disconnecting or a power-outage. EVEN in this case, the best individuals > from the prior generation will have been sent to other clients or the > server (depending on your architecture) already. This seems to work > pretty well in my opinion. Think of it as a massive earthquake, volcano, > or hurricane striking one of many islands. This is right. But The problem is: 1) Are the agent widespread among the population? if so, what will prevent the best agents to replicate endlessly, sterilizing the whole population? 2) Will the server be able to store a potentially endless list of agent locations? Will some slave act as a secondary master? This points are still open, and, on my opinion, we'll have to left them open until we have a basic parallel process working. Then, we'll be able to manage all this topics with a better understanding. Thanks, Giancarlo |
|
From: Stephen W. <sw...@wa...> - 2002-01-14 21:20:06
|
On Mon, 14 Jan 2002, Jonny Mind wrote: > Yes, clients can check in or out, but they can't drop. Why can't they drop out? More below.. > Why I did not stick to Beowolf or some other parallel computation > environment (i.e. mosix?) Because I want GD to run on many different > and remote servers, without having to share a common OS sublayer. I agree, designing your own protocol rather than use MPI or other environment is much better. > So, we have to exhamine what happens when a slave drop it's dial-up > connection, or breaks down. If it is running a poor genetic > population, this is not a problem, but if the slave with the winning > genetic falls somehow off-line forever (i.e. a blackout in local city > power sources) this could be a disaster. We'll have to think about > some redoundancy also... There are two disconnection scenarios. 1) The client is shutdown "properly" by a user or system administrator. In this case the client can attempt to send on its best bred individuals before returning control to the operating system. 2) The client is shutdown "improperly", as in the case of a modem disconnecting or a power-outage. EVEN in this case, the best individuals from the prior generation will have been sent to other clients or the server (depending on your architecture) already. This seems to work pretty well in my opinion. Think of it as a massive earthquake, volcano, or hurricane striking one of many islands. Again I will be able to check out the new code some day soon! Thanks, Steve |
|
From: Jonny M. <jon...@ni...> - 2002-01-14 18:32:43
|
Il 04:25, lunedì 14 gennaio 2002, hai scritto: > I also ready your notes on how to approach the parallelization of gd. > Please consider checking out what John Koza and company have done on their > Beowulf system with Genetic Programming. Here's the link describing it: > > http://www.genetic-programming.com/parallel.html Good article. It's like my 1.3 point (different populations with the same learning set), but the migrating population is a good news to me. Since the agent loading (at this time) is sync with the turn ending, we could not do this now: the negihbours sending agents around (or the master ordering the slaves to send agents around) should wait for others to finish their turn, and this is an awful waste of time. But a buffered agent loading isn't so hard to do. It's now my #1 priorty, and I'll program that in a brust. Also, I was thinking that the MASTER serve should have regulated all the traffic between clients, but this "topology" topic is great: the slaves can communicate on the base of a previous order sent by the master (new "topo" command? (topo means "mouse" in italian...:-) ), freeing the master from a high burden. > > This approach makes a lot of sense to me. It provides an inherent > speciation model, allows for asynchronous execution. Clients can be of > different speeds, and, may check in and out of working on a problem at any > moment during a run. Yes, clients can check in or out, but they can't drop. Why I did not stick to Beowolf or some other parallel computation environment (i.e. mosix?) Because I want GD to run on many different and remote servers, without having to share a common OS sublayer. So, we have to exhamine what happens when a slave drop it's dial-up connection, or breaks down. If it is running a poor genetic population, this is not a problem, but if the slave with the winning genetic falls somehow off-line forever (i.e. a blackout in local city power sources) this could be a disaster. We'll have to think about some redoundancy also... > > Just my $0.02.. Hope it might be helpful. Very. I repeat, it is a great Idea. Since I am very busy in coding, I often am not able to study the fresh literature about it... any suggestion is WELCOME! Thanks... Giancarlo |
|
From: Stephen W. <sw...@wa...> - 2002-01-14 03:25:14
|
Great news on getting the project converted over to the (more portable) autoconf setup. I will hopefully be able to check it soon. My real work is pretty hectic these days so I'm not sure how soon I might be able to contribute; however, I remain as interested as ever. I also ready your notes on how to approach the parallelization of gd. Please consider checking out what John Koza and company have done on their Beowulf system with Genetic Programming. Here's the link describing it: http://www.genetic-programming.com/parallel.html This approach makes a lot of sense to me. It provides an inherent speciation model, allows for asynchronous execution. Clients can be of different speeds, and, may check in and out of working on a problem at any moment during a run. Just my $0.02.. Hope it might be helpful. --Steve |
|
From: Jonny M. <jon...@ni...> - 2002-01-14 01:47:22
|
This night we prepared the first step toward parallel processing in Geneticd.
After a deep thought, I understood that:
1) there is more than one strategy to implement a parallel genetic population
strategy i.e.:
1.1) Coordinated parallel populations (genetic populations running
independently on different servers)
1.2) Global population, with different calculation point on different
genetic engines
1.3) Independent populations coordinating on the same learning set.
2) Some or all this strategies must be implemented in a flexible manner. The
parallell process ongoing should be transparent to both the end-user AND the
top level program routines. All the burden to manage the parallel process
should be allocated in a convenient part of the program, and this part must
have a common programming interface.
3) We can make all this if we abstract the genetic engine class. It has been
though has a "command simplifier", that holds a lot of functionality that
commands should have: saving, restoring, dumping populations or agents were
demanded to the genetic engine because it was a convenient place to store
that routines.
4) the new genetic engine model must be a BLACK BOX, that receives bunch of
useless genetic code, and through the interaction of some internal pieces (as
the evaluator, the environment and the learning set) is capable of
synthetize a set of better agents in output.
5) this new black box will be self-complete; it will be "programmed" by
commands or other interfaces with a programmable API, and it will be the only
mean by which the servers will deal with the agents.
In this way, we can switch black boxes so that they run their populations on
just one machine, on more than one machine, or in any fashion they like.
Genetic engine will not be pluggable, since their architecture is deeply tied
inside the daemon itself, but will be easy to reprogram if we want to add
some new way to manage our populations. Engine type will still be a set of
"internal pieces" of the genetic engines, the tools by which each engine
synthetizes its agents.
I WOULD GREATLY APPRECIATE A COMMENT FROM THE READERS OF THIS MAILING LIST.
------------------------
Now, I have been making some changes towards this model, and I think that we
can really start to develop a parallel - multiserver engine. Now I'll
describe the changes I've made, so that you can get in touch with what is
going on...
1) I splitted up the BIG genetic.cpp file; now engines will reside in a file
called gengine.cpp (with a gengine.h); each source willing to include genetic
code will have to include gengine.h, that includes genetic.h and learningset.h
(I have already changed #include directives in the whole source tree).
2) I removed the methods getEnv and getEvaluator from the GeneticEngine
class, substituting them with some new method. The daemon was accessing
directly the population or the environment for trivial reasons (i.e. for
counting how many agents had been alive or locking the environment). I
removed this references, and substituted them with new GeneticEngine methods.
In this way we can have a completely "population" created by our engines,
without having to worry about the daemon being able to understand it.
3) I abstracted class GeneticEngine, that is now a pure virtual class. It's
virtual=0 methods describe all the API that the rest of the daemon will be
able to use to manage genetic populations. The old GeneticEngine class became
the new GEInABox, Genetic Engine in a box, that is a genetic engine
completely self-functional, but capable to manage it's population locally
only.
Now, the API of the engines will be modified, and maybe we'll have to change
something in the design of the daemon itself, but I think that this powerful
model is capable of solving the problem of massive internet based broadcast
parallel computation in genetic algorithms (WOW!).
Waiting to have your news,
Giancarlo Niccolai.
|
|
From: Jonny M. <jon...@ni...> - 2002-01-13 09:53:40
|
The new CVS is now up, with the new configure and makefile scripts and the
new (more "adeguate") directory tree.
Now geneticd should compile well in most environment. I definitely abandoned
Kdevelop and automake: when it comes to serious things, nothing is like a
finetuned handwritten configuration.
Configuration TODO:
- write more adequate checking macros (I need your help!)
- have a complete config.h
- create a "make depend" target. At this moment, dependency cecking is a
little loose, I suggest to have always a "make clean" before a serious
compilation. Anyway, compilation and making is VERY faster than
before.
l'll handle it in a fiew days.
I'll be happy if you could try to compile this version of GD and tell me
something about it.
Regards,
Giancarlo Niccolai
|
|
From: Giancarlo N. <gi...@ni...> - 2002-01-12 19:59:26
|
I re-engeneerized the whole tree of geneticdaemon, and create a configure.in / makefile.in script system that I developed on my own. The scripts are self-configurable and should be able to compile geneticd in a whole lot of linux systems. I left behind some exothic check, that I'll add at a later stage (or that can be added by other developeres, if you will); the important thing is that make, make clean, make dist and make install will work well. I used only autoconf -- the distribution does not need it, it is used only to create a workable configure scripts shipped with CVS and .tgz distro -- and I written the makefiles by myself (no automake needed). Makefiles uses a root Rules.make, and are highly configurables with a minimum effor. Is not like having an automake script, but it is far more flexible, and it's often just a matter of copying a Makefile.in script from a parent or brother directory. The options used in my Makefiles are various and powerful; I had to turn a bounch of .o coming from geneUtils subdirectory into a .so library file, and with a couple of variables in the configure.in and geneUtils/makefile.in, the whole thing have been done. You'll find some direction on how to use this model in the TECHNICAL document, in the new developer section. There is a good news and a bad news about the system. The good news is that now gd_plugin can live inside the genetic daemon directory. I re-created the source tree, so to arrange better directory names and dynamic libraries boundles, and to be able to ship plugins with the source code. The bad news is that you can throw away your whole CVS archive. I requested a CVS cleaning to Source Forge. At soon as our CVS will be cleaned, I'll put the new tree under geneticd/ cvs directory, and our distributions will be called geneticd-VERSION, starting from version 0.2 As soon as I'll be able to create the new CVS archive, I'll let you know. If you have developed some new source in the while, don't use CVS; send it directly to me, and I'll integrate it manually. Best regards, Giancarlo Niccolai |
|
From: Jonny M. <jon...@ni...> - 2001-12-27 00:19:00
|
And something other. I had the time to deal with some topics we needed to touch if we are to proceed with massive parallel processing. Here I include the CVS log: --------------------- Begin of CVS commit Altered command "rset" so that now it stops immediately running engines. This was necessary to ensure that the client is able to stop an engine asynchronously, if the engine takes to long to complete a turn, or if it is faulty. This change urged the ristructuratction of a whole set of classes, from the genetic engnine (in which I added the block() method), the environment, the population and the agent. In the enviroment, a spare_pop member variable has been added; at the beginning of a turn, a (deep) copy of the population is saved, and can be restored with the reset() method (formerly it was useless, now it is used to restore the environment status of the previous turn). Population and agent have been provided with a copy() method; also, accordingly changes have been made to the serialization methods. The changes have been tested, and seem to work fine, but a deeper test is needed. ------------------------ Time sharing is implemented through command "slim" (set limit) which takes a real number between 1 and 100. If the number is 100, time sharing limit is disabled, else, a control thread is started; the control thread has the duty to monitorize the activity of the main thread. This method is not very precise, but it works. Fixed never ending loop in read from network dropped connections. Now a connection can be safely closed either by command quit or by dropping tcp/ip network. ------------------------- END OF CVS COMMIT Now that we have a strong blockade command (rset) and time sharing capabilities, we can begin to start thinking how we want to implement the broadcast of populations. I have some idea about it, And I will post them tomorrow. To get the latest updates, enter your working geneticd directory, log a usual and use the "update" CVS command. Then autoconf, automake, configure and compile. And have fun! 2002 will be the year of widespread genetic based AI all over the internet. Wanna play with us? Merry Christmas and a happy new AI year. Giancarlo |
|
From: Jonny M. <jon...@ni...> - 2001-12-16 22:32:03
|
Il 22:03, sabato 15 dicembre 2001, hai scritto: > Libtool appears to be making a poor decision on my system.. > > make[3]: Entering directory > `/bigdisk/users/swaits/Dev/geneticd/geneticdaemon/geneticdaemon/genetic' > /bin/sh ../../libtool --mode=compile g++ -DHAVE_CONFIG_H -I. -I. -I../.. > -O2 -fno-exceptions -fno-rtti -fno-check-new -c learningunit.cpp > g++ -DHAVE_CONFIG_H -I. -I. -I../.. -O2 -fno-exceptions -fno-rtti > -fno-check-new -Wp,-MD,.deps/learningunit.pp -c learningunit.cpp $ -fPIC > -DPIC -o learningunit.lo > g++: cannot specify -o with -c or -S and multiple compilations > > > [swaits@gateway] [1:00pm] > [~/Dev/geneticd/geneticdaemon] 121> g++ --version > 2.95.2 > > I know my g++ is pretty old, but seems like it should work ok. Any ideas? > > --Steve I have developed the whole thing with Kdevelop, that makes a lot of decisions by itself. Now, if you have Kdevelop, enter it and select "project options", then disable "optimization" (that -O2 flag). Else, edit aclocal.m4 and acinclude.m4, remove all the -O2 flags (find -O2 and then substitute them with "") and make distclean, autoconf, automake, ./configure and make the stuff. Good luck! Giancarlo PS, let's use the developer list to communicate; others may have your problem. Just send mail to gen...@li... |
|
From: Jonny M. <jon...@ni...> - 2001-12-15 14:54:41
|
Now agents can be named. The name of agents can be a simbolic reference "eg. eddie". Agents "complete" name is formed by the "family" name (again, "eddie") a minus sign and generation id, starting from 0. The father has id "eddie-0", it's childs have ids "eddie-1", etc. An agent can be referenced by full name or by numeric id (the position in the population) in all commands referencing agents. Eg: dage 0 eddie-0 or dage 0 14 (eddie-0 it's the 14th agent). Name of agents can be dinamically changed with the command nage: nage <eng-id> <agent-id> - displays the agent's full name nage <eng-id> <agent-id> <name> - changes family name; generation will be 0 This change is meant for human readers that wish to follow the destiny of a particular agent. Software client can still safely reference agents with their position in the population. |
|
From: Jonny M. <jon...@ni...> - 2001-12-15 14:48:03
|
Now every serializable object (the tree below GeneticEngine) has a new method
called serial_size() that forecasts exactly the amount of space needed to
store the engine, or the component. This data is displayed between double
quotes (") in the response line of the "save" and "sage" comands; in this
way, clients can know how many bytes will be recieved. Clients are also
advised that the serialized stream via tcp/ip connection still terminates
with <CR><LF>.<CR><LF>
This terminator is a part of the protocol; clients will have to read and
discards 5 characters more than the data lenght inidicated in the response
line.
|
|
From: Jonny M. <jon...@ni...> - 2001-12-09 22:59:14
|
I added a "Config file" class, that should read, parse and manage configuration options for geneticd. I also added an admin command, "parm", that let the superuser to see parameters, and to change some parameters (those marked dynamic, that can be changed after the engine has started). I always thought this was a COOL feature to add. Now I have to fine tune it, add more parameters and write a decent config file. But this are minor works. |
|
From: Jonny M. <jon...@ni...> - 2001-11-26 23:53:07
|
---------- Messaggio inoltrato ----------
Subject: Re: I became a member of SF.
Date: Tue, 27 Nov 2001 00:11:07 +0100
From: Giancarlo Niccolai <gi...@ni...>
To: "jason" <jay...@ho...>
Il 09:06, luned=EC 26 novembre 2001, hai scritto:
> Dear Sir.
>
> Sorry that I am so late.
> My username of SF is "jayjungkr".
>
> Nowadays I am studying how to use CVS.
> I am just wondering if I can ask about that.
>
> Take care.
> Sincerely.
Ok, you're in. If you log in, you will see "geneticd" among your active
projects.
The first thing you shoud do is to cvs-out genetic daemon tree, so you ha=
ve a
working directory on your own; you can edit files in that working dir to =
test
new features or bug patches you add;
The CVS system will take care to sync your changes with mine (or other
developers'es).
To get out the CVS tree, you have to install the CVS system on your box
(already in if you have a linux box), and then enter the following comman=
ds:
cd /MY_CPP_DIRECTORY
cvs -d:pserver:ano...@cv...:/cvsroot/geneticd l=
ogin
cvs -z3 -d:pserver:ano...@cv...:/cvsroot/geneti=
cd\
co geneticdaemon
cvs -z3 -d:pserver:ano...@cv...:/cvsroot/geneti=
cd\
co gd_plugin
If you have already downloaded the .tgz distros, delete them, or CVS won'=
t be
able to detect changes you make.
You'll have two CVS directories. They contains almost the same file as th=
e
.tgz. Feel free to edit them.
Now, I think that the first issue I will focus on is parallel processing
among different servers, and thus process limitation (in both memory and =
CPU
consume). I'll add one variable to the genetic_egine class (int max_cpu) =
that
will be 1/1000 of CPU timeshare for engine running. I have already a func=
tion
that handles this: implementing it in the genetic engine will be easy. An=
d
I'll have to implement a command to assign cpu shares. I will do that.
Now, I would discuss with you on how to implement populations spreading a=
mong
different servers on the internet. The mechanism to send agents around is
already in: "sage" and "lage" commands (even if it must be tested and
upgraded in some way. I only tested it in GD to disk and back session, no=
t GD
to client or GD to gd. And I have to implement a binary oriented data
transmission; the one it's in now shoud work well in a lot of cases, but =
a
lenght-run indication in the status response before starting sending bina=
ry
data, as in HTTP, is necessary. It's a minor change, we'll discuss it bel=
ow).
I would opt for a master-slave structure, in which a GD (master) splits
population around to other GD (slaves). Slaves can get population descrip=
tion
=66rom master, and they will transfer agents to other GDs on master reque=
ts.
The master-slave structure is (or should be? the point is open) hierarchi=
cal:
fi the AGD sends a pop-slice to BGD, BGD can re-slice its population to a
CGD. BGD will be a slave for AGD, but will be a master for CGD.
The master-slave architecture should be focussed on engines, not on the w=
hole
GD. A GD could serve as a slave for some engines, but could be a master f=
or
other engines; there can be the case where a AGD can send pop-slice to BG=
D
and BGD will have pop-slices running on AGD.
Pop allocation must be decided on a resource availability (cpu power and =
cpu
consumed) and can be finetuned with server commands both on master and sl=
ave
side.
Potential slaves (GDs willing to accept master connections) will signal t=
heir
presence and disponibility to masters using commands as
"slave <address> <cpu-power> <free-cpu> <free-engine>"
This discussion is whole opened, and can be changed.
Another simpler topic is lenght of binary data do be transmissed:
we have to implement a pre-serialize routine whose task is to calculate t=
he
lenght of the resulting stream of a seralize() request. We'll put the out=
put
of this function to the status line of geneticd. Example:
sage 0 1 //saving agent 1 of engine 0
+201-Ok, sending "2053" bytes.
asdfakjsdfhajklsdhfajksdfhklajdhflkjasdf
.... 2053 binary data ...
asdfadfasdf
. //<--- we still continue to put the <CR><LF>.<CR><LF> sequence at the e=
nd.
// not included in the count above.
This is a simple task. If you want, you can familiarize with CVS and
geneticdaemon doing this.
A third "urgent" topic is a strong protocol for reply codes. I just used =
-5xx
for error codes, -501 for command parameter mistakes and +2xx for ok resu=
lts.
We can change it completely, but BEFORE to start programming clients. You
could rationalize this error codes putting them in a .h file (#define
GD_ERR_PARAMS -501 etc.) and reprorgram status replies in the corecmd.c=
pp
file for using this. In the beginning, I did not have an advanced class a=
s
the GStream for output, and I did not had the variable lenght method
GStream::print. So, a reply was just a write(file_descr, buffer,
str(buffer))... using something like "%d error: can'tdo", ERR_CODE, was a
waste of time... now it's necessary. If you want, you can start from here=
, to
familiarize with GD command parsing system.
A fuorth topic is on new engines, to be put in separate plugins.
A fifth topic is about supporting cooperatives population types: I just
developed a population that can evolve a "BEST" agent among the others. B=
ut
we can have an higher degree of powerful engines if we develop a "colony"
based genetic system. We'll discuss them later.
Enough for now.
=46rom now on, we'll use the mailing list provided by source forge (I'll =
tell
you when it will be ready, but will be SHORT), so our discussions will be
recorded, and readable for other developers joining later.
Let me know if you want to take care of 1) pre_serialize() method and '+x=
xx
"xxx" bytes to be sent in' replies and / or #define REPLY codes, so I won=
't
do that.
Let me know about your ideas for 1) sharing populations and 2) cooperativ=
e
engine types.
And finally, let me know if you want to develop or just have an idea for =
new
plugins.
Great to have you in my project,
Giancarlo Niccolai.
-------------------------------------------------------
|