We are pulling TM5-MP out of the current "multiple repositories" layout of TM5 to a "one repository" layout. This will have several advantages. It should:
simplify how to get, set-up, modify, update and run the model (all of which should lower the entry barrier to TM5)
ease documention
cleanup the repository and make it way smaller, but mainly help to keep it clean (no more dead branches littering the repos)
more incentive to keep the proj sync with the base (less likely to have partial commits)
facilitate the switch to mercurial when ready (and possibly includes the repos history when doing so)
The repetition of this structure throughout the code creates the multiple repositories approach currently used. Not shown here is the extra layer below chem, which acts like a super project with several subprojects (base, m7,...).
New layout
I propose the following layout for the standalone TM5-MP:
TM5-MP|---bin# Scripts|---rc# Config files|---base# Base code (transport only) with 5 dummy tracers|---levels# Levels definition |---proj# Code that extends the functionality of the base:||---one# one inert tracer||---chem# lots of tracers, with full chemistry||---ecearth3# coupling and meteo for EC-Earth ||---output# extra output ||---budget10# budgets into 10-degree-wide zonal bands |`---...`---tools# extra scripts not strictly required by TM5
First of all, this keeps enough familiarity with the previous layout to not put off people. What is important is that there is:
no more trunk, branch, release below those base and proj dirs, only fortran source code, and
no more src, bin, py, rc below each code subdir: one bin and one rc at the root of the tree is enough; src becomes moot, py is obsolete (pre-pycasso code)
Note also that:
TM and proj that are not TM5-MP compatible would not be carried over.
the levels are treated like the base, since required.
In the subversion repository, this whole tree will be a trunk. If ever needed, branches and tags (the later in place of release to follow the standard that hg could import) would mirror it.
Other considerations
We probably have to put more thought into output later on when porting code to MP. There are incompatibilities between the extra output and some proj. For example, we have some output (pdump) that is now so specific to the chemistry that it has to be in the proj/chem, but needs a dummy in the output project. We will probably have separate output proj, like the current sounding. For another discussion.
the rc could probably be tidied up. We have to consider how these files are used: some are constantly edited and there should be templates for them (chem.input.rc, main rc), some only once in a while (machine/compiler/meteo/expert). Put the later in an "include" subdir as suggested by Sourish?
Scripts: we should not change the way the model interfaces with pycasso. That is we keep linking (and not copying) the setup script to the top of the tree. But we should have the symlink (there is only one possibility now) already in the repository. The second needed script (rc.py) should also be a softlink and already be present in the repository as well. We need to merge the two versions around: one in tools and one in bin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In the new scheme, where would you like users to put their tracer specific projects? For example, Sander could have a 'CO2' project, and I could have a 'CO2' project, which would not be identical. According to the present structure, /proj/chem would be a good place, but then we would have conflicting /proj/chem/CO2 projects. Should we allow /proj/user/Sander/ and /proj/user/Sourish/ ? The danger is that then we'll have a jumble of folders in /proj/user instead of base. A cleaner alternative is to have one single /proj/chem/CO2, but Sander and I will have different mercurial branches, so there will be no conflict. I like this idea since it utilizes the full power of hg.
Both the 4DVAR and EnKF setups use a whole bunch of python scripts apart from pycasso. So I would vote for keeping the 'py' folders.
The 4DVAR base can never be the same as the forward model base, even if we were to use the same forward model version. Should we then have two top level directories, say TM5-MP and TM5-MP-4DVAR, under which will then be base, rc, etc?
If the 'base' folder only contains 'src', then can we do away with 'base' altogether, and simply have
TM5-MP
|
|--src
|--rc
...
etc?
Perhaps it's a good idea to have a /proj/user folder anyway. However, each user can commit code from /proj/user/$USER only to their own hg branch, so that a new user checking out the code does not get those folders in their default branch.
Should we have a 'stable' and a 'devel' branch in hg?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Motivation
We are pulling TM5-MP out of the current "multiple repositories" layout of TM5 to a "one repository" layout. This will have several advantages. It should:
Old TM5 layout
The current tree looks like this:
The base and each of the projects (below proj) have the same tree structure, with at least a trunk and usually release and/or branches directories:
The repetition of this structure throughout the code creates the multiple repositories approach currently used. Not shown here is the extra layer below chem, which acts like a super project with several subprojects (base, m7,...).
New layout
I propose the following layout for the standalone TM5-MP:
First of all, this keeps enough familiarity with the previous layout to not put off people. What is important is that there is:
Note also that:
In the subversion repository, this whole tree will be a trunk. If ever needed, branches and tags (the later in place of release to follow the standard that hg could import) would mirror it.
Other considerations
We probably have to put more thought into output later on when porting code to MP. There are incompatibilities between the extra output and some proj. For example, we have some output (pdump) that is now so specific to the chemistry that it has to be in the proj/chem, but needs a dummy in the output project. We will probably have separate output proj, like the current sounding. For another discussion.
the rc could probably be tidied up. We have to consider how these files are used: some are constantly edited and there should be templates for them (chem.input.rc, main rc), some only once in a while (machine/compiler/meteo/expert). Put the later in an "include" subdir as suggested by Sourish?
Scripts: we should not change the way the model interfaces with pycasso. That is we keep linking (and not copying) the setup script to the top of the tree. But we should have the symlink (there is only one possibility now) already in the repository. The second needed script (rc.py) should also be a softlink and already be present in the repository as well. We need to merge the two versions around: one in tools and one in bin.
In the new scheme, where would you like users to put their tracer specific projects? For example, Sander could have a 'CO2' project, and I could have a 'CO2' project, which would not be identical. According to the present structure, /proj/chem would be a good place, but then we would have conflicting /proj/chem/CO2 projects. Should we allow /proj/user/Sander/ and /proj/user/Sourish/ ? The danger is that then we'll have a jumble of folders in /proj/user instead of base. A cleaner alternative is to have one single /proj/chem/CO2, but Sander and I will have different mercurial branches, so there will be no conflict. I like this idea since it utilizes the full power of hg.
Both the 4DVAR and EnKF setups use a whole bunch of python scripts apart from pycasso. So I would vote for keeping the 'py' folders.
The 4DVAR base can never be the same as the forward model base, even if we were to use the same forward model version. Should we then have two top level directories, say TM5-MP and TM5-MP-4DVAR, under which will then be base, rc, etc?
If the 'base' folder only contains 'src', then can we do away with 'base' altogether, and simply have
etc?
Perhaps it's a good idea to have a /proj/user folder anyway. However, each user can commit code from /proj/user/$USER only to their own hg branch, so that a new user checking out the code does not get those folders in their default branch.
Should we have a 'stable' and a 'devel' branch in hg?