From: Sam H. <sh...@ma...> - 2005-01-10 17:16:31
|
On Jan 10, 2005, at 12:08 PM, Michael Gage wrote: > We've been considering this, but haven't made any moves yet -- mostly > because we've > had other things to do. Sam, John, do you have any comments? Arch is more ambitious and has a greater "cool" factor, but I think Subversion is the way to go right now, especially since it's possible to convert a repository from CVS to Subversion: <http://svnbook.red-bean.com/en/1.0/apas11.html>. -sam > On Jan 10, 2005, at 11:52 AM, William Ziemer wrote: > >> Most of the projects I link with are going to subversion: >> http://subversion.tigris.org/ >> Maybe use this for the problem database? >> Good to see you all again, >> Bill |
From: John J. <jj...@as...> - 2005-01-10 20:47:26
|
First, I am copying Bill and Jeff since I don't know if they get openwebwork-devel e-mail. Also, since Jeff (and some openwebwork-devel readers) may not know what we are talking about, here is the plan. For the .pg files of the National Problem Library (perhaps to be renamed the National Problem Database), we will start using a cvs-like system. There will be two repositories, or maybe two directories of one repository. The basic distinction is tagged vs. non-tagged problems. Problems start in the non-tagged side, basically however we find them. Once this thing is initialized, I guess we can start filling that up with lots of pg files. When it gets tagged, then it is moved to the tagged-side, which will be organized to mirror the heirarchical topic structure of the database. We may not be able to "polish" every problem, but as that is done, it simply gives an updated version of the problem on the tagged side. The setup as described above basically gives up on the notion of systematically polishing problems. If we want to keep that alive, we should have 3 basic sub-divisions (raw, tagged, and tagged-and-polished). Actually, this 3-part version might be a good way to go. Thinking of the operations we will need in a cvs-like system: * add directories and files - I would hope all systems are good at this * move a file - cvs may be weaker here since it looses version history if you move a file, but maybe we don't really care. We don't expect much revision to take place before tagging. * look at recent updates, and maybe the diffs. This would be important as new versions are committed to the tagged files (e.g., to be sure no one deleted the tags, or to see if the new version should be forked instead of being a new version of the same problem). I don't know which is better here. It probably hinges on how useful the status commands are for pulling information (since I wouldn't want to browse through thousands of files looking for recent changes). I don't know enough about different systems to know which will be better for us on these things. John Sam Hathaway wrote: > On Jan 10, 2005, at 12:08 PM, Michael Gage wrote: > >> We've been considering this, but haven't made any moves yet -- mostly >> because we've >> had other things to do. Sam, John, do you have any comments? > > > Arch is more ambitious and has a greater "cool" factor, but I think > Subversion is the way to go right now, especially since it's possible > to convert a repository from CVS to Subversion: > <http://svnbook.red-bean.com/en/1.0/apas11.html>. > -sam > >> On Jan 10, 2005, at 11:52 AM, William Ziemer wrote: >> >>> Most of the projects I link with are going to subversion: >>> http://subversion.tigris.org/ >>> Maybe use this for the problem database? >>> Good to see you all again, >>> Bill >> > > > > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > OpenWeBWorK-Devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/openwebwork-devel |
From: Michael G. <ga...@ma...> - 2005-01-10 21:13:52
|
> > Problems start in the non-tagged side, basically however we find=20 > them.=A0 Once this thing is initialized, I guess we can start filling=20= > that up with lots of pg files.=A0 When it gets tagged, then it is = moved=20 > to the tagged-side, which will be organized to mirror the heirarchical=20= > topic structure of the database. > > We may not be able to "polish" every problem, but as that is done, it=20= > simply gives an updated version of the problem on the tagged side.=A0=20= > The setup as described above basically gives up on the notion of=20 > systematically polishing problems.=A0 If we want to keep that alive, = we=20 > should have 3 basic sub-divisions (raw, tagged, and=20 > tagged-and-polished).=A0 Actually, this 3-part version might be a good=20= > way to go. > I like the 3 part version. Possibly even a 4th part for problems=20 which can be used as models for future problems (exhibiting best=20 practices, etc. etc.) This fourth part could be fairly small however,=20= and may not need to be a CVS. Take care, Mike |
From: Sam H. <sh...@ma...> - 2005-01-10 21:29:32
|
On Jan 10, 2005, at 4:13 PM, Michael Gage wrote: >> >> Problems start in the non-tagged side, basically however we find=20 >> them.=A0 Once this thing is initialized, I guess we can start filling=20= >> that up with lots of pg files.=A0 When it gets tagged, then it is = moved=20 >> to the tagged-side, which will be organized to mirror the=20 >> heirarchical topic structure of the database. >> >> We may not be able to "polish" every problem, but as that is done,=20= >> it simply gives an updated version of the problem on the tagged=20 >> side.=A0 The setup as described above basically gives up on the = notion=20 >> of systematically polishing problems.=A0 If we want to keep that = alive,=20 >> we should have 3 basic sub-divisions (raw, tagged, and=20 >> tagged-and-polished).=A0 Actually, this 3-part version might be a = good=20 >> way to go. >> > I like the 3 part version. Possibly even a 4th part for problems=20 > which can be used as models for future problems (exhibiting best=20 > practices, etc. etc.) This fourth part could be fairly small however,=20= > and may not need to be a CVS. Can anyone give me more details on how the repository of problem=20 sources and the "database" will interact? Based on what little I know,=20= it seems to me that the problem source should be part of the problem's=20= database record. By the way, has anyone thought about how problems will be packaged?=20 Many problems consist of more than one file and it might be worth=20 laying out a packaging format, so that a problem and all of its=20 auxiliary files and metadata can be distributed as a single file. -sam |
From: John J. <jj...@as...> - 2005-01-10 22:47:30
|
Sam Hathaway wrote: > On Jan 10, 2005, at 4:13 PM, Michael Gage wrote: > >>> >>> Problems start in the non-tagged side, basically however we find >>> them. Once this thing is initialized, I guess we can start filling >>> that up with lots of pg files. When it gets tagged, then it is >>> moved to the tagged-side, which will be organized to mirror the >>> heirarchical topic structure of the database. >>> >>> We may not be able to "polish" every problem, but as that is done, >>> it simply gives an updated version of the problem on the tagged >>> side. The setup as described above basically gives up on the notion >>> of systematically polishing problems. If we want to keep that >>> alive, we should have 3 basic sub-divisions (raw, tagged, and >>> tagged-and-polished). Actually, this 3-part version might be a good >>> way to go. >>> >> I like the 3 part version. Possibly even a 4th part for problems >> which can be used as models for future problems (exhibiting best >> practices, etc. etc.) This fourth part could be fairly small >> however, and may not need to be a CVS. > > > Can anyone give me more details on how the repository of problem > sources and the "database" will interact? Based on what little I know, > it seems to me that the problem source should be part of the problem's > database record. In a sense, things are reversed. The problem files initially contain all of the information; the extra information comes in the form of special comments in those files. We then have a script to set up a mysql database, and to extract the information from the files and load it into the database. I like this approach since it is easy to reload the database if something goes wrong, and we are shipping mainly flat text files (except for the images). > By the way, has anyone thought about how problems will be packaged? > Many problems consist of more than one file and it might be worth > laying out a packaging format, so that a problem and all of its > auxiliary files and metadata can be distributed as a single file. I hadn't thought of the extra files. Thus far, the problems were basically not packaged in any special way. The distribution method I had in mind was that webwork would handle it behind the scenes. It would fetch files over http from perl (I think the perl module is LWP, or something like that). The entry point would be an extra tab in the admin course (along with add course, ..., and then Problem Database). If you ask it to update your Problem Library Database, then it fetches the current list of files/version via http, checks it against your current list, and gets whatever is new and reloads the database. My guess is that this is how the perl cpan module works, and it is how the xemacs package system works. Since knowing which files need updating keys off of version numbers, we may have to keep those as part of the files' metadata. If we use cvs for the files, we can just use the cvs revision number. If we use subversion, then there is one number for all files, so every change would make it look like all of the files need updating, so that would be a case where a problem's version number would have to be kept. This approach should still be ok with extra associated files. They are listed in the manifest along with the problem files. So, if you don't have one at the time of an update, it will be fetched for you. John |
From: Sam H. <sh...@ma...> - 2005-01-14 04:55:43
|
On Jan 10, 2005, at 5:47 PM, John Jones wrote: > Sam Hathaway wrote: > >> On Jan 10, 2005, at 4:13 PM, Michael Gage wrote: >> >>>> >>>> Problems start in the non-tagged side, basically however we find >>>> them. Once this thing is initialized, I guess we can start filling >>>> that up with lots of pg files. When it gets tagged, then it is >>>> moved to the tagged-side, which will be organized to mirror the >>>> heirarchical topic structure of the database. >>>> >>>> We may not be able to "polish" every problem, but as that is done, >>>> it simply gives an updated version of the problem on the tagged >>>> side. The setup as described above basically gives up on the >>>> notion of systematically polishing problems. If we want to keep >>>> that alive, we should have 3 basic sub-divisions (raw, tagged, and >>>> tagged-and-polished). Actually, this 3-part version might be a >>>> good way to go. >>>> >>> I like the 3 part version. Possibly even a 4th part for problems >>> which can be used as models for future problems (exhibiting best >>> practices, etc. etc.) This fourth part could be fairly small >>> however, and may not need to be a CVS. >> >> >> Can anyone give me more details on how the repository of problem >> sources and the "database" will interact? Based on what little I >> know, it seems to me that the problem source should be part of the >> problem's database record. > > In a sense, things are reversed. The problem files initially contain > all of the information; the extra information comes in the form of > special comments in those files. We then have a script to set up a > mysql database, and to extract the information from the files and load > it into the database. Would it be fair to say that the MySQL database does nothing more than act as an index on the metadata associated with each problem? Or am I missing something? > I like this approach since it is easy to reload the database if > something goes wrong, and we are shipping mainly flat text files > (except for the images). I like the simplicity of this, and in a distributed system like this the more we can do with a version control system the better. >> By the way, has anyone thought about how problems will be packaged? >> Many problems consist of more than one file and it might be worth >> laying out a packaging format, so that a problem and all of its >> auxiliary files and metadata can be distributed as a single file. > > I hadn't thought of the extra files. Thus far, the problems were > basically not packaged in any special way. > > The distribution method I had in mind was that webwork would handle it > behind the scenes. It would fetch files over http from perl (I think > the perl module is LWP, or something like that). The entry point > would be an extra tab in the admin course (along with add course, ..., > and then Problem Database). If you ask it to update your Problem > Library Database, then it fetches the current list of files/version > via http, checks it against your current list, and gets whatever is > new and reloads the database. Shouldn't we leverage the version control system checkout features to fetch and update problem libraries? It seems like a waste to keep the problems in CVS (or Subversion) and then ignore the versioning features of that system and track versions separately and fetch via HTTP. > My guess is that this is how the perl cpan module works, and it is how > the xemacs package system works. By the way, CPAN modules are packaged in "distributions", tarballs which have a predictable naming scheme and layout and a standard way to build and install them. > Since knowing which files need updating keys off of version numbers, > we may have to keep those as part of the files' metadata. Would that still be a problem if you were to keep the local copy of the problem database as a checked-out CVS (or Subversion) working copy? > If we use cvs for the files, we can just use the cvs revision number. > If we use subversion, then there is one number for all files, so every > change would make it look like all of the files need updating, so that > would be a case where a problem's version number would have to be > kept. I don't really know, but I would expect Subversion to provide some way of identifying a version of a particular file. I know that it has the concept of a changeset, and that might be more like what you want anyway. Each changeset would encompass a small set a files, usually a single problem file but sometimes a problem and its auxiliary files. > This approach should still be ok with extra associated files. They > are listed in the manifest along with the problem files. So, if you > don't have one at the time of an update, it will be fetched for you. What is the manifest? I don't think you'd need any such thing if you were to use a version control system to track files. Thanks for explaining this all to me. If you get sick of it, just let me know. I always have opinions about things that aren't really my business, but if you'd like to be left alone, say the word. :) -sam |
From: John J. <jj...@as...> - 2005-01-14 16:00:56
|
Sam Hathaway wrote: > On Jan 10, 2005, at 5:47 PM, John Jones wrote: > >> Sam Hathaway wrote: >> >>> On Jan 10, 2005, at 4:13 PM, Michael Gage wrote: >>> >>> Can anyone give me more details on how the repository of problem >>> sources and the "database" will interact? Based on what little I >>> know, it seems to me that the problem source should be part of the >>> problem's database record. >> >> >> In a sense, things are reversed. The problem files initially contain >> all of the information; the extra information comes in the form of >> special comments in those files. We then have a script to set up a >> mysql database, and to extract the information from the files and >> load it into the database. > > > Would it be fair to say that the MySQL database does nothing more than > act as an index on the metadata associated with each problem? Or am I > missing something? I think that is a reasonable description. >> I like this approach since it is easy to reload the database if >> something goes wrong, and we are shipping mainly flat text files >> (except for the images). > > > I like the simplicity of this, and in a distributed system like this > the more we can do with a version control system the better. > >>> By the way, has anyone thought about how problems will be packaged? >>> Many problems consist of more than one file and it might be worth >>> laying out a packaging format, so that a problem and all of its >>> auxiliary files and metadata can be distributed as a single file. >> >> >> I hadn't thought of the extra files. Thus far, the problems were >> basically not packaged in any special way. >> >> The distribution method I had in mind was that webwork would handle >> it behind the scenes. It would fetch files over http from perl (I >> think the perl module is LWP, or something like that). The entry >> point would be an extra tab in the admin course (along with add >> course, ..., and then Problem Database). If you ask it to update >> your Problem Library Database, then it fetches the current list of >> files/version via http, checks it against your current list, and gets >> whatever is new and reloads the database. > > > Shouldn't we leverage the version control system checkout features to > fetch and update problem libraries? It seems like a waste to keep the > problems in CVS (or Subversion) and then ignore the versioning > features of that system and track versions separately and fetch via HTTP. At one time, there were two reasons for this. Maybe neither is compelling. The first was that I thought people might now want the most current version of some problems. This would introduce various complications. After talking with Bill about this in Atlanta, we decided against this. Along with that, if someone modifies a problem and submits the change and it isn't a strict improvement, then we would fork the problem rather than replace the original. Anyway, if people might prefer older versions of problems, we would not want the equivalent of "cvs up". The second reason was that I knew I could get perl to do http requests. If someone was missing the needed perl module, it would just become one more cpan module to fetch. If webwork is going to access the cvs command on their system, it may be more inconvenient at installation time. Now that I think about it, the first reason is now out. The second is connected to what system we choose. If webwork is keeping track of the versions and just getting files as needed (i.e., doing some functions cvs can do) then it doesn't matter which system is storing files at Problem Library Central. If sites get new versions of problems by the equivalent of "cvs up" and we are using subversion, presumably they need to have subversion installed. What I think I have just talked myself into is: * if webwork does some of the cvs work and uses http to get files, we have more flexibility in how problems are maintained at the repository. * if we have files transmitted by cvs, cvs will do some of the work for us. But, we then either use cvs itself, or increase the installation hassle of webwork (the latter is something I would not want to do). I just thought of another complication with using a cvs-like system for fetching files. Part of the repository structure currently envisioned is that there will be several directories with the same internal structure, and the process of downloading the problem library should amount to taking their union. If we polish a problem, then it will move in the original repository, but it's location should not change the individual sites' machines. Just its version number increases. It is not insurmountable, but it is a complication. >> My guess is that this is how the perl cpan module works, and it is >> how the xemacs package system works. > > > By the way, CPAN modules are packaged in "distributions", tarballs > which have a predictable naming scheme and layout and a standard way > to build and install them. There are lots of aspects to the cpan process. I was only thinking of the part where it first seems to fetch a file which gives the modules available from a site and their current versions. >> Since knowing which files need updating keys off of version numbers, >> we may have to keep those as part of the files' metadata. > > > Would that still be a problem if you were to keep the local copy of > the problem database as a checked-out CVS (or Subversion) working copy? No, then it shouldn't be a problem. If the individual sites access the library via a cvs-like system, then all version control (e.g. the manifest mentioned below) would be handled by cvs. >> This approach should still be ok with extra associated files. They >> are listed in the manifest along with the problem files. So, if you >> don't have one at the time of an update, it will be fetched for you. > > > What is the manifest? I don't think you'd need any such thing if you > were to use a version control system to track files. The manifest would be a list of files and version information. It can also contain dependency information if we choose. The main role would be to do simple version tracking, so when updating you only get the new and updated files. > Thanks for explaining this all to me. If you get sick of it, just let > me know. I always have opinions about things that aren't really my > business, but if you'd like to be left alone, say the word. :) It is good to have some discussion before moving farther forward. John |
From: Michael G. <ga...@ma...> - 2005-01-10 23:49:33
|
On Jan 10, 2005, at 4:29 PM, Sam Hathaway wrote: > By the way, has anyone thought about how problems will be packaged? > Many problems consist of more than one file and it might be worth > laying out a packaging format, so that a problem and all of its > auxiliary files and metadata can be distributed as a single file. This is a really good thing to look at. I still regret that we couldn't use the "resource fork" idea of mac files systems, so that a problem and all of its pictures stay together. Are there equivalent schemes used on unix and windows? What is being done on the mac these days? Our current system, which works, but is a bit fragile, is to use the same name for the directory and the .pg file when packaging a file with its accompanying pictures, applets or whatever. e.g. prob1/prob1.pg accompanied by prob1/picture1.png, prob1/picture2.png, etc. We usually place applets in a common area so that they can be used in multiple places, but that also means that frequently problems using applets don't work when you are first setting up a course. You have to tweak the addresses in the problems and/or find the applets and install them in the right place. Any ideas for making this situation more robust and straightforward? Take care, Mike |
From: Sam H. <sh...@ma...> - 2005-01-14 06:19:24
|
On Jan 10, 2005, at 6:48 PM, Michael Gage wrote: > On Jan 10, 2005, at 4:29 PM, Sam Hathaway wrote: > >> By the way, has anyone thought about how problems will be packaged? >> Many problems consist of more than one file and it might be worth >> laying out a packaging format, so that a problem and all of its >> auxiliary files and metadata can be distributed as a single file. > > This is a really good thing to look at. I still regret that we > couldn't use the "resource fork" idea of mac files systems, so that a > problem and all of its pictures stay together. Are there equivalent > schemes used on unix and windows? What is being done on the mac these > days? For distribution packaging there are many archive formats -- tar, zip, pkg, etc. with various features. For installed files, though, there isn't much. Win32 can embed certain resources (icons, for example) in executables and libraries. In UNIX, there's never been much of an attempt to do this. On OS X, of course, there are bundles. > Our current system, which works, but is a bit fragile, is to use the > same name for the directory and the .pg file when packaging a file > with its accompanying pictures, applets or whatever. e.g. > prob1/prob1.pg accompanied by prob1/picture1.png, prob1/picture2.png, > etc. We usually place applets in a common area so that they can be > used in multiple places, but that also means that frequently problems > using applets don't work when you are first setting up a course. You > have to tweak the addresses in the problems and/or find the applets > and install them in the right place. > > Any ideas for making this situation more robust and straightforward? A simple tweak would be to put the problem file in the directory with the auxiliary files. This sits better with me than having the problem file and the directory of auxiliary files side-by-side, since it would be harder for a problem to be separated from its auxiliary files. These directories could be tarred up for distribution if need be. You mentioned the applets, and I'm thinking that it might be worth talking about dependancy information. Perhaps a piece of metadata about a problem should be a list of other resources on which it depends (maybe along with version ranges, if needed). The more I think about it, the more the needs of the problem database converge with the needs of operating system distribution packagers. For example, Debian packages consist of an archive containing the package files and several several well-defined metadata files that give information about the package and its dependancies, how to build it, etc. To build an archive, the metadata in each package is assembled into a Packages.gz file that can be read by the APT system to build a dependancy graph, etc. Of course, this is overkill, but what's currently proposed seems essentially like a simpler version of this, if you view each problem source file as a package, and the MySQL database as the Packages.gz file. It might be worth formalizing ways of tracking dependancies for situations where a problem depends on some external resource like an applet. On the other hand, it might not be worth it. By the way, all involved with the problem database project are free to use the WeBWorK Wiki to share ideas and write up proposals: <http://devel.webwork.rochester.edu/>. -sam |
From: William Z. <wz...@cs...> - 2005-01-25 21:52:21
|
Just found time to look over arch. It is very cool! Perhaps more importantly for me, I have never understood cvs completely, but arch makes perfect sense. On Jan 10, 2005, at 9:16 AM, Sam Hathaway wrote: > > On Jan 10, 2005, at 12:08 PM, Michael Gage wrote: > >> We've been considering this, but haven't made any moves yet -- mostly >> because we've >> had other things to do. Sam, John, do you have any comments? > > Arch is more ambitious and has a greater "cool" factor, but I think > Subversion is the way to go right now, especially since it's possible > to convert a repository from CVS to Subversion: > <http://svnbook.red-bean.com/en/1.0/apas11.html>. > -sam > >> On Jan 10, 2005, at 11:52 AM, William Ziemer wrote: >> >>> Most of the projects I link with are going to subversion: >>> http://subversion.tigris.org/ >>> Maybe use this for the problem database? >>> Good to see you all again, >>> Bill > > -- William Ziemer Mathematics Department CSULB |
From: John J. <jj...@as...> - 2005-01-26 03:07:52
|
William Ziemer wrote: > Just found time to look over arch. It is very cool! > Perhaps more importantly for me, I have never understood cvs > completely, but arch makes perfect sense. I set it up once and got the basic thing to compile but failed to get it to compile with security settings turned on. It was increasingly frustrating that it loves weird characters in filenames (weird meaning that they needed quoting when issuing shell commands). Have you tried it? John > On Jan 10, 2005, at 9:16 AM, Sam Hathaway wrote: > >> >> On Jan 10, 2005, at 12:08 PM, Michael Gage wrote: >> >>> We've been considering this, but haven't made any moves yet -- >>> mostly because we've >>> had other things to do. Sam, John, do you have any comments? >> >> >> Arch is more ambitious and has a greater "cool" factor, but I think >> Subversion is the way to go right now, especially since it's possible >> to convert a repository from CVS to Subversion: >> <http://svnbook.red-bean.com/en/1.0/apas11.html>. >> -sam >> >>> On Jan 10, 2005, at 11:52 AM, William Ziemer wrote: >>> >>>> Most of the projects I link with are going to subversion: >>>> http://subversion.tigris.org/ >>>> Maybe use this for the problem database? >>>> Good to see you all again, >>>> Bill >>> >> >> > -- > William Ziemer > Mathematics Department > CSULB |