[exprla-devel] [xpl] Variable = Container = Element ... = Directory ?
Status: Pre-Alpha
Brought to you by:
xpl2
From: reid_spencer <ras...@re...> - 2002-01-31 08:16:03
|
--- In xpl-dev@y..., Jonathan Burns <saski@w...> wrote: Robert A Hopkins wrote: > Jonathan (11 May): > "Would you like to see > the addition example re-expressed using simplifications like that?" > > I think the more we express & re-express this language that doesn't > exist > yet, the more fluent we will become. (What does "<stdout>" mean? And > "<stdin>"?) (Jonathan, is there a reason you're using the form > "<comment></comment>" rather than "<!--comment here...-->"?) Righty. I used <comment>...</comment> because it was in Mike's tags. (Mike, I want to see them annotated with brief descriptions of their intended functionality. Most I can guess, but "slice" and "splice", to mention a couple, leave me guessing.) "stdin" and "stdout" are standard terms meaning "where the characters for this program are coming from", and "where the characters this program produces are going to". They're short for "standard input" and "standard output". Formally, they are "filehandles". In Unix-speak, that means references to objects you can read from and write to. Open files have filehandles, and so do standard character input and output. I thought everybody in the whole world knew that. I do suggest you look at a couple of tutorials in any of: (a) Perl, (b) Unix scripts, (c) C. You'll see "stdin" everywhere. <soapbox> If and when this list expands, newcomers will bring a combination of programming lore and Web lore to the table, and they'll expect to find the same. We need to be somewhat hip. I'm assimilating HTML and CGI conventions as fast as I can. But ideally, I'd want everyone here to have a rough overview of a variety of programming languages. And up to a point, I can provide just that. Just keep sending 'em in, Aly, and I'll keep knocking 'em back. </soapbox> > Jonathan: > <if> > <false> > <verify> > mylist > </verify> > </false> > > abort <comment> Handy things, macros. </comment> > </if> > > I like this. I think if I would argue for anything it would be for > rigorously applying one of the fundamental concepts of xml, and that > is > of containment. Elements are containers into which you stick things. Damn straight. Thank you. > comment> > Here we define a program variable to contain > a list of integers. > <comment> > > <variable type=intList> > <name> > mylist > </name> > </variable> > > <comment> > > Somehow I'm not comfortable with the idea of a container called > "variable". How about "<xpl:variable-type name=intList>" > As usual, you highlight the issue brilliantly, but this time you're wrong. Pay close attention... In the context of XPL, I want to say: A variable is a minimal document. It possesses the following attributes: A type. A name. Permissions: to read, to write, maybe to refresh. [A link to] something to call to recalculate its value. Its current value, in text form. [A link to] he DTD element definition for its type. The document which owns it - probably its parent. Timestamps, for date created, and date last modified. I'll take in a couple of general objections first: (1) It's a lot of content to impose on every little variable. Does this mean that we'll be transmitting massive quantities of information across the Net, at 56Kbd? If so, forget it. Reply: What we will transmit are XML documents - marked up text. Typically, we can expect to transmit, ONCE, an XPL program which contains a lot of variable-definitions, together with all the attributes; and MANY TIMES, the content (what I call "current value") of certain variables, in text form, bracketted by start and end tags giving their types. This may be just what the end-user requires. However the middle-user, i.e. the XPL program, will take the transmitted data, and install it in a fully-attributed document. Probably by changing one string in it, and updating a timestamp. In brief, the transmission load will be typical of XML and of CGI-style dialogs. (2) Does it mean that an XPL program must go through some verbose interface whenever it accesses a variable? Reply: The interface through which an XPL program accesses its own variables is simple. The methods of the interface begin with: Get-Current-Value: To use the current value of the variable. its-name <!-- Just quote its name --> Set-Current-Value: To replace the current value of the variable. <set> its-name new-value </set> Refresh-Current-Value: To have the variable update its own current value. <refresh> its-name </refresh> There will be additional methods, to get and set permissions, timestamps and so on, for use as required. But such information is usually set once only, or else automatically maintained, by the XPL executing parser. In normal programming, they will not be required. And this will SIMPLIFY the overall interface, because conditions such as "New value is not well-formed/wrong type/not verified" will automatically raise error conditions with which the programmer can deal at leisure. Rather than having to fill in all the error- for every second operation, as I did in the Add-list code. (3) Will not the processing of all this information, per variable access, require a serious processing overhead? Reply: Processing cost will be negligible, because variable access will be infrequent. Insofar as XPL is employed as a gateway language for Web dialog, the rate of variable access is gauged by the end-user's atttention. It is well-known that people can keep about 8 items of information at the forefront of attention at any moment. Arguably, they can select, from 20-30 items on a screen, the 8 to which they want to attend. The end-user has a dialog rate of a transaction every few seconds. The CPU will get the attributes up-to-date in microseconds. Insofar as XPL is employed as a high-throughput application server mechanism, its typical variable types will be large aggregates of structured information. E.g. documents, tables, databases, graphic models, and so on. (Remember: variables are elements, and elements are containers.) In processing aggregates, most XPL access to their parts will be reads, with negligible overhead. Write-access overhead will occur only on the subtrees which are rewritten, and their ancestry back to the root of the aggregate. Believe me. I went through much of this, while working with the Obsidian gang, of whom more another day. The data was 3D polygons; the amount of recalculated graphics was large, the response time was critical. The solution was to enrich the leaf items, i.e polygons, with cache data, ownership by larger volume elements, and so on. This enabled efficient searching and sorting, as well as caching. Now to the gist. In any programming language, a variable is a container of data. In XML, an element is a container of data. Specifically, it is a container which can contain other containers, as well as primitive data like ints and strings. I suggest that XPL variables should be variable elements. You can read them. You can write to them, subject to a dreaded VERIFICATION ERROR condition if you write incompatible content. That way, the source of an XPL program is a document, and so is its internal data. I want to explore the consequences of using documents for everything. The key questions are: What can you do WITH a document? and What can you do TO a document? It seems to me that the variable = container = element principle, combined with type-checked grafting of documents onto each other, gives us real power within a controlled family of data structures. (I haven't thought the question through, whether grafting is to be done with namespaces, or DTDs.) But wait, there's more. Traditionally, a variable is a container of data which resides in memory while a program is running. By the time the program has stopped running, the data we actually want should be stored in a file. Or files. A file is a container of data which persists whether its creator/maintainer programs are running or not. Q. When is a file a container which can contain other containers? A. When it is a directory. Q. Apart from persisting outside memory, is a directory an XML element? A. No. But wouldn't things get interesting if it were? Q. What does a directory have, that an XML element doesn't? A. Permissions, timestamps, and a name. Q. What does an element have, that a directory doesn't? A. Sibling ordering, type, and grammatical (i.e. verification) constraint on content. Q. What can we do to a directory, that we can't to an element? A. Search it by name. Sort it by name, or attribute. Q. What can we do to an element, that we can't to a directory? A. Parse it, constrained by tag (i.e. type) information. Q. What would it take to upgrade an element to directory abilities? A. Give it, and all its descendents, names. Q. How? A. Install a name attribute in each start tag, from the root down. Q. What would it take to upgrade a directory to element abilities? A. Give it, and all its descendents, types. Also, give an ordering for every set of siblings. Q. How? A. Give it a file type. That is, have its name end with its type. Also, create its children in access order - the creation timestamp will then preserve that order. Simply, I'm saying that <int name=x created=14May2000 permission=read+write> 377 </int> contains the same information as a file called x.int, created on 14 May 2000, with read and write permissions, and containing the text "377"; and also that <compound-type name=y created=15May2000 permission=read+enter (children) </compound-type> contains the same information as a directory called y.compound-type, containing files or directories in which the childrens' information is stored. I want to explore, whether we can use elements to do useful things with documents, and vice-versa. Q. Are you serious? XML is supposed to be platform-independent, while file system architectures vary from system to system. Bind them together, and you break platform-independence. A. Quite so - if we bind XML to an actual file system. Which we would, if we used, say a Tcl mechanism to make system calls to do cd, mv, rm, etc on the local file system. However, there are two parts to a file system architecture, the interface (API) and the implementation. Now HTML already assumes a file system interface of sorts, insofar as it uses paths in URLs: http://[domain]/[user-site]/topic/.../topic.html That much is platform-independent. It's simple enough that we take it for granted - mainly because it's a read-only lookup system. Now I'm canvassing the idea that XPL programs could access a file system interface like this, with write access. The actual file system implementation still calls the tune, because it takes the initiative in offering the API to XPL, and can impose its own error-detection and security upon the API, without XPL being aware. In brief, XPL won't know whether it is accessing files on a Windows server or an IBM 370, and it will be a guest, subject to local security. Engineering details no doubt remain to be worked out ;-) That being said, the main questions are: Q1. How can we expect to use directory mechanisms to perform useful operations on elements? Q2. How can we use element mechanisms to perform useful operations on directories? A1. Directory mechanisms will come into their own, once XML and XPL libraries become relatively complex. Several kinds of directory- based systems are used routinely for housekeeping and end-user simplification. For instance: configurations, installations, and version control. Configurations: Applications, e.g. Netscape, come with config and preference data, which is kept in files in special directories. On my Linux system, for example, PPP has its defaults stored in /etc/ppp, and Netscape in [...]/.netscape. When I have finished with setting defaults, I can set permission on the directory to read-only, and only with root access can I reopen them for writing. This is a good mechanism; and when people begin to offer complex XPL applications, they will want their clients to be able to store the defaults in controllable areas. Installations: When an application is provided as source for compilation, there are issues about where to find the bits and pieces to build the working version. The build process is automated to an extent, and re-used over many applications, by utilities such as project files (e.g. Borland C++ Build), makefiles (Unix standard) and package management files (Redhat Package Manager). All these utilities expect to look for and find the construction kit for the application in directories. Version Control: In developing any complex software, one wants to step from one stable version to another, as one gets more of the functionality up and running. As well, one wants to branch out into experimental versions, which will mature into desired functionality and be merged into a stable version, or else thrown away. And one wants to keep the entire history on file, easily and consistently referenced. But, one does not want to have to rename the items of the construction kit, each time a new version extends the tree. This is solved by means of a two-level directory naming system, in which the actual directories are automatically renamed, while on the upper level the developer continues to refer to items by their common names. XPL users will want this. All of these can be provided within XPL, if XPL elements have standard file and directory attributes built in. To implement them, we would need to investigate the workings of free-source examples, and to rebuild them as XPL programs. A major effort in each case; but each case would need to be done just ONCE. This is the kind of thing we could do, if XPL variable data elements possessed the usual attributes of files and directories. Shortly, we'll turn this on its head, and see what we could do, if files and directories accessible by XPL had the usual attributes of XPL elements: i.e. type, and grammatical contraints on content. But first, a digression: What a fine thing it is, to be a stamped addressed envelope! You get some writing on your front to say where you're going, and some more writing on your back saying where you've come from, in case you get lost. You have a handsome prepaid ticket stuck in one corner, so that everyone whose hands you pass through has to help you on your way; and an authoritative rubber stamp over your ticket, so that no-one can steal it. You might even have special instructions printed on you, for your proper handling by the postal services; PAR AVION, or DO NOT FOLD. And special instructions too for your receiver, about how to open you and dispose of you with environmental care. When you find me in your mailbox Cut the string and let me out Take me out of my wrapping paper Stick some bubblegum in my mouth Wash the glue offa my fingers Peel the stamps offa my head Pour me full of icecream sodies Tuck me in a nice warm bed - Pete Seeger The point I'm belabouring is, the Net is a system of message-handling services which works much as the postal services do, but automatically. As soon as I post this email, it will embark on a saga where it gets copied from server to server and protocol to protocol, being wrapped up with instructions, unwrapped, rewrapped and passed on, until it reaches its destination server and is delivered to its proper protocol. But while data-on-the-fly enjoys this deluxe handling, data-in-storage has it added on, sometimes, and as an afterthought. Reason is, files and directories on personal computers were conceived on the assumption that Programs are Active, Data is Passive. You told a program where to find its data, with a few command-line parameters; the data certainly did not jump up and demand its rightful program. Old-style applications modified this paradigm here and there, but the paradigm continued as default, pretty much until Apple put a GUI on everybody's desks. Macintosh files came equipped with resource envelopes, which dictated what the file required for processing - including the right application program. MS Windows took up the idea. Unix lagged behind, with generic files and directories which waited for a program to open them. The next big twist was Multipurpose Internet Mail Extensions. MIME types indicate the proper application for a file, e.g. text/html, application/msword, video/mpeg. Some MIMEs apply to stored data, and have standard filename extensions, e.g. .html, .doc, .mpg, respectively. From the other end, the desktop environment includes a list which nominates the preferred application to handle each MIME type listed. In this way, the idea of typed files has gained currency. But with XML elements which refer to directories, or better still have directory attributes, we can take it way further. A2. Let's work through an example. Cruising the Web, you happen on a page, and find there a link to download the beta of a game called Turps. When you click the link, you get the usual FTP download dialog. You nominate an empty user directory on your machine, with no root privilege. You must make the directory read+write, so that your local FTP can create new files in it. Say this directory is called game. You click in the dialog to accept download. You wait, and shortly there appears a new file, game/turps.tgz. So far, nothing novel has taken place. Now, you turn on your browser's XPL shell. You select a directory tree window, and open game. You see an ordinary directory tree display, entitled "game", with a mini-icon in it and a title "turps.tgz". You are naturally suspicious of downloaded games. So, you select another window, identical to the first one, and choose View/XPL. It shows <local-dir name=game permissions=read+write> <tgz name=turps permissions=read> <mime> Decompress archive/gunzip-untar </mime> </tgz> </local-dir> Now, if I have things right to this point, the <tgz>...</tgz> element contains exactly the same information as the icon. That is, courtesy of your system's MIME file, which contains: the MIME entry: archive/gunzip-tar tgz the user-verb: Decompress and the icon for .tgz files. If any of that doesn't fit in a MIME entry, then the hell with it, we'll make our own MIME file format up as an XPL document, with its own mime.dtd and everything. So assume we have that in operation. Both the element and the icon show the filename and type extension, and both are clickable. In fact, the two presentations are merely the parsing of the same XML file with two different stylesheets. So you click on either the icon or the user-verb. Next thing you know, the XPL window is showing: <local-dir name=game permissions=read+write> <tgz name=turps permissions=read> <mime> Decompress archive/gunzip-untar </mime> </tgz> <local-dir name=game/turps> <html name=README> Browse text/html </html> <xpl-package name=turps> Browse text/xpl-package Install package/xpl-package </xpl-package> </local-dir> </local-dir> Meanwhile, the directory tree shows that the turps.tgz file is now accompanied by a new directory, with README.html and turps.xpl-package files. Note that xpl-package comes with two MIME types: Browse text, and Install package. I'm not sure this is legit for MIME, but like I say, the hell with it. The point is, that we can equip a file type with whatever methods we like. Click on Browse, and you'll see a directory breakdown of the package, much as you would with Redhat. Click on Install, and the XPL code for the xpl-package type goes to work. What we have, in short is object-orientation for files by type, courtesy of an XML hierarchy which comprehends file structure, and XPL per type (or per class) to implement the methods. (Whew.) Consider this a first draft for what I'll be writing down more rigorously sometime soon, I hope. > As for the substance of what you're saying, I'm still working on it. > As I > understand it, the gist of it is that you want to define a complex > data > type in a DTD rather than in some sort of TYPE statement. (With > namespace > we can link your variable to a different DTD. Name it something on the > > order of "int:intlist", then it or any int:variable would belong to a > different namespace and different DTD than the default xmlns:xpl (it > would > be linked to xmlns:int). That obviates my previous suggestion of > "xpl:variable-type".) > I'll get back to you on this. Just now, I really need to get CGI andHTML generators under my skull. Don't lose the concepts. > ps: Oh my! I just saw what eGroups does to angle brackets in the > archives! ("View source" works.) > > pps: Don't be alarmed, Jonathan, I'm still working on the (larger) > gist of > it. > Same here. Closing down at 2 a.m. Jonathan We found our camels at the Paris end of town With a rabid alligator and he followed us around We fed him salted camembert and paddled his behind He said, I'm touched you'd bother, but the greater loss is mine - Nadine & Ferdie Okeefenokee GP&FS lives, '00 --- End forwarded message --- |