[Phplib-users] Template reverse transformation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello,

I've been charged with either finding or making a template system for a
large-ish website that uses PHP and I have a question about a feature
that doesn't appear to be in PHPLIB or any other template system that I
know of.  I'd like to know if anybody could give me some feedback on 1)
what it would take to retro-fit PHPLIB with this feature;  2) whether
somebody has suggestions that would achieve the same goals and would be
simpler to retro-fit PHPLIB for;  and/or 3) whether anybody has ever
seen such a feature before in a template system?

Typically, a templating system works like this:

  Template1 ---
               \
  Template2 ------> Transform --> HTML
               /
  Template3 ---

The feature that I want is this:

                                --> Template1
                               /
  HTML --> Reverse-Transform -----> Template2
                               \
                                --> Template3

Before anybody tells me that this is insanely difficult, I will say that
the "how" is coming below and seeing this idea (which wasn't my idea
originally, incidentally) might make it worth your while you read
through this very long message.  First, the why...

I consider ease of use by designers to be extremely important in a
templating system.  Ideally, this would mean that there is seamless
WYSIWYG support so that designers who use things like Dreamweaver can
edit a *single* HTML file with *real* data on it and have the changes
reflected in the templates that were used to build that page.  I know
there is a temptation to ask what's so hard about editing Template1,
Template2, etc by hand, but for designers that only know Dreamweaver and
don't know HTML, it's a moot point - regardless of how hard it is, it's
not an option in a lot of cases.

I a huge fan of DOM manipulation templating solutions like XMLC and
HTML_tree for this reason.  They allow designers to work with valid HTML
and realistic sample data which generates a fully realistic mockup. 
They don't, however, support a reverse transformation so there is still
a degree of having to figure out which template to edit to effect which
change.

Below is how this could be achieved in a system that would be
implemented from scratch.  The basic gist is that you can put the
templating system in "debug" mode, which would then result in pages
being generated with <span> tags wrapping template fragments and
variables so as to identify them later.  The designer would edit the
single, exported page (which contains real data instead of variable
names) in a WYSWIG editor and then upload the modified page to the
template system.  The template system would then use the <span> tags to
figure out which templates created which pieces of the page and then
modify those templates accordingly.

That's the Reader's Digest summary - see below for *much* more detail. 
My question again is whether something similar could be built into
PHPLIB and what this would take?  I know this would be extremely
difficult to make work correctly with probably the majority of PHPLIB's
features, but I'd still be happy even if getting a reverse
transformation feature necessitated that I use a certain sub-set of
PHPLIB features in my templates.

Enabling the reverse transformation and importation by utilizing <span>
tags was originally Benjamin 'Quincy' Cabell V's idea, by the way, to
give credit where credit is due.  His summary of how he would design
such a system from scratch is below.

Thanks,
- Tim Macinta

From: Quincy <Tem...@be...>
-----------------------------------

The features which I believe are critical to a good templating system
are the following:

- Multi-language
- Multi-style
- Caching
- Variable AND constant substitution
- Easy variable insertion (in template, no coding, per se)
- Significant separation of html and code
- Edit Once, Edit Everywhere

In my opinion all of the templating systems I have seen fail on these
last two points.  On the issue of separating HTML and code, they succeed
only to the degree that they eradicate HTML from inside PHP files.  They
separate HTML from code by spreading one page's HTML (final rendered
page the user will see) into dozens of template pieces which are then
merged together in a way that makes editing the GUI at a later time
difficult because of the need to a) read the code (or some a document
about the code/templates) to understand which template contains which
"piece" of the layout and b) modify each individual template (of dozens
or more) in a coordinated fashion to modify the design.  These two
failures/weaknesses make templating systems much less effective than
they could be.

But, before talking about a solution to that, let me first cover the
other issues.

Multi-language and mult-style.  This is really one feature, but looks
like two.  The idea is that you have parallel template sets for style
and within those parallel template sets for language.  So, each template
has a name, a style, and a language setting.  To find the template you
would look in the database for the appropriate template of the
appropriate style, then the appropriate language.  If you fail to find
the template of the appropriate language you fall back to the default
style with the appropriate language.  If you fail to find that you fall
back to the appropriate style with the default langauge.  If you fail to
find that you fall back to the default style and default language (which
will always exist).  The setting of the style/language is optional
(otherwise default), and the setting of it could occur in the
initializer to the template class, or at any point, or specifically in a
template inclusion call.

Caching is somewhat obvious, somewhat not.  The ability to store the
processed template (post variable/constant substitution) for X seconds
and have this get used instead of "creating" a new one.  The bit that's
a bit awkward is that you are not caching to prevent the variable
substitution, which is pretty simple and efficient, you are caching to
prevent the PHP that would get the data ready to be laid out in a
template.  So, the actual PHP code (business logic) would essentially
check with the template cache manager code to see if the content is in
cache (according to hash info and the rules of that template) and if it
is it returns it and bypasses the PHP code which would have done the SQL
calls/computations/etc. to create whatever data/templates the template
might have required.  It's the generic object cache I had Tim Macinta
build, in the setting of a template manager.

Variable and constant substitution.  The variable substitution part is
just that.  It's the thing every template system does most
fundamentally.  Replace a marker in the template with the value of some
variable.  The "constant" substitution is a bit of a misnomer.  It's
actually a style/language-specific constant substitution.  Meaning that
you could define page_bg_color to be #0000cc in the "Cool Blue" style,
and #cc0000 in the "Hot Red" style.  And, you could substitute other
things such as key strings on a language-by-language basis.  So, the
constants are NOT constants in the true PHP sense (and they are not
defined as constants in the PHP code), but items which are small-ish
fragments of text you want to put in your pages, which will not change
based on page execution, and which get placed into templates. (This
variable/constant is functionally identical to vBulletin's template
system.)

Easy variable insertion is just an awkward way of saying that when
you're editing a template it shouldn't be painful, requiring code-like
things, require many more keystrokes than simply referring to the
variable/constant you're substituting by a name.  Most systems do this
reasonably well.

Back to the thorny and significant problem of separation of html and
code.  The other features are relatively common and relatively easy to
achieve (caching is a bit complicated, but it's a pretty "well
understood" problem).  This item, though, is a a bit of a bitch.  And I
don't think it has to be, and I think I know a way out of this maze.
Personally I've found templating systems hideous to use.  Again, you
just can't edit things easily when you use them.  You have to trace back
the building of the page, see all the templates used, discern from their
names or their order of execution what they do, edit the one or many
templates you need to (and the numbers can be truly staggering, see
vBulletin with its dozens and dozens of templates for one viewed page),
then hope to god you didn't miss a closing tag and have to journey into
a nightmarish world to find which template has one too many or too few
tags.  On some level, this "cure" becomes worse than the original
"disease".  So, what to do?

I'll present part of a possible solution.  It's one I worked on a bit a
year or two ago, and started thinking again a bit when I was asking Tim
Macinta to look into templating systems for our site.  From his initial
research it seems the state of templating systems still hadn't improved
(enough) to meet what I perceived as my/our needs.  I'll present a
possible solution.  It's one I worked on a bit a year or two ago, and
started thinking about again after the renewed look at templating
systems.  If anyone reads this and responds, "Oh, I've seen that before,
it's in Foo Bar Inc.'s Fubaramatic Templating System."  please forgive
me and let me know where it's to be found!

The goal is simply this:  

Let's say I decide to clean up the look of our new "search system".  I
open up a browser to my site, load up the search form, execute a search,
get a bunch of results, now that I'm looking at a page on the site that
contains all the content areas/items I want to edit I then I add a
"&templatex=1" to the URL (or set a cookie named templatex), the page
reloads, looks exactly like it did a moment ago (but has invisible
differences in hidden meta markup), I save the page with IE.  This is a
"real" page, it has real results on it, not bogus sample data, but real
results I got from the real page.  I open up Macromedia's Dreamweaver,
change the order of the cells in the result table, change the background
colors in the cells from alternating light and dark grey to alternating
light blue and light red, edit some copyright text, replace a few
graphics, and save.  I go to the "TemplateX" management page, go the
"import" form, browse to and upload the HTML page I just edited (just
the HTML file, no graphics).  I hit "submit".  And the system
automagically makes the required changes to ALL the required templates
and constants (the style/language ones).  No more trying to understand
how the templates fit together, or which has what, no more editing
multiple template files.

Now...  I think it's possible to achieve all of that goal, relatively
easily, within certain limits.  Here is some info on how such a system
could be achieved, and some of the limitations.

Achieving this goal goes a little like this.  Every substitution (from a
variable or constant) is wrapped in a <span></span> wrapper.  The
wrapper includes addition info in attributes about what did the
substitution (variable or constant, and the variable name).  Each
template is wrapped in <span> telling with attributes which template it
is, and what language and style it is.  And, for tag attributes, for
every "real" attribute where a substitution occurred, a paired fake
attribute exists which tells what variable/constant did the substitution
(could also do all this in one fake attribute per tag).  There's a bit
more to how this would work, but these are the key points.

Now for some limitations.  The templates cannot be "anything"; they must
be HTML fragments (text which can contain HTML tags, but no <table (with
no closing>) and attributes cannot be "partially" replaced by variables
(such as bgcolor=#cc$foo), an attribute may have a value of a variable,
or a constant (and perhaps even another template (which in turn could be
created by part of a variable) but maybe not I haven't thought this
out).  But, a tag could not be like <table $border_attribute>, though it
could have <table border=$border_width>.  If a template is included
multiple times only modifications to the first instance (presumably)
would be processed (the others would cause no harm, but the changes
there would be ignored, and would in fact be wrapped with span tags
indicating the data there was just sample data, not related to a
template).  The page you edit contains the substituted variables, so it
renders as a real page, with real data.  As these items are wrapped in
<span> tags with vital identifiers in the tags, the designer cannot
corrupt the wrapping when the move/alter the "text"
(<font><span>foo</span></font> would be very different than
<span><font>foo</font></span>).  Similarly, as the HTML tags may contain
"false" attributes used to identify tag attribute substitutions, those
"false" attributes must also be unaltered by the designer.

The parser would read the document, identify the templates involved for
each region, reconstructing the variables/constants that were
substituted, and re-create the various template files (and alter
constants that had changed).

So, that's it.  It should work, and I believe it would be a great
advancement over anything I've seen/used.

We could start with some core features, initially supporting no
"constants" and no tag attribute substitution.  The parsing of the
document I do not believe should be too difficult, as the logic would be
pretty simple and the <span> tags we're using would be assumed to be
very exactly defined and should thus be easy to parse/locate via regex.
The true benefit of this "Edit Once, Edit Everywhere" design may not be
perfect.  Dreamweaver or other applications may insist on putting <font>
tags on the inside of a <span> when you format text, and so true WYSIWYG
may not come.  We may find we need to constuct our templates differently
than we might with other template systems, such as defining a <table>
header twice (with different attributes) rather than do variable
substitutions dynamically to add attributes in one template on the fly.
And so on.  Nonetheless, this system would still be better and easier to
use than any current templating system out there.  Assuming that it is
this major step forward in templating systems we could/should release it
GPL or LGPL to the open source community and allow others to develop it
further, which is likely to occur.

Import/Export Tool
------------------
In addition to the system mentioned above, where a designer can make
changes to one HTML file then automagically update all the appropriate
templates based on the changes made in that one file, another useful
method for doing large volumes of template changes, an import/export
tool would be useful.  While the templates are stored in the database,
there would be great benefit in having a tool to import/export template
files to/from a file system for ease in making large scale template
changes/edits.  In this situation, I would type in "templatex --export
/home/quincy" and it creates a directory structure roughly as I outlined
(with langauge directories, and style sub-directories), you can then
edit the files with Dreamweaver/TextPad32/Homesite/Emacs/etc. and then
run "templatex --import /home/quincy" and it imports them, creating
template entries for 'new' files and over-writing entries for edited
ones.  My primary reason for believing this feature is needed is because
of my experience in doing lots of vBulletin changes, and feeling pretty
limited/frustrated by needing to do all edits within their limited
template editing system, instead of being able to use whatever tools I
like, leveraging their power, for syntax highlighting, HTML
preview/editing, doing global search and replaces, global finds, etc.  

Automating Template Determination/Fetching
------------------------------------------
There are many lessons we can learn from experience with vBulletin and
seeing how they have implemented their system.  Within each vBulletin
page they call a function which pre-fetches the templates that the page
expects it will need, saving on DB calls; and with my hack that caches
fetched templates to the filesystem bypassing the DB altogether.  We
should (probably) try to create a system which automagically knows which
templates will be required for which scripts, based on previous runs of
the script (it would "learn"), as well as being able to manually control
which templates are pre-fetched.  Of course we can start with manually
setting it, and later do it automagically.  I just mention this feature
because with vB it gets very taxing to manually maintain the list, when
sometimes you've got 20 or 30 templates in the list, and you need to
remember which you've added/removed/etc.  And if you forget a couple
then of course it needs to separately fetch them as needed, which is
(probably) less efficient than the overhead of automagically fetching
too many.  

Parsing of Updated HTML File
----------------------------
The reverse transformation which occurs when the parser is asked to
process a final document that you modified (with a text editor or
Dreamweaver) back into the mods to the individual templates 
is easy because it has been purposely "dumbed down" with only one
"branch" to follow, ever.  That branch is the one that was used in the
actual page generation that you edit.  Originally my goal when I started
thinking about this last year was to try to handle all the
cases/branches when you edit one page, but it's clearly a nightmare and
forces the templates or the editing of them to contain "code"
(code-suggestive markups, at least).  So, I think the solution is to
bypass the issue entirely, and go with the simlpe one branch approach.
So, if I want to change the entire look of the search page, that does
mean I need to run through all of the possible variations of the search
page (showing 0 results, showing X results, in this mode, that mode,
etc.).  And I don't think that's a bad thing, actually.  Every time you
make a change that will show up in the other variations, it will show up
there when you check them, so it's not that you'll be doing extra work.
Now, the one danger is of course that you could forget about one of the
application's variations.  But, I would say that risk is probably
acceptable, and with most applications, the sorts of changes you're
doing are few enough that it's unlikely you'd foget, and if the change
is big enough, you'd better be sure you do know all the variations.

Flexibility in Design through Proper Programming Decisions
----------------------------------------------------------
The page designer's/editor's flexibility is dictated by the coding
decisions the programmer makes.  It is the programmer's responsibility
to build in this flexibility by breaking the pages up into appropriate
template fragments.  The template manager has two primary template
processing functions, build_fragment() and show_fragment().
Build_fragment() processes the fragment but does not echo/print it,
instead it places it in a 'variable' in the template manager which will
then use that as a variable for substitution in other templates.
Show_fragment() is identical to build_fragment() except that it outputs
(echos/prints) the results of the template processing.  So, here is one
example of how you might build a form processing page:

  build_fragment("header");
  if (count($errors) == 0) {
     build_fragment("abc_success");
  } else {
     build_fragment("abc_errors");
  }
  build_fragment("abc_form");

  build_document("abc_form_body");

  build_fragment("footer");

  show_fragment("standard_document");

In this design, abc_errors, abc_success, and abc_form are "positioned"
by abc_form_body.  Header, footer, and abc_form_body are then positioned
by standard_document.

So, the designer could alter all these things, assuming of course that
the code is properly designed with this in mind, and (a very important
limitation) that no template changes its relationship with its parent;
technically they may not absolutely need to observe that but it would be
somewhat at their peril not to.  In the above example, by editing the
output of a succesful form submission, they could move the placement of
the abc_success fragment directly into the standard_document and it
would "work" (that is, the rendering would all layout okay, since the
template abc_success is still being evaluated in the same place in the
code, it's just being told to appear in standard_document NOT in
abc_form_body.  But, every other page that uses standard_document will
attempt to do the same (this may not cause problems since the variable
that abc_success fills would NOT be set for other pages, and thus
nothing would be output).  But, the point is, it's a corruption of the
intended structure, and such things should be strongly guarded against
because they will cause problems.

Just to repeat for clarity, the idea of build_fragment is to build
variables like this from templates, as opposed to just setting template
variables from code (execution).  One point to note is that in the above
example I don't make clear what the variable name should be for the
storage of the built template.  The logic should probably be that it can
be "assumed" to be the template name, OR, it can be set, specifically.
In the above example, the abc_form_body should *NOT* use the default
name of the template as the variable name for the template manager,
because the standard_document would be using a generic name like
'body_variable' or something as a placeholder for the body, and that is
what should be holding the abc_form_body.  So, the call to build
fragment for abc_form_body would need to specify this.  In this way each
template has a clearly defined set of inputs.

How the Meta Markup Exists in the Document to be Edited
-------------------------------------------------------
The design of span tags used in this system should be something like:

<span id="template:en|default|standard_document">
 <span id="variable_from_template:en|default|header">Top of the
page</span>  <span id="variable:body_variable">
   <span
id="variable_from_template:default|default|abc_success">Success!</span>
   <span id="variable_from_template:default|default|abc_errors"></span>
   <span id="variable_from_template:default|default|abc_form_body">Form
goes here</span>  </span>  <span
id="variable_from_template:en|default|footer">Bottom of the page</span>
</span id="template:en|default|standard_document">

The 'id' I'm defining here is equal to something like a "type"
identifier (template: or variable: or constant: or
template_from_variable: (maybe) and maybe some other types) followed by
the language indicator|style indicator|template name.  In the case of a
variable, all we (may) need is the variable name (but we may want/need
more to help us know where that variable is expected to be defined,
though contextually it would be clear, since it is inside another span).
For a constant: you would also need the language, style, and constant
name.  Now, this variable_from_template: is something I just realized we
*might* want.  It may be enough to just say template: and the parser
will "figure out" that it's imported via a variable because it's within
a variable:'s span.  Again, it was just a thought I had now that we
*might* want to explicitly differentiate.  So, we may want to
differentiate, we may not.  I think maybe NOT, but I include it in case
I'm wrong.  If we don't need it, then we'd just call these template: as
I'd originally thought.  Also, there may be an issue about how to expose
it, as I have said, build_template needs to be able to store their
results under an alternate variable name to work within a generic
container template that will include it.  In this case the template:
would need to have an additional element saying what its alias is.
Again, not entirely thought through, but a strong design suspicion.

I'm not at all set on the "template: something|something" formatting,
that's just for explanatory purposes, and it could be something
better/more compact.  "" (empty) could be default, and template: could
be t: or t|, etc.

Also, we may want to add identical id info to the closing span tag (see
last /span), for easier parsing (I'm not sure if we want/need this
exactly, as part of the parsing, but if we did this would clearly be a
quick way to find the end of a span, versus tracking span counts, and
all that).

In addition to the <span> tags listed above, there would be another
type/modifier of an id tag, the "sample data" modifier.  For example:

<span id="template:{sample data}en|default|standard_document">
 <span id="variable_from_template:en|default|header">Top of the
page</span> </span>

Or some such representation (I don't like the one I made above), which
would modify the second (and nth) instances of a template, meaning that
the parser can disregard any changes within THAT exact level (which is
to say, if there was a variable/template WITHIN that sample data span,
that was NOT a 2nd plus instance, then its changes WOULD need to be
parsed.  So, the sample data indicator does NOT mean everything within
the wrapped sample tags can be disregarded it only means that that
specific template can.  Of course you could just let the parser figure
it out as it's running, and have no "sample data" modifier, and just let
the parser only modify where it sees changes and the "last change" (if
the user modified multiple instances of the same template) wins.  Or do
it this way.

<span> tags are not sufficient to support all types of template
modifications.  <span> tags can only be used to identify changes which
are OUTSIDE of tags.  If you wanted a template to do something like
<table border=$border_width> this <span> system cannot work.  The reason
is because whatever markup we use must allow valid HTML rendering while
preserving meta-information about what was dynamic.  Thus, a different
system must be used to handle "inter-tag dynamicity" (sounds cool!)!

One proposal would be:

<table bgcolor="#cccccc" templatex__bgcolor="variable:foo">

The same span id structure would be used here, supporting templates,
variables, and constants

Obviously we could achieve *some* of the same end result with the
tag-level replacements, by having more duplication in the template
level, that is have one template variation for each variation we want to
achieve.  But, this is pretty limited, since we couldn't have the color
set from a constant or anything, and if we were trying to vary more than
one attribute, we'd have an exponential growth in the number of tag
variations we'd need to store in templates.

One concern is that by this method you do replacements on the attribute
values.  I do not know if in all cases those attributes in HTML which do
NOT have values (such as 'selected' and 'checked') would behave properly
in a browser if assigned values, like checked=0, selected=0?  

Storing all the attribute info in one fake attribute may be a good
alternative to storing one fake attribute for each real one.  I am not
satisfied with this attribute replacement scheme, it feels weak, since
you can't dynamically add new attributes/etc.  You can't do stuff like
<table $border $width>, and that is annoying to feel limited like this.

Another alternative Tim proposed is something like:

<table
     templatex_span="variable:foo"
     bgcolor="#cccccc"
     templatex_span_end="variable:foo">

Then you could add a width attribute just as you would add
text between <span> tags:

   <table
     templatex_span="variable:foo"
     bgcolor="#cccccc"
     width="50%"
     templatex_span_end="variable:foo">

In pracice you would need to have the fake attributes which define the
span randomized such as templatex_span_89823 and
templatex_span_end_89823 to prevent attribute folding by editors which
wouldn't like duplicate attributes.  Also, the arrangement of these fake
attributes is critical, and poses some risks as editors could (in
theory) re-arrange them.

One limitation of this design is that an attribute value cannot be as
easily modified, as you would need to specify both attribute and value
in the template/variable being used.  One possible solution is by doing
something like:

   <table
     templatex_span="variable:foo|bgcolor"
     bgcolor="#cccccc"
     templatex_span_end="variable:foo|bgcolor">

Mentioning the variable in the variable: to aid the replacement is bad,
and breaks the concept of spanning the pure variable fill, since here
part is permanent (the attribute) and part is to be replaced by a
variable.  So, this isn't ideal and should probably be replaced by
something better.  I just feel like it would be good to have a mechanism
to optionally replace just the value of an attribute with a variable,
since that's very often how it would be used.  And to make the user pass
in the attribute name and value would be a bit of duplication and once
again sort of shift some un-necessary HTML into the code or make a lot
of very tiny template segments to contain the attribute and the variable
value.

For safety with editors I prefer the first method, the
templatex__bgcolor="variable:foo".

Template Input
--------------
This indirectly brings up the issues of controlling the input to the
template.  And for ease of use suggests a greater need for an ability to
pass in values to variables via the show_template/build_template calls.
So, it might look like:

$a = "#333333";

// where $template_a would be available in the template with the value
of $a 
build_template("template_name",array('template_a'=>$a)); 

And perhaps there would be a default, assumed variable for templates
with only ONE variable, again for ease of use, so in this case you might
call it like:

// where '#cccccc' gets put in a variable called
'$template_default_variable' or something... 
build_template("set_bgcolor_attribute",'#cccccc'); 

Again, the parameter list here are just examples, we also need the other
option to specify alternative name for the storage of this built
template in the template manager and so this parameter list order isn't
intended to be the real one.

Quincy
Tem...@be...