On 01/07/2013, at 1:52 AM, Steve Vinoski <vinoski@ieee.org> wrote:

Specifically, I want to know if it is possible to 'tune' the HTML generation so that it produces source that is neat and tidy.

What kind of "tuning" do you have in mind? If you want tidy HTML, why not use the tidy tool itself? See https://github.com/w3c/tidy-html5 or the original at http://tidy.sourceforge.net .

Note also that Yaws has exhtml, for similar support for XHTML.

Also, you might consider trying these questions on the Yaws mailing list: https://lists.sourceforge.net/lists/listinfo/erlyaws-list

Sorry for posting this question in the wrong list - I didn't know that shtml was a YAWS-specific thing.  I thought it was a generic Erlang thing.

The reason I don't want to use tidy (or any other tool) is because I actually use formatting and structure to help me craft and troubleshoot my pages.  Seeing the raw HTML helps me write better code - it's a nice feedback loop.  So although using tidy might end up producing better code, it wouldn't help me become a better coder.

Besides, the shortfalls in what shtml produces are only relatively minor (as far as I can see so far).  Below is some output from a script on the hyper.org site somewhere (I believe):


Notice how </head> is on the end of the line as opposed to on a line by itself.

<h4>The headers passed to us were:</h4><hr />

<hr /> not on its own line

<p>Connection: keep-alive</p></li>

Even though <p> shouldn't be inside of a <li>, it seems to have a /n in front of it

<p>Dnt: 1</p></li></ol>

Three close tags at the end.

<li>method: GET</li>
<li>path: {abs_path,"/testYaws.yaws"}</li>
<li>version: {1,1}</li></ul><hr />

It would be nice if all <li> were indented and, again, a batch of close tags at the end.

So, in summary, only a few changes would make for, imho, much more readable code:

1) </ul> (and presumably) </ol> tags I think should be on their own line

2) <hr /> should be on its own line

3) <li></li> should stay on the same line (as it currently is) but be indented by either a tab or 2/4 spaces

4) <p> tags should not be preceded by a \n  (which may/may not be an artefact of the code the produced it:  {ol, [],lists:map(fun(S) -> {li,[], {p,[],S}} end,H)}, I'm not sure.

I recognise that my personal preferences may be quite different to those of other folk, so one style (no matter who's it is) will not please everyone.

In an ideal world, each tag/set could be customised independently.  Maybe defaults would be in place but could be overridden by a set of style overrides?

{ehtml, [
{override, [hr], "\n<hr />\n"},
{override, [li], "    <li></li>\n"},
{override, [ul], "<ul>\n</ul>\n\n"},
{h4,[], "The headers passed to us were:"},
{ol, [],lists:map(fun(S) -> {li,[], {p,[],S}} end,H)},
{h4, [], "The request"},
{li,[], f("method: ~s", [Req#http_request.method])},
{li,[], f("path: ~p", [Req#http_request.path])},
{li,[], f("version: ~p", [Req#http_request.version])}
{h4, [], "Other items"},
{li,[], f("clisock from: ~p", [inet:peername(A#arg.clisock)])},
{li,[], f("docroot: ~s", [A#arg.docroot])},
{li,[], f("fullpath: ~s", [A#arg.fullpath])}
{h4, [], "Parsed query data"},
{pre,[], f("~p", [yaws_api:parse_query(A)])},
{h4,[], "Parsed POST data "},
{pre,[], f("~p", [yaws_api:parse_post(A)])}

In essence, for each start and/or end tag you would want to be able to specify:

0 or more [spaces | tabs | newlines] either [before | after] the tag.

Were such a system in place, I would be much more inclined to use it, and I'm sure others would to.  I could then get rid of my much less pleasing:

"<h1>" ++ integer_to_list(MyInfo) ++ "</h1>\n" ++
"<h2>Directories</h2>\n" ++
"<ul>\n<li>" ++ DirBlock ++ "</li>\n</ul>\n" ++
"<h2>Regular Files</h2>\n" ++
  "<ul>\n<li>" ++ RegBlock ++ "</li>\n</ul>\n"

which reads poorly in YAWS but looks great in a browser.