From: Vance S. <va...@mo...> - 2005-04-05 00:00:12
|
Folks, For my latest make work project I've decided to use yaws to serve up content tailored to the client's browser capabilities. Apache does the simplest part of this; choosing from a number of file formats (e.g. foo.png, foo.gif, etc.) based on the "Accept:" header in the request. I want to do better than that though and have my content formated into XHTML, HTML or WML on the fly. If I really feel like eating up time I'll do some UAprof parsing and match the contect to the screen size etc. Has anyone done any work in this area? -Vance |
From: Claes W. <kl...@gm...> - 2005-04-05 12:43:25
|
> > Has anyone done any work in this area? > > -Vance > Not that I know of. It's interesting though. Possibly a way forward is pass an argument or a Fun or something to the ehtml renderer. as: {ehtml, Format, Term} or maybe even: {ewml, Term} and add code to render correct wml in yaws_api.erl or somthing like that. Should be straight forward. There is a also Language Content Negotiation to consider. All in all, there is zero Content Negotiation support in Yaws today, I'll be more than happy to help you to add it though. /klacke |
From: Vance S. <va...@mo...> - 2005-04-05 16:53:26
|
Klacke, For now I'm going to use arg_rewrite_mod to map .../foo to .../foo.yaws In foo.yaws I'll test Accept and output xhtml/html/wml as required. I'm writing a module now to parse the Accept header. The hard part is going to be the qs= parameters. The browsers provide their preferences however the page author needs to set his own preferences. The main example of this is that while technically application/xhtml+xml might be the superior format one of the wml formats may be prefered for some mobile applications due to size/speed and the functionality of soft keys. In Apache they choose from the available image formats based on size however that would always prefer wbml over png which would be wrong in most cases so you need to create a foo.var file and set something like image/vnd.wap.wbmp;qs=0.1 so that it is given low rank. -Vance On Tue, Apr 05, 2005 at 02:43:23PM +0200, Claes Wikstrom wrote: } } It's interesting though. Possibly a way forward is pass an argument } or a Fun or something to the ehtml renderer. } } as: } } {ehtml, Format, Term} } } } or maybe even: } } } {ewml, Term} } } and add code to render correct wml in yaws_api.erl } or somthing like that. Should be straight forward. } } There is a also Language Content Negotiation to consider. All in all, } there is zero Content Negotiation support in Yaws today, I'll be more } than happy to help you to add it though. } } } /klacke |
From: Claes W. <kl...@gm...> - 2005-04-06 20:39:12
|
On Apr 5, 2005 11:56 PM, Vance Shipley <va...@mo...> wrote: > Klacke, > > For now I'm going to use arg_rewrite_mod to map .../foo to .../foo.yaws > In foo.yaws I'll test Accept and output xhtml/html/wml as required. > > I'm writing a module now to parse the Accept header. Ok, If you need me to do anything, let me know. /klacke |
From: Vance S. <va...@mo...> - 2005-04-07 06:02:45
|
Klacke, Having done some research I can see that this issue is more complicated than I had realized. The Apache mod_negotiate module does more than I knew. It supports transparent content negotiation (RFC2295/2296) which utilizes features of HTTP/1.1. The simple way (HTTP/1.0) this works is that the client sends an "Accept:" header field with a list of media types it can support in a GET. The server then chooses the most appropriate version of the resource and returns it. With transparent content negotiation the server returns a list of the available versions of the resource and the client then chooses which it prefers and requests that in a normal GET. This requires an new HTTP/1.1 response (example from RFC2295): HTTP/1.1 300 Multiple Choices Date: Tue, 11 Jun 1996 20:02:21 GMT TCN: list Alternates: {"paper.1" 0.9 {type text/html} {language en}}, {"paper.2" 0.7 {type text/html} {language fr}}, {"paper.3" 1.0 {type application/postscript} {language en}} Vary: negotiate, accept, accept-language ETag: "blah;1234" Cache-control: max-age=86400 Content-Type: text/html Content-Length: 227 <h2>Multiple Choices:</h2> <ul> <li><a href=paper.1>HTML, English version</a> <li><a href=paper.2>HTML, French version</a> <li><a href=paper.3>Postscript, English version</a> </ul> The body is for manual selection in lieu of automatic agent selection support. The two methods may be combined so the server may choose to return a particular version of a resource directly if the client's Accept headers make it clear which is appropriate. So it seems to me that what is needed to fully support this functionality is to have a .yaws file return the alternatives and the server implement the logic, in conjunction with the client, to choose the full URI of the resource variant. For example the URI http://yaws.hyber.org/paper.yaws is requested. The paper.yaws file must provide the list of alternatives to the server (here shown using the syntax of RFC2295): {"paper.html.en.yaws" 0.9 {type text/html} {language en}}, {"paper.html.fr.yaws" 0.7 {type text/html} {language fr}}, {"paper.ps" 1.0 {type application/postscript} {language en}} The server would compare this to the client's request and decide whether to proceed with transparent content negotiation (a list response) or to take the shortcut of choosing a resource version directly (choice response). So we need a new return value for out/1. For example: {alternates, VariantList} VariantList = [ VariantDescription | FallbackVariant | ListDirective] VariantDescription = {URI, SourceQuality, [VariantAttribute]} URI = string() SourceQuality = string() VariantAttribute = {type, MediaType} | {charset, Charset} | {language, Language} | {length, Length} | {description, Description} | {description, Description, Language} MediaType = string() Charset = string() Language = string() Length = string() Description = string() FallbackVariant = URI ListDirective = {proxy_rvsa, [RVSAVersion]} | ExtensionListDirective RVSAVersion = string() ExtensionListDirective = string() The {features, FeatureList} VariantAttribute is left for further study. Is the syntax proposed above at all consistant with the yaws style? -Vance |
From: Claes W. <kl...@gm...> - 2005-04-07 08:55:21
|
On Apr 7, 2005 1:05 PM, Vance Shipley <va...@mo...> wrote: > Klacke, > > Having done some research I can see that this issue is more > complicated than I had realized. As usual, > > Is the syntax proposed above at all consistant with the yaws style? > > It certainly is. /klacke |
From: Vance S. <va...@mo...> - 2005-04-07 16:40:10
|
Klacke, It occurs to me that with the method I propose the server will need to remember the resource variant descriptions. For example out/1 returns: {alternates, [{"diagram.en.pdf", "0.9", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.pdf", "0.9", [{type, "application/pdf"}, {language, "fr"}]}, {"diagram.en.gif", "0.7", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.gif", "0.7", [{type, "application/pdf"}, {language, "fr"}]}]} After negotiation a request will be made for one of these binary files. Unless the server remembers the variant descriptions for these resources it will have to fallback to some other method to determine the appropriate Content-*: header fields. What do you think? In the Apache implementation the web author doesn't create the Alternates: header directly as I propose but instead creates a static file to define the Content-*: header fields for individual URIs. Alternatively the server is left to decide based on matching the file extensions. -Vance |
From: Claes W. <kl...@gm...> - 2005-04-08 08:15:54
|
On Apr 7, 2005 11:42 PM, Vance Shipley <va...@mo...> wrote: > Klacke, > > It occurs to me that with the method I propose the server will > need to remember the resource variant descriptions. For example > out/1 returns: > > {alternates, [{"diagram.en.pdf", "0.9", [{type, "application/pdf"}, > {language, "en"}]}, > {"diagram.fr.pdf", "0.9", [{type, "application/pdf"}, > {language, "fr"}]}, > {"diagram.en.gif", "0.7", [{type, "application/pdf"}, > {language, "en"}]}, > {"diagram.fr.gif", "0.7", [{type, "application/pdf"}, > {language, "fr"}]}]} > > After negotiation a request will be made for one of these binary > files. Unless the server remembers the variant descriptions for > these resources it will have to fallback to some other method to > determine the appropriate Content-*: header fields. > This is possibly implementable in the server process. It needs experimentation, as well as deep understanding of how the content negotiation is really supposed to work. That the server needs to remember anything sounds as if it breaks the stateless-ness of HTTP, but I know to little (haven't yet even read the relevant parts from the rfc) abouth the HTTP content negotiation to say. The server process, which ships the {alternate, Alt} header, could remember this. Then it would be valid for that particular socket only, or does it have to be remembered for the "session" ? > What do you think? > I don't think anything at this stage :-) The question wether it has to be remebered for the session or just for a single socket, determines the implementation entirely. The concept of "session" is a bit muddy and is not at all part of HTTP (usually solved with server-side state and cookies) > In the Apache implementation the web author doesn't create the > Alternates: header directly as I propose but instead creates a > static file to define the Content-*: header fields for individual > URIs. Alternatively the server is left to decide based on matching > the file extensions. > /klacke |
From: Vance S. <va...@mo...> - 2005-04-08 14:51:26
|
On Fri, Apr 08, 2005 at 10:15:49AM +0200, Claes Wikstrom wrote: } } That the server needs to remember anything sounds as if it breaks the } stateless-ness of HTTP [....] The problem isn't in the HTTP protocol but in the proposed method of generating the resource variant descriptions. My proposal was to use a .yaws file to dynamically generate the resource variant descriptions for a given URI. This seemed like the yaws way of doing things. The problem is that it generates descriptions of other URIs. Those URIs may be fetched in the normal way by the client. The server should implement the remote variant selection algorithm (RVSA), that much is clear. The proposal was for the .yaws file to return {alternates, VariantList} describing the individual variants for the URI requested. In our example the URI requested was "/diagram.yaws". There were four variants available: {alternates, [{"diagram.en.pdf", "0.9", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.pdf", "0.9", [{type, "application/pdf"}, {language, "fr"}]}, {"diagram.en.gif", "0.7", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.gif", "0.7", [{type, "application/pdf"}, {language, "fr"}]}]} The server will use it's RVSA implementation (which I've just about completed) to decide whether it can determine definitively which is the best variant for this client. If it can it returns the choice directly: ----------> GET /paper.yaws HTTP/1.1 <---------- HTTP/1.1 200 OK In this case we could contruct accurate Content-*: headers from the variant description above: Content-Type: application/pdf Content-Language: en The problem comes in when RVSA isn't able to determine definitively which variant to choose. In this case it returns a list response which includes an Alternates: header with the complete variants description list. The user agent then makes it's own choice from the described variants and responds with a normal GET to retrieve the URI for the individual variant: ----------> GET /paper.yaws HTTP/1.1 <---------- HTTP/1.1 300 Multiple Choices ----------> GET /paper.en.pdf HTTP/1.1 <---------- HTTP/1.1 200 OK At this point the server will not have the {alternates, VariantList} any more and will need to rely upon file extensions to construct the Content-*: headers. In this example we could potentially do so using file names however bringing in the dimensions of charset and features make this too complex to accomplish. We could also do it with another .yaws file for the variant choice: {alternates, [{"diagram.en.pdf.yaws", "0.9", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.pdf.yaws", "0.9", [{type, "application/pdf"}, {language, "fr"}]}, {"diagram.en.gif.yaws", "0.7", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.fr.gif.yaws", "0.7", [{type, "application/pdf"}, {language, "fr"}]}]} The diagram.en.pdf.yaws file could dynamically build the Content-*: headers and output the pdf file. The problem with this is that it requires coordination between the two .yaws files. My first though was to push the file into the cache along with the appropriate headers (I haven't looked at how the cache works however). I'm not sure this is an elegant solution either. An interesting dimension to the problem is that proxies are allowed to implement RSVA and use it in a couple ways: User Agent Proxy Server ---------- ----- ------ -----------GET----------> ----------------------> <----List Response----- <-------List Response---- -----------GET----------> ---------GET----------> <----Choice Response--- <-----Choice Response---- Some time later another user agent requests the same URI: -----------GET----------> <-------List Response---- The proxy may also cache the received variants and return that: -----------GET----------> <-----Choice Response---- Or it may cache the list response but it hasn't cached the choice yet: -----------GET----------> <-------List Response---- -----------GET----------> ---------GET----------> <----Choice Response--- <-----Choice Response---- So the root of the issue is that although the intention is to describe multiple variants of a URI it does so by in fact describing other URIs. So the Apache approach starts to make more sense. We need to build desciptions of the full URIs. One way to do this may be to have .yaws file of the requested URI simply list the other URIs which are variants of the resource and their preferences: {alternates, [{"diagram.en.pdf", "0.9"}, {"diagram.en.gif", "0.8"}, {"diagram.yaws", "0.5"}]} The server could then use the default file extension method to determine the variant descriptions. In the case of the file with a .yaws extension it could read a more complete description from the file itself somehow. Maybe I'm trying to overload the .yaws filetype. Hmmm ... -Vance |
From: Claes W. <kl...@gm...> - 2005-04-11 08:16:00
|
> The diagram.en.pdf.yaws file could dynamically build the Content-*: > headers and output the pdf file. The problem with this is that it > requires coordination between the two .yaws files. > > My first though was to push the file into the cache along with the > appropriate headers (I haven't looked at how the cache works however). Good investigation and good explanation of the problem !! As for the cache: I've started to write a yaws internals document but it isn't finished yet. Each Virtual server which is represented by a #sconf{} record has a ets table which is used as cache. The purpose of the cache is to cache content in RAM to speed up delivery. The key in the cache is the Path for the URL, take a look at yaws_server:cache_file/4 and you'll see: E = SC#sconf.ets, ..... ets:insert(E, {{url, Path}, now_secs(), UT2}), ets:insert(E, {{urlc, Path}, 1}), It would be possible to use this mechanism to propagate information from one yaws file to another. But it's certainly not crystal clear since if the actual paths to the different files are different, some hackish naming scheme would have to be adopted. Unclear. > I'm not sure this is an elegant solution either. No, > Maybe I'm trying to overload the .yaws filetype. Hmmm ... > To be continued ..... /klacke |
From: Vance S. <va...@mo...> - 2005-04-12 17:44:17
|
What is needed is a general way to tag URIs which may be served by the yaws server with optional metainformation. This information would be used to create entity headers (e.g. Content-Language) for simple GETs as well as drive the new content negotiation functionality. Currently a Content-Type entity header is created when a URI resolves to a file name and the suffix matches an entry in the mime.types file. One could come up with a naming scheme which would also allow us to derive a Content-Language (e.g. paper.en.pdf) however this would provide only the most minimal support. You might have a bilingual document in which case the following header would be valid: Content-Language: en, fr One approach, the one used in Apache, is to have a file containing metainformation with the same name as a URI but with a specific suffix. This only works for URIs which translate directly to file names. A more appropriate scheme for yaws would be one which fully supported any URI including those with no corresponding file. The ets table seems the ideal place to store this metainformation. On Mon, Apr 11, 2005 at 10:15:35AM +0200, Claes Wikstrom wrote: } } The purpose of the cache is to cache content in RAM to speed up } delivery. } } The key in the cache is the Path for the URL, take a look at } yaws_server:cache_file/4 and you'll see: } } E = SC#sconf.ets, } } ..... } } ets:insert(E, {{url, Path}, now_secs(), UT2}), } ets:insert(E, {{urlc, Path}, 1}), } } It would be possible to use this mechanism to propagate information from } one yaws file to another. But it's certainly not crystal clear since } if the actual paths to the different files are different, some hackish } naming scheme would have to be adopted. Unclear. A .yaws file could return {alternates, VariantList} to indicate that the server should use content negotiation. VariantList is a list of URIs and optional metainformation. The metainformation is stored in the server's ets table with the URI as key. When a request for that URI is made the metainformation is added, or overides, that otherwise created. For example: ----------> GET /diagram.yaws HTTP/1.1 diagram.yaws has an out/1 which returns: {alternates, [{"diagram.pdf", "0.9", [{type, "application/pdf"}, {language, "en"}]}, {"diagram.gif", "0.8", [{type, "image/gif"}, {language, "en"}]}]} the server creates entries for each of the two URIs ("diagram.pdf" and "diagram.gif") in the ets table with the Content-Type and Content-Language metainformation the server returns a choice response: <---------- HTTP/1.1 300 Multiple Choices ----------> GET /diagram.pdf HTTP/1.1 the server handles the request in the normal way however when building the #urltype{} record it uses the metainformation from the ets table <---------- HTTP/1.1 200 OK The example above mostly closely resenbles the current use of the ets table for short lived caching however if we were to allow the metainformation records to be added statically we could support the addition of entity headers (e.g. Content-Language) for simple GETs. More to consider ... -Vance |
From: Vlad D. <vla...@ho...> - 2005-04-12 18:10:41
|
> What is needed is a general way to tag URIs which may be served by > the yaws server with optional metainformation. This information > would be used to create entity headers (e.g. Content-Language) for > simple GETs as well as drive the new content negotiation functionality. Hi, I apologize if I am going to express a silly idea, this isn't my best area of expertise... Wouldn't it be possible to have the URI point to a file that contains both headers and content? I was looking the other week at StringTemplate (www.stringtemplate.org), which is a cool template engine IMHO, and wondered if it couldn't be used with yaws - the reason being that I think the "<erl> out()-> stuff end. </erl>" notation is awkward. Better to have a simple template that encompasses even headers (when needed) and fully separate presentation from the data source. If it looks interesting, I have started to implement something similar in Erlang. I don't remember why I didn't finish it ;-) regards, Vlad |