Here’s a little idea that might improve the web for everyone. I don’t know how to draft or submit a Request For Comments—I could have read RFC 2026 (BCP 9) but I wasn’t interested—but if anyone would like to see this through, I hope you’ll contact me.
We could improve the overall performance and reduce the request load on most of the world wide web if servers supported a way of sending in one response some or all of the resources that will certainly or likely be requested (GET) as a result of parsing the requested resource.
A sufficient implementation might be possible using existing RFCs. Perhaps a media range of “Multipart” in the Accept request-header field could be used to announce that the client can accept such responses.
The ideal implementation might include optimizations for client and proxy caches, such as a Cached request-header whose value would specify for exclusion from the response any already-cached items and their LastModified, ETag, or other conditional request-header values as appropriate.
Static files served in this manner might be parsed by the web server in order to discover which, if any, other resources (images, audio, stylesheets, DTD’s, etc.) should be sent. The server might take cues from a saved list, as from a cache of previous results of parsing or from a manually generated list.
Servers generating resources dynamically might take cues from the program state, or the page generation scripts might cue the server by passing data directly or as a value of an Include response-header. When a proxy detects the Include response-header along with a single-part response, it may assume that the server was incapable of providing a multi-part response and convert the response into a multi-part response if the proxy has a valid copy or wishes to pre-fetch a copy of any or all of the resources indicated by Include.
A typical request might proceed in this way:
- Client requests /index.html from example.com with Accept: Multipart
- Server finds index.html and discovers that its display will require a PNG file as a background.
- Server responds:
{response-headers}
{body-part (text/html)}
–boundary
{body-part (image/png)}
–boundary–
In the preceeding example, there is no explicit request for the background image but the client receives it in the payload of the initial response.
Specialized user agents may use the Accept request-header to specify which types of media they prefer to receive but there ought to be a way for agents to specify which types of media they prefer not to receive in multi-part responses. For example, a screen reader probably would want the server to bundle audio but not images. A Reject request-header could indicate unwanted media types or a zero q-value in the Accept header might serve to prevent the server or any proxies from attaching these types.
A proxy handling a non-multi-part request from a client may request resources in multi-part mode and then cache and serve the individual parts as if each had been requested singly. Proxies may construct multi-part responses from parts retrieved individually and they may append additional parts according to any cue, such as by rendering web pages with the engine of their choice (perhaps taking a hint from the User-Agent request-header).
The existence of an entity in a multi-part response should not cause the user agent to display, execute, or otherwise handle the entity. User agents accepting multi-part responses should not store or execute any part which was not used during the course of rendering or interacting with the requested resource.
Does anybody else think that something like this could be beneficial?
October 1, 2007 at 6:39 pm |
[...] Proposal: Multipart Web Request. « WordPress.com Rising Comment » [...]
October 1, 2007 at 8:47 pm |
I’m a huge fan of the concept of optimized downloading. I might take a different angle: What about returning a zip file with a manifest with a root document (index.html) and then other reference files as well with some sort of map for the browser to identify requests with files. That way you can have optimized compression and unified downloading. The biggest problem will be writing the server side code to parse an output file and then creating the map/zip/multi-part data.
October 1, 2007 at 9:50 pm |
Most of what you want exists.
1. You can do inline javascript
2. You can do inline css
3. You can do inline images (data:url), just not supported in IE.
At that point, you could serve most webpages with 1 request. Downside is caching.
I think #2 (if I understand it correctly )is what makes it too complicated…
Your now talking about an html parser… and something that allows for the mess that most sites call html. Not to mention two parsers that can fail. The server side or client side. There’s also the issue of overhead on this.
Another issue is caching… how do you cache when everything is inline? It would reduce effectiveness if something like If-Modified-Since is supported, since UA’s could have different things in cache… meaning your server side cache or CDN could have hundreds or even thousands of variations to cache. You can break it up and cache elements, but you still have to reassemble… making even caching not very efficient.
It’s a good idea, but puts a huge burden on the web.
October 1, 2007 at 10:25 pm |
[...] Skelton posted a proposal for a mechanism by which a web server could send related objects to a resource in response to a single request. It’s quite an interesting idea, although I’m not sure [...]
October 1, 2007 at 10:31 pm |
I think it could be beneficial. It’s already been floated out there, too.
http://en.wikipedia.org/wiki/MHTML
Doesn’t seem to be a whole lot of interest, unfortunately, from those who build browsers.
October 1, 2007 at 10:33 pm |
You would also need a way to specify the URI of the related objects, perhaps the Content-Location header?
October 1, 2007 at 11:06 pm |
Randy: Just a guess here, let’s both Google “zhtml”
robertaccettura: Caches will treat each body-part exactly as if it were the response to a separate request. Every body-part begins with appropriate response-headers as if it had been requested separately.
Clay: Thanks for that link. I should have known better.
William: That’s perfect.
October 2, 2007 at 2:20 am |
Why isn’t http keep-alive enough?
October 2, 2007 at 2:45 am |
As have been pointed out, ZHTML and MHTML already exists for this purpose. The browser support isn’t very good, though. The problem with caching isn’t with the servers, though, but with conditional requests made by the client.
Client-side caching means that the User Agent, or browser, stores a resource (or entity) with either an ETag or Last-Modified date as responded in an HTTP header by the server. When a request for the same entity is being made, the ETag or Last-Modified date is being used in a so-called “conditional request” with the headers “If-None-Match” or “If-Modified-Since”. If the server concludes that the ETag or Last-Modified date received represents a stellar entity, the whole resource is served to the browser. If not, a “304 Not Modified” response is sent, with an empty HTTP body. This saves enormous amounts of bandwidth.
The problem with bundling up several responses with multipart is that the client doesn’t know beforehand which URIs that response is going to consist of. Thus, it doesn’t know which ETags or Last-Modified dates it’s going to have to send to the server to qualify for a “304 Not Modified”. Perhaps there’s a way to fix this, but I don’t think it’s going to be easy.
October 2, 2007 at 3:08 am |
What major benefits does this have over a HTTP pipelining where one single request is use to fulfill the GET requests for the individual files?
After all the server now has to handle parsing the HTML possibly.
Do you not also lose out on the caching benifits of external resources (css, images etc) being used my multiple files?
October 2, 2007 at 4:15 am |
Sorry Matt, in which ways your proposal should be more effective than the already-existing HTTP 1.1 Keep Alive ?
October 2, 2007 at 6:14 am |
Not a bad idea, but I’m not sure that this would solve any problems that HTTP pipelining[1] wouldn’t if it was more widely supported.
[1] http://en.wikipedia.org/wiki/HTTP_pipelining
October 2, 2007 at 8:45 am |
I don’t think that such a thing would have any benefit at all. Modern webservers and browsers use Persistent Connections over HTTP anyway, so the way your request looks right now is:
Client connects via http.
Client requests /index.html
Server finds index.html and sends it.
Client parses and discovers it needs background.png and style.css.
Client requests background.png
Server sends background.png
Client requests style.css
Server sends style.css
Client disconnects from server.
One connection. No overhead. Only delay is delay in having to parse the file. So what are you saving by the multipart approach? The extra requests? These are tiny, less than one packet. The delay? Well, yeah, if your client has a slow parser, then it’s a problem, but your case requires that the webserver have a parser too, and some kind of caching mechanism if it wants to be at all speedy. It adds a lot of processing on the server side, which will add delay as well. I just don’t see that you gain anything here.
And anyway, there are potential security issues involved if a server can send unrequested things to a client whenever it pleases.
October 2, 2007 at 8:52 am |
One alternative approach might be to use the data: URI scheme. That works in everything but Internet Explorer.
October 2, 2007 at 9:37 am |
Hi Andy,
Excellent suggestion, I’ve always liked the multipart types for emailing. Might also be better for client-side Virus Scanners since they wouldn’t have to scan each individual server response.
October 2, 2007 at 10:00 am |
What need is there for this? HTTP’s solution to this issue is persistent connections: the performance difference between the two is negligible.
For something like email there is a clear use-case: attachments, as everything must be sent in one message, whereas with HTTP things can be sent in a (theoretically) infinite number of messages, which can all be sent over a single connection with a persistent connection.
October 2, 2007 at 11:19 am |
Is this what Keep-alive is for?
October 2, 2007 at 12:00 pm |
Still leaves the question: is it needed?
If IE supported data: url’s (RFC 2397), is there really a need for this? Technically it could do any data in the html.
October 2, 2007 at 12:31 pm |
Pipelining is great and it should not be tossed out. A browser can still benefit from requesting Multipart at times, such as on its first request to a domain.
The burden of parsing HTML at the server is not necessary to a solution and it was obviously not my intent to suggest that such parsing should occur on each page load. In the case of server-side parsing, MHTML caching is a foregone conclusion. Alternatively, authors may generate resource lists to cue the server. Dynamic page generators (e.g. WordPress) typically know of at least one resource that will be needed by the browser and often more: stylesheets, multiple javascript files, site-wide design elements, etc.
It is obvious that it would be impractical for a browser to request Multipart on subsequent loads from the same domain unless the requests contained a list of conditions, such a unique ETag per URI. This does not negate the obvious benefit of sending in the initial response those resources that are practically guaranteed to be needed by the client.
Inline CSS, JS, and the data scheme are not suitable replacements for a Multipart encoding. Whereas Multipart permits the client and each proxy to treat each part as an individual resource, and whereas Multipart-capable clients may generate non-Multipart requests when they possess resources from that domain in a cache, inline resources permit none of these benefits because the secondary “parts” are inseparable from the primary resource.
I still believe this could benefit servers, proxies, clients, and people browsing the web. Consider how many “uniques” you get a month, times the number of secondary resources that could be sent along with each first request. For most sites, having your most popular entry pages pre-compiled into MHTML would almost certainly reduce your server load (fewer open connections) and improve the perceived performance of your site (time to fully render).
October 3, 2007 at 8:04 am |
[...] Proposal: Multipart Web Requests Here’s a little idea that might improve the web for everyone. I don’t know how to draft or submit a Request […] [...]