Proposal: Multipart Web Requests

Here’s a little idea that might improve the web for everyone. I don’t know how to draft or submit a Request For Comments—I could have read RFC 2026 (BCP 9) but I wasn’t interested—but if anyone would like to see this through, I hope you’ll contact me.

We could improve the overall performance and reduce the request load on most of the world wide web if servers supported a way of sending in one response some or all of the resources that will certainly or likely be requested (GET) as a result of parsing the requested resource.

A sufficient implementation might be possible using existing RFCs. Perhaps a media range of “Multipart” in the Accept request-header field could be used to announce that the client can accept such responses.

The ideal implementation might include optimizations for client and proxy caches, such as a Cached request-header whose value would specify for exclusion from the response any already-cached items and their LastModified, ETag, or other conditional request-header values as appropriate.

Static files served in this manner might be parsed by the web server in order to discover which, if any, other resources (images, audio, stylesheets, DTD’s, etc.) should be sent. The server might take cues from a saved list, as from a cache of previous results of parsing or from a manually generated list.

Servers generating resources dynamically might take cues from the program state, or the page generation scripts might cue the server by passing data directly or as a value of an Include response-header. When a proxy detects the Include response-header along with a single-part response, it may assume that the server was incapable of providing a multi-part response and convert the response into a multi-part response if the proxy has a valid copy or wishes to pre-fetch a copy of any or all of the resources indicated by Include.

A typical request might proceed in this way:

  1. Client requests /index.html from example.com with Accept: Multipart
  2. Server finds index.html and discovers that its display will require a PNG file as a background.
  3. Server responds:
    {response-headers}
    {body-part (text/html)}
    –boundary
    {body-part (image/png)}
    –boundary–

In the preceeding example, there is no explicit request for the background image but the client receives it in the payload of the initial response.

Specialized user agents may use the Accept request-header to specify which types of media they prefer to receive but there ought to be a way for agents to specify which types of media they prefer not to receive in multi-part responses. For example, a screen reader probably would want the server to bundle audio but not images. A Reject request-header could indicate unwanted media types or a zero q-value in the Accept header might serve to prevent the server or any proxies from attaching these types.

A proxy handling a non-multi-part request from a client may request resources in multi-part mode and then cache and serve the individual parts as if each had been requested singly. Proxies may construct multi-part responses from parts retrieved individually and they may append additional parts according to any cue, such as by rendering web pages with the engine of their choice (perhaps taking a hint from the User-Agent request-header).

The existence of an entity in a multi-part response should not cause the user agent to display, execute, or otherwise handle the entity. User agents accepting multi-part responses should not store or execute any part which was not used during the course of rendering or interacting with the requested resource.

Does anybody else think that something like this could be beneficial?

Published by

Andy Skelton

Code Wrangler @ Automattic youtube.com/AndySkelton

21 thoughts on “Proposal: Multipart Web Requests”

  1. I’m a huge fan of the concept of optimized downloading. I might take a different angle: What about returning a zip file with a manifest with a root document (index.html) and then other reference files as well with some sort of map for the browser to identify requests with files. That way you can have optimized compression and unified downloading. The biggest problem will be writing the server side code to parse an output file and then creating the map/zip/multi-part data.

  2. Most of what you want exists.
    1. You can do inline javascript
    2. You can do inline css
    3. You can do inline images (data:url), just not supported in IE.

    At that point, you could serve most webpages with 1 request. Downside is caching.

    I think #2 (if I understand it correctly )is what makes it too complicated…

    Server finds index.html and discovers that its display will require a PNG file as a background.

    Your now talking about an html parser… and something that allows for the mess that most sites call html. Not to mention two parsers that can fail. The server side or client side. There’s also the issue of overhead on this.

    Another issue is caching… how do you cache when everything is inline? It would reduce effectiveness if something like If-Modified-Since is supported, since UA’s could have different things in cache… meaning your server side cache or CDN could have hundreds or even thousands of variations to cache. You can break it up and cache elements, but you still have to reassemble… making even caching not very efficient.

    It’s a good idea, but puts a huge burden on the web.

  3. Randy: Just a guess here, let’s both Google “zhtml”

    robertaccettura: Caches will treat each body-part exactly as if it were the response to a separate request. Every body-part begins with appropriate response-headers as if it had been requested separately.

    Clay: Thanks for that link. I should have known better.

    William: That’s perfect.

  4. As have been pointed out, ZHTML and MHTML already exists for this purpose. The browser support isn’t very good, though. The problem with caching isn’t with the servers, though, but with conditional requests made by the client.

    Client-side caching means that the User Agent, or browser, stores a resource (or entity) with either an ETag or Last-Modified date as responded in an HTTP header by the server. When a request for the same entity is being made, the ETag or Last-Modified date is being used in a so-called “conditional request” with the headers “If-None-Match” or “If-Modified-Since”. If the server concludes that the ETag or Last-Modified date received represents a stellar entity, the whole resource is served to the browser. If not, a “304 Not Modified” response is sent, with an empty HTTP body. This saves enormous amounts of bandwidth.

    The problem with bundling up several responses with multipart is that the client doesn’t know beforehand which URIs that response is going to consist of. Thus, it doesn’t know which ETags or Last-Modified dates it’s going to have to send to the server to qualify for a “304 Not Modified”. Perhaps there’s a way to fix this, but I don’t think it’s going to be easy.

  5. What major benefits does this have over a HTTP pipelining where one single request is use to fulfill the GET requests for the individual files?

    After all the server now has to handle parsing the HTML possibly.

    Do you not also lose out on the caching benifits of external resources (css, images etc) being used my multiple files?

  6. I don’t think that such a thing would have any benefit at all. Modern webservers and browsers use Persistent Connections over HTTP anyway, so the way your request looks right now is:
    Client connects via http.
    Client requests /index.html
    Server finds index.html and sends it.
    Client parses and discovers it needs background.png and style.css.
    Client requests background.png
    Server sends background.png
    Client requests style.css
    Server sends style.css
    Client disconnects from server.

    One connection. No overhead. Only delay is delay in having to parse the file. So what are you saving by the multipart approach? The extra requests? These are tiny, less than one packet. The delay? Well, yeah, if your client has a slow parser, then it’s a problem, but your case requires that the webserver have a parser too, and some kind of caching mechanism if it wants to be at all speedy. It adds a lot of processing on the server side, which will add delay as well. I just don’t see that you gain anything here.

    And anyway, there are potential security issues involved if a server can send unrequested things to a client whenever it pleases.

  7. Hi Andy,

    Excellent suggestion, I’ve always liked the multipart types for emailing. Might also be better for client-side Virus Scanners since they wouldn’t have to scan each individual server response.

  8. What need is there for this? HTTP’s solution to this issue is persistent connections: the performance difference between the two is negligible.

    For something like email there is a clear use-case: attachments, as everything must be sent in one message, whereas with HTTP things can be sent in a (theoretically) infinite number of messages, which can all be sent over a single connection with a persistent connection.

  9. Pipelining is great and it should not be tossed out. A browser can still benefit from requesting Multipart at times, such as on its first request to a domain.

    The burden of parsing HTML at the server is not necessary to a solution and it was obviously not my intent to suggest that such parsing should occur on each page load. In the case of server-side parsing, MHTML caching is a foregone conclusion. Alternatively, authors may generate resource lists to cue the server. Dynamic page generators (e.g. WordPress) typically know of at least one resource that will be needed by the browser and often more: stylesheets, multiple javascript files, site-wide design elements, etc.

    It is obvious that it would be impractical for a browser to request Multipart on subsequent loads from the same domain unless the requests contained a list of conditions, such a unique ETag per URI. This does not negate the obvious benefit of sending in the initial response those resources that are practically guaranteed to be needed by the client.

    Inline CSS, JS, and the data scheme are not suitable replacements for a Multipart encoding. Whereas Multipart permits the client and each proxy to treat each part as an individual resource, and whereas Multipart-capable clients may generate non-Multipart requests when they possess resources from that domain in a cache, inline resources permit none of these benefits because the secondary “parts” are inseparable from the primary resource.

    I still believe this could benefit servers, proxies, clients, and people browsing the web. Consider how many “uniques” you get a month, times the number of secondary resources that could be sent along with each first request. For most sites, having your most popular entry pages pre-compiled into MHTML would almost certainly reduce your server load (fewer open connections) and improve the perceived performance of your site (time to fully render).

  10. To all of those that say that the keep alive is the same cause the connection is already open, having an open connection and sending data on it is not the same. I can have an open connection with the server, but every time I ask for something, my request has to travel over that connection to the server (which means between 50 and 200 ms, but can be much more) and the response has to travel back (I will receive the first byte after more or less the same time, that is 50 to 200 ms on average), so even on an open connection, for each request up to half a second will be wasted in waiting. Obviously having the connection already established avoids even more overhead, but does not mean that since the connection is there communication over it is instantaneous.

    Regarding pipelining, it is indeed the best solution ever, but still (after nearly 3 years from this post) it is not used by browsers/not supported by servers/causes problems with proxy/will break everything if you use comet-style stuff.

    Right now most CDNs are supporting something similar to this idea, having you code a single URL referrring to multiple JS/CSS files that will be concatenated and sent back to the browser in a single get to save the time of multiple requestes. This solution works when it comes to saving time, but not when saving bandwidth, because single JS or CSS files will not be cached one by one, but as a single monolithic “thing”, so you’ll have to load all the javascript and all CSS for all pages or have the same single JS file (that could have been cached and shared) be streamed more and more times cause they are all parts of the “thing”.

    Another solution can be multipart HTTP responses, it has been there since the beginning of HTTP, suffers from the same problem of not having a way to let the client say “wait, I already have this”, not all clients supports it (and there is no way of knowing wether it supports it or not from the headers), but where it is correctly supported lets you stream multiple files to the browser in a single response AND have the browser cache them separately, which as said is unfortunately useless.

    People is now convinced that “pipelining” means forcing the “two requests per server” limit browsers have by faking multiple domains, which makes it faster in the case you have a very good line connected to a mostly unused server (typical web-developer setup) but makes it waste more bandwidth (more DNS requests, more connections opened etc..).

    Seems like a way to actually speed up requests, reduce bandwith and use caches properly is still far from being correctly employed, even if many different strategies are there to solve some of these aspects since years … which is a pity.

Comments are closed.