RemoteCommunications.com had developed Chtml as described. You can find further information about it at http://www.rctp.com

Ever Thought of Compressed HTML Files?

Do you wait sometimes for web pages longer than you want too, even when you turned of graphics.No wonder, the html files are getting more and more complex with tables, javascripts, style sheets and DHTML and the size of the file grows as well.

The sizes of dynamic html files are often already well above 10 KB and approach 20 KB very fast. Did you know that there are companies who keep an eye on the amount of images on one web page to keep it fast to load? The target is to have no more than 25KB of images used within one html file. They use the latest tricks to compress their images to have a faster loading page without losing on the quality of the images. So how comes that they are not compressing the html file although you can decrease html files down to a third or quarter of the original size? It is simply not possible yet because no standard exists and no browser supports it yet.

Let us change the rules of the game!

I would suggest to implement the following specification of compressed html files (short CHTML)

CHTML files are just normal html files that are compressed into a zip file. One html file in one zip file. When the zip file goes on the server it should have the extension chtml or cht so that the browser can identify it by the extension as a html file within a zip file.

I believe that a commonly used form of compression should be used so that private user without the latest software will still be able to communicate their ideas without having a severe disadvantage in distributing their thoughts. Obviously you could try to push your own standard if you do not care about the user. (Dogbert would do it because it is an easy way to tell the world that you want to rule it.) Another advantage of using a common compression algorithm like Zip would be that user without the latest browser will still be able to read the files after downloading and manually decompressing it.

Advantages and Disadvantages

The advantage is that the files will load three to four times faster depending on the rate of compression. The disadvantage is that it will take some additional millie seconds to un compress the received file. (Within my examples I will assume that it will be 250 millie second to make it easier to understand although I am aware that real time for decompression on the fly can be much shorter as software products like stacker and doublespace show.) I will prove now that the disadvantage is so small that it is really hard to notice it.

When an html file would normally take 6 sec to load and it would be compressed than it would take only 1.5 or 2 sec plus the 0.25 sec for decompression. It is still very much faster. The disadvantage of decompressing appears only at a very small file size. Assuming only a compression down to a third (usually tending more to a quarter for html files) and a download time of only 0.375 sec for the html file, the advantage of compressing disappears. The time the compressed file would need to load would be 0.125 sec plus the above mentioned 0.25 sec for decompression would add up to the same time. Any smaller and faster to load files would actually slightly be slower to load as the compresses file than the html file, but it would always be less than 0.375 sec in total which I think is already too small to notice any difference. If you think that the used values are incorrect you might want to calculate it through with your own figures. I calculated the break even point of 0.375 sec by changing the following formula.

t   is time for loading uncompressed html file
rc  is rate of compression 
    (is 4, when html file is 4 times larger than after compression)
t   is time for decompressing the file

       t = t/rc+td            *rc
    rc*t = t+td*rc            -t
(rc-1)*t = td*rc          /(rc-1)

       t = td*rc/(rc-1)
What ever reasonable value you use, you will find that compressing html files, does not only provide an advantage when loading files that would then take only 1 hour instead of 3 or 4 hours. The disadvantage for users with fast access would also be minimal. Although it would initially be used only for html files of 15-20 KB and above, it might spread even further. When chtml is supported in the majority of browser, it could also be used for smaller files. When whole America is on the net, access speeds to many sites crawl below 1KB/sec . So it would make sense to crunch even the small html files when everybody can read chtml.

What needs to be done?

Obviously the browser need to understand the new file format. When loading a chtml file the browser has to decompress it before it can be interpreted. To have it as a built-in function it would need to be integrated into new browser versions. That old browser version will be able to read it plug-ins or add-ons or active-x components will be needed.

For HTML-Editors an additional format in the save function will be needed that compresses the html file. It will also need the decompressing since it should also edit existing files. This changes will exclude the Notepad as a editing tool, but all other HTML-Editors can probably be updated with patches or new releases can be issued.

The biggest change could be done on servers which create large amount of data in html format for download either on a regular basis or upon request. if the server would compress large file it would decrease the average download time and the work load on the server. On a unix server the scripts could be amended slightly so that the output will be zipped before it is send. Even for Active Server Pages it could be used in the next release (compressed active server pages CASP ?) to make the output of large amount of data faster.

Naturally, the hardware should be improved as well to decrease the time taken by compression and decompression on clients and server. Although it is already quite good as I have shown above, I am sure that the large micro-processor producer will improve the performance of the processors even further.

I hope all this will be implemented until the next May.
I love birthday presents ....

If you have any other ideas or if you disagree or want to comment on it, please write me to my email address.

Thanks

Armin Kielack

 

AK
Start

My Games

Cross Thinking

Creativity

CHTML

Feedback or Questions

about me

 
 

Copyright by Armin Kielack 1998 last modified: