Results

On this page you'll find some test results for the current implementation of CTW (version 0.1). All files of the Calgary corpus and the Canterbury corpus are encoded under Microsoft Windows ME on a 1400 MHz AMD Athlon processor with 256 megabytes of PC-266/2100 DRR-RAM. The compiler that was used is Microsoft Visual C++ 6.0 with the "Win32 Release" project configuration (which is the same as for the downloadable Windows version). The settings that are used, are the default settings:

Tree depth6
Size of tree array4194304 nodes (33554432 bytes)
Max. number of tries32
Max. file buffer size4194304 bytes
Strict pruningenabled
Root weightingdisabled
Max. log beta1024
Estimatorzero-redundancy

As can be seen in the tables below, the compression performance is very good. However, the processing time is quite long (17.3 seconds for book1, which is 768771 bytes) and the algorithm uses a large amount of memory (32 MB for the tree array and up to 4 MB for the file buffer, with the default settings).


Calgary corpus

The results for the Calgary corpus are as follows:

fileorig. size
(bytes)
comp. size
(bytes)
#codebits#treenodes#failedprocessing
time (seconds)
compression-
rate (bits/byte)
bib1112612549120382847236402.21.83198
book176877120951416760152957949717.32.18012
book261085614439911550922084480012.41.89094
geo10240058012463996113704101.64.53121
news377109110794886252194818007.32.35012
obj121504100027991616225600.33.71633
obj224681473990591822125052603.92.39785
paper1531611522112167033658701.02.28871
paper2821992290018309848030501.62.22750
paper3465261455111630533550200.92.49979
paper41328646893741311390700.32.81597
paper51195443923503410412600.32.93073
paper638105113199045425411700.72.37381
pic5132165108140854767051204.80.79605
progc39611115729247925379300.72.33468
progl716461475411793626655101.21.64609
progp49379103658282319602300.81.67729
trans936951690213511728072401.41.44209
total32514938099486477797--62.21.99225


Canterbury corpus

The results for the Canterbury corpus are as follows:

fileorig. size
(bytes)
comp. size
(bytes)
#codebits#treenodes#failed processing
time (seconds)
compression-
rate (bits/byte)
alice29.txt1520893945431553475063403.22.07467
asyoulik.txt1251793634129062677657802.42.32168
cp.html2460370965667114132400.42.30342
fields.c111502774220956068700.21.98161
grammar.lsp3721110987692595000.12.35662
kennedy.xls102974412985110387092885792014.91.00871
lcet10.txt42675497740781824145785308.51.83203
plrabn12.txt48186113162510529031951633010.32.18508
ptt55132165108140854767051204.80.79605
sum38240122889820324053900.62.56807
xargs.142271565124223685000.12.93873
total28107845109244086303--45.51.45379