On this page you'll find some test results for the current implementation of CTW (version 0.1). All files of the Calgary corpus and the Canterbury corpus are encoded under Microsoft Windows ME on a 1400 MHz AMD Athlon processor with 256 megabytes of PC-266/2100 DRR-RAM. The compiler that was used is Microsoft Visual C++ 6.0 with the "Win32 Release" project configuration (which is the same as for the downloadable Windows version). The settings that are used, are the default settings:
| Tree depth | 6 |
| Size of tree array | 4194304 nodes (33554432 bytes) |
| Max. number of tries | 32 |
| Max. file buffer size | 4194304 bytes |
| Strict pruning | enabled |
| Root weighting | disabled |
| Max. log beta | 1024 |
| Estimator | zero-redundancy |
As can be seen in the tables below, the compression performance is very good. However, the processing time is quite long (17.3 seconds for book1, which is 768771 bytes) and the algorithm uses a large amount of memory (32 MB for the tree array and up to 4 MB for the file buffer, with the default settings).
The results for the Calgary corpus are as follows:
| file | orig. size (bytes) | comp. size (bytes) |
#codebits | #treenodes | #failed | processing time (seconds) | compression- rate (bits/byte) |
| bib | 111261 | 25491 | 203828 | 472364 | 0 | 2.2 | 1.83198 |
| book1 | 768771 | 209514 | 1676015 | 2957949 | 7 | 17.3 | 2.18012 |
| book2 | 610856 | 144399 | 1155092 | 2084480 | 0 | 12.4 | 1.89094 |
| geo | 102400 | 58012 | 463996 | 1137041 | 0 | 1.6 | 4.53121 |
| news | 377109 | 110794 | 886252 | 1948180 | 0 | 7.3 | 2.35012 |
| obj1 | 21504 | 10002 | 79916 | 162256 | 0 | 0.3 | 3.71633 |
| obj2 | 246814 | 73990 | 591822 | 1250526 | 0 | 3.9 | 2.39785 |
| paper1 | 53161 | 15221 | 121670 | 336587 | 0 | 1.0 | 2.28871 |
| paper2 | 82199 | 22900 | 183098 | 480305 | 0 | 1.6 | 2.22750 |
| paper3 | 46526 | 14551 | 116305 | 335502 | 0 | 0.9 | 2.49979 |
| paper4 | 13286 | 4689 | 37413 | 113907 | 0 | 0.3 | 2.81597 |
| paper5 | 11954 | 4392 | 35034 | 104126 | 0 | 0.3 | 2.93073 |
| paper6 | 38105 | 11319 | 90454 | 254117 | 0 | 0.7 | 2.37381 |
| pic | 513216 | 51081 | 408547 | 670512 | 0 | 4.8 | 0.79605 |
| progc | 39611 | 11572 | 92479 | 253793 | 0 | 0.7 | 2.33468 |
| progl | 71646 | 14754 | 117936 | 266551 | 0 | 1.2 | 1.64609 |
| progp | 49379 | 10365 | 82823 | 196023 | 0 | 0.8 | 1.67729 |
| trans | 93695 | 16902 | 135117 | 280724 | 0 | 1.4 | 1.44209 |
| total | 3251493 | 809948 | 6477797 | - | - | 62.2 | 1.99225 |
The results for the Canterbury corpus are as follows:
| file | orig. size (bytes) | comp. size (bytes) |
#codebits | #treenodes | #failed | processing time (seconds) | compression- rate (bits/byte) |
| alice29.txt | 152089 | 39454 | 315534 | 750634 | 0 | 3.2 | 2.07467 |
| asyoulik.txt | 125179 | 36341 | 290626 | 776578 | 0 | 2.4 | 2.32168 |
| cp.html | 24603 | 7096 | 56671 | 141324 | 0 | 0.4 | 2.30342 |
| fields.c | 11150 | 2774 | 22095 | 60687 | 0 | 0.2 | 1.98161 |
| grammar.lsp | 3721 | 1109 | 8769 | 25950 | 0 | 0.1 | 2.35662 |
| kennedy.xls | 1029744 | 129851 | 1038709 | 2885792 | 0 | 14.9 | 1.00871 |
| lcet10.txt | 426754 | 97740 | 781824 | 1457853 | 0 | 8.5 | 1.83203 |
| plrabn12.txt | 481861 | 131625 | 1052903 | 1951633 | 0 | 10.3 | 2.18508 |
| ptt5 | 513216 | 51081 | 408547 | 670512 | 0 | 4.8 | 0.79605 |
| sum | 38240 | 12288 | 98203 | 240539 | 0 | 0.6 | 2.56807 |
| xargs.1 | 4227 | 1565 | 12422 | 36850 | 0 | 0.1 | 2.93873 |
| total | 2810784 | 510924 | 4086303 | - | - | 45.5 | 1.45379 |