A close relationship exists between the tasks of data compression and feature detection. A piece of signal data which has features of interest detected and recognized can be compressed at rates beyond those achievable by statistical-based methods. Conversely, since the essence of many compression methods is to quantize decorrelated features, obtained typically through a suitable transform, these methods share many computational goals of early processing in feature detection. The zerotree quantization method of encoding significant wavelet coefficients is one example of a data compression method that has potential in playing a significant role in feature detection. The zerotree algorithm is a tree-pruning method for encoding the significant map, which contains the locations of significant wavelet transform coefficients of a signal. The key idea is to systematically eliminate those coefficients that do not carry sufficient information by assigning a zero symbol to all the wavelet coefficients at the subtree branching from a coefficient whose magnitude is negligible. When the coefficients are obtained from a wavelet transform, they are organized naturally as a tree, so that the pruning process can start from the lowest resolution, and work its way towards the higher resolutions. The zerotree algorithm operating on the wavelet coefficients of a one-dimensional signal can be summarized as follows. The algorithm starts from the lowest resolution coefficients; at resolution level $i$, if the magnitude of the $j$th coefficient, $c^{(i)}_j$ is less than a preset threshold, and that all coefficients of the subtree having $c^{(i)}_j$ as its root have magnitudes less than the threshold, a tree-of-zeros is identified and located at $c^{(i)}_j$. The identification of the zero symbol, or the {\em zerotree}, is facilitated by the fact that all coefficients at higher resolutions are descendents from one of the coefficients at a specific resolution. The immediate descendents of $c^{(i)}_j$, for instance, are $c^{(i+1)}_{2j}$ and $c^{(i+1)}_{2j+1}$. \section{Tree Pruning in Wavelet Analysis} The zerotree algorithm can be generalized to identify the significant wavelet-packet coefficients. These coefficients are obtained by a multiresolution processing tree that is pruned to obtain a basis set that is best adapted to the time-frequency characteristics of the signal data. The multiresolution processing tree is a binary tree that transforms a signal into packets of wavelet coefficients. At the root level, the data $x$ is decomposed into a lower resolution level of smoothed and detail coefficients, ($Hx$ and $Gx$, respectively). Each of these can be further decomposed into lower levels; if there is no further decomposition, as denoted by a terminating node in the processing tree, the coefficients are said to form a wavelet packet at the node. The representation is then based on a suitably chosen set of wavelet packets. The wavelet-packet approach can therefore be viewed as pruning the processing tree to customize the basis set. In \cite{coi92}, the ``best basis'' (EBB) method chooses the packet combinations based on the entropy of the packets. In \cite{oe}, the ``optimized multiresolution tree'' (OMT) method chooses the packets by optimizing the compression performance. These methods can have better performance compared to the wavelet transform (DWT). As an example, in Figure 1, the processed signal using the wavelet transform and the best basis packets obtained using OMT are shown for the case when the compression rate is 1/8. The binary trees of the two representations are shown in Figure 2; in this processing tree, an up-branch and a down-branch denote a low-pass and a high-pass filtering operation, respectively. The representation corresponding to the DWT is more structured compared to the less structured OMT, though the latter is better adapted to the task assigned. \begin{figure} \centerline{\epsfxsize=2in \epsfbox{fig8.eps} \epsfxsize=2in \epsfbox{fig9.eps}} \begin{verse}{\small Figure 1. The reconstruction using 32 (out of 256) coefficients using DWT ({\em left}) and OMT ({\em right}).} \end{verse} \end{figure} \begin{figure} \centerline{\epsfysize=1.5in \epsfbox{fig10.eps} \epsfysize=1.5in \epsfbox{fig11.eps}} \begin{verse}{\small Figure 2. The binary tree representation of the processing using DWT ({\em left}) and OMT ({\em right}) that reconstructed the results in Fig. 1.} \end{verse} \end{figure} A zerotree algorithm generalized to handle wavelet-packet coefficients would be important: (i) for improving the performance of wavelet-packet based data compression algorithms, and (ii) as an optimized encoding of information, drawing on the strengths of the wavelet-packet and the zerotree approaches. The generalized zerotree algorithm must be capable of handling coefficients that result from a general multiresolution processing tree. The natural sequency of coefficients is not always guaranteed, and relationships among different packets of coefficients have to be established so that the ``zero-subtree'' can be established. The algorithm for locating the significant coefficients works as follows. The coefficients of the packet at the lowest resolution are examined first. The descendents of each coefficient that has a magnitude less than a threshold are searched, and if no significant coefficient can be found, a zerotree symbol is marked at the original node. Rules have to be developed for labeling descendents among wavelet-packet coefficients. The descendent of a coefficient in a packet can be defined based on their time (or spatial, in 2-dimensional cases) locations. Consider a coefficient of a packet at level $i$. The descendents of this coefficient are either from the same resolution level, or from a higher resolution. The choice depends on the multiresolution processing tree that generated the coefficients. In general, the mapping from a packet of coefficients to their descendents is such that, at level $i$, a packet of $2^i$ coefficients are mapped to $2 \times 2^i$ coefficients. In the case of the descendents being at the same resolution, the mapping is from one packet of $2^i$ coefficients to two packets of $2^i$ coefficients. In the case of the descendents being at a higher resolution, the mapping is from a packet of $2^i$ coefficients to a packet of $2^{(i+1)}$ coefficients. Note that the descendents of a coefficients cannot be two levels away, since the multiresolution processing tree is a binary tree. \section{Research Work and Direction} The issues to be investigated are along two directions. First, what are the implications of the zerotree encoding on the multiresolution tree? Should the multiresolution tree be modified by the zerotree encoding, and, if so, how? The approach would be to develop the zerotree quantization method for wavelet packet coefficients, assess their compression performance as compared to the wavelet transform coefficients. A second direction of this research is to investigate the use of significant coefficient encoding map in the detection of features in signals. For the purpose of data compression, it is desirable to construct a suitable basis set so that most of the energy is contained in a few representation coefficients. This is referred to as the energy packing capacity of the basis set, and can be measured quantitatively: (a) by the entropy of the transform coefficients; and (b) by measuring percentage of the signal energy represented by a fixed number of coefficients. There are different methods to construct a wavelet basis function set, which depends on the processing tree as well as the mother wavelet. Besides the discrete wavelet function (DWT), the wavelet-packet approach builds a better basis by modifying the processing tree to customize the basis set. In \cite{coi92}, the ``best basis'' (EBB) method chooses the packet combinations based on the entropy of the packets. In \cite{oe}, the ``optimized multiresolution tree'' (OMT) method chooses the packets by optimizing the compression performance. These methods can have better performance compared to the baseline DWT method, as illustrated by the experimental results in \cite{oe}. Both of these methods (OMT and EBB) will be considered to modify the processing tree to obtain the best basis, and compared to the DWT method. There are three quantization strategies: scalar, vector, and zerotree. The scalar quantization is the baseline method; it quantizes each coefficient by assigning a number of bits that is proportional to the variance of the coefficient. In vector quantization, codebooks are optimized for compressing blocks of data \cite{gersho83}. For example, by setting the block size at 128 or 256, a full-search based on the LBG-algorithm would be possible for codebook design. In the zerotree quantization, different algorithms will be needed for the wavelet (DWT) coefficients and for the wavelet-packet (EBB and OMT) coefficients. For the wavelet coefficients, a zero symbol is assigned to an entire subtree branching off a coefficient with negligible magnitude. For the wavelet-packet coefficients, since the tree has already been pruned, special tree-traversal algorithms as described above will be used. It is expected that the zerotree-quantized wavelet-packet coefficients from either the EBB and OMT will have better performance compared to the quantized DWT coefficients because of this second pass of tree pruning. \subsection{Relationship to Recent Research} Since the early results of multirate filter banks were reported (see, e.g., \cite{vaidy,vetterli}), the decomposition of signal data into different resolution components has been popular for various processing tasks, among them, signal compression \cite{mallat}. Independent of the work on filter banks, the representation of signals using a basis set consisting of dilations and translations of a single ``wavelet'' was developed \cite{daubechies}. In a wavelet decomposition, a signal $x$ is represented by $ x(n) = \sum_k c_k \psi_k (n), $ where $\psi_k(n)$ has the form $ \psi_k(n) = h(\frac{n - b_k}{a_k}) . $ The wavelet $h(n)$ is commonly referred to as the mother wavelet, $a_k$ and $b_k$ are the dilation and translation parameters, respectively, which can take on any real values. The signal is therefore represented by $ x(n) = \sum_k w_k h(\frac{n - b_k}{a_k}) . $ In a wavelet series representation, the dilation and shift of the mother wavelet are restricted to binary and dyadic, respectively. i.e., the signal is represented by $ x(n) = \sum_{l,k} w_k 2^{k/2} h(\frac{n - l/2^k }{2^{-k}}) , $ where $l$ and $k$ are integer values. These restrictions allow the wavelet transform to be considered as an orthogonal transform for efficient computation \cite{chu92}. If the basis functions are discrete sequences, the resulting transform is a discrete wavelet transform \cite{rioul}. Wavelet transform has been applied to tasks such as MRI signal processing \cite{hea92}, data classification \cite{szu93}, texture classification \cite{lai93}, and data compression \cite{dev92}. Despite the increasing availability of wideband technologies for data transmission and storage, the need and interest in data compression methods remain intense \cite{wallace,legall,fbi,woo86}. Signal data compression can be categorized into the lossless and the lossy classes. Lossless coding methods are concerned with compacting data, and are usually designed using entropy concepts \cite{storer}. Lossless methods typically cannot achieve compression ratios of more than 4:1. In the lossy methods, the decompressed signal will be degraded compared to the original signal. A commonly used method is transform coding, which represents the signal in a different basis so that its transform representation has less correlation. Only those transform coefficients with the highest information will be retained. The decompressed data can then be reconstructed by an inverse transform. The cosine transform is used in compressing image data \cite{rabbani}, while the wavelet transform has found more diverse applications \cite{chu92}. The transform decomposition is always followed by a bit allocation and quantization step. Specific algorithms and quantization strategies are typically developed to minimize certain domain specific fidelity criteria \cite{wallace,legall}. The quantized coefficients, or symbols, are next encoded using lossless channel coding methods. Next to the fidelity criteria, the quantization strategy is the key issue in data compression algorithm development. In scalar quantization, the transform coefficients are quantized independently, whereas in vector quantization, blocks of transform coefficients are coded using a developed codebook \cite{gersho83}. These approaches, however, do not exploit correlations between coefficients at different levels of the wavelet decomposition. Recently, a number of promising quantizers, viz. zerotree quantizers, exploiting correlations between different levels of the wavelet decomposition, have been proposed for image data compression \cite{shap91}. \section{Concluding Remarks} Algorithms in signal compression and feature detection have widespread potential in many applications. The availability of wideband technologies as well as the organization of computer networks such as the Internet has promoted the management and integration of information in many multidisciplinary fields. Data compression of signal data will enhance the facilities for sharing of data resources and repositories for the voluminous data such as those found in the medical field. The basic contribution to signal processing methods from this research will be the investigation of using zerotree quantization to wavelet packet coefficients. This amounts to a two-pass pruning of the multiresolution processing tree: one pass for customizing the basis set for data representation, and a second pass for selecting the best truncated representation in the ``best'' basis domain. The optimality of this approach, if proven, will be a significant result, both in theory and in practice. The authors have been involved in a related project that is being formulated by investigators from USL, the Mayo Clinic, and the Cleveland Clinic Foundation. This larger project is to develop an archive to support collaborative research in the neurosciences, involving clinicians, theoreticians, and data analysts. Such effort is frequently complicated by the physical separation of the units involved in such projects. Work described in this paper addresses some of these barriers by developing data compression methods that will lead to the development and maintenance of a public archive, available via the Internet, of digitized EEG signals and clinical data. \begin{thebibliography}{99} \bibitem{oe} C. H. Chu, ``Data compression by multiresolution tree search,'' {\em Optical Engineering}, vol. 33, no. 7, pp. 2136--2142, July 1994. \bibitem{chu92} C.~K. Chui, {\em An Introduction to Wavelets}. \newblock San Diego, Calif.: Academic Press, 1992. \bibitem{coi92} R.~R. Coifman and M.~V. Wickerhauser, ``Entropy-based algorithms for best basis selection,'' {\em IEEE Transactions on Information Theory}, vol.~38, pp.~713--718, March 1992. \bibitem{daubechies} I.~Daubechies, ``Orthonormal bases of compactly supported wavelets,'' {\em Communications on Pure and Applied Mathematics}, vol.~41, pp.~909--996, 1988. \bibitem{dev92} R.~A. Devore, B.~Jawerth, and B.~J. Lucier, ``Image compression through wavelet transform coding,'' {\em IEEE Transactions on Information Theory}, vol.~38, pp.~719--746, March 1992. \bibitem{gersho83} A. Gersho, ``On the structure of vector quantizers,'' {\em IEEE Transactions on Information Theory,} vol. 28, pp. 157--166, 1982. \bibitem{hea92} D.~M. Healy and J.~B. Weaver, ``Two applications of wavelet transforms in magnetic resonance imaging,'' {\em IEEE Transactions on Information Theory}, vol.~38, pp.~840--860, March 1992. \bibitem{lai93} A.~Laine and J.~Fan, ``Texture classification by wavelet packet signatures,'' {\em IEEE Transactions on Pattern Analysis and Machine Intelligence}, vol.~15, pp.~1186--1191, November 1993. \bibitem{legall} D. Le Gall, ``MPEG: A video compression standard for multimedia applications,'' {\em Communications of the ACM}, vol. 34, no. 4, pp. 46--58, 1991. \bibitem{mallat} S. Mallat, ``A theory of multiresolution signal decomposition: The wavelet representation,'' {\em IEEE Transactions on Pattern Recognition and Machine Intelligence}, vol. 11, no. 7, pp. 674--693, 1989. \bibitem{rabbani} M.~Rabbani and P.~W. Jones, {\em Digital Image Compression Techniques}. \newblock Bellingham, Wash.: SPIE Press, 1991. \bibitem{rioul} O. Rioul, ``A discrete time multiresolution theory unifying octave-band filter banks, pyramid and wavelet transforms,'' {\em IEEE Transactions on Signal Processing,} 1990. \bibitem{shap91} J. M. Shapiro, ``Embedded image coding using zerotrees of wavelet coefficients,'' {\em IEEE Transactions on Signal Processing}, vol. 41, pp. 3445--3462, Dec. 1993. \bibitem{storer} J. Storer, {\em Data Compression}, Computer Science Press, Rockville, Md., 1988. \bibitem{szu93} H.~H. Szu, X.~Y. Yang, B.~Telfer, and Y.~Sheng, ``Neural networks and wavelet transforms for scale-invariant data classification,'' {\em Physics Review E}, vol.~48, pp.~1497--1501, August 1993. \bibitem{vaidy} P. P. Vaidyanathan, ``Multirate digital filters, filter banks, polyphase networks, and applications: A tutorial,'' {\em Proceedings of the IEEE}, vol. 78, no. 1. pp 56--93, 1990. \bibitem{vetterli} M. Vetterli, ``A theory of multirate filter banks,'' {\em IEEE Transactions on Acoustics, Speech, and Signal Processing}, vol. 35, no. 3, pp. 356--372, 1987. \bibitem{fbi} {\em WSQ Gray-Scale Fingerprint Image Compression Specification}, IAFIS-IC-0110 (V2), Criminal Justice Information Services, Federal Bureau of Investigation, Feb. 1993. \bibitem{wallace} G. K. Wallace, ``The JPEG still picture compression standard,'' {\em Communications of the ACM}, vol. 34, no. 4, pp. 30--44, 1991. \bibitem{woo86} J.~W. Woods and S.~D. O'Neil, ``Subband coding of images,'' {\em IEEE Transactions on Acoustics, Speech, and Signal Processing}, vol.~35, no.~5, pp.~1278--1288, 1986. \end{thebibliography} \end{document}