Direct Processing of Compressed Data
Shahram Latifi(*) and Junichi Kanai(**)
Electrical & Computer Engineering Department (*)
University of Nevada, Las Vegas
Information Science Research Institute (**)
University of Nevada, Las Vegas
Contact Information
Shahram Latifi
ECE Department
University of Nevada, Las Vegas
4505 Maryland Parkway
Las Vegas, NV 89154-4026
Phone: (702) 895-4016
Fax : (702) 895-4075
Email: latifi@ee.unlv.edu
Keywords
Data compression, Binary image processing, Document image analysis,
CCITT Group III/IV, JBIG.
Project Award Information
Duration: 3 years
Current award year: September 1, 1997 - August 31, 2000
Name of the project: Direct Processing of Compressed Data
Project Summary
Progress in computer and communication technologies allow a large
volume of digital information to be exchanged and archived. While the
need for data compression is evident, many operations may have to be
performed on the uncompressed data. This research investigates the
following two problems: (i) develop new algorithms for processing
compressed data without fully decompressing them, and (ii) develop new
compression algorithms that allow given operations to be rapidly
performed on compressed data.
Goals, Objectives, and Targeted Activities
We are currently working on image processing algorithms and data
compression methods for binary document images. Two of the
fundamental operations used in document image processing are rotation
(deskewing) of a given image and detection of connected components.
The CCITT Group IV standard is a popular compression
technique for binary document images. We are working on an algorithm
which will detect connected components in a CCITT Group IV compressed
image.
We are also developing a new compression algorithm that allows rapid
rotation of a document image. Our rotation algorithm is an extension
of the method proposed by Shima et al. and moves black runs rather
than black pixels. As a starting point, we are working on a variation
of the CCITT Group III 1-D scheme.
Indication of Success
Data compression research mainly focused on compression ratio and
quality of images recovered from lossy compression methods.
Information/image processing research traditionally deals with raw
(uncompressed) data. Our project, however, attempts to bridge the
fields of data compression and information processing, promoting a new way of
thinking.
We originally planned to develop new
compression methods to achieve rapid processing of data in the
compress domain by given operations in the third year of the
project. Since we were able to quickly understand the nature of the
problem, we are already designing new compression methods for binary
image rotation.
We are also making progress in developing an algorithm for detecting
connected components in images compressed by the CCITT Group IV scheme.
To complete this task, we are going to code this algorithm and to
test it using a fairly large set of test data.
We evaluate the performance of the new algorithm based on its speed
compared to the corresponding traditional method.
Our approach should make operations faster and/or memory requirements
smaller.
Project Impact and Output
Include a brief discussion on the impact of the project on
- The following two students are supported as graduate research
assistants by this project. They will also choose thesis
topics in the area:
Ms. Shulan Deng, Ph.D. candidate, Mr. Bin Zhu, Ph.D. candidate.
- A graduate course on data compression and its manipulation was
designed and is currently being taught by one of the PIs. Classroom
assignments include typical problems arising in the course of development
of the project.
- We are collecting data to develop and test new techniques. These
data will be used in homework and other research projects.
- According to market research firm Gartner Group Inc, document
imaging is one of merging technologies for 1998. The algorithms
developed in this project can be utilized by a new generation of
document image analysis systems.
What activities have been enabled/spawned because of the
accomplishments made possible by your award?
Two Ph.D students were recruited as research assistants to work on this
project. In the department, weekly meetings are held to discuss the
progress and possible hurdles. The PIs have presented their activities
in several international conferences. Collaborations with other teams
within the IDM is definitely a consideration.
Project References
J. Kanai, S. Latifi, G. Nagy, and H. Bunke,
"Operations on Compressed Image Data", Proceedings of DCC'95,
p. 432, March 1995.
S. Latifi and J. Kanai, "Rapid Manipulation of Images Compressed by
the CCITT Group III 1-D Coding Scheme," Proceedings of the 1997
International Conference on Imaging Science, Systems, and Technology
(CISST'97), Las Vegas, Nevada, June 30 - July 3, 1997, pp. 351-354.
J. Kanai and A. Bagdanov, "Projection Profile Based Skew Estimation
Algorithm for JBIG Compressed Images," To appear: International
Journal of Document Analysis and Recognition, Springer-Verlag, Volume
1, No. 1, 1998.
Area Background
In general, a lossless data compression algorithm consists of a
transformation/decomposition process and an encoding process. A
lossy compression algorithm utilizes a quantization process before an
encoding process. Our approach attempts to extract useful information
while decoding the bit stream corresponding to a compressed data. We
also investigate algorithms that manipulate intermediate symbols
generated by a decoding process rather than the original raw data.
Data are expected be processed more rapidly using less memory.
Area References
Since our research relates to both areas of data compression and
information (image) processing, familiarity with data
compression techniques and information processing algorithms is essential.
K. R. Castleman, Digital Image Processing, Prentice Hall, 1996.
K. Sayood, Introduction to Data Compression, Morgan Kaufmann
Publishers, Inc., 1996.
Y. Shima, S. Kashioka, and J. Higashino,
"A High-Speed Rotation Method for Binary Images Based on
Coordinate Operation of
Run Data", Systems and Computers in Japan, vol. 20,
no. 6, pp. 91-102, 1989.
A. L. Spitz, "Analysis of Compressed Document Images for
Dominant Skew, Multiple Skew and Logotype Detection," To appear
Computer Vision and Image Understanding, May, 1998.
Potential Related Projects
Data base operations could be done more efficiently if data were stored
(perhaps in some sort of compressed form) and queried differently from
the classic methods.
Specific compression techniques may prove useful in
improving performance of data intensive applications such as data mining.