Computer Science
Lossless Compression
Lossless compression is a method of reducing the size of data without losing any information. It achieves this by identifying and eliminating redundant data. When the compressed data is decompressed, it is restored to its original form without any loss of information. This technique is commonly used in computer science to reduce file sizes without compromising data integrity.
Written by Perlego with AI-assistance
Related key terms
1 of 5
12 Key excerpts on "Lossless Compression"
- eBook - ePub
- Alan C. Bovik(Author)
- 2009(Publication Date)
- Academic Press(Publisher)
Chapter 16Lossless Image Compression
Karam Lina J., Arizona State UniversityPublisher Summary
This chapter introduces the basics of Lossless Compression. Lossless Compression is possible because, in general, there is significant redundancy present in image signals. This redundancy is proportional to the amount of correlation among the image data samples. In lossless coding, the decoded image data should be identical both quantitatively (numerically) and qualitatively (visually) to the original encoded image. Although this requirement preserves exactly the accuracy of representation, it often severely limits the amount of compression that can be achieved to a compression factor of two or three. In order to achieve higher compression factors, perceptually lossless coding methods attempt to remove redundant as well as perceptually irrelevant information. These methods require that the encoded and decoded images be only visually, and not necessarily numerically, identical. In this case, some loss of information is allowed as long as the recovered image is perceived to be identical to the original one. Although a higher reduction in bit rate can be achieved with lossy compression, there exist several applications that require lossless coding, such as the compression of digital medical imagery and facsimile transmissions of bitonal images.16.1 INTRODUCTION
The goal of lossless image compression is to represent an image signal with the smallest possible number of bits without loss of any information, thereby speeding up transmission and minimizing storage requirements. The number of bits representing the signal is typically expressed as an average bit rate (average number of bits per sample for still images, and average number of bits per second for video). The goal of lossy compression is to achieve the best possible fidelity given an available communication or storage bit rate capacity or to minimize the number of bits representing the image signal subject to some allowable loss of information. In this way, a much greater reduction in bit rate can be attained as compared to Lossless Compression, which is necessary for enabling many real-time applications involving the handling and transmission of audiovisual information. The function of compression is often referred to as coding - eBook - ePub
Nine Algorithms That Changed the Future
The Ingenious Ideas That Drive Today's Computers
- John MacCormick(Author)
- 2011(Publication Date)
- Princeton University Press(Publisher)
Most people have plenty of disk space on their own computers and don't need to bother about compressing their own files. So it's tempting to think that compression doesn't affect most of us. But this impression is wrong: in fact, compression is used behind the scenes in computer systems quite often. For example, many of the messages sent over the internet are compressed without the user even knowing it, and almost all software is downloaded in compressed form—this means your downloads and file transfers are often several times quicker than they otherwise would be. Even your voice gets compressed when you speak on the phone: telephone companies can achieve a vastly superior utilization of their resources if they compress voice data before transporting it.Compression is used in more obvious ways, too. The popular ZIP file format employs an ingenious compression algorithm that will be described in this chapter. And you're probably very familiar with the trade-offs involved in compressing digital videos: a high-quality video has a much larger file size than a low-quality version of the same video.Lossless Compression: THE ULTIMATE FREE LUNCHIt's important to realize that computers use two very different types of compression: lossless and lossy. Lossless Compression is the ultimate free lunch that really does give you something for nothing. A Lossless Compression algorithm can take a data file, compress it to a fraction of its original size, then later decompress it to exactly the same thing. In contrast, lossy compression leads to slight changes in the original file after decompression takes place. We'll discuss lossy compression later, but let's focus on Lossless Compression for now. For an example of Lossless Compression, suppose the original file contained the text of this book. Then the version you get after compressing and decompressing contains exactly the same text—not a single word, space, or punctuation character is different. Before we get too excited about this free lunch, I need to add an important caveat: Lossless Compression algorithms can't produce dramatic space savings on every file. But a good compression algorithm will produce substantial savings on certain common types of files.So how can we get our hands on this free lunch? How on earth can you make a piece of data, or information, smaller than its actual “true” size without destroying it, so that everything can be reconstructed perfectly later on? In fact, humans do this all the time without even thinking about it. Consider the example of your weekly calendar. To keeps things simple, let's assume you work eight-hour days, five days a week, and that you divide your calendar into one-hour slots. So each of the five days has eight possible slots, for a total of 40 slots per week. Roughly speaking, then, to communicate a week of your calendar to someone else, you have to communicate 40 pieces of information. But if someone calls you up to schedule a meeting for next week, do you describe your availability by listing 40 separate pieces of information? Of course not! Most likely you will say something like “Monday and Tuesday are full, and I'm booked from 1 p.m. to 3 p.m. on Thursday and Friday, but otherwise available.” This is an example of lossless data compression! The person you are talking to can exactly reconstruct your availability in all 40 slots for next week, but you didn't have to list them explicitly. - eBook - PDF
Computer Networks ISE
A Systems Approach
- Larry L. Peterson, Bruce S. Davie(Authors)
- 2007(Publication Date)
- Morgan Kaufmann(Publisher)
Of course, when talking about lossy compression algorithms, processing resources are not the only factor. Depending on the exact application, users are willing to make very different trade-offs between bandwidth (or delay) and extent of information loss due to compression. For example, a radiologist reading a mammogram is unlikely to tolerate 7.2 Data Compression 559 any significant loss of image quality and might well tolerate a delay of several hours in retrieving an image over a network. By contrast, it has become quite clear that many people will tolerate questionable audio quality in exchange for free global telephone calls (not to mention the ability to talk on the phone while driving). 7.2.1 Lossless Compression Algorithms We begin by introducing three Lossless Compression algorithms. We do not describe these algorithms in much detail—we just give the essential idea—since it is the lossy algorithms used to compress image and video data that are of the greatest utility in today’s network environment. We do comment, though, on how well these lossless algorithms work on digital imagery. Some of the ideas exploited by these lossless techniques show up again in later sections when we consider the lossy algorithms that are used to compress images. Run Length Encoding Run length encoding (RLE) is a compression technique with a brute-force simplicity. The idea is to replace consecutive occurrences of a given symbol with only one copy of the symbol, plus a count of how many times that symbol occurs—hence the name “run length.” For example, the string AAABBCDDDD would be encoded as 3A2B1C4D. RLE can be used to compress digital imagery by comparing adjacent pixel values and then encoding only the changes. For images that have large homogeneous regions, this technique is quite effective. For example, it is not uncommon that RLE can achieve compression ratios on the order of 8-to-1 for scanned text images. - eBook - PDF
- Stan Birchfield(Author)
- 2017(Publication Date)
- Cengage Learning EMEA(Publisher)
On a typical image-sharing website, hundreds of millions of photographs are uploaded every day, amounting to several exabytes per year of images. † These numbers are staggering, and although we are starting to reach the point where memory is cheap enough that we can begin to think about storing large collections of raw images and videos at home or on a server, limited transmission speeds and the desire to store these data on mobile devices, not to mention rapidly increasing rates of content creation, continue to motivate the need for compressing and decompressing the data. An overview of a compression/decompression system is provided in Figure 8.1. A stream of bits (in our case an image) is fed to a compressor , which converts the stream to a smaller stream of bits. This new stream is then either stored as a file on disk or transmitted across a network, where on the other end a decompressor restores the original image. Sometimes the compressor and decompressor are known as a coder and decoder , respectively, so that the software part of the system is collectively known as a codec . When we say that the decompressor restores the original image, we must make an impor-tant distinction because there are two types of compression. In Lossless Compression , the restored image is exactly the same as the original image, so that no information has been lost. Lossless Compression techniques are applicable to any type of data, such as text, an image, a database of addresses, or a file containing an executable. On the other hand, the image restored by lossy compression is only similar to the original image. Lossy compres-sion techniques are applicable to data arising from real-world measurements, such as an audio signal, a photographic image, or a signal captured by some other type of sensor. - Zoran Gacovski(Author)
- 2023(Publication Date)
- Arcler Press(Publisher)
Lossless image compression is used to compress images in critical applications as it allows the exact original image to be reconstructed from the compressed one without any loss of the image data. Lossy image compression, on the other hand, suffers from the loss of some data. Thus, repeatedly compressing and decompressing an image results in poor Lossless Image Compression Technique Using Combination Methods 213 quality of image. An advantage of this technique is that it allows for higher compression ratio than the lossless [3,4]. Compression is achieved by removing one or more of the three basic data redundancies: • Coding redundancy, which is presented when less than optimal code words are used; • Interpixel redundancy, which results from correlations between the pixels of an image; • Psychovisual redundancy, which is due to data that are ignored by the human visual system [5]. So, image compression becomes a solution to many imaging applications that require a vast amount of data to represent the images, such as document imaging management systems, facsimile transmission, image archiving, remote sensing, medical imaging, entertainment, HDTV, broadcasting, education and video teleconferencing [6]. One major difficulty that faces lossless image compression is how to protect the quality of the image in a way that the decompressed image appears identical to the original one. In this paper we are concerned with lossless image compression based on LZW and BCH algorithms, which compresses different types of image formats. The proposed method repeats the compression three times in order to increase the compression ratio. The proposed method is an implementation of the lossless image compression. The steps of our approach are as follows: first, we perform a preprocessing step to convert the image in hand into binary. Next, we apply the LZW algorithm on the image to compress.- Roberto Togneri, Christopher J.S deSilva(Authors)
- 2003(Publication Date)
- Chapman and Hall/CRC(Publisher)
This is an example of Lossless Compression , where no information is lost in the coding and decoding. When images are compressed, it may be permissible for the decompressed image not to have exactly the same pixel values as the original image, provided the difference is not perceptible to the eye. In this case, some form of lossy compression may be acceptable. This involves a loss of information between the coding and decoding processes. 4.3 Run-length Coding Run-length coding is a simple and effective means of compressing data in which it is frequently the case that the same character occurs many times in succession. This may be true of some types of image data, but it is not generally true for text, where it is rare for a letter of the alphabet to occur more than twice in succession. To compress a sequence, one simply replaces a repeated character with one instance of the character followed by a count of the number of times it occurs. For example, the sequence could be replaced by 174 Fundamentals of Information Theory and Coding Design reducing the number of characters from 24 to 16. To decompress the sequence, each combination of a character and a count is replaced by the appropriate number of characters. Protocols need to be established to distinguish between the characters and the counts in the compressed data. While the basic idea of run-length coding is very simple, complex protocols can be developed for particular purposes. The standard for fac-simile transmission developed by the International Telephone and Telegraph Con-sultative Committee (CCITT) (now the International Telecommunications Union) [4] involves such protocols. 4.4 The CCITT Standard for Facsimile Transmission Facsimile machines have revolutionised the way in which people do business. Send-ing faxes now accounts for a major part of the traffic on telephone lines.- eBook - PDF
- Mauro Barni(Author)
- 2018(Publication Date)
- CRC Press(Publisher)
Lossy image compression is widely deployed, e.g., using the classic JPEG standard [21]. This standard also has a less-known lossless version. Lossless Compression has the advantage of avoiding the issue whether the coding quality is sufficient. In critical applications, lossless coding may be mandatory. This includes applications where further processing is applied to the images. 113 114 Document and Image Compression Examples are medical imaging, remote sensing, and space applications, where scientific fidelity is of paramount importance. In other areas such as prepress and film production, it is the visual fidelity after further processing which is of concern. In this chapter, we present an overview of techniques for Lossless Compression of images. The basis is techniques for coding gray-scale images. These techniques may be extended or modified in order to increase performance on color and multiband images, as well as image sequences. Lossless coding is performed in a modeling and a coding step. The focus of this text is on the paradigm of modeling by prediction followed by entropy coding. In JPEG, a simple linear prediction filter is applied. In recent efficient schemes, nonlinear prediction is applied based on choosing among a set of linear predictors. Both the predictors and the selection is based on a local neighborhood. The prediction residuals are coded using context-based entropy coding. Arithmetic coding provides the best performance. Variable-length codes related to Huffman coding allow faster implementations. An interesting alternative to predictive coding is established by the use of reversible wavelets. This is the basis of the lossless coding in JPEG2000 [50] providing progression to lossless. For color-mapped images having a limited number of colors per pixel, coding directly in the pixel domain may be an efficient alternative. This chapter is organized as follows. First, the general principles are introduced. - eBook - PDF
- Gerald Friedland, Ramesh Jain(Authors)
- 2014(Publication Date)
- Cambridge University Press(Publisher)
141 Lossy Compression 12 Entropy-based compression as presented in the previous chapter is an important founda-tion of many data formats for multimedia. However, as already pointed out, it often does not achieve the compression rates required for the transmission or storage of multimedia data in many applications. Because compression beyond entropy is not possible without losing information, that is exactly what we have to do: lose information. Fortunately, unlike texts or computer programs, where a single lost bit can render the rest of the data useless, a flipped pixel or a missing sample in an audio file is hardly notice-able. Lossy compression leverages the fact that multimedia data can be gracefully degraded in quality by increasingly losing more information. This results in a very useful quality/cost trade-off: one might not lose any perceivable information and the cost (transmission time, memory space, etc.) is high; with a little bit of information loss, the cost decreases, and this can be continued to a point where almost no information is left and the perceptual quality is very bad. Lossless Compression usually can compress multimedia by a factor of about 1.3: to 2:1. Lossy compression can go up to ratios of several hundreds to one (in the case of video compression). This is leveraged on any DVD or Blu-ray, in digital TV, or in Web sites that present consumer-produced videos. Without lossy compression, media consumption as observed today would not exist. MATHEMATICAL FOUNDATION: VECTOR QUANTIZATION Consider the following problem: We have an image that we want to store on a certain disc, but no matter how hard we try to compress it, it won’t fit. In fact, we know that it won’t fit because information theory tells us that it cannot be compressed to the size of the space that we have without losing any information. - Jr. Johnson, Greg A. Harris, D.C. Hankerson, Peter D. Johnson Jr.(Authors)
- 2003(Publication Date)
- Chapman and Hall/CRC(Publisher)
Chapter 5 Lossless Data Compression by Replacement Schemes Most (but not all) modern data compression problems are of the following form: you have a long binary word (or “file”) W which you wish to transform into a shorter binary word U in such a way that W is recoverable from U , or, in ways to be defined case by case, almost or substantially recoverable from U . In case W is completely recoverable from U , we say we have Lossless Compression. Oth-erwise, we have lossy compression. The compression ratio is lgth ( W )/ lgth ( U ) . The “compression ratio achieved by a method” is the average compression ra-tio obtained, using that method, with the average taken over all instances of W in the cases where the method is used. (This taking of the average is usually hypothetical, not actual.) Sometimes the file W is sitting there, available for leisurely perusal and sampling. Sometimes the file W is coming at you at thousands of bits per sec-ond, with immediate compression required and with no way of foretelling with certainty what the bit stream will be like 5 seconds from now. Therefore, our compression methods will be distinguished not only by how great a compres-sion ratio they achieve, together with how much information they preserve, but also by how fast they work, and how they deal with fundamental changes in the stream W (such as changing from a stream in which the digits 0 , 1 occur approximately randomly to one which is mostly 0’s). There is another item to keep account of in assessing and distinguishing between compression methods: hidden costs . These often occur as instructions for recovering W from U . Clearly it is not helpful to achieve great compression, if the instructions for recovering W from U take almost as much storage as W would. We will see another sort of hidden cost when we come to arithmetic coding: the cost of doing floating-point arithmetic with great precision.- eBook - PDF
Digital Signal Compression
Principles and Practice
- William A. Pearlman, Amir Said(Authors)
- 2011(Publication Date)
- Cambridge University Press(Publisher)
3 Principles of Lossless Compression 3.1 Introduction Source coding began with the initial development of information theory by Shannon in 1948 [1] and continues to this day to be influenced and stimulated by advances in this theory. Information theory sets the framework and the language, motivates the meth- ods of coding, provides the means to analyze the methods, and establishes the ultimate bounds in performance for all methods. No study of image coding is complete without a basic knowledge and understanding of the underlying concepts in information theory. In this chapter, we shall present several methods of lossless coding of data sources, beginning with the motivating principles and bounds on performance based on informa- tion theory. This chapter is not meant to be a primer on information theory, so theorems and propositions will be presented without proof. The reader is referred to one of the many excellent textbooks on information theory, such as Gallager [2] and Cover and Thomas [3], for a deeper treatment with proofs. The purpose here is to set the foun- dation and present lossless coding methods and assess their performance with respect to the theoretical optimum when possible. Hopefully, the reader will derive from this chapter both a knowledge of coding methods and an appreciation and understanding of the underlying information heory. The notation in this chapter will indicate a scalar source on a one-dimensional field, i.e., the source values are scalars and their locations are on a one-dimensional grid, such as a regular time or space sequence. Extensions to multi-dimensional fields, such as images or video, and even to vector values, such as measurements of weather data (temperature, pressure, wind speed) at points in the atmosphere, are often obvious once the scalar, one-dimensional field case is mastered. 3.2 Lossless source coding and entropy Values of data are not perfectly predictable and are only known once they are emitted by a source. - Qian, Shen-En(Authors)
- 2013(Publication Date)
High compression ratios can be achieved. The higher the compression ratio is, the larger the compression error. A near-Lossless Compression technique lies between the lossless and lossy compression techniques. The error introduced by a near-Lossless Compression 33 technique is bound by a predefined threshold, such as RMSE, the accuracy of an application product. A near-Lossless Compression means that it is theoretically still a lossy compression due to its irreversibility; however, the loss of information caused by the compression is designed to have negligible or minor impact on the derivation of the ultimate data products or applications. Satellite data users often do not like lossy data compression and may be willing to accept near-Lossless Compression by trading off the gain and cost of the compression. For satellite data, lossy compression is normally not recommended because it will reduce the value of acquired data for their purpose. For this reason, lossy data compression is not a subject of this book. Instead, this book describes both lossless and near-lossless data compression techniques in this and following chapters. Lossless Compression techniques can be generally classified as two categories: prediction-based and transform-based. The former is based on the predictive coding paradigm, whereby a current pixel is predicted from the previous pixels, and the prediction error is then entropy coded. 1,2 Lossless Compression techniques that use a lookup-table or vector-quantization method are also categorized as prediction-based methods because both the lookup-table and vector-quantization methods are used to generate prediction of the data. A vector-quantization-based lossless technique is an asymmetric compression process that is much more computationally intensive than the decompression. For prediction-based Lossless Compression, band-reordering techniques may also be applied before the prediction to improve the compression ratio.- eBook - PDF
- Khalid Sayood(Author)
- 2002(Publication Date)
- Academic Press(Publisher)
Thesis, Georgia Institute of Technology, May 1998, Available at http://users.ece.gatech.edu/--~hans/thesis.zip. 7. Hans, M., and Schafer, R. W., 1999. Lossless Compression of Digital Audio, Hewlett-Packard Technical Report HPL- 1999-144, November 1999, Available at http://www.hpl.hp.com/techreports/1999/HPL- 1999-144.html. 8. Robinson, T., 1994. SHORTEN: Simple Lossless and Near-Lossless Waveform Compression, Technical Report CUED/F-INFENG/TR.156, Cambridge University Engineering Department, Cambridge, UK, December 1994, Available at http://svr-www.eng.cam.ac.uk/reports/svr-ftp/robinson_tr156.ps.Z. 9. Sayood, K., 2000. Introduction to Data Compression, 2nd ed., Morgan Kaufmann. 10. Sonic Foundry, Inc., 2001. Private communication, Madison, WI, May 2001. This Page Intentionally Left Blank CHAPTER 13 Algorithms for Delta Compression and Remote File Synchronization TORSTEN SUEL NASIR MEMON OVERVIEW Delta compression and remote file synchronization techniques are concerned with efficient file transfer over a slow communication link in the case where the receiving party already has a similar file (or files). This problem arises naturally, e.g., when distributing updated versions of software over a network or synchronizing personal files between different accounts and devices. More generally, the problem is becoming increasingly common in many network-based applications where files and content are widely replicated, frequently modified, and cut and reassembled in different contexts and packagings. In this chapter, we survey techniques, software tools, and applications for delta compression, remote file synchronization, and closely related problems. We first focus on delta compression, where the sender knows all the similar files that are held by the receiver. In the second part, we sur- vey work on the related, but in many ways quite different, problem of remote file synchronization, where the sender does not have a copy of the files held by the receiver.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.











