Computer Science

Lossy Compression

Lossy compression is a data compression technique that reduces the size of a file by permanently eliminating certain information deemed less essential. This process results in a smaller file size, making it more efficient for storage and transmission. However, it also leads to a loss of some data, which may affect the quality of the compressed file.

Written by Perlego with AI-assistance

12 Key excerpts on "Lossy Compression"

  • Book cover image for: Multimedia Computing
    141 Lossy Compression 12 Entropy-based compression as presented in the previous chapter is an important founda-tion of many data formats for multimedia. However, as already pointed out, it often does not achieve the compression rates required for the transmission or storage of multimedia data in many applications. Because compression beyond entropy is not possible without losing information, that is exactly what we have to do: lose information. Fortunately, unlike texts or computer programs, where a single lost bit can render the rest of the data useless, a flipped pixel or a missing sample in an audio file is hardly notice-able. Lossy Compression leverages the fact that multimedia data can be gracefully degraded in quality by increasingly losing more information. This results in a very useful quality/cost trade-off: one might not lose any perceivable information and the cost (transmission time, memory space, etc.) is high; with a little bit of information loss, the cost decreases, and this can be continued to a point where almost no information is left and the perceptual quality is very bad. Lossless compression usually can compress multimedia by a factor of about 1.3: to 2:1. Lossy Compression can go up to ratios of several hundreds to one (in the case of video compression). This is leveraged on any DVD or Blu-ray, in digital TV, or in Web sites that present consumer-produced videos. Without Lossy Compression, media consumption as observed today would not exist. MATHEMATICAL FOUNDATION: VECTOR QUANTIZATION Consider the following problem: We have an image that we want to store on a certain disc, but no matter how hard we try to compress it, it won’t fit. In fact, we know that it won’t fit because information theory tells us that it cannot be compressed to the size of the space that we have without losing any information.
  • Book cover image for: Lossless Compression Handbook
    The performance of a lossless data compressor can be interpreted in several ways. One is the compression ratio, which is the size of the input data file divided by the size of the output data file. Sometimes the performance is stated as a percentage, either the percentage reduction in size or the percentage of the original file size that remains after compression. Another interpretation is to describe data compression as a way to minimize the bit rate of the signal, where the bit rate is the number of bits required to represent the data divided by the total playing time. And in some contexts it is useful to describe data compression in terms of the average number of bits per sample or the average reduction in bits per sample. Clearly, the compression ratio is most helpful when trying to determine how much file system space would be saved by compressing the audio data, while the interpretation in terms of bit rate is more meaningful when considering transmission of the data through a communications channel. Finally, the act of compressing and decompressing the audio data is sometimes referred to as encoding and decoding. In this context the encoded data is the losslessly compressed data stream, while the decoded data is the recovered original audio waveform. In this chapter we use compression and encoding interchangeably. 12.2 PRINCIPLESOF LOSSLESS DATA COMPRESSION The essence oflossless data compression is to obtain efficient redundancy removal from a bitstream [9]. Common lossless compressors such as winz I P and Stuf fiT are used on arbitrary computer data files and usually provide compressed files roughly half the size of the original. However, the compression model (e.g., LZ77) commonly used is poorly matched to the statistical characteristics of binary audio data files, and the compressed audio files are typically still about 90% of the original file size.
  • Book cover image for: Image Processing and Analysis
    On the other hand, the image restored by Lossy Compression is only similar to the original image. Lossy compres-sion techniques are applicable to data arising from real-world measurements, such as an audio signal, a photographic image, or a signal captured by some other type of sensor. In such data, certain aspects of the data are more important than others, both because of noise in the signal and the inability of the human perceptual system to distinguish certain subtle characteristics of the data. By taking into account these perceptual inequities, the informa-tion that is lost by a well-designed Lossy Compression algorithm will not be noticeable to the person viewing the restored image. The key idea behind compression is the distinction between data and information . Data are the bits, stored in the computer and, taken in and of themselves, carry no inherent meaning. Information, on the other hand, can be thought of as the message being conveyed by the data, or rather the meaning that can be inferred from the data. This distinction is made clear in the following example. † A megabyte is a million 1 10 6 2 bytes; a gigabyte is a billion 1 10 9 2 bytes; a terabyte is a trillion 1 10 12 2 bytes; a petabyte is a million billion 1 10 15 2 bytes; and an exabyte is a billion billion 1 10 18 2 bytes. EXAMPLE 8.1 Suppose you want to share with someone the number 3.1415926535897932384626433832… What is the data, and what is the information? Solution If you were to send this number as a series of digits, it would literally take forever. On the other hand, if you were to agree with the receiver beforehand that the Greek letter p repre-sents the number, then you could simply send only a small number of bits. In this context, the information is the ratio of a circle’s circumference to its diameter, while the data is either the infinite string of digits or the single Greek letter. Obviously, the latter is a much more efficient encoding of the information than the former.
  • Book cover image for: Video Coding for Wireless Communication Systems
    • King N. Ngan, Chi W. Yap, Keng T. Tan(Authors)
    • 2018(Publication Date)
    • CRC Press
      (Publisher)
    2 CHAPTER 1 can bring this requirement down to a level such that one CD can be made to contain a movie 100 minutes in length. The compression method is known as lossless if the recovery is perfect. Lossy Compression results if there is a measurable loss of fidelity between the original and the recovered image. This loss of data is usually chosen to be an allowable loss and can never be recovered. However, the aillount of storage memory required by Lossy Compression is much less than that required for lossless compression. The tradeoffs resulting from Lossy Compression include a lower picture quality, and perhaps a reduction in frame rate as ,veIl. Section 1.1 will introduce the reader to the fundamental concepts of video compression, as applied in this book. These concepts will cover still frame image coding and multiple frame video coding, in the spatial and temporal domains. Since this book also deals with the transmission of wavelet coded still images, some background about wavelet coding will be covered here. Sections 1.5 and 1.8 rill cover the image and video codecs l respectively. Both the image and video codecs will be described in detail here with the additional work on error resilience to be covered in Chapters 5 and 6. 1.1 IITlage and Video Source Coding-Image coding can be treated as a subset of video coding as video coding can be considered to be a stream of slightly different images. Uncompressed video data at 25 frames per second (fps) is similar to many data sources in the respect that it contains a lot of redundant information. Within each frame, this redundancy is due to the significant correlation among neighbouring pixels, otherwise known as spatial correlation. There is also redundancy present, between any two consecutive frames, that can be taken advantage of as there is usually a very large amount of similarity between them. This type of similarity is known as temporal correlation.
  • Book cover image for: Computer Networks ISE
    eBook - PDF

    Computer Networks ISE

    A Systems Approach

    • Larry L. Peterson, Bruce S. Davie(Authors)
    • 2007(Publication Date)
    • Morgan Kaufmann
      (Publisher)
    Of course, when talking about Lossy Compression algorithms, processing resources are not the only factor. Depending on the exact application, users are willing to make very different trade-offs between bandwidth (or delay) and extent of information loss due to compression. For example, a radiologist reading a mammogram is unlikely to tolerate 7.2 Data Compression 559 any significant loss of image quality and might well tolerate a delay of several hours in retrieving an image over a network. By contrast, it has become quite clear that many people will tolerate questionable audio quality in exchange for free global telephone calls (not to mention the ability to talk on the phone while driving). 7.2.1 Lossless Compression Algorithms We begin by introducing three lossless compression algorithms. We do not describe these algorithms in much detail—we just give the essential idea—since it is the lossy algorithms used to compress image and video data that are of the greatest utility in today’s network environment. We do comment, though, on how well these lossless algorithms work on digital imagery. Some of the ideas exploited by these lossless techniques show up again in later sections when we consider the lossy algorithms that are used to compress images. Run Length Encoding Run length encoding (RLE) is a compression technique with a brute-force simplicity. The idea is to replace consecutive occurrences of a given symbol with only one copy of the symbol, plus a count of how many times that symbol occurs—hence the name “run length.” For example, the string AAABBCDDDD would be encoded as 3A2B1C4D. RLE can be used to compress digital imagery by comparing adjacent pixel values and then encoding only the changes. For images that have large homogeneous regions, this technique is quite effective. For example, it is not uncommon that RLE can achieve compression ratios on the order of 8-to-1 for scanned text images.
  • Book cover image for: Document and Image Compression
    • Mauro Barni(Author)
    • 2018(Publication Date)
    • CRC Press
      (Publisher)
    Lossy image compression is widely deployed, e.g., using the classic JPEG standard [21]. This standard also has a less-known lossless version. Lossless compression has the advantage of avoiding the issue whether the coding quality is sufficient. In critical applications, lossless coding may be mandatory. This includes applications where further processing is applied to the images. 113 114 Document and Image Compression Examples are medical imaging, remote sensing, and space applications, where scientific fidelity is of paramount importance. In other areas such as prepress and film production, it is the visual fidelity after further processing which is of concern. In this chapter, we present an overview of techniques for lossless compression of images. The basis is techniques for coding gray-scale images. These techniques may be extended or modified in order to increase performance on color and multiband images, as well as image sequences. Lossless coding is performed in a modeling and a coding step. The focus of this text is on the paradigm of modeling by prediction followed by entropy coding. In JPEG, a simple linear prediction filter is applied. In recent efficient schemes, nonlinear prediction is applied based on choosing among a set of linear predictors. Both the predictors and the selection is based on a local neighborhood. The prediction residuals are coded using context-based entropy coding. Arithmetic coding provides the best performance. Variable-length codes related to Huffman coding allow faster implementations. An interesting alternative to predictive coding is established by the use of reversible wavelets. This is the basis of the lossless coding in JPEG2000 [50] providing progression to lossless. For color-mapped images having a limited number of colors per pixel, coding directly in the pixel domain may be an efficient alternative. This chapter is organized as follows. First, the general principles are introduced.
  • Book cover image for: Information and coding theory in computer science
    • Zoran Gacovski(Author)
    • 2023(Publication Date)
    • Arcler Press
      (Publisher)
    Lossless image compression is used to compress images in critical applications as it allows the exact original image to be reconstructed from the compressed one without any loss of the image data. Lossy image compression, on the other hand, suffers from the loss of some data. Thus, repeatedly compressing and decompressing an image results in poor Lossless Image Compression Technique Using Combination Methods 213 quality of image. An advantage of this technique is that it allows for higher compression ratio than the lossless [3,4]. Compression is achieved by removing one or more of the three basic data redundancies: • Coding redundancy, which is presented when less than optimal code words are used; • Interpixel redundancy, which results from correlations between the pixels of an image; • Psychovisual redundancy, which is due to data that are ignored by the human visual system [5]. So, image compression becomes a solution to many imaging applications that require a vast amount of data to represent the images, such as document imaging management systems, facsimile transmission, image archiving, remote sensing, medical imaging, entertainment, HDTV, broadcasting, education and video teleconferencing [6]. One major difficulty that faces lossless image compression is how to protect the quality of the image in a way that the decompressed image appears identical to the original one. In this paper we are concerned with lossless image compression based on LZW and BCH algorithms, which compresses different types of image formats. The proposed method repeats the compression three times in order to increase the compression ratio. The proposed method is an implementation of the lossless image compression. The steps of our approach are as follows: first, we perform a preprocessing step to convert the image in hand into binary. Next, we apply the LZW algorithm on the image to compress.
  • Book cover image for: Introduction to Digital Audio
    • John Watkinson(Author)
    • 2013(Publication Date)
    • Routledge
      (Publisher)
    Clearly with computer programs the corruption of a single bit can be catastrophic. Lossless coding is generally restricted to compression factors of around 2:1. It is important to appreciate that a lossless coder cannot guarantee a particular compression factor and the communications link or recorder used with it must be able to handle the variable output data rate. Audio material which results in poor compression factors on a given codec is described as difficult. It should be pointed out that the difficulty is often a function of the codec. In other words audio which one codec finds difficult may not be found difficult by another. Lossless codecs can be included in bit-error-rate testing schemes. It is also possible to cascade or concatenate lossless codecs without any special precautions. In lossy coding, data from the decoder are not identical bit-for-bit with the source data and as a result comparing the input with the output is bound to reveal differences. Clearly lossy codecs are not suitable for computer data, but are used in many audio coders, MPEG included, as they allow greater compression factors than lossless codecs. The most successful lossy codecs are those in which the errors are arranged so that the listener finds them subjectively difficult to detect. Thus lossy codecs must be based on an understanding of psychoacoustic perception and are often called perceptive codes. Perceptive coding relies on the principle of auditory masking, which was considered in Chapter 2. Masking causes the ear/brain combination to be less sensitive to sound at one frequency in the presence of another at a nearby frequency. If a first tone is present in the input, then it will mask signals of lower level at nearby frequencies. The quantizing of the first tone and of further tones at those frequencies can be made coarser. Fewer bits are needed and a coding gain results
  • Book cover image for: Introduction to Information Theory and Data Compression
    • Jr. Johnson, Greg A. Harris, D.C. Hankerson, Peter D. Johnson Jr.(Authors)
    • 2003(Publication Date)
    Chapter 5 Lossless Data Compression by Replacement Schemes Most (but not all) modern data compression problems are of the following form: you have a long binary word (or “file”) W which you wish to transform into a shorter binary word U in such a way that W is recoverable from U , or, in ways to be defined case by case, almost or substantially recoverable from U . In case W is completely recoverable from U , we say we have lossless compression. Oth-erwise, we have Lossy Compression. The compression ratio is lgth ( W )/ lgth ( U ) . The “compression ratio achieved by a method” is the average compression ra-tio obtained, using that method, with the average taken over all instances of W in the cases where the method is used. (This taking of the average is usually hypothetical, not actual.) Sometimes the file W is sitting there, available for leisurely perusal and sampling. Sometimes the file W is coming at you at thousands of bits per sec-ond, with immediate compression required and with no way of foretelling with certainty what the bit stream will be like 5 seconds from now. Therefore, our compression methods will be distinguished not only by how great a compres-sion ratio they achieve, together with how much information they preserve, but also by how fast they work, and how they deal with fundamental changes in the stream W (such as changing from a stream in which the digits 0 , 1 occur approximately randomly to one which is mostly 0’s). There is another item to keep account of in assessing and distinguishing between compression methods: hidden costs . These often occur as instructions for recovering W from U . Clearly it is not helpful to achieve great compression, if the instructions for recovering W from U take almost as much storage as W would. We will see another sort of hidden cost when we come to arithmetic coding: the cost of doing floating-point arithmetic with great precision.
  • Book cover image for: Optical Satellite Data Compression and Implementation
    High compression ratios can be achieved. The higher the compression ratio is, the larger the compression error. A near-lossless compression technique lies between the lossless and Lossy Compression techniques. The error introduced by a near-lossless compression 33 technique is bound by a predefined threshold, such as RMSE, the accuracy of an application product. A near-lossless compression means that it is theoretically still a Lossy Compression due to its irreversibility; however, the loss of information caused by the compression is designed to have negligible or minor impact on the derivation of the ultimate data products or applications. Satellite data users often do not like lossy data compression and may be willing to accept near-lossless compression by trading off the gain and cost of the compression. For satellite data, Lossy Compression is normally not recommended because it will reduce the value of acquired data for their purpose. For this reason, lossy data compression is not a subject of this book. Instead, this book describes both lossless and near-lossless data compression techniques in this and following chapters. Lossless compression techniques can be generally classified as two categories: prediction-based and transform-based. The former is based on the predictive coding paradigm, whereby a current pixel is predicted from the previous pixels, and the prediction error is then entropy coded. 1,2 Lossless compression techniques that use a lookup-table or vector-quantization method are also categorized as prediction-based methods because both the lookup-table and vector-quantization methods are used to generate prediction of the data. A vector-quantization-based lossless technique is an asymmetric compression process that is much more computationally intensive than the decompression. For prediction-based lossless compression, band-reordering techniques may also be applied before the prediction to improve the compression ratio.
  • Book cover image for: Fundamentals of Information Theory and Coding Design
    • Roberto Togneri, Christopher J.S deSilva(Authors)
    • 2003(Publication Date)
    This is an example of lossless compression , where no information is lost in the coding and decoding. When images are compressed, it may be permissible for the decompressed image not to have exactly the same pixel values as the original image, provided the difference is not perceptible to the eye. In this case, some form of Lossy Compression may be acceptable. This involves a loss of information between the coding and decoding processes. 4.3 Run-length Coding Run-length coding is a simple and effective means of compressing data in which it is frequently the case that the same character occurs many times in succession. This may be true of some types of image data, but it is not generally true for text, where it is rare for a letter of the alphabet to occur more than twice in succession. To compress a sequence, one simply replaces a repeated character with one instance of the character followed by a count of the number of times it occurs. For example, the sequence could be replaced by 174 Fundamentals of Information Theory and Coding Design reducing the number of characters from 24 to 16. To decompress the sequence, each combination of a character and a count is replaced by the appropriate number of characters. Protocols need to be established to distinguish between the characters and the counts in the compressed data. While the basic idea of run-length coding is very simple, complex protocols can be developed for particular purposes. The standard for fac-simile transmission developed by the International Telephone and Telegraph Con-sultative Committee (CCITT) (now the International Telecommunications Union) [4] involves such protocols. 4.4 The CCITT Standard for Facsimile Transmission Facsimile machines have revolutionised the way in which people do business. Send-ing faxes now accounts for a major part of the traffic on telephone lines.
  • Book cover image for: Digital Signal Compression
    eBook - PDF

    Digital Signal Compression

    Principles and Practice

    3 Principles of lossless compression 3.1 Introduction Source coding began with the initial development of information theory by Shannon in 1948 [1] and continues to this day to be influenced and stimulated by advances in this theory. Information theory sets the framework and the language, motivates the meth- ods of coding, provides the means to analyze the methods, and establishes the ultimate bounds in performance for all methods. No study of image coding is complete without a basic knowledge and understanding of the underlying concepts in information theory. In this chapter, we shall present several methods of lossless coding of data sources, beginning with the motivating principles and bounds on performance based on informa- tion theory. This chapter is not meant to be a primer on information theory, so theorems and propositions will be presented without proof. The reader is referred to one of the many excellent textbooks on information theory, such as Gallager [2] and Cover and Thomas [3], for a deeper treatment with proofs. The purpose here is to set the foun- dation and present lossless coding methods and assess their performance with respect to the theoretical optimum when possible. Hopefully, the reader will derive from this chapter both a knowledge of coding methods and an appreciation and understanding of the underlying information heory. The notation in this chapter will indicate a scalar source on a one-dimensional field, i.e., the source values are scalars and their locations are on a one-dimensional grid, such as a regular time or space sequence. Extensions to multi-dimensional fields, such as images or video, and even to vector values, such as measurements of weather data (temperature, pressure, wind speed) at points in the atmosphere, are often obvious once the scalar, one-dimensional field case is mastered. 3.2 Lossless source coding and entropy Values of data are not perfectly predictable and are only known once they are emitted by a source.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.