Computer Science

Data Compression

Data compression is the process of reducing the size of data to save storage space or transmission time. It is achieved by encoding information using fewer bits than the original representation. This can be done through various algorithms and techniques, such as lossless compression which preserves all the original data, or lossy compression which sacrifices some data to achieve higher compression ratios.

Written by Perlego with AI-assistance

8 Key excerpts on "Data Compression"

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.
  • Software-Defined Data Infrastructure Essentials
    eBook - ePub

    Software-Defined Data Infrastructure Essentials

    Cloud, Converged, and Virtual Fundamental Server Storage I/O Tradecraft

    ...Data Compression is widely used in IT and consumer electronics environments. It is implemented in hardware and software to reduce the size of data to create a corresponding reduction in network bandwidth or storage capacity. If you have used a traditional or TCP/IP-based telephone or cell phone, watched a DVD or HDTV, listened to an MP3, transferred data over the Internet or used email, you have likely relied on some form of compression technology that is transparent to you. Some forms of compression are time-delayed, such as using PKZIP to zip files, while others are real-time or on the fly, such as when using a network, cell phone, or listening to an MP3. Compression technology is very complementary to archive, backup, and other functions, including supporting on-line primary storage and data applications. Compression is commonly implemented in several locations, including databases, email, operating systems, tape drives, network routers, and compression appliances, to help reduce your data footprint. 11.2.2.5.1 Compression Implementation Approaches to Data Compression vary in time delay or impact on application performance as well as in the amount of compression and loss of data. Two approaches that focus on data loss are lossless (no data loss) and lossy (some data loss for higher compression ratio)...

  • How Video Works
    eBook - ePub

    How Video Works

    From Broadcast to the Cloud

    • Diana Weynand, Vance Piccin(Authors)
    • 2015(Publication Date)
    • Routledge
      (Publisher)

    ...14 Compression Compression is the process of reducing data in a digital signal by eliminating redundant information. This process reduces the amount of bandwidth required to transmit the data and the amount of storage space required to store it. Any type of digital data can be compressed. Reducing the required bandwidth permits more data to be transmitted at one time. Compression can be divided into two categories: lossless and lossy. In lossless compression, the restored image is an exact duplicate of the original with no loss of data. In lossy compression, the restored image is an approximation, not an exact duplicate, of the original (Figure 14.1). Lossless Compression In lossless compression, the original data can be perfectly reconstructed from the compressed data that was contained in the original image. Compressing a document is a form of lossless compression in that the restored document must be exactly the same as the original. It cannot be an approximation. In the visual world, lossless compression lends itself to images that contain large quantities of repeated information, such as an image that contains a large area of one color, perhaps a blue sky. Computer-generated images or flat colored areas that do not contain much detail—e.g., cartoons, graphics, and 3D animation—also lend themselves to lossless compression. Figure 14.1 Lossless vs Lossy Compression One type of lossless compression commonly used in graphics and computer-generated images (CGI) is run-length encoding. These images tend to have large portions using the same colors or repeated patterns. Every pixel in a digital image is composed of the three component colors—red, green, and blue—and every pixel has a specific value for each color...

  • Art of Digital Audio
    • John Watkinson(Author)
    • 2013(Publication Date)
    • Routledge
      (Publisher)

    ...5 Compression 5.1 Introduction Compression, bit rate reduction and data reduction are all terms which mean basically the same thing in this context. In essence the same (or nearly the same) audio information is carried using a smaller quantity or rate of data. It should be pointed out that in audio, compression traditionally means a process in which the dynamic range of the sound is reduced, typically by broadcasters wishing their station to sound louder. However, when bit rate reduction is employed, the dynamics of the decoded signal are unchanged. Provided the context is clear, the two meanings can co-exist without a great deal of confusion. There are several reasons why compression techniques are popular: (a) Compression extends the playing time of a given storage device. (b) Compression allows miniaturization. With fewer data to store, the same playing time is obtained with smaller hardware. This is useful in portable and consumer devices. (c) Tolerances can be relaxed. With fewer data to record, storage density can be reduced, making equipment which is more resistant to adverse environments and which requires less maintenance. (d) In transmission systems, compression allows a reduction in bandwidth which will generally result in a reduction in cost. This may make possible some process which would be uneconomic without it. (e) If a given bandwidth is available to an uncompressed signal, compression allows faster than real-time transmission within that bandwidth. (f) If a given bandwidth is available, compression allows a better-quality signal within that bandwidth. Figure 5.1 In (a) a compression system consists of compressor or coder, a transmission channel and a matching expander or decoder. The combination of coder and decoder is known as a codec. (b) MPEG is asymmetrical since the encoder is much more complex than the decoder. Compression is summarized in Figure 5.1. It will be seen in (a) that the PCM audio data rate is reduced at source by the compressor...

  • Introduction to Digital Audio
    • John Watkinson(Author)
    • 2013(Publication Date)
    • Routledge
      (Publisher)

    ...5 Compression 5.1 Introduction Compression, bit rate reduction and data reduction are all terms which mean basically the same thing in this context. In essence the same (or nearly the same) audio information is carried using a smaller quantity and/or rate of data. It should be pointed out that in audio compression traditionally means a process in which the dynamic range of the sound is reduced, typically by broadcasters wishing their station to sound louder. However, when bit rate reduction is employed, the dynamics of the decoded signal are unchanged. Provided the context is clear, the two meanings can co-exist without a great deal of confusion. There are several reasons why compression techniques are popular: (a)  Compression extends the playing time of a given storage device. (b)  Compression allows miniaturization. With fewer data to store, the same playing time is obtained with smaller hardware. This is useful in portable and consumer devices. (c)  Tolerances can be relaxed. With fewer data to record, storage density can be reduced, making equipment which is more resistant to adverse environments and which requires less maintenance. (d)  In transmission systems, compression allows a reduction in bandwidth which will generally result in a reduction in cost. This may make possible some process which would be uneconomic without it. (e)  If a given bandwidth is available to an uncompressed signal, compression allows faster than real-time transmission within that bandwidth. (f)  If a given bandwidth is available, compression allows a better-quality signal within that bandwidth. Compression is summarized in Figure 5.1. It will be seen in (a) that the PCM audio data rate is reduced at source by the compressor. The compressed data are then passed through a communication channel and returned to the original audio rate by the expander. The ratio between the source data rate and the channel data rate is called the compression factor. The term coding gain is also used...

  • Compression for Great Video and Audio
    eBook - ePub

    Compression for Great Video and Audio

    Master Tips and Common Sense

    • Ben Waggoner(Author)
    • 2013(Publication Date)
    • Routledge
      (Publisher)

    ...Compression is sometimes called “entropy coding,” since what you’re really saving is the entropy (randomness) in the data, while the stuff that could be predicted from that entropy is what gets compressed away to be reconstructed on decode. The More Efficient the Coding, the More Random the Output Using a codebook makes the file smaller by reducing redundancy. Because there is less redundancy, there is by definition less of a pattern to the data itself, and hence the data itself looks random. You can look at the first few dozen characters of a text file, and immediately see what language it’s in. Look at the first few dozen characters of a compressed file, and you’ll have no idea what it is. Data Compression Data Compression is compression that works on arbitrary content, like computer files, without having to know much in advance about their contents. There have been many different compression algorithms used over the past few decades. Ones that are currently available use different techniques, but they share similar properties. The most-used Data Compression technique is Deflate, which originated in PKWare’s.zip format and is also used in.gz files,.msi installers, http header compression, and many, many other places. Deflate was even used in writing this book—Microsoft Word’s.docx format (along with all Microsoft Office “.???x” formats) is really a directory of files that are then Deflated into a single file. For example, the longest chapter in my current draft (“Production, Post, and Acquisition”) is 78,811 bytes. Using Deflate, it goes down to 28,869 bytes. And if I use an advanced texttuned compressor like PPMd, (included in the popular 7-Zip tool), it can get down to 22,883 bytes. But that’s getting pretty close to the theoretical lower limit for how much this kind of content can be compressed...

  • The Manual of Photography
    • Elizabeth Allen, Sophie Triantaphillidou(Authors)
    • 2012(Publication Date)
    • Routledge
      (Publisher)

    ...At its most fundamental, information is knowledge about something and is an inherent quality of an image. In communication terms, information is contained in any message transmitted between a source and a receiver. Data are the means by which the message is transmitted and are a collection of organized information. In a digital image, the information is contained in the arrangement of pixel values, but the data are the set of binary digits that represent it when it is transmitted or stored. Information theory is a branch of applied mathematics providing a framework allowing the quantification of the information generated or transmitted through a communication channel (see Chapter 24). This framework can be applied to many types of signal. In image compression the digital image is the signal, and is being transmitted through a number of communication channels as it moves through the digital imaging chain. The process of compression involves reduction in the data representing the information or a reduction in the information content itself. Data reduction is generally achieved by finding more efficient methods to represent (encode) the information. In an image containing a certain number of pixels, each pixel may be considered as an information source. The information content of the image relates to the probabilities of each pixel taking on one of n possible values. The range of possible values, as we have already seen in Chapter 24, is related to the bit depth of the image. The process of gaining information is equivalent to the removal of uncertainty. Therefore, information content may be regarded as a measure of predictability: an image containing pixels which all have the same pixel value has a high level of predictability and therefore an individual pixel does not give us much information. It is this idea upon which much of the theory of compression is based...

  • Understanding Digital Cinema
    eBook - ePub

    Understanding Digital Cinema

    A Professional Handbook

    • Charles S. Swartz, Charles S. Swartz(Authors)
    • 2004(Publication Date)
    • Routledge
      (Publisher)

    ...In one respect this is obviously the ideal form of compression in that (assuming error-free transmission) there can be no possibility of degradation. This is lossless compression, and it does have practical applications. Well-known computer programs such as PK-Zip and Stuffit are lossless compression systems. They can take a computer file, make it more compact for storage or transmission, and then restore a perfect copy of the original. Unfortunately, lossless systems generally do not provide sufficient compression for large-scale imagery applications such as Digital Cinema distribution. Typically, lossless systems can compress image data by factors in the range of two or three to one; a useful degree of compression, certainly, but not enough to make Digital Cinema practical. Recently there have been claims that new techniques can provide much higher compression ratios but—at the time of writing—no independent tests have verified these claims. So the majority of this chapter will be devoted to the characteristics and design of lossy compression systems; systems that are likely to meet the practical needs of Digital Cinema distribution. However, lossless compression does still play an important role. These techniques may be used with almost any source of data, including the output data of a lossy compression system. So practical compression systems usually consist of a lossy front end followed by a lossless section (known as the entropy coder) to reduce the bit rate even further. Lossy Compression For the foreseeable future, Digital Cinema will require the use of compression systems that are not lossless: systems that discard or distort some of the information in the original image data, or lossy compression...

  • The Technology of Video and Audio Streaming
    • David Austerberry(Author)
    • 2013(Publication Date)
    • Routledge
      (Publisher)

    ...These parameters are then coded into data packets for streaming. There are four main redundancies present in the video signal: Spatial Temporal Perceptual Statistical Compression can be lossless or lossy. If all the original information is preserved, the codec is called lossless. A typical example for basic file compression would be ZIP. To achieve the high levels of compression demanded by streaming codecs, the luxury of lossless codecs is not possible – the data reduction is insufficient. Spatial redundancy occurs where neighboring pixels in a frame of a video signal are related; it could be an object of a single color. If consecutive pictures also are related there is temporal redundancy. The human visual system has psychovisual redundancy; not all the visual information is treated with the same relevance. An example is lower acuity to color detail than luminance. Finally, not all parameters occur with the same probability in an image. This statistical redundancy can be used in the coding of the image parameters. For example, frequently occurring parameters can be coded with fewer bits (Huffman coding). The goal with compression is to avoid artifacts that are perceived as unnatural. The fine detail in an image can be degraded gently without losing understanding of the objects in a scene. As an example we can watch a 70-mm print of a movie or a VHS transfer and in both cases still enjoy the experience. If too much compression is applied, and the artifacts interfere with the image perception, the compression has become unnaturally lossy. Table 5.1 lists some of the more popular technologies that have been used for encoding streaming files. The techniques may be combined within codecs. For example, MPEG-2 divides the picture into blocks. Each block is encoded using a spatial transform, and the data is then run-length encoded. Blocks that repeat from frame to frame have temporal redundancy...