Computer Science

Data Representation in Computer Science

Data representation in computer science refers to the methods used to store and manipulate data in a computer system. It involves encoding data into a format that can be processed by the computer, such as binary or hexadecimal. Different data types, such as integers, floating-point numbers, characters, and strings, are represented using specific formats to facilitate efficient computation and storage.

Written by Perlego with AI-assistance

5 Key excerpts on "Data Representation in Computer Science"

  • Book cover image for: Systems Architecture
    Computers manipulate and store a variety of data, such as numbers, text, sound, and pictures. This chapter describes how data is represented and stored in computer hardware. It also explains how simple data types are used as building blocks to create more complex data structures, such as arrays and re-cords. Understanding data representation is key to understanding hardware and software technologies. DATA REPRESENTATION AND PROCESSING People can understand and manipulate data represented in a variety of forms. For example, they can understand numbers represented symbolically as Arabic numerals (such as 8714), Roman numerals (such as XVII), and simple lines or tick marks on pa-per (for example, ||| to represent the value 3). They can understand words and concepts represented with pictorial characters ( ) or alphabetic characters (“computer” and компьютер , Cyrillic text of the Russian word for computer) and in the form of sound waves (spoken words). People also extract data from visual images (photos and movies) and from the senses of taste, smell, and touch. The human brain’s processing power and flexibility are evident in the rich variety of data representations it can recognize and understand. To be manipulated or processed by the brain, external data representations, such as printed text, must be converted to an internal format and transported to the brain’s processing “circuitry.” Sensory organs convert inputs, such as sight, smell, taste, sound, and skin sensations, into electrical impulses that are transported through the nervous system to the brain. Processing in the brain occurs as networks of neurons exchange data electrically. C H A P T E R 3 DATA REPRESENTATION C H A P T E R G O A L S • Describe numbering systems and their use in data representation • Compare different data representation methods • Summarize the CPU data types and explain how nonnumeric data is represented • Describe common data structures and their uses Copyright 2016 Cengage Learning.
  • Book cover image for: Computer Systems Architecture
    2 Data Representation DATA REPRESENTATION
    The widespread availability of computers and computer-based systems requires a precise definition of data representation. Although human communication with computers is at a high level and most users do not care about the internal representation, it is needed to assure proper functioning of the system. This definition is not different from the “protocols” that were defined in order to provide communications between humans themselves, such as the natural languages. Writing was developed in order to provide a mechanism for representing language in a more visual form. This is done by a set of symbols (letters and numbers) that represent sounds defined in the language. Only after the language was defined could the writing symbols (letters, numbers) be developed, and this paved the way for written communication between humans, that is, books and newspapers as well as information displayed and printed by computers. The agreed-upon convention for representing a natural language was developed in the early stages of human development, and it provided the mechanism for written communication that is not confined to face-to-face discussions. Very well-known examples are ancient Egyptian hieroglyphs and the Cuneiform scripts, which were used over 5000 years ago. Rapid technological advancements and the development of analog and later digital communication links provide the means to communicate with people even if they are far away. For establishing such communications, the various systems (i.e., telephone, telegraph, facsimile, etc.) had to use a predefined encoding system. Such a system that has already been mentioned was the Hollerith punched card, which used the holes in the card to represent data.
    The fast development of the Internet and the fact it is a global system required special attention to data representation standards. These standards provide the basic platform for data transfers between all connected devices. Furthermore, since all modern computers use the binary system, the standards have to define the binary representation of data as well. This data may include numbers (integers, real and complex numbers), text, and special symbols. An important aspect of the representation system applicable to numbers is its ability to support computations (as will be explained in the section “Computer’s Arithmetic” in this chapter).
  • Book cover image for: Computer Systems Architecture
    47 C H A P T E R 2 Data Representation DATA REPRESENTATION The widespread availability of computers and computer-based systems requires a precise definition of data representation. Although human communication with computers is at a high level and most users do not care about the internal representation, it is needed to assure proper functioning of the system. This definition is not different from the “protocols” that were defined in order to provide communications between humans themselves, such as the natural languages. Writing was developed in order to provide a mechanism for represent-ing language in a more visual form. This is done by a set of symbols (letters and numbers) that represent sounds defined in the language. Only after the language was defined could the writing symbols (letters, numbers) be developed, and this paved the way for written communication between humans, that is, books and newspapers as well as information displayed and printed by computers. The agreed-upon convention for representing a natu-ral language was developed in the early stages of human development, and it provided the mechanism for written communication that is not confined to face-to-face discussions. Very well-known examples are ancient Egyptian hieroglyphs and the Cuneiform scripts, which were used over 5000 years ago. Rapid technological advancements and the develop-ment of analog and later digital communication links provide the means to communicate with people even if they are far away. For establishing such communications, the various systems (i.e., telephone, telegraph, facsimile, etc.) had to use a predefined encoding system. Such a system that has already been mentioned was the Hollerith punched card, which used the holes in the card to represent data. The fast development of the Internet and the fact it is a global system required spe-cial attention to data representation standards.
  • Book cover image for: Statistical Data Cleaning with Applications in R
    • Mark van der Loo, Edwin de Jonge(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    Chapter 3 Technical Representation of Data
    Ideally, data analysts should not have to worry too much about how exactly data is technically stored in computer memory. Indeed, much effort in computer engineering has gone into trying to abstract away technical details, such as how a real number is represented as a sequence of bits.
    In practice, however, consequences of technical choices eventually pop up. For example, there is probably hardly any computer user who has not at some point seen symbols similar to or appearing in their text editor. Such symbols indicate that the program displaying the text was not able to translate the string of bytes it read into readable characters. As a second example, consider the output of the following calculation in R.
    if ( 1 - 0.9 == 0.1 ) print("ok") else print("oh no!") ## [1] "oh no!"
    Although it seems reasonable to expect "ok" , apparently is not precisely equal to 0.1 for a computer although the difference is admittedly small.
    (1-0.9) - 0.1 ## [1] -2.775558e-17
    These examples are forms of what is commonly termed abstraction leakage: issues that have to do with the underlying technical representation of data exposed to the user. In the case of the misrepresentation of text, the problem is that the text editor assumes text is stored in one encoding, while it is stored in another. In the case of the ‘failed’ calculation, the small difference between the expected and obtained result is the result of a trade-off: the lack of precision is small enough to be well compensated by the vast gain in speed for nearly all applications that involve statistical data processing. We will encounter numerical precision issues further on in this book, when we discuss algorithms for error localization and value adaption.
    This chapter discusses the technical representation of the most important data types: integers, real numbers, and text while exposing caveats one may encounter while programming with data. Special attention is paid to representation of these data types in R.
  • Book cover image for: The Tao of Computing
    33 C H A P T E R 2 How Are Numbers and Characters Represented in a Computer (and Who Cares)? W hen we use computer applications, we interact with the computer at the user level, normally focusing surprisingly little attention on the technology that allows us to accomplish our work at hand. For example, when browsing the Internet, we likely think about the topic we are researching and the information on the screen rather than the servers and networks that make our communication possible. When writing using word processors, we concentrate on the content we want to convey and consider the application’s role only in such matters as type font, type size, and page layout. As users, we work with what we “see”; rarely do we stop and think about how the computer processes and orga-nizes our data so that we can see them. Because the machine handles so many behind-the-scenes tasks, we can easily ignore the technical details of how data are represented and stored. We let the machine make the technical decisions, and our work can progress smoothly. The computer’s technical decisions about data representation, however, can affect our work in several ways. It is to our advantage to understand how the storage of information impacts: • The accuracy of our results • The speed of processing • The range of alphabets available to us • The size of the files we must store • The appearance of the graphics we see on the screen or printed on a page • The time it takes for materials to download on the Internet 34 ◾ The Tao of Computing An awareness of data representation and its consequences can guide us as we develop our own materials and as we use the materials of others. This chapter focuses on the stor-age of numbers, characters, and other non-pictorial data, as a way to understand underly-ing fundamentals of data storage. Chapter 3 can then draw upon these ideas as it considers the storage of images.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.