Computer Science

What is ASCII

ASCII, or American Standard Code for Information Interchange, is a character encoding standard used in computers and communication equipment to represent text and control characters. It assigns a unique numerical value to each character, allowing for universal compatibility and communication between different systems. ASCII includes 128 standard characters, such as letters, numbers, and symbols, each represented by a 7-bit binary code.

Written by Perlego with AI-assistance

6 Key excerpts on "What is ASCII"

  • Book cover image for: The Routledge Handbook of Chinese Applied Linguistics
    • Chu-Ren Huang, Zhuo Jing-Schmidt, Barbara Meisterernst(Authors)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    process generally refers to the recognition (taking a computer code as a symbol by definition) for storage (input), display (output) and handling of the text in computer systems. The definition process, also referred to as the encoding process, involves 1) the proper selection for a character set, followed by 2) the assignment of a unique binary code value to each symbol, referred to as a code point, with consideration of the script size, nature and efficiency for computer processing, among other things. The assignment results in a coded character set, or codeset for short. A codeset is a code table mapping all the characters to their respective unique code points.
    The American Standard Code for Information Interchange (ASCII) (American National Standards Institute 1986) can be used as an example to see how symbols from a writing system are defined as a codeset. The ASCII Table is the first commonly used codeset defined for computer use. It includes both the symbols used for writing English text and other symbols necessary for preparing text in computer systems. ASCII encodes characters using the so-called fixed length encoding method where the code point for each character is of the same binary length. This means when you are dealing with binary code sequence, you can read one character at a time using a fixed number of binary sequences. For convenience, we use the decimal numbers to refer to the assigned code points. The corresponding hexadecimal (HEX for short) numbers are short forms for the binary code points used in computer systems.
    Alphabet letters are put into two separate blocks for easier processing in computers. This design allows binary operations to be handled as quickly as possible. Take the letter ‘A’ as an example: Its corresponding computer code in hexadecimal form is 41, which is translated to a binary sequence of 0100 0001. The hexadecimal code of ‘a’ is HEX 61, which is translated to the binary sequence 0110 0001. Since the only difference between the two sequences is the third bit, to change an upper case ‘A’ to its corresponding lower case ‘a’, the computer only needs to toggle the 3rd bit from the left from 0 to 1. The reverse is to switch from 1 to 0, a simple logical operation in computer circuits. This assignment is a part of the encoding design that allows such frequently used operations to be done efficiently at the binary level.
  • Book cover image for: Spoken Language Reference Materials
    • Dafydd Gibbon, Roger Moore, Richard Winski, Dafydd Gibbon, Roger Moore, Richard Winski(Authors)
    • 2020(Publication Date)
    For example, in 7-bit ASCII, the character a is encoded as the 7 bit integer number 97. A script consists of an alphabet and a set of rules that determines the direction of writing (left to right, right to left, up to down, etc.), and the composition of characters (placement of accents, combination of glyphs, etc.). A.2 ASCII ASCII codes come in various flavours: the original 7-bit ASCII code, plat-form dependent variations and extensions such as the Mac ASCII or the country pages of IBM PCs, multinational extensions such as the ISO 8859 family, and application dependent extensions such as ISO 8879 for SGML. Character codes and computer readable alphabets [655] 31 7 bit ASCII 7-bit ASCII (also known as US-ASCII, ANSI X3.4) as defined by the Amer-ican National Standards Institute is the most widespread code for the com-puter representation of characters. The 128 numbers of US-ASCII are suf-ficient for the standard English alphabet, punctuation marks, digits, some mathematical operators, and control codes. However, for many uses, this code system is far too restricted. The ISO 646 family is a set of standards for 7 bit code tables which differs from US ASCII in language dependent codes, e.g. in the German code table the square brackets and curly braces of the 7-bit ASCII are mapped to German umlauts, in the English code table the # is replaced by a £ symbol, etc. Platform dependent ASCII Many hardware vendors, especially in the PC market, implemented propri-etary extensions to the 7-bit ASCII standard. The Macintosh uses an 8-bit ASCII extension which was meant to cover all languages using the Latin alphabet; complex characters could be composed from more than one single character, e.g. by adding an accent or a dieresis. On the IBM PC there exist various 8-bit ASCII extensions for individual languages. This reduces the need for character composition from single characters, but introduces incompatibilities between the different ASCII extensions.
  • Book cover image for: A Companion to Digital Literary Studies
    • Ray Siemens, Susan Schreibman, Ray Siemens, Susan Schreibman(Authors)
    • 2013(Publication Date)
    • Wiley-Blackwell
      (Publisher)
    ASCII to 127.
    As can be learned immediately even from a cursory look at a table of the ASCII code, the repertoire of characters is suitable for almost no other language except English (one could theoretically also write Latin and Swahili, but in fact one would be hard pressed to write even a moderate essay with this repertoire, since it does not allow for foreign loan words, smart quotes, and other things that are frequently seen in modern English texts), since it defines no accented characters used in other European languages, not to mention languages like Arabic, Tibetan, or Chinese.
    ASCII is the ancestor and common subset of most character codes in use today. It was adopted by ISO (International Organization for Standardization) as ISO 646 in 1967; in 1972, country-specific versions that replaced some of the less frequently used punctuation characters with accented letters needed for specific languages were introduced. This resulted in a babylonic situation where French, German, Italian, and the Scandinavian languages all had mutually exclusive, incompatible adaptations which made it impossible to transfer data to other areas without recoding.
    Several attempts where made to improve this situation. In 1984, the Apple Macintosh appeared with the so-called MacRoman character set that allowed all languages of Western Europe to be used in the same document. The IBM codepage 850 (one of a variety of so-called codepages that could be used in DOS (disk operating system) environments) later achieved something similar. In the 1980s, an effort within the ISO finally succeeded in the publication of an international standard that would allow the combination of these languages, the group of ISO 8859 standards. This is a series of standards all based on ASCII, but they differ in the allocation of code points with values in the range 128–255. Of these, the first one, ISO 8859-1 (also known as Latin-1), is (albeit with some non-standard extensions) the “ANSI” used in the versions of the Microsoft Windows operating systems sold in Western Europe and the Americas. With the introduction of the European common currency, the euro, it became necessary to add the euro symbol to this character code; this version, with some additional modifications, has been adopted as ISO
  • Book cover image for: PPI FE Mechanical Review Manual eText - 1 Year
    Chapter 62 . Computer Software
    1. 1. Character Coding
    2. 2. Program Design
    3. 3. Flowcharts
    4. 4. Low-Level Languages
    5. 5. High-Level Languages
    6. 6. Relative Computational Speed
    7. 7. Structure, Data Typing, and Portability
    8. 8. Structured Programming
    9. 9. Hierarchy of Operations
    10. 10. Simulators
    11. 11. Spreadsheets
    12. 12. Spreadsheets in Engineering
    13. 13. Fields, Records, and File Types
    14. 14. File Indexing
    15. 15. Sorting
    16. 16. Searching
    17. 17. Hashing
    18. 18. Database Structures
    19. 19. Hierarchical and Relational Data Structures
    20. 20. Artificial Intelligence
    1. Character Coding
    Alphanumeric data refers to characters that can be displayed or printed, including numerals and symbols ($, %, &, etc.) but excluding control characters (tab, carriage return, form feed, etc.). Since computers can handle binary numbers only, all symbolic data must be represented by binary codes. Coding refers to the manner in which alphanumeric data and control characters are represented by sequences of bits.
    The American Standard Code for Information Interchange , ASCII, is a seven-bit code permitting 128 (27 ) different combinations. It is commonly used in desktop computers, although use of the high order (eighth) bit is not standardized. ASCII-coded magnetic tape and disk files are used to transfer data and documents between computers of all sizes that would otherwise be unable to share data structures.
    The Extended Binary Coded Decimal Interchange Code , EBCDIC (pronounced eb’-sih-dik), is in widespread use in IBM mainframe computers. It uses eight bits (one byte) for each character, allowing a maximum of 256 (28 ) different characters.
    Since strings of binary digits (bits) are difficult to read, the hexadecimal (or “packed”) format is used to simplify working with EBCDIC data. Each byte is converted into two strings of four bits each. The two strings are then converted to hexadecimal. Since
  • Book cover image for: The Architecture of Computer Hardware, Systems Software, and Networking
    • Irv Englander, Wilson Wong(Authors)
    • 2021(Publication Date)
    • Wiley
      (Publisher)
    By choosing a code in which the value of the binary number representing a character corresponds to the placement of the character within the alphabet, we can provide programs that sort data without even knowing what the data is, just by numerically sorting the codes that correspond to each character. Three alphanumeric codes are in common use. The three codes are known as Unicode, ASCII (which stands for American Standard Code for Information Interchange, pronounced “as-key” with a soft “s”), and EBCDIC (Extended Binary Coded Decimal Interchange Code, pronounced “ep-sǝ-dik”). EBCDIC was developed by IBM. Its use is restricted mostly to IBM and IBM-compatible mainframe computers and terminals. The Web makes EBCDIC particularly unsuitable for current work. Nearly everyone today uses Unicode or ASCII. Still, it will be many years before EBCDIC totally disappears from the landscape. The translation table for ASCII code is shown in Figure 4.3. The EBCDIC code is some- what less standardized; the punctuation symbols have changed over the years. A recent EBCDIC code table is shown in Figure 4.4. The codes for each symbol are given in hexadeci- mal, with the most significant digit across the top and the least significant digit down the side. Both ASCII and EBCDIC codes can be stored in a byte. For example, the ASCII value for “G” is 47 16 . The EBCDIC value for “G” is C7 16 . When comparing the two tables, note that the standard ASCII code was originally defined as a 7-bit code, so there are only 128 entries 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI 1 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US 2 space ! " # $ % & ' ( ) * + , - . / 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 4 @ A B C D E F G H I J K L M N O 5 P Q R S T U V W X Y Z [ \ ] ^ _ 6 ` a b c d e f g h i j k l m n o 7 p q r s t u v w x y z { | } ~ DEL MSD LSD FIGURE 4.3 ASCII Code Table 84 DATA FORMATS in the ASCII table. EBCDIC is defined as an 8-bit code.
  • Book cover image for: Cybercrime and Information Technology
    eBook - ePub

    Cybercrime and Information Technology

    The Computer Network Infrastructure and Computer Security, Cybersecurity Laws, Internet of Things (IoT), and Mobile Devices

    • Alex Alexandrou(Author)
    • 2021(Publication Date)
    • CRC Press
      (Publisher)
    8
    7 Maini, Anil K. Digital electronics: principles, devices and applications. John Wiley & Sons, 2007.
    8 Danet, Brenda, and Susan C. Herring. “Introduction: The multilingual internet.” Journal of Computer-Mediated Communication 9, no. 1 (2003): JCMC9110.
    Another encoding system is the Extended Binary Coded Decimal Interchange Code (EBCDIC), developed in the late 1950s and early 1960s, and used by International Business Machines (IBM).9 Since EBCDIC and ASCII have limitations in terms of number of characters, a UNICODE was created jointly by the Unicode Consortium and by The International Organization for Standardization (ISO). The Unicode provides an exclusive number for every character, including emojis, regardless of the platform, program, or language.10
    9 Id. at 7.
    10 Unicode standard. https://home.unicode.org/basic-info/overview/
    Table 1.10 displays the first 25 (0–25) decimals and characters in binary values and ASCII code.
    TABLE 1.10 Decimal numbers 0–25 and the alphabet in binary values and ASCII code
    Decimals Binary Alphabet uppercase ASCIIcode Binary Alphabetlowercase ASCIIcode Binary
    0
    0000 0000
    A
    065 0100 0001
    A
    097 0110 0001
    1
    0000 0001
    B
    066 0100 0010
    B
    098 0110 0010
    2
    0000 0010
    C
    067 0100 0011
    C
    099 0110 0011
    3
    0000 0011
    D
    068 0100 0100
    D
    100 0110 0100
    4
    0000 0100
    E
    069 0100 0101
    E
    101 0110 0101
    5
    0000 0101
    F
    070 0100 0110
    F
    102 0110 0110
    6
    0000 0110
    G
    071 0100 0111
    G
    103 0110 0111
    7
    0000 0111
    H
    072 0100 1000
    H
    104 0110 1000
    8
    0000 1000
    I
    073 0100 1001
    I
    105 0110 1001
    9
    0000 1001
    J
    074 0100 1010
    J
    106 0110 1010
    10
    0000 1010
    K
    075 0100 1011
    K
    107 0110 1011
    11
    0000 1011
    L
    076 0100 1100
    L
    108 0110 1100
    12
    0000 1100
    M
    077 0100 1101
    M
    109 0110 1101
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.