ASCII
ASCII (acronym for American Standard Code for Information Interchange), generally pronounced [áski] or (rarely) [ásθi] or [ási], is a character code based on the Latin alphabet, as used in modern English. It was created in 1963 by the American Standards Committee (ASA, known since 1969 as the American National Standards Institute, or ANSI) as an evolution of the code sets then used in telegraphy. Later, in 1967, lower case was included, and some control codes were redefined to form the code known as US-ASCII.
The ASCII code uses 7 bits to represent characters, although initially it used an additional bit (parity bit) that was used to detect transmission errors. Various 8-bit character codes that extend ASCII with characters specific to languages other than English, such as the ISO/IEC 8859-1 standard, are often incorrectly referred to as ASCII.
ASCII was first published as a standard in 1967 and was last updated in 1986. It currently defines codes for 32 non-printing characters, most of which are control characters that affect how data is processed. text, plus another 95 printable characters that follow them in the numbering (starting with the space character).
Almost all current computer systems use ASCII code or a compatible extension to represent text and to control devices that handle letters or any symbol on the keyboard.
Overview
Computers only understand numbers. The ASCII code is a method of translating letters and symbols into numbers like 'a=97' or '/=47'.
Like other character representation format codes, ASCII is a method for a correspondence between strings of bits and a series of symbols (alphanumeric and others), thus allowing communication between digital devices as well as their processing and storage. The ASCII character code—or a compatible extension (see below)—is used by almost all computers, especially personal computers and workstations. The more appropriate name for this character code is "US-ASCII".
! " # $ % ' () * + -. / 0 1 2 3 4 5 6 7 8 9:; ك = أع ? @ A B C D E F G H I J K M N O P Q R S T U V W X Y Z [ ] ^ _ ` a b c d e f h i k l m n o p q r s t u v w x y z { cult } ~ |
ASCII is, strictly speaking, a seven-bit code, which means that it uses bit strings representable with seven binary digits (ranging from 0 to 127 in decimal base) to represent character information. At the time when the ASCII code was introduced, many computers worked with groups of eight bits (bytes or octets), as the minimum unit of information; where the eighth bit was commonly used as a parity bit with error control functions on communication lines or other device-specific functions. Machines that did not use parity checking set the eighth bit to zero in most cases, although other systems, such as Prime computers, running PRIMOS set the eighth bit of the ASCII code to one. The ASCII code defines a relationship between specific characters and sequences of bits; apart from reserving a few control codes for the word processor, and does not define any mechanism for describing the structure or appearance of text in a document; these matters are specified by other languages such as markup languages.
History
ASCII code was developed in the field of telegraphy and was first used commercially as a teleprinting code powered by Bell data services. Bell had planned to use a six-bit code, derived from Fieldata, that added punctuation and lowercase letters to the older Baudot teleprinting code, but they were persuaded to join the American Standards Agency (ASA) subcommittee, which had started to develop the ASCII code. Baudot helped automate the sending and receiving of telegraphic messages, and borrowed many features from Morse code; unlike Morse code, however, Baudot used codes of constant length. Compared to the early telegraph codes, the code proposed by Bell and ASA resulted in a more convenient reorganization for sorting lists (especially since it was arranged alphabetically) and added features such as the 'escape sequence'.
The American Standards Agency (ASA), which would later become the American National Standards Institute (ANSI), first published the ASCII code in 1963. The ASCII published in 1963 had an arrow pointing up (↑) instead of the circumflex (^) and a left-pointing arrow instead of the underscore (_). The 1967 version added the lowercase letters, changed the names of some control codes, and moved the two control codes ACK and ESC from the lowercase letter area to the control code area.
ASCII was updated accordingly and published as ANSI X3.4-1968, ANSI X3.4-1977, and finally ANSI X3.4-1986. Other standardization bodies have published character codes that are identical to ASCII. These character codes are often called ASCII, even though ASCII is strictly defined only by ASA/ANSI standards:
- The European Association of Computer Manufacturers (ECMA) published editions of its ASCII clone, ECMA-6 in 1965, 1967, 1970, 1973, 1983, and 1991. The 1991 edition is identical to ANSI X3.4-1986.
- The International Standardization Organization (ISO) published its version, ISO 646 (after ISO/IEC 646) in 1967, 1972, 1983 and 1991. In particular, ISO 646:1972 established a set of specific versions for each country where the score characters were replaced with non-English characters. ISO/IEC 646:1991 International Reference Version is the same as in ANSI X3.4-1986.
- The International Telecommunication Union (ITU) published its version of ANSI X3.4-1986, ITU Recommendation T.50, in 1992. In the early 1970s he published a version as Recommendation CCITT V.3.
- DIN published an ASCII version as the DIN 66003 standard in 1974.
- The Working Group on Internet Engineering (IETF) published a version in 1969 as RFC 20, and established the standard version for the Internet, based on ANSI X3.4-1986, with the publication of RFC 1345 in 1992.
- The IBM version of ANSI X3.4-1986 was published in IBM's technical literature as a 367 code page.
ASCII code is also included in Unicode, making up the first 128 characters (or 'lowest').
- The symbol ricerepresented by character @, is a fundamental component of email addresses, where it appears as a signal or mark of separation between the username and the domain name, using the format user@proveedor.
- It is also used in various computer applications, with different functions, such as denoting a user account on Twitter (@user), Telegram, Instagram, etc. It is also used as an Internet symbol by antonomasia, even as a pictogram in the signage, to indicate the location of a cybercafé or a place with access to the network. Within the ASCII code, it is represented with the number 64.
- The term “arroba” comes from Arabic الربع (ar-rub absent), which means ‘the fourth part’ and was used in Spain to represent the mass unit called also arroba. In English read at [æt] (“a”, “junto a” or “en”), hence its use in computer science.
ASCII control characters
The ASCII code reserves the first 32 codes (numbered 0 through 31 in decimal) for control characters: codes not originally intended to represent printable information, but to control devices (such as printers) that use ASCII. For example, the character 10 represents the function "newline" (line feed), which causes a printer to feed the paper, and the 27th character represents the "escape" which is often found in the upper left corner of common keyboards.
Code 127 (all seven bits to one), another special character, is equivalent to "delete" ("delete"). Although this function is similar to other control characters, the designers of ASCII devised this code in order to "clear" a section of punched paper (a popular storage medium until the 1980s) by punching all possible holes for a particular character position, replacing any previous information. Since code 0 was ignored, it was possible to leave holes (hole regions) and later make corrections.
Many of the ASCII control characters were used to mark data packets, or to control data transmission protocols (for example ENQuiry, with the meaning: is there a station around?, ACKnowledge: received or ", Start Of Header: start of header, Start of TeXt: start of text, End of TeXt: end of text, etc.). ESCape and SUBstitute allowed a communications protocol, for example, to mark binary data so that it contained codes with the same code as the protocol character, so that the receiver could interpret it as data rather than as protocol characters. The designers of the ASCII code devised the separator characters for use in magnetic tape systems.
Two of the device control characters, commonly called XON and XOFF, generally performed flow control character functions to control flow to a slow device (such as a printer) from a fast device (such as a computer), so so that the data does not saturate the reception capacity of the slow device and is lost.
Early users of ASCII adopted some of the control codes to represent "meta information" such as end-of-line, beginning/end of a data element, etc. These mappings often conflicted, so part of the effort of converting data from one format to another involves making the correct metadata conversions. For example, the character that represents the end-of-line in text files varies with the operating system. When files are copied from one system to another, the conversion system must recognize these characters as end-of-line marks and act accordingly.
Currently ASCII users use less control characters, (with a few exceptions like "carriage return" or "new line"). Modern tag languages, modern communication protocols, the shift from text-based to graphics-based devices, the decline of teleprinters, punch cards, and continuous papers have made most control characters obsolete.
Binario | Decimal | Hex | Abbreviation | Repr | AT | Name/Signified |
---|---|---|---|---|---|---|
0000 | 0 | 00 | NUL | ^@ | Character Nulo | |
0000 0001 | 1 | 01 | SOH | ^A | Start of Heading | |
0000 0010 | 2 | 02 | STX | ^B | Home Text | |
0000 0011 | 3 | 03 | ETX | ^C | End of Text | |
0000 0100 | 4 | 04 | EOT | ^D | End of Transmission | |
0000 0101 | 5 | 05 | ENQ | ^E | Consultation | |
0000 0110 | 6 | 06 | ACK | ^F | Acknowledgement | |
0000 0111 | 7 | 07 | BEL | ^G | Timbre | |
0000 1000 | 8 | 08 | BS | ^H | Back | |
0000 1001 | 9 | 09 | HT | ^I | Horizontal tabulation | |
0000 1010 | 10 | 0A | LF | ^J | Line balance | |
0000 1011 | 11 | 0B | VT | ^K | Vertical tabulation | |
0000 1100 | 12 | 0C | FF | ^L | Page progress | |
0000 1101 | 13 | 0D | CR | ▪ | ^M | Return car |
0000 1110 | 14 | 0E | SO | ^N | Deactivate capital | |
0000 1111 | 15 | 0F | Yes | ^O | Activate capital | |
0001 0000 | 16 | 10 | DLE | ^P | Escape data link | |
0001 0001 | 17 | 11 | DC1 | ^Q | Device control 1 (XON) | |
0001 0010 | 18 | 12 | DC2 | ^R | Device control 2 | |
0001 0011 | 19 | 13 | DC3 | ^S | Device control 3 (XOFF) | |
0001 0100 | 20 | 14 | DC4 | ^T | Device control 4 | |
0001 0101 | 21 | 15 | NAK | ^U | Acuse of negative receipt | |
0001 0110 | 22 | 16 | SYN | ^V | Synchrony waiting | |
0001 0111 | 23 | 17 | ETB | ^W | End of transmission block | |
0001 1000 | 24 | 18 | CAN | ^X | Cancel | |
0001 1001 | 25 | 19 | EM | ^Y | End of the medium | |
10101 | 26 | 1A | SUB | ^Z | Substitution | |
0001 1011 | 27 | 1B | ESC | ^[ or ESC | Escape | |
1100 | 28 | 1C | FS | ^ | File separator | |
0001 1101 | 29 | 1D | GS | ^] | Group separator | |
0001 1110 | 30 | 1E | RS | ^^ | Registration separator | |
0001 1111 | 31 | 1F | US | ^_ | Unit separator | |
0111 1111 | 127 | 7F | OF THE | ␡ | ^? | Suppress |
ASCII printable characters
The character 'space', designates the space between words, and is normally produced by the space bar on a keyboard. Codes 32 through 126 are known as printable characters, and they represent letters, digits, punctuation marks, and various symbols.
Seven-bit ASCII provides seven "national" and, if the particular combination of hardware and software allows it, you can use key combinations to simulate other international characters: in these cases a backspace can precede an open or grave accent (in British and US standards, but only in these standards)., is also called an "opening single quotation mark"), a tilde or a "breathing mark".
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{}|~
|
|
|
Structural features
- The digits from 0 to 9 are represented with their prefixed values with the value 0011 in binary (this means BCD-ASCII conversion is a simple question of taking each bcd drive and prefixing it with 0011).
- The bit strings of the tiny and capital letters only differ in a bit, thus simplifying the conversion of one to another group.
Other names for ASCII
RFC 1345 (published June 1992) and the IANA Registry of Character Codes recognize the following alternative names for ASCII for use on the Internet.
- ANSI_X3.4-1968 (canonical name)
- ANSI_X3.4-1986
- ASCII
- US-ASCII (Recommended MIME Name)
- us
- ISO646-US
- ISO_646.irv:1991
- iso-ir-6
- IBM367
- cp367
- csASCII
Of these, only the names "US-ASCII" and "ASCII" they are widely used. They are often found in the "character code" optional in the Content-Type header of some MIME messages, in the equivalent "meta" of some HTML documents, and in the character encoding declaration part of the header of some XML documents.
ASCII Variants
As computer technology spread throughout the world, different standards were developed and companies developed many variations of the ASCII code to make it easier to write languages other than English that used Latin alphabets. Some of these variations can be found classified as "Extended ASCII", although the term is sometimes misapplied to cover all variants, even those that do not preserve the original seven-bit ASCII character code set.
ISO 646 (1972), the first attempt to remedy the pro-English bias of the character encoding, created compatibility problems as it was also a 7-bit character code. You didn't specify any additional code, so you reassigned some specifically for the new languages. In this way it became impossible to know in which variant the text was encoded, consequently, word processors could only treat one variant.
Technology improved and provided means to represent the information encoded in the eighth bit of each byte, freeing up this bit, which added an additional 128 character codes that were made available for new assignments. For example, IBM developed 8-bit code pages, such as code page 437, which replaced control characters with graphic symbols such as smileys, and assigned additional graphic characters to the upper 128 bytes of the code page.
Some operating systems, such as DOS, could work with these code pages, and personal computer manufacturers included support for such pages in their hardware.
Eight-bit standards like ISO 8859 and Mac OS Roman were developed as true extensions of ASCII, leaving the first 127 characters intact and adding only additional values above 7-bits. This allowed for the representation of a wider range of languages, but these standards continued to suffer from incompatibilities and limitations. Still today, ISO-8859-1 and its variant Windows-1252 (sometimes mistakenly called ISO-8859-1) and the original 7-bit ASCII code are the most commonly used character codes.
Unicode and ISO/IEC 10646 Universal Character Set (UCS) define a much larger character set, and its different forms of encoding have quickly begun to replace ISO 8859 and ASCII in many environments. While ASCII basically uses 7-bit codes, Unicode and UCS use "code points" o relatively abstract pointers: positive numbers (including zero) that assign sequences of 8 or more bits to characters. To allow compatibility, Unicode and UCS map the first 128 pointers to the same characters as the ASCII code. Thus, ASCII can be thought of as a very small subset of Unicode and UCS. The popular UTF-8 encoding recommends the use of one to four 8-bit values for each pointer, where the first 128 values point to the same characters as ASCII. Other character encodings like UTF-16 resemble ASCII in how they represent the first 128 characters of Unicode, but tend to use 16 to 32 bits per character, so they require proper conversion to be compatible between the two character codes.
The word ASCIIbetical (or, more commonly, the "English" word; ASCIIbetical) describes sorting according to the order of ASCII codes rather than in alphabetical order.
The abbreviation ASCIIZ or ASCIZ refers to a zero-terminated string of characters (from English zero). It is very normal for the ASCII code to be integrated into other more sophisticated coding systems and for this reason it must be clear what is the role of the ASCII code in the table or character map of a computer.
ASCII Art
_ _ ____ ____ ___ __ _ _________ __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ / _` 日本語 '__SD _____ //_\\___ UD (_VAID UB LIC UB LIC LINK ___ / ___ ___.. ____,_related_related_____________________________________________________________ |
The ASCII character code is the support of a minority artistic discipline, ASCII art, which consists of the composition of images using ASCII printable characters. The resulting effect has been compared to pointillism, since the images produced with this technique are generally appreciated in more detail when viewed from a distance. ASCII art began as an experimental art, but it soon became popular as a resource for representing images on supports incapable of processing graphics, such as teletypes, terminals, emails or some printers.
Although you can compose ASCII art manually using a text editor, you can also automatically convert images and videos to ASCII using software, such as the Aalib library (freely licensed), which has gained some popularity. Aalib is supported by some graphic design programs, games, and video players.
See also
- Text files and binary files
- BCD
- EBCDIC
- Extended ASCII
- ISCII
- ISO 8859
- ASCII
- Unicode
- UTF-8
- Keyboard Codes
- ACiD Productions
- ASCII Art
Computer Specific ASCII Variants
- ATASCII
- PETSCII
Notes and references
- ↑ a b Mackenzie, Charles E. (1980). «Coded Character Sets, History and Development». The Systems Programming Series (1 edition) (Addison-Wesley Publishing Co. Inc.). pp. 6, 166, 211, 215, 217, 220, 223, 228, 236-238, 243-245, 247-253, 423, 425-428, 435-439. ISBN 0-201-14460-3. LCCN 77090165. [1]
- ↑ Internationalized Domain Names - Glossary, Internet Corporation for Assigned Names and Numbers (ICANN). Consultation on 19 November 2008.
- ↑ International Organization for Standardization (1 December 1975). The ISO 646 character set. Internet Assigned Numbers Authority Registry. Accessed on 7 August 2005.
- ↑ International Organization for Standardization (1 December 1975). Internet Assigned Numbers Authority Registry. American version. Accessed on 7 August 2005.
- ↑ Internet Assigned Numbers Authority (28 January 2005). Character codes. Accessed on 7 August 2005.
- ↑ ECMA International (December 1991). Standard ECMA-6: 7-bit Coded Character Set, 6th edition Accessed on December 17, 2005.
- ↑ Jargon File. ASCIIbetical. Accessed on 17 December 2005.
General references
- Unicode.org Unicode area ASCII
- Tom Jennings (29 October 2004). Annotated History of Character Codes Accessed December 17, 2005.
External links
- The characters and code ASCII Online tool that displays the ASCII characters and their conversions to other numerical systems.
- Ascii Code To Character Converter at binaryconverterpro.com
- Conversion from and to decimal, octal, hexadecimal, binary (ASCII notation)
- The ASCII code. Ascii code, table available for pc and mobile devices.
- www.elcodigoascii.com.ar The ASCII code. Table with simple and complete ASCII codes.
Contenido relacionado
GnuLinEx
Adobe Animate CC
Fourth generation of computers