Steganography
Steganography (from the Greek στεγανος steganos, "covered" or "hidden", and γραφος graphos, "writing") deals with the study and application of techniques that allow messages or objects to be hidden inside others, called carriers, so that their existence is not perceived. That is, it seeks to hide messages within other objects and thus establish a covert communication channel, so that the very act of communication goes unaware for observers who have access to that channel.
One way to differentiate steganography from common cryptography is that cryptography only encrypts files keeping the original file visible, but when opened it will show a sequence of characters that will not allow it to be read and to see its original content it is necessary to know the clue. In steganography, a file with a different format can be seen, and to know its original content it will be necessary to know the key and the software with which it was hidden.
This science has aroused a lot of interest in recent years, especially in the area of computer security, because it has been used by criminal and terrorist organizations. However, this is nothing new, as it has been used since ancient times, and has traditionally been used by police, military and intelligence institutions, as well as by criminals or civilians who wish to evade government control, especially in tyrannical regimes.
Classical steganography was based solely on the ignorance of the covert channel used, while in the modern era digital channels (image, video, audio and communication protocols, among others) are also used to achieve the objective. In many cases, the container object is known, and what is unknown is the algorithm for inserting the information into said object.
For it to be possible to speak of steganography, there must be a desire for covert communication between the sender and the receiver.
Differences with cryptography
Although steganography can be confused with cryptography, since both are part of information protection processes, they are different disciplines, both in the way they are implemented and in their objective itself.
Whereas cryptography is used to encrypt or scramble information in such a way that it is unintelligible to a potential intruder, even though he knows of its existence, steganography hides the information on a carrier so that the attacker is not noticed. the very fact of its existence and delivery. In the latter way, a likely intruder won't even know that sensitive information is being transmitted.
However, cryptography and steganography can complement each other, giving an extra level of security to the information, that is, it is very common (although not essential) that the message to be steganographed is previously encrypted, in such a way that a Any eventual intruder will not only have a hard time noticing the presence of the hidden messaging, but, if he were to discover it, he would find it encrypted.
Motivation
The usefulness of using steganography is manifested in the so-called prisoner problem (Gustavus J. Simmons, 1983). Briefly, in the prisoner problem we have two prisoners, A and B, who want to communicate confidentially in order to escape. The problem is that they can only exchange messages through a gatekeeper. The watchdog can read, modify or generate the messages itself. If the gatekeeper detects any communication that can be used to escape (for example, if it detects encryption), it will stop transmitting the messages. In this scenario, the prisoners need to establish a covert channel.
The use of steganography allows having a covert channel so that it is possible to communicate without being detected. The strategy that steganography follows to solve the prisoner's problem is to hide the data that must be detected, among the messages allowed by the guardian.
Operation and terminology
The idea behind the steganography is to send the hidden message (E) “hidden” in a message of inocuous appearance (C) that will serve as “camuflaje”. This is, a stenography function is applied f(E){displaystyle f(E)}. The result of applying the function (O), is sent by an insecure channel and can be seen smoothly by the guardian. Finally, the other prisoner receives object O and, applying reverse function f− − 1(O){displaystyle f^{-1}(O)}You can get the hidden message back.
Typical terminology used in steganography is:
- It is defined as stenographic scheme to the set of components that allows to carry out the stenographic communication.
- The carrier is all that data set that is susceptible to being altered to incorporate the message we want to keep secret. It can be of many types or formats. Examples: image (in its different formats), audio (in its different formats), plain text, binary files, a communication protocol message.
- You talk about it. message-legitimate to refer to the message conveyed by the carrier.
- His name is sage message the message we want to keep secret and we want to hide inside the carrier. It can be of different types or formats. Examples: image (in its different formats), audio (in its different formats), plain text, binary files.
- Estego-algoritmo is the stenographic algorithm that indicates how to perform the procedure of incorporating the message that we want to keep secret in the carrier.
- The action to hide the message within the carrier is called impotre English to embed).
- His name is step-message the result of flushing the stenographic message inside the carrier.
- The action of recovery, from the stoke-message, of the occult steganographic message is called extract extract extract extract extract extract.
- Because of the role played in the stenographic process, the issuer is also called embryo and the receiver, extractor. As in any act of conventional communication, it is common for the roles of transmitter and receiver to be exchanged successively between the communicated parties.
- His name is stereoist or stereo a person who attempts to determine the existence or absence of a stenographic message. Note that it is sufficient to determine the existence of the message; it does not have to reach the content itself. I mean, a steganalist is the one who does stegoanalysis.
- Them selection channels are additional carrier channels used for embeber where it is reported which carrier positions are used for stenographic communication. For example, if the carrier is a textbook, a selection channel could be defined by a succession of natural numbers that represent the position of each of the words in the textbook, which should be considered to build the sage message.
- The equivalence classes They correspond to pairs of carrier elements used that have a semantic interpretation equivalent in legitimate communication, but the use of one element or another has an agreed meaning in the stenographic communication. For example, the words 'lindo' and 'bonito' are synonyms in Spanish and could be used indistinctly in a context. A reader would not notice the difference in the semantics of the text, but can be used to build the stenographic message.
The ability to imperceptibly alter the carrier is possible thanks to the existence of redundancy in the carrier. Alterations can be made both to the content and to other parameters; for example, in the response time in the transmission of the bearer.
Classification according to the stego-algorithm
The steganographic algorithm is the steganographic algorithm that indicates how to carry out the procedure for incorporating the steganographic message in the carrier to obtain the steganographic message. Depending on the type of stego-algorithm, it is possible to distinguish two types of steganography: pure steganography and secret key steganography.
Pure steganography
In this type of steganography, the stego-algorithm establishes a fixed method to embed the steganographic message in the carrier to obtain the stego-message. In this strategy it is being assumed that the keeper of the prisoner's problem knows nothing about the stego-algorithm. Therefore, security is based on obscurity. This approach to achieving security rarely works and is especially disastrous in the case of cryptography.
Secret Key Steganography
In this type of steganography, the stego-algorithm is parameterized by a steganographic key called the stego-key, which defines how to apply the algorithm. For example, the stego-key could indicate the place inside the carrier from which the incorporation of the secret message begins. The sender and receiver must previously agree on both the stego-algorithm and the stego-key.
The extraction process consists of applying the necessary stego-algorithm and stego-key to the stego-message received to obtain the steganographic message.
In this scenario, the keeper of the prisoner's problem may know the stego-algorithm, but does not know the stego-key, which is used in it. In this strategy, security is based on the Kerckhoffs principle. Applied to steganography, the Kerckhoffs principle could include, as revealable information, access to carrier information before the stego-algorithm is applied to it.
History
These are some examples or stories that show that steganography has been around since ancient times and is constantly evolving.
Herodotus
Probably one of the oldest examples of the use of steganography is the one referred to by Herodotus in The Histories. In this book, he describes how a character took a two-leaf booklet or tablets, He scratched the wax that covered them well and on the wood itself he engraved the message and covered it again with regular wax. Another story, in the same book, relates how another character had shaved the head of his most trusted slave with a razor, tattooed the message on his scalp, then waited for his hair to grow back and sent it to the recipient of the message. message, with instructions to have his head shaved.
15th century
Italian scientist Giovanni Battista della Porta discovered how to hide a message inside a boiled egg. The method consisted of preparing an ink by mixing an ounce of alum and a pint of vinegar, and then writing on the shell. The solution penetrates the porous shell and leaves a message on the albumen surface of the hard-boiled egg, which can only be read by peeling the egg.
First book
The origin of the word steganography dates back to the early 16th century. The German abbot Johannes Trithemius wrote a book which he titled Steganographia . In it, topics related to the encryption of messages were discussed, as well as methods to conjure the spirits. The book in question is considered today a cursed book and is highly appreciated by esotericists. Apart from this title, he also published Polygraphiae Libri Sex, a six-volume compendium on cryptography that did not partake of the esoteric elements of his previous book.
Other books
- In Dream of Polífilo (Hypnerotomachia Poliphilied. Aldus Manutius), written by Francesco Colonna in 1499, you can get the phrase Poliam frater Franciscus Column peramavit (‘Brother Francesco Colonna passionately loved Polia’) if he took the first letter of the thirty-eight chapters.
- Gaspar Schott (1665): Schola steganographica.
- Ian Caldwell, Dustin Thomason (2004): Enigma of the four.
- Lev Grossman (2004): The secret codex.
World War II
During World War II, microfilm was used to dot the i's or punctuation marks to send messages. Prisoners used i, j, t, and f to hide morse code messages. But one of the most ingenious systems is known as Null Cipher. It consists of sending a message, as common as possible, and choosing a certain part of it to hide the real message. An example is the following text:
Apparently neutral's protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by products, ejecting suets and vegetable oils. (Apparently the neutral protest is completely discounted and ignored. Isman affected. Blockade problem affects pretext of embargo on products, manages to expel tallow and vegetable oils)
If we take the second letter of each word, the message Pershing sails from NYr June i. appears ("Pershing sails from New York on June 1.")
Invisible inks
It is not clear when they were first used, but they have certainly been used throughout history and up to the present day. The best known can be classified into two categories:
- Basics: substances with high carbon content: milk, urine, lemon juice, orange juice, apple juice, onion juice, sugary solution, diluted honey, diluted coke, wine, vinegar, etc. Basically, no matter which of the above-mentioned “tints” is used, the message will be invisible, and by heating the surface where it was written, the carbon will react, and the message will appear in a brown tone.
- More sophisticated: they appear after a chemical reaction, or after being exposed to light of a certain wavelength (IR, UV and others).
Classic and modern steganography
“Classic” steganography: completely obscure methods.
- Protection based on unknown the specific cover channel being used.
Modern steganography: use of digital channels:
- Text file (web pages, source code, etc.)
- Digital audio
- Digital Images and Video
- Executable files
- Communications protocols
Digital techniques
There are numerous methods and algorithms used to hide information within media files: images, audio, and video. Some of the most used are listed below.
Masking and filtering
In this case the information is hidden within a digital image using watermarks that include information, such as copyright, ownership or licenses. The goal is different from traditional steganography (basically covert communication), as it is to add an attribute to the image that acts as a cover. This expands the amount of information presented.
Algorithms and transformations
This technique hides data based on mathematical functions that are often used in data compression algorithms. The idea of this method is to hide the message in the less important data bits.
Insertion at least significant bit
This is the most common and popular modern method used for steganography, it is also one of the so-called substitution methods. It consists of making use of the least significant bit of the pixels of an image and altering it. The same technique can be applied to video and audio, although it is not the most common. Done this way, the distortion of the overall image is kept to a minimum (perceptibility is practically nil), while the message is scattered throughout its pixels. This technique works best when the image file is large, has strong color variations ("noisy image"), and also benefits the higher the color depth. Likewise, this technique can be used effectively on grayscale images, but it is not appropriate for those in 8-bit color paletted (same structure as grayscale images, but with a color palette). In general, the best results are obtained on images with RGB (three bytes, color components, per pixel) color format.
Example:
The value (1 1 1 1 1 1 1 1) is an 8-bit binary number. The rightmost bit is called the "least significant bit" (LSB) because it is the one with the least weight, altering it changes the total value of the represented number to the least possible extent.
An example of steganography: Hiding the letter "A". If you have part of an image with pixels in RGB format (3 bytes), its original representation could be as follows (3 pixels, 9 bytes):
(1 1 0 1 1 0 1 0) (0 1 0 0 1 0 0 1) (0 1 0 0 0 0 1 1)
(0 0 0 1 1 1 1 0) (0 1 0 1 1 0 1 1) (1 1 0 1 1 1 1 1)
(0 0 0 0 1 1 1 0) (0 1 0 0 0 1 1 1) (0 0 0 0 0 1 1 1)
The message to be encrypted is 'A' whose ASCII representation is (0 1 0 0 0 0 0 1), then the new altered pixels would be:
(1 1 0 1 1 0 1 0) (0 1 0 0 1 0 0 1) (0 1 0 0 0 0 1 0)
(0 0 0 1 1 1 1 0) (0 1 0 1 1 0 1 0) (1 1 0 1 1 1 1 0)
(0 0 0 0 1 1 1 0) (0 1 0 0 0 1 1 1) (0 0 0 0 0 1 1 1)
Note that the message bit (letter A, marked in bold) has been replaced in each of the least significant color bits of the 3 pixels. 8 bytes were necessary for the change, one for each bit of the letter A, the ninth color byte was not used, but it is part of the third pixel (its third color component).
The LSB method works best on image files that have a high resolution and use a lot of colors. In case of audio files, they favor those that have many different sounds that have a high bit rate.
In addition, this method does not alter the size of the carrier or cover file at all (hence it is "a substitution technique"). It has the disadvantage that the size of the carrier file must be greater than the message to be embedded; 8 image bytes are needed for each message byte to hide; that is, the maximum capacity of an image to store a hidden message is its 12.5%. If it is intended to use a greater portion of bits of the image (for example, not only the last, but the last two), the general alteration caused can begin to be perceptible to the human eye.
Most used techniques according to the type of media
In documents
Use of steganography in documents can work by simply adding whitespace and tabs to the ends of lines in a document. This type of steganography is extremely efficient, since the use of white space and tabs is not visible to the human eye, at least in most text editors, and occurs naturally in documents, so that in general it is very difficult to raise suspicions.
In pictures
The most widely used method is LSB, since for a computer an image file is simply a file that shows different colors and light intensities in different areas (pixels). The most appropriate image format to hide information is BMP (24-bit color), because it is the one with the highest proportion (uncompressed image) and is usually of the highest quality. Finally, it is preferred to opt for 8-bit BMP formats or others such as GIF, because they are smaller. It should be noted that transporting large images over the Internet may arouse suspicion.
When an image is of high quality and resolution, it is easier and more efficient to hide and mask the information within it.
The disadvantage of the LSB method is that it is the best known and most popular, therefore the most studied. It leaves marks similar to white noise on the carrier (container image), which makes it highly detectable or vulnerable to stegoanalysis attacks. To avoid this, the message is dispersed, generally using random sequences.
It is important to note that if information is hidden within an image file and it is converted to another format, the information hidden within will most likely be corrupted and therefore unrecoverable.
Audio
When hiding information within audio files, low bit encoding is generally the technique used, which is similar to LSB often used for images. The problem with low bit encoding is that it's generally audible to the human ear, so it's rather a risky method for someone to use if they're trying to hide information within an audio file..
Spread Spectrum is also used to hide information within an audio file. It works by adding random noise to the signal that information is hidden within an airline and propagating it across the entire frequency spectrum.
Another method is Echo data hiding, which uses echoes in sound files to try to hide data. By simply adding extra sound to an echo within an audio file, the information can be hidden. What this method does better than others is that it can actually improve the sound of the audio within an audio file.
On video
In video, the DCT (Discrete Cosine Transform) method is often used. DCT works by slightly changing each of the images in the video, just in a way that it is not perceptible to the human eye. To be more precise about how DCT works, DCT alters the values of certain parts of the images, usually by rounding them. For example, if part of an image has a value of 6,667, it rounds it up to 7.
Video steganography is similar to that applied to images, in addition to the fact that the information is hidden in each video frame. When only a small amount of information is hidden within the source code it is usually not perceptible to everyone. However, the more information that is hidden, the more noticeable it will be.
In files of any type
One of the easiest methods to implement is to inject or add bytes to the end of the file. This technique essentially consists of adding or attaching to the end of a file, of any type, another file that will be the container for the "message to hide", also of any type. This methodology is the most versatile, since it allows you to use any type of file as a carrier (documents, images, audio, videos, executables, etc) and add the "package sent" at the end of the container file, which is another file, also of any type.
This is a technique that does not use human limitations (sight and hearing) to implement the steganographic strategy, but instead uses the way software applications that use the carrier function. They do not degrade the content of the carrier in any way, for example, if it is an image, it will remain intact; since the "message" it is injected or attached to it at the end of it and the application used to view it will display it normally until it ends. This is due to the fact that every type of file, in its header, among others, contains certain fixed bytes (in quantity and location) used exclusively to indicate the size of the file. The application that uses a file, of any type, always reads its header first, acquires that value as its size (in number of bytes) and then reads the rest of the file until the end indicated by said value. So if anything (message) is placed beyond the value of that parameter, it will not be read by the normal application, thus not detected, and the carrier file will work normally.
Although it is the simplest technique to implement, and is widely used, it has the great disadvantage that it causes growth of the bearer, as well as the size of its message, being therefore an easily detectable strategy. A simple stegoanalysis program detects it by simply reading its header and checking the real size of the carrier file; even any distrustful user can often suspect the carrier because of its size occupied on disk in relation to its content. Another disadvantage, although very relative and eventual, is that the growth of the bearer could be limiting when transferring it over the networks, particularly over the Internet.
The programs or software that use this technique are called joiners, basically they join two files, the carrier and the message, keeping the initial value of the size in bytes indicated in the header of the first. This is a technique not used if it is intended to obtain undetectable characteristics.
If stealth requirements are not required, it is one of the preferred methods for its simplicity, flexibility, and few limitations. Virtually any type of carrier is supported, with or without compression, even executable modules. In some cases it causes corruption of the bearer, which is not a big problem: once the technique has been practiced and the message injected, the bearer is tested with its corresponding application, if it has been degraded and/or does not work well, simply take another, the same or another type and the operation is repeated.
Others
A new steganographic technique involves injecting imperceptible delays into packets sent over the network from the keyboard. Command typing delays in some applications (telnet or remote desktop software) can mean packet delay, and packet delays can be used to encrypt data.
Stegoanalysis
What steganography essentially does is exploit the limitations of human perception (except for the injection method), since the human senses (sight and hearing) have limits in perceiving extraneous information embedded in its content; but there are software applications available that can do this detection work, by various analytical techniques, the study and application of which corresponds to what is called stegoanalysis.
While steganography tries to study and implement methods to send covert messages in innocuous or normal-appearing carriers, steganalysis studies ways to detect the presence of hidden messages in potential carriers (not necessarily to extract them).
Because steganography is invasive, that is, it leaves traces in the transport medium used, steganalysis techniques try to detect these changes, even using complex statistical mechanisms. Until now, stegoanalysis techniques have only managed to provide a level of probability of the existence of a covert message on a carrier.
Stegoanalytical algorithms
Stegoanalytical algorithms can be classified in different ways, highlighting: according to the information available and according to the purpose sought.
Based on available information
There is the possibility of cataloging these algorithms based on the information held by the stegoanalyst in terms of clear and encrypted messages. It is a technique similar to cryptography, however, they have several differences:
- Chosen stego attack: the stegoanalist perceives the final object and the stenographic algorithm used.
- Known cover attack: the stegoanalist includes the initial driver object and the final object.
- Known stego attack: the stegoanalist knows the initial carrier object and the final object stego, in addition to the used algorithm.
- Stego only attack: the stegoanalist exclusively perceives the object stem.
- Chosen message attack: the stegoanalist, following a message selected by him, originates an object stenth.
- Known message attack: the stegoanalist detests the object and the hidden message, which is known to it
According to the purpose sought
The primary purpose of steganography is to inadvertently transfer information, however it is possible for an attacker to make two different claims:
- Passive stegoanalysis: does not alter the target stem, therefore examines this objective stem in order to establish whether it moves covert information and collects the hidden message, the key used or both.
- Active stegoanalysis: the initial objective step varies, therefore, it seeks to suppress the transfer of information, if any.
Steganography software
- StegHide Steganographic software that supports encryption and compression. Works with JPEG, BMP, WAV and AU files and has GNU license
- Steghide UI Graphical user interface (GUI) for GNU license StegHide
Contenido relacionado
Electronic analogue
MediaWiki:Nowiki sample
Optical pencil