ISSN: 2229-371X
Souvik Bhattacharyya*1, Pabak Indu2 , Sanjana Dutta3 , Ayan Biswas4 and Gautam Sanyal5
|
Corresponding Author: Souvik Bhattacharyya, E-mail: souvik.bha@gmail.com |
Related article at Pubmed, Scholar Google |
Visit for more related articles at Journal of Global Research in Computer Sciences
Recent years have witnessed the rapid development of the Internet and telecommunication techniques But due to hostilities of environment over the internet, confidentiality of information have increased at phenomenal rate. Therefore to safeguard the information from attacks, number of data/information hiding methods have evolved. Steganography is an emerging area which is used for secured data transmission over any public media. Steganography is of Greek origin and means "Covered or hidden writing". Considerable amount of work has been carried out by different researchers on Steganography. In this paper the authors propose a novel text steganography method through changing the pattern of English alphabet letters. Considering the structure of English alphabets, secret message has been mapped through some little structural modification of some of the alphabets of the cover text .This approach uses the idea of structural and feature changing of the cover carrier which is not visibly distinguishable from the original to the human beings and may be modified for other India language also. This solution is independent of the nature of the data to be hidden and produces a stego text with minimum degradation. Quality of the stego text is analyzed by trade off between no of bits used for mapping. Efficiency of the proposed method is illustrated by exhaustive experimental results and comparisons.
Keywords |
Steganography, Cover Text, Stego Text, CALP (Changing in Alphabet Letter Patterns), Pattern Change, Jaro-Winkler Distance. |
INTRODUCTION |
The technique of information hiding has been widely applied on various fields during the recent years [7] and the two major branches, viz. digital watermarking and steganography have been derived [9], [11]. Digital watermarking provides the protection of intellectual property, whereas steganography concerns privacy of information under surveillance. Steganalysis is the art of detecting any hidden message on the communication channel. If the existence of the hidden message is revealed, the goal of steganography is defeated. Steganography is an ancient art of conveying messages in a secret way that only the receiver knows the existence of the message [5]. The well-known steganographic methods include invisible ink, micro dot, covert channel, and spread spectrum communication. A famous illustration of modern day steganography is SimmonsâÃâ¬ÃŸ PrisonersâÃâ¬ÃŸ Problem [1]. The term steganography is a Greek word means “covered writing”. As the goal of steganography is to hide the presence of a message and to create a covert channel, it can be seen as the complement of cryptography, whose goal is to hide the content of a message. The message is hidden in another media such that the transmitted data will be meaningful and innocuous looking to everyone. Compared with cryptography attempting to conceal the content of the secret message, steganography conceals the very existence of that [8]. Fig 1 shows the framework of modern day steganography. |
In steganography two aspects are usually addressed. First, the cover-media and stego media should appear identical under all possible statistical attacks. Second, the embedding process should not degrade the media fidelity, that is, the difference between the stego media and the cover-media should be imperceptible to human perceptual system. |
Steganography works have been carried out on different transmission media like images, video, text, or audio [13].and receiver. If the public key of the receiver is known to the sender, the steganographic protocol is called public key steganography [4, 7]. Although all digital file formats can be used for steganography, but the image and audio files are more suitable because of their high degree of redundancy [21]. Fig. 2 below shows the different categories of file formats that can be used for steganography techniques. |
Among them image steganography is the most popular of the lot. In this method the secret message is embedded into an image as noise to it, which is nearly impossible to differentiate by human eyes [10, 12, 14]. In video steganography, same method may be used to embed a message [15, 20]. Audio steganography embeds the message into a cover audio file as noise at a frequency out of human hearing range [16]. One major category, perhaps the most difficult kind of steganography is text steganography or linguistic steganography [3]. The text steganography is a method of using written natural language to conceal a secret message as defined by Chapman et al. [13]. The advantage to prefer text steganography over other media is its smaller memory occupation and simpler communication. For a more thorough knowledge of steganography methodology the reader may see [10], [21].Some Steganographic model with high security features has been presented in [25-31].A block diagram of a generic text steganographic system is given in Fig. 3. |
A block diagram of a generic form of text steganographic system is given in Fig. 3. A message is embedded in a carrier (cover text) through an embedding algorithm, with the help of a secret key. The resulting stego text is transmitted over a channel to the receiver where it is processed by the extraction algorithm using the same key. During transmission the stego text, it can be monitored by unauthenticated viewers who will only notice the transmission of an innocuous text without discovering the existence of the hidden message. |
This paper has been organized as following sections:- Section II discusses about some of the related works done based on text steganography. Section III describes proposed text steganography method. Section IV describes the solution methodology. Section V describes different algorithms Section VI contains the analysis of the results and Section VII draws the conclusion. |
RELATED WORKS ON TEXT STEGANOGRAPHY |
Text steganography can be broadly divided into three types. They are format-based, random & statistical generations and Linguistic method shown in Figure 4. Most peoples have suggested various methods for hiding information in text in mentioned three categories. Some of the methods are discussed in this paper. Format-based methods use and change the formatting of the cover-text to hide the data. They donâÃâ¬ÃŸt change any words or sentences, so it does not harm the „valueâÃâ¬ÃŸ of the cover-text. A format-based text steganography method is open space method. In this method extra white spaces are added into the text to hide information. These white spaces can be added after end of each word, sentence or paragraph. A single space is interpreted as “0” and two consecutive spaces are interpreted as “1” [6]. Although a little amount of data can be hidden in a document, this method can be applied to almost all kinds of text without revealing the existence of the hidden data. |
Another two format-based methods are word shifting and line shifting. In word shifting method, the horizontal alignments of some words are shifted by changing distances between words to embed information [18]. These changes are hard to interpret because varying distances between words are very common in documents. Another method of hiding information is, in manipulation of whitespaces between words and paragraph [23]. In line shifting method, vertical alignments of some lines of the text are shifted to create a unique hidden shape to embed a message in it [19]. Random and statistical generation methods are used to generate cover-text automatically according to the statistical properties of language. These methods use example grammars to produce cover-text in a certain natural language. A probabilistic context-free grammar (PCFG) is a commonly used language model where each transformation rule of a context-free grammar has a probability associated with it [2]. A PCFG can be used to generate word sequences by starting with the root node and recursively applying randomly chosen rules. The sentences are constructed according to the secret message to be hidden in it. The quality of the generated stego-message depends directly on the quality of the grammars used. Another approach to this type of method is to generate words having same statistical properties like word length and letter frequency of a word in the original message. The words generated are often without of any lexical value. The last category, the linguistic method considers the linguistic properties of the text to modify it. The method uses linguistic structure of the message as a place to hide information. Syntactic method is a linguistic steganography method where some punctuation signs like comma (,) and full-stop (.) are placed in proper places in the document to embed a data. This method needs proper identification of places where the signs can be inserted. Another linguistic steganography method is semantic method. In this method the synonym of words for some pre-selected are used. The words are replaced by their synonyms to hide information in it [17]. Except the above mentioned methods, there are some other methods proposed for text steganography, such as feature coding, text steganography by specific characters in words, abbreviations etc. [22] or by changing words spelling [24]. |
PROPOSED METHOD FOR TEXT STEGANOGRAPHY (CALP) |
In this paper, a new method for text steganography for English language is proposed. In this method cover text and secret message is generated by the user. Stego text is formed by mapping the binary sequence of the secret message through texture/pattern changes of some alphabets of the cover text. Figure 5 and 6 below respectively shows the mapping sequence for embedding 0s and 1s through the following pattern changes of the following alphabets of the cover text. These pattern changes have been incorporated using some unused symbols of the ASCII chart. |
SOLUTION METHODOLOGY |
The proposed system consists of the following two windows, one for the cover text generation and the other for the secret message generation. The user will be someone who is familiar with the process of information hiding and will have the knowledge of steganography systems. The user should be able to form a plain text as secret message, another text needs to be formed for use as carrier (cover text).Finally the proposed embedding method will be used to hide the secret message in cover text to form the stego text.The user at the receiver side should be able to extract the secret message from the stego text with the help of different reverse process. Figure 7 shows the corresponding GUI for the proposed text steganography system |
ALGORITHMS |
In this section algorithmic process for embedding and extraction methodology has been discussed. Figure. 8 show the block diagram of the proposed steganographic system. This input message is first converted into bits according to their ASCII values. Then the bit is embedded into the cover text according to the methods mentioned earlier and thus stego text is generated. |
A. Algorithm Stego Text formation |
Let COVER be the cover text and STEGO be the stego text and MSG is the binary string of the secret message and N is the no of elements in the MSG. Initially COVER and STEGO are the same. Set two counters i and j initialize to 1. Take an array arr to keep the embeeding positions. |
Step 1: Generate an appropriate COVER consisting of „AâÃâ¬ÃŸ or „aâÃâ¬ÃŸ or „câÃâ¬ÃŸ and „iâÃâ¬ÃŸ or „jâÃâ¬ÃŸ. Let k be the size of the COVER. |
Copy the contents of the COVER into STEGO. |
ANALYSIS OF THE RESULTS |
There are mainly three aspects that should be taken into account when discussing the results of the proposed method of text steganography. They are security, capacity and robustness. The authors simulated the proposed system and the results are shown in the figures 9, 10, and 11 respectively. This method satisfies both security aspects and hiding capacity requirements. It generates the stego text with minimum degradation which is not very revealing to people about the existence of any hidden data, maintaining its security to the eavesdroppers. Although the embedding capacity of the proposed method depends upon the cover text structure but the embedding capacity can be maximized by incorporating more no of alphabets through minor pattern changes for mapping 0s and 1s. |
Similarity Measure of the Cover Text and Stego Text through Correlation |
The most familiar measure of dependence between two quantities is the Pearson product-moment correlation coefficient [32], or ”PearsonâÃâ¬ÃŸs correlation.” It is obtained by dividing the covariance of the two variables by the product of their standard deviations. Karl Pearson developed the coefficient from a similar but slightly different idea by Francis Galton. The Pearson correlation is +1 in the case of a perfect positive (increasing) linear relationship (correlation), -1 in the case of a perfect decreasing (negative) linear relationship (anti correlation) , and some value between -1 and 1 in all other cases, indicating the degree of linear dependence between the variables. As it approaches zero there is less of a relationship (closer to uncorrelated). The closer the coefficient is to either -1 or 1, the stronger the correlation between the variables. If the variables are independent, PearsonâÃâ¬ÃŸs correlation coefficient is 0, but the converse is not true because the correlation coefficient detects only linear dependencies between two variables. |
If we have a series of n measurements of X and Y written as xi and yi where i = 1,2,…,n then the sample correlation coefficient can be used in Pearson correlation r between X and Y. The sample correlation coefficient is written as |
CONCLUDING REMARKS |
In this paper the authors presented a novel approach of English text steganography method .Stego text is generated by mapping the binary sequence of the secret message through texture/pattern changes of some alphabets of the cover text in order to achieve high level of security. From figure 12 it has been observed that CALP method generates the stego text with minimum or zero degradation as both the Jaro score and Correlation-coefficient value is very high. This property also enables the method to avoid the steganalysis. The proposed steganography technique through texture/pattern changing is a new approach for the English steganography and this methodology can be extended to any Indian language also. |
References |
|