Detection of Steganographic Messages in Email Attachment

,


Introduction
Steganography has been an important subject since the rise of Information Technology especially with the introduction of Internet.Steganography is the art and science of hiding communication (Provos & Honeyman, 2003).It is a technique of hiding secret messages inside image and audio files.Steganography derives from Greek's word and it means "concealed writing".It hides the message being exchanged in such a way that is undetectable under traditional packet analysis.The ease of use in steganography and the ability to make detecting steganography files difficult has lead to the rise in their use in the Internet as one of a secret communication between two parties.There is also a fear that terrorist organisations are using similar technology to secretly communicate with one another.Thus, this technology has become a threat to the information security over the Internet.
With steganography, one can send messages without anyone having to know of the existence of the communication.Email that contains the files embedded with hidden messages using steganography can be very difficult to detect.Those with a normal packet sniffer would only see an image or music file go by.In order for someone to detect it, the use of the publicly available tools that can detect the movement and types of traffic on the network may be useful.These detection tools can help network administrators to obtain an understanding of traffic within their network.In doing so, this can assist them in detecting any type of irregularity such as the presence of large images through emails around the network.The network administrator then can further investigate and take action on it.
To counter this new security threat and to assist the administrators, we had developed a system called Steganography Sniffer (Stegasniff) that monitor the email messages for image files that have been steganographically modified by publicly available software (Mohd Hazali Mohamed Halip & Mohd Lazim Abdul Raoh, 2009).Stegasniff (as shown in Figure 1) is a system that would be able to discover hidden messages sent through Internet that may be placed on the email attachment.Once detected over a Local Area Network (LAN) and stored to a database, the system will then extract the hidden message to reveal the message.Our work focuses on image file as it is the most widely used media to transmit hidden messages and also of the popular files to be attached in an email.The focus of our work is also on JPEG images since this image type is commonly used on Internet.The remainder of this paper is organised as follows: the introductions of steganography and steganalysis tools, a brief description of emails' protocol and encoding, the architecture and modules of Stegasniff and finally the conclusion of the paper.

Steganography Tools
Presently, there are many steganography tools for hiding data in images files (Johnson & Jajodia, 1998).Some of the tools available on the Internet that can hide messages in JPEG images are: The methods used by the tools to hide information in images includes least significant bit insertion (Lee & Chen, 2000), masking and filtering, algorithms and transformations (Johnson & Jajodia, 1998).

Steganalysis
Steganalysis is a process of detecting and extracting steganographic messages.The main objective of steganalysis is to identify the suspicious steganographic files.Currently, there are two extreme techniques used in steganographic detection (Chandramouli & Subbalakshmi, 2004) which are (a) little or no statistical assumptions about the image under investigation.Statistics are learnt using a large database of training images and (b) a parametric model is assumed for the image and its statistics are computed for steganalysis detection.Each of these techniques has pros and cons.Therefore, it is up to the user to choose an appropriate methodology based on the amount of side information that is available earlier.
There are several steganalysis tools that can be used on the Internet to find hidden messages.These tools could determine which steganographic tools were used to embed the messages onto the image.One of the tools is Stegdetect.It is a tool that analyse JPEG images for steganographic message.It is able to detect several different steganographic techniques and methods to embed hidden messages in JPEG images.Stegdetect can detects images that have hidden messages with Jsteg, JPhide and OutGuess.The output from Stegdetect may contains the lists of steganographic images.Stegdetect shows the level of confidence of the detection with one to three asterisks.Figure 2 shows some output from Stegdetect.Stegdetect is used as one of the modules in our Stegasniff.
Figure 2. The output of Stegdetect However, Stegdetect cannot assure the presence of the hidden message.A steganalysis tool called Stegbreak may be used to verify that the detected images have hidden messages.Stegbreak launches a dictionary attack against the JPEG images.JSteg-Shell, JPHide, or Outguess all hide content based on a user-supplied password, so an attacker can try to guess the password by taking a large dictionary and trying to use every single word in it to retrieve the hidden message (Provos & Honeyman, 2003).

Simple Mail Transfer Protocol (SMTP)
Simple Mail Transfer Protocol (SMTP) is without doubt the most popular protocol that is widely used since the existence of Internet.SMTP is one of the three primary application layer Internet protocols that support email (Tanenbaum, 2003).It is used for sending emails and it is responsible for establishing the connection between an email client and the email server.By using an email client, user can send messages together with other files types such as image, sound, documents and etc.An important issue in any messaging system is that sender and receiver agree on the format of the message content.Such an agreement is possible by including the description of that format as part of the message header.This is the principle underlying Multipurpose Internet Mail Extensions (MIME) (Tanenbaum, 2003).
In order to send content different from ASCII text, the sending user agent must include additional headers in the message.These extra headers are defined as Multipurpose Internet Mail Extensions (MIME).For example, sender's user agent must encode the message body that contains a JPEG image using base64 encoding.Base64 is one of several encoding techniques standardized in the MIME for conversion of the content of email attachment sent through email.Figure 3 shows an example of the MIME message of a JPEG image encoded in base64.

Figure 3. MIME encoding
Our work focus on images sent through email.The analysis was performed on the captured Transmission Control Protocol (TCP) packets with destination SMTP port 25.

The Architecture of Stegasniff
The Stegasniff consists of six modules: Figure 1 shows how the processes are passed between the modules.These modules have specific functions that enable the detection of steganography during the delivery and the reception of the emails in the local network.
The Stegasniff is the main process that controls all other modules.The system administrator (or network administrator) of Stegasniff will be the key player of the system.Stegasniff then invokes Packet Sniffer module which sets the packet filter in the local network to listen only to SMTP port 25.The system then captures the SMTP packets using pthread function which later are stores temporarily in the system's buffer.The Packet Analyzer will then collects the Internet Protocol (IP) and TCP header plus the payload data of the packet.This module observes and records any traffic that contains email messages.This traffic is then given to Mail Analyzer.The analyzer reads the mail header and attachment segment in order to find and decode any JPEG image attachment to the email using the base64 decoder.The saved JPEG image in the temporary memory is then sent to the Steganography Analyzer module.This module which has Stegdetect as part of it, will detect for any steganographic content in the JPEG image.
Stegdetect uses the filename of the JPEG image as its input before it can performs different steganalysis methods on the JPEG image.The result of the analysis is then recorded into the Stegasniff database.This database contains information such as the filename, output or results, and confidence level of each analysis.It also includes the email addresses of the sender and the recipient of the email, the timestamp and the IP address of the sender.Only the positive results are stored in the database as this will protect the privacy of the negatives images.The Web-based Reporter module assists the system administrator to check on the lists of the captured images (as shown in Figure 4) and the details of each steganographic image for further actions.Figure 5 shows the detail of steganographic image captured by Stegasniff.Finally, once stored to the database, the system will then extract the hidden message to reveal the message.This is done by Steganography Extractor module which has Stegbreak tool as part of it.By using Stegbreak, it can extract the messages from the JPEG images that is relevant to a dictionary attack and save it in a separate file.This is shown in Figure 6.

Conclusions
In this paper, we have presented a system called Stegasniff that can retrieve images from the emails and automatically detects whether they might contain steganographic messages.The hidden message is then revealed to its existence in which thwarts the main purpose of steganography.Future enhancements of Stegasniff will include additional support for other types of images and the integration of these tools for capturing steganography contents over the World Wide Web.

Figure 4 .
Figure 4. List of the captured JPEG images