A Protection Method of Target Codes

,


Introduction
With the development of modern information society, human depends on the software more than the computer.As a result of the rapid popularization of computer in the social life, demand for software application is increasing day after day.At the same time the complexity of software is also rising, demand for reliability is gradually enhanced.However, the larger software systems has become more fragile, sometimes they work in an unexpected way and cause tremendous damage directly or indirectly to the users, for example, the incident of Aliya No.5 Rocket in Europe in 1996 and ambulance events in France in 1999 and so on.Software fault has been gradually the main source of failure in safety critical systems In order to develop more secure and reliable computer software, many countries invest a lot of manpower and material resources in software reliability research The software is composed by codes, its reliability fundamentally can be manifested as code reliability, so it is an important way that the security protection for codes guarantees the reliability of software.
At present code protection methods can be classified into two types: hardware protection method and software protection method.Only one method is hard to keep balance among security intensity, system cost and implementation cost.Hardware protection method adopts cryptographic technique, so security intensity can be increased and secure coprocessor can increase code execution speed, but the requirement for hardware increases the system's implementation cost.Software protection method such as watermark or code obfuscation can reduce the cost, but system security and code execution efficiency can be reduced as well.
This paper puts forward a scheme combining compiler with hardware platform on the base of the existing protection method of executable code.The scheme is implemented and simulated through FPGA(Field Programmable Gate Array) reduce The method not only implements the high security protection for codes but also reduces the cost of system implementation.
The structure of this paper is presented as follows: introduction related to technical theory in Section 2; introduction to the scheme of target code protection in Section3; how to implement encryption/decryption module and decompression module through FPGA in Section 4; related work and conclusion in the last section.

Integrated Development Environment of Quartus
Quartus software is an integrated development tool for logic circuit design provided by Altera company.It not Ⅱ only includes a series of tools for FPGA development process, but also provides a third party software interface to facilitate the enhancement of function.
Design process of Quartus II usually starts with compiling RTL codes, then verifies if RTL codes can achieve the correct function and if the results can meet the time requirements of the design after the layout, finally configurates the programming files to the target FPGA devices to complete the entire development process.
2.1.2FPGA FPGA (Field -Programmable Gate Array) is a further development product based on PAL, GAL, CPLD and other programmable devices.It appears as a semi custom circuit in the field of an application-specific integrated circuit (ASIC), so it can solve the lack of custom circuit and overcome that the numbers of original programmable device gate circuit are limited FPGA basically can be divided into six parts: the programmable input / output unit, basic programmable logic unit, embedded block RAM, rich routing resources, the embedding function unit and embedded special hardcore etc.

AES Algorithm
AES(Advanced Encryption Standard)is also called Rijndael encryption method, a block encryption standard adopted by the United States federal government.This standard has replaced DES and been widely analyzed and used in the world.This paper adopts the AES symmetric encryption algorithm to encrypt the data processing.In this way it can not only protect code security better, but also implement hardware conveniently.

Compression Algorithm
The compression algorithm is mainly divided into two categories: lossless compression and lossy compression.Lossless compression refers to a process through which the raw data can be restored after the decompression of the compressed information.Lossy compression refers to a process through which the raw data is slightly at variance with the reprocessed data, but its expression remains intact.This paper compresses the target codes to save storage space, adopt to the equipment with low storage capacity and further reduce the amount of encryption, decryption and verification.The paper designs a simple compression algorithm (BCC algorithm), which encodes the binary data directly to save the cost of mapping table and dictionary, adapt to smaller process compression; on the other hand, reduce complexity of system security enhancement module through software and hardware platform.

models of Code Security Level
According to the statistics of CNVD, leaks ratio caused by input validation error is up to 29.82%, the ratio of design error followed by is 21.56%, the ratio of boundary conditions error is 19.91% from Jan of 2001 to July of 2010.The input validation error is due to the failure to check the validity of input data provided by users, resulting in a buffer overflow, illegal pointer and so on.The proportion of buffer overflow in all leaks accounts for 2.4%.Thus, buffer overflow and pointer problem are the main problems, Leaks caused by boundary conditions error rank second.Therefore, by analyzing the statistical information and the fault in C language, this paper divides C language into four security levels as shown in Table 1.

Strategies for Codes Protection
Strategies for codes protection are divided into three stages: Stage 1: After compiler divides source program into basic blocks, the paper marks the security level for each statement on the base of models of code security level.According to Sbi = Max(Scbi,k), security level of each basic block is calculated (where b i represents basic block I, c bi, represents statement j in basic block b i , S k represents the security level of K, which is the basic block and the statement in basic block).Finally, write the mark to the corresponding configuration file.
Stage 2: When the target code of compiler is generated, the target code is compressed with compression algorithm.According to the actual results of compression, security control file will be modified.Finally, encrypt the target code and verify the mark insertion, at the same tome, encrypt and verify the security control file according to Table 2.

Key Expansion Module
Key Expansion Module is used for the generation of circulating keys.It is a process that a 128 bit key as an initial input key generates a key needed in the back 10 rounds' operation through RotWord, SubByt and Rcon XOR operation.pio[17..0] is the corresponding 18 keyboards' signal when the initial key is inputted.pio[15..0] represents hexadecimal numeral f, e, d…1, 0 of keyboard input.pio[16], a flag bit of input end, represents enter key of keyboard input.pio[17], a flag bit deleting input error, represents back key of keyboard input.Wren is the enabling signal of keyboard input, effective on low level.rst represents module reset signal, effective on low level.'rdaddress[3..0]' represents reading Key RAM address signal.'rden' is the enabling signal.'rdclock' is reading key clock signal.'ready_finish' is key expansion finish signal.'outkey[127..0]' is circulating key output signal representing a 128 bit key after expansion.As shown in Figure 4.

FPGA Implementation of Decompression Module
Decompression module decompresses the compressed data of basic block to and then restore to raw data under the control of clock signal.Basic block is divided in terms of 64 B (512 bit), so the data of the compressed basic block must be less than 64 B. Based on the principle that no enough width should be supplied with zero at low position, the data of basic block can be expanded into 521 bit for program to operate.In the module CLK is clock signal of decompression, ena is enabling signal of decompression, datain[511..0] represents the compressed data of basic block after expansion, available is effective signal of the decompressed data, dataout[511..0] represents the decompressed data of basic block.Decompression module diagram is shown in Figure 7.

Related Work
At present the study on security problem of target code generated by compiler is discussed from three aspects: leak precaution for code security, attribute proof for code security and encryption techniques for executable code.Encryption techniques for executable code are used to prevent attacker from reading, analyzing and tampering with the complied executable code.There are four methods: (1) Copyright Notice (Watermark).Copyright Notice (Watermark) is to embed copyright information into codes and then to run and verify its correctness (Horne, Matheson, Sheehan, & Tarjan, 2002).Watermark cannot stop attack, but it is useful in tracing and identifying code and preventing code from illegally copying.(2) Code Obfuscation.Code obfuscation is a method through which the compiler is deliberately obfuscated to generate disorder codes to hide the important algorithm and data (Kiyomoto & Shinsaku, 2008).Collberg et al. (2002)  Signatures is used to verify if codes are changed in order to avoid malicious attacks (Han & Wang, 2004).Horne et al. (2002) proposed a self-verification technique, through which calculated HASH value of codes segment and compared with correct value to make sure if codes were tampered or not.But the reliability of this technique depends on the secrecy level of verification and algorithm, so if attacker decrypts verification and algorithm, the verification system will be under threat.(4) Cryptographic Architectures.In order to prevent codes from exposing in any unreliable main memory, from analyzing or tampering, after program codes are encrypted through the compiler, program will be run with the form of encryption in secure coprocessor.With the support of hardware platform the codes generated by the compiler can be encrypted.But the development of this technique which is based on codes system structure is limited by the high cost of the security hardware platform.
In view of these, this paper proposes a scheme combining the compiler with hardware platform.The protection method that combines software with hardware not only makes up for the deficiency of algorithm secrecy level but also decreases the cost of security protection.When target codes are compressed, the length of flag codes will be reduced to a minimum.In this way the cost of security protection can be reduced.Finally, the scheme is implemented and simulated through FPGA to form an effective protection framework for target codes.

Conclusion
This paper puts forward a target codes protection scheme combining hardware protection with software protection.In the scheme the compiler compresses the generated target codes and parts of codes are encrypted through FPGA.Then the compiler inserts a verification mark into codes.When executed, the target codes will be decompressed through FPGA.Before executed through CPU, the target codes will be decrypted and verified in lights of block mark information.Since the length of the basic block codes is short, a binary compression algorithm is designed.Finally, design and simulate encryption/decryption module of AES algorithm and decompression module of BCC algorithm through FPGA.Simulation results oscillogram shows that the algorithm function implemented through FPGA is correct and effective.

Figure 8 .
Figure 8. Simulation diagram of decompression module stopped attackers from analyzing control flow graph of program by realigning or changing codes.William Zhu et al. (2006) had proved that analyzing obfuscation program was an NP difficulty problem.Obfuscation technique is simple, practical, no need for any encryption and decryption algorithm and can prevent codes from being analyzed.But security level of this technique is low and no classified protection for the compiler's specific application requirement.(3) Signatures.

Table 1 .
Models of code security level

Table 2 .
Security code level and process mode