WO2001076129A2 - Scalable cryptographic engine - Google Patents

Scalable cryptographic engine Download PDF

Info

Publication number
WO2001076129A2
WO2001076129A2 PCT/US2001/009714 US0109714W WO0176129A2 WO 2001076129 A2 WO2001076129 A2 WO 2001076129A2 US 0109714 W US0109714 W US 0109714W WO 0176129 A2 WO0176129 A2 WO 0176129A2
Authority
WO
WIPO (PCT)
Prior art keywords
coprocessor
cryptographic
data packet
processing
bit
Prior art date
Application number
PCT/US2001/009714
Other languages
French (fr)
Other versions
WO2001076129A3 (en
Inventor
Phillip Anthony Carswell
Original Assignee
General Dynamics Decision Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Dynamics Decision Systems, Inc. filed Critical General Dynamics Decision Systems, Inc.
Priority to AU52972/01A priority Critical patent/AU5297201A/en
Priority to GB0129287A priority patent/GB2367404A/en
Priority to CA002375749A priority patent/CA2375749A1/en
Priority to PL01354956A priority patent/PL354956A1/en
Publication of WO2001076129A2 publication Critical patent/WO2001076129A2/en
Publication of WO2001076129A3 publication Critical patent/WO2001076129A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/72Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Definitions

  • the present invention relates generally to data encryption systems, and more particularly to a cryptographic engine that has scalable processing capabilities and that is for use in communications devices requiring secure transmission and reception of data.
  • broadband communications devices include cryptographic coprocessors for executing encryption and decryption processing algorithms, while a general- purpose controller typically executes more traditional device processing algorithms.
  • conventional encryption/decryption engine architectures enable data to be securely transmitted across the bandwidth, such architectures do have associated limitations.
  • the underlying cryptographic engine architecture must include several cryptographic co-processors to handle the symmetric encryption and decryption algorithm processing required for secure data transmission and reception. Because the engine footprint must be large enough to accommodate the coprocessors, an engine design including several cryptographic coprocessors tends to increase overall manufacturing costs and limit device design possibilities, as well as increase overall chip power requirements.
  • FIG. 1 is a schematic block diagram of a programmable cryptographic engine according to a first preferred embodiment of the present invention
  • FIG. 2 is a schematic block diagram of the main processor shown in FIG. 1 ;
  • FIG. 3 is a schematic block diagram of one of the coprocessor slices shown in FIG. 1 ;
  • FIG. 4 is a diagram of a coprocessor permuter shown in FIG. 3;
  • FIG. 5 is a diagram of a non-linear function section of the permuter shown in FIG. 3;
  • FIG. 6 is a diagram of a linear logic unit of the coprocessor shown in FIG. 3;
  • FIG. 7 is a schematic block diagram of the coprocessor controller shown in
  • FIG. 1 is a diagrammatic representation of FIG. 1 ;
  • FIG. 8 is an instruction format for the coprocessor microsequencer shown in FIG. 1 ;
  • FIG. 9 is a schematic block diagram of a programmable cryptographic engine according to a second preferred embodiment of the present invention.
  • FIG. 10 is a schematic block diagram of a programmable cryptographic engine according to a third preferred embodiment of the present invention.
  • FIG. 11 is a schematic block diagram of a programmable cryptographic engine according to a fourth preferred embodiment of the present invention.
  • FIG. 12 is a schematic block diagram of a programmable cryptographic engine according to a fifth preferred embodiment of the present invention.
  • FIG. 1 Referring now to the drawings in which like numerals reference like parts, FIG.
  • the engine 10 which is typically fabricated on a single semiconductor substrate, includes a main processor 12 for performing general purpose engine processing, and a coprocessor 14 having a scalable cryptographic processing capacity that will be described below in more detail.
  • the engine 10 is capable of processing externally-received data packets via any of a variety of techniques including plain-text data encryption, cipher-text data decryption, in-band signaling processing required for bit, word or frame formatting, in-band signal detection, and error detection and correction processing techniques.
  • the engine 10 is capable of executing a single or multiple independent cryptographic algorithms on a large state space, and is capable of supporting both active and shadow techniques, thereby enabling algorithm and key agility through rapid context switching between algorithms and/or state and variable associations.
  • the main processor 12 handles all communications with external devices, such as other communications device processors, through several interfaces. Specifically, the main processor 12 receives configuration and initial state loading instructions through a control interface 16 and performs processing operations, such as asymmetric processing operations used to initialize a data/voice link, based on instructions stored in and fetched from a processor memory 18, such as a microcode RAM, in response to the received configuration and initial state loading instructions.
  • the main processor 12 may also receive data to be processed from an external memory such as a RAM memory (not shown) via a RAM interface 20 if a particular application requires temporary storage of data to be processed, and outputs the data after it has been processed on one or both of the red and black interfaces 22, 24.
  • the main processor 12 includes a 32-bit microsequencer 26 that includes the interfaces 16, 20, 22, 24. Also, the microsequencer 26 is coupled to an arithmetic logic unit (ALU) 28, multiplexers 29a, 29b, a register file 30 and a comparison unit 34. The microsequencer 26 manages all main processor resources based on instructions stored and fetched from the microcode RAM 18.
  • ALU arithmetic logic unit
  • the ALU 28 is programmed to perform operations such as AND/OR, XOR, ADD and SUBTRACT operations to enable the main processor 12 to perform standard Boolean and arithmetic processing of data packet portions, referred to as data packet elements, selectively output from the multiplexers 29a, 29b.
  • the data packet elements processed by the ALU 28 are output to either the register file 30, to the microsequencer 26 for transmission back to external processors through either the red or black interfaces 22, 24, or to the comparison unit 34.
  • the register file 30 functions as a scratchpad memory for temporary data storage when a message received from the microsequencer 26, such as a decode message, requires that the ALU 28 run an algorithm several times, while the comparison unit 34 enables the main processor 12 to compare two 32-bit data packet elements output from the microsequencer 26 and to generate the number of bit errors occurring in one clock cycle for engine processing integrity purposes.
  • the scalable coprocessor 14 is controlled by the main processor 12 and performs cryptographic processing functions in parallel with the functions performed by the main processor.
  • the coprocessor is operative.
  • the coprocessor is a multiple slice coprocessor with cryptographic processing slices 34a-34n each having corresponding volatile memories, such as microcode RAMs 36a-36n, for storing slice-specific independent control instructions, where the number n is application-specific and is based on design parameters such as chip cost and performance requirements.
  • a coprocessor controller 38 sequentially handles all operations to be performed by the coprocessor based on instructions retrieved from the memory 39 in response to commands received from the main processor 12, and informs the main processor 12 each time a slice has completed a specific processing operation.
  • the coprocessor 14 is capable of simultaneously performing several independent cryptographic processing operations, with the exact number of possible operations being dependent on the number of slices, at higher, albeit synchronous, clock rates than that of the main processor 12.
  • the coprocessor may receive instructions to loop through a particular encryption algorithm 64 times to generate encrypted data packets, while the main processor need only input the data packets to be encrypted to the coprocessor 14 and subsequently output the resulting encrypted data packets through the red/black interfaces 22, 24. Therefore, the cryptographic processor engine of the present invention is capable of supporting data rates equal to the throughput of the coprocessor 14 in a manner that minimizes overall engine power consumption.
  • the coprocessor includes a 64 x 32-bit register bank 40 that includes a command register 40a, variable registers 40b and state registers 40c.
  • the command register 40a hereinafter referred to as the register R0, is for receiving and loading commands from the main processor 12 and for subsequently sending those commands to the coprocessor controller 38.
  • the variable registers 40b are for storing data packet variables for the slices 34a - 34n such as, for example, a session key for use by the coprocessor 14 in performing cryptographic functions.
  • the state registers 40c are for storing state data such as, for example, channel program data representing a current execution state for an encryption algorithm being executed by one of the slices.
  • the state registers 40c which are typically accessible by main processors in conventional cryptographic engine architectures, are included as part of the coprocessor 14 in the present invention to enable the number of coprocessor slices to be set according to cryptographic application processing parameters.
  • FIG. 3 is a block diagram of the cryptographic slice 34a of FIG. 1 , with it being understood that the other cryptographic slices 34b - 34n are of like structure and similar function, with the exact function of each slice being dependent on the instructions stored in the microcode RAMs 36a - 36n.
  • the cryptographic slice 34a performs a designated encryption/decryption function for the cryptographic engine 10 that can be independent from processing functions performed by other cryptographic slices that may be included in the engine architecture, even though the slice 34a may share state space in the register bank 40 with other slices.
  • the slice 34a includes an 8-bit microsequencer 50 that provides the slice with processing power and that manages program flow for the slice based on instructions stored in and fetched from the microcode RAM 36a.
  • each slice includes a microsequencer such as the microsequencer 50, each slice is capable of performing a processing function independently from other slices.
  • the independent processing capability of each slice produces economies of scale, as the coprocessor can simultaneously execute a number of independent cryptographic algorithms corresponding generally to the number of coprocessor slices, thereby increasing overall engine data packet throughput.
  • the microsequencer 50 controls operation of an input register 54, which in turn provides control bits necessary to move data into and out of the register bank 40 and control bits necessary to select various coprocessor cryptographic functions. More specifically, the microsequencer 50 initiates operation of the input register 54 by inputting an address into an input register non-volatile memory 56, such as a microcode RAM, that stores register-specific operating instructions.
  • an input register non-volatile memory 56 such as a microcode RAM
  • the microsequencer 50 then latches data output from the RAM 56 to control a permuter 58, which has a 160 x 160-bit linear permutation section 60 and a non-linear function unit including a 160 x 144-bit permutation section 62 and a non-linear lookup table section 64 with, for example, sixteen 9:1 lookup tables.
  • the data slice 34a is capable of processing ten 32-bit data packet elements input from the register bank 40 to produce four 32-bit encrypted/decrypted data packet elements for return to the state registers 40c every clock cycle. More specifically, five of the input data packet elements, indicated at 70, are input into and permuted by the linear permutation section 60 to create four new 32-bit data packet elements at 72 that are then input into the linear function unit 66. The other five input data packet elements, indicated at 74, are set up by the nonlinear permutation section 62 before being mapped to values in the lookup tables 64 and then output to the linear logic unit 66.
  • the linear permutation section 60 is shown in more detail.
  • the linear permutation section 60 is capable of routing any input bit to any output bit based on control instructions stored in a local section memory (not shown).
  • Each of the five 32-bit data packet elements at 70 is input into a 4:1 multiplexer 76.
  • the multiplexer 76 outputs each 32-bit data packet element as four separate 8-bit data packet elements.
  • the 8-bit data packet elements in turn are input into an 8:1 multiplexer 78.
  • the multiplexer 78 outputs each of the 8-bit data packet elements to a multiplexer 80 as eight separate bits of data.
  • the multiplexer 80 sends each of the data bits along with the thirty-one other 1 -bit outputs as a permuted 32-bit output to the linear logic unit 66.
  • FIG. 5 shows that the non-linear lookup table section 64 is composed of sixteen separate 512 x 1-bit lookup tables, as indicated generally at 81. Sixteen 9- bit data packet elements, resulting from the five 32-bit data packet elements at 74 being permuted by the non-linear permutation section 62, are input into corresponding ones of the lookup tables and mapped to a single bit non-linear table value. The resulting 16-bit data packet element is concatenated at 80 with a previously-generated 16-bit result stored in a delay register 82 to form a fifth new 32-bit data packet element input to the linear function unit at 84.
  • the non-linear function unit configuration of the present invention enables different slice encryption/decryption algorithms to efficiently utilize the non-linear function unit lookup tables 81.
  • one algorithm may utilize four of the memories as a 4-bit lookup table, while a separate algorithm may simultaneously and independently address eight of the other memories for use as an 8-bit lookup table.
  • Any combination of the lookup tables 81 may be utilized to create a single multi-bit lookup table based on the cryptographic algorithm parameters necessary for a particular slice to perform its stored cryptographic algorithm or function.
  • the linear function unit 66 processes the four 32-bit input data packet elements 72 generated by the permutation section 60 and the one 32-bit data packet element 84 generated by the non-linear function unit to produce the four 32-bit linear results at 86. More specifically, the linear function unit 66 either processes the data packet elements In1 - In5 through the EXOR tree indicated generally at 90 or bypasses data packet elements In1 - In4 directly to a multiplexer bank 92 based on instructions from the microsequencer 50. The multiplexer bank 92 then outputs the resulting data packet elements as data packet elements Out1 - Out4 back to the register bank 40 for transmission back to the main processor 12.
  • the coprocessor controller 38 includes an 8-bit microsequencer 94 that provides slice processing power and that manages program flow for a given slice-executed encryption/decryption algorithm in response to commands received from the main processor 12.
  • the microsequencer 94 may include a three stage pipeline for performing, for example, fetch, execute and write operations on the program instructions in the memory 39. According to one embodiment of the present invention, the microsequencer 94 does not perform conditional operations, and therefore is capable of operating at high speeds with a simple design.
  • the coprocessor controller 38 also includes a 4 x 16-bit stack 96 that enables the microsequencer 94 to execute up to four nested loops.
  • the stack 96 in combination with program and loop counting circuitry 98, enables the microsequencer to perform in-line code and loop execution processing for the slices 36a - 36n when the main processor 12 writes initial program count and loop count values to the input register 98 through the command register R0. Consequently, the stack 96 minimizes the number of necessary program execution instructions and thus the size of the slice memories.
  • the stack 96 also enables the engine 10 to run encryption algorithms such as data encryption standard (DES) algorithms that generate 16 different versions of a single session key, and codebook algorithms that necessitate sixteen rounds of actual encryption computation.
  • DES data encryption standard
  • the format of the instructions stored in the memory 39 of the microsequencer 94 is shown at 100 in FIG. 8.
  • the 00 Continue Opcode is used to execute in-line code, while the 01 Loop Start Opcode is used to signify the start of a loop.
  • the microsequencer 94 executes the 01 Loop Start Opcode, a value contained in the Loop cnt field and the next instruction address are pushed onto the stack 96.
  • the 10 Loop End Opcode signifies the end of a loop and causes a current count value stored in the program and loop counting circuitry 98 to be decremented when executed by the microsequencer 94.
  • the program counter component 102 of the program and loop counting circuitry 98 is loaded with a current Loop cnt from the stack 96. If the count is equal to zero, the current entry is removed from the stack 96, and the microsequencer 94 continues to execute in-line instructions.
  • the cryptographic processing engine 10 of the present invention is designed so that the configuration of the coprocessor controller 38 remains the same regardless of the number of coprocessor slices.
  • the number of instructions that must be generated by the coprocessor controller 38 to perform equivalent functional operations is actually reduced as the number of cryptographic slices is increased.
  • the decrease in the number of required instructions subsequently enables the size of the corresponding coprocessor controller memory 39 to be reduced.
  • FIGs. 9 - 12 show additional embodiments of the scalable cryptographic coprocessor of the present invention at 110, 110', 110" and 110'", respectively.
  • the cryptographic coprocessor 110 is a single-slice cryptographic coprocessor with a slice 134a.
  • the cryptographic coprocessor 110' is a double-slice cryptographic coprocessor with slices 134a', 134b'.
  • the cryptographic coprocessor 110" is a triple-slice cryptographic coprocessor with slices 134a"-134c".
  • the cryptographic coprocessor 110"' is a quadruple-slice cryptographic coprocessor with slices 134a'"- 134d" ⁇
  • the cryptographic engine 110 has the least amount of processing power, and therefore the lowest associated cost and the smallest footprint, while the cryptographic engine 110'" has the most processing power, and therefore the highest associated cost and largest footprint. Additional slices could be added to the embodiments shown to increase processing power if necessary.
  • FIGs. 9-12 therefore show that the scalable architecture of the present invention provides design flexibility that enables a cryptographic engine to be configured to fit within specific application processing and cost parameters. This design flexibility eliminates the need for multiple cryptographic engines to support high performance cryptographic algorithm processing.
  • the single engine architecture of the present invention also enhances overall cryptographic processing performance when compared to conventional cryptographic engine architectures, as it has a target throughput encryption/decryption processing rate of 50 - 200 Mbps for a combination of algorithms and associations.
  • the above-described scalable programmable cryptographic engine of the present invention is designed to support high performance communications applications such as personal computer cards, network encryption systems, and satellite communications, and can be embedded in applications such as programmable and handheld radios, avionics equipment, network security systems, telephony and numerous other applications requiring secure data transmission and reception capabilities.
  • the engine can ensure data integrity on both personal computers and networks and at the same time maintain interoperability with numerous cryptographic algorithm implementations.

Abstract

A cryptographic engine (10) that includes a scalable cryptographic coprocessor (14) that is controlled by, and separate from, a main engine processor (12). The coprocessor includes a register bank (40) for receiving and storing data packets to be encrypted, and cryptographic processing slices (34a-34n) coupled to the register bank (40) with a processing capacity that is scalable based on application-specific parameters. The coprocessor (14) also includes a control device (38) coupled to the register bank (40) and the cryptographic processing device (34a-34n) for instructing the cryptographic processing slices to perform a cryptographic processing operation unique to each cryptographic processing slice (34a-34n) based on externally-received processing instructions.

Description

SCALABLE CRYPTOGRAPHIC ENGINE
Background of the Invention
Field of the Invention
The present invention relates generally to data encryption systems, and more particularly to a cryptographic engine that has scalable processing capabilities and that is for use in communications devices requiring secure transmission and reception of data.
Description of Related Art
As the sophistication and importance of broadband communications devices continues to increase in both in commercial and military applications, so does the need to maintain the security of data transmitted by these devices. Currently, such broadband communications devices include cryptographic coprocessors for executing encryption and decryption processing algorithms, while a general- purpose controller typically executes more traditional device processing algorithms. While conventional encryption/decryption engine architectures enable data to be securely transmitted across the bandwidth, such architectures do have associated limitations. For example, in a multi-band multi-mode radio or other similar device requiring high performance cryptographic processing, the underlying cryptographic engine architecture must include several cryptographic co-processors to handle the symmetric encryption and decryption algorithm processing required for secure data transmission and reception. Because the engine footprint must be large enough to accommodate the coprocessors, an engine design including several cryptographic coprocessors tends to increase overall manufacturing costs and limit device design possibilities, as well as increase overall chip power requirements. Brief Description of the Drawings
Additional objects and advantages of the present invention will be more readily apparent from the following detailed description of preferred embodiments thereof when taken together with the accompanying drawings in which:
FIG. 1 is a schematic block diagram of a programmable cryptographic engine according to a first preferred embodiment of the present invention;
FIG. 2 is a schematic block diagram of the main processor shown in FIG. 1 ; FIG. 3 is a schematic block diagram of one of the coprocessor slices shown in FIG. 1 ;
FIG. 4 is a diagram of a coprocessor permuter shown in FIG. 3; FIG. 5 is a diagram of a non-linear function section of the permuter shown in FIG. 3;
FIG. 6 is a diagram of a linear logic unit of the coprocessor shown in FIG. 3; FIG. 7 is a schematic block diagram of the coprocessor controller shown in
FIG. 1 ;
FIG. 8 is an instruction format for the coprocessor microsequencer shown in FIG. 1 ;
FIG. 9 is a schematic block diagram of a programmable cryptographic engine according to a second preferred embodiment of the present invention;
FIG. 10 is a schematic block diagram of a programmable cryptographic engine according to a third preferred embodiment of the present invention;
FIG. 11 is a schematic block diagram of a programmable cryptographic engine according to a fourth preferred embodiment of the present invention; and FIG. 12 is a schematic block diagram of a programmable cryptographic engine according to a fifth preferred embodiment of the present invention.
Detailed Description of a Preferred Embodiment
Referring now to the drawings in which like numerals reference like parts, FIG.
1 shows a cryptographic processor engine 10 according to a first embodiment of the present invention. The engine 10, which is typically fabricated on a single semiconductor substrate, includes a main processor 12 for performing general purpose engine processing, and a coprocessor 14 having a scalable cryptographic processing capacity that will be described below in more detail. The engine 10 is capable of processing externally-received data packets via any of a variety of techniques including plain-text data encryption, cipher-text data decryption, in-band signaling processing required for bit, word or frame formatting, in-band signal detection, and error detection and correction processing techniques. The engine 10 is capable of executing a single or multiple independent cryptographic algorithms on a large state space, and is capable of supporting both active and shadow techniques, thereby enabling algorithm and key agility through rapid context switching between algorithms and/or state and variable associations.
As shown in FIG. 1 , the main processor 12 handles all communications with external devices, such as other communications device processors, through several interfaces. Specifically, the main processor 12 receives configuration and initial state loading instructions through a control interface 16 and performs processing operations, such as asymmetric processing operations used to initialize a data/voice link, based on instructions stored in and fetched from a processor memory 18, such as a microcode RAM, in response to the received configuration and initial state loading instructions. The main processor 12 may also receive data to be processed from an external memory such as a RAM memory (not shown) via a RAM interface 20 if a particular application requires temporary storage of data to be processed, and outputs the data after it has been processed on one or both of the red and black interfaces 22, 24.
Referring to FIG. 2, the main processor 12 includes a 32-bit microsequencer 26 that includes the interfaces 16, 20, 22, 24. Also, the microsequencer 26 is coupled to an arithmetic logic unit (ALU) 28, multiplexers 29a, 29b, a register file 30 and a comparison unit 34. The microsequencer 26 manages all main processor resources based on instructions stored and fetched from the microcode RAM 18.
The ALU 28 is programmed to perform operations such as AND/OR, XOR, ADD and SUBTRACT operations to enable the main processor 12 to perform standard Boolean and arithmetic processing of data packet portions, referred to as data packet elements, selectively output from the multiplexers 29a, 29b. The data packet elements processed by the ALU 28 are output to either the register file 30, to the microsequencer 26 for transmission back to external processors through either the red or black interfaces 22, 24, or to the comparison unit 34. The register file 30 functions as a scratchpad memory for temporary data storage when a message received from the microsequencer 26, such as a decode message, requires that the ALU 28 run an algorithm several times, while the comparison unit 34 enables the main processor 12 to compare two 32-bit data packet elements output from the microsequencer 26 and to generate the number of bit errors occurring in one clock cycle for engine processing integrity purposes.
Referring back to FIG. 1 , the scalable coprocessor 14 is controlled by the main processor 12 and performs cryptographic processing functions in parallel with the functions performed by the main processor. The coprocessor is operative. The coprocessor is a multiple slice coprocessor with cryptographic processing slices 34a-34n each having corresponding volatile memories, such as microcode RAMs 36a-36n, for storing slice-specific independent control instructions, where the number n is application-specific and is based on design parameters such as chip cost and performance requirements. A coprocessor controller 38 sequentially handles all operations to be performed by the coprocessor based on instructions retrieved from the memory 39 in response to commands received from the main processor 12, and informs the main processor 12 each time a slice has completed a specific processing operation.
Because the slices 34a-34n perform cryptographic processing operations independently from one another and in parallel with the processing of the main processor, the coprocessor 14 is capable of simultaneously performing several independent cryptographic processing operations, with the exact number of possible operations being dependent on the number of slices, at higher, albeit synchronous, clock rates than that of the main processor 12. For example, the coprocessor may receive instructions to loop through a particular encryption algorithm 64 times to generate encrypted data packets, while the main processor need only input the data packets to be encrypted to the coprocessor 14 and subsequently output the resulting encrypted data packets through the red/black interfaces 22, 24. Therefore, the cryptographic processor engine of the present invention is capable of supporting data rates equal to the throughput of the coprocessor 14 in a manner that minimizes overall engine power consumption.
As shown in FIG. 1 , the coprocessor includes a 64 x 32-bit register bank 40 that includes a command register 40a, variable registers 40b and state registers 40c. The command register 40a, hereinafter referred to as the register R0, is for receiving and loading commands from the main processor 12 and for subsequently sending those commands to the coprocessor controller 38. The variable registers 40b are for storing data packet variables for the slices 34a - 34n such as, for example, a session key for use by the coprocessor 14 in performing cryptographic functions. The state registers 40c are for storing state data such as, for example, channel program data representing a current execution state for an encryption algorithm being executed by one of the slices. The state registers 40c, which are typically accessible by main processors in conventional cryptographic engine architectures, are included as part of the coprocessor 14 in the present invention to enable the number of coprocessor slices to be set according to cryptographic application processing parameters.
FIG. 3 is a block diagram of the cryptographic slice 34a of FIG. 1 , with it being understood that the other cryptographic slices 34b - 34n are of like structure and similar function, with the exact function of each slice being dependent on the instructions stored in the microcode RAMs 36a - 36n. The cryptographic slice 34a performs a designated encryption/decryption function for the cryptographic engine 10 that can be independent from processing functions performed by other cryptographic slices that may be included in the engine architecture, even though the slice 34a may share state space in the register bank 40 with other slices. The slice 34a includes an 8-bit microsequencer 50 that provides the slice with processing power and that manages program flow for the slice based on instructions stored in and fetched from the microcode RAM 36a. Because each slice includes a microsequencer such as the microsequencer 50, each slice is capable of performing a processing function independently from other slices. The independent processing capability of each slice produces economies of scale, as the coprocessor can simultaneously execute a number of independent cryptographic algorithms corresponding generally to the number of coprocessor slices, thereby increasing overall engine data packet throughput.
The microsequencer 50 controls operation of an input register 54, which in turn provides control bits necessary to move data into and out of the register bank 40 and control bits necessary to select various coprocessor cryptographic functions. More specifically, the microsequencer 50 initiates operation of the input register 54 by inputting an address into an input register non-volatile memory 56, such as a microcode RAM, that stores register-specific operating instructions. The microsequencer 50 then latches data output from the RAM 56 to control a permuter 58, which has a 160 x 160-bit linear permutation section 60 and a non-linear function unit including a 160 x 144-bit permutation section 62 and a non-linear lookup table section 64 with, for example, sixteen 9:1 lookup tables.
In operation, the data slice 34a is capable of processing ten 32-bit data packet elements input from the register bank 40 to produce four 32-bit encrypted/decrypted data packet elements for return to the state registers 40c every clock cycle. More specifically, five of the input data packet elements, indicated at 70, are input into and permuted by the linear permutation section 60 to create four new 32-bit data packet elements at 72 that are then input into the linear function unit 66. The other five input data packet elements, indicated at 74, are set up by the nonlinear permutation section 62 before being mapped to values in the lookup tables 64 and then output to the linear logic unit 66.
In FIG. 4, the linear permutation section 60 is shown in more detail. The linear permutation section 60 is capable of routing any input bit to any output bit based on control instructions stored in a local section memory (not shown). Each of the five 32-bit data packet elements at 70 is input into a 4:1 multiplexer 76. The multiplexer 76 outputs each 32-bit data packet element as four separate 8-bit data packet elements. The 8-bit data packet elements in turn are input into an 8:1 multiplexer 78. The multiplexer 78 outputs each of the 8-bit data packet elements to a multiplexer 80 as eight separate bits of data. The multiplexer 80 sends each of the data bits along with the thirty-one other 1 -bit outputs as a permuted 32-bit output to the linear logic unit 66. FIG. 5 shows that the non-linear lookup table section 64 is composed of sixteen separate 512 x 1-bit lookup tables, as indicated generally at 81. Sixteen 9- bit data packet elements, resulting from the five 32-bit data packet elements at 74 being permuted by the non-linear permutation section 62, are input into corresponding ones of the lookup tables and mapped to a single bit non-linear table value. The resulting 16-bit data packet element is concatenated at 80 with a previously-generated 16-bit result stored in a delay register 82 to form a fifth new 32-bit data packet element input to the linear function unit at 84.
Still referring to FIG. 5, the non-linear function unit configuration of the present invention enables different slice encryption/decryption algorithms to efficiently utilize the non-linear function unit lookup tables 81. For example, one algorithm may utilize four of the memories as a 4-bit lookup table, while a separate algorithm may simultaneously and independently address eight of the other memories for use as an 8-bit lookup table. Any combination of the lookup tables 81 may be utilized to create a single multi-bit lookup table based on the cryptographic algorithm parameters necessary for a particular slice to perform its stored cryptographic algorithm or function.
Referring now to FIG. 6, the linear function unit 66 is shown in more detail. The linear function unit processes the four 32-bit input data packet elements 72 generated by the permutation section 60 and the one 32-bit data packet element 84 generated by the non-linear function unit to produce the four 32-bit linear results at 86. More specifically, the linear function unit 66 either processes the data packet elements In1 - In5 through the EXOR tree indicated generally at 90 or bypasses data packet elements In1 - In4 directly to a multiplexer bank 92 based on instructions from the microsequencer 50. The multiplexer bank 92 then outputs the resulting data packet elements as data packet elements Out1 - Out4 back to the register bank 40 for transmission back to the main processor 12.
Referring to FIG. 7, a block diagram of the coprocessor controller 38 is shown. The coprocessor controller 38 includes an 8-bit microsequencer 94 that provides slice processing power and that manages program flow for a given slice-executed encryption/decryption algorithm in response to commands received from the main processor 12. The microsequencer 94 according to one embodiment of the present invention may include a three stage pipeline for performing, for example, fetch, execute and write operations on the program instructions in the memory 39. According to one embodiment of the present invention, the microsequencer 94 does not perform conditional operations, and therefore is capable of operating at high speeds with a simple design.
The coprocessor controller 38 also includes a 4 x 16-bit stack 96 that enables the microsequencer 94 to execute up to four nested loops. The stack 96, in combination with program and loop counting circuitry 98, enables the microsequencer to perform in-line code and loop execution processing for the slices 36a - 36n when the main processor 12 writes initial program count and loop count values to the input register 98 through the command register R0. Consequently, the stack 96 minimizes the number of necessary program execution instructions and thus the size of the slice memories. The stack 96 also enables the engine 10 to run encryption algorithms such as data encryption standard (DES) algorithms that generate 16 different versions of a single session key, and codebook algorithms that necessitate sixteen rounds of actual encryption computation.
The format of the instructions stored in the memory 39 of the microsequencer 94 is shown at 100 in FIG. 8. The 00 Continue Opcode is used to execute in-line code, while the 01 Loop Start Opcode is used to signify the start of a loop. When the microsequencer 94 executes the 01 Loop Start Opcode, a value contained in the Loop cnt field and the next instruction address are pushed onto the stack 96. The 10 Loop End Opcode signifies the end of a loop and causes a current count value stored in the program and loop counting circuitry 98 to be decremented when executed by the microsequencer 94. If the current count is not equal to zero, the program counter component 102 of the program and loop counting circuitry 98 is loaded with a current Loop cnt from the stack 96. If the count is equal to zero, the current entry is removed from the stack 96, and the microsequencer 94 continues to execute in-line instructions.
It should be appreciated at this point that the cryptographic processing engine 10 of the present invention is designed so that the configuration of the coprocessor controller 38 remains the same regardless of the number of coprocessor slices. In addition, the number of instructions that must be generated by the coprocessor controller 38 to perform equivalent functional operations is actually reduced as the number of cryptographic slices is increased. The decrease in the number of required instructions subsequently enables the size of the corresponding coprocessor controller memory 39 to be reduced. FIGs. 9 - 12 show additional embodiments of the scalable cryptographic coprocessor of the present invention at 110, 110', 110" and 110'", respectively. The cryptographic coprocessor 110 is a single-slice cryptographic coprocessor with a slice 134a. The cryptographic coprocessor 110' is a double-slice cryptographic coprocessor with slices 134a', 134b'. The cryptographic coprocessor 110" is a triple-slice cryptographic coprocessor with slices 134a"-134c". The cryptographic coprocessor 110"' is a quadruple-slice cryptographic coprocessor with slices 134a'"- 134d"\ Of the engines shown, the cryptographic engine 110 has the least amount of processing power, and therefore the lowest associated cost and the smallest footprint, while the cryptographic engine 110'" has the most processing power, and therefore the highest associated cost and largest footprint. Additional slices could be added to the embodiments shown to increase processing power if necessary.
The embodiments shown in FIGs. 9-12 therefore show that the scalable architecture of the present invention provides design flexibility that enables a cryptographic engine to be configured to fit within specific application processing and cost parameters. This design flexibility eliminates the need for multiple cryptographic engines to support high performance cryptographic algorithm processing. The single engine architecture of the present invention also enhances overall cryptographic processing performance when compared to conventional cryptographic engine architectures, as it has a target throughput encryption/decryption processing rate of 50 - 200 Mbps for a combination of algorithms and associations.
The above-described scalable programmable cryptographic engine of the present invention is designed to support high performance communications applications such as personal computer cards, network encryption systems, and satellite communications, and can be embedded in applications such as programmable and handheld radios, avionics equipment, network security systems, telephony and numerous other applications requiring secure data transmission and reception capabilities. The engine can ensure data integrity on both personal computers and networks and at the same time maintain interoperability with numerous cryptographic algorithm implementations.
While the above description is of the preferred embodiment of the present invention, it should be appreciated that the invention may be modified, altered, or varied without deviating from the scope and fair meaning of the following claims.

Claims

Claims What is claimed is:
1. A cryptographic engine, comprising: a main processor for performing only asymmetric processing; a control interface for enabling main processor external communication; and a coprocessor subordinate to and separate from the main processor for executing symmetric cryptographic processing algorithms stored therein, the coprocessor having a processing capacity that is scalable based on application- specific parameters.
2. The cryptographic engine of claim 1 , wherein the coprocessor includes at least one processing slice.
3. The cryptographic engine of claim 1 , wherein the coprocessor includes multiple processing slices each for independently executing a corresponding cryptographic processing algorithm.
4. The cryptographic engine of claim 1 , wherein the coprocessor has an aggregate processing throughput in a range of approximately 50 Mbps - 200 Mbps.
5. The cryptographic engine of claim 1 , wherein the main processor comprises: a multi-bit microsequencer for managing resources of the main processor; a register for receiving data packet elements to be processed from the microsequencer; an arithmetic logic unit coupled to the register and the microsequencer for receiving the data packet elements from the register, for performing standard
Boolean and arithmetic operations on the received data packet elements, and for sending processing status information on the received data packet elements to the microsequencer; and a comparison unit coupled to the microsequencer, the register and to the arithmetic logic unit for enabling the main processor to cyclically compare multi-bit data packet elements from the microsequencer and to generate and transmit to the microsequencer a subsequent number of bit errors to maintain engine processing integrity.
6. The cryptographic engine of claim 1 , wherein the coprocessor includes a register bank for holding data packet elements to be processed by the coprocessor, and for holding data packet elements processed by the coprocessor to be returned to the main processor.
7. The cryptographic engine of claim 6, wherein the coprocessor further includes: a microsequencer for managing coprocessor program flow based on the data packet elements to be processed in the register bank, and on the processed data packet elements in the register bank to be returned to the main processor; and a coprocessor slice including an input register for moving data in and out of the register bank and for selecting stored cryptographic function codes in response to instructions received from the main processor.
8. The cryptographic engine of claim 7, wherein the main processor is for controlling operation of the coprocessor by writing to a register RO in the register bank.
9. The cryptographic engine of claim 7, wherein the coprocessor slice is for processing 10 32-bit input data packet elements to produce four 32-bit output data packet elements during a single clock cycle.
10. The cryptographic engine of claim 9, wherein the coprocessor slice further comprises: a linear permutation section for permuting a first five of the ten 32-bit input data packet elements to create four new 32-bit permuted data packet elements; a non-linear permutation section for processing a second five of the ten 32-bit input values to create a fifth 32-bit permuted data packet element; a linear logic unit for processing the five 32-bit permuted data packet elements to generate four 32-bit encrypted data packet elements to be output to the main processor.
1 1. The cryptographic engine of claim 10, wherein the non-linear function unit comprises: a non-linear permutation section for permuting the second five 32-bit data packet elements into sixteen 9-bit data packet elements; sixteen 512 x 1 lookup tables each for mapping a non-linear variable to one of the 9-bit data packet elements input thereto to generate a 16-bit non-linear data packet element; and a delay register for causing the 16-bit non-linear data packet to be concatenated with a previously generated 16-bit non-linear data packet to produce the fifth 32-bit permuted data packet input into the linear logic unit.
12. The cryptographic engine of claim 1 , wherein the main processor includes a nested loop stack for controlling nested loop operation of the coprocessor in response to receiving nested loop instructions from the main processor.
13. A coprocessor for a cryptographic engine that is controlled by, and separate from, a main processor, the coprocessor comprising: a register bank for receiving and storing data packet elements to be encrypted; a cryptographic processing device coupled to the register bank and having a processing capacity that is scalable based on application-specific parameters; and a coprocessor control device coupled to the register bank and the cryptographic processing device for instructing the cryptographic processing device to perform a cryptographic processing operation unique to the cryptographic processing device.
14. The coprocessor of claim 13, wherein the cryptographic processing device includes a microsequencer for controlling cryptographic processing device operation in response to operating instructions received from the coprocessor control device.
15. The coprocessor of claim 14, wherein the cryptographic processing device includes a permuter for performing both linear and non-linear permutation processing on data packet elements input from the register bank in response to control instructions received from the microsequencer.
16. The coprocessor of claim 15, wherein the permuter includes a plurality of non-linear lookup tables each containing non-linear variables to be mapped to input data and that can be utilized separately or in combination based on cryptographic algorithm processing requirements.
17. The coprocessor of claim 13, wherein the cryptographic processing device comprises a plurality of cryptographic processing devices each coupled to the register bank and the coprocessor control device and each for performing a unique cryptographic processing operation independently from others of the cryptographic processing devices.
18. The coprocessor of claim 17, wherein each of the plurality of cryptographic processing devices includes a microsequencer for controlling cryptographic processing device operation independently from operation of other cryptographic processing device microsequencers and in response to operating instructions received from the coprocessor control device.
19. The coprocessor of claim 17, wherein the cryptographic processing device includes a permuter for performing both linear and non-linear permutation processing on data packet elements input from the register bank
20. The coprocessor of claim 19, wherein the permuter includes a plurality of non-linear lookup tables each containing non-linear variables to be mapped to input data and that can be utilized separately or in combination based on cryptographic algorithm processing requirements.
21. The coprocessor of claim 13, wherein the coprocessor control device includes a stack for enabling the cryptographic processing device to loop through a corresponding cryptographic processing algorithm a predetermined number of times.
PCT/US2001/009714 2000-03-31 2001-03-27 Scalable cryptographic engine WO2001076129A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU52972/01A AU5297201A (en) 2000-03-31 2001-03-27 Scalable cryptographic engine
GB0129287A GB2367404A (en) 2000-03-31 2001-03-27 Scalable cryptographic engine
CA002375749A CA2375749A1 (en) 2000-03-31 2001-03-27 Scalable cryptographic engine
PL01354956A PL354956A1 (en) 2000-03-31 2001-03-27 Scalable cryptographic engine

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US54002200A 2000-03-31 2000-03-31
US09/540,022 2000-03-31

Publications (2)

Publication Number Publication Date
WO2001076129A2 true WO2001076129A2 (en) 2001-10-11
WO2001076129A3 WO2001076129A3 (en) 2002-05-23

Family

ID=24153653

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/009714 WO2001076129A2 (en) 2000-03-31 2001-03-27 Scalable cryptographic engine

Country Status (5)

Country Link
AU (1) AU5297201A (en)
CA (1) CA2375749A1 (en)
GB (1) GB2367404A (en)
PL (1) PL354956A1 (en)
WO (1) WO2001076129A2 (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003036508A2 (en) * 2001-10-22 2003-05-01 Sun Microsystems, Inc. Stream processor with cryptographic co-processor
EP1324175A1 (en) * 2001-12-28 2003-07-02 Bull S.A. Module for securing data by encryption/decryption and/or signature/verification of signature
WO2003077119A1 (en) * 2002-03-05 2003-09-18 Quicksilver Technology, Inc. Hardware implementation of the secure hash standard
WO2008082843A1 (en) 2006-12-28 2008-07-10 Intel Corporation Method for processing multiple operations
WO2009090541A2 (en) * 2008-01-16 2009-07-23 Nokia Corporation Co-processor for stream data processing
US7653710B2 (en) 2002-06-25 2010-01-26 Qst Holdings, Llc. Hardware task manager
US7660984B1 (en) 2003-05-13 2010-02-09 Quicksilver Technology Method and system for achieving individualized protected space in an operating system
US7668229B2 (en) 2001-12-12 2010-02-23 Qst Holdings, Llc Low I/O bandwidth method and system for implementing detection and identification of scrambling codes
US7752419B1 (en) 2001-03-22 2010-07-06 Qst Holdings, Llc Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US7809050B2 (en) 2001-05-08 2010-10-05 Qst Holdings, Llc Method and system for reconfigurable channel coding
US7865847B2 (en) 2002-05-13 2011-01-04 Qst Holdings, Inc. Method and system for creating and programming an adaptive computing engine
US7904603B2 (en) 2002-10-28 2011-03-08 Qst Holdings, Llc Adaptable datapath for a digital processing system
US7937538B2 (en) 2002-11-22 2011-05-03 Qst Holdings, Llc External memory controller node
US7937591B1 (en) 2002-10-25 2011-05-03 Qst Holdings, Llc Method and system for providing a device which can be adapted on an ongoing basis
USRE42743E1 (en) 2001-11-28 2011-09-27 Qst Holdings, Llc System for authorizing functionality in adaptable hardware devices
US8108656B2 (en) 2002-08-29 2012-01-31 Qst Holdings, Llc Task definition for specifying resource requirements
US8225073B2 (en) 2001-11-30 2012-07-17 Qst Holdings Llc Apparatus, system and method for configuration of adaptive integrated circuitry having heterogeneous computational elements
US8250339B2 (en) 2001-11-30 2012-08-21 Qst Holdings Llc Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements
US8276135B2 (en) 2002-11-07 2012-09-25 Qst Holdings Llc Profiling of software and circuit designs utilizing data operation analyses
US8321687B2 (en) 2003-11-28 2012-11-27 Bull S.A.S. High speed cryptographic system with modular architecture
US8543794B2 (en) 2001-03-22 2013-09-24 Altera Corporation Adaptive integrated circuitry with heterogenous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements
US20140006798A1 (en) * 2012-06-29 2014-01-02 Gyan Prakash Device, system, and method for processor-based data protection
US9002998B2 (en) 2002-01-04 2015-04-07 Altera Corporation Apparatus and method for adaptive multimedia reception and transmission in communication environments
JP2017501478A (en) * 2014-10-23 2017-01-12 スンシル ユニバーシティー リサーチ コンソルティウム テクノ−パークSoongsil University Research Consortium Techno−Park Mobile device and method of operating the mobile device
WO2021014125A1 (en) * 2019-03-18 2021-01-28 Pqshield Ltd Cryptographic architecture for cryptographic permutation
CN114629665A (en) * 2022-05-16 2022-06-14 百信信息技术有限公司 Hardware platform for trusted computing

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6836839B2 (en) 2001-03-22 2004-12-28 Quicksilver Technology, Inc. Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BROSCIUS A G ET AL: "EXPLOITING PARALLELISM IN HARDWARE IMPLEMENTATION OF THE DES" ADVANCES IN CRYPTOLOGY. SANTA BARBARA, AUG. 11 - 15, 1991, PROCEEDINGS OF THE CONFERENCE ON THEORY AND APPLICATIONS OF CRYPTOGRAPHIC TECHNIQUES (CRYPTO), BERLIN, SPRINGER, DE, 1991, pages 367-376, XP000269040 *
MIYAMORI ET AL: "A quantitative analysis of reconfigurable coprocessors for multimedia applications" FPGAS FOR CUSTOM COMPUTING MACHINES, 1998. PROCEEDINGS. IEEE SYMPOSIUM ON NAPA VALLEY, CA, USA 15-17 APRIL 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 15 April 1998 (1998-04-15), pages 2-11, XP010298174 ISBN: 0-8186-8900-5 *
MOSANYA E ET AL: "CRYPTOBOOSTER: A RECONFIGURABLE AND MODULAR CRYPTOGRAPHIC COPROCESSOR" CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS. 1ST INTERNATIONAL WORKSHOP, CHES'99. WORCESTER, MA, AUG. 12 - 13, 1999PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, BERLIN: SPRINGER, DE, vol. 1717, 12 August 1999 (1999-08-12), pages 246-256, XP000989320 ISBN: 3-540-66646-X *

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7752419B1 (en) 2001-03-22 2010-07-06 Qst Holdings, Llc Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US9665397B2 (en) 2001-03-22 2017-05-30 Cornami, Inc. Hardware task manager
US9396161B2 (en) 2001-03-22 2016-07-19 Altera Corporation Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US9037834B2 (en) 2001-03-22 2015-05-19 Altera Corporation Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US9015352B2 (en) 2001-03-22 2015-04-21 Altera Corporation Adaptable datapath for a digital processing system
US8589660B2 (en) 2001-03-22 2013-11-19 Altera Corporation Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US8543794B2 (en) 2001-03-22 2013-09-24 Altera Corporation Adaptive integrated circuitry with heterogenous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements
US7809050B2 (en) 2001-05-08 2010-10-05 Qst Holdings, Llc Method and system for reconfigurable channel coding
US8249135B2 (en) 2001-05-08 2012-08-21 Qst Holdings Llc Method and system for reconfigurable channel coding
US7822109B2 (en) 2001-05-08 2010-10-26 Qst Holdings, Llc. Method and system for reconfigurable channel coding
WO2003036508A2 (en) * 2001-10-22 2003-05-01 Sun Microsystems, Inc. Stream processor with cryptographic co-processor
WO2003036508A3 (en) * 2001-10-22 2003-12-18 Sun Microsystems Inc Stream processor with cryptographic co-processor
USRE42743E1 (en) 2001-11-28 2011-09-27 Qst Holdings, Llc System for authorizing functionality in adaptable hardware devices
US9594723B2 (en) 2001-11-30 2017-03-14 Altera Corporation Apparatus, system and method for configuration of adaptive integrated circuitry having fixed, application specific computational elements
US9330058B2 (en) 2001-11-30 2016-05-03 Altera Corporation Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements
US8225073B2 (en) 2001-11-30 2012-07-17 Qst Holdings Llc Apparatus, system and method for configuration of adaptive integrated circuitry having heterogeneous computational elements
US8250339B2 (en) 2001-11-30 2012-08-21 Qst Holdings Llc Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements
US7668229B2 (en) 2001-12-12 2010-02-23 Qst Holdings, Llc Low I/O bandwidth method and system for implementing detection and identification of scrambling codes
US8442096B2 (en) 2001-12-12 2013-05-14 Qst Holdings Llc Low I/O bandwidth method and system for implementing detection and identification of scrambling codes
US7437569B2 (en) 2001-12-28 2008-10-14 Bull, S.A. Module for secure management of digital date by encryption/decryption and/or signature/verification of signature which can be used for dedicated servers
FR2834361A1 (en) * 2001-12-28 2003-07-04 Bull Sa DATA SECURITY MODULE BY ENCRYPTION / DECRYPTION AND / OR SIGNATURE / VERIFICATION OF SIGNATURE
EP1324175A1 (en) * 2001-12-28 2003-07-02 Bull S.A. Module for securing data by encryption/decryption and/or signature/verification of signature
US9002998B2 (en) 2002-01-04 2015-04-07 Altera Corporation Apparatus and method for adaptive multimedia reception and transmission in communication environments
WO2003077119A1 (en) * 2002-03-05 2003-09-18 Quicksilver Technology, Inc. Hardware implementation of the secure hash standard
US7865847B2 (en) 2002-05-13 2011-01-04 Qst Holdings, Inc. Method and system for creating and programming an adaptive computing engine
US8782196B2 (en) 2002-06-25 2014-07-15 Sviral, Inc. Hardware task manager
US10185502B2 (en) 2002-06-25 2019-01-22 Cornami, Inc. Control node for multi-core system
US7653710B2 (en) 2002-06-25 2010-01-26 Qst Holdings, Llc. Hardware task manager
US8200799B2 (en) 2002-06-25 2012-06-12 Qst Holdings Llc Hardware task manager
US10817184B2 (en) 2002-06-25 2020-10-27 Cornami, Inc. Control node for multi-core system
US8108656B2 (en) 2002-08-29 2012-01-31 Qst Holdings, Llc Task definition for specifying resource requirements
US7937591B1 (en) 2002-10-25 2011-05-03 Qst Holdings, Llc Method and system for providing a device which can be adapted on an ongoing basis
US8706916B2 (en) 2002-10-28 2014-04-22 Altera Corporation Adaptable datapath for a digital processing system
US8380884B2 (en) 2002-10-28 2013-02-19 Altera Corporation Adaptable datapath for a digital processing system
US7904603B2 (en) 2002-10-28 2011-03-08 Qst Holdings, Llc Adaptable datapath for a digital processing system
US8276135B2 (en) 2002-11-07 2012-09-25 Qst Holdings Llc Profiling of software and circuit designs utilizing data operation analyses
US7941614B2 (en) 2002-11-22 2011-05-10 QST, Holdings, Inc External memory controller node
US7937538B2 (en) 2002-11-22 2011-05-03 Qst Holdings, Llc External memory controller node
US7984247B2 (en) 2002-11-22 2011-07-19 Qst Holdings Llc External memory controller node
US7979646B2 (en) 2002-11-22 2011-07-12 Qst Holdings, Inc. External memory controller node
US7937539B2 (en) 2002-11-22 2011-05-03 Qst Holdings, Llc External memory controller node
US7660984B1 (en) 2003-05-13 2010-02-09 Quicksilver Technology Method and system for achieving individualized protected space in an operating system
US8321687B2 (en) 2003-11-28 2012-11-27 Bull S.A.S. High speed cryptographic system with modular architecture
EP2126688A1 (en) * 2006-12-28 2009-12-02 Intel Corporation Method for processing multiple operations
WO2008082843A1 (en) 2006-12-28 2008-07-10 Intel Corporation Method for processing multiple operations
EP2126688A4 (en) * 2006-12-28 2011-06-01 Intel Corp Method for processing multiple operations
WO2009090541A2 (en) * 2008-01-16 2009-07-23 Nokia Corporation Co-processor for stream data processing
WO2009090541A3 (en) * 2008-01-16 2009-10-15 Nokia Corporation Co-processor for stream data processing
US20140006798A1 (en) * 2012-06-29 2014-01-02 Gyan Prakash Device, system, and method for processor-based data protection
US9569633B2 (en) * 2012-06-29 2017-02-14 Intel Corporation Device, system, and method for processor-based data protection
EP3057022A4 (en) * 2014-10-23 2017-05-31 Soongsil University Research Consortium Techno-Park Mobile device and method for operating same
JP2017501478A (en) * 2014-10-23 2017-01-12 スンシル ユニバーシティー リサーチ コンソルティウム テクノ−パークSoongsil University Research Consortium Techno−Park Mobile device and method of operating the mobile device
WO2021014125A1 (en) * 2019-03-18 2021-01-28 Pqshield Ltd Cryptographic architecture for cryptographic permutation
GB2601928A (en) * 2019-03-18 2022-06-15 Pqshield Ltd Cryptographic architecture for cryptographic permutation
GB2601928B (en) * 2019-03-18 2023-10-04 Pqshield Ltd Cryptographic architecture for cryptographic permutation
US11822901B2 (en) 2019-03-18 2023-11-21 Pqshield Ltd. Cryptography using a cryptographic state
CN114629665A (en) * 2022-05-16 2022-06-14 百信信息技术有限公司 Hardware platform for trusted computing
CN114629665B (en) * 2022-05-16 2022-07-29 百信信息技术有限公司 Hardware platform for trusted computing

Also Published As

Publication number Publication date
GB0129287D0 (en) 2002-01-23
AU5297201A (en) 2001-10-15
GB2367404A (en) 2002-04-03
CA2375749A1 (en) 2001-10-11
WO2001076129A3 (en) 2002-05-23
PL354956A1 (en) 2004-03-22

Similar Documents

Publication Publication Date Title
WO2001076129A2 (en) Scalable cryptographic engine
CN107465501B (en) Processor and system for Advanced Encryption Standard (AES)
US10103873B2 (en) Power side-channel attack resistant advanced encryption standard accelerator processor
US6952478B2 (en) Method and system for performing permutations using permutation instructions based on modified omega and flip stages
US6922472B2 (en) Method and system for performing permutations using permutation instructions based on butterfly networks
US20040086114A1 (en) System and method for implementing DES permutation functions
EP3839788B1 (en) Bit-length parameterizable cipher
US20070098153A1 (en) Cryptographic processing apparatus
Chaves et al. Reconfigurable memory based AES co-processor
EP1623294B1 (en) Instructions to assist the processing of a cipher message
US20080062803A1 (en) System and method for encrypting data
Aagaard et al. ACE: An authenticated encryption and hash algorithm
US8707051B2 (en) Method and system for embedded high performance reconfigurable firmware cipher
Driessen et al. IPSecco: A lightweight and reconfigurable IPSec core
Chakraborty et al. STES: A stream cipher based low cost scheme for securing stored data
US9065631B2 (en) Integrated cryptographic module providing confidentiality and integrity
US20100228939A1 (en) Parallel Read Functional Unit for Microprocessors
Gilbert et al. Decorrelated Fast Cipher: an AES Candidate
Plos et al. Implementation of symmetric algorithms on a synthesizable 8-bit microcontroller targeting passive RFID tags
US20040086116A1 (en) System and method for implementing DES round functions
Šijačić et al. Hold your breath, PRIMATEs are lightweight
Mucci et al. Implementation of AES/Rijndael on a dynamically reconfigurable architecture
Kivilinna Block ciphers: fast implementations on x86-64 architecture
Mucci et al. Interactive presentation: Implementation of aes/rijndael on a dynamically reconfigurable architecture
Kancharla et al. The Advanced Encryption Standard on the HC 36m Reconfigurable Computer

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

ENP Entry into the national phase in:

Ref country code: CA

Ref document number: 2375749

Kind code of ref document: A

Format of ref document f/p: F

Ref document number: 2375749

Country of ref document: CA

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase in:

Ref country code: GB

Ref document number: 200129287

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 52972/01

Country of ref document: AU

AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP