US20060059286A1 - Multi-core debugger - Google Patents
Multi-core debugger Download PDFInfo
- Publication number
- US20060059286A1 US20060059286A1 US11/042,476 US4247605A US2006059286A1 US 20060059286 A1 US20060059286 A1 US 20060059286A1 US 4247605 A US4247605 A US 4247605A US 2006059286 A1 US2006059286 A1 US 2006059286A1
- Authority
- US
- United States
- Prior art keywords
- signal
- interrupt
- processor
- core
- processor cores
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/084—Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3632—Software debugging of specific synchronisation aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0813—Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
- G06F12/0835—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means for main memory peripheral accesses (e.g. I/O or DMA)
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0891—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/24—Handling requests for interconnection or transfer for access to input/output bus using interrupt
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
- G06F9/30043—LOAD or STORE instructions; Clear instruction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/30138—Extension of register space, e.g. register cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0804—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/60—Details of cache memory
- G06F2212/601—Reconfiguration of cache memory
- G06F2212/6012—Reconfiguration of cache memory of operating mode, e.g. cache mode or local memory mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/60—Details of cache memory
- G06F2212/6022—Using a prefetch buffer or dedicated prefetch cache
Definitions
- a debugger is a software program used to break (i.e., interrupt) program execution at one or more locations in an application program. Once interrupted, a user is presented with a debugger command prompt for entering debugger commands that will allow for setting breakpoints, displaying or changing memory, single stepping, and so forth.
- processors include onboard features accessible by a debugger to facilitate access to and operation of the processor during debugging.
- Multi-core processors include two or more processor cores that are each capable of simultaneously executing independent programs.
- a multi-core processor includes a global interrupt capability that selectively breaks operation of more than one of the multiple processor cores at substantially the same time, usually within a few clock cycles.
- a global interrupt-signal interconnect is coupled to each of the plurality of independent processor cores.
- Each of the processor cores includes an interrupt-signal sensor for sampling an interrupt signal on the global-signal interconnect and an interrupt-signal generator for selectively providing an interrupt signal.
- Each processor core respectively interrupts its execution of instructions in response to sampling an interrupt signal on the global interrupt-signal interconnect.
- the respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect. Outputs from the respective interrupt-signal generators can be coupled together and further to the global interrupt-signal interconnect in a wired-OR configuration. Thus, each of the processor cores can individually assert an interrupt signal on the same global interrupt-signal interconnect.
- the multi-core processor can further include an interface adapted to connect to an external device.
- the interface can be defined by a Joint Test Action Group (JTAG) interface.
- JTAG Joint Test Action Group
- more than one global interrupt-signal interconnect includes are provided.
- each of the global interrupt-signal interconnects can represent a different interrupt signal.
- information that may be relevant to debugging the multi-core processor can be provided by a combination of signals asserted on the multiple interrupt-signal interconnects.
- the plurality of independent processor cores resides on a single semiconductor die.
- the independent processor cores can be Reduced Instruction Set Computer (RISC) processors.
- RISC Reduced Instruction Set Computer
- each of the multiple independent processor cores includes a respective register storing information configurable according to the sampled interrupt signal.
- FIG. 1 is a block diagram of a security appliance including a network services processor according to the principles of the present invention
- FIG. 2 is a block diagram of the network services processor shown in FIG. 1 ;
- FIGS. 3A and 3B are block diagrams illustrating exemplary embodiments of a multi-core debug architecture
- FIG. 4 is a more detailed block diagram of one of the processor cores shown in FIGS. 3A and 3B ;
- FIG. 5 is a schematic block diagram of a debug register
- FIG. 6 is a schematic block diagram of the Multi Core Debug (MCD) register
- FIG. 7A is a schematic diagram of an exemplary Test Access Port (TAP) controller
- FIG. 7B is a more-detailed block diagram illustrating the interconnection of the TAPs among the multiple processor cores shown in FIGS. 3A and 3B ;
- FIG. 8 is a more-detailed block diagram of the debug architecture within one of the processor cores shown in FIGS. 3A and 3B .
- Multi-core processors are as limitless as applications that use a single microprocessor. Some applications that are particularly well suited for multi-core processors include telecommunications and networking. Having multiple processor cores enables a single sizeable task to be broken down into several smaller, more manageable subtasks, each subtask being executed on a different core processor. Breaking down large tasks in this way typically simplifies the overall processing of complex, high-speed data manipulations, such as those used in data security.
- a debugging system for multi-core processors is provided to facilitate debugging these parallel applications executing on several independent processor cores. This is accomplished, at least in part, by generating internal trigger events from one or more of the multiple processor cores. These multiple trigger events can be transmitted to an external debug console using a debug interface having relatively few I/O signal lines.
- the debug interface is separate from the processor core's memory interface (e.g., the Dynamic Random Access Memory (DRAM) interface) to avoid interference with the parallel application.
- DRAM Dynamic Random Access Memory
- a separate debug interface also allows a majority of the hardware for the debug interface to remain useable during normal processing of the multi-core processors.
- processor cores are provided within the same central processing unit and are interconnected using cables.
- some of the processor cores can be interconnected in the same socket, e.g., plugged into a common processor socket on a motherboard.
- the multiple processors are provided together on the same semiconductor die.
- MCD Multi-Core Debug
- a separate, high-speed interrupt-signal interconnect is provided.
- This separate signal interconnect allows for substantially simultaneous interruption of more than one of the multiple processor cores.
- a global signal interconnect is coupled to each of the processor cores.
- Each of the processor cores is configured to selectively provide an interrupt signal, or pulse, on the global signal interconnect.
- each of the processor cores is capable of pulsing the global signal interconnect during any cycle of the processor clock.
- each of the processor cores samples the global signal interconnect to determine whether any processor core has provided an interrupt signal.
- Each of the multiple processor cores is connected to the global signal interconnect, with each core being capable of independently pulsing the signal interconnect. Once pulsed, the processor cores sampling the signal interconnect receive the interrupt substantially simultaneously. Using a logical OR configuration of the contributed pulses from all of the multiple processor cores provides the desired functionality (i.e., the global signal interconnect is asserted if any one of the interconnected processor cores asserts the interconnect).
- Each of the processor cores includes respective debug circuitry with supporting extensions that enable concurrent multi-Core debugging.
- the debug circuitry is responsive to the global signal interconnect being asserted.
- a multi-core processor architecture includes multiple global signal interconnects.
- Each of the multiple interconnects is independently configured as the single interconnect described above.
- each of the global signal interconnects can be asserted (i.e., pulsed) by any of the multiple processor cores as described above.
- FIG. 1 is a block diagram of an exemplary security appliance 102 that includes a network services processor 100 according to the principles of the present invention.
- the network services processor 100 is a multi-core processor.
- the security appliance 102 can be a standalone system that switches packets received at one Ethernet port (Gig E) to another Ethernet port (Gig E).
- the security appliance 102 also performs one or more security functions related to the received packets prior to forwarding the packets.
- the security appliance 102 can be used to perform security processing on packets received from a Wide Area Network (WAN) 102 prior to forwarding the processed packets to a Local Area Network (LAN) 103 .
- Exemplary network services processors 100 adapted to perform such security processing can include hardware packet processing, buffering, work scheduling, ordering, synchronization, and coherence support to accelerate packet processing tasks according to the principles of the present invention.
- the network services processor 100 generally processes higher layer protocols.
- the network services processor 100 processes one ore more of the Open System Interconnection (OSI) network L2-L7 layer protocols encapsulated in received packets.
- OSI Open System Interconnection
- L1-L7 the OSI reference model defines seven network protocol layers: Layers 1-7 (referred to herein as L1-L7).
- the physical layer (L1) represents an actual physical interface. Namely, the electrical and physical attributes that enable a device to be connected to a transmission medium.
- the data link layer (L2) performs data framing.
- the network layer (L3) formats the data into packets.
- the transport layer (L4) handles end to end transport.
- the session layer (L5) manages communications between devices, for example, whether communication is half-duplex or full-duplex.
- the presentation layer (L6) manages data formatting and presentation, for example, syntax, control codes, special graphics and character sets.
- the application layer (L7) permits communication between users, for example, file transfer and electronic mail.
- the network services processor 100 includes a number of interfaces.
- the network services processor 100 includes a number of Ethernet Media Access Control interfaces with standard Reduced Gigabyte Media Independent Interface (RGMII) connections to off-chip destinations using physical interfaces (PHYs) 104 a , 104 b.
- RGMII Reduced Gigabyte Media Independent Interface
- the network services processor 100 receives packets from one or more external destinations at one ore more respective Ethernet ports (Gig E) through the physical interfaces PHY 104 a , 104 b .
- the network services processor 100 then selectively performs L7-L2 network protocol processing on the received packets forwarding processed packets through the physical interfaces 104 a , 104 b .
- the processed packets may be forwarded to another “hop” in the network, to their final destination, or through a local communications bus for further processing by a host processor.
- the local communications bus can be any one of a number of industry standard busses, such as a Peripheral Component Interconnect (PCI) bus 106 or a PCI Extended (PCI-X).
- PCI Peripheral Component Interconnect
- PCI-X PCI Extended
- PC busses include Integrated Systems Architecture (ISA), Extended ISA (EISA), Micro Channel, VL-bus, NuBus, TURBOchannel, VMEbus, MULTIBUS, STD bus, and proprietary busses.
- ISA Integrated Systems Architecture
- EISA Extended ISA
- Micro Channel VL-bus
- NuBus NuBus
- TURBOchannel VMEbus
- MULTIBUS MULTIBUS
- STD bus and proprietary busses.
- the network protocol processing can include processing of network security protocols such as Firewall, Application Firewall, Virtual Private Network (VPN) including IP Security (IPSec) and/or Secure Sockets Layer (SSL), Intrusion detection System (IDS) and Anti-virus (AV).
- VPN Virtual Private Network
- SSL Secure Sockets Layer
- IDS Intrusion detection System
- AV Anti-virus
- a DRAM controller in the network services processor 100 controls access to an external Dynamic Random Access Memory (DRAM) 108 that is coupled to the network services processor 100 .
- the DRAM 108 stores data packets received from the PHY interfaces 104 a , 104 b from a local communications bus, such as the PCI-X interface 106 for processing by the network services processor 100 .
- the DRAM interface supports 64 or 128 bit Double Data Rate II Synchronous Dynamic Random Access Memory (DDR II SDRAM) operating at speeds up to and including 800 MHz.
- DDR II SDRAM Double Data Rate II Synchronous Dynamic Random Access Memory
- a boot bus 110 can be provided, such that the necessary boot code is accessible allowing the network services processor 100 to execute the boot code upon power-on and/or reset.
- the boot code is stored in a memory, such as a flash memory 112 .
- Application code can also be loaded into the network services processor 100 over the boot bus 110 .
- application code can be loaded from a device 114 implementing the Compact Flash standard, or from another high-volume device, such as a disk, attached via the PCI bus.
- a miscellaneous I/O interface 116 offers auxiliary interfaces such as General Purpose Input/Output (GPIO), Flash, IEEE 802 two-wire Management Interface (MDIO), Universal Asynchronous Receiver-Transmitters (UARTs), and serial interfaces.
- GPIO General Purpose Input/Output
- MDIO two-wire Management Interface
- UARTs Universal Asynchronous Receiver-Transmitters
- the network services processor 100 can include another memory controller for controlling Low latency DRAM 118 .
- the low latency DRAM 118 can be used for Internet Services and Security applications, thereby allowing fast lookups, including the string-matching that may be required for Intrusion Detection System (IDS) or Anti Virus (AV) applications.
- IDS Intrusion Detection System
- AV Anti Virus
- FIG. 2 is a more-detailed block diagram of an exemplary network services processor 100 , such as the one shown in FIG. 1 .
- the network services processor 100 can be adapted to deliver high application performance by including multiple processor cores 202 .
- Network operations can be categorized into data plane operations and control plane operations.
- a data plane operation includes packet operations for forwarding packets.
- a control plane operation includes processing of portions of complex higher level protocols such as Internet Protocol Security (IPSec), Transmission Control Protocol (TCP) and Secure Sockets Layer (SSL).
- IPSec Internet Protocol Security
- TCP Transmission Control Protocol
- SSL Secure Sockets Layer
- selective processor cores 202 can be dedicated to performing respective data plane or control plane operations.
- a data plane operation can include processing of other portions of these complex higher level protocols.
- a packet input unit 214 can be used to allocate and create a work queue entry for each packet.
- the work queue entry in turn contains a pointer to a buffered packet temporarily stored in memory, such as Level-2 cache 212 or DRAM 108 ( FIG. 1 ).
- Packet Input/Output processing is performed by a respective interface unit 210 a , 210 b , a packet input unit (Packet Input) 214 , and a packet output unit (PKO) 218 .
- the input controller 214 and interface units 210 a , 210 b can perform parsing of received packets and checking of results to offload the processor cores 202 .
- a packet is received by any one of the interface units 210 a , 210 b (generally 210 ) through a predefined interface, such as a System Packet Interface SPI-4.2 (e.g., SPI-4 phase 2 standard of the Optical Internetworking Forum) or an RGMII interface.
- a packet can also be received by a PCI interface 224 .
- the interface unit 210 a , 210 b handles L2 network protocol pre-processing of the received packet by checking various fields in the L2 network protocol header included in the received packet. After the interface unit 210 has performed L2 network protocol processing, the packet is forwarded to the packet input unit 214 .
- the pre-processed packet can be forwarded over an input/output (I/O) bus, such as I/O bus 225 .
- the packet input unit 214 can be used to perform additional pre-processing, such as pre-processing of L3 and L4 network protocol headers included in the received packet.
- the pre-processing can include checksum checks for Transmission Control Protocol (TCP)/User Datagram Protocol (UDP) (L3 network protocols).
- the packet input unit 214 writes packet data into buffers in Level-2 cache 212 or DRAM 108 ( FIG. 1 ) in a format that is convenient to higher-layer software executed in at least one processor core 202 for further processing of higher level network protocols.
- the packet input unit 214 can support a programmable buffer size and can distribute packet data across multiple buffers to support large packet input sizes.
- the Packet order/work (POW) module (unit) 228 queues and schedules work (i.e., packet processing operations) for the processor cores 202 .
- Work can be defined to be any task to be performed by a processor core 202 that is identified by an entry on a work queue.
- the task can include packet processing operations, for example, packet processing operations for L4-L7 layers to be performed on a received packet identified by a work queue entry on a work queue.
- Each separate packet processing operation is a piece of the work to be performed by a processor core 202 on the received packet stored in memory.
- the work can be the processing of a received Firewall/Virtual Private Network (VPN) packet.
- VPN Firewall/Virtual Private Network
- Firewall/VPN packet includes the following separate packet processing operations (i.e., pieces of work): (1) defragmentation to reorder fragments in the received packet; (2) IPSec decryption (3) IPSec encryption; and (4) Network Address Translation (NAT) or TCP sequence number adjustment prior to forwarding the packet.
- packet processing operations i.e., pieces of work: (1) defragmentation to reorder fragments in the received packet; (2) IPSec decryption (3) IPSec encryption; and (4) Network Address Translation (NAT) or TCP sequence number adjustment prior to forwarding the packet.
- NAT Network Address Translation
- the POW module 228 selects (i.e., schedules) work for a processor core 202 and returns a pointer to the work queue entry that describes the work to the processor core 202 .
- Each piece of work i.e., a packet processing operation
- a packet output unit (PKO) 218 reads the packet data from Level-2 cache 212 or DRAM 108 ( FIG. 1 ), performs L4 network protocol post-processing (e.g., generates a TCP/UDP checksum), forwards the packet through the interface unit 210 and frees the Level-2 cache 212 or DRAM 108 locations used to store the packet.
- L4 network protocol post-processing e.g., generates a TCP/UDP checksum
- the network services processor 100 can also include application specific co-processors that offload the processor cores 202 so that the network services processor 100 achieves a high-throughput.
- the application specific co-processors can include a DFA co-processor 244 that performs Deterministic Finite Automata (DFA) and a compression/decompression co-processor 208 that performs compression and decompression.
- Other co-processors include a Random Number Generator (RNG) 246 and a timer unit 242 .
- RNG Random Number Generator
- the timer unit 242 is particularly useful for TCP applications.
- Each processor core 202 can include a dual-issue, superscalar processor with a respective instruction cache 206 , a respective Level-1 data cache 204 , and respective built-in hardware acceleration (e.g., a crypto acceleration module) 200 for cryptography algorithms with direct access to low latency memory over the low latency memory bus 230 .
- the low-latency, direct-access path to low-latency memory 118 ( FIG. 1 ) that bypasses the Level-2 cache memory 212 and can be directly accessed from both the processor cores 202 and a DFA co-processor 244 .
- the network services processor 100 also includes a memory subsystem.
- the memory subsystem includes the respective Level-1 data cache memory 204 of each of the processor cores 202 , respective instruction cache 206 in each of the processor cores 202 , a Level-2 cache memory 212 , a DRAM controller 216 for external DRAM memory 108 ( FIG. 1 ), and an interface, such as a low-latency bus 230 to external low latency memory (not shown).
- the memory subsystem is configured to support the multiple processor cores 202 and can be tuned to deliver both the high-throughput and the low-latency required by memory-intensive, content-networking applications.
- Level-2 cache memory 212 and external DRAM memory 108 are shared by all of the processor cores 202 and I/O co-processor devices.
- Each of the processor cores 202 can be coupled to the Level-2 cache by a, local bus, such as a coherent memory bus 234 .
- the coherent memory bus 234 can represent the communication channel for memory and I/O transactions between the processor cores 202 , an I/O Bridge (IOB) 232 , and the Level-2 cache and controller 212 .
- IOB I/O Bridge
- a Free-Pool Allocator (FPA) 236 maintains pools of pointers to free memory in Level-2 cache memory 212 and DRAM 108 .
- a bandwidth efficient (Last-In-First-Out (LIFO)) stack is implemented for each free pointer pool.
- the I/O Bridge 232 manages the overall protocol and arbitration and provides coherent I/O partitioning.
- the I/O Bridge 232 includes a bridge 238 and a Fetch-and-Add Unit (FAU) 240 .
- the bridge 238 includes buffer queues for storing information to be transferred between the I/O bus 225 , coherent memory bus 234 , the packet input unit 214 and the packet output unit 218 .
- the Fetch-and-Add Unit 240 includes a 2 kilobyte (KB) register file supporting read, write, atomic fetch-and-add, and atomic update operations.
- the Fetch-and-Add Unit 240 can be accessed from both the cores 202 and the packet output unit 218 .
- the registers store highly-used values and thus reduce traffic to access these values. Registers in the Fetch-and-Add Unit 240 are used to maintain lengths of the output queues that are used for forwarding processed packets through the packet output unit 218 .
- the PCI interface controller 224 has a Direct Memory Access (DMA) engine that allows the processor cores 202 to move data asynchronously between local memory in the network services processor 100 and remote (PCI) memory (not shown) in both directions.
- DMA Direct Memory Access
- a key memory (KEY) 248 is provided.
- the key memory 248 is a protected memory coupled to the I/O Bus 225 that can be written/read by the processor cores 202 .
- the key memory can include error checking and correction. ECC will report single and double bit errors and repair single bit errors.
- the memory is a single-port memory that can be provided withy write precedence.
- the key memory 248 can be used to temporarily store Loads, Stores, and I/O pre-fetches.
- An Miscellaneous Input/Output (MIO) unit 226 can also be coupled to the I/O bus 225 to provide interface support for one or more external devices.
- the MIO unit 226 can support one or more interfaces to a Universal Asynchronous Receiver/Transmitter (UART), to a boot bus, to a General Purpose Input/Output (GPIO) interface for communicating with peripheral devices (not shown), and more generally to a Field-Programmable Gate Array (FPGA) for interfacing with external devices.
- UART Universal Asynchronous Receiver/Transmitter
- GPIO General Purpose Input/Output
- FPGA Field-Programmable Gate Array
- an FPGA can be used to interface to external Ternary Content-Addressable Memory (TCAM) hardware providing fast-lookup performance.
- TCAM Ternary Content-Addressable Memory
- the MIO 226 can provide an interface to an external debugger console described below.
- the processor core 202 supports multiple operational modes including: user mode, kernel mode, and debug mode.
- User mode is most often employed when executing applications programs (e.g., the internal flow of program control).
- Kernel mode is typically used for handling exceptions and operating system kernel functions, including management of any related coprocessor and Input/Output (I/O) device access.
- Debug mode is a special operational mode typically used by software developers to examine variables and memory locations, to stop code execution at predefined break points, and to step through the code one line or unit at a time, usually while monitoring variables and memory locations.
- Debug mode is also different from other operational modes in that there are substantially no restrictions on access to coprocessors, memory areas. Additionally, while in Debug mode, the usual exceptions like address error and interrupt are masked.
- a multi-core processor 100 configured for debugging parallel applications executing on more than one independent processor cores is shown in FIGS. 3A and 3B .
- the multi-core processor 100 includes three separate global signal interconnects: MCD_ 0 , MCD_ 1 , and MCD_ 2 .
- Each of the three global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 is coupled to each of the multiple processor cores 202 a , 202 b , . . . 202 n (generally 202 ).
- Each processor core 202 includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects.
- each of the processor cores 202 is configured to independently and selectively assert (i.e., pulse) an interrupt on one or more of the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 .
- Each processor core 202 also includes sensing circuitry configured to sample each of the global signal interconnects to determine the presence of an asserted interrupt.
- each of the processor cores 202 independently samples the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 to determine whether an interrupt has been asserted, and on which of the several global signal interconnects the interrupt has been asserted—interrupts can be asserted on more than one of the global signal interconnects at a time.
- the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 are preferably sampled continuously, or at least once during each clock cycle to determine the presence of an interrupt.
- the sensing circuitry can include a register into which the state of the global signal interconnect is latched.
- a register is configured to store one bit for each of the multiple global signal interconnects, the value of the stored bit indicative of the state of the respective global signal interconnect.
- Having more than one global signal interconnect MCD_ 0 , MCD_ 1 , and MCD_ 2 coupled to each of the processor cores 202 can provide additional information.
- each of the three wires capable of being independently pulsed between two states e.g., a logical Low or “0” and a logical High or “1”
- the global signal interconnects can be used to communicate with the processor cores 202 once interrupted.
- An external debug console 325 hosting a debugger application and providing a user interface can be interconnected to one or more of the processor cores 202 to facilitate debugging of the system 100 .
- the global signal interconnects are accessible by the debugger.
- a debugger can assert a pulse on MCD_ 1 to instruct the processor cores 202 to check their mailbox location (e.g., in main memory) for an instruction from the debugger.
- the debugger can assert a pulse on MCD_ 2 to restart all processor cores 202 after a multi-core interrupt.
- usage of the global signal interconnects can minimize disruption of the state contained in the processor cores 202 and in the system 100 , while the debugger examines it. This capability can be very useful to isolate the cause of bugs in parallel applications.
- the processor cores 202 are each coupled to the one or more global signal MCD_ 0 , MCD_ 1 , and MCD_ 2 interconnects in a respective “wired-OR” fashion.
- respective interrupt-signal generators of each of the processor cores 202 are all interconnected at a first wired-OR 310 a , further connected to the first global signal interconnect MCD_ 0 .
- Second and third wired-ORs 310 b , 310 c are provided to similarly interconnect the processor cores 202 to the second and third global signal interconnects MCD_ 1 and MCD_ 2 , respectively.
- the processor cores 202 assert a pulse on any of its respective interrupt-signal generator outputs (e.g., to wired-OR 310 a ), the pulse will be asserted on the respective global signal interconnect (e.g., MCD_ 0 ). Preferably, a pulse can be asserted during any cycle.
- each of the processor cores 202 can be interconnected to the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 using combinational logic, such as a logical OR gate.
- combinational logic such as a logical OR gate.
- Such logic represents additional complexity generally resulting in a corresponding delay (e.g., a gate delay due to synchronous logic, and/or a rise time delay due to the capacitance of the logic circuitry).
- Each processor core 202 provides an exception handler.
- an exception refers to an error or other special condition detected during normal program execution.
- the exception handler can interrupt the normal flow of program control in response to receiving an exception. For example, a debug exception handler halts normal operation in response to receiving a debug interrupt. The exception handler then passes control to a debug handler, or software program, that controls operation in debug mode.
- Some exemplary exception types include a Debug Single Step (DSS) exception resulting in single step execution in debug mode.
- DDS Debug Single Step
- a general Debug Interrupt (DINT) results in entry of debug mode and can be caused by the assertion of an external interrupt (e.g., EJ_DINT), or by setting a related bit in a debug register. An interrupt can result from assertion of unmasked hardware or software interrupt signal.
- EJ_DINT external interrupt
- An interrupt can result from assertion of unmasked hardware or software interrupt signal.
- DIB debug hardware instruction break matched
- a debug breakpoint instruction results in entry of debug mode upon execution of a special instruction (e.g., a software debug breakpoint instruction, such as the EJTAG “SDBBP” instruction that places a processor into debug mode and fetches associated handler code from memory).
- a Data Address Break address only
- Data Value e.g., DDBL/DDBS
- Each of the processor cores 202 includes respective onboard debug circuitry 318 .
- each of the multiple processor cores 202 can include a respective core Test Access Port (TAP) 320 ′, 320 ′′, 320 ′′′ (generally 320 ) for accessing the respective debug circuitry 318 .
- the core TAPs 320 are connected to one system TAP 330 .
- each of the respective core TAPs 320 and the system TAP 330 can be interconnected in a daisy chain configuration.
- the debug circuitry 318 of all of the interconnected processor cores 202 can be coupled to the external debug console 325 .
- the debug control console can be used to inspect the values stored in registers and memory locations.
- the debug control console provides a software program that communicates with the onboard debug circuitry 318 to accomplish inspection of stored values, setting of breakpoints, stopping, restarting and sequentially stepping each of the processor cores 202 in unison.
- each of the processor cores 202 can be coupled to the external debug console 325 through one or more Universal Asynchronous Receiver-Transmitter (UART) devices that include receiving and transmitting circuits for asynchronous serial communications, as shown in FIG. 3B .
- the multi-core processor 100 includes two UART devices 335 a , 335 b (generally 335 ) used to control serial data transmission and reception between the processor cores 202 and external devices, such as the external debug console 325 .
- the UART devices 335 can be included within the Miscellaneous I/O unit ( FIG. 2 ).
- each processor core 202 can communicate with another device, such as the external debug console 325 , through a respective memory bus interface 340 using one or more of the UART devices 335 accessible through the I/O bridge 238 .
- communicating with the external debug console 325 using the UART device 335 removes constraints that would have otherwise been imposed by using a standard interface, such as the JTAG TAP interface ( FIG. 3A ).
- the multi-core processor 100 optionally includes a trace buffer 610 (shown in phantom) for selectively monitoring memory transactions of the processor cores 202 .
- the trace buffer 610 is coupled to the coherent memory bus 234 to monitor transactions thereon.
- the trace buffer 610 stores information that can be used to assist in any debugging activity.
- the trace buffer 610 can be configured to store the last “N” transactions on the bus, the N+1 st transaction being dumped as a new transaction occurs.
- identification tags can be used to identify the particular core processor 202 associated with each stored transaction.
- the trace buffer 610 is also coupled to each of the one or more global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 , and configured with sensing circuitry sampling any pulses asserted on the global signal interconnects.
- the trace buffer 610 also includes a trigger that initiates the starting and or stopping of monitoring in response to sampling an interrupt signal on the global signal interconnect.
- a single trace buffer 610 supporting multiple core processors 202 is illustrated, other configurations are possible.
- multiple trace buffers 610 can be provided with each trace buffer 610 respectively corresponding to one of the multiple core processors 202 .
- the trace buffer 610 can be on-chip, as shown, or off-chip and accessible by a probe.
- the trace buffer 610 includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 .
- the trace buffer 610 can be coupled to the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 through the wired-OR circuits 310 .
- the trace buffer 610 can selectively assert a global interrupt signal on one or more of the global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 , thereby interrupting more than one of the multiple processor cores 202 in response to activity on the coherent memory bus 234 .
- FIG. 4 is a more detailed block diagram of an exemplary processor core 202 shown in FIGS. 3A and 3B .
- a processor core 202 interprets and executes instructions.
- the processor core 202 is a Reduced Instruction Set Computing (RISC) processor core 202 .
- the processor core 202 includes an execution unit 400 , an instruction dispatch unit 402 , an instruction fetch unit 404 , a load/store unit 416 , a Memory Management Unit 406 , a system interface 408 , a write buffer 420 and security accelerators 200 .
- the processor core 202 also includes debug circuitry 318 allowing debug operations to be performed.
- the system interface 408 controls access to external memory, that is, memory external to the processor core 202 , such as the L2 cache memory described in relation to FIG. 2 .
- the execution unit 400 includes a multiply/divide unit 412 and at least one register file 414 .
- the multiply/divide unit 412 has a 64-bit register-direct multiply.
- the instruction fetch unit 404 includes Instruction Cache (ICache) 206 .
- the load/store unit 416 includes Data Cache (DCache) 204 .
- a portion of the data cache 204 can be reserved as local scratch pad/local memory 422 .
- the instruction cache 206 is 32 Kilobytes
- the data cache 204 is 8 Kilobytes
- the write buffer 420 is 2 Kilobytes.
- the memory management unit 406 includes a Translation Lookaside Buffer (TLB) 410 .
- TLB Translation Lookaside Buffer
- the processor core 202 includes a crypto acceleration module (security accelerators) 200 that includes cryptography acceleration.
- the cryptography acceleration can include one or more of Triple Data Encryption Standard (3DES), Advanced Encryption Standard (AES), Secure Hash Algorithm (SHA-1l), and Message Digest Algorithm #5 (MD5).
- the crypto acceleration module 200 communicates by moves to and from the main register file 414 in the execution unit 400 .
- Particular algorithms such as Rivest, Shamir, Adleman (RSA) and the Diffie-Heilman (DH) can be implemented and are performed in the multiply/divide unit 412 .
- the multi-core processor 100 includes a superscalar processor.
- a superscalar processor includes a superscalar instruction pipeline that allows more than one instruction to be completed each cycle of the processor's clock period by allowing multiple instructions to be issued simultaneously and dispatched in parallel to multiple execution units 400 .
- the RISC-type processor core 202 has an instruction set architecture that defines instructions by which the programmer interfaces with the RISC-type processor 202 .
- the superscalar RISC-type core is an extension of the MIPS64 version 2 core. Only load-and-store instructions access external memory; that is, memory external to the processor core 202 . In one embodiment, the external memory is accessed over a coherent memory bus 234 ( FIG. 2 ). All other instructions operate on data stored in the register file 414 within the execution unit 414 of the processor core 202 .
- the superscalar processor can be a dual-issue processor.
- the instruction pipeline is divided into stages, each stage taking one clock cycle to complete. Thus, in a five stage pipeline, it takes five clock cycles to process each instruction and five instructions can be processed concurrently with each instruction being processed by a different stage of the pipeline in any given clock cycle.
- a five stage pipeline includes the following stages: fetch, decode, execute, memory and write back.
- the instruction fetch unit 404 fetches an instruction from instruction cache 206 at a location in instruction cache 206 identified by a memory address stored in a program counter.
- the instruction fetched in the fetch-stage is decoded by the instruction dispatch unit 402 and the address of the next instruction to be fetched for the issuing context is computed.
- the execution unit 400 performs an operation dependent on the type of instruction. For example, the execution unit 400 begins the arithmetic or logical operation for a register-to-register instruction, calculates the virtual address for a load or store operation, or determines whether the branch condition is true for a branch instruction.
- data is aligned by the load/store unit 416 and transferred to its destination in external memory.
- the write back-stage the result of a register-to-register or load instruction is written back to the register file 414 .
- the system interface 408 is coupled via the coherent memory bus 234 ( FIG. 2 ) to external memory.
- the coherent memory bus 243 has 384 bits and includes four separate buses: (i) an Address/Command Bus; (ii) a Store Data Bus; (iii) a Commit/Response control bus; and (iv) a Fill Data bus. All store data is sent to external memory over the coherent memory bus 234 via a write buffer entry in the write buffer 420 .
- the write buffer 420 has 16 write buffer entries.
- Store data flows from the load/store unit 416 to the write buffer 420 , and from the write buffer 420 through the system interface 408 to external memory.
- the processor core 202 can generate data to be stored in external memory faster than the system interface 408 can write the store data to the external memory.
- the write buffer 420 minimizes pipeline stalls by providing a resource for storing data prior to forwarding the data to external memory.
- the write buffer 420 is also used to aggregate data to be stored in external memory over a coherent memory bus 424 into aligned cache blocks to maximize the rate at which the data can be written to the external memory. Furthermore, the write buffer 420 can also merge multiple stores to the same location in external memory resulting in a single write operation to external memory. The write-merging operation of the write buffer 420 can result in the order of writes to the external memory being different than the order of execution of the store instructions.
- the processor core 202 also includes an exception control system providing circuitry for identifying and managing exceptions.
- An exception refers to an interruption or change of the normal flow of program control that occurs when an event or other special condition is detected during execution. Exceptions can be caused by a variety of sources, including boundary cases in data, external events, or even program errors, being generated (i.e., “raised”) by hardware or software. Exemplary hardware exceptions include resets, interrupts and signals from a memory management unit. Hardware exceptions may be generated by an arithmetic logic unit or floating-point unit for numerical errors such as divide by zero, overflow or underflow or instruction decoding errors such as privileged, reserved, trap or undefined instructions. Software exceptions are even more varied. For example, a software exception can refer to any kind of error checking that alters the normal behavior of the program. An exception transfers control from code being executed at the instant of the exception to different code-a routine commonly referred to as an exception handler.
- a system co-processor can also be provide within the processor core 202 for providing a diagnostic capability, for controlling the operating mode (i.e., kernel, user, and debug), for configuring interrupts as enabled or disabled, and for storing other configuration information.
- a diagnostic capability for controlling the operating mode (i.e., kernel, user, and debug), for configuring interrupts as enabled or disabled, and for storing other configuration information.
- the processor core 202 also includes a Memory Management Unit (MMU) 406 coupled to the instruction fetch unit 404 and the load/store unit 416 .
- the MMU 406 is a hardware device or circuit that supports virtual memory and paging by translating virtual addresses into physical addresses. Thus, the MMU 406 may receive a virtual memory address from program instructions being executed on the processor core 202 .
- the virtual memory address is associated with a read from or a write to physical memory.
- the MMU 406 translates the virtual address to a physical address to allow a related physical memory location to be accessed by the program.
- the MMU 406 can include a Translation Lookaside Buffer (TLB) 410 .
- TLB Translation Lookaside Buffer
- the debug circuitry 318 on each processor core 202 can include an onboard debug controller. Having an onboard debug controller facilitates operation of the processor core 202 in the debug mode.
- the debug controller can allow for single-step execution of the processor core 202 .
- the debug controller can support breakpoints, enabling them to transition the processor core 202 into debug mode.
- the breakpoints can be one or more of instruction breakpoints, data breakpoints, and virtual address breakpoints.
- the onboard debug circuitry 318 includes standardized features.
- the onboard debug circuitry 318 can be compliant with the design philosophy of the Joint Test Action Group (JTAG) interface—a popular standardized interface defined by IEEE Standard 1149.1.
- JTAG Joint Test Action Group
- the onboard controller is referred to is the standard MIPS Enhanced JTAG (EJTAG) debug circuitry 318 .
- Each processor core 202 includes one or more debug registers, each register including one or more pre-defined fields for storing information (e.g., state bits) related to different aspects of debug mode operation.
- the debug registers 425 can be located in the instruction fetch unit 404 .
- one of the debug registers 425 is a Debug register 500 .
- the Debug register 500 is illustrated in more detail in FIG. 5 .
- the Debug register 500 includes a DM state bit indicative of whether the processor core 202 is operating in debug mode. Other bits include a DBD state bit indicative of whether the last debug exception or exception in Debug Mode occurred in a branch or jump delay slot.
- a DDBSImpr bit is indicative of an imprecise debug data break store.
- a DDBLImpr bit is indicative of an imprecise debug data break load. This bit can be implemented for load value breakpoints.
- a DExcCode bit is set to one when Debug[DExcCode] is valid and should be
- MCD register 600 Another one of the debug registers 425 is a Multi-Core Debug (MCD) register 600 is shown in FIG. 6 .
- the MCD register 600 includes dedicated multi-core debug state positions 615 , one position being provided for each of the respective global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 .
- the MCD register 600 includes dedicated mask-disable state positions 605 , one position being provided for each of the respective global signal interconnects MCD_ 0 , MCD_ 1 , and MCD_ 2 .
- the mask-disable bits disable the effect of sampling a pulse on the corresponding global signal interconnect.
- the MCD register 600 also includes respective software-control bit locations 610 for each of the several global MCD wires.
- the three software-control bit locations 610 referred to as: Pls 0 , Pls 1 , or Pls 2 are reserved. These software-control bit locations 610 corresponding to the three global signal interconnects: MCD_ 0 , MCD_ 1 , and MCD_ 2 , respectively.
- bits written by software into the software control bit locations 610 can be used to pulse any combination of the three global MCD wires.
- the debug registers 425 include a DEPC register for imprecise debug exceptions and imprecise exceptions in Debug Mode. Imprecise debug data breakpoint are provided for load value compare, otherwise debug data breakpoints are precise.
- the DEPC register contains an address at which execution should be resumed when returning to Non-Debug Mode.
- Exception handlers can be entered for debug processing in a number of ways.
- software such as the processor core instruction set and/or the debugger can include a breakpoint instruction.
- the breakpoint instruction When the breakpoint instruction is executed by the execution unit 400 , it causes a specific exception.
- a set of trap instructions can be provided.
- a pair of optional Watch registers can be programmed to cause a specific exception on a load, store, or instruction fetch access to a specific word (e.g., a 64-bit double word) in virtual memory.
- an optional TLB-based MMU 406 can be programmed to “trap,” or otherwise interrupt program execution on any access, or more specifically, on any store to a page of memory. These exceptions generally refer to interrupting operation on any one of the processor cores 202 . To interrupt the other processor cores 202 , a pulse must be asserted on one or more of the global signal interconnects MCD —0, MCD —1, MCD —2.
- the corresponding signal value can be a high state, or logical one.
- the respective instruction fetch unit 404 of each of the interconnected processor cores 202 samples the one on the global signal interconnect.
- the instruction fetch unit 404 sets an internal state bit corresponding to the sampled pulse.
- the internal state bit, or MCD state bit can be dedicated multi-core debug state positions 615 in the multi-core debug register 600 (i.e., Multi-Core Debug[MCD 0 , MCD 1 , MCD 2 ]).
- the onboard debug circuitry 318 requests a debug exception on its respective processor core 202 .
- the onboard debug circuitry 318 requests a debug exception on its respective processor core 202 .
- all of the multiple processor cores 202 sampling the same pulse and setting their respective bits 615 at substantially the same time, all of the unmasked processor cores 202 are interrupted at substantially the same time. Preferably, this occurs during the same cycle, but it can also occur within a few clock cycles.
- Software can later clear Multi-Core Debug[MCD 0 , MCD 1 , MCD 2 ] bits by overwriting them (e.g., writing a one to them). Such a provision ensures that no further debug interrupts occur after exiting the debug handler.
- interrupts can be assigned different priority values to ensure the desired results in situations in which more than one type of interrupt occurs.
- the MCD interrupts can occur at the same priority level as standard debug interrupts provided within the debug circuitry 318 of each of the processor cores 202 .
- the exception location can also be the same as a debug interrupt, with the multi-core debug bits 615 being similar to the DINT bit of the debug register shown in FIG. 5 .
- the DINT bit is read-only, whereas Multi-Core Debug[MCD 0 , MCD 1 , MCD 2 ] bits can be written to, allowing the bits to be cleared by the debug handler. Further, the DINT is cleared when Multi-Core Debug[DExcC] is set, whereas the multi-core debug state bits 615 need not be.
- the global signal interconnects MCD —0, MCD _ 1 , MCD_ 2 can be pulsed.
- software can cause initiation of a pulse on the global MCD wires.
- debugger software running on a processor core 202 can write one or more values (e.g., a logical “1”s) to any combination of the software-control state bits 610 of the MCD register 600 .
- the processor core 202 interprets it as an instruction to assert an interrupt signal, or pulse, on the corresponding global signal interconnects.
- the global signal interconnects can also be pulsed by execution of a special instruction.
- execution of a software breakpoint instruction such as the SDBBP instruction
- any one of the processor cores 202 results in that core 202 asserting a pulse on the MCD_ 0 global signal interconnect.
- Whether a pulse is actually asserted by a processor core 202 in response to the breakpoint instruction can be further controlled by a global-signal debug bit 618 in the MCD register 600 .
- a pulse is only asserted in response to the breakpoint instruction when the MCD[GSDB] bit 618 is set.
- the initiation of a pulse on the global signal interconnects can result if one or more bits within a particular register are set and a breakpoint match occurs.
- the hardware e.g., the debug circuitry 318
- pulses one of the global MCD wires e.g., the MCD_ 0 wire.
- An Instruction Breakpoint Control-n register (IBCn, “n” being a numbered reference to a particular instruction breakpoint) stores a value responsive to a match of an executed breakpoint instruction.
- a Data Breakpoint Control-n (DBCn) stores a value responsive to a match of a data transaction.
- the registers IBCn and DBCn generally include special bits (e.g., BE, TE) that can be used to enable the respective breakpoints.
- the TAP controller 700 includes one or more registers 705 for storing instruction, data, and control information relating to the TAP interface 320 .
- the registers 705 allow a user to perform a set up for the onboard debug circuitry 318 , and provide important status information during a debug session.
- the size of the registers 705 depends on the specific implementations, but usually they are at least 32 bits.
- the registers 705 receive information from an external source using the Test Data Input (TDI) input (i.e., pin). The registers also provide information to an external source using the Test Data Output (TDO) output (i.e., pin). Operation of the interface is provided by a TAP controller state machine 710 .
- the TAP controller 700 uses a communications channel, such as a serial communications channel that operates according to a clock signal received on the Test Clock (TCK) input (i.e., pin).
- TCK Test Clock
- operation of the state machine also relies on the received clock.
- a JTAG interface referred to as a Test Access Port (TAP) 320 ′, 320 ′′, 320 ′′′ (generally 320 ), includes at least four-signal lines: Test Clock (TCK); TMS; Test Data In (TDI); and Test Data Out (TDO).
- TCK Test Clock
- TMS Test Data In
- TDO Test Data Out
- the interface can also include one or more power and ground signal lines (note shown).
- the JTAG interface is a serial interface that is capable of transferring data according to a clock signal received on the TCK signal line. Operating frequency varies per chip, but is typically defined by a clock signal having a frequency between about 10 MHz to about 100 MHz (i.e., from about 100 nanoseconds to about 10 nanoseconds per bit time).
- Configuration of each of the respective debug circuitry 318 can be performed by manipulating an internal state machine.
- a debug controller state machine within the debug circuitry 318 can be externally manipulated one bit at a time via the TMS signal line of the TAP 330 . Data can then be transferred in and out, one bit at a time, during each TCK clock cycle. The data can be received via the TDI signal line, and transmitted out via the TDO signal line, respectively.
- Different instruction modes can be loaded into the debug controller 318 to read the core identification (ID), to sample input, to drive (and/or float) output, to manipulate functions, and/or to bypass (pipe TDI to TDO to logically shorten chains of multiple chips).
- the respective TAP 320 of each of the multiple processor cores 202 a , 202 b . . . 202 n (generally 202 ) is coupled to the respective TAP 320 of the other multiple processor cores 202 in a serial, or “daisy chain” configuration.
- the TCK signal of the first TAP 320 ′ is serially interconnected to the corresponding TCK signal lines of all of the other TAPs 320 .
- the interconnected TCK signal lines are further connected to a corresponding TCK signal line of a system TAP 330 .
- the system TAP 330 is interconnected to one of the end of the interconnected processor cores 202 (i.e., processor core 202 n or processor core 202 a , as shown), that end processor core 202 referred to as a “master” processor core 202 a .
- the remaining TAP signal lines are generally interconnected in a similar manner being further connected from the master processor core 202 a corresponding TAP signal lines on the system TAP 330 . Interconnection of the TDI and TDO signal lines, however, is different as described in more detail below.
- the TDI signal line of the master processor core 202 a connects to the corresponding TDI signal line of the system TAP 330 , the master processor core 202 a receiving data from an external source.
- the TDO signal line of the master processor core 202 a is connected to the TDI signal line of an adjacent processor core 202 b .
- Additional processor cores 202 are connected in a similar manner, the TDO signal line of one processor core 202 being interconnected to the TDI signal line of its preceding processor core 202 , until the TDO signal line of the last processor core 202 n in the chain is interconnected to the TDO signal line of the system TAP 330 .
- FIG. 8 A more-detailed diagram illustrating an alternative embodiment of a processor core 202 including exemplary onboard debug circuitry is shown in FIG. 8 .
- An execution unit 400 e.g., a combined processor and co-processor
- the MMU 410 may include a TLB.
- the memory controller 805 is further coupled to a memory system interface through a bus interface unit 408 .
- Access and control of the onboard debug features is provided through an EJTAG TAP 320 .
- the processing unit 300 includes a number of registers 830 that support debug operation.
- the processor core 202 includes an MCD register 835 as discussed above; a debug register 836 as also discussed above, a DEPC register 837 and a DESAVE register 838 .
- a debug control register 832 is coupled between the registers 830 of the processing unit 400 , the memory controller 805 , and externally via the EJTAG TAP 320 .
- a hardware breakpoint unit 825 is also coupled between the registers 830 of the execution unit 400 , the memory controller and the MMU 410 .
- the Hardware Breakpoint Unit 825 implements memory-mapped registers that control the instruction and data hardware breakpoints. The memory-mapped region containing the hardware breakpoint registers is accessible to software only in debug mode.
- the debug features provide compatibility with existing debuggers.
- the debug circuitry 318 support includes specific extensions that enable concurrent multi-Core debugging.
- controlling logic can be used to interpret the values of the software-control bit locations 610 .
- the controlling logic can write the interpreted values into the corresponding MCD —0, MCD _ 1 , MCD_ 2 bit locations of the MCD register.
- the controlling logic can then pulse the one or more corresponding global MCD wires, according to the corresponding values 615 .
- the processor cores 202 sample the pulse. The pulse sampling can occur during the next clock cycle after the pulse was written. Once sampled, each of the processor cores 202 that is not masked, will initiate a debug exception handler routine.
- the debug exception handler can then follow a set of predetermined rules to determine the one or more causes of a given debug exception after reading the Debug and/or Multi-Core Debug registers. For example, the debug exception handler can follow the rules listed in Table 2 below. TABLE 2 Debug Exception Handler Rules 1. Any of MCD state bit locations (Multi-Core Debug[MCD0, MCD1, MCD2]) could be set at any time, indicating that the corresponding MCD state bit is set. 2.
- Multi-Core Debug[DExcC] is set, all of Debug[DDBSImpr, DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] will be clear, and Debug[DExcCode] will contain a valid code. (This is the case for a debug mode exception.) 3. If none of Debug[DDBSImpr, DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] are set, then the exception was either due to MCD*, or Multi- Core Debug[DExcC] being set and Debug[DExcCode] is valid. 4.
- Debug[DIB, DDBS, DDBL, DBp, DSS] No more than one of Debug[DIB, DDBS, DDBL, DBp, DSS] can be set. 5. If Multi-Core Debug[DExcC] is clear, any combination of Debug[DDBLImpr, DINT] may be set. 6. At least one of Debug[DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] and Multi-Core Debug[MCD0, MCD1, MCD2, DExcC] will be set.
Abstract
In a multi-core processor, a high-speed interrupt-signal interconnect allows more than one of the processors to be interrupted at substantially the same time. For example, a global signal interconnect is coupled to each of the multiple processors, each processor being configured to selectively provide an interrupt signal, or pulse thereon. Preferably, each of the processor cores is capable of pulsing the global signal interconnect during every clock cycle to minimize delay between a triggering event and its respective interrupt signal. Each of the multiple processors also senses, or samples the global signal interconnect, preferably during the same cycle within which the pulse was provided, to determine the existence of an interrupt signal. Upon sensing an interrupt signal, each of the multiple processors responds to it substantially simultaneously. For example, an interrupt signal sampled by each of the multiple processors causes each processor to invoke a debug handler routine.
Description
- This application claims the benefit of U.S. Provisional Application No. 60/609,211, filed on Sep. 10, 2004. The entire teachings of the above application are incorporated herein by reference.
- Complex computer systems and programs rarely work exactly as designed. During the development of a new computer system, unexpected errors or bugs may be discovered by thorough testing and exhaustive execution of a variety of programs and applications. The source or cause of an error is often not apparent from the error itself, many times an error manifests itself by locking the target system for no apparent reason. Thus, tracking down the source of the error can be problematic.
- Software and system developers commonly use tools referred to as debuggers to identify the source of unexpected errors and to assist in their resolution. A debugger is a software program used to break (i.e., interrupt) program execution at one or more locations in an application program. Once interrupted, a user is presented with a debugger command prompt for entering debugger commands that will allow for setting breakpoints, displaying or changing memory, single stepping, and so forth. Often, processors include onboard features accessible by a debugger to facilitate access to and operation of the processor during debugging.
- One of the most difficult tasks facing designers of embedded systems today is emulating and debugging embedded hardware and software in a real-world environment. Embedded systems are growing more complex, offering increasingly higher levels of performance, and using larger software programs than ever before. To meet the challenges of dealing with embedded systems, engineers and programmers seek advanced tools to enable their performance of appropriate levels of debugging.
- Tracking down problems is particularly challenging when the target system includes a multi-core processor. Multi-core processors include two or more processor cores that are each capable of simultaneously executing independent programs.
- Using standard debug features that may be provided with the individual processor cores of the multi-core processor, can provide insight into operation of the individual processor cores. Assessing operation of parallel applications being developed and executed on the multi-core processor system by debugging an individual processor core will generally be inadequate. Namely, if an operation of a first processor is interrupted as described above, the other processors will continue to operate, thereby changing the state of the system with each subsequent clock cycle as measured from the moment of interrupt.
- A multi-core processor includes a global interrupt capability that selectively breaks operation of more than one of the multiple processor cores at substantially the same time, usually within a few clock cycles. A global interrupt-signal interconnect is coupled to each of the plurality of independent processor cores. Each of the processor cores includes an interrupt-signal sensor for sampling an interrupt signal on the global-signal interconnect and an interrupt-signal generator for selectively providing an interrupt signal. Each processor core respectively interrupts its execution of instructions in response to sampling an interrupt signal on the global interrupt-signal interconnect.
- The respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect. Outputs from the respective interrupt-signal generators can be coupled together and further to the global interrupt-signal interconnect in a wired-OR configuration. Thus, each of the processor cores can individually assert an interrupt signal on the same global interrupt-signal interconnect.
- The multi-core processor can further include an interface adapted to connect to an external device. For example, the interface can be defined by a Joint Test Action Group (JTAG) interface. In some embodiments, more than one global interrupt-signal interconnect includes are provided. In such a configuration, each of the global interrupt-signal interconnects can represent a different interrupt signal. Additionally, information that may be relevant to debugging the multi-core processor can be provided by a combination of signals asserted on the multiple interrupt-signal interconnects.
- In some embodiments, the plurality of independent processor cores resides on a single semiconductor die. The independent processor cores can be Reduced Instruction Set Computer (RISC) processors. Alternatively or in addition, each of the multiple independent processor cores includes a respective register storing information configurable according to the sampled interrupt signal.
- The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
-
FIG. 1 is a block diagram of a security appliance including a network services processor according to the principles of the present invention; -
FIG. 2 is a block diagram of the network services processor shown inFIG. 1 ; -
FIGS. 3A and 3B are block diagrams illustrating exemplary embodiments of a multi-core debug architecture; -
FIG. 4 is a more detailed block diagram of one of the processor cores shown inFIGS. 3A and 3B ; -
FIG. 5 is a schematic block diagram of a debug register; -
FIG. 6 is a schematic block diagram of the Multi Core Debug (MCD) register; -
FIG. 7A is a schematic diagram of an exemplary Test Access Port (TAP) controller; -
FIG. 7B is a more-detailed block diagram illustrating the interconnection of the TAPs among the multiple processor cores shown inFIGS. 3A and 3B ; and -
FIG. 8 is a more-detailed block diagram of the debug architecture within one of the processor cores shown inFIGS. 3A and 3B . - A description of preferred embodiments of the invention follows.
- Applications for multi-core processors are as limitless as applications that use a single microprocessor. Some applications that are particularly well suited for multi-core processors include telecommunications and networking. Having multiple processor cores enables a single sizeable task to be broken down into several smaller, more manageable subtasks, each subtask being executed on a different core processor. Breaking down large tasks in this way typically simplifies the overall processing of complex, high-speed data manipulations, such as those used in data security.
- A debugging system for multi-core processors is provided to facilitate debugging these parallel applications executing on several independent processor cores. This is accomplished, at least in part, by generating internal trigger events from one or more of the multiple processor cores. These multiple trigger events can be transmitted to an external debug console using a debug interface having relatively few I/O signal lines. Preferably the debug interface is separate from the processor core's memory interface (e.g., the Dynamic Random Access Memory (DRAM) interface) to avoid interference with the parallel application. A separate debug interface also allows a majority of the hardware for the debug interface to remain useable during normal processing of the multi-core processors.
- Combining multiple processor cores in a single system leads to a closer placement of cores with respect to each other. Reducing separation between the processor cores generally reduces propagation delay, thereby increasing communication speed between them. In some embodiments, the processor cores are provided within the same central processing unit and are interconnected using cables. Alternatively or in addition, some of the processor cores can be interconnected in the same socket, e.g., plugged into a common processor socket on a motherboard. In some applications, the multiple processors are provided together on the same semiconductor die.
- With different processor cores operating cooperatively to implement a common function, such as packet processing in a high-speed packet processor, it may be necessary to examine the state of more than one of the multiple processor cores during any debugging activity. Thus, it would be beneficial to interrupt the multiple processor cores at substantially the same time, thereby allowing examination of the register contents and memory values attributable to any of the multiple processor cores. Once interrupted, operation of the multiple processor cores can be stepped sequentially, in unison according to operation in a debug mode. This special class of fast interrupts are referred to herein as Multi-Core Debug (MCD) interrupts.
- To facilitate the very fast debug interrupt, a separate, high-speed interrupt-signal interconnect is provided. This separate signal interconnect allows for substantially simultaneous interruption of more than one of the multiple processor cores. For example, a global signal interconnect is coupled to each of the processor cores. Each of the processor cores, in turn, is configured to selectively provide an interrupt signal, or pulse, on the global signal interconnect. Preferably, each of the processor cores is capable of pulsing the global signal interconnect during any cycle of the processor clock. Additionally, each of the processor cores samples the global signal interconnect to determine whether any processor core has provided an interrupt signal.
- Each of the multiple processor cores is connected to the global signal interconnect, with each core being capable of independently pulsing the signal interconnect. Once pulsed, the processor cores sampling the signal interconnect receive the interrupt substantially simultaneously. Using a logical OR configuration of the contributed pulses from all of the multiple processor cores provides the desired functionality (i.e., the global signal interconnect is asserted if any one of the interconnected processor cores asserts the interconnect).
- Each of the processor cores includes respective debug circuitry with supporting extensions that enable concurrent multi-Core debugging. The debug circuitry is responsive to the global signal interconnect being asserted.
- More generally, a multi-core processor architecture includes multiple global signal interconnects. Each of the multiple interconnects is independently configured as the single interconnect described above. Thus, each of the global signal interconnects can be asserted (i.e., pulsed) by any of the multiple processor cores as described above.
-
FIG. 1 is a block diagram of anexemplary security appliance 102 that includes anetwork services processor 100 according to the principles of the present invention. Thenetwork services processor 100 is a multi-core processor. Thesecurity appliance 102 can be a standalone system that switches packets received at one Ethernet port (Gig E) to another Ethernet port (Gig E). Preferably, thesecurity appliance 102 also performs one or more security functions related to the received packets prior to forwarding the packets. For example, thesecurity appliance 102 can be used to perform security processing on packets received from a Wide Area Network (WAN) 102 prior to forwarding the processed packets to a Local Area Network (LAN) 103. Exemplarynetwork services processors 100 adapted to perform such security processing can include hardware packet processing, buffering, work scheduling, ordering, synchronization, and coherence support to accelerate packet processing tasks according to the principles of the present invention. - The
network services processor 100 generally processes higher layer protocols. For example, thenetwork services processor 100 processes one ore more of the Open System Interconnection (OSI) network L2-L7 layer protocols encapsulated in received packets. As is well-known to those skilled in the art, the OSI reference model defines seven network protocol layers: Layers 1-7 (referred to herein as L1-L7). The physical layer (L1) represents an actual physical interface. Namely, the electrical and physical attributes that enable a device to be connected to a transmission medium. The data link layer (L2) performs data framing. The network layer (L3) formats the data into packets. The transport layer (L4) handles end to end transport. The session layer (L5) manages communications between devices, for example, whether communication is half-duplex or full-duplex. The presentation layer (L6) manages data formatting and presentation, for example, syntax, control codes, special graphics and character sets. The application layer (L7) permits communication between users, for example, file transfer and electronic mail. - To support multiple interconnects, the
network services processor 100 includes a number of interfaces. For example, thenetwork services processor 100 includes a number of Ethernet Media Access Control interfaces with standard Reduced Gigabyte Media Independent Interface (RGMII) connections to off-chip destinations using physical interfaces (PHYs) 104 a, 104 b. - In operation, the
network services processor 100 receives packets from one or more external destinations at one ore more respective Ethernet ports (Gig E) through thephysical interfaces PHY network services processor 100 then selectively performs L7-L2 network protocol processing on the received packets forwarding processed packets through thephysical interfaces bus 106 or a PCI Extended (PCI-X). Other PC busses include Integrated Systems Architecture (ISA), Extended ISA (EISA), Micro Channel, VL-bus, NuBus, TURBOchannel, VMEbus, MULTIBUS, STD bus, and proprietary busses. Further, the network protocol processing can include processing of network security protocols such as Firewall, Application Firewall, Virtual Private Network (VPN) including IP Security (IPSec) and/or Secure Sockets Layer (SSL), Intrusion detection System (IDS) and Anti-virus (AV). - A DRAM controller in the
network services processor 100 controls access to an external Dynamic Random Access Memory (DRAM) 108 that is coupled to thenetwork services processor 100. TheDRAM 108 stores data packets received from the PHY interfaces 104 a, 104 b from a local communications bus, such as the PCI-X interface 106 for processing by thenetwork services processor 100. In one embodiment, the DRAM interface supports 64 or 128 bit Double Data Rate II Synchronous Dynamic Random Access Memory (DDR II SDRAM) operating at speeds up to and including 800 MHz. - A boot bus 110 can be provided, such that the necessary boot code is accessible allowing the
network services processor 100 to execute the boot code upon power-on and/or reset. Generally, the boot code is stored in a memory, such as aflash memory 112. Application code can also be loaded into thenetwork services processor 100 over the boot bus 110. For example, application code can be loaded from adevice 114 implementing the Compact Flash standard, or from another high-volume device, such as a disk, attached via the PCI bus. - A miscellaneous I/
O interface 116 offers auxiliary interfaces such as General Purpose Input/Output (GPIO), Flash, IEEE 802 two-wire Management Interface (MDIO), Universal Asynchronous Receiver-Transmitters (UARTs), and serial interfaces. - The
network services processor 100 can include another memory controller for controllingLow latency DRAM 118. Thelow latency DRAM 118 can be used for Internet Services and Security applications, thereby allowing fast lookups, including the string-matching that may be required for Intrusion Detection System (IDS) or Anti Virus (AV) applications. -
FIG. 2 is a more-detailed block diagram of an exemplarynetwork services processor 100, such as the one shown inFIG. 1 . As discussed above, thenetwork services processor 100 can be adapted to deliver high application performance by includingmultiple processor cores 202. Network operations can be categorized into data plane operations and control plane operations. A data plane operation includes packet operations for forwarding packets. A control plane operation includes processing of portions of complex higher level protocols such as Internet Protocol Security (IPSec), Transmission Control Protocol (TCP) and Secure Sockets Layer (SSL). Advantageously, in such a network application,selective processor cores 202 can be dedicated to performing respective data plane or control plane operations. A data plane operation can include processing of other portions of these complex higher level protocols. - A
packet input unit 214 can be used to allocate and create a work queue entry for each packet. The work queue entry, in turn contains a pointer to a buffered packet temporarily stored in memory, such as Level-2cache 212 or DRAM 108 (FIG. 1 ). - Packet Input/Output processing is performed by a
respective interface unit input controller 214 andinterface units processor cores 202. - A packet is received by any one of the
interface units phase 2 standard of the Optical Internetworking Forum) or an RGMII interface. A packet can also be received by a PCI interface 224. Theinterface unit packet input unit 214. The pre-processed packet can be forwarded over an input/output (I/O) bus, such as I/O bus 225. Thepacket input unit 214 can be used to perform additional pre-processing, such as pre-processing of L3 and L4 network protocol headers included in the received packet. The pre-processing can include checksum checks for Transmission Control Protocol (TCP)/User Datagram Protocol (UDP) (L3 network protocols). - The
packet input unit 214 writes packet data into buffers in Level-2cache 212 or DRAM 108 (FIG. 1 ) in a format that is convenient to higher-layer software executed in at least oneprocessor core 202 for further processing of higher level network protocols. Thepacket input unit 214 can support a programmable buffer size and can distribute packet data across multiple buffers to support large packet input sizes. - The Packet order/work (POW) module (unit) 228 queues and schedules work (i.e., packet processing operations) for the
processor cores 202. Work can be defined to be any task to be performed by aprocessor core 202 that is identified by an entry on a work queue. The task can include packet processing operations, for example, packet processing operations for L4-L7 layers to be performed on a received packet identified by a work queue entry on a work queue. Each separate packet processing operation is a piece of the work to be performed by aprocessor core 202 on the received packet stored in memory. For example, the work can be the processing of a received Firewall/Virtual Private Network (VPN) packet. The processing of a Firewall/VPN packet includes the following separate packet processing operations (i.e., pieces of work): (1) defragmentation to reorder fragments in the received packet; (2) IPSec decryption (3) IPSec encryption; and (4) Network Address Translation (NAT) or TCP sequence number adjustment prior to forwarding the packet. - The POW module 228 selects (i.e., schedules) work for a
processor core 202 and returns a pointer to the work queue entry that describes the work to theprocessor core 202. Each piece of work (i.e., a packet processing operation) has an associated group identifier and a tag. - Prior to describing the operation of the
processor cores 202 in further detail, the other modules in theprocessor core 202 will be described. After the packet has been processed by theprocessor cores 202, a packet output unit (PKO) 218 reads the packet data from Level-2cache 212 or DRAM 108 (FIG. 1 ), performs L4 network protocol post-processing (e.g., generates a TCP/UDP checksum), forwards the packet through the interface unit 210 and frees the Level-2cache 212 orDRAM 108 locations used to store the packet. - The
network services processor 100 can also include application specific co-processors that offload theprocessor cores 202 so that thenetwork services processor 100 achieves a high-throughput. The application specific co-processors can include a DFA co-processor 244 that performs Deterministic Finite Automata (DFA) and a compression/decompression co-processor 208 that performs compression and decompression. Other co-processors include a Random Number Generator (RNG) 246 and atimer unit 242. Thetimer unit 242 is particularly useful for TCP applications. - Each
processor core 202 can include a dual-issue, superscalar processor with arespective instruction cache 206, a respective Level-1data cache 204, and respective built-in hardware acceleration (e.g., a crypto acceleration module) 200 for cryptography algorithms with direct access to low latency memory over the lowlatency memory bus 230. The low-latency, direct-access path to low-latency memory 118 (FIG. 1 ) that bypasses the Level-2cache memory 212 and can be directly accessed from both theprocessor cores 202 and a DFA co-processor 244. - The
network services processor 100 also includes a memory subsystem. The memory subsystem includes the respective Level-1data cache memory 204 of each of theprocessor cores 202,respective instruction cache 206 in each of theprocessor cores 202, a Level-2cache memory 212, aDRAM controller 216 for external DRAM memory 108 (FIG. 1 ), and an interface, such as a low-latency bus 230 to external low latency memory (not shown). - The memory subsystem is configured to support the
multiple processor cores 202 and can be tuned to deliver both the high-throughput and the low-latency required by memory-intensive, content-networking applications. Level-2cache memory 212 and external DRAM memory 108 (FIG. 1 ) are shared by all of theprocessor cores 202 and I/O co-processor devices. - Each of the
processor cores 202 can be coupled to the Level-2 cache by a, local bus, such as acoherent memory bus 234. Thus, thecoherent memory bus 234 can represent the communication channel for memory and I/O transactions between theprocessor cores 202, an I/O Bridge (IOB) 232, and the Level-2 cache andcontroller 212. - A Free-Pool Allocator (FPA) 236 maintains pools of pointers to free memory in Level-2
cache memory 212 andDRAM 108. A bandwidth efficient (Last-In-First-Out (LIFO)) stack is implemented for each free pointer pool. - The I/
O Bridge 232 manages the overall protocol and arbitration and provides coherent I/O partitioning. The I/O Bridge 232 includes abridge 238 and a Fetch-and-Add Unit (FAU) 240. Thebridge 238 includes buffer queues for storing information to be transferred between the I/O bus 225,coherent memory bus 234, thepacket input unit 214 and thepacket output unit 218. - The Fetch-and-
Add Unit 240 includes a 2 kilobyte (KB) register file supporting read, write, atomic fetch-and-add, and atomic update operations. The Fetch-and-Add Unit 240 can be accessed from both thecores 202 and thepacket output unit 218. The registers store highly-used values and thus reduce traffic to access these values. Registers in the Fetch-and-Add Unit 240 are used to maintain lengths of the output queues that are used for forwarding processed packets through thepacket output unit 218. - The PCI interface controller 224 has a Direct Memory Access (DMA) engine that allows the
processor cores 202 to move data asynchronously between local memory in thenetwork services processor 100 and remote (PCI) memory (not shown) in both directions. - In some embodiments, a key memory (KEY) 248 is provided. The
key memory 248 is a protected memory coupled to the I/O Bus 225 that can be written/read by theprocessor cores 202. For example, the key memory can include error checking and correction. ECC will report single and double bit errors and repair single bit errors. The memory is a single-port memory that can be provided withy write precedence. In some embodiments, thekey memory 248 can be used to temporarily store Loads, Stores, and I/O pre-fetches. - An Miscellaneous Input/Output (MIO)
unit 226 can also be coupled to the I/O bus 225 to provide interface support for one or more external devices. For example, theMIO unit 226 can support one or more interfaces to a Universal Asynchronous Receiver/Transmitter (UART), to a boot bus, to a General Purpose Input/Output (GPIO) interface for communicating with peripheral devices (not shown), and more generally to a Field-Programmable Gate Array (FPGA) for interfacing with external devices. For example, an FPGA can be used to interface to external Ternary Content-Addressable Memory (TCAM) hardware providing fast-lookup performance. In particular, theMIO 226 can provide an interface to an external debugger console described below. - The
processor core 202 supports multiple operational modes including: user mode, kernel mode, and debug mode. User mode is most often employed when executing applications programs (e.g., the internal flow of program control). Kernel mode is typically used for handling exceptions and operating system kernel functions, including management of any related coprocessor and Input/Output (I/O) device access. Debug mode is a special operational mode typically used by software developers to examine variables and memory locations, to stop code execution at predefined break points, and to step through the code one line or unit at a time, usually while monitoring variables and memory locations. Debug mode is also different from other operational modes in that there are substantially no restrictions on access to coprocessors, memory areas. Additionally, while in Debug mode, the usual exceptions like address error and interrupt are masked. - A
multi-core processor 100 configured for debugging parallel applications executing on more than one independent processor cores is shown inFIGS. 3A and 3B . For example, themulti-core processor 100 includes three separate global signal interconnects: MCD_0, MCD_1, and MCD_2. Each of the three global signal interconnects MCD_0, MCD_1, and MCD_2 is coupled to each of themultiple processor cores processor core 202, in turn, includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects. Preferably, each of theprocessor cores 202 is configured to independently and selectively assert (i.e., pulse) an interrupt on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2. - Each
processor core 202 also includes sensing circuitry configured to sample each of the global signal interconnects to determine the presence of an asserted interrupt. Preferably, each of theprocessor cores 202 independently samples the global signal interconnects MCD_0, MCD_1, and MCD_2 to determine whether an interrupt has been asserted, and on which of the several global signal interconnects the interrupt has been asserted—interrupts can be asserted on more than one of the global signal interconnects at a time. The global signal interconnects MCD_0, MCD_1, and MCD_2 are preferably sampled continuously, or at least once during each clock cycle to determine the presence of an interrupt. The sensing circuitry can include a register into which the state of the global signal interconnect is latched. For example, a register is configured to store one bit for each of the multiple global signal interconnects, the value of the stored bit indicative of the state of the respective global signal interconnect. - Having more than one global signal interconnect MCD_0, MCD_1, and MCD_2 coupled to each of the
processor cores 202 can provide additional information. For example, with each of the three wires capable of being independently pulsed between two states (e.g., a logical Low or “0” and a logical High or “1”) can provide information corresponding to one of up to eight different messages (i.e., 3=8). - Alternatively, or in addition, the global signal interconnects can be used to communicate with the
processor cores 202 once interrupted. Anexternal debug console 325 hosting a debugger application and providing a user interface can be interconnected to one or more of theprocessor cores 202 to facilitate debugging of thesystem 100. Preferably, the global signal interconnects are accessible by the debugger. For example, a debugger can assert a pulse on MCD_1 to instruct theprocessor cores 202 to check their mailbox location (e.g., in main memory) for an instruction from the debugger. The debugger can assert a pulse on MCD_2 to restart allprocessor cores 202 after a multi-core interrupt. Thus, usage of the global signal interconnects can minimize disruption of the state contained in theprocessor cores 202 and in thesystem 100, while the debugger examines it. This capability can be very useful to isolate the cause of bugs in parallel applications. - The
processor cores 202 are each coupled to the one or more global signal MCD_0, MCD_1, and MCD_2 interconnects in a respective “wired-OR” fashion. Thus, respective interrupt-signal generators of each of theprocessor cores 202 are all interconnected at a first wired-OR 310 a, further connected to the first global signal interconnect MCD_0. Second and third wired-ORs processor cores 202 to the second and third global signal interconnects MCD_1 and MCD_2, respectively. - Thus, should one or more of the
processor cores 202 assert a pulse on any of its respective interrupt-signal generator outputs (e.g., to wired-OR 310 a), the pulse will be asserted on the respective global signal interconnect (e.g., MCD_0). Preferably, a pulse can be asserted during any cycle. Using the wired-OR processor cores 202 to drive the global interconnect signal (i.e., the wired-OR providing an output=“1” if any of its inputs=“1”), while also minimizing any corresponding delay. Once a pulse is asserted on the global signal interconnects, all of theprocessor cores 202 sample it, allowing thecores 202 to be interrupted very quickly—at the same time, or at least within a few cycles of the processor clock. Such a rapid interrupt of all of theprocessor cores 202 preserves the entire state of the parallel application at the time of the interrupt for examination by the debugger. - In other embodiments, each of the
processor cores 202 can be interconnected to the global signal interconnects MCD_0, MCD_1, and MCD_2 using combinational logic, such as a logical OR gate. Such logic, however, represents additional complexity generally resulting in a corresponding delay (e.g., a gate delay due to synchronous logic, and/or a rise time delay due to the capacitance of the logic circuitry). - Each
processor core 202 provides an exception handler. Generally, an exception refers to an error or other special condition detected during normal program execution. The exception handler can interrupt the normal flow of program control in response to receiving an exception. For example, a debug exception handler halts normal operation in response to receiving a debug interrupt. The exception handler then passes control to a debug handler, or software program, that controls operation in debug mode. - Some exemplary exception types include a Debug Single Step (DSS) exception resulting in single step execution in debug mode. A general Debug Interrupt (DINT) results in entry of debug mode and can be caused by the assertion of an external interrupt (e.g., EJ_DINT), or by setting a related bit in a debug register. An interrupt can result from assertion of unmasked hardware or software interrupt signal. A debug hardware instruction break matched (DIB) exception results in entry of debug mode when an instruction matches a predetermined instruction breakpoint. Similarly, a debug breakpoint instruction (DBp) results in entry of debug mode upon execution of a special instruction (e.g., a software debug breakpoint instruction, such as the EJTAG “SDBBP” instruction that places a processor into debug mode and fetches associated handler code from memory). A Data Address Break (address only) or Data Value (e.g., DDBL/DDBS) results in entry of debug mode when a particular memory address is accessed, or a particular value is written to/read from memory.
- Each of the
processor cores 202 includes respectiveonboard debug circuitry 318. As shown inFIG. 3A , each of themultiple processor cores 202 can include a respective core Test Access Port (TAP) 320′, 320″, 320′″ (generally 320) for accessing therespective debug circuitry 318. The core TAPs 320 are connected to onesystem TAP 330. As shown, each of therespective core TAPs 320 and thesystem TAP 330 can be interconnected in a daisy chain configuration. Additionally, thedebug circuitry 318 of all of theinterconnected processor cores 202 can be coupled to theexternal debug console 325. - Once in debug mode, the debug control console can be used to inspect the values stored in registers and memory locations. The debug control console provides a software program that communicates with the
onboard debug circuitry 318 to accomplish inspection of stored values, setting of breakpoints, stopping, restarting and sequentially stepping each of theprocessor cores 202 in unison. - Alternatively, or in addition, each of the
processor cores 202 can be coupled to theexternal debug console 325 through one or more Universal Asynchronous Receiver-Transmitter (UART) devices that include receiving and transmitting circuits for asynchronous serial communications, as shown inFIG. 3B . In one embodiment, themulti-core processor 100 includes twoUART devices processor cores 202 and external devices, such as theexternal debug console 325. The UART devices 335 can be included within the Miscellaneous I/O unit (FIG. 2 ). Thus, eachprocessor core 202 can communicate with another device, such as theexternal debug console 325, through a respectivememory bus interface 340 using one or more of the UART devices 335 accessible through the I/O bridge 238. Advantageously, communicating with theexternal debug console 325 using the UART device 335 removes constraints that would have otherwise been imposed by using a standard interface, such as the JTAG TAP interface (FIG. 3A ). - The
multi-core processor 100 optionally includes a trace buffer 610 (shown in phantom) for selectively monitoring memory transactions of theprocessor cores 202. For example, thetrace buffer 610 is coupled to thecoherent memory bus 234 to monitor transactions thereon. Generally, thetrace buffer 610 stores information that can be used to assist in any debugging activity. For example, thetrace buffer 610 can be configured to store the last “N” transactions on the bus, the N+1st transaction being dumped as a new transaction occurs. Further, when using asingle trace buffer 610, identification tags can be used to identify theparticular core processor 202 associated with each stored transaction. - Beneficially, the
trace buffer 610 is also coupled to each of the one or more global signal interconnects MCD_0, MCD_1, and MCD_2, and configured with sensing circuitry sampling any pulses asserted on the global signal interconnects. Thetrace buffer 610 also includes a trigger that initiates the starting and or stopping of monitoring in response to sampling an interrupt signal on the global signal interconnect. Although asingle trace buffer 610 supporting multiplecore processors 202 is illustrated, other configurations are possible. For example,multiple trace buffers 610 can be provided with eachtrace buffer 610 respectively corresponding to one of the multiplecore processors 202. Additionally, thetrace buffer 610 can be on-chip, as shown, or off-chip and accessible by a probe. - Alternatively or in addition, the
trace buffer 610 includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2. As shown, thetrace buffer 610 can be coupled to the global signal interconnects MCD_0, MCD_1, and MCD_2 through the wired-OR circuits 310. In this configuration, thetrace buffer 610 can selectively assert a global interrupt signal on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2, thereby interrupting more than one of themultiple processor cores 202 in response to activity on thecoherent memory bus 234. -
FIG. 4 is a more detailed block diagram of anexemplary processor core 202 shown inFIGS. 3A and 3B . In general, aprocessor core 202 interprets and executes instructions. In some embodiments theprocessor core 202 is a Reduced Instruction Set Computing (RISC)processor core 202. In more detail, theprocessor core 202 includes anexecution unit 400, aninstruction dispatch unit 402, an instruction fetchunit 404, a load/store unit 416, aMemory Management Unit 406, asystem interface 408, awrite buffer 420 andsecurity accelerators 200. Theprocessor core 202 also includesdebug circuitry 318 allowing debug operations to be performed. Thesystem interface 408 controls access to external memory, that is, memory external to theprocessor core 202, such as the L2 cache memory described in relation toFIG. 2 . - Still referring to
FIG. 4 , theexecution unit 400 includes a multiply/divide unit 412 and at least oneregister file 414. The multiply/divide unit 412 has a 64-bit register-direct multiply. The instruction fetchunit 404 includes Instruction Cache (ICache) 206. The load/store unit 416 includes Data Cache (DCache) 204. A portion of thedata cache 204 can be reserved as local scratch pad/local memory 422. In one embodiment, theinstruction cache 206 is 32 Kilobytes, thedata cache 204 is 8 Kilobytes and thewrite buffer 420 is 2 Kilobytes. Thememory management unit 406 includes a Translation Lookaside Buffer (TLB) 410. - In one embodiment, the
processor core 202 includes a crypto acceleration module (security accelerators) 200 that includes cryptography acceleration. For example, the cryptography acceleration can include one or more of Triple Data Encryption Standard (3DES), Advanced Encryption Standard (AES), Secure Hash Algorithm (SHA-1l), and Message Digest Algorithm #5 (MD5). Thecrypto acceleration module 200 communicates by moves to and from themain register file 414 in theexecution unit 400. Particular algorithms, such as Rivest, Shamir, Adleman (RSA) and the Diffie-Heilman (DH) can be implemented and are performed in the multiply/divide unit 412. - In some embodiments, the multi-core processor 100 (
FIG. 2 ) includes a superscalar processor. A superscalar processor includes a superscalar instruction pipeline that allows more than one instruction to be completed each cycle of the processor's clock period by allowing multiple instructions to be issued simultaneously and dispatched in parallel tomultiple execution units 400. The RISC-type processor core 202 has an instruction set architecture that defines instructions by which the programmer interfaces with the RISC-type processor 202. In one embodiment, the superscalar RISC-type core is an extension of theMIPS64 version 2 core. Only load-and-store instructions access external memory; that is, memory external to theprocessor core 202. In one embodiment, the external memory is accessed over a coherent memory bus 234 (FIG. 2 ). All other instructions operate on data stored in theregister file 414 within theexecution unit 414 of theprocessor core 202. In some embodiments, the superscalar processor can be a dual-issue processor. - The instruction pipeline is divided into stages, each stage taking one clock cycle to complete. Thus, in a five stage pipeline, it takes five clock cycles to process each instruction and five instructions can be processed concurrently with each instruction being processed by a different stage of the pipeline in any given clock cycle. Typically, a five stage pipeline includes the following stages: fetch, decode, execute, memory and write back.
- During the fetch-stage, the instruction fetch
unit 404 fetches an instruction frominstruction cache 206 at a location ininstruction cache 206 identified by a memory address stored in a program counter. During the decode-stage, the instruction fetched in the fetch-stage is decoded by theinstruction dispatch unit 402 and the address of the next instruction to be fetched for the issuing context is computed. During the execute-stage, theexecution unit 400 performs an operation dependent on the type of instruction. For example, theexecution unit 400 begins the arithmetic or logical operation for a register-to-register instruction, calculates the virtual address for a load or store operation, or determines whether the branch condition is true for a branch instruction. During the memory-stage, data is aligned by the load/store unit 416 and transferred to its destination in external memory. During the write back-stage, the result of a register-to-register or load instruction is written back to theregister file 414. - The
system interface 408 is coupled via the coherent memory bus 234 (FIG. 2 ) to external memory. In one embodiment, the coherent memory bus 243 has 384 bits and includes four separate buses: (i) an Address/Command Bus; (ii) a Store Data Bus; (iii) a Commit/Response control bus; and (iv) a Fill Data bus. All store data is sent to external memory over thecoherent memory bus 234 via a write buffer entry in thewrite buffer 420. In one embodiment, thewrite buffer 420 has 16 write buffer entries. - Store data flows from the load/
store unit 416 to thewrite buffer 420, and from thewrite buffer 420 through thesystem interface 408 to external memory. Theprocessor core 202 can generate data to be stored in external memory faster than thesystem interface 408 can write the store data to the external memory. Thewrite buffer 420 minimizes pipeline stalls by providing a resource for storing data prior to forwarding the data to external memory. - The
write buffer 420 is also used to aggregate data to be stored in external memory over a coherent memory bus 424 into aligned cache blocks to maximize the rate at which the data can be written to the external memory. Furthermore, thewrite buffer 420 can also merge multiple stores to the same location in external memory resulting in a single write operation to external memory. The write-merging operation of thewrite buffer 420 can result in the order of writes to the external memory being different than the order of execution of the store instructions. - The
processor core 202 also includes an exception control system providing circuitry for identifying and managing exceptions. An exception refers to an interruption or change of the normal flow of program control that occurs when an event or other special condition is detected during execution. Exceptions can be caused by a variety of sources, including boundary cases in data, external events, or even program errors, being generated (i.e., “raised”) by hardware or software. Exemplary hardware exceptions include resets, interrupts and signals from a memory management unit. Hardware exceptions may be generated by an arithmetic logic unit or floating-point unit for numerical errors such as divide by zero, overflow or underflow or instruction decoding errors such as privileged, reserved, trap or undefined instructions. Software exceptions are even more varied. For example, a software exception can refer to any kind of error checking that alters the normal behavior of the program. An exception transfers control from code being executed at the instant of the exception to different code-a routine commonly referred to as an exception handler. - A system co-processor can also be provide within the
processor core 202 for providing a diagnostic capability, for controlling the operating mode (i.e., kernel, user, and debug), for configuring interrupts as enabled or disabled, and for storing other configuration information. - The
processor core 202 also includes a Memory Management Unit (MMU) 406 coupled to the instruction fetchunit 404 and the load/store unit 416. TheMMU 406 is a hardware device or circuit that supports virtual memory and paging by translating virtual addresses into physical addresses. Thus, theMMU 406 may receive a virtual memory address from program instructions being executed on theprocessor core 202. The virtual memory address is associated with a read from or a write to physical memory. TheMMU 406 translates the virtual address to a physical address to allow a related physical memory location to be accessed by the program. - In a multitasking system all processes compete for the use of memory and of the
MMU 406. In some memory management architectures, however, each process is allowed to have its own area or configuration of the page table, with a mechanism to switch between different mappings on a process switch. This means that all processes can have the same virtual address space rather than require load-time relocation. To accomplish this task, theMMU 406 can include a Translation Lookaside Buffer (TLB) 410. - The
debug circuitry 318 on eachprocessor core 202 can include an onboard debug controller. Having an onboard debug controller facilitates operation of theprocessor core 202 in the debug mode. For example, the debug controller can allow for single-step execution of theprocessor core 202. Further, the debug controller can support breakpoints, enabling them to transition theprocessor core 202 into debug mode. For example, the breakpoints can be one or more of instruction breakpoints, data breakpoints, and virtual address breakpoints. - In some embodiments, the
onboard debug circuitry 318 includes standardized features. For example, theonboard debug circuitry 318 can be compliant with the design philosophy of the Joint Test Action Group (JTAG) interface—a popular standardized interface defined by IEEE Standard 1149.1. In embodiments that utilize processor cores, the onboard controller is referred to is the standard MIPS Enhanced JTAG (EJTAG)debug circuitry 318. - Each
processor core 202 includes one or more debug registers, each register including one or more pre-defined fields for storing information (e.g., state bits) related to different aspects of debug mode operation. The debug registers 425 can be located in the instruction fetchunit 404. For example, one of the debug registers 425 is aDebug register 500. TheDebug register 500 is illustrated in more detail inFIG. 5 . TheDebug register 500 includes a DM state bit indicative of whether theprocessor core 202 is operating in debug mode. Other bits include a DBD state bit indicative of whether the last debug exception or exception in Debug Mode occurred in a branch or jump delay slot. A DDBSImpr bit is indicative of an imprecise debug data break store. A DDBLImpr bit is indicative of an imprecise debug data break load. This bit can be implemented for load value breakpoints. A DExcCode bit is set to one when Debug[DExcCode] is valid and should be interpreted. - Another one of the debug registers 425 is a Multi-Core Debug (MCD) register 600 is shown in
FIG. 6 . The MCD register 600 includes dedicated multi-core debug state positions 615, one position being provided for each of the respective global signal interconnects MCD_0, MCD_1, and MCD_2. Similarly, theMCD register 600 includes dedicated mask-disablestate positions 605, one position being provided for each of the respective global signal interconnects MCD_0, MCD_1, and MCD_2. When set, the mask-disable bits (one bit for each global signal interconnect) disable the effect of sampling a pulse on the corresponding global signal interconnect. - The MCD register 600 also includes respective software-
control bit locations 610 for each of the several global MCD wires. For the exemplarymulti-core processor 100, the three software-control bit locations 610, referred to as: Pls0, Pls1, or Pls2 are reserved. These software-control bit locations 610 corresponding to the three global signal interconnects: MCD_0, MCD_1, and MCD_2, respectively. Thus, bits written by software into the softwarecontrol bit locations 610 can be used to pulse any combination of the three global MCD wires. - In some embodiments, the debug registers 425 (
FIG. 4 ) include a DEPC register for imprecise debug exceptions and imprecise exceptions in Debug Mode. Imprecise debug data breakpoint are provided for load value compare, otherwise debug data breakpoints are precise. The DEPC register contains an address at which execution should be resumed when returning to Non-Debug Mode. - Exception handlers can be entered for debug processing in a number of ways. First, software such as the processor core instruction set and/or the debugger can include a breakpoint instruction. When the breakpoint instruction is executed by the
execution unit 400, it causes a specific exception. Alternatively or in addition, a set of trap instructions can be provided. When the trap instructions are executed by theexecution unit 400, a specific exception will result, but only when certain register value criteria are also satisfied. Further, a pair of optional Watch registers can be programmed to cause a specific exception on a load, store, or instruction fetch access to a specific word (e.g., a 64-bit double word) in virtual memory. Still further, an optional TLB-basedMMU 406 can be programmed to “trap,” or otherwise interrupt program execution on any access, or more specifically, on any store to a page of memory. These exceptions generally refer to interrupting operation on any one of theprocessor cores 202. To interrupt theother processor cores 202, a pulse must be asserted on one or more of the global signal interconnects MCD—0, MCD —1, MCD —2. - In operation, when one or more of the
processor cores 202 asserts a pulse on one of the global signal interconnects MCD_0, MCD_1, and MCD_2, the corresponding signal value can be a high state, or logical one. The respective instruction fetchunit 404 of each of theinterconnected processor cores 202 samples the one on the global signal interconnect. In response to sampling the one, the instruction fetchunit 404 sets an internal state bit corresponding to the sampled pulse. The internal state bit, or MCD state bit, can be dedicated multi-coredebug state positions 615 in the multi-core debug register 600 (i.e., Multi-Core Debug[MCD0, MCD1, MCD2]). - If any of multi-core
debug state bits 615 are non-zero on a given processor core 202 (and thatprocessor core 202 is not already in debug mode), theonboard debug circuitry 318 requests a debug exception on itsrespective processor core 202. With all of themultiple processor cores 202 sampling the same pulse and setting theirrespective bits 615 at substantially the same time, all of the unmaskedprocessor cores 202 are interrupted at substantially the same time. Preferably, this occurs during the same cycle, but it can also occur within a few clock cycles. Software can later clear Multi-Core Debug[MCD0, MCD1, MCD2] bits by overwriting them (e.g., writing a one to them). Such a provision ensures that no further debug interrupts occur after exiting the debug handler. - In general, interrupts can be assigned different priority values to ensure the desired results in situations in which more than one type of interrupt occurs. In particular, the MCD interrupts can occur at the same priority level as standard debug interrupts provided within the
debug circuitry 318 of each of theprocessor cores 202. The exception location can also be the same as a debug interrupt, with themulti-core debug bits 615 being similar to the DINT bit of the debug register shown inFIG. 5 . - The detailed behavior of the bits, however, is different. For example, the DINT bit is read-only, whereas Multi-Core Debug[MCD0,
MCD 1, MCD2] bits can be written to, allowing the bits to be cleared by the debug handler. Further, the DINT is cleared when Multi-Core Debug[DExcC] is set, whereas the multi-coredebug state bits 615 need not be. - There are at least four ways that the global signal interconnects MCD—0, MCD_1, MCD_2 can be pulsed. First, software can cause initiation of a pulse on the global MCD wires. For example, debugger software running on a
processor core 202 can write one or more values (e.g., a logical “1”s) to any combination of the software-control state bits 610 of theMCD register 600. When a “1” is written into one or more of thesebits 610, theprocessor core 202 interprets it as an instruction to assert an interrupt signal, or pulse, on the corresponding global signal interconnects. - The global signal interconnects can also be pulsed by execution of a special instruction. For example, execution of a software breakpoint instruction, such as the SDBBP instruction, by any one of the
processor cores 202 results in thatcore 202 asserting a pulse on the MCD_0 global signal interconnect. Whether a pulse is actually asserted by aprocessor core 202 in response to the breakpoint instruction can be further controlled by a global-signal debug bit 618 in theMCD register 600. Thus, a pulse is only asserted in response to the breakpoint instruction when the MCD[GSDB] bit 618 is set. - Alternatively or in addition, the initiation of a pulse on the global signal interconnects can result if one or more bits within a particular register are set and a breakpoint match occurs. When these two conditions occur, the hardware (e.g., the debug circuitry 318) pulses one of the global MCD wires (e.g., the MCD_0 wire). An Instruction Breakpoint Control-n register (IBCn, “n” being a numbered reference to a particular instruction breakpoint) stores a value responsive to a match of an executed breakpoint instruction. Similarly, a Data Breakpoint Control-n (DBCn) stores a value responsive to a match of a data transaction. The registers IBCn and DBCn generally include special bits (e.g., BE, TE) that can be used to enable the respective breakpoints.
- Table 1 below describes an exemplary embodiment in which the detailed behavior on a breakpoint match is defined based on exemplary register values.
TABLE 1 Breakpoint Match Behavior BE TE Comment 0 0 Nothing happens on a match 0 1 MCD0 is pulsed on a match. BS bits are also set in IBS/DBS. No direct local exception occurs. (This mode may not be used.) 1 0 A local breakpoint exception occurs due to the breakpoint match, causing the Core to enter debug mode. MCD0 is not pulsed. BS bits are set in IBS/DBS. (This mode will be used when debugging, but not multi-Core.) 1 1 A local breakpoint exception occurs due to the breakpoint match, causing the Core to enter debug mode. MCD0 is also pulsed. BS bits are also set in IBS/DBS. (This mode will be used when debugging multi-Core.) - An
exemplary TAP controller 700 is shown inFIG. 7A . TheTAP controller 700 includes one ormore registers 705 for storing instruction, data, and control information relating to theTAP interface 320. Theregisters 705 allow a user to perform a set up for theonboard debug circuitry 318, and provide important status information during a debug session. The size of theregisters 705 depends on the specific implementations, but usually they are at least 32 bits. - The
registers 705 receive information from an external source using the Test Data Input (TDI) input (i.e., pin). The registers also provide information to an external source using the Test Data Output (TDO) output (i.e., pin). Operation of the interface is provided by a TAPcontroller state machine 710. TheTAP controller 700 uses a communications channel, such as a serial communications channel that operates according to a clock signal received on the Test Clock (TCK) input (i.e., pin). Thus, movement of data into and/or out of theregisters 705 operates according to the received clock signal. Similarly, operation of the state machine also relies on the received clock. - A more detailed interconnection of respective TAP interfaces 320 on each of the
multiple processor cores 202 is shown inFIG. 7B . A JTAG interface, referred to as a Test Access Port (TAP) 320′, 320″, 320′″ (generally 320), includes at least four-signal lines: Test Clock (TCK); TMS; Test Data In (TDI); and Test Data Out (TDO). The interface can also include one or more power and ground signal lines (note shown). The JTAG interface is a serial interface that is capable of transferring data according to a clock signal received on the TCK signal line. Operating frequency varies per chip, but is typically defined by a clock signal having a frequency between about 10 MHz to about 100 MHz (i.e., from about 100 nanoseconds to about 10 nanoseconds per bit time). - Configuration of each of the respective debug circuitry 318 (
FIGS. 3A and 3B ) can be performed by manipulating an internal state machine. For example, a debug controller state machine within thedebug circuitry 318 can be externally manipulated one bit at a time via the TMS signal line of theTAP 330. Data can then be transferred in and out, one bit at a time, during each TCK clock cycle. The data can be received via the TDI signal line, and transmitted out via the TDO signal line, respectively. Different instruction modes can be loaded into thedebug controller 318 to read the core identification (ID), to sample input, to drive (and/or float) output, to manipulate functions, and/or to bypass (pipe TDI to TDO to logically shorten chains of multiple chips). - The
respective TAP 320 of each of themultiple processor cores respective TAP 320 of the othermultiple processor cores 202 in a serial, or “daisy chain” configuration. Thus, the TCK signal of thefirst TAP 320′ is serially interconnected to the corresponding TCK signal lines of all of theother TAPs 320. The interconnected TCK signal lines are further connected to a corresponding TCK signal line of asystem TAP 330. Typically, thesystem TAP 330 is interconnected to one of the end of the interconnected processor cores 202 (i.e.,processor core 202 n orprocessor core 202 a, as shown), thatend processor core 202 referred to as a “master”processor core 202 a. For the most part, the remaining TAP signal lines are generally interconnected in a similar manner being further connected from themaster processor core 202 a corresponding TAP signal lines on thesystem TAP 330. Interconnection of the TDI and TDO signal lines, however, is different as described in more detail below. - In the daisy chain configuration, the TDI signal line of the
master processor core 202 a connects to the corresponding TDI signal line of thesystem TAP 330, themaster processor core 202 a receiving data from an external source. The TDO signal line of themaster processor core 202 a, however, is connected to the TDI signal line of anadjacent processor core 202 b.Additional processor cores 202 are connected in a similar manner, the TDO signal line of oneprocessor core 202 being interconnected to the TDI signal line of its precedingprocessor core 202, until the TDO signal line of thelast processor core 202 n in the chain is interconnected to the TDO signal line of thesystem TAP 330. - A more-detailed diagram illustrating an alternative embodiment of a
processor core 202 including exemplary onboard debug circuitry is shown inFIG. 8 . An execution unit 400 (e.g., a combined processor and co-processor) is coupled to a memory (e.g., cache)controller 805 through anMMU 410. TheMMU 410 may include a TLB. Thememory controller 805 is further coupled to a memory system interface through abus interface unit 408. Access and control of the onboard debug features is provided through anEJTAG TAP 320. The processing unit 300 includes a number ofregisters 830 that support debug operation. For example, theprocessor core 202 includes anMCD register 835 as discussed above; adebug register 836 as also discussed above, aDEPC register 837 and aDESAVE register 838. - A
debug control register 832 is coupled between theregisters 830 of theprocessing unit 400, thememory controller 805, and externally via theEJTAG TAP 320. Ahardware breakpoint unit 825 is also coupled between theregisters 830 of theexecution unit 400, the memory controller and theMMU 410. TheHardware Breakpoint Unit 825 implements memory-mapped registers that control the instruction and data hardware breakpoints. The memory-mapped region containing the hardware breakpoint registers is accessible to software only in debug mode. - The debug features provide compatibility with existing debuggers. The
debug circuitry 318 support includes specific extensions that enable concurrent multi-Core debugging. For example, controlling logic can be used to interpret the values of the software-control bit locations 610. Upon interpreting a value indicative of a pulse, the controlling logic can write the interpreted values into the corresponding MCD—0, MCD_1, MCD_2 bit locations of the MCD register. The controlling logic can then pulse the one or more corresponding global MCD wires, according to the corresponding values 615. Once pulsed, theprocessor cores 202 sample the pulse. The pulse sampling can occur during the next clock cycle after the pulse was written. Once sampled, each of theprocessor cores 202 that is not masked, will initiate a debug exception handler routine. - The debug exception handler can then follow a set of predetermined rules to determine the one or more causes of a given debug exception after reading the Debug and/or Multi-Core Debug registers. For example, the debug exception handler can follow the rules listed in Table 2 below.
TABLE 2 Debug Exception Handler Rules 1. Any of MCD state bit locations (Multi-Core Debug[MCD0, MCD1, MCD2]) could be set at any time, indicating that the corresponding MCD state bit is set. 2. If Multi-Core Debug[DExcC] is set, all of Debug[DDBSImpr, DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] will be clear, and Debug[DExcCode] will contain a valid code. (This is the case for a debug mode exception.) 3. If none of Debug[DDBSImpr, DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] are set, then the exception was either due to MCD*, or Multi- Core Debug[DExcC] being set and Debug[DExcCode] is valid. 4. No more than one of Debug[DIB, DDBS, DDBL, DBp, DSS] can be set. 5. If Multi-Core Debug[DExcC] is clear, any combination of Debug[DDBLImpr, DINT] may be set. 6. At least one of Debug[DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] and Multi-Core Debug[MCD0, MCD1, MCD2, DExcC] will be set. - While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Claims (23)
1. A multi-core processor comprising:
a plurality of independent processor cores, each processor core executing instructions and operating in parallel to perform work;
each of the plurality of independent processor cores respectively including:
an interrupt-signal sensor; and
an interrupt-signal generator selectively providing an interrupt signal; and
a global interrupt-signal interconnect in electrical communication with each of the plurality of independent processor cores, more than one of the processor cores respectively interrupting its execution of instructions substantially simultaneously responsive to sampling with respective interrupt-signal sensors an interrupt signal on the global interrupt-signal interconnect.
2. The multi-core processor of claim 1 , wherein the respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect.
3. The multi-core processor of claim 2 , wherein the respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect in a wired-OR configuration.
4. The multi-core processor of claim 1 , wherein an interrupt signal is provided in response to a write to a register in one of the plurality of independent processor cores.
5. The multi-core processor of claim 1 , wherein an interrupt signal is provided in response to execution of a debug breakpoint instruction in one of the plurality of independent processor cores.
6. The multi-core processor of claim 1 , wherein an interrupt signal is provided in response to detection of an instruction or data breakpoint match in one of the plurality of independent processor cores.
7. The multi-core processor of claim 1 , wherein the global interrupt-signal interconnect comprises a plurality of independent global interrupt-signal interconnects, each of the independent global interrupt-signal interconnects representing a respective interrupt signal.
8. The multi-core processor of claim 1 , further comprising a trace buffer coupled to the global interrupt-signal interconnect, the trace buffer being configured to monitor memory transactions of the independent processor cores in response to an interrupt signal on the global interrupt-signal interconnect.
9. The multi-core processor of claim 1 , wherein each of the plurality of independent processor cores comprises a respective register storing information, the register configurable according to the sampled interrupt signal.
10. The multi-core processor of claim 1 , further comprising a core-processor clock signal for coordinating execution of the instructions, wherein the interrupt-signal sensor samples the global interrupt-signal interconnect during each cycle of the core-processor clock signal.
11. The multi-core processor of claim 10 , wherein each processor core respectively interrupts its execution of instructions within three core-processor clock cycles of sampling an interrupt signal on the global interrupt-signal interconnect.
12. The multi-core processor of claim 1 , wherein the global interrupt-signal interconnect is used to communicate after the plurality of processor cores are interrupted.
13. A method of debugging a multi-core processor comprising the steps of:
selectively providing an interrupt signal on a global interrupt-signal interconnect, the global interrupt-signal interconnect coupled to each of a plurality of processor cores comprising the multi-core processor;
sampling the provided interrupt signal at each of the plurality of processor cores; and
interrupting execution of more than one of the plurality of processor cores substantially simultaneously responsive to the sensed interrupt signal.
14. The method of claim 13 , wherein the interrupt signal is selectively provided by one of the plurality of processor cores.
15. The method of claim 14 , wherein the interrupt signal is provided in response to software control.
16. The method of claim 15 , wherein the software control comprises software writing a value to a register.
17. The method of claim 14 , wherein the interrupt signal is provided in response to execution of a debug breakpoint instruction.
18. The method of claim 14 , wherein the interrupt signal is provided in response to a breakpoint match.
19. The method of claim 13 , further comprising entering a debug handler routine at each of the interrupted processor cores.
20. The method of claim 19 , wherein each of the interrupted processor cores communicates with an external device responsive to entering the debug handler routine.
21. The method of claim 20 , wherein each of the plurality of processor cores communicates with the external device using a Joint Test Action Group (JTAG) test access port.
22. The method of claim 20 , further comprising using the global interrupt-signal interconnect to communicate after the plurality of processor cores are interrupted.
23. A multi-core processor comprising:
means for selectively providing an interrupt signal on a global interrupt-signal interconnect, the global interrupt-signal interconnect coupled to each of a plurality of processor cores comprising the multi-core processor;
means for sensing the provided interrupt signal at each of the plurality of processor cores; and
means for interrupting execution of more than one of the plurality of processors substantially simultaneously responsive to a sensed interrupt signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/042,476 US20060059286A1 (en) | 2004-09-10 | 2005-01-25 | Multi-core debugger |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US60921104P | 2004-09-10 | 2004-09-10 | |
US11/042,476 US20060059286A1 (en) | 2004-09-10 | 2005-01-25 | Multi-core debugger |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060059286A1 true US20060059286A1 (en) | 2006-03-16 |
Family
ID=38731731
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/015,343 Active US7941585B2 (en) | 2004-09-10 | 2004-12-17 | Local scratchpad and data caching system |
US11/030,010 Abandoned US20060059316A1 (en) | 2004-09-10 | 2005-01-05 | Method and apparatus for managing write back cache |
US11/042,476 Abandoned US20060059286A1 (en) | 2004-09-10 | 2005-01-25 | Multi-core debugger |
US14/159,210 Active US9141548B2 (en) | 2004-09-10 | 2014-01-20 | Method and apparatus for managing write back cache |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/015,343 Active US7941585B2 (en) | 2004-09-10 | 2004-12-17 | Local scratchpad and data caching system |
US11/030,010 Abandoned US20060059316A1 (en) | 2004-09-10 | 2005-01-05 | Method and apparatus for managing write back cache |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/159,210 Active US9141548B2 (en) | 2004-09-10 | 2014-01-20 | Method and apparatus for managing write back cache |
Country Status (2)
Country | Link |
---|---|
US (4) | US7941585B2 (en) |
CN (5) | CN100533372C (en) |
Cited By (147)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060156099A1 (en) * | 2004-12-16 | 2006-07-13 | Sweet James D | Method and system of using a single EJTAG interface for multiple tap controllers |
US20060253660A1 (en) * | 2005-03-30 | 2006-11-09 | Intel Corporation | Method and apparatus to provide dynamic hardware signal allocation in a processor |
US20080184150A1 (en) * | 2007-01-31 | 2008-07-31 | Marc Minato | Electronic circuit design analysis tool for multi-processor environments |
US20080239956A1 (en) * | 2007-03-30 | 2008-10-02 | Packeteer, Inc. | Data and Control Plane Architecture for Network Application Traffic Management Device |
US20080316922A1 (en) * | 2007-06-21 | 2008-12-25 | Packeteer, Inc. | Data and Control Plane Architecture Including Server-Side Triggered Flow Policy Mechanism |
WO2009008007A3 (en) * | 2007-07-09 | 2009-03-05 | Hewlett Packard Development Co | Data packet processing method for a multi core processor |
US20090083517A1 (en) * | 2007-09-25 | 2009-03-26 | Packeteer, Inc. | Lockless Processing of Command Operations in Multiprocessor Systems |
US20090150695A1 (en) * | 2007-12-10 | 2009-06-11 | Justin Song | Predicting future power level states for processor cores |
US20090150696A1 (en) * | 2007-12-10 | 2009-06-11 | Justin Song | Transitioning a processor package to a low power state |
US7813277B2 (en) | 2007-06-29 | 2010-10-12 | Packeteer, Inc. | Lockless bandwidth management for multiprocessor networking devices |
US7840000B1 (en) * | 2005-07-25 | 2010-11-23 | Rockwell Collins, Inc. | High performance programmable cryptography system |
US20100332909A1 (en) * | 2009-06-30 | 2010-12-30 | Texas Instruments Incorporated | Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems |
US20110153724A1 (en) * | 2009-12-23 | 2011-06-23 | Murali Raja | Systems and methods for object rate limiting in multi-core system |
US20110161630A1 (en) * | 2009-12-28 | 2011-06-30 | Raasch Steven E | General purpose hardware to replace faulty core components that may also provide additional processor functionality |
US20110214023A1 (en) * | 2010-02-26 | 2011-09-01 | UltraSoC Technologies Limited | Method of Debugging Multiple Processes |
US20110225456A1 (en) * | 2010-03-10 | 2011-09-15 | Texas Instruments Incorporated | Commanded jtag test access port operations |
US20110307741A1 (en) * | 2010-06-15 | 2011-12-15 | National Chung Cheng University | Non-intrusive debugging framework for parallel software based on super multi-core framework |
US8111707B2 (en) | 2007-12-20 | 2012-02-07 | Packeteer, Inc. | Compression mechanisms for control plane—data plane processing architectures |
WO2012087894A2 (en) * | 2010-12-22 | 2012-06-28 | Intel Corporation | Debugging complex multi-core and multi-socket systems |
US20120216017A1 (en) * | 2009-11-16 | 2012-08-23 | Fujitsu Limited | Parallel computing apparatus and parallel computing method |
US20130254605A1 (en) * | 2006-10-20 | 2013-09-26 | Texas Instruments Incorporated | High speed double data rate jtag interface |
US20140019644A1 (en) * | 2012-07-10 | 2014-01-16 | International Business Machines Corporation | Controlling A Plurality Of Serial Peripheral Interface ('SPI') Peripherals Using A Single Chip Select |
US8683240B2 (en) | 2011-06-27 | 2014-03-25 | Intel Corporation | Increasing power efficiency of turbo mode operation in a processor |
US8688883B2 (en) | 2011-09-08 | 2014-04-01 | Intel Corporation | Increasing turbo mode residency of a processor |
US20140157051A1 (en) * | 2011-05-16 | 2014-06-05 | Zongyou Shao | Method and device for debugging a mips-structure cpu with southbridge and northbridge chipsets |
US8769316B2 (en) | 2011-09-06 | 2014-07-01 | Intel Corporation | Dynamically allocating a power budget over multiple domains of a processor |
US8799687B2 (en) | 2005-12-30 | 2014-08-05 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates |
US8832478B2 (en) | 2011-10-27 | 2014-09-09 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US8914650B2 (en) | 2011-09-28 | 2014-12-16 | Intel Corporation | Dynamically adjusting power of non-core processor circuitry including buffer circuitry |
US20150026441A1 (en) * | 2005-05-16 | 2015-01-22 | Texas Instruments Incorporated | Method and system of inserting marking values used to correlate trace data as between processor cores |
US8943334B2 (en) | 2010-09-23 | 2015-01-27 | Intel Corporation | Providing per core voltage and frequency control |
US8943340B2 (en) | 2011-10-31 | 2015-01-27 | Intel Corporation | Controlling a turbo mode frequency of a processor |
US8954770B2 (en) | 2011-09-28 | 2015-02-10 | Intel Corporation | Controlling temperature of multiple domains of a multi-domain processor using a cross domain margin |
US8972763B2 (en) | 2011-12-05 | 2015-03-03 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state |
US8984313B2 (en) | 2012-08-31 | 2015-03-17 | Intel Corporation | Configuring power management functionality in a processor including a plurality of cores by utilizing a register to store a power domain indicator |
US9026815B2 (en) | 2011-10-27 | 2015-05-05 | Intel Corporation | Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor |
US9052901B2 (en) | 2011-12-14 | 2015-06-09 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including configurable maximum processor current |
US9063727B2 (en) | 2012-08-31 | 2015-06-23 | Intel Corporation | Performing cross-domain thermal control in a processor |
US9069555B2 (en) | 2011-03-21 | 2015-06-30 | Intel Corporation | Managing power consumption in a multi-core processor |
US9074947B2 (en) | 2011-09-28 | 2015-07-07 | Intel Corporation | Estimating temperature of a processor core in a low power state without thermal sensor information |
US9075556B2 (en) | 2012-12-21 | 2015-07-07 | Intel Corporation | Controlling configurable peak performance limits of a processor |
US9081577B2 (en) | 2012-12-28 | 2015-07-14 | Intel Corporation | Independent control of processor core retention states |
US9098261B2 (en) | 2011-12-15 | 2015-08-04 | Intel Corporation | User level control of power management policies |
WO2015134103A1 (en) * | 2014-03-07 | 2015-09-11 | Cavium, Inc. | Method and system for ordering i/o access in a multi-node environment |
US9158693B2 (en) | 2011-10-31 | 2015-10-13 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US9164565B2 (en) | 2012-12-28 | 2015-10-20 | Intel Corporation | Apparatus and method to manage energy usage of a processor |
US9176875B2 (en) | 2012-12-14 | 2015-11-03 | Intel Corporation | Power gating a portion of a cache memory |
US20150346935A1 (en) * | 2010-03-09 | 2015-12-03 | Avistar Communications Corporation | Scalable high-performance interactive real-time media architectures for virtual desktop environments |
US9235252B2 (en) | 2012-12-21 | 2016-01-12 | Intel Corporation | Dynamic balancing of power across a plurality of processor domains according to power policy control bias |
US9239611B2 (en) | 2011-12-05 | 2016-01-19 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including balancing power among multi-frequency domains of a processor based on efficiency rating scheme |
US9292468B2 (en) | 2012-12-17 | 2016-03-22 | Intel Corporation | Performing frequency coordination in a multiprocessor system based on response timing optimization |
US9323316B2 (en) | 2012-03-13 | 2016-04-26 | Intel Corporation | Dynamically controlling interconnect frequency in a processor |
US9323525B2 (en) | 2014-02-26 | 2016-04-26 | Intel Corporation | Monitoring vector lane duty cycle for dynamic optimization |
US9335803B2 (en) | 2013-02-15 | 2016-05-10 | Intel Corporation | Calculating a dynamically changeable maximum operating voltage value for a processor based on a different polynomial equation using a set of coefficient values and a number of current active cores |
US9335804B2 (en) | 2012-09-17 | 2016-05-10 | Intel Corporation | Distributing power to heterogeneous compute elements of a processor |
US9348401B2 (en) | 2013-06-25 | 2016-05-24 | Intel Corporation | Mapping a performance request to an operating frequency in a processor |
US9348407B2 (en) | 2013-06-27 | 2016-05-24 | Intel Corporation | Method and apparatus for atomic frequency and voltage changes |
US9354689B2 (en) | 2012-03-13 | 2016-05-31 | Intel Corporation | Providing energy efficient turbo operation of a processor |
US9367114B2 (en) | 2013-03-11 | 2016-06-14 | Intel Corporation | Controlling operating voltage of a processor |
US9372800B2 (en) | 2014-03-07 | 2016-06-21 | Cavium, Inc. | Inter-chip interconnect protocol for a multi-chip system |
US9372524B2 (en) | 2011-12-15 | 2016-06-21 | Intel Corporation | Dynamically modifying a power/performance tradeoff based on processor utilization |
US9377836B2 (en) | 2013-07-26 | 2016-06-28 | Intel Corporation | Restricting clock signal delivery based on activity in a processor |
US9377841B2 (en) | 2013-05-08 | 2016-06-28 | Intel Corporation | Adaptively limiting a maximum operating frequency in a multicore processor |
US9395784B2 (en) | 2013-04-25 | 2016-07-19 | Intel Corporation | Independently controlling frequency of plurality of power domains in a processor system |
US9405351B2 (en) | 2012-12-17 | 2016-08-02 | Intel Corporation | Performing frequency coordination in a multiprocessor system |
US9405345B2 (en) | 2013-09-27 | 2016-08-02 | Intel Corporation | Constraining processor operation based on power envelope information |
US9411644B2 (en) | 2014-03-07 | 2016-08-09 | Cavium, Inc. | Method and system for work scheduling in a multi-chip system |
US9423858B2 (en) | 2012-09-27 | 2016-08-23 | Intel Corporation | Sharing power between domains in a processor package using encoded power consumption information from a second domain to calculate an available power budget for a first domain |
US9436245B2 (en) | 2012-03-13 | 2016-09-06 | Intel Corporation | Dynamically computing an electrical design point (EDP) for a multicore processor |
US9459689B2 (en) | 2013-12-23 | 2016-10-04 | Intel Corporation | Dyanamically adapting a voltage of a clock generation circuit |
US20160299859A1 (en) * | 2013-11-22 | 2016-10-13 | Freescale Semiconductor, Inc. | Apparatus and method for external access to core resources of a processor, semiconductor systems development tool comprising the apparatus, and computer program product and non-transitory computer-readable storage medium associated with the method |
US9471088B2 (en) | 2013-06-25 | 2016-10-18 | Intel Corporation | Restricting clock signal delivery in a processor |
US9495001B2 (en) | 2013-08-21 | 2016-11-15 | Intel Corporation | Forcing core low power states in a processor |
US9494998B2 (en) | 2013-12-17 | 2016-11-15 | Intel Corporation | Rescheduling workloads to enforce and maintain a duty cycle |
US9513689B2 (en) | 2014-06-30 | 2016-12-06 | Intel Corporation | Controlling processor performance scaling based on context |
US9529532B2 (en) | 2014-03-07 | 2016-12-27 | Cavium, Inc. | Method and apparatus for memory allocation in a multi-node system |
US9547027B2 (en) | 2012-03-30 | 2017-01-17 | Intel Corporation | Dynamically measuring power consumption in a processor |
US9575543B2 (en) | 2012-11-27 | 2017-02-21 | Intel Corporation | Providing an inter-arrival access timer in a processor |
US9575537B2 (en) | 2014-07-25 | 2017-02-21 | Intel Corporation | Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states |
US9594560B2 (en) | 2013-09-27 | 2017-03-14 | Intel Corporation | Estimating scalability value for a specific domain of a multicore processor based on active state residency of the domain, stall duration of the domain, memory bandwidth of the domain, and a plurality of coefficients based on a workload to execute on the domain |
US9606888B1 (en) * | 2013-01-04 | 2017-03-28 | Marvell International Ltd. | Hierarchical multi-core debugger interface |
US9606602B2 (en) | 2014-06-30 | 2017-03-28 | Intel Corporation | Method and apparatus to prevent voltage droop in a computer |
US9639134B2 (en) | 2015-02-05 | 2017-05-02 | Intel Corporation | Method and apparatus to provide telemetry data to a power controller of a processor |
US9665153B2 (en) | 2014-03-21 | 2017-05-30 | Intel Corporation | Selecting a low power state based on cache flush latency determination |
US9671853B2 (en) | 2014-09-12 | 2017-06-06 | Intel Corporation | Processor operating by selecting smaller of requested frequency and an energy performance gain (EPG) frequency |
US9684360B2 (en) | 2014-10-30 | 2017-06-20 | Intel Corporation | Dynamically controlling power management of an on-die memory of a processor |
US9684517B2 (en) | 2013-10-31 | 2017-06-20 | Lenovo Enterprise Solutions (Singapore) Pte. Ltd. | System monitoring and debugging in a multi-core processor system |
US9703358B2 (en) | 2014-11-24 | 2017-07-11 | Intel Corporation | Controlling turbo mode frequency operation in a processor |
US9710054B2 (en) | 2015-02-28 | 2017-07-18 | Intel Corporation | Programmable power management agent |
US9710043B2 (en) | 2014-11-26 | 2017-07-18 | Intel Corporation | Controlling a guaranteed frequency of a processor |
US9710041B2 (en) | 2015-07-29 | 2017-07-18 | Intel Corporation | Masking a power state of a core of a processor |
US9760158B2 (en) | 2014-06-06 | 2017-09-12 | Intel Corporation | Forcing a processor into a low power state |
US9760136B2 (en) | 2014-08-15 | 2017-09-12 | Intel Corporation | Controlling temperature of a system memory |
US9760160B2 (en) | 2015-05-27 | 2017-09-12 | Intel Corporation | Controlling performance states of processing engines of a processor |
US9823719B2 (en) | 2013-05-31 | 2017-11-21 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US9842082B2 (en) | 2015-02-27 | 2017-12-12 | Intel Corporation | Dynamically updating logical identifiers of cores of a processor |
US9847927B2 (en) | 2014-12-26 | 2017-12-19 | Pfu Limited | Information processing device, method, and medium |
US9874922B2 (en) | 2015-02-17 | 2018-01-23 | Intel Corporation | Performing dynamic power control of platform devices |
US9910470B2 (en) | 2015-12-16 | 2018-03-06 | Intel Corporation | Controlling telemetry data communication in a processor |
US9910481B2 (en) | 2015-02-13 | 2018-03-06 | Intel Corporation | Performing power management in a multicore processor |
US9977477B2 (en) | 2014-09-26 | 2018-05-22 | Intel Corporation | Adapting operating parameters of an input/output (IO) interface circuit of a processor |
US9983644B2 (en) | 2015-11-10 | 2018-05-29 | Intel Corporation | Dynamically updating at least one power management operational parameter pertaining to a turbo mode of a processor for increased performance |
EP3333697A1 (en) * | 2016-12-12 | 2018-06-13 | INTEL Corporation | Communicating signals between divided and undivided clock domains |
US10001822B2 (en) | 2015-09-22 | 2018-06-19 | Intel Corporation | Integrating a power arbiter in a processor |
US20180181478A1 (en) * | 2016-12-28 | 2018-06-28 | Arm Limited | Performing diagnostic operations upon a target apparatus |
US20180217915A1 (en) * | 2015-09-25 | 2018-08-02 | Huawei Technologies Co.,Ltd. | Debugging method, multi-core processor, and debugging device |
US10048744B2 (en) | 2014-11-26 | 2018-08-14 | Intel Corporation | Apparatus and method for thermal management in a multi-chip package |
EP3343377A4 (en) * | 2015-09-25 | 2018-09-12 | Huawei Technologies Co., Ltd. | Debugging method, multi-core processor, and debugging equipment |
US10108454B2 (en) | 2014-03-21 | 2018-10-23 | Intel Corporation | Managing dynamic capacitance using code scheduling |
US10146286B2 (en) | 2016-01-14 | 2018-12-04 | Intel Corporation | Dynamically updating a power management policy of a processor |
US10168758B2 (en) | 2016-09-29 | 2019-01-01 | Intel Corporation | Techniques to enable communication between a processor and voltage regulator |
US10185566B2 (en) | 2012-04-27 | 2019-01-22 | Intel Corporation | Migrating tasks between asymmetric computing elements of a multi-core processor |
US10234930B2 (en) | 2015-02-13 | 2019-03-19 | Intel Corporation | Performing power management in a multicore processor |
US10234920B2 (en) | 2016-08-31 | 2019-03-19 | Intel Corporation | Controlling current consumption of a processor based at least in part on platform capacitance |
US10281975B2 (en) | 2016-06-23 | 2019-05-07 | Intel Corporation | Processor having accelerated user responsiveness in constrained environment |
US10289188B2 (en) | 2016-06-21 | 2019-05-14 | Intel Corporation | Processor having concurrent core and fabric exit from a low power state |
US10324519B2 (en) | 2016-06-23 | 2019-06-18 | Intel Corporation | Controlling forced idle state operation in a processor |
US10339023B2 (en) | 2014-09-25 | 2019-07-02 | Intel Corporation | Cache-aware adaptive thread scheduling and migration |
US10379904B2 (en) | 2016-08-31 | 2019-08-13 | Intel Corporation | Controlling a performance state of a processor using a combination of package and thread hint information |
US10379596B2 (en) | 2016-08-03 | 2019-08-13 | Intel Corporation | Providing an interface for demotion control information in a processor |
US10386900B2 (en) | 2013-09-24 | 2019-08-20 | Intel Corporation | Thread aware power management |
US10417149B2 (en) | 2014-06-06 | 2019-09-17 | Intel Corporation | Self-aligning a processor duty cycle with interrupts |
US10423206B2 (en) | 2016-08-31 | 2019-09-24 | Intel Corporation | Processor to pre-empt voltage ramps for exit latency reductions |
US10429919B2 (en) | 2017-06-28 | 2019-10-01 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US10620266B2 (en) | 2017-11-29 | 2020-04-14 | Intel Corporation | System, apparatus and method for in-field self testing in a diagnostic sleep state |
US10620969B2 (en) | 2018-03-27 | 2020-04-14 | Intel Corporation | System, apparatus and method for providing hardware feedback information in a processor |
US10620682B2 (en) | 2017-12-21 | 2020-04-14 | Intel Corporation | System, apparatus and method for processor-external override of hardware performance state control of a processor |
US10678674B2 (en) * | 2017-06-15 | 2020-06-09 | Silicon Laboratories, Inc. | Wireless debugging |
US10719326B2 (en) | 2015-01-30 | 2020-07-21 | Intel Corporation | Communicating via a mailbox interface of a processor |
US10739844B2 (en) | 2018-05-02 | 2020-08-11 | Intel Corporation | System, apparatus and method for optimized throttling of a processor |
US10846251B1 (en) * | 2016-07-01 | 2020-11-24 | The Board Of Trustees Of The University Of Illinois | Scratchpad-based operating system for multi-core embedded systems |
US10860083B2 (en) | 2018-09-26 | 2020-12-08 | Intel Corporation | System, apparatus and method for collective power control of multiple intellectual property agents and a shared power rail |
US10877530B2 (en) | 2014-12-23 | 2020-12-29 | Intel Corporation | Apparatus and method to provide a thermal parameter report for a multi-chip package |
US10955899B2 (en) | 2018-06-20 | 2021-03-23 | Intel Corporation | System, apparatus and method for responsive autonomous hardware performance state control of a processor |
US10976801B2 (en) | 2018-09-20 | 2021-04-13 | Intel Corporation | System, apparatus and method for power budget distribution for a plurality of virtual machines to execute on a processor |
US11079819B2 (en) | 2014-11-26 | 2021-08-03 | Intel Corporation | Controlling average power limits of a processor |
US11132283B2 (en) * | 2019-10-08 | 2021-09-28 | Renesas Electronics America Inc. | Device and method for evaluating internal and external system processors by internal and external debugger devices |
US11132201B2 (en) | 2019-12-23 | 2021-09-28 | Intel Corporation | System, apparatus and method for dynamic pipeline stage control of data path dominant circuitry of an integrated circuit |
US11256657B2 (en) | 2019-03-26 | 2022-02-22 | Intel Corporation | System, apparatus and method for adaptive interconnect routing |
US11290881B2 (en) | 2018-05-15 | 2022-03-29 | Siemens Aktiengesellschaft | Method for functionally secure connection identification |
US11366506B2 (en) | 2019-11-22 | 2022-06-21 | Intel Corporation | System, apparatus and method for globally aware reactive local power control in a processor |
US11442529B2 (en) | 2019-05-15 | 2022-09-13 | Intel Corporation | System, apparatus and method for dynamically controlling current consumption of processing circuits of a processor |
US11513835B2 (en) * | 2020-06-01 | 2022-11-29 | Micron Technology, Inc. | Notifying memory system of host events via modulated reset signals |
US11593544B2 (en) | 2017-08-23 | 2023-02-28 | Intel Corporation | System, apparatus and method for adaptive operating voltage in a field programmable gate array (FPGA) |
US11656676B2 (en) | 2018-12-12 | 2023-05-23 | Intel Corporation | System, apparatus and method for dynamic thermal distribution of a system on chip |
US11698812B2 (en) | 2019-08-29 | 2023-07-11 | Intel Corporation | System, apparatus and method for providing hardware state feedback to an operating system in a heterogeneous processor |
US11921564B2 (en) | 2022-02-28 | 2024-03-05 | Intel Corporation | Saving and restoring configuration and status information with reduced latency |
Families Citing this family (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7941585B2 (en) * | 2004-09-10 | 2011-05-10 | Cavium Networks, Inc. | Local scratchpad and data caching system |
US7594081B2 (en) * | 2004-09-10 | 2009-09-22 | Cavium Networks, Inc. | Direct access to low-latency memory |
EP1794979B1 (en) * | 2004-09-10 | 2017-04-12 | Cavium, Inc. | Selective replication of data structure |
US8316431B2 (en) * | 2004-10-12 | 2012-11-20 | Canon Kabushiki Kaisha | Concurrent IPsec processing system and method |
US20070067567A1 (en) * | 2005-09-19 | 2007-03-22 | Via Technologies, Inc. | Merging entries in processor caches |
US20080282034A1 (en) * | 2005-09-19 | 2008-11-13 | Via Technologies, Inc. | Memory Subsystem having a Multipurpose Cache for a Stream Graphics Multiprocessor |
US8079084B1 (en) | 2007-08-10 | 2011-12-13 | Fortinet, Inc. | Virus co-processor instructions and methods for using such |
US8375449B1 (en) | 2007-08-10 | 2013-02-12 | Fortinet, Inc. | Circuits and methods for operating a virus co-processor |
US8286246B2 (en) * | 2007-08-10 | 2012-10-09 | Fortinet, Inc. | Circuits and methods for efficient data transfer in a virus co-processing system |
US7836283B2 (en) * | 2007-08-31 | 2010-11-16 | Freescale Semiconductor, Inc. | Data acquisition messaging using special purpose registers |
US20090106501A1 (en) * | 2007-10-17 | 2009-04-23 | Broadcom Corporation | Data cache management mechanism for packet forwarding |
CN101272334B (en) * | 2008-03-19 | 2010-11-10 | 杭州华三通信技术有限公司 | Method, device and equipment for processing QoS service by multi-core CPU |
CN101282303B (en) * | 2008-05-19 | 2010-09-22 | 杭州华三通信技术有限公司 | Method and apparatus for processing service packet |
JP5202130B2 (en) * | 2008-06-24 | 2013-06-05 | 株式会社東芝 | Cache memory, computer system, and memory access method |
CN101299194B (en) * | 2008-06-26 | 2010-04-07 | 上海交通大学 | Heterogeneous multi-core system thread-level dynamic dispatching method based on configurable processor |
US8041899B2 (en) * | 2008-07-29 | 2011-10-18 | Freescale Semiconductor, Inc. | System and method for fetching information to a cache module using a write back allocate algorithm |
US8996812B2 (en) * | 2009-06-19 | 2015-03-31 | International Business Machines Corporation | Write-back coherency data cache for resolving read/write conflicts |
US8595425B2 (en) * | 2009-09-25 | 2013-11-26 | Nvidia Corporation | Configurable cache for multiple clients |
DK2507951T5 (en) * | 2009-12-04 | 2013-12-02 | Napatech As | DEVICE AND PROCEDURE FOR RECEIVING AND STORING DATA PACKAGES MANAGED BY A CENTRAL CONTROLLER |
US8850404B2 (en) * | 2009-12-23 | 2014-09-30 | Intel Corporation | Relational modeling for performance analysis of multi-core processors using virtual tasks |
CN102141905B (en) * | 2010-01-29 | 2015-02-25 | 上海芯豪微电子有限公司 | Processor system structure |
CN101840328B (en) * | 2010-04-15 | 2014-05-07 | 华为技术有限公司 | Data processing method, system and related equipment |
US8683128B2 (en) | 2010-05-07 | 2014-03-25 | International Business Machines Corporation | Memory bus write prioritization |
US8838901B2 (en) | 2010-05-07 | 2014-09-16 | International Business Machines Corporation | Coordinated writeback of dirty cachelines |
CN102279802A (en) * | 2010-06-13 | 2011-12-14 | 中兴通讯股份有限公司 | Method and device for increasing reading operation efficiency of synchronous dynamic random storage controller |
CN102346661A (en) * | 2010-07-30 | 2012-02-08 | 国际商业机器公司 | Method and system for state maintenance of request queue of hardware accelerator |
US8661227B2 (en) * | 2010-09-17 | 2014-02-25 | International Business Machines Corporation | Multi-level register file supporting multiple threads |
CN102149207B (en) * | 2011-04-02 | 2013-06-19 | 天津大学 | Access point (AP) scheduling method for improving short-term fairness of transmission control protocol (TCP) in wireless local area network (WLAN) |
US20120297147A1 (en) * | 2011-05-20 | 2012-11-22 | Nokia Corporation | Caching Operations for a Non-Volatile Memory Array |
US9936209B2 (en) * | 2011-08-11 | 2018-04-03 | The Quantum Group, Inc. | System and method for slice processing computer-related tasks |
US8898244B2 (en) * | 2011-10-20 | 2014-11-25 | Allen Miglore | System and method for transporting files between networked or connected systems and devices |
US8473658B2 (en) | 2011-10-25 | 2013-06-25 | Cavium, Inc. | Input output bridging |
US8560757B2 (en) * | 2011-10-25 | 2013-10-15 | Cavium, Inc. | System and method to reduce memory access latencies using selective replication across multiple memory ports |
US8850125B2 (en) | 2011-10-25 | 2014-09-30 | Cavium, Inc. | System and method to provide non-coherent access to a coherent memory system |
US9330002B2 (en) * | 2011-10-31 | 2016-05-03 | Cavium, Inc. | Multi-core interconnect in a network processor |
FR2982683B1 (en) * | 2011-11-10 | 2014-01-03 | Sagem Defense Securite | SEQUENCING METHOD ON A MULTICOAT PROCESSOR |
US9336000B2 (en) * | 2011-12-23 | 2016-05-10 | Intel Corporation | Instruction execution unit that broadcasts data values at different levels of granularity |
WO2013095618A1 (en) | 2011-12-23 | 2013-06-27 | Intel Corporation | Instruction execution that broadcasts and masks data values at different levels of granularity |
US8693490B1 (en) * | 2012-12-20 | 2014-04-08 | Unbound Networks, Inc. | Parallel processing using multi-core processor |
US9274826B2 (en) * | 2012-12-28 | 2016-03-01 | Futurewei Technologies, Inc. | Methods for task scheduling through locking and unlocking an ingress queue and a task queue |
US9507563B2 (en) | 2013-08-30 | 2016-11-29 | Cavium, Inc. | System and method to traverse a non-deterministic finite automata (NFA) graph generated for regular expression patterns with advanced features |
US9811467B2 (en) * | 2014-02-03 | 2017-11-07 | Cavium, Inc. | Method and an apparatus for pre-fetching and processing work for procesor cores in a network processor |
US9431105B2 (en) | 2014-02-26 | 2016-08-30 | Cavium, Inc. | Method and apparatus for memory access management |
US10110558B2 (en) | 2014-04-14 | 2018-10-23 | Cavium, Inc. | Processing of finite automata based on memory hierarchy |
US10002326B2 (en) * | 2014-04-14 | 2018-06-19 | Cavium, Inc. | Compilation of finite automata based on memory hierarchy |
US8947817B1 (en) | 2014-04-28 | 2015-02-03 | Seagate Technology Llc | Storage system with media scratch pad |
US9443553B2 (en) | 2014-04-28 | 2016-09-13 | Seagate Technology Llc | Storage system with multiple media scratch pads |
US9971686B2 (en) | 2015-02-23 | 2018-05-15 | Intel Corporation | Vector cache line write back processors, methods, systems, and instructions |
GB2540948B (en) * | 2015-07-31 | 2021-09-15 | Advanced Risc Mach Ltd | Apparatus with reduced hardware register set |
CN105072050A (en) * | 2015-08-26 | 2015-11-18 | 联想(北京)有限公司 | Data transmission method and data transmission device |
US10303372B2 (en) | 2015-12-01 | 2019-05-28 | Samsung Electronics Co., Ltd. | Nonvolatile memory device and operation method thereof |
US10223295B2 (en) * | 2016-03-10 | 2019-03-05 | Microsoft Technology Licensing, Llc | Protected pointers |
CN107315563B (en) * | 2016-04-26 | 2020-08-07 | 中科寒武纪科技股份有限公司 | Apparatus and method for performing vector compare operations |
US11853244B2 (en) * | 2017-01-26 | 2023-12-26 | Wisconsin Alumni Research Foundation | Reconfigurable computer accelerator providing stream processor and dataflow processor |
US10740256B2 (en) * | 2017-05-23 | 2020-08-11 | Marvell Asia Pte, Ltd. | Re-ordering buffer for a digital multi-processor system with configurable, scalable, distributed job manager |
CN109542348B (en) * | 2018-11-19 | 2022-05-10 | 郑州云海信息技术有限公司 | Data brushing method and device |
US11243883B2 (en) * | 2019-05-24 | 2022-02-08 | Texas Instruments Incorporated | Cache coherence shared state suppression |
CN110262888B (en) * | 2019-06-26 | 2020-11-20 | 京东数字科技控股有限公司 | Task scheduling method and device and method and device for computing node to execute task |
DE102020127704A1 (en) | 2019-10-29 | 2021-04-29 | Nvidia Corporation | TECHNIQUES FOR EFFICIENT TRANSFER OF DATA TO A PROCESSOR |
US11080051B2 (en) | 2019-10-29 | 2021-08-03 | Nvidia Corporation | Techniques for efficiently transferring data to a processor |
CN111045960B (en) * | 2019-11-21 | 2023-06-13 | 中国航空工业集团公司西安航空计算技术研究所 | Cache circuit for multi-pixel format storage |
US11341066B2 (en) * | 2019-12-12 | 2022-05-24 | Electronics And Telecommunications Research Institute | Cache for artificial intelligence processor |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5193187A (en) * | 1989-12-29 | 1993-03-09 | Supercomputer Systems Limited Partnership | Fast interrupt mechanism for interrupting processors in parallel in a multiprocessor system wherein processors are assigned process ID numbers |
US5613128A (en) * | 1990-12-21 | 1997-03-18 | Intel Corporation | Programmable multi-processor interrupt controller system with a processor integrated local interrupt controller |
US5778236A (en) * | 1996-05-17 | 1998-07-07 | Advanced Micro Devices, Inc. | Multiprocessing interrupt controller on I/O bus |
US5848164A (en) * | 1996-04-30 | 1998-12-08 | The Board Of Trustees Of The Leland Stanford Junior University | System and method for effects processing on audio subband data |
US6115763A (en) * | 1998-03-05 | 2000-09-05 | International Business Machines Corporation | Multi-core chip providing external core access with regular operation function interface and predetermined service operation services interface comprising core interface units and masters interface unit |
US20020029358A1 (en) * | 2000-05-31 | 2002-03-07 | Pawlowski Chester W. | Method and apparatus for delivering error interrupts to a processor of a modular, multiprocessor system |
US6496880B1 (en) * | 1999-08-26 | 2002-12-17 | Agere Systems Inc. | Shared I/O ports for multi-core designs |
US6539522B1 (en) * | 2000-01-31 | 2003-03-25 | International Business Machines Corporation | Method of developing re-usable software for efficient verification of system-on-chip integrated circuit designs |
US6598178B1 (en) * | 1999-06-01 | 2003-07-22 | Agere Systems Inc. | Peripheral breakpoint signaler |
US6675284B1 (en) * | 1998-08-21 | 2004-01-06 | Stmicroelectronics Limited | Integrated circuit with multiple processing cores |
US6718294B1 (en) * | 2000-05-16 | 2004-04-06 | Mindspeed Technologies, Inc. | System and method for synchronized control of system simulators with multiple processor cores |
US20040264077A1 (en) * | 2002-10-02 | 2004-12-30 | Dejan Radosavljevic | Protective device with end of life indicator |
Family Cites Families (96)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4415970A (en) | 1980-11-14 | 1983-11-15 | Sperry Corporation | Cache/disk subsystem with load equalization |
JPS5969826A (en) | 1982-10-15 | 1984-04-20 | Hitachi Ltd | Controlling method of buffer |
US4755930A (en) | 1985-06-27 | 1988-07-05 | Encore Computer Corporation | Hierarchical cache memory system and method |
US5091846A (en) | 1986-10-03 | 1992-02-25 | Intergraph Corporation | Cache providing caching/non-caching write-through and copyback modes for virtual addresses and including bus snooping to maintain coherency |
US5155831A (en) | 1989-04-24 | 1992-10-13 | International Business Machines Corporation | Data processing system with fast queue store interposed between store-through caches and a main memory |
US5119485A (en) | 1989-05-15 | 1992-06-02 | Motorola, Inc. | Method for data bus snooping in a data processing system by selective concurrent read and invalidate cache operation |
US5471593A (en) * | 1989-12-11 | 1995-11-28 | Branigin; Michael H. | Computer processor with an efficient means of executing many instructions simultaneously |
US5432918A (en) * | 1990-06-29 | 1995-07-11 | Digital Equipment Corporation | Method and apparatus for ordering read and write operations using conflict bits in a write queue |
US5404483A (en) | 1990-06-29 | 1995-04-04 | Digital Equipment Corporation | Processor and method for delaying the processing of cache coherency transactions during outstanding cache fills |
US5347648A (en) * | 1990-06-29 | 1994-09-13 | Digital Equipment Corporation | Ensuring write ordering under writeback cache error conditions |
US5404482A (en) * | 1990-06-29 | 1995-04-04 | Digital Equipment Corporation | Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills |
US5276852A (en) | 1990-10-01 | 1994-01-04 | Digital Equipment Corporation | Method and apparatus for controlling a processor bus used by multiple processor components during writeback cache transactions |
US6446164B1 (en) * | 1991-06-27 | 2002-09-03 | Integrated Device Technology, Inc. | Test mode accessing of an internal cache memory |
US5408644A (en) | 1992-06-05 | 1995-04-18 | Compaq Computer Corporation | Method and apparatus for improving the performance of partial stripe operations in a disk array subsystem |
US5590368A (en) | 1993-03-31 | 1996-12-31 | Intel Corporation | Method and apparatus for dynamically expanding the pipeline of a microprocessor |
US5623633A (en) * | 1993-07-27 | 1997-04-22 | Dell Usa, L.P. | Cache-based computer system employing a snoop control circuit with write-back suppression |
US5551006A (en) | 1993-09-30 | 1996-08-27 | Intel Corporation | Low cost writethrough cache coherency apparatus and method for computer systems without a cache supporting bus |
US5509129A (en) * | 1993-11-30 | 1996-04-16 | Guttag; Karl M. | Long instruction word controlling plural independent processor operations |
US5623627A (en) * | 1993-12-09 | 1997-04-22 | Advanced Micro Devices, Inc. | Computer memory architecture including a replacement cache |
US5754819A (en) | 1994-07-28 | 1998-05-19 | Sun Microsystems, Inc. | Low-latency memory indexing method and structure |
GB2292822A (en) * | 1994-08-31 | 1996-03-06 | Hewlett Packard Co | Partitioned cache memory |
US5619680A (en) | 1994-11-25 | 1997-04-08 | Berkovich; Semyon | Methods and apparatus for concurrent execution of serial computing instructions using combinatorial architecture for program partitioning |
JPH08278916A (en) | 1994-11-30 | 1996-10-22 | Hitachi Ltd | Multichannel memory system, transfer information synchronizing method, and signal transfer circuit |
JP3872118B2 (en) | 1995-03-20 | 2007-01-24 | 富士通株式会社 | Cache coherence device |
US5737547A (en) | 1995-06-07 | 1998-04-07 | Microunity Systems Engineering, Inc. | System for placing entries of an outstanding processor request into a free pool after the request is accepted by a corresponding peripheral device |
US5742840A (en) * | 1995-08-16 | 1998-04-21 | Microunity Systems Engineering, Inc. | General purpose, multiple precision parallel operation, programmable media processor |
US6598136B1 (en) * | 1995-10-06 | 2003-07-22 | National Semiconductor Corporation | Data transfer with highly granular cacheability control between memory and a scratchpad area |
WO1997027539A1 (en) * | 1996-01-24 | 1997-07-31 | Sun Microsystems, Inc. | Methods and apparatuses for stack caching |
US6021473A (en) | 1996-08-27 | 2000-02-01 | Vlsi Technology, Inc. | Method and apparatus for maintaining coherency for data transaction of CPU and bus device utilizing selective flushing mechanism |
US5897656A (en) | 1996-09-16 | 1999-04-27 | Corollary, Inc. | System and method for maintaining memory coherency in a computer system having multiple system buses |
US5860158A (en) | 1996-11-15 | 1999-01-12 | Samsung Electronics Company, Ltd. | Cache control unit with a cache request transaction-oriented protocol |
US6134634A (en) | 1996-12-20 | 2000-10-17 | Texas Instruments Incorporated | Method and apparatus for preemptive cache write-back |
US5895485A (en) | 1997-02-24 | 1999-04-20 | Eccs, Inc. | Method and device using a redundant cache for preventing the loss of dirty data |
JP3849951B2 (en) | 1997-02-27 | 2006-11-22 | 株式会社日立製作所 | Main memory shared multiprocessor |
US6018792A (en) | 1997-07-02 | 2000-01-25 | Micron Electronics, Inc. | Apparatus for performing a low latency memory read with concurrent snoop |
US5991855A (en) | 1997-07-02 | 1999-11-23 | Micron Electronics, Inc. | Low latency memory read with concurrent pipe lined snoops |
US6009263A (en) * | 1997-07-28 | 1999-12-28 | Institute For The Development Of Emerging Architectures, L.L.C. | Emulating agent and method for reformatting computer instructions into a standard uniform format |
US6760833B1 (en) | 1997-08-01 | 2004-07-06 | Micron Technology, Inc. | Split embedded DRAM processor |
US7076568B2 (en) | 1997-10-14 | 2006-07-11 | Alacritech, Inc. | Data communication apparatus for computer intelligent network interface card which transfers data between a network and a storage device according designated uniform datagram protocol socket |
US6070227A (en) | 1997-10-31 | 2000-05-30 | Hewlett-Packard Company | Main memory bank indexing scheme that optimizes consecutive page hits by linking main memory bank address organization to cache memory address organization |
US6026475A (en) * | 1997-11-26 | 2000-02-15 | Digital Equipment Corporation | Method for dynamically remapping a virtual address to a physical address to maintain an even distribution of cache page addresses in a virtual address space |
US6253311B1 (en) * | 1997-11-29 | 2001-06-26 | Jp First Llc | Instruction set for bi-directional conversion and transfer of integer and floating point data |
US6560680B2 (en) | 1998-01-21 | 2003-05-06 | Micron Technology, Inc. | System controller with Integrated low latency memory using non-cacheable memory physically distinct from main memory |
JP3751741B2 (en) | 1998-02-02 | 2006-03-01 | 日本電気株式会社 | Multiprocessor system |
US6643745B1 (en) | 1998-03-31 | 2003-11-04 | Intel Corporation | Method and apparatus for prefetching data into cache |
JP3708436B2 (en) | 1998-05-07 | 2005-10-19 | インフィネオン テクノロジース アクチエンゲゼルシャフト | Cache memory for 2D data fields |
TW501011B (en) | 1998-05-08 | 2002-09-01 | Koninkl Philips Electronics Nv | Data processing circuit with cache memory |
US20010054137A1 (en) * | 1998-06-10 | 2001-12-20 | Richard James Eickemeyer | Circuit arrangement and method with improved branch prefetching for short branch instructions |
US6483516B1 (en) * | 1998-10-09 | 2002-11-19 | National Semiconductor Corporation | Hierarchical texture cache |
US6718457B2 (en) | 1998-12-03 | 2004-04-06 | Sun Microsystems, Inc. | Multiple-thread processor for threaded software applications |
US6526481B1 (en) | 1998-12-17 | 2003-02-25 | Massachusetts Institute Of Technology | Adaptive cache coherence protocols |
US6563818B1 (en) | 1999-05-20 | 2003-05-13 | Advanced Micro Devices, Inc. | Weighted round robin cell architecture |
US6279080B1 (en) | 1999-06-09 | 2001-08-21 | Ati International Srl | Method and apparatus for association of memory locations with a cache location having a flush buffer |
US6188624B1 (en) | 1999-07-12 | 2001-02-13 | Winbond Electronics Corporation | Low latency memory sensing circuits |
US6606704B1 (en) | 1999-08-31 | 2003-08-12 | Intel Corporation | Parallel multithreaded processor with plural microengines executing multiple threads each microengine having loadable microcode |
US6401175B1 (en) | 1999-10-01 | 2002-06-04 | Sun Microsystems, Inc. | Shared write buffer for use by multiple processor units |
US6661794B1 (en) * | 1999-12-29 | 2003-12-09 | Intel Corporation | Method and apparatus for gigabit packet assignment for multithreaded packet processing |
US6438658B1 (en) | 2000-06-30 | 2002-08-20 | Intel Corporation | Fast invalidation scheme for caches |
US6654858B1 (en) | 2000-08-31 | 2003-11-25 | Hewlett-Packard Development Company, L.P. | Method for reducing directory writes and latency in a high performance, directory-based, coherency protocol |
US6665768B1 (en) | 2000-10-12 | 2003-12-16 | Chipwrights Design, Inc. | Table look-up operation for SIMD processors with interleaved memory systems |
US6587920B2 (en) | 2000-11-30 | 2003-07-01 | Mosaid Technologies Incorporated | Method and apparatus for reducing latency in a memory system |
US6662275B2 (en) | 2001-02-12 | 2003-12-09 | International Business Machines Corporation | Efficient instruction cache coherency maintenance mechanism for scalable multiprocessor computer system with store-through data cache |
US6647456B1 (en) | 2001-02-23 | 2003-11-11 | Nvidia Corporation | High bandwidth-low latency memory controller |
US6725336B2 (en) * | 2001-04-20 | 2004-04-20 | Sun Microsystems, Inc. | Dynamically allocated cache memory for a multi-processor unit |
US6785677B1 (en) | 2001-05-02 | 2004-08-31 | Unisys Corporation | Method for execution of query to search strings of characters that match pattern with a target string utilizing bit vector |
US7133971B2 (en) * | 2003-11-21 | 2006-11-07 | International Business Machines Corporation | Cache with selective least frequently used or most frequently used cache line replacement |
JP2002358782A (en) * | 2001-05-31 | 2002-12-13 | Nec Corp | Semiconductor memory |
GB2378779B (en) | 2001-08-14 | 2005-02-02 | Advanced Risc Mach Ltd | Accessing memory units in a data processing apparatus |
US6877071B2 (en) | 2001-08-20 | 2005-04-05 | Technology Ip Holdings, Inc. | Multi-ported memory |
US20030110208A1 (en) | 2001-09-12 | 2003-06-12 | Raqia Networks, Inc. | Processing data across packet boundaries |
US6757784B2 (en) | 2001-09-28 | 2004-06-29 | Intel Corporation | Hiding refresh of memory and refresh-hidden memory |
US7072970B2 (en) | 2001-10-05 | 2006-07-04 | International Business Machines Corporation | Programmable network protocol handler architecture |
US7248585B2 (en) | 2001-10-22 | 2007-07-24 | Sun Microsystems, Inc. | Method and apparatus for a packet classifier |
US6944731B2 (en) | 2001-12-19 | 2005-09-13 | Agere Systems Inc. | Dynamic random access memory system with bank conflict avoidance feature |
US6789167B2 (en) | 2002-03-06 | 2004-09-07 | Hewlett-Packard Development Company, L.P. | Method and apparatus for multi-core processor integrated circuit having functional elements configurable as core elements and as system device elements |
US7200735B2 (en) * | 2002-04-10 | 2007-04-03 | Tensilica, Inc. | High-performance hybrid processor with configurable execution units |
GB2388447B (en) * | 2002-05-09 | 2005-07-27 | Sun Microsystems Inc | A computer system method and program product for performing a data access from low-level code |
US6814374B2 (en) * | 2002-06-28 | 2004-11-09 | Delphi Technologies, Inc. | Steering column with foamed in-place structure |
CN1387119A (en) * | 2002-06-28 | 2002-12-25 | 西安交通大学 | Tree chain table for fast search of data and its generating algorithm |
US7493607B2 (en) * | 2002-07-09 | 2009-02-17 | Bluerisc Inc. | Statically speculative compilation and execution |
GB2390950A (en) * | 2002-07-17 | 2004-01-21 | Sony Uk Ltd | Video wipe generation based on the distance of a display position between a wipe origin and a wipe destination |
US6957305B2 (en) * | 2002-08-29 | 2005-10-18 | International Business Machines Corporation | Data streaming mechanism in a microprocessor |
US20040059880A1 (en) | 2002-09-23 | 2004-03-25 | Bennett Brian R. | Low latency memory access method using unified queue mechanism |
US7146643B2 (en) * | 2002-10-29 | 2006-12-05 | Lockheed Martin Corporation | Intrusion detection accelerator |
US7093153B1 (en) * | 2002-10-30 | 2006-08-15 | Advanced Micro Devices, Inc. | Method and apparatus for lowering bus clock frequency in a complex integrated data processing system |
US7055003B2 (en) * | 2003-04-25 | 2006-05-30 | International Business Machines Corporation | Data cache scrub mechanism for large L2/L3 data cache structures |
US20050138276A1 (en) | 2003-12-17 | 2005-06-23 | Intel Corporation | Methods and apparatus for high bandwidth random access using dynamic random access memory |
US7159068B2 (en) * | 2003-12-22 | 2007-01-02 | Phison Electronics Corp. | Method of optimizing performance of a flash memory |
US20050138297A1 (en) * | 2003-12-23 | 2005-06-23 | Intel Corporation | Register file cache |
US7380276B2 (en) * | 2004-05-20 | 2008-05-27 | Intel Corporation | Processor extensions and software verification to support type-safe language environments running with untrusted code |
US7353341B2 (en) * | 2004-06-03 | 2008-04-01 | International Business Machines Corporation | System and method for canceling write back operation during simultaneous snoop push or snoop kill operation in write back caches |
US7594081B2 (en) | 2004-09-10 | 2009-09-22 | Cavium Networks, Inc. | Direct access to low-latency memory |
US7941585B2 (en) * | 2004-09-10 | 2011-05-10 | Cavium Networks, Inc. | Local scratchpad and data caching system |
EP1794979B1 (en) | 2004-09-10 | 2017-04-12 | Cavium, Inc. | Selective replication of data structure |
US20060143396A1 (en) | 2004-12-29 | 2006-06-29 | Mason Cabot | Method for programmer-controlled cache line eviction policy |
US9304767B2 (en) * | 2009-06-02 | 2016-04-05 | Oracle America, Inc. | Single cycle data movement between general purpose and floating-point registers |
-
2004
- 2004-12-17 US US11/015,343 patent/US7941585B2/en active Active
-
2005
- 2005-01-05 US US11/030,010 patent/US20060059316A1/en not_active Abandoned
- 2005-01-25 US US11/042,476 patent/US20060059286A1/en not_active Abandoned
- 2005-09-01 CN CNB2005800346066A patent/CN100533372C/en not_active Expired - Fee Related
- 2005-09-01 CN CN2005800346009A patent/CN101069170B/en active Active
- 2005-09-01 CN CN2005800334834A patent/CN101036117B/en not_active Expired - Fee Related
- 2005-09-08 CN CN200580034214XA patent/CN101053234B/en not_active Expired - Fee Related
- 2005-09-09 CN CN2005800304519A patent/CN101128804B/en not_active Expired - Fee Related
-
2014
- 2014-01-20 US US14/159,210 patent/US9141548B2/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5193187A (en) * | 1989-12-29 | 1993-03-09 | Supercomputer Systems Limited Partnership | Fast interrupt mechanism for interrupting processors in parallel in a multiprocessor system wherein processors are assigned process ID numbers |
US5613128A (en) * | 1990-12-21 | 1997-03-18 | Intel Corporation | Programmable multi-processor interrupt controller system with a processor integrated local interrupt controller |
US5848164A (en) * | 1996-04-30 | 1998-12-08 | The Board Of Trustees Of The Leland Stanford Junior University | System and method for effects processing on audio subband data |
US5778236A (en) * | 1996-05-17 | 1998-07-07 | Advanced Micro Devices, Inc. | Multiprocessing interrupt controller on I/O bus |
US6115763A (en) * | 1998-03-05 | 2000-09-05 | International Business Machines Corporation | Multi-core chip providing external core access with regular operation function interface and predetermined service operation services interface comprising core interface units and masters interface unit |
US6675284B1 (en) * | 1998-08-21 | 2004-01-06 | Stmicroelectronics Limited | Integrated circuit with multiple processing cores |
US6598178B1 (en) * | 1999-06-01 | 2003-07-22 | Agere Systems Inc. | Peripheral breakpoint signaler |
US6496880B1 (en) * | 1999-08-26 | 2002-12-17 | Agere Systems Inc. | Shared I/O ports for multi-core designs |
US6539522B1 (en) * | 2000-01-31 | 2003-03-25 | International Business Machines Corporation | Method of developing re-usable software for efficient verification of system-on-chip integrated circuit designs |
US6718294B1 (en) * | 2000-05-16 | 2004-04-06 | Mindspeed Technologies, Inc. | System and method for synchronized control of system simulators with multiple processor cores |
US20020029358A1 (en) * | 2000-05-31 | 2002-03-07 | Pawlowski Chester W. | Method and apparatus for delivering error interrupts to a processor of a modular, multiprocessor system |
US20040264077A1 (en) * | 2002-10-02 | 2004-12-30 | Dejan Radosavljevic | Protective device with end of life indicator |
Cited By (278)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060156099A1 (en) * | 2004-12-16 | 2006-07-13 | Sweet James D | Method and system of using a single EJTAG interface for multiple tap controllers |
US7650542B2 (en) * | 2004-12-16 | 2010-01-19 | Broadcom Corporation | Method and system of using a single EJTAG interface for multiple tap controllers |
US7549026B2 (en) * | 2005-03-30 | 2009-06-16 | Intel Corporation | Method and apparatus to provide dynamic hardware signal allocation in a processor |
US20060253660A1 (en) * | 2005-03-30 | 2006-11-09 | Intel Corporation | Method and apparatus to provide dynamic hardware signal allocation in a processor |
US9342468B2 (en) * | 2005-05-16 | 2016-05-17 | Texas Instruments Incorporated | Memory time stamp register external to first and second processors |
US20150026441A1 (en) * | 2005-05-16 | 2015-01-22 | Texas Instruments Incorporated | Method and system of inserting marking values used to correlate trace data as between processor cores |
US7840000B1 (en) * | 2005-07-25 | 2010-11-23 | Rockwell Collins, Inc. | High performance programmable cryptography system |
US8799687B2 (en) | 2005-12-30 | 2014-08-05 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates |
US8898528B2 (en) * | 2006-10-20 | 2014-11-25 | Texas Instruments Incoporated | DDR JTAG interface setting flip-flops in high state at power-up |
US10162003B2 (en) | 2006-10-20 | 2018-12-25 | Texas Instruments Incorporated | DDR TMS/TDI, addressable tap, state machine, and tap state monitor |
US20130254605A1 (en) * | 2006-10-20 | 2013-09-26 | Texas Instruments Incorporated | High speed double data rate jtag interface |
US9817071B2 (en) | 2006-10-20 | 2017-11-14 | Texas Instruments Incorporated | TDI/TMS DDR coupled JTAG domain with 6 preset flip flops |
US20080184150A1 (en) * | 2007-01-31 | 2008-07-31 | Marc Minato | Electronic circuit design analysis tool for multi-processor environments |
US9419867B2 (en) | 2007-03-30 | 2016-08-16 | Blue Coat Systems, Inc. | Data and control plane architecture for network application traffic management device |
US20080239956A1 (en) * | 2007-03-30 | 2008-10-02 | Packeteer, Inc. | Data and Control Plane Architecture for Network Application Traffic Management Device |
US8059532B2 (en) | 2007-06-21 | 2011-11-15 | Packeteer, Inc. | Data and control plane architecture including server-side triggered flow policy mechanism |
US20080316922A1 (en) * | 2007-06-21 | 2008-12-25 | Packeteer, Inc. | Data and Control Plane Architecture Including Server-Side Triggered Flow Policy Mechanism |
US7813277B2 (en) | 2007-06-29 | 2010-10-12 | Packeteer, Inc. | Lockless bandwidth management for multiprocessor networking devices |
US20100241831A1 (en) * | 2007-07-09 | 2010-09-23 | Hewlett-Packard Development Company, L.P. | Data packet processing method for a multi core processor |
WO2009008007A3 (en) * | 2007-07-09 | 2009-03-05 | Hewlett Packard Development Co | Data packet processing method for a multi core processor |
US8799547B2 (en) | 2007-07-09 | 2014-08-05 | Hewlett-Packard Development Company, L.P. | Data packet processing method for a multi core processor |
US20090083517A1 (en) * | 2007-09-25 | 2009-03-26 | Packeteer, Inc. | Lockless Processing of Command Operations in Multiprocessor Systems |
US8279885B2 (en) | 2007-09-25 | 2012-10-02 | Packeteer, Inc. | Lockless processing of command operations in multiprocessor systems |
US20090150695A1 (en) * | 2007-12-10 | 2009-06-11 | Justin Song | Predicting future power level states for processor cores |
US10261559B2 (en) | 2007-12-10 | 2019-04-16 | Intel Corporation | Predicting future power level states for processor cores |
US20090150696A1 (en) * | 2007-12-10 | 2009-06-11 | Justin Song | Transitioning a processor package to a low power state |
US8024590B2 (en) | 2007-12-10 | 2011-09-20 | Intel Corporation | Predicting future power level states for processor cores |
US9285855B2 (en) | 2007-12-10 | 2016-03-15 | Intel Corporation | Predicting future power level states for processor cores |
US8111707B2 (en) | 2007-12-20 | 2012-02-07 | Packeteer, Inc. | Compression mechanisms for control plane—data plane processing architectures |
US10024913B2 (en) | 2009-03-25 | 2018-07-17 | Texas Instruments Incorporated | Tap commandable data register control router inverted TCK, TMS/TDI imputs |
US20150113347A1 (en) * | 2009-03-25 | 2015-04-23 | Texas Instruments Incorporated | Commanded jtag test access port operations |
US9121905B2 (en) * | 2009-03-25 | 2015-09-01 | Texas Instruments Incorporated | TAP with commandable data register control router and routing circuit |
US8959396B2 (en) * | 2009-03-25 | 2015-02-17 | Texas Instruments Incorporated | Commandable data register control router connected to TCK and TDI |
US20100332909A1 (en) * | 2009-06-30 | 2010-12-30 | Texas Instruments Incorporated | Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems |
US8407528B2 (en) | 2009-06-30 | 2013-03-26 | Texas Instruments Incorporated | Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems |
US20120216017A1 (en) * | 2009-11-16 | 2012-08-23 | Fujitsu Limited | Parallel computing apparatus and parallel computing method |
US8549261B2 (en) * | 2009-11-16 | 2013-10-01 | Fujitsu Limited | Parallel computing apparatus and parallel computing method |
US9866463B2 (en) | 2009-12-23 | 2018-01-09 | Citrix Systems, Inc. | Systems and methods for object rate limiting in multi-core system |
US20110153724A1 (en) * | 2009-12-23 | 2011-06-23 | Murali Raja | Systems and methods for object rate limiting in multi-core system |
US8452835B2 (en) * | 2009-12-23 | 2013-05-28 | Citrix Systems, Inc. | Systems and methods for object rate limiting in multi-core system |
US20110161630A1 (en) * | 2009-12-28 | 2011-06-30 | Raasch Steven E | General purpose hardware to replace faulty core components that may also provide additional processor functionality |
US8914672B2 (en) * | 2009-12-28 | 2014-12-16 | Intel Corporation | General purpose hardware to replace faulty core components that may also provide additional processor functionality |
US8112677B2 (en) | 2010-02-26 | 2012-02-07 | UltraSoC Technologies Limited | Method of debugging multiple processes |
US20110214023A1 (en) * | 2010-02-26 | 2011-09-01 | UltraSoC Technologies Limited | Method of Debugging Multiple Processes |
US20150346935A1 (en) * | 2010-03-09 | 2015-12-03 | Avistar Communications Corporation | Scalable high-performance interactive real-time media architectures for virtual desktop environments |
US11269008B2 (en) | 2010-03-10 | 2022-03-08 | Texas Instruments Incorporated | Commanded JTAG test access port operations |
US11604222B2 (en) | 2010-03-10 | 2023-03-14 | Texas Instmments Incorporated | Commanded JTAG test access port operations |
US20110225456A1 (en) * | 2010-03-10 | 2011-09-15 | Texas Instruments Incorporated | Commanded jtag test access port operations |
US10634719B2 (en) | 2010-03-10 | 2020-04-28 | Texas Instruments Incorporated | Commandable data register control router including input coupled to TDI |
US8572433B2 (en) * | 2010-03-10 | 2013-10-29 | Texas Instruments Incorporated | JTAG IC with commandable circuit controlling data register control router |
US20110307741A1 (en) * | 2010-06-15 | 2011-12-15 | National Chung Cheng University | Non-intrusive debugging framework for parallel software based on super multi-core framework |
US9983660B2 (en) | 2010-09-23 | 2018-05-29 | Intel Corporation | Providing per core voltage and frequency control |
US10613620B2 (en) | 2010-09-23 | 2020-04-07 | Intel Corporation | Providing per core voltage and frequency control |
US9983659B2 (en) | 2010-09-23 | 2018-05-29 | Intel Corporation | Providing per core voltage and frequency control |
US8943334B2 (en) | 2010-09-23 | 2015-01-27 | Intel Corporation | Providing per core voltage and frequency control |
US9983661B2 (en) | 2010-09-23 | 2018-05-29 | Intel Corporation | Providing per core voltage and frequency control |
US9939884B2 (en) | 2010-09-23 | 2018-04-10 | Intel Corporation | Providing per core voltage and frequency control |
US9348387B2 (en) | 2010-09-23 | 2016-05-24 | Intel Corporation | Providing per core voltage and frequency control |
US9032226B2 (en) | 2010-09-23 | 2015-05-12 | Intel Corporation | Providing per core voltage and frequency control |
KR101498452B1 (en) * | 2010-12-22 | 2015-03-04 | 인텔 코포레이션 | Debugging complex multi-core and multi-socket systems |
CN103270504A (en) * | 2010-12-22 | 2013-08-28 | 英特尔公司 | Debugging complex multi-core and multi-socket systems |
WO2012087894A3 (en) * | 2010-12-22 | 2013-01-03 | Intel Corporation | Debugging complex multi-core and multi-socket systems |
US8782468B2 (en) * | 2010-12-22 | 2014-07-15 | Intel Corporation | Methods and tools to debug complex multi-core, multi-socket QPI based system |
US20120166882A1 (en) * | 2010-12-22 | 2012-06-28 | Binata Bhattacharyya | Methods and tools to debug complex multi-core, multi-socket qpi based system |
WO2012087894A2 (en) * | 2010-12-22 | 2012-06-28 | Intel Corporation | Debugging complex multi-core and multi-socket systems |
US9075614B2 (en) | 2011-03-21 | 2015-07-07 | Intel Corporation | Managing power consumption in a multi-core processor |
US9069555B2 (en) | 2011-03-21 | 2015-06-30 | Intel Corporation | Managing power consumption in a multi-core processor |
US9846625B2 (en) * | 2011-05-16 | 2017-12-19 | Dawning Information Industry Co., Ltd. | Method and device for debugging a MIPS-structure CPU with southbridge and northbridge chipsets |
US20140157051A1 (en) * | 2011-05-16 | 2014-06-05 | Zongyou Shao | Method and device for debugging a mips-structure cpu with southbridge and northbridge chipsets |
US8683240B2 (en) | 2011-06-27 | 2014-03-25 | Intel Corporation | Increasing power efficiency of turbo mode operation in a processor |
US8793515B2 (en) | 2011-06-27 | 2014-07-29 | Intel Corporation | Increasing power efficiency of turbo mode operation in a processor |
US8904205B2 (en) | 2011-06-27 | 2014-12-02 | Intel Corporation | Increasing power efficiency of turbo mode operation in a processor |
US9081557B2 (en) | 2011-09-06 | 2015-07-14 | Intel Corporation | Dynamically allocating a power budget over multiple domains of a processor |
US8775833B2 (en) | 2011-09-06 | 2014-07-08 | Intel Corporation | Dynamically allocating a power budget over multiple domains of a processor |
US8769316B2 (en) | 2011-09-06 | 2014-07-01 | Intel Corporation | Dynamically allocating a power budget over multiple domains of a processor |
US8688883B2 (en) | 2011-09-08 | 2014-04-01 | Intel Corporation | Increasing turbo mode residency of a processor |
US9032126B2 (en) | 2011-09-08 | 2015-05-12 | Intel Corporation | Increasing turbo mode residency of a processor |
US9032125B2 (en) | 2011-09-08 | 2015-05-12 | Intel Corporation | Increasing turbo mode residency of a processor |
US8954770B2 (en) | 2011-09-28 | 2015-02-10 | Intel Corporation | Controlling temperature of multiple domains of a multi-domain processor using a cross domain margin |
US9074947B2 (en) | 2011-09-28 | 2015-07-07 | Intel Corporation | Estimating temperature of a processor core in a low power state without thermal sensor information |
US8914650B2 (en) | 2011-09-28 | 2014-12-16 | Intel Corporation | Dynamically adjusting power of non-core processor circuitry including buffer circuitry |
US9235254B2 (en) | 2011-09-28 | 2016-01-12 | Intel Corporation | Controlling temperature of multiple domains of a multi-domain processor using a cross-domain margin |
US9501129B2 (en) | 2011-09-28 | 2016-11-22 | Intel Corporation | Dynamically adjusting power of non-core processor circuitry including buffer circuitry |
US9354692B2 (en) | 2011-10-27 | 2016-05-31 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US9026815B2 (en) | 2011-10-27 | 2015-05-05 | Intel Corporation | Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor |
US9939879B2 (en) | 2011-10-27 | 2018-04-10 | Intel Corporation | Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor |
US10248181B2 (en) | 2011-10-27 | 2019-04-02 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US8832478B2 (en) | 2011-10-27 | 2014-09-09 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US9176565B2 (en) | 2011-10-27 | 2015-11-03 | Intel Corporation | Controlling operating frequency of a core domain based on operating condition of a non-core domain of a multi-domain processor |
US10705588B2 (en) | 2011-10-27 | 2020-07-07 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US10037067B2 (en) | 2011-10-27 | 2018-07-31 | Intel Corporation | Enabling a non-core domain to control memory bandwidth in a processor |
US10474218B2 (en) | 2011-10-31 | 2019-11-12 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US9158693B2 (en) | 2011-10-31 | 2015-10-13 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US10613614B2 (en) | 2011-10-31 | 2020-04-07 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US8943340B2 (en) | 2011-10-31 | 2015-01-27 | Intel Corporation | Controlling a turbo mode frequency of a processor |
US10564699B2 (en) | 2011-10-31 | 2020-02-18 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US9471490B2 (en) | 2011-10-31 | 2016-10-18 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US10067553B2 (en) | 2011-10-31 | 2018-09-04 | Intel Corporation | Dynamically controlling cache size to maximize energy efficiency |
US9618997B2 (en) | 2011-10-31 | 2017-04-11 | Intel Corporation | Controlling a turbo mode frequency of a processor |
US9292068B2 (en) | 2011-10-31 | 2016-03-22 | Intel Corporation | Controlling a turbo mode frequency of a processor |
US9753531B2 (en) | 2011-12-05 | 2017-09-05 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state |
US9239611B2 (en) | 2011-12-05 | 2016-01-19 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including balancing power among multi-frequency domains of a processor based on efficiency rating scheme |
US8972763B2 (en) | 2011-12-05 | 2015-03-03 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state |
US9052901B2 (en) | 2011-12-14 | 2015-06-09 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including configurable maximum processor current |
US9372524B2 (en) | 2011-12-15 | 2016-06-21 | Intel Corporation | Dynamically modifying a power/performance tradeoff based on processor utilization |
US9760409B2 (en) | 2011-12-15 | 2017-09-12 | Intel Corporation | Dynamically modifying a power/performance tradeoff based on a processor utilization |
US9170624B2 (en) | 2011-12-15 | 2015-10-27 | Intel Corporation | User level control of power management policies |
US9098261B2 (en) | 2011-12-15 | 2015-08-04 | Intel Corporation | User level control of power management policies |
US9535487B2 (en) | 2011-12-15 | 2017-01-03 | Intel Corporation | User level control of power management policies |
US10372197B2 (en) | 2011-12-15 | 2019-08-06 | Intel Corporation | User level control of power management policies |
US8996895B2 (en) | 2011-12-28 | 2015-03-31 | Intel Corporation | Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates |
US9323316B2 (en) | 2012-03-13 | 2016-04-26 | Intel Corporation | Dynamically controlling interconnect frequency in a processor |
US9354689B2 (en) | 2012-03-13 | 2016-05-31 | Intel Corporation | Providing energy efficient turbo operation of a processor |
US9436245B2 (en) | 2012-03-13 | 2016-09-06 | Intel Corporation | Dynamically computing an electrical design point (EDP) for a multicore processor |
US9547027B2 (en) | 2012-03-30 | 2017-01-17 | Intel Corporation | Dynamically measuring power consumption in a processor |
US10185566B2 (en) | 2012-04-27 | 2019-01-22 | Intel Corporation | Migrating tasks between asymmetric computing elements of a multi-core processor |
US20140019644A1 (en) * | 2012-07-10 | 2014-01-16 | International Business Machines Corporation | Controlling A Plurality Of Serial Peripheral Interface ('SPI') Peripherals Using A Single Chip Select |
US9411770B2 (en) * | 2012-07-10 | 2016-08-09 | Lenovo Enterprise Solutions (Singapore) Pte. Ltd. | Controlling a plurality of serial peripheral interface (‘SPI’) peripherals using a single chip select |
US9063727B2 (en) | 2012-08-31 | 2015-06-23 | Intel Corporation | Performing cross-domain thermal control in a processor |
US10191532B2 (en) | 2012-08-31 | 2019-01-29 | Intel Corporation | Configuring power management functionality in a processor |
US9189046B2 (en) | 2012-08-31 | 2015-11-17 | Intel Corporation | Performing cross-domain thermal control in a processor |
US9235244B2 (en) | 2012-08-31 | 2016-01-12 | Intel Corporation | Configuring power management functionality in a processor |
US11237614B2 (en) | 2012-08-31 | 2022-02-01 | Intel Corporation | Multicore processor with a control register storing an indicator that two or more cores are to operate at independent performance states |
US9760155B2 (en) | 2012-08-31 | 2017-09-12 | Intel Corporation | Configuring power management functionality in a processor |
US10203741B2 (en) | 2012-08-31 | 2019-02-12 | Intel Corporation | Configuring power management functionality in a processor |
US8984313B2 (en) | 2012-08-31 | 2015-03-17 | Intel Corporation | Configuring power management functionality in a processor including a plurality of cores by utilizing a register to store a power domain indicator |
US10877549B2 (en) | 2012-08-31 | 2020-12-29 | Intel Corporation | Configuring power management functionality in a processor |
US9342122B2 (en) | 2012-09-17 | 2016-05-17 | Intel Corporation | Distributing power to heterogeneous compute elements of a processor |
US9335804B2 (en) | 2012-09-17 | 2016-05-10 | Intel Corporation | Distributing power to heterogeneous compute elements of a processor |
US9423858B2 (en) | 2012-09-27 | 2016-08-23 | Intel Corporation | Sharing power between domains in a processor package using encoded power consumption information from a second domain to calculate an available power budget for a first domain |
US9575543B2 (en) | 2012-11-27 | 2017-02-21 | Intel Corporation | Providing an inter-arrival access timer in a processor |
US9183144B2 (en) | 2012-12-14 | 2015-11-10 | Intel Corporation | Power gating a portion of a cache memory |
US9176875B2 (en) | 2012-12-14 | 2015-11-03 | Intel Corporation | Power gating a portion of a cache memory |
US9405351B2 (en) | 2012-12-17 | 2016-08-02 | Intel Corporation | Performing frequency coordination in a multiprocessor system |
US9292468B2 (en) | 2012-12-17 | 2016-03-22 | Intel Corporation | Performing frequency coordination in a multiprocessor system based on response timing optimization |
US9235252B2 (en) | 2012-12-21 | 2016-01-12 | Intel Corporation | Dynamic balancing of power across a plurality of processor domains according to power policy control bias |
US9086834B2 (en) | 2012-12-21 | 2015-07-21 | Intel Corporation | Controlling configurable peak performance limits of a processor |
US9075556B2 (en) | 2012-12-21 | 2015-07-07 | Intel Corporation | Controlling configurable peak performance limits of a processor |
US9671854B2 (en) | 2012-12-21 | 2017-06-06 | Intel Corporation | Controlling configurable peak performance limits of a processor |
US9081577B2 (en) | 2012-12-28 | 2015-07-14 | Intel Corporation | Independent control of processor core retention states |
US9164565B2 (en) | 2012-12-28 | 2015-10-20 | Intel Corporation | Apparatus and method to manage energy usage of a processor |
US9606888B1 (en) * | 2013-01-04 | 2017-03-28 | Marvell International Ltd. | Hierarchical multi-core debugger interface |
US9335803B2 (en) | 2013-02-15 | 2016-05-10 | Intel Corporation | Calculating a dynamically changeable maximum operating voltage value for a processor based on a different polynomial equation using a set of coefficient values and a number of current active cores |
US10394300B2 (en) | 2013-03-11 | 2019-08-27 | Intel Corporation | Controlling operating voltage of a processor |
US9996135B2 (en) | 2013-03-11 | 2018-06-12 | Intel Corporation | Controlling operating voltage of a processor |
US9367114B2 (en) | 2013-03-11 | 2016-06-14 | Intel Corporation | Controlling operating voltage of a processor |
US11175712B2 (en) | 2013-03-11 | 2021-11-16 | Intel Corporation | Controlling operating voltage of a processor |
US11822409B2 (en) | 2013-03-11 | 2023-11-21 | Daedauls Prime LLC | Controlling operating frequency of a processor |
US11507167B2 (en) | 2013-03-11 | 2022-11-22 | Daedalus Prime Llc | Controlling operating voltage of a processor |
US9395784B2 (en) | 2013-04-25 | 2016-07-19 | Intel Corporation | Independently controlling frequency of plurality of power domains in a processor system |
US9377841B2 (en) | 2013-05-08 | 2016-06-28 | Intel Corporation | Adaptively limiting a maximum operating frequency in a multicore processor |
US10146283B2 (en) | 2013-05-31 | 2018-12-04 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US10429913B2 (en) | 2013-05-31 | 2019-10-01 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US9823719B2 (en) | 2013-05-31 | 2017-11-21 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US11157052B2 (en) | 2013-05-31 | 2021-10-26 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US11687135B2 (en) | 2013-05-31 | 2023-06-27 | Tahoe Research, Ltd. | Controlling power delivery to a processor via a bypass |
US10409346B2 (en) | 2013-05-31 | 2019-09-10 | Intel Corporation | Controlling power delivery to a processor via a bypass |
US9348401B2 (en) | 2013-06-25 | 2016-05-24 | Intel Corporation | Mapping a performance request to an operating frequency in a processor |
US10175740B2 (en) | 2013-06-25 | 2019-01-08 | Intel Corporation | Mapping a performance request to an operating frequency in a processor |
US9471088B2 (en) | 2013-06-25 | 2016-10-18 | Intel Corporation | Restricting clock signal delivery in a processor |
US9348407B2 (en) | 2013-06-27 | 2016-05-24 | Intel Corporation | Method and apparatus for atomic frequency and voltage changes |
US9377836B2 (en) | 2013-07-26 | 2016-06-28 | Intel Corporation | Restricting clock signal delivery based on activity in a processor |
US9495001B2 (en) | 2013-08-21 | 2016-11-15 | Intel Corporation | Forcing core low power states in a processor |
US10310588B2 (en) | 2013-08-21 | 2019-06-04 | Intel Corporation | Forcing core low power states in a processor |
US10386900B2 (en) | 2013-09-24 | 2019-08-20 | Intel Corporation | Thread aware power management |
US9594560B2 (en) | 2013-09-27 | 2017-03-14 | Intel Corporation | Estimating scalability value for a specific domain of a multicore processor based on active state residency of the domain, stall duration of the domain, memory bandwidth of the domain, and a plurality of coefficients based on a workload to execute on the domain |
US9405345B2 (en) | 2013-09-27 | 2016-08-02 | Intel Corporation | Constraining processor operation based on power envelope information |
US9684517B2 (en) | 2013-10-31 | 2017-06-20 | Lenovo Enterprise Solutions (Singapore) Pte. Ltd. | System monitoring and debugging in a multi-core processor system |
US20160299859A1 (en) * | 2013-11-22 | 2016-10-13 | Freescale Semiconductor, Inc. | Apparatus and method for external access to core resources of a processor, semiconductor systems development tool comprising the apparatus, and computer program product and non-transitory computer-readable storage medium associated with the method |
US9494998B2 (en) | 2013-12-17 | 2016-11-15 | Intel Corporation | Rescheduling workloads to enforce and maintain a duty cycle |
US9459689B2 (en) | 2013-12-23 | 2016-10-04 | Intel Corporation | Dyanamically adapting a voltage of a clock generation circuit |
US9965019B2 (en) | 2013-12-23 | 2018-05-08 | Intel Corporation | Dyanamically adapting a voltage of a clock generation circuit |
US9323525B2 (en) | 2014-02-26 | 2016-04-26 | Intel Corporation | Monitoring vector lane duty cycle for dynamic optimization |
WO2015134103A1 (en) * | 2014-03-07 | 2015-09-11 | Cavium, Inc. | Method and system for ordering i/o access in a multi-node environment |
US9372800B2 (en) | 2014-03-07 | 2016-06-21 | Cavium, Inc. | Inter-chip interconnect protocol for a multi-chip system |
US9529532B2 (en) | 2014-03-07 | 2016-12-27 | Cavium, Inc. | Method and apparatus for memory allocation in a multi-node system |
US10169080B2 (en) | 2014-03-07 | 2019-01-01 | Cavium, Llc | Method for work scheduling in a multi-chip system |
US9411644B2 (en) | 2014-03-07 | 2016-08-09 | Cavium, Inc. | Method and system for work scheduling in a multi-chip system |
US10592459B2 (en) | 2014-03-07 | 2020-03-17 | Cavium, Llc | Method and system for ordering I/O access in a multi-node environment |
US10963038B2 (en) | 2014-03-21 | 2021-03-30 | Intel Corporation | Selecting a low power state based on cache flush latency determination |
US10108454B2 (en) | 2014-03-21 | 2018-10-23 | Intel Corporation | Managing dynamic capacitance using code scheduling |
US10198065B2 (en) | 2014-03-21 | 2019-02-05 | Intel Corporation | Selecting a low power state based on cache flush latency determination |
US9665153B2 (en) | 2014-03-21 | 2017-05-30 | Intel Corporation | Selecting a low power state based on cache flush latency determination |
US10417149B2 (en) | 2014-06-06 | 2019-09-17 | Intel Corporation | Self-aligning a processor duty cycle with interrupts |
US9760158B2 (en) | 2014-06-06 | 2017-09-12 | Intel Corporation | Forcing a processor into a low power state |
US10345889B2 (en) | 2014-06-06 | 2019-07-09 | Intel Corporation | Forcing a processor into a low power state |
US10948968B2 (en) | 2014-06-30 | 2021-03-16 | Intel Corporation | Controlling processor performance scaling based on context |
US10216251B2 (en) | 2014-06-30 | 2019-02-26 | Intel Corporation | Controlling processor performance scaling based on context |
US9513689B2 (en) | 2014-06-30 | 2016-12-06 | Intel Corporation | Controlling processor performance scaling based on context |
US9606602B2 (en) | 2014-06-30 | 2017-03-28 | Intel Corporation | Method and apparatus to prevent voltage droop in a computer |
US9575537B2 (en) | 2014-07-25 | 2017-02-21 | Intel Corporation | Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states |
US10331186B2 (en) | 2014-07-25 | 2019-06-25 | Intel Corporation | Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states |
US9990016B2 (en) | 2014-08-15 | 2018-06-05 | Intel Corporation | Controlling temperature of a system memory |
US9760136B2 (en) | 2014-08-15 | 2017-09-12 | Intel Corporation | Controlling temperature of a system memory |
US9671853B2 (en) | 2014-09-12 | 2017-06-06 | Intel Corporation | Processor operating by selecting smaller of requested frequency and an energy performance gain (EPG) frequency |
US10339023B2 (en) | 2014-09-25 | 2019-07-02 | Intel Corporation | Cache-aware adaptive thread scheduling and migration |
US9977477B2 (en) | 2014-09-26 | 2018-05-22 | Intel Corporation | Adapting operating parameters of an input/output (IO) interface circuit of a processor |
US9684360B2 (en) | 2014-10-30 | 2017-06-20 | Intel Corporation | Dynamically controlling power management of an on-die memory of a processor |
US10429918B2 (en) | 2014-11-24 | 2019-10-01 | Intel Corporation | Controlling turbo mode frequency operation in a processor |
US9703358B2 (en) | 2014-11-24 | 2017-07-11 | Intel Corporation | Controlling turbo mode frequency operation in a processor |
US9710043B2 (en) | 2014-11-26 | 2017-07-18 | Intel Corporation | Controlling a guaranteed frequency of a processor |
US11079819B2 (en) | 2014-11-26 | 2021-08-03 | Intel Corporation | Controlling average power limits of a processor |
US10048744B2 (en) | 2014-11-26 | 2018-08-14 | Intel Corporation | Apparatus and method for thermal management in a multi-chip package |
US11841752B2 (en) | 2014-11-26 | 2023-12-12 | Intel Corporation | Controlling average power limits of a processor |
US10877530B2 (en) | 2014-12-23 | 2020-12-29 | Intel Corporation | Apparatus and method to provide a thermal parameter report for a multi-chip package |
US11543868B2 (en) | 2014-12-23 | 2023-01-03 | Intel Corporation | Apparatus and method to provide a thermal parameter report for a multi-chip package |
US9847927B2 (en) | 2014-12-26 | 2017-12-19 | Pfu Limited | Information processing device, method, and medium |
US10719326B2 (en) | 2015-01-30 | 2020-07-21 | Intel Corporation | Communicating via a mailbox interface of a processor |
US9639134B2 (en) | 2015-02-05 | 2017-05-02 | Intel Corporation | Method and apparatus to provide telemetry data to a power controller of a processor |
US9910481B2 (en) | 2015-02-13 | 2018-03-06 | Intel Corporation | Performing power management in a multicore processor |
US10775873B2 (en) | 2015-02-13 | 2020-09-15 | Intel Corporation | Performing power management in a multicore processor |
US10234930B2 (en) | 2015-02-13 | 2019-03-19 | Intel Corporation | Performing power management in a multicore processor |
US9874922B2 (en) | 2015-02-17 | 2018-01-23 | Intel Corporation | Performing dynamic power control of platform devices |
US10706004B2 (en) | 2015-02-27 | 2020-07-07 | Intel Corporation | Dynamically updating logical identifiers of cores of a processor |
US11567896B2 (en) | 2015-02-27 | 2023-01-31 | Intel Corporation | Dynamically updating logical identifiers of cores of a processor |
US9842082B2 (en) | 2015-02-27 | 2017-12-12 | Intel Corporation | Dynamically updating logical identifiers of cores of a processor |
US9710054B2 (en) | 2015-02-28 | 2017-07-18 | Intel Corporation | Programmable power management agent |
US10761594B2 (en) | 2015-02-28 | 2020-09-01 | Intel Corporation | Programmable power management agent |
US9760160B2 (en) | 2015-05-27 | 2017-09-12 | Intel Corporation | Controlling performance states of processing engines of a processor |
US10372198B2 (en) | 2015-05-27 | 2019-08-06 | Intel Corporation | Controlling performance states of processing engines of a processor |
US9710041B2 (en) | 2015-07-29 | 2017-07-18 | Intel Corporation | Masking a power state of a core of a processor |
US10001822B2 (en) | 2015-09-22 | 2018-06-19 | Intel Corporation | Integrating a power arbiter in a processor |
US20180217915A1 (en) * | 2015-09-25 | 2018-08-02 | Huawei Technologies Co.,Ltd. | Debugging method, multi-core processor, and debugging device |
EP3343377A4 (en) * | 2015-09-25 | 2018-09-12 | Huawei Technologies Co., Ltd. | Debugging method, multi-core processor, and debugging equipment |
US10503629B2 (en) * | 2015-09-25 | 2019-12-10 | Huawei Technologies Co., Ltd. | Debugging method, multi-core processor, and debugging device |
US10409709B2 (en) * | 2015-09-25 | 2019-09-10 | Huawei Technologies Co., Ltd. | Debugging method, multi-core processor and debugging device |
US9983644B2 (en) | 2015-11-10 | 2018-05-29 | Intel Corporation | Dynamically updating at least one power management operational parameter pertaining to a turbo mode of a processor for increased performance |
US9910470B2 (en) | 2015-12-16 | 2018-03-06 | Intel Corporation | Controlling telemetry data communication in a processor |
US10146286B2 (en) | 2016-01-14 | 2018-12-04 | Intel Corporation | Dynamically updating a power management policy of a processor |
US10289188B2 (en) | 2016-06-21 | 2019-05-14 | Intel Corporation | Processor having concurrent core and fabric exit from a low power state |
US10990161B2 (en) | 2016-06-23 | 2021-04-27 | Intel Corporation | Processor having accelerated user responsiveness in constrained environment |
US11435816B2 (en) | 2016-06-23 | 2022-09-06 | Intel Corporation | Processor having accelerated user responsiveness in constrained environment |
US10324519B2 (en) | 2016-06-23 | 2019-06-18 | Intel Corporation | Controlling forced idle state operation in a processor |
US10281975B2 (en) | 2016-06-23 | 2019-05-07 | Intel Corporation | Processor having accelerated user responsiveness in constrained environment |
US10846251B1 (en) * | 2016-07-01 | 2020-11-24 | The Board Of Trustees Of The University Of Illinois | Scratchpad-based operating system for multi-core embedded systems |
US10379596B2 (en) | 2016-08-03 | 2019-08-13 | Intel Corporation | Providing an interface for demotion control information in a processor |
US10234920B2 (en) | 2016-08-31 | 2019-03-19 | Intel Corporation | Controlling current consumption of a processor based at least in part on platform capacitance |
US10379904B2 (en) | 2016-08-31 | 2019-08-13 | Intel Corporation | Controlling a performance state of a processor using a combination of package and thread hint information |
US10423206B2 (en) | 2016-08-31 | 2019-09-24 | Intel Corporation | Processor to pre-empt voltage ramps for exit latency reductions |
US11119555B2 (en) | 2016-08-31 | 2021-09-14 | Intel Corporation | Processor to pre-empt voltage ramps for exit latency reductions |
US10761580B2 (en) | 2016-09-29 | 2020-09-01 | Intel Corporation | Techniques to enable communication between a processor and voltage regulator |
US11402887B2 (en) | 2016-09-29 | 2022-08-02 | Intel Corporation | Techniques to enable communication between a processor and voltage regulator |
US11782492B2 (en) | 2016-09-29 | 2023-10-10 | Intel Corporation | Techniques to enable communication between a processor and voltage regulator |
US10168758B2 (en) | 2016-09-29 | 2019-01-01 | Intel Corporation | Techniques to enable communication between a processor and voltage regulator |
US10877509B2 (en) | 2016-12-12 | 2020-12-29 | Intel Corporation | Communicating signals between divided and undivided clock domains |
EP3333697A1 (en) * | 2016-12-12 | 2018-06-13 | INTEL Corporation | Communicating signals between divided and undivided clock domains |
US20180181478A1 (en) * | 2016-12-28 | 2018-06-28 | Arm Limited | Performing diagnostic operations upon a target apparatus |
US10534682B2 (en) * | 2016-12-28 | 2020-01-14 | Arm Limited | Method and diagnostic apparatus for performing diagnostic operations upon a target apparatus using transferred state and emulated operation of a transaction master |
US10678674B2 (en) * | 2017-06-15 | 2020-06-09 | Silicon Laboratories, Inc. | Wireless debugging |
US10963034B2 (en) | 2017-06-28 | 2021-03-30 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management in a processor |
US10429919B2 (en) | 2017-06-28 | 2019-10-01 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US11402891B2 (en) | 2017-06-28 | 2022-08-02 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US10990155B2 (en) | 2017-06-28 | 2021-04-27 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US10990154B2 (en) | 2017-06-28 | 2021-04-27 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US11740682B2 (en) | 2017-06-28 | 2023-08-29 | Intel Corporation | System, apparatus and method for loose lock-step redundancy power management |
US11593544B2 (en) | 2017-08-23 | 2023-02-28 | Intel Corporation | System, apparatus and method for adaptive operating voltage in a field programmable gate array (FPGA) |
US10620266B2 (en) | 2017-11-29 | 2020-04-14 | Intel Corporation | System, apparatus and method for in-field self testing in a diagnostic sleep state |
US10962596B2 (en) | 2017-11-29 | 2021-03-30 | Intel Corporation | System, apparatus and method for in-field self testing in a diagnostic sleep state |
US10620682B2 (en) | 2017-12-21 | 2020-04-14 | Intel Corporation | System, apparatus and method for processor-external override of hardware performance state control of a processor |
US10620969B2 (en) | 2018-03-27 | 2020-04-14 | Intel Corporation | System, apparatus and method for providing hardware feedback information in a processor |
US10739844B2 (en) | 2018-05-02 | 2020-08-11 | Intel Corporation | System, apparatus and method for optimized throttling of a processor |
US11290881B2 (en) | 2018-05-15 | 2022-03-29 | Siemens Aktiengesellschaft | Method for functionally secure connection identification |
US11340687B2 (en) | 2018-06-20 | 2022-05-24 | Intel Corporation | System, apparatus and method for responsive autonomous hardware performance state control of a processor |
US10955899B2 (en) | 2018-06-20 | 2021-03-23 | Intel Corporation | System, apparatus and method for responsive autonomous hardware performance state control of a processor |
US11669146B2 (en) | 2018-06-20 | 2023-06-06 | Intel Corporation | System, apparatus and method for responsive autonomous hardware performance state control of a processor |
US10976801B2 (en) | 2018-09-20 | 2021-04-13 | Intel Corporation | System, apparatus and method for power budget distribution for a plurality of virtual machines to execute on a processor |
US10860083B2 (en) | 2018-09-26 | 2020-12-08 | Intel Corporation | System, apparatus and method for collective power control of multiple intellectual property agents and a shared power rail |
US11656676B2 (en) | 2018-12-12 | 2023-05-23 | Intel Corporation | System, apparatus and method for dynamic thermal distribution of a system on chip |
US11256657B2 (en) | 2019-03-26 | 2022-02-22 | Intel Corporation | System, apparatus and method for adaptive interconnect routing |
US11442529B2 (en) | 2019-05-15 | 2022-09-13 | Intel Corporation | System, apparatus and method for dynamically controlling current consumption of processing circuits of a processor |
US11698812B2 (en) | 2019-08-29 | 2023-07-11 | Intel Corporation | System, apparatus and method for providing hardware state feedback to an operating system in a heterogeneous processor |
US11132283B2 (en) * | 2019-10-08 | 2021-09-28 | Renesas Electronics America Inc. | Device and method for evaluating internal and external system processors by internal and external debugger devices |
US11366506B2 (en) | 2019-11-22 | 2022-06-21 | Intel Corporation | System, apparatus and method for globally aware reactive local power control in a processor |
US11853144B2 (en) | 2019-11-22 | 2023-12-26 | Intel Corporation | System, apparatus and method for globally aware reactive local power control in a processor |
US11132201B2 (en) | 2019-12-23 | 2021-09-28 | Intel Corporation | System, apparatus and method for dynamic pipeline stage control of data path dominant circuitry of an integrated circuit |
US20230027877A1 (en) * | 2020-06-01 | 2023-01-26 | Micron Technology, Inc. | Notifying memory system of host events via modulated reset signals |
US11513835B2 (en) * | 2020-06-01 | 2022-11-29 | Micron Technology, Inc. | Notifying memory system of host events via modulated reset signals |
US11921564B2 (en) | 2022-02-28 | 2024-03-05 | Intel Corporation | Saving and restoring configuration and status information with reduced latency |
Also Published As
Publication number | Publication date |
---|---|
CN101069170A (en) | 2007-11-07 |
CN101040256A (en) | 2007-09-19 |
US9141548B2 (en) | 2015-09-22 |
CN101053234B (en) | 2012-02-29 |
CN101128804B (en) | 2012-02-01 |
US20140317353A1 (en) | 2014-10-23 |
US7941585B2 (en) | 2011-05-10 |
US20060059310A1 (en) | 2006-03-16 |
CN101128804A (en) | 2008-02-20 |
CN100533372C (en) | 2009-08-26 |
CN101069170B (en) | 2012-02-08 |
CN101036117B (en) | 2010-12-08 |
CN101036117A (en) | 2007-09-12 |
US20060059316A1 (en) | 2006-03-16 |
CN101053234A (en) | 2007-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060059286A1 (en) | Multi-core debugger | |
US11055203B2 (en) | Virtualizing precise event based sampling | |
US6378064B1 (en) | Microcomputer | |
KR100439781B1 (en) | A data processor, an operation method thereof, a method of executing the debugging operation, and a method of correcting a disadvantage value among the data processor | |
US6691251B2 (en) | On-chip debugging system emulator | |
US7392431B2 (en) | Emulation system with peripherals recording emulation frame when stop generated | |
US5951696A (en) | Debug system with hardware breakpoint trap | |
US6564339B1 (en) | Emulation suspension mode handling multiple stops and starts | |
US8380966B2 (en) | Method and system for instruction stuffing operations during non-intrusive digital signal processor debugging | |
EP1209565A2 (en) | Multicore dsp device having shared program memory with conditional write protection | |
US20030115506A1 (en) | Apparatus and method for shadowing processor information | |
US7793261B1 (en) | Interface for transferring debug information | |
US6665737B2 (en) | Microprocessor chip includes an addressable external communication port which connects to an external computer via an adapter | |
US6526501B2 (en) | Adapter for a microprocessor | |
US10078113B1 (en) | Methods and circuits for debugging data bus communications | |
US6389498B1 (en) | Microprocessor having addressable communication port | |
US10042729B2 (en) | Apparatus and method for a scalable test engine | |
US6457124B1 (en) | Microcomputer having address diversion means for remapping an on-chip device to an external port | |
JP2002268910A (en) | Semiconductor device having self-test function | |
WO2022235265A1 (en) | Debug channel for communication between a processor and an external debug host | |
US7203799B1 (en) | Invalidation of instruction cache line during reset handling | |
US11119149B2 (en) | Debug command execution using existing datapath circuitry | |
Added | 31.2 Signal Descriptions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CAVIUM NETWORKS, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BERTONE, MICHAEL S.;CARLSON, DAVID A.;KESSLER, RICHARD E.;AND OTHERS;REEL/FRAME:016974/0203;SIGNING DATES FROM 20050914 TO 20050926 |
|
AS | Assignment |
Owner name: CAVIUM NETWORKS, INC., A DELAWARE CORPORATION, CAL Free format text: MERGER;ASSIGNOR:CAVIUM NETWORKS, A CALIFORNIA CORPORATION;REEL/FRAME:019014/0174 Effective date: 20070205 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |