US9348775B2 - Out-of-order execution of bus transactions - Google Patents

Out-of-order execution of bus transactions Download PDF

Info

Publication number
US9348775B2
US9348775B2 US13/422,021 US201213422021A US9348775B2 US 9348775 B2 US9348775 B2 US 9348775B2 US 201213422021 A US201213422021 A US 201213422021A US 9348775 B2 US9348775 B2 US 9348775B2
Authority
US
United States
Prior art keywords
transactions
transaction
received
bus
buffer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US13/422,021
Other versions
US20130246682A1 (en
Inventor
Krishna S A Jandhyam
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Analog Devices Inc
Original Assignee
Analog Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Analog Devices Inc filed Critical Analog Devices Inc
Priority to US13/422,021 priority Critical patent/US9348775B2/en
Assigned to ANALOG DEVICES, INC. reassignment ANALOG DEVICES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JANDHYAM, KRISHNA S A
Priority to PCT/US2013/031865 priority patent/WO2013138683A1/en
Publication of US20130246682A1 publication Critical patent/US20130246682A1/en
Application granted granted Critical
Publication of US9348775B2 publication Critical patent/US9348775B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/161Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement
    • G06F13/1626Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement by reordering requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution

Definitions

  • Embodiments of the invention generally relate to bus transactions and, in particular, to efficiently scheduling bus transactions to occur in an order different from that in which they are received.
  • SoC system on a chip
  • Internal buses on the SoC connect the various internal components; unlike traditional, off-chip buses, the on-chip buses need not be the bandwidth-limiting factor in communication between components. For example, while it may be expensive in resources, area, and power to double the bandwidth of an (e.g.) off-chip printed-circuit-board bus, it may be comparatively cheap to do so for an on-chip bus.
  • the less-severe crosstalk, reflections, and/or other noise on-chip buses are exposed to may make it easier to run on-chip buses at higher frequencies (e.g., at the same clock frequencies at which the SoC components themselves run). Special care must be taken, however, to maximize the benefits of the advantages presented by on-chip buses.
  • Out-of-order execution of transactions received by a shared resource is one way to increase the efficiency of on-chip buses.
  • a memory an example of a bus “slave”
  • the throughput to and from one master may be relatively high and the throughput to the other master may be relatively low (due to any one of many design factors and considerations).
  • a long series of transactions between the slave and the slow master may disadvantageously delay a later-received transaction between the slave and the fast master (the “fast” transaction may be received after all the “slow” transactions have been received, but are still executing, or may be received during receipt of—or “interleaved” with—the slow transactions).
  • the slow transactions may be temporarily suspended so that the fast transaction may execute.
  • the increase in total execution time for the slow transactions may be negligible, while the fast transaction avoids a potentially significant delay.
  • Multi-layer AXI is an architecture capable of providing the maximum bandwidth between each of the masters and the slaves in a system while requiring only a routing density comparable to that of the SoC components. Every connection in a multi-layer AXI system looks like, and behaves like, a direct master-slave connection; existing peripheral and sub-systems (e.g., those not programmed for the advanced features of multi-layer AXI) may thus be compatibly connected via the architecture.
  • Multi-layer AXI One aspect of multi-layer AXI that enables these features is the association of an identification (“ID”) tag with each bus transaction; transactions having the same IDs have internal dependencies and must be completed in order, while transactions having different IDs may be completed in any order.
  • Multi-layer AXI also supports write-data interleaving, in which groups of write data transactions from two or more masters are received, at a slave, interspersed with each other; the slave tracks and maintains the original sources of the transactions and honors any dependencies therebetween.
  • various aspects of the systems and methods described herein execute bus transactions in an order different from that in which they were received.
  • Groups of in-order transactions e.g., “burst” transactions
  • burst transactions are accounted for—i.e., their order is preserved—by storing information regarding their dependencies; when a next transaction is to be selected for execution, only the first transaction of a group of in-order transactions is considered as eligible for execution (along with any other pending out-of-order transactions and/or other groups of in-order transactions).
  • the groups of in-order transaction are stored using a hardware linked list; the first transaction in the group points to the second transaction, the second to the third, and so on.
  • An additional hardware linked list may be used to receive, and account for, interleaved data.
  • system for executing bus transactions includes address and data buffers and control circuitry.
  • the data buffer stores write data associated with transactions received from a bus
  • the address buffer stores (i) write addresses associated with transactions received from the bus and (ii) information regarding in-order dependencies among the transactions.
  • the control circuitry selects a received transaction for out-of-order execution in accordance with the in-order dependencies.
  • the address buffer may include a linked list, which may include a series of in-order transactions. Selecting the received transaction may include selecting a head of the series of the in-order transactions.
  • the data buffer may include a linked list, which may be include a series of related write data received on the bus interleaved with unrelated write data.
  • the control circuitry may include a control buffer for storing information linking write data and write addresses, an arbitration unit for selecting the received transaction, a free-buffer-list FIFO for storing available locations in the data buffer, a burst shaper for chopping a received burst into smaller bursts, and/or a completed list for storing information regarding completed transactions.
  • a method for executing bus transactions includes storing write data and write address associated with transactions received from a bus. Information regarding in-order dependencies among the transactions is also stored. A received transaction is selected for out-of-order execution in accordance with (i.e., in a manner that respects) the in-order dependencies.
  • Storing information regarding in-order dependencies may include linking a first in-order transaction to a second in-order transaction. Selecting the received transaction may include selecting a head of a series of linked transactions.
  • a storing write data may include linking a series of related write data received on the bus interleaved with unrelated write data.
  • a burst of write data may be shaped in accordance with a slave interface.
  • FIG. 1 illustrates a basic master/slave bus architecture
  • FIG. 2 illustrates a slave interface unit in accordance with an embodiment of the invention
  • FIGS. 3A, 3B, 4A, 4B, and 5 illustrate address buffers in accordance with embodiments of the invention
  • FIG. 6 illustrates a control buffer in accordance with an embodiment of the invention
  • FIG. 7 illustrates a data buffer in accordance with an embodiment of the invention
  • FIGS. 8A, 8B, and 8C illustrate free-buffer-list FIFOs in accordance with embodiments of the invention
  • FIGS. 9 and 10 illustrate implementations of slave interface units in accordance with embodiments of the invention.
  • FIG. 11 illustrates a read data path in accordance with an embodiment of the invention
  • FIGS. 12 and 13 are flowcharts illustrating methods for operating a slave interface unit in accordance with embodiments of the invention.
  • a basic master-slave interface 100 is shown in FIG. 1 .
  • a bus master 102 communicates over a bus 104 (such as an AXI bus or any other SoC bus) with an interface unit 106 .
  • a slave 108 communicates with the interface unit via a local link 110 .
  • the master 102 may send read or write transactions over the bus 104 ; the interface unit 106 receives the transactions and fulfills them by forwarding them to the slave 108 .
  • the interface unit merely forwards the requests to the slave 108 as they are received; as explained in greater detail below, however, the interface unit 106 may include buffers (or other means of temporary storage) to store the incoming transactions and execute them at a later point in time. Only a single master 102 and slave 110 are shown, but any number of masters 102 and slaves 110 is within the scope of the current invention.
  • the bus 104 may be a network or “fabric” of bus connections connecting the various components.
  • the bus 104 may deliver bus transactions from the master 102 in the form of read addresses and/or write addresses and associated write data (each transaction possibly having control/status information bundled therein).
  • a data buffer 202 may be used to store the incoming write data
  • an address buffer 204 may be used to store the incoming write addresses
  • a control buffer 206 may be used to store the incoming control/status information (e.g., an ID tag).
  • the address buffer 204 and the control buffer 206 may be used to store the incoming read address and control/status information; the data buffer 202 may be used to hold the read data once it is read out from the slave 108 .
  • the particular arrangement of the buffers 202 , 204 , 206 is, however, not meant to be limiting, and any combination or permutation of the three buffers 202 , 204 , 206 is within the scope of the current invention.
  • Each buffer 202 , 204 , 206 may be implemented in any kind of storage medium, device, or structure, including (for example) partitions or sections in a random-access memory, hardware registers, flip-flops, or latches.
  • the buffers 202 , 204 , 206 store information related to the interdependencies of bus transactions as well as the actual transactions. For example, a transaction having no dependencies with respect to other transactions may be stored by itself in the buffers 202 , 204 , 206 , while a group of transactions having a dependency (i.e., the transactions in the group must be executed in-order) may have that information encoded into the buffers 202 , 204 , 206 .
  • the address buffer 204 stores this dependency information as a hardware linked list; the first transaction in such a group of dependent transactions is stored with a link to the second transaction, the second is stored with a link to the third, and so on.
  • control buffer 206 may store the ID tag associated with each bus transaction; as a new transaction is received, its ID tag is examined. If it matches the ID tag of one or more transactions already received, the new transaction is added to the end of a linked list representing the rest of the similarly tagged transactions. Received transactions having unique ID tags are simply added to the buffers 202 , 204 , 206 .
  • the data buffer 202 may include linked lists of received data. As interleaved transactions are received (i.e., two sets of unrelated transactions received from two different bus masters interspersed with each other), data associated with the transactions may be stored in the data buffer 202 in the order it is received. As the data entries are stored, however, pointers to previously received, related data are also stored and associated with them (as, e.g., another field in a buffer row, as explained in greater detail below). A group of related data may thus be read from the data buffer 202 by following the links from the first data entry, despite each data entry being stored throughout the data buffer 202 .
  • control logic 208 may be used to select a next transaction, or set of transactions, for execution.
  • the control logic considers each stand-alone transaction and the head of each linked list of dependent transactions when determining a next transaction to execute.
  • Pointers or links 210 e.g., an address of an entry in a first buffer stored as a data field in a second
  • the control logic 208 removes it from the buffers 202 , 204 , 206 .
  • FIG. 3A One implementation of an address buffer 300 is illustrated in FIG. 3A .
  • the address buffer stores write addresses 302 .
  • the address buffer may be also used to store read addresses.
  • a valid bit 304 Associated with each write address 302 is a valid bit 304 that indicates whether a given entry holds a valid address (e.g., the valid bit 304 hold a binary 1 to indicate a valid address and a binary 0 for an invalid address).
  • a mask bit 306 indicates whether a valid address should be considered for arbitration and execution; transactions that are not the heads of linked lists, for example, may be masked off.
  • a next-transaction address 308 includes a pointer to a next transaction, if any, in a linked list of transactions.
  • An ID tag 310 stores a transaction ID, such as the ID used in multi-layer AXI buses, for each address.
  • a control-buffer pointer 312 points to a corresponding entry in a control buffer (e.g., the control buffer 206 discussed above).
  • the current invention is not limited to this particular implementation, however, and one of skill in the art will understand that the previously described information may be stored in any a variety of ways.
  • the address buffer 300 holds a first transaction 314 at an entry 0 in the buffer 300 .
  • ID 310 in this example, “ID1”
  • the new transaction 314 is added, marked as valid 304 , and marked as not masked as indicated at 306 .
  • the next-transaction address 308 is set to a constant (e.g., 0xF) to indicate it is the last entry in a linked list having that ID 310 (in this case, it is the first, last, and only entry in the linked list).
  • Its entry in the control-buffer pointer field 312 points to a corresponding entry in the control buffer.
  • a second transaction 316 arrives and is stored in a second entry 1 in the address buffer 300 ; this transaction 316 has the same ID 310 as the first transaction 314 .
  • the next address 308 of the first transaction 314 is modified to include the address (“1”) of the second transaction 316 in the buffer 300 .
  • the second transaction 316 is marked as valid as indicated at 304 , but, because it is not the head of a linked list, its mask bit 306 is marked as masked.
  • the new transaction is linked to the end of the list of existing entries (as indicated, for example, by the entry having a next address 308 of 0xF) and masked.
  • the second transaction 316 being masked, is ineligible for execution (reflecting the in-order nature of the first 314 and second 316 transactions; the first transaction must be executed first).
  • the second transaction 316 may be unmasked when the first transaction 314 has executed.
  • its next address 308 is examined and, if not null (e.g., 0xF), its corresponding transaction (i.e., the second transaction 316 ) is identified and unmasked. Being unmasked, the second transaction 316 is thus eligible for execution. This chain of unmasking transactions continues until the last transaction in the linked list is identified and executed.
  • FIG. 4A Another example of an address buffer 400 is illustrated in FIG. 4A .
  • Three transactions are stored in the address buffer 400 : a first transaction 402 at entry 0 (which is the head of a three-member linked list of transactions that also includes the transactions at entries 1 and 3), a second transaction 404 at entry 2, and a third transaction 406 at entry 4.
  • An arbitration unit (within, for example, the control logic 208 described above with reference to FIG. 2 ) decides which of the three transactions 402 , 404 , 406 will next execute. Any suitable arbitration unit and/or functionality is within the scope of the current invention, as one of skill in the art will understand, and the current invention is not limited to any particular means or method of arbitration.
  • the arbitration logic selects the first transaction 402 for execution, and the state of the buffer 400 after said execution is illustrated in FIG. 4B .
  • the valid bit 408 of the first transaction 402 has been cleared to reflect the execution of this transaction, and the next member of the linked list, the entry 410 at location 1 , has its mask bit 412 set to indicate that it is now available for execution.
  • FIG. 5 illustrates an address buffer 500 that includes a data-arrived (or “DA”) field 502 that is asserted when all of the data associated with a given address has arrived (said data being stored in, for example, the data buffer 202 described above with reference to FIG. 2 ).
  • DA data-arrived
  • a number representing the total number of pieces (or “beats”) of data is sent from a bus master along with the rest of the control information associated with a transaction.
  • a beat counter in the control logic 208 increments, and the control logic compares the value of the counter with the total number of beats associated with the given transaction.
  • the numbers match, all the data has arrived, and the data-arrived field 502 is set, indicating that the corresponding transaction is available for arbitration.
  • control buffer 600 An illustrative example of a control buffer 600 is shown with another address buffer 602 in FIG. 6 .
  • the control buffer 600 stores control information associated with each incoming transaction;
  • the address buffer 602 includes a control-buffer pointer field 604 that links to entries in the control buffer 600 .
  • entry 0 in the address buffer 602 includes a pointer 604 to entry 0 in the control buffer 600
  • entry 1 in the address buffer 602 includes a pointer 604 to entry 5 in the control buffer 600
  • entry 2 in the address buffer 602 includes a pointer 604 to entry 2 in the control buffer 600
  • entry 3 in the address buffer 602 includes a pointer 604 to entry 3 in the control buffer 600
  • entry 4 in the address buffer 602 includes a pointer 604 to entry 4 in the control buffer 600 .
  • Each entry in the control buffer 600 includes a pointer 606 to a corresponding entry in the data buffer; in another embodiment, the control buffer 600 includes two pointers 606 for each entry, wherein one pointer indicates a first data beat associated with a transaction and the other pointer indicates a last data beat associated with a transaction.
  • Each entry in the control buffer 600 may include additional information associated with each transaction.
  • An ID field 608 may store the AXI (or other protocol) ID of a transaction, and a burst profile 610 may contain burst-related information about a transaction (such as, for example, burst length, burst size, burst type, and/or byte lane).
  • a valid bit 612 indicates whether an entry is valid or invalid.
  • a data buffer 700 is illustrated in FIG. 7 (along with an address 702 and a control 704 buffer).
  • the data buffer 700 includes a data field 706 and a next-entry field 708 , which indicates a relationship among interleaved data.
  • interleaved data is maintained as a linked list, in which later-arriving data is linked to previously arriving data.
  • the data 710 at address 6 in the data buffer 700 is linked to additional data 712 at address 9 and data 714 at address 12 via use of the next-entry field 708 , despite other data 716 being received in-between the receipt of the linked data 710 , 712 , 714 .
  • the next-entry field 708 associated with the last item of data in the list holds a value of 0xF, indicating the end of the list.
  • FIG. 7 also illustrates the links 718 between the control buffer 704 and the data buffer 700 .
  • entry 0 of the control buffer 704 links to address 6 in the data buffer 700 (as the first data beat 710 of the associated transaction) and to address 12 in the data buffer 700 (as the last data beat 714 of the same transaction).
  • an additional beat of data is received for that same transaction, it is stored in the data buffer 700 and linked to the last data item 714 by changing the next-entry field 708 of the last item 714 to reflect the location of the new data.
  • the control buffer 704 is also updated to reflect the new end of the linked list.
  • FIG. 8A illustrates a free-buffer-list FIFO 800 that may be used to track freely available locations in the data buffer.
  • Each available location in the data buffer is an entry in the FIFO; when incoming data arrives, an entry in the FIFO 800 is de-queued and the data is stored at that location. Once data is flushed from the data buffer, its location is queued back into the FIFO 800 .
  • the top 802 of the FIFO 800 holds a value of 1 ; incoming data is thus stored at location 1 in the data buffer.
  • FIG. 8B illustrates the FIFO 800 when the top entry 802 has been de-queued, and the new top entry 804 is 4.
  • FIG. 9 illustrates a system 900 that includes a data buffer 902 , a control buffer 904 , an address buffer 906 , and a free-buffer-list FIFO 908 .
  • the FIFO 908 may be included in the control logic 208 .
  • FIG. 10 illustrates another embodiment 1000 of the invention having a data buffer 1002 , address buffer 1004 , FIFO 1006 , and control buffer 1008 . Also included in FIG. 10 are further details of the control logic 208 shown in FIG. 2 .
  • a completed list 1010 is a FIFO that contains pointers to the control buffer 1008 of transactions that were completed (and/or sent to the slave 108 for completion).
  • the completed list 1010 may store pending, outgoing transactions and be de-queued upon the successful sending of the completed transaction over the bus interface 104 .
  • the completed list 1010 is used to reference the ID tag of a transaction to be sent.
  • the valid bit of the control buffer 1008 may be de-asserted once the transaction response is sent out on the bus 104 .
  • a write request received from the bus 104 is honored if there are free entries in the address 1004 , control 1008 , and data 1002 buffers, and if the completed list 1010 has a free space.
  • a burst shaper 1012 may disburse transactions (i.e., prepare and send for execution) stored in the buffers 1002 , 1004 , 1008 and stores them, upon completion and/or sending, in the completed list 1010 .
  • the burst shaper 1012 may be used to chop larger burst sizes in to smaller ones to comply with constraints of the slave 108 . For example, if a 64 kb burst arrives but the slave 108 supports only 16 kb bursts, the burst shaper 1012 divides the received burst into four 16 kb bursts.
  • a burst address generator 1014 When the burst shaper 1012 chops a bigger burst into smaller ones, a burst address generator 1014 outputs the address of the chopped burst (i.e., the “internal” address within the original, larger burst that is now the starting address of a smaller burst).
  • the smaller bursts are then submitted to the slave 108 ; as each is submitted, its data is sequentially removed from the linked list in the data buffer 1002 .
  • a response may be sent out on the bus interface 104 only when all of the chopped, smaller bursts are submitted to the slave 108 ; at this point (which corresponds to completion of the original, larger burst), the valid bit in the address buffer 1004 corresponding to the transaction is made invalid.
  • the burst shaper 1012 may also be used to support incoming transactions that are narrower than a maximum width of the bus 104 ; for example, 8-, 16-, or 32-bit transactions may be received over a 64-bit bus 104 .
  • the burst shaper 1012 may expand these narrow transactions to be compatible with a width of the interface 110 to the slave 108 .
  • the burst shaper 1012 may re-align incoming unaligned transactions and/or support variable data widths (if the slave 108 supports this feature).
  • a transaction controller 1014 also known as an efficiency controller or arbitration controller
  • FIG. 11 illustrates a read interface 1100 that includes an address buffer 1102 , a control buffer 1104 , and a completed list 1106 .
  • the data buffer 1108 may be a FIFO because the read data comes from only the slave 108 (i.e., a single source) and there is no interleaving of data. Because of the non-interleaving, the control buffer 1104 may not maintain data pointers for the read data.
  • a method for operating a slave-interface unit in accordance with embodiments of the invention is shown in a flowchart in FIG. 12 .
  • write data associated with transactions received from a bus is stored (in, e.g., a write buffer).
  • write addresses associated with transactions received from the bus are stored (in, e.g., an address buffer, and, in a third step 1206 , information regarding in-order dependencies among the transactions is stored (as, e.g., a linked list in the address buffer.
  • a received transaction is selected for out-of-order execution in accordance with the in-order dependencies.
  • step 13 illustrates a corresponding read transaction, in which received read addresses are stored (step 1302 ), as is information regarding any in-order dependencies (step 1304 ).
  • a read transaction is selected for execution ( 1306 ), and, when the corresponding read data is received back from the slave, it is stored (in, e.g., a FIFO) and sent back to the master (step 1306 ).

Abstract

A slave-interface unit for use with a system-on-a-chip bus (such as an AXI bus) executes received transactions out-of-order while accounting for groups of in-order transactions.

Description

TECHNICAL FIELD
Embodiments of the invention generally relate to bus transactions and, in particular, to efficiently scheduling bus transactions to occur in an order different from that in which they are received.
BACKGROUND
As transistors shrink and die sizes grow, more and more digital-logic system components traditionally implemented as separate chips in discrete packages are being implemented together on a single chip (a so-called “system on a chip” or “SoC”). Internal buses on the SoC connect the various internal components; unlike traditional, off-chip buses, the on-chip buses need not be the bandwidth-limiting factor in communication between components. For example, while it may be expensive in resources, area, and power to double the bandwidth of an (e.g.) off-chip printed-circuit-board bus, it may be comparatively cheap to do so for an on-chip bus. Furthermore, the less-severe crosstalk, reflections, and/or other noise on-chip buses are exposed to may make it easier to run on-chip buses at higher frequencies (e.g., at the same clock frequencies at which the SoC components themselves run). Special care must be taken, however, to maximize the benefits of the advantages presented by on-chip buses.
Out-of-order execution of transactions received by a shared resource is one way to increase the efficiency of on-chip buses. For example, a memory (an example of a bus “slave”) may be shared by two on-chip processors (examples of bus “masters”). The throughput to and from one master may be relatively high and the throughput to the other master may be relatively low (due to any one of many design factors and considerations). In this case, a long series of transactions between the slave and the slow master may disadvantageously delay a later-received transaction between the slave and the fast master (the “fast” transaction may be received after all the “slow” transactions have been received, but are still executing, or may be received during receipt of—or “interleaved” with—the slow transactions). By allowing transactions to execute out-of-order, the slow transactions may be temporarily suspended so that the fast transaction may execute. The increase in total execution time for the slow transactions may be negligible, while the fast transaction avoids a potentially significant delay.
One example of a protocol that supports out-of-order execution is known as the Advanced Microcontroller Bus Architecture (“AMBA”), and specifically an aspect of it called multi-layer Advanced Extensible Interface, or “multi-layer AXI.” Multi-layer AXI is an architecture capable of providing the maximum bandwidth between each of the masters and the slaves in a system while requiring only a routing density comparable to that of the SoC components. Every connection in a multi-layer AXI system looks like, and behaves like, a direct master-slave connection; existing peripheral and sub-systems (e.g., those not programmed for the advanced features of multi-layer AXI) may thus be compatibly connected via the architecture. One aspect of multi-layer AXI that enables these features is the association of an identification (“ID”) tag with each bus transaction; transactions having the same IDs have internal dependencies and must be completed in order, while transactions having different IDs may be completed in any order. Multi-layer AXI also supports write-data interleaving, in which groups of write data transactions from two or more masters are received, at a slave, interspersed with each other; the slave tracks and maintains the original sources of the transactions and honors any dependencies therebetween.
Any efficient implementation of an SoC bus protocol like multi-layer AXI, if it accommodates out-of-order execution, must therefore account for the design challenges that groups of in-order transactions and/or data interleaving present. Existing designs may use first-in-first-out (“FIFO”) and/or simple buffers to capture bus transaction requests as they are received at a slave, but these designs require sophisticated control logic to account for, and properly deal with, the mixture of in-order and out-of-order transactions as well as control logic to de-interleave received data. These implementations are thus large, inefficient, and power-hungry; a need therefore exists for a small, elegant, low-power implementation.
SUMMARY
In general, various aspects of the systems and methods described herein execute bus transactions in an order different from that in which they were received. Groups of in-order transactions (e.g., “burst” transactions) are accounted for—i.e., their order is preserved—by storing information regarding their dependencies; when a next transaction is to be selected for execution, only the first transaction of a group of in-order transactions is considered as eligible for execution (along with any other pending out-of-order transactions and/or other groups of in-order transactions). In one embodiment, the groups of in-order transaction are stored using a hardware linked list; the first transaction in the group points to the second transaction, the second to the third, and so on. An additional hardware linked list may be used to receive, and account for, interleaved data.
In one aspect, system for executing bus transactions includes address and data buffers and control circuitry. The data buffer stores write data associated with transactions received from a bus, and the address buffer stores (i) write addresses associated with transactions received from the bus and (ii) information regarding in-order dependencies among the transactions. The control circuitry selects a received transaction for out-of-order execution in accordance with the in-order dependencies.
The address buffer may include a linked list, which may include a series of in-order transactions. Selecting the received transaction may include selecting a head of the series of the in-order transactions. The data buffer may include a linked list, which may be include a series of related write data received on the bus interleaved with unrelated write data. The control circuitry may include a control buffer for storing information linking write data and write addresses, an arbitration unit for selecting the received transaction, a free-buffer-list FIFO for storing available locations in the data buffer, a burst shaper for chopping a received burst into smaller bursts, and/or a completed list for storing information regarding completed transactions.
In another aspect, a method for executing bus transactions includes storing write data and write address associated with transactions received from a bus. Information regarding in-order dependencies among the transactions is also stored. A received transaction is selected for out-of-order execution in accordance with (i.e., in a manner that respects) the in-order dependencies.
Storing information regarding in-order dependencies may include linking a first in-order transaction to a second in-order transaction. Selecting the received transaction may include selecting a head of a series of linked transactions. A storing write data may include linking a series of related write data received on the bus interleaved with unrelated write data. A burst of write data may be shaped in accordance with a slave interface.
These and other objects, along with advantages and features of the present invention herein disclosed, will become more apparent through reference to the following description, the accompanying drawings, and the claims. Furthermore, it is to be understood that the features of the various embodiments described herein are not mutually exclusive and can exist in various combinations and permutations.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, like reference characters generally refer to the same parts throughout the different views. In the following description, various embodiments of the present invention are described with reference to the following drawings, in which:
FIG. 1 illustrates a basic master/slave bus architecture;
FIG. 2 illustrates a slave interface unit in accordance with an embodiment of the invention;
FIGS. 3A, 3B, 4A, 4B, and 5 illustrate address buffers in accordance with embodiments of the invention;
FIG. 6 illustrates a control buffer in accordance with an embodiment of the invention;
FIG. 7 illustrates a data buffer in accordance with an embodiment of the invention;
FIGS. 8A, 8B, and 8C illustrate free-buffer-list FIFOs in accordance with embodiments of the invention;
FIGS. 9 and 10 illustrate implementations of slave interface units in accordance with embodiments of the invention;
FIG. 11 illustrates a read data path in accordance with an embodiment of the invention;
FIGS. 12 and 13 are flowcharts illustrating methods for operating a slave interface unit in accordance with embodiments of the invention.
DETAILED DESCRIPTION
A basic master-slave interface 100 is shown in FIG. 1. A bus master 102 communicates over a bus 104 (such as an AXI bus or any other SoC bus) with an interface unit 106. A slave 108, in turn, communicates with the interface unit via a local link 110. The master 102 may send read or write transactions over the bus 104; the interface unit 106 receives the transactions and fulfills them by forwarding them to the slave 108. In the simplest case, the interface unit merely forwards the requests to the slave 108 as they are received; as explained in greater detail below, however, the interface unit 106 may include buffers (or other means of temporary storage) to store the incoming transactions and execute them at a later point in time. Only a single master 102 and slave 110 are shown, but any number of masters 102 and slaves 110 is within the scope of the current invention. In these more complicated systems, the bus 104 may be a network or “fabric” of bus connections connecting the various components.
A more detailed representation of the interface unit 106 is shown in FIG. 2. The bus 104 may deliver bus transactions from the master 102 in the form of read addresses and/or write addresses and associated write data (each transaction possibly having control/status information bundled therein). In the case of an incoming write transaction, a data buffer 202 may be used to store the incoming write data, an address buffer 204 may be used to store the incoming write addresses, and a control buffer 206 may be used to store the incoming control/status information (e.g., an ID tag). Similarly, in the case of an incoming read transaction, the address buffer 204 and the control buffer 206 may be used to store the incoming read address and control/status information; the data buffer 202 may be used to hold the read data once it is read out from the slave 108. The particular arrangement of the buffers 202, 204, 206 is, however, not meant to be limiting, and any combination or permutation of the three buffers 202, 204, 206 is within the scope of the current invention. Each buffer 202, 204, 206 may be implemented in any kind of storage medium, device, or structure, including (for example) partitions or sections in a random-access memory, hardware registers, flip-flops, or latches.
As explained in greater detail below, the buffers 202, 204, 206 store information related to the interdependencies of bus transactions as well as the actual transactions. For example, a transaction having no dependencies with respect to other transactions may be stored by itself in the buffers 202, 204, 206, while a group of transactions having a dependency (i.e., the transactions in the group must be executed in-order) may have that information encoded into the buffers 202, 204, 206. In one embodiment, the address buffer 204 stores this dependency information as a hardware linked list; the first transaction in such a group of dependent transactions is stored with a link to the second transaction, the second is stored with a link to the third, and so on. In the case of a multi-layer AXI bus, the control buffer 206 may store the ID tag associated with each bus transaction; as a new transaction is received, its ID tag is examined. If it matches the ID tag of one or more transactions already received, the new transaction is added to the end of a linked list representing the rest of the similarly tagged transactions. Received transactions having unique ID tags are simply added to the buffers 202, 204, 206.
Similarly, as also explained in greater detail below, the data buffer 202 may include linked lists of received data. As interleaved transactions are received (i.e., two sets of unrelated transactions received from two different bus masters interspersed with each other), data associated with the transactions may be stored in the data buffer 202 in the order it is received. As the data entries are stored, however, pointers to previously received, related data are also stored and associated with them (as, e.g., another field in a buffer row, as explained in greater detail below). A group of related data may thus be read from the data buffer 202 by following the links from the first data entry, despite each data entry being stored throughout the data buffer 202.
Once the incoming transactions are stored in the buffers 202, 204, 206, control logic 208 may be used to select a next transaction, or set of transactions, for execution. In one embodiment, the control logic considers each stand-alone transaction and the head of each linked list of dependent transactions when determining a next transaction to execute. Pointers or links 210 (e.g., an address of an entry in a first buffer stored as a data field in a second) between the buffers 202, 204, 206 may be used to identify the data, address, and control information associated with each transaction. Once a transaction has been executed, the control logic 208 removes it from the buffers 202, 204, 206.
One implementation of an address buffer 300 is illustrated in FIG. 3A. In this implementation, which for illustrative purposes is explained using a write operation, the address buffer stores write addresses 302. One of skill in the art will understand that the address buffer may be also used to store read addresses. Associated with each write address 302 is a valid bit 304 that indicates whether a given entry holds a valid address (e.g., the valid bit 304 hold a binary 1 to indicate a valid address and a binary 0 for an invalid address). A mask bit 306 indicates whether a valid address should be considered for arbitration and execution; transactions that are not the heads of linked lists, for example, may be masked off. A next-transaction address 308 includes a pointer to a next transaction, if any, in a linked list of transactions. An ID tag 310 stores a transaction ID, such as the ID used in multi-layer AXI buses, for each address. A control-buffer pointer 312 points to a corresponding entry in a control buffer (e.g., the control buffer 206 discussed above). The current invention is not limited to this particular implementation, however, and one of skill in the art will understand that the previously described information may be stored in any a variety of ways.
The operation of the address buffer 300 will now be explained in greater detail. With reference again to FIG. 3A, the address buffer 300 holds a first transaction 314 at an entry 0 in the buffer 300. When this transaction arrives, its ID 310 (in this example, “ID1”) is not equal to that of any existing entries in the address buffer 300; the new transaction 314 is added, marked as valid 304, and marked as not masked as indicated at 306. In one embodiment, the next-transaction address 308 is set to a constant (e.g., 0xF) to indicate it is the last entry in a linked list having that ID 310 (in this case, it is the first, last, and only entry in the linked list). Its entry in the control-buffer pointer field 312 points to a corresponding entry in the control buffer.
In FIG. 3B, a second transaction 316 arrives and is stored in a second entry 1 in the address buffer 300; this transaction 316 has the same ID 310 as the first transaction 314. To reflect this dependence between the first 314 and second 316 transactions, the next address 308 of the first transaction 314 is modified to include the address (“1”) of the second transaction 316 in the buffer 300. The second transaction 316 is marked as valid as indicated at 304, but, because it is not the head of a linked list, its mask bit 306 is marked as masked. In general, when a transaction arrives that has an ID equal to that of any other entries in the address buffer 300, the new transaction is linked to the end of the list of existing entries (as indicated, for example, by the entry having a next address 308 of 0xF) and masked.
The second transaction 316, being masked, is ineligible for execution (reflecting the in-order nature of the first 314 and second 316 transactions; the first transaction must be executed first). The second transaction 316 may be unmasked when the first transaction 314 has executed. Upon execution of the first transaction 314, its next address 308 is examined and, if not null (e.g., 0xF), its corresponding transaction (i.e., the second transaction 316) is identified and unmasked. Being unmasked, the second transaction 316 is thus eligible for execution. This chain of unmasking transactions continues until the last transaction in the linked list is identified and executed.
Another example of an address buffer 400 is illustrated in FIG. 4A. Three transactions are stored in the address buffer 400: a first transaction 402 at entry 0 (which is the head of a three-member linked list of transactions that also includes the transactions at entries 1 and 3), a second transaction 404 at entry 2, and a third transaction 406 at entry 4. An arbitration unit (within, for example, the control logic 208 described above with reference to FIG. 2) decides which of the three transactions 402, 404, 406 will next execute. Any suitable arbitration unit and/or functionality is within the scope of the current invention, as one of skill in the art will understand, and the current invention is not limited to any particular means or method of arbitration. In this example, however, the arbitration logic selects the first transaction 402 for execution, and the state of the buffer 400 after said execution is illustrated in FIG. 4B. In this figure, the valid bit 408 of the first transaction 402 has been cleared to reflect the execution of this transaction, and the next member of the linked list, the entry 410 at location 1, has its mask bit 412 set to indicate that it is now available for execution.
In one embodiment, another field may be included in the address buffer to facilitate the execution of interleaved transactions. FIG. 5 illustrates an address buffer 500 that includes a data-arrived (or “DA”) field 502 that is asserted when all of the data associated with a given address has arrived (said data being stored in, for example, the data buffer 202 described above with reference to FIG. 2). In this embodiment, a number representing the total number of pieces (or “beats”) of data is sent from a bus master along with the rest of the control information associated with a transaction. As each beat of data arrives, a beat counter in the control logic 208 increments, and the control logic compares the value of the counter with the total number of beats associated with the given transaction. When the numbers match, all the data has arrived, and the data-arrived field 502 is set, indicating that the corresponding transaction is available for arbitration.
An illustrative example of a control buffer 600 is shown with another address buffer 602 in FIG. 6. As mentioned above, the control buffer 600 stores control information associated with each incoming transaction; the address buffer 602 includes a control-buffer pointer field 604 that links to entries in the control buffer 600. In this example, entry 0 in the address buffer 602 includes a pointer 604 to entry 0 in the control buffer 600, entry 1 in the address buffer 602 includes a pointer 604 to entry 5 in the control buffer 600, entry 2 in the address buffer 602 includes a pointer 604 to entry 2 in the control buffer 600, entry 3 in the address buffer 602 includes a pointer 604 to entry 3 in the control buffer 600, and entry 4 in the address buffer 602 includes a pointer 604 to entry 4 in the control buffer 600.
Each entry in the control buffer 600 includes a pointer 606 to a corresponding entry in the data buffer; in another embodiment, the control buffer 600 includes two pointers 606 for each entry, wherein one pointer indicates a first data beat associated with a transaction and the other pointer indicates a last data beat associated with a transaction. Each entry in the control buffer 600 may include additional information associated with each transaction. An ID field 608 may store the AXI (or other protocol) ID of a transaction, and a burst profile 610 may contain burst-related information about a transaction (such as, for example, burst length, burst size, burst type, and/or byte lane). A valid bit 612 indicates whether an entry is valid or invalid.
A data buffer 700 is illustrated in FIG. 7 (along with an address 702 and a control 704 buffer). The data buffer 700 includes a data field 706 and a next-entry field 708, which indicates a relationship among interleaved data. In one embodiment, interleaved data is maintained as a linked list, in which later-arriving data is linked to previously arriving data. For example, the data 710 at address 6 in the data buffer 700 is linked to additional data 712 at address 9 and data 714 at address 12 via use of the next-entry field 708, despite other data 716 being received in-between the receipt of the linked data 710, 712, 714. The next-entry field 708 associated with the last item of data in the list holds a value of 0xF, indicating the end of the list.
FIG. 7 also illustrates the links 718 between the control buffer 704 and the data buffer 700. For example, entry 0 of the control buffer 704 links to address 6 in the data buffer 700 (as the first data beat 710 of the associated transaction) and to address 12 in the data buffer 700 (as the last data beat 714 of the same transaction). In this example, if an additional beat of data is received for that same transaction, it is stored in the data buffer 700 and linked to the last data item 714 by changing the next-entry field 708 of the last item 714 to reflect the location of the new data. The control buffer 704 is also updated to reflect the new end of the linked list.
FIG. 8A illustrates a free-buffer-list FIFO 800 that may be used to track freely available locations in the data buffer. Each available location in the data buffer is an entry in the FIFO; when incoming data arrives, an entry in the FIFO 800 is de-queued and the data is stored at that location. Once data is flushed from the data buffer, its location is queued back into the FIFO 800. In this example, the top 802 of the FIFO 800 holds a value of 1; incoming data is thus stored at location 1 in the data buffer. FIG. 8B illustrates the FIFO 800 when the top entry 802 has been de-queued, and the new top entry 804 is 4. If entry 2 (for example) in the data buffer becomes available, it is queued into the last position 806 in the FIFO 800. FIG. 9 illustrates a system 900 that includes a data buffer 902, a control buffer 904, an address buffer 906, and a free-buffer-list FIFO 908. The FIFO 908 may be included in the control logic 208.
FIG. 10 illustrates another embodiment 1000 of the invention having a data buffer 1002, address buffer 1004, FIFO 1006, and control buffer 1008. Also included in FIG. 10 are further details of the control logic 208 shown in FIG. 2. A completed list 1010 is a FIFO that contains pointers to the control buffer 1008 of transactions that were completed (and/or sent to the slave 108 for completion). The completed list 1010 may store pending, outgoing transactions and be de-queued upon the successful sending of the completed transaction over the bus interface 104. In one embodiment, the completed list 1010 is used to reference the ID tag of a transaction to be sent. The valid bit of the control buffer 1008 may be de-asserted once the transaction response is sent out on the bus 104. In general, a write request received from the bus 104 is honored if there are free entries in the address 1004, control 1008, and data 1002 buffers, and if the completed list 1010 has a free space.
A burst shaper 1012 may disburse transactions (i.e., prepare and send for execution) stored in the buffers 1002, 1004, 1008 and stores them, upon completion and/or sending, in the completed list 1010. The burst shaper 1012 may be used to chop larger burst sizes in to smaller ones to comply with constraints of the slave 108. For example, if a 64 kb burst arrives but the slave 108 supports only 16 kb bursts, the burst shaper 1012 divides the received burst into four 16 kb bursts. When the burst shaper 1012 chops a bigger burst into smaller ones, a burst address generator 1014 outputs the address of the chopped burst (i.e., the “internal” address within the original, larger burst that is now the starting address of a smaller burst). The smaller bursts are then submitted to the slave 108; as each is submitted, its data is sequentially removed from the linked list in the data buffer 1002. A response may be sent out on the bus interface 104 only when all of the chopped, smaller bursts are submitted to the slave 108; at this point (which corresponds to completion of the original, larger burst), the valid bit in the address buffer 1004 corresponding to the transaction is made invalid.
The burst shaper 1012 may also be used to support incoming transactions that are narrower than a maximum width of the bus 104; for example, 8-, 16-, or 32-bit transactions may be received over a 64-bit bus 104. The burst shaper 1012 may expand these narrow transactions to be compatible with a width of the interface 110 to the slave 108. Similarly, the burst shaper 1012 may re-align incoming unaligned transactions and/or support variable data widths (if the slave 108 supports this feature). Finally, a transaction controller 1014 (also known as an efficiency controller or arbitration controller) may be used to select which among a plurality of available transactions will be next executed.
The above discussion relates to write transactions, but one of skill in the art will understand that much of it applies to read transactions as well. FIG. 11 illustrates a read interface 1100 that includes an address buffer 1102, a control buffer 1104, and a completed list 1106. The data buffer 1108 may be a FIFO because the read data comes from only the slave 108 (i.e., a single source) and there is no interleaving of data. Because of the non-interleaving, the control buffer 1104 may not maintain data pointers for the read data.
A method for operating a slave-interface unit in accordance with embodiments of the invention is shown in a flowchart in FIG. 12. In a first step 1202, write data associated with transactions received from a bus is stored (in, e.g., a write buffer). In a second step 1204, write addresses associated with transactions received from the bus are stored (in, e.g., an address buffer, and, in a third step 1206, information regarding in-order dependencies among the transactions is stored (as, e.g., a linked list in the address buffer. In a fourth step 1208, a received transaction is selected for out-of-order execution in accordance with the in-order dependencies. FIG. 13 illustrates a corresponding read transaction, in which received read addresses are stored (step 1302), as is information regarding any in-order dependencies (step 1304). A read transaction is selected for execution (1306), and, when the corresponding read data is received back from the slave, it is stored (in, e.g., a FIFO) and sent back to the master (step 1306).
The terms and expressions employed herein are used as terms and expressions of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding any equivalents of the features shown and described or portions thereof. In addition, having described certain embodiments of the invention, it will be apparent to those of ordinary skill in the art that other embodiments incorporating the concepts disclosed herein may be used without departing from the spirit and scope of the invention. Accordingly, the described embodiments are to be considered in all respects as only illustrative and not restrictive.

Claims (19)

What is claimed is:
1. A system for executing bus transactions, the system comprising:
a data buffer for storing write data associated with a plurality of transactions received from a bus;
an address buffer for storing (i) write addresses associated with the plurality of transactions received from the bus and (ii) information regarding in-order dependencies among the plurality of transactions, wherein the address buffer includes a plurality of linked lists; and
control circuitry for selecting a received transaction from the plurality of transactions for out-of-order execution in accordance with the in-order dependencies thereby interleaving execution of transactions from the plurality of linked lists, wherein selecting the received transaction for out-of-order execution includes reviewing transactions for which an associated mask bit has an on value.
2. The system of claim 1, wherein each of the plurality of linked lists comprises a series of in-order transactions.
3. The system of claim 2, wherein selecting the received transaction comprises selecting a head of one of the series of the in-order transactions.
4. The system of claim 1, wherein the data buffer comprises a linked list.
5. The system of claim 4, wherein the linked list comprises a series of related write data received on the bus interleaved with unrelated write data.
6. The system of claim 1, wherein the control circuitry comprises a control buffer for storing information linking write data and write addresses.
7. The system of claim 1, wherein the control circuitry comprises an arbitration unit for selecting the received transaction.
8. The system of claim 1, wherein the control circuitry comprises a free-buffer-list FIFO for storing available locations in the data buffer.
9. The system of claim 1, wherein the control circuitry comprises a burst shaper for chopping a received burst into smaller bursts.
10. The system of claim 1, wherein the control circuitry comprises a completed list for storing information regarding completed transactions.
11. A method for executing bus transactions, the method comprising:
storing write data associated with a plurality of transactions received from a bus;
storing write addresses associated with the plurality of transactions received from the bus;
storing information regarding in-order dependencies among the plurality of transactions, wherein storing information regarding in-order dependencies includes linking a first series of transactions and linking a second series of transactions,
selecting a received transaction from the plurality of transactions for out-of-order execution in accordance with the in-order dependencies, thereby interleaving execution of transactions from the first series of transactions and transactions from the second series of transactions, wherein selecting the received transaction for out-of-order execution includes reviewing transactions for which an associated mask bit has an on value.
12. The method of claim 11, wherein linking the first series of transactions includes linking a first in-order transaction to a second in-order transaction.
13. The method of claim 12, wherein selecting the received transaction comprises selecting a head of one of the first and second series of linked transactions.
14. The method of claim 11, wherein storing write data comprises linking a series of related write data received on the bus interleaved with unrelated write data.
15. The method of claim 11, further comprising shaping a burst of write data in accordance with a slave interface.
16. The method of claim 13, further comprising removing the selected received transaction from the head of the respective series of linked transactions such that a next transaction in the respective series of linked transactions becomes the head.
17. The method of claim 16, further comprising changing a value of a the mask bit for the next transaction to on.
18. The method of claim 11, further comprising:
setting a value of the mask bit to on for a head of the first series of transactions and for a head of the second series of transactions, and setting the value of the mask bit to off for other transactions in the first and second series.
19. A system for executing bus transactions, the system comprising:
a data buffer for storing write data associated with a plurality of transactions received from a bus;
an address buffer for storing (i) write addresses associated with the plurality of transactions received from the bus and (ii) information regarding in-order dependencies among the plurality of transactions, wherein the address buffer includes a plurality of linked lists; and
means for selecting a received transaction from the plurality of transactions for out-of-order execution in accordance with the in-order dependencies thereby interleaving execution of transactions from the plurality of linked lists, wherein selecting the received transaction for out-of-order execution includes reviewing transactions for which an associated mask bit has an on value.
US13/422,021 2012-03-16 2012-03-16 Out-of-order execution of bus transactions Active 2032-08-29 US9348775B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/422,021 US9348775B2 (en) 2012-03-16 2012-03-16 Out-of-order execution of bus transactions
PCT/US2013/031865 WO2013138683A1 (en) 2012-03-16 2013-03-15 Out-of-order execution of bus transactions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/422,021 US9348775B2 (en) 2012-03-16 2012-03-16 Out-of-order execution of bus transactions

Publications (2)

Publication Number Publication Date
US20130246682A1 US20130246682A1 (en) 2013-09-19
US9348775B2 true US9348775B2 (en) 2016-05-24

Family

ID=48045080

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/422,021 Active 2032-08-29 US9348775B2 (en) 2012-03-16 2012-03-16 Out-of-order execution of bus transactions

Country Status (2)

Country Link
US (1) US9348775B2 (en)
WO (1) WO2013138683A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019180A1 (en) * 2013-03-06 2016-01-21 Sagem Defense Securite Method and device for filtering transactions for an on-chip system
US11252108B2 (en) 2019-06-19 2022-02-15 Nxp Usa, Inc. Controller for ordering out-of-order transactions in SoC
US11775467B2 (en) 2021-01-14 2023-10-03 Nxp Usa, Inc. System and method for ordering transactions in system-on-chips
US11893413B2 (en) 2020-09-11 2024-02-06 Apple Inc. Virtual channel support using write table

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9449137B2 (en) * 2012-09-17 2016-09-20 Texas Instruments Incorporated Buffered conduits for high throughput channel implementation, crosstalk de-sensitization and late timing fixes on skew sensitive buses
US9304959B2 (en) * 2013-08-13 2016-04-05 Global Unichip Corp. Method of optimizing the width of transaction ID for an interconnecting bus
US20150199286A1 (en) * 2014-01-10 2015-07-16 Samsung Electronics Co., Ltd. Network interconnect with reduced congestion
US10409606B2 (en) 2015-06-26 2019-09-10 Microsoft Technology Licensing, Llc Verifying branch targets
US9940136B2 (en) 2015-06-26 2018-04-10 Microsoft Technology Licensing, Llc Reuse of decoded instructions
US10346168B2 (en) 2015-06-26 2019-07-09 Microsoft Technology Licensing, Llc Decoupled processor instruction window and operand buffer
US11755484B2 (en) 2015-06-26 2023-09-12 Microsoft Technology Licensing, Llc Instruction block allocation
US20170083343A1 (en) * 2015-09-19 2017-03-23 Microsoft Technology Licensing, Llc Out of order commit
US10095519B2 (en) 2015-09-19 2018-10-09 Microsoft Technology Licensing, Llc Instruction block address register
US10146714B1 (en) * 2016-03-01 2018-12-04 Cadence Design Systems, Inc. Method and system for synchronizing transaction streams of a partial sequence of transactions through master-slave interfaces

Citations (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1005543A (en) 1910-09-06 1911-10-10 John N Heltzel Mold.
US5875309A (en) 1997-04-18 1999-02-23 3Com Corporation Arbitration system using linked table
US6216178B1 (en) 1998-11-16 2001-04-10 Infineon Technologies Ag Methods and apparatus for detecting the collision of data on a data bus in case of out-of-order memory accesses of different times of memory access execution
US20030217239A1 (en) 2002-05-14 2003-11-20 Jeddeloh Joseph M. Out of order DRAM sequencer
US6668313B2 (en) 2001-12-21 2003-12-23 Agere Systems, Inc. Memory system for increased bandwidth
US6697923B2 (en) * 2001-06-05 2004-02-24 Via Technologies Inc. Buffer management method and a controller thereof
US6772300B1 (en) 2000-08-30 2004-08-03 Intel Corporation Method and apparatus for managing out of order memory transactions
US6907002B2 (en) * 2000-12-29 2005-06-14 Nortel Networks Limited Burst switching in a high capacity network
US7043593B1 (en) 2003-04-29 2006-05-09 Advanced Micro Devices, Inc. Apparatus and method for sending in order data and out of order data on a data bus
US20060123206A1 (en) 2004-12-03 2006-06-08 Barrett Wayne M Prioritization of out-of-order data transfers on shared data bus
US7181556B2 (en) * 2003-12-23 2007-02-20 Arm Limited Transaction request servicing mechanism
US20070067549A1 (en) 2005-08-29 2007-03-22 Judy Gehman Method for request transaction ordering in OCP bus to AXI bus bridge design
US7243200B2 (en) 2004-07-15 2007-07-10 International Business Machines Corporation Establishing command order in an out of order DMA command queue
US7398361B2 (en) * 2005-08-30 2008-07-08 P.A. Semi, Inc. Combined buffer for snoop, store merging, load miss, and writeback operations
US7555579B2 (en) * 2004-05-21 2009-06-30 Nortel Networks Limited Implementing FIFOs in shared memory using linked lists and interleaved linked lists
US20090172250A1 (en) 2007-12-28 2009-07-02 Spansion Llc Relocating data in a memory device
US20090177821A1 (en) 2008-01-04 2009-07-09 Robert Michael Dinkjian Cache Intervention on a Separate Data Bus When On-Chip Bus Has Separate Read and Write Data Busses
US20090216993A1 (en) 2008-02-26 2009-08-27 Qualcomm Incorporated System and Method of Data Forwarding Within An Execution Unit
US20090240896A1 (en) 2006-01-12 2009-09-24 Mtekvision Co.,Ltd. Microprocessor coupled to multi-port memory
US7603490B2 (en) 2007-01-10 2009-10-13 International Business Machines Corporation Barrier and interrupt mechanism for high latency and out of order DMA device
EP2126893A1 (en) 2007-03-22 2009-12-02 QUALCOMM Incorporated Pipeline techniques for processing musical instrument digital interface (midi) files
US7647441B2 (en) 1998-11-13 2010-01-12 Sonics, Inc. Communications system and method with multilevel connection identification
US7657791B2 (en) 2006-11-15 2010-02-02 Qualcomm Incorporated Method and system for a digital signal processor debugging during power transitions
US7663051B2 (en) 2007-03-22 2010-02-16 Qualcomm Incorporated Audio processing hardware elements
US7718882B2 (en) 2007-03-22 2010-05-18 Qualcomm Incorporated Efficient identification of sets of audio parameters
US7804735B2 (en) 2008-02-29 2010-09-28 Qualcomm Incorporated Dual channel memory architecture having a reduced interface pin requirements using a double data rate scheme for the address/control signals
US20100306423A1 (en) 2009-05-26 2010-12-02 Fujitsu Semiconductor Limited Information processing system and data transfer method
US20110221743A1 (en) 2010-03-11 2011-09-15 Gary Keall Method And System For Controlling A 3D Processor Using A Control List In Memory
US8046513B2 (en) * 2008-07-22 2011-10-25 Realtek Semiconductor Corp. Out-of-order executive bus system and operating method thereof
US8145805B2 (en) * 2008-06-09 2012-03-27 Emulex Design & Manufacturing Corporation Method for re-sequencing commands and data between a master and target devices utilizing parallel processing
US20120159037A1 (en) * 2010-12-17 2012-06-21 Kwon Woo Cheol Memory interleaving device and method using reorder buffer
US8332564B2 (en) * 2009-10-20 2012-12-11 Arm Limited Data processing apparatus and method for connection to interconnect circuitry
US8489794B2 (en) * 2010-03-12 2013-07-16 Lsi Corporation Processor bus bridge for network processors or the like
US8631184B2 (en) * 2010-05-20 2014-01-14 Stmicroelectronics (Grenoble 2) Sas Interconnection method and device, for example for systems-on-chip
US8656078B2 (en) * 2011-05-09 2014-02-18 Arm Limited Transaction identifier expansion circuitry and method of operation of such circuitry
US8677045B2 (en) * 2010-09-29 2014-03-18 Stmicroelectronics (Grenoble 2) Sas Transaction reordering system and method with protocol indifference
US8688853B2 (en) * 2001-12-21 2014-04-01 Agere Systems Llc Method and apparatus for maintaining multicast lists in a data network
US20140101340A1 (en) * 2012-10-05 2014-04-10 Analog Devices, Inc. Efficient Scheduling of Transactions from Multiple Masters
US20140101339A1 (en) * 2012-10-05 2014-04-10 Analog Devices, Inc. Efficient Scheduling of Read and Write Transactions in Dynamic Memory Controllers

Patent Citations (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1005543A (en) 1910-09-06 1911-10-10 John N Heltzel Mold.
US5875309A (en) 1997-04-18 1999-02-23 3Com Corporation Arbitration system using linked table
US7647441B2 (en) 1998-11-13 2010-01-12 Sonics, Inc. Communications system and method with multilevel connection identification
US6216178B1 (en) 1998-11-16 2001-04-10 Infineon Technologies Ag Methods and apparatus for detecting the collision of data on a data bus in case of out-of-order memory accesses of different times of memory access execution
US6772300B1 (en) 2000-08-30 2004-08-03 Intel Corporation Method and apparatus for managing out of order memory transactions
US6907002B2 (en) * 2000-12-29 2005-06-14 Nortel Networks Limited Burst switching in a high capacity network
US6697923B2 (en) * 2001-06-05 2004-02-24 Via Technologies Inc. Buffer management method and a controller thereof
US8688853B2 (en) * 2001-12-21 2014-04-01 Agere Systems Llc Method and apparatus for maintaining multicast lists in a data network
US6668313B2 (en) 2001-12-21 2003-12-23 Agere Systems, Inc. Memory system for increased bandwidth
US20030217239A1 (en) 2002-05-14 2003-11-20 Jeddeloh Joseph M. Out of order DRAM sequencer
US7043593B1 (en) 2003-04-29 2006-05-09 Advanced Micro Devices, Inc. Apparatus and method for sending in order data and out of order data on a data bus
US7181556B2 (en) * 2003-12-23 2007-02-20 Arm Limited Transaction request servicing mechanism
US7555579B2 (en) * 2004-05-21 2009-06-30 Nortel Networks Limited Implementing FIFOs in shared memory using linked lists and interleaved linked lists
US7243200B2 (en) 2004-07-15 2007-07-10 International Business Machines Corporation Establishing command order in an out of order DMA command queue
US20060123206A1 (en) 2004-12-03 2006-06-08 Barrett Wayne M Prioritization of out-of-order data transfers on shared data bus
US20070067549A1 (en) 2005-08-29 2007-03-22 Judy Gehman Method for request transaction ordering in OCP bus to AXI bus bridge design
US7398361B2 (en) * 2005-08-30 2008-07-08 P.A. Semi, Inc. Combined buffer for snoop, store merging, load miss, and writeback operations
US20090240896A1 (en) 2006-01-12 2009-09-24 Mtekvision Co.,Ltd. Microprocessor coupled to multi-port memory
US7657791B2 (en) 2006-11-15 2010-02-02 Qualcomm Incorporated Method and system for a digital signal processor debugging during power transitions
US7603490B2 (en) 2007-01-10 2009-10-13 International Business Machines Corporation Barrier and interrupt mechanism for high latency and out of order DMA device
EP2126893A1 (en) 2007-03-22 2009-12-02 QUALCOMM Incorporated Pipeline techniques for processing musical instrument digital interface (midi) files
US7663051B2 (en) 2007-03-22 2010-02-16 Qualcomm Incorporated Audio processing hardware elements
US7718882B2 (en) 2007-03-22 2010-05-18 Qualcomm Incorporated Efficient identification of sets of audio parameters
US20090172250A1 (en) 2007-12-28 2009-07-02 Spansion Llc Relocating data in a memory device
US20090177821A1 (en) 2008-01-04 2009-07-09 Robert Michael Dinkjian Cache Intervention on a Separate Data Bus When On-Chip Bus Has Separate Read and Write Data Busses
US20090216993A1 (en) 2008-02-26 2009-08-27 Qualcomm Incorporated System and Method of Data Forwarding Within An Execution Unit
US7804735B2 (en) 2008-02-29 2010-09-28 Qualcomm Incorporated Dual channel memory architecture having a reduced interface pin requirements using a double data rate scheme for the address/control signals
US8145805B2 (en) * 2008-06-09 2012-03-27 Emulex Design & Manufacturing Corporation Method for re-sequencing commands and data between a master and target devices utilizing parallel processing
US8046513B2 (en) * 2008-07-22 2011-10-25 Realtek Semiconductor Corp. Out-of-order executive bus system and operating method thereof
US20100306423A1 (en) 2009-05-26 2010-12-02 Fujitsu Semiconductor Limited Information processing system and data transfer method
US8332564B2 (en) * 2009-10-20 2012-12-11 Arm Limited Data processing apparatus and method for connection to interconnect circuitry
US20110221743A1 (en) 2010-03-11 2011-09-15 Gary Keall Method And System For Controlling A 3D Processor Using A Control List In Memory
US8489794B2 (en) * 2010-03-12 2013-07-16 Lsi Corporation Processor bus bridge for network processors or the like
US8631184B2 (en) * 2010-05-20 2014-01-14 Stmicroelectronics (Grenoble 2) Sas Interconnection method and device, for example for systems-on-chip
US8677045B2 (en) * 2010-09-29 2014-03-18 Stmicroelectronics (Grenoble 2) Sas Transaction reordering system and method with protocol indifference
US20120159037A1 (en) * 2010-12-17 2012-06-21 Kwon Woo Cheol Memory interleaving device and method using reorder buffer
US8656078B2 (en) * 2011-05-09 2014-02-18 Arm Limited Transaction identifier expansion circuitry and method of operation of such circuitry
US20140101340A1 (en) * 2012-10-05 2014-04-10 Analog Devices, Inc. Efficient Scheduling of Transactions from Multiple Masters
US20140101339A1 (en) * 2012-10-05 2014-04-10 Analog Devices, Inc. Efficient Scheduling of Read and Write Transactions in Dynamic Memory Controllers
US8880745B2 (en) * 2012-10-05 2014-11-04 Analog Devices, Inc. Efficient scheduling of transactions from multiple masters
US8886844B2 (en) * 2012-10-05 2014-11-11 Analog Devices, Inc. Efficient scheduling of read and write transactions in dynamic memory controllers

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Hardware/Software Tradeoffs: A General Design Principle?"; 3 pages, Dated Jan. 1985. *
International Search Report and Written Opinion of PCT Application Serial No. PCT/US2013/031865 mailed Jul. 26, 2013, 12 pages.
Xilinx-"AXI Reference Guide"; 82 pages, Dated Mar. 7, 2011. *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019180A1 (en) * 2013-03-06 2016-01-21 Sagem Defense Securite Method and device for filtering transactions for an on-chip system
US11252108B2 (en) 2019-06-19 2022-02-15 Nxp Usa, Inc. Controller for ordering out-of-order transactions in SoC
US11893413B2 (en) 2020-09-11 2024-02-06 Apple Inc. Virtual channel support using write table
US11775467B2 (en) 2021-01-14 2023-10-03 Nxp Usa, Inc. System and method for ordering transactions in system-on-chips

Also Published As

Publication number Publication date
US20130246682A1 (en) 2013-09-19
WO2013138683A1 (en) 2013-09-19

Similar Documents

Publication Publication Date Title
US9348775B2 (en) Out-of-order execution of bus transactions
EP3729281B1 (en) Scheduling memory requests with non-uniform latencies
KR102401594B1 (en) high performance transaction-based memory systems
US8775754B2 (en) Memory controller and method of selecting a transaction using a plurality of ordered lists
US7200688B2 (en) System and method asynchronous DMA command completion notification by accessing register via attached processing unit to determine progress of DMA command
US20090138570A1 (en) Method for setting parameters and determining latency in a chained device system
US8412870B2 (en) Optimized arbiter using multi-level arbitration
EP3732578B1 (en) Supporting responses for memory types with non-uniform latencies on same channel
US10146468B2 (en) Addressless merge command with data item identifier
US7512729B2 (en) Method and apparatus for a high efficiency two-stage rotating priority arbiter with predictable arbitration latency
US20140359195A1 (en) Crossbar switch, information processing apparatus, and information processing apparatus control method
US10176126B1 (en) Methods, systems, and computer program product for a PCI implementation handling multiple packets
WO2013052695A1 (en) Inter-processor communication apparatus and method
US20130290984A1 (en) Method for Infrastructure Messaging
AU2003234641B2 (en) Inter-chip processor control plane
TW201303870A (en) Effective utilization of flash interface
CN109062604A (en) A kind of launching technique and device towards the mixing execution of scalar sum vector instruction
US20040199706A1 (en) Apparatus for use in a computer systems
US20170308487A1 (en) Data transfer control system, data transfer control method, and program storage medium
US9846662B2 (en) Chained CPP command
US8787368B2 (en) Crossbar switch with primary and secondary pickers
US20120117286A1 (en) Interface Devices And Systems Including The Same
US20140173160A1 (en) Innovative Structure for the Register Group
JP2009194510A (en) Priority arbitration system and priority arbitration method
JP2014194619A (en) Buffer circuit and semiconductor integrated circuit

Legal Events

Date Code Title Description
AS Assignment

Owner name: ANALOG DEVICES, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JANDHYAM, KRISHNA S A;REEL/FRAME:028899/0081

Effective date: 20120830

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8