WO1996007139A1 - A multi-port memory system including read and write buffer interfaces - Google Patents

A multi-port memory system including read and write buffer interfaces Download PDF

Info

Publication number
WO1996007139A1
WO1996007139A1 PCT/US1995/010684 US9510684W WO9607139A1 WO 1996007139 A1 WO1996007139 A1 WO 1996007139A1 US 9510684 W US9510684 W US 9510684W WO 9607139 A1 WO9607139 A1 WO 9607139A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
memory array
memory
write
circuit
Prior art date
Application number
PCT/US1995/010684
Other languages
French (fr)
Inventor
Gary L. Mcalpine
Original Assignee
Mcalpine Gary L
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mcalpine Gary L filed Critical Mcalpine Gary L
Priority to AU34122/95A priority Critical patent/AU3412295A/en
Publication of WO1996007139A1 publication Critical patent/WO1996007139A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1015Read-write modes for single port memories, i.e. having either a random port or a serial port
    • G11C7/103Read-write modes for single port memories, i.e. having either a random port or a serial port using serially addressed read-write data registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/1642Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing

Definitions

  • This invention relates to digital electronic system architectures and circuits therefor, particularly to such architectures and circuits for use in applications requiring high performance memory access and data transfer.
  • HDTV high definition television
  • digital system performance is driven by new applications, as well as advances in current applications
  • HDTV high definition television
  • the implementation of high definition television depends critically on increasing digital system performance so as to achieve fundamental improvements in the quality of the large picture size of HDTV, relative to the current television standard.
  • advances in personal computers also require increases in system performance to accommodate developments such as parallel, superscalar and other advanced processing techniques.
  • Conventional system architectures generally combine a microprocessor, a main memory, and one or more other system components, such as other microprocessors and input/output devices. These architectures generally rely on a separate data communication mechanism that interconnects, and communicates data among, the system components. In particular, these architectures provide for interconnecting components through the data communication mechanism so as to share the main memory among each of the microprocessor and selected other system components.
  • These conventional architectures typically implement the data communication mechanism using either a conventional multi-drop data bus or a multi-port hardware switch. In multi-drop bus implementations, data communication is time-multiplexed among the system components coupled to the bus. In multi-port hardware switch implementations, each of the system components is coupled respectively to one of the switch ports, and data communication between any two components.
  • these architectures typically implement main memory using a plurality of conventional discrete dynamic random access memory (“DRAM”) devices, together with associated access circuitry.
  • DRAM discrete dynamic random access memory
  • These conventional architectures while suitable for many applications, tend to be inadequate for high performance applications.
  • conventional architectures are inadequate for applications requiring one or more of high system throughput, high system bandwidth, or low system latencies
  • Conventional architectures have nevertheless been employed. To do so, the architectures' performance shortfalls have typically been addressed using custom engineering solutions that adhere to the fundamental confines of the architecture.
  • VRAM video random access memory
  • the implementation of the data communication mechanism is particularly associated with conventional architectures' performance shortfalls.
  • system performance is limited to the bandwidth and throughput of the bus Bus bandwidth and throughput is subject both to the loading associated with interfacing the bus to system components and to the bus' physical characteristics, e.g., the length of the bus lines.
  • bus bandwidth and throughput is subject both to the loading associated with interfacing the bus to system components and to the bus' physical characteristics, e.g., the length of the bus lines.
  • buses time-multiplex data communications system performance is limited by associated latency in access to system data communications, a limitation that compounds with increases in either or both the number of components seeking to communicate and the size of each communication.
  • switches undesirably preclude each component's monitoring, e.g., "snooping", of the other components' memory activities, snooping generally being important to memory protection and cache coherency.
  • switches also tend to substantially preclude the communication of data from one component to a plurality of other components, e.g., multi-cast data communications.
  • SRAM static random access memory
  • the conventional approaches generally seek specifically to close the bandwidth gaps between main memory and microprocessors Accordingly, the conventional approaches are not directed at improving cooperation among the system components so as to improve system performance In particular, these approaches are not directed at improving communication of data among the system components or specifically at improving the sharing of main memory among a plurality of system components, all of which components may have bandwidths comparable to high performance microprocessors Accordingly, there is a need for an improved digital electronic system architecture and, in particular, an architecture that permits implementation of high performance digital electronic systems by improving data communication and main memory sharing among the system components. There is also a need for an improved memory circuit and, particularly. for a memory circuit that permits implementation of high performance digital electronic systems
  • the present invention meets the aforementioned needs and overcomes the aforementioned limitations by providing a digital electronic system architecture having one or more system components and a memory coupled to selected system components, the memory selectively storing and communicating data among the coupled components.
  • the digital electronic system preferably also has a transaction control bus, coupled to each of the selected system components and to the memory, for communicating command and control signals among the components and memory.
  • the invention also provides a memory circuit having a plurality of ports, each of the ports (i) having an input terminal and an output terminal that transfer data independently of one another, (ii) operating independently of one another and (iii) being coupled respectively to one of the other system components for data communication therewith.
  • the present invention also provides a read interface for a memory array, the interface having a queue for receiving data read from a row of the array and a selection circuit for placing in the queue a contiguous block of the read data, the size of the block and its placement being selectable
  • the read interface preferably comprises a plurality of queues, and the selection circuit preferably is adapted to place independently selectable blocks of the read data in independently selectable positions in selected queues.
  • the present invention also provides a write interface for a memory array, the interface having a queue for receiving data to be written to the array and a selection circuit for placing in the array a contiguous block of received data, the size of the block and its placement being selectable.
  • the write interface preferably comprises a plurality of queues, and the selection circuit preferably is adapted to place independently selectable data received from selected queues in independently selectable positions in the memory array.
  • Figure 1 shows a schematic representation of a generalized digital electronic circuit implemented using an architecture according to the present invention.
  • Figure 2 shows a general block diagram of a memory circuit according to the present invention.
  • Figure 3 shows a block diagram of a specific embodiment of the memory circuit of
  • Figure 4 shows a logical organization of a RAM array according to the present invention.
  • Figure 5 shows a master control and a RAM access control according to the present invention
  • Figure 6 shows an embodiment of a load control according to the present invention
  • Figure 7 shows an embodiment of an unload control according to the present invention
  • Figure 8 shows an embodiment of a write access interface according to the present invention
  • Figure 9 shows an embodiment of a shift count and write mask generator circuit according to the present invention
  • Figure 10 shows an embodiment of a RAM core according to the present invention
  • Figure 1 1 shows an embodiment of a sense amplifiers and write back registers circuit according to the present invention.
  • Figure 12 shows an embodiment of a read access interface according to the present invention
  • Figure 13 shows a data flow diagram of a memory circuit according to the present invention
  • Figures 14 through 23 show timing diagrams of the operation of a memory circuit according to the present invention.
  • the control and address bus 16 is common to the main memory 12 and the other system components 14, and is sometimes referred to herein as the transaction control bus.
  • the main memory 12 has a plurality of ports 18, each port providing a mechanism for data communication between the main memory 12 and the respective system component 14 coupled to the main memory 12.
  • the storage functions of the main memory 12 preferably are shared by each of the system components 14, that is, the main memory 14 preferably is randomly accessible by the system components 14 through the respective ports 18.
  • the ports 18 preferably operate independently of each other, so as to facilitate data communication, including providing for effectively simultaneous data communication between main memory and any plurality of system components coupled thereto.
  • the other system components 14 include one or more microprocessors, mass storage devices, video controllers, input/output devices, network interfaces, or the like. One or more of these system components 14 may be coupled to one or more peripheral components 20.
  • the digital electronic system 10 does not include any conventional data bus or hardware switch, it is to be recognized that the system 10 may include such a data bus or switch
  • the digital electronic system 10 may comprise a conventional multi-drop input/output bus to which mass storage and other peripheral components are coupled, the bus generally being coupled to the main memory 12 by an interposed controller
  • the main memory 12 according to the digital electronic system architecture of the present invention, provides the primary mechanism for data communication among the system components 14 coupled thereto.
  • the transaction control bus 16 communicates command, control and address signals, but no data, among the system components 14.
  • the transaction control bus 16 preferably comprises a system clock, signals combined to form transaction descriptors, and one or more control and arbitration signals coordinating accesses of, respectively, the main memory 12 and the transaction control bus 16.
  • Each transaction descriptor preferably consumes a relatively small portion of the bus' bandwidth.
  • each transaction descriptor communicated over the bus 16 preferably is independent of the other communicated descriptors
  • Each transaction descriptor preferably corresponds to a predefined transaction. To do so, each transaction descriptor preferably includes information identifying the type of transaction, e.g., load, write and read transactions for accesses of the main memory 12, as well as information identifying each participating system component 14, e.g., by the port 18 at which the participating component 14 is respectively coupled to the main memory 12
  • the transaction control bus 16 preferably time multiplexes the communication of transaction descriptors thereover.
  • each of the system components 14 competes for access to the bus 16 when transmitting transaction descriptors associated with accessing the main memory 12.
  • Access to the transaction control bus 16 preferably is determined by a selected arbitration algorithm. Because system throughput is limited principally by time- multiplexed communication of transaction descriptors over the transaction control bus 16 and each such descriptor consumes a relatively small portion of the bus' bandwidth, the transaction control bus 16 provides for communication of descriptors at a relatively high rate.
  • each descriptor can control the communication of a relatively large amount of data
  • the system's use of the bus 16 provides for a substantially enhanced system throughput of data
  • Each transmission over the transaction control bus 16 by any of the main memory 12 or a system component 14 preferably is received by each of the other system components 14 and main memory 12, as the case may be.
  • the system 10 supports conventional techniques and technologies including (i) snooping by each system component 14 of each of the other components' activities respecting the main memory 12, such as to maintain memory protection and cache coherency, where implemented, (ii) multi-cast data communication among the system components 14, and (iii) basic arbitration algorithms More specifically, broadcast command and control communications supports use of basic memory protection and cache coherency algorithms, particularly because each system component 14 can monitor the transaction descriptors communicated by the other system components 14 Moreover, broadcast command and control communications makes practical the use of basic arbitration algorithms because arbitration need only coordinate accesses to the transaction control bus 16 for defined, relatively short transaction descriptors from a known number of sources.
  • a memory circuit 22 in accordance with the present invention, includes a control interface 24, a write access interface 26, a RAM core 28 and a read access interface 30
  • the control interface 24 is coupled to the transaction control bus 16, as well as to each of the write access interface 26, the RAM core 28 and the read access interface 30.
  • the write access interface 26 is coupled to the RAM core 28 which, in turn, is coupled to the read access interface 30.
  • the write access interface 26 has a plurality of data input terminals
  • the input and output terminals 32 and 34 have a selected number, the number being designated herein by N.
  • the data input and output terminals 32 and 34 may be grouped to form a selected number of ports 18, the number being designated herein as P
  • the number of ports P is between 1 and N, each port being coupled respectively to one of the system components 14, as shown in Figure 1.
  • the main memory 12 of the system architecture shown in Figure 1 preferably is implemented using one or more memory circuits 22, the circuits 22 being organized to provide a selected word width for each of the ports 18 (word width is designated herein as W).
  • each circuit 22 generally provides a slice of the word width, the slice being N/P bits wide
  • the control interface 24 in response to signals received over the transaction control bus 16, controls each of the write access interface 26, the RAM core 28 and the read access interface 30. More specifically, the control interface 24 controls the routing of data into and out of the RAM core 28, as well as communication of data at the input and output terminals 32 and 34.
  • the write access interface 26, under control of the control interface 24, provides for buffering, queuing and routing of data for storage in the RAM core 28, the data being communicated to the memory circuit 22 at one or more of the data input terminals 32.
  • the read access interface 30, under control of the control interface 24, provides for routing, queuing and buffering of data stored in the RAM core 28 for communication at one or more of the data output terminals 34.
  • the memory circuit 36 shown in Figure 3 is a specific embodiment of the memory circuit 22 shown generally in Figure 2.
  • the control interface 24 comprises a master control 40, a load control 42, a RAM access control 44, and an unload control 46.
  • the control interface 24, as shown, also comprises a refresh control 48.
  • the refresh control 48 is employed when the RAM core 28 is implemented using dynamic random access memory ("DRAM"). In that case, the refresh control 48 provides, through the RAM access control 44, for refresh of the DRAM cells Refresh circuits and procedures are known and, accordingly, are not described further herein. It is to be recognized that, if the RAM core is implemented using other than DRAM, the refresh control 48 may be omitted without departing from the principles of the invention.
  • DRAM dynamic random access memory
  • the write access interface 26 comprises a data input interface 50, in queue registers 52 and a write data routing and section write mask circuit 54.
  • the RAM core 28 comprises a RAM array 56, a row access control 58, and sense amplifiers and write back registers 60.
  • the read access interface 30 comprises a read data routing circuit 62, out queue registers 64 and a data output interface 66.
  • the in and out queue registers 52 and 64 preferably are equal in number and in one-to-one relationship with the number of input and output terminals 32 and 34, respectively, such that the registers are N in number.
  • the in and out queue registers 52 and 64 preferably have uniform bit depth, that depth being designated herein by Q.
  • the write data routing and section write mask circuit 54 and the sense amplifiers and write back registers 60 are sometimes referred to herein as the routing/mask circuit 54 and the sense/write circuit 60, respectively.
  • the RAM array 56 preferably is a conventional array, physically organized as R rows and C columns.
  • the RAM array's columns preferably are logically organized into S sections 57.
  • the sections 57 lie end to end to form each row in the array 56, it is to be recognized that the sections 57 may have other logical organization, including being interleaved bit-by-bit, without departing from the principles of the invention
  • the number of columns per section 57 in the array 56 preferably is uniform for all sections 57 and equals the bit depth Q of the in and out queue registers 52 and 64
  • the RAM array 56 may be constructed using conventional SRAM or DRAM Generally, the RAM array 56 may be any memory technology.
  • control interface's master control 40 preferably has principal functions that include (i) providing configuration information for the memory circuit 36, including the number of ports 18 and associated grouping of elements of the circuit 36, (ii) receiving the external command and control signals carried over the transaction control bus 16 and, in response thereto, generating internal command and control signals, including an internal clock signal, and distributing the signals to the other elements of the circuit 36, and (iii) receiving internal command and control signals from the other elements of the circuit 36 and, in response thereto, generating external command and control signals for transmission over the transaction control bus 16.
  • the control interface's RAM access control 44 in response to internal signals received from the master control 40, generates internal command and control signals and distributes the signals to the appropriate elements of the circuit 36 in accordance with internal timing demands associated with performing each transaction.
  • the RAM access control 44 coordinates the flow of data in and out of the RAM core 28 and controls read and write timing.
  • the RAM access control 44 controls the load and unload controls 42 and 46.
  • the load control 42 operates under the control of the RAM access control 44, together with the master control 40, to control the queuing of data communicated to the circuit 36 at the data input terminals 32, while the unload control 46 operates under the control of the RAM access control 44 to control the unloading of queues of data from the RAM core 28 to the data output terminals 34
  • the control interface 24 provides for communication of external command and control signals carried over the transaction control bus 16, as well as communication of the interface's internally-generated command and control signals.
  • Communication of external command and control signals is provided by coupling the master control 40 with the transaction control bus 16 Communication of the internal signals is provided by coupling the master control 40 directly to the load control 42, and with the RAM access control 44 The internal signals are communicated from the RAM access control 44 by coupling the control 44 both to the load control 42 and to the unload control 46
  • the master control 40 is not directly coupled to the unload control 46, it is to be recognized that the master control 40 is indirectly coupled to the unload control 46 through the RAM access control 44. It is also to be recognized that the master control 40 may be coupled directly to the unload control 46 without departing from the principles of the invention, provided the unload control 46 receives command and control signals so as to provide its function.
  • the control interface 24 also provides for distribution of its internally-generated command and control signals to the other elements of the memory circuit 36.
  • the control interface 24 distributes the internal signals to the circuit's write access interface 26 via both the
  • control interface 24 distributes such internal signals to circuit's read access interface 30 via both the RAM access control 44 and the unload control 42. Moreover, the control interface 24 distributes the internal signals to the circuit's RAM core 28 via the RAM access control 44. It is to be recognized that the memory circuit's control interface 24 may comprise other or different functional blocks, or other or different interconnections between functional blocks and other elements of the memory circuit 36, or both, without departing from the principles of the invention, the important point being that the control interface 24, in response to signals received over the transaction control bus 16, controls the routing of data into and out of the RAM core 28, as well as communication of data at the input and output terminals 32 and 34.
  • FIGs 5 through 7 show embodiments of the control interface's master control 40, RAM access control 44, load control 42 and unload control 46
  • the master control 40 and the RAM access control 44 are shown in association with the transaction control bus 16
  • the master control 40 and RAM access control 44 preferably comprise respective state machines whose implementation is readily understood to those of ordinary skill in the art, using well known digital design techniques with reference to (i) the functions performed by, and the respective signals into and out of, each such machine, (ii) the structure and function of each functional block of the memory circuit 36 and of the memory circuit 36 overall, and (iii) the timing diagrams shown in Figures 14 through 23, all as described herein
  • the master control 40 and the RAM access control 40 may be implemented as a single state machine, together with one or more other blocks of the circuit 36, without departing from the principles of the invention
  • the master control 40 and transaction control bus 16 communicate therebetween external command and control signals carried over the bus 16, each signal preferably being buffered in its communication to or from the master control
  • the signals preferably include system clock 68, bank enable 70, byte enable 72, cancel access 74, and tcb 76, each received at the master control 40 from the transaction control bus 16, as well as q_ready 78 and read 80, both received at the transaction control bus 16 from the master control 40.
  • the system clock 68 provides the master clock for the synchronization of data communications at the memory circuit's terminals 32 and 34, as well as for the other command and control signals communicated between the master control 40 and transaction control bus 16 It is to be recognized that the frequency of the system clock 68 may be limited by loading of the transaction control bus and, in that case, data may be communicated at the terminals 32 and 34 on both the rising and falling edges of the system clock 68 so as to maintain data bandwidth, without departing from the principles of the invention.
  • the tcb 76 comprises a plurality of signals for communicating transaction descriptors to the memory circuit 36
  • Each transaction descriptor preferably comprises one or more packets of information communicated over the tcb 76, each packet being communicated synchronous with one respective cycle of the system clock 68 and having a preselected size given by the number of signals, the number being designated herein by D.
  • the information associated with each transaction descriptor preferably is predefined.
  • Transaction desc ptors n ormat on pre era y s communicated in prede ned el s w c preferably include fields respectively for commands, RAM array addresses, source and destination identifications, and transaction cycle counts
  • the commands preferably are encoded and correspond to transactions that include load, write and read transactions, while the unload function preferably is included as part of a read transaction and therefore has no separate transaction descriptor.
  • the source and destination identifications preferably are encoded and identify the respective port 18 associated with communicated data. In that regard, if the circuit 36 is employed in a system 10 implemented as shown in Figure 1 , the source and destination identifications identify, not only the port 18, but also the respective system component 14 associated with the port.
  • the transaction cycle count preferably describes, for load and read transactions, the number of system clock cycles for communication of data at the transaction's associated port 18 and, for write transactions, the size of the block of data to be written to the RAM array 56.
  • the transaction descriptors may vary in number of packets, while the descriptors' packets may vary in the number and types of fields, in particular depending on the command and, thence, the function of the particular descriptor.
  • the size D of the descriptors' packets preferably is invariate once selected for an application, being selected to optimize packet functionality and system performance while comporting with the design of the digital electronic system employing the transaction control bus 16 and memory circuit 36.
  • the size D in particular, preferably accommodates the addressing requisites of the RAM array 56.
  • RAM array 56 e.g. a write transaction
  • a write transaction preferably comprises four packets, one to communicate the source identification and the command, another to communicate the size of the data block to be written, and the remaining two to communicate the address of the initial bit in the writing of the data to the array 56.
  • the transaction descriptors' specific definitions are largely a matter of design choice, subject to and informed by, among other things, the transactions to be performed, the applications in which the transactions are performed, and the configuration of both the memory circuit 36 and the system 10, as described above and known in the art Accordingly, transaction descriptors' definitions are not described further herein
  • Bank enable 70 enables the circuit's reception of transaction descriptors from the tcb 76.
  • the source of the transaction descriptor asserts the bank enable 70 in conjunction with the source's transmission of the descriptor, preferably in conjunction with the transmission of the descriptor's first word
  • each memory bank has an associated bank enable signal. Accordingly, the bank enable 70 associated with the memory circuit 36 is asserted only if the circuit is in the bank addressed by the transaction descriptor Byte enable 72, when asserted, enables the circuit's writing of data to the RAM array
  • Byte enable 72 preferably is used where the memory circuit 36 is one of a plurality of memory circuits 36 organized to provide a selected word width W for one or more ports 18, the word width W being greater than one byte and the circuits 36 providing memory word slices. In such use, each byte of word width W has an associated byte enable signal, so that the particular byte enable signal associated with the memory circuit 36 is asserted only if the circuit 36 provides a slice of the byte addressed by the write transaction descriptor.
  • Cancel access 74 provides for the cancellation of read and write transaction descriptors before execution. In a system 10, cancel access preferably is monitored not only by the main memory 12, but also by the system components 14 so as to accurately track memory accesses. Cancel access 74 preferably is generated by an external algorithm monitoring memory transactions for, among other things, invalid accesses.
  • Q ready 78 is a handshake signal asserted by the memory circuit 36 to indicate readiness to receive another read or write transaction descriptor, and deasserted to indicate receipt of such descriptors.
  • Q ready 78 preferably is asserted a predetermined number of system clock cycles in advance of when it is able to accept the next read or write transaction descriptor. Advance assertion has particular application when the memory circuit
  • the memory circuit 36 is employed in systems 10 having arbitration algorithms to coordinate time-multiplexing of transaction descriptors over the transaction control bus 16.
  • the memory circuit 36 may be one of a plurality of such circuits forming a memory bank of the main memory 12
  • the q_ready signal of only one circuit 36 per bank
  • the system components 14 preferably monitor the q_ready signals so as to determine whether to transmit data to its associated port 18 of the main memory 12
  • Read 80 is another handshake signal asserted upon execution of each read transaction descriptor and deasserted prior to the circuit's communication of read data at one or more of the data output terminals 34.
  • Read 80 preferably is deasserted a predetermined number of system clock cycles in advance of that communication
  • advance deassertion allows the system component 14 that sent a read transaction descriptor to monitor the circuit's read 80 so as to determine when to receive data from the circuit 36.
  • the master control 40 and the RAM access control 44 generate internal command and control signals and communicate some of these signals therebetween.
  • the communicated signals preferably include load controls 82, write enable 84, cancel 86, tcb in 88 and internal clocks 94, each received at the RAM access control 44 from the master control 40, as well as reading 90 and start read 92, both received at the master control 40 from the RAM access control 44.
  • the master control 40 in response to receipt of system clock 68, generates internal clocks 94 which are distributed, not only to the RAM access control 44, but also to the elements of the memory circuit 36 generally so as to synchronize the memory circuit's internal operations
  • the internal clocks 94 though derived from and preferably synchronized with the system clock 68, need not have the same frequency as the system clock 68.
  • the internal clocks 94 may be obtained by multiplying or dividing the frequency of the system clock 68.
  • Load controls 82 enable loading of each word of the transaction descriptor received by the master control 40 into the RAM access control 44.
  • Write enable 84, cancel 86 and tcb in 88 comprise synchronized versions, respectively, of byte enable 72, cancel access 74 and tcb 76 received over the transaction control bus 16.
  • Write_enable 84 preferably determines whether data is replaced in the write back registers of the sense/write circuit 60 during a write transaction
  • Reading 90 is an internal version of read 80 transmitted from the master control 40 over the transaction control bus 16.
  • Start read 92 enables the start of the read phase of a row access in the RAM array 56.
  • Start read is generated by the RAM access control and communicated both to the master control 40 and to row access control 58 of the RAM core 28
  • the master control also generates a load count 96 that is directed to and controls operation of the load control 42 Load count 96 is described hereinafter in the description of the load control 42.
  • the RAM access control 44 generates and communicates internal command and control signals in addition to those directed to the master control 40 These signals preferably include start_write 100, base mask enables 102, next mask enables 104, queue select 106, load enable 108, load rcount 110, row address 1 12, section select 1 14, base column 116, block size 1 18, and input block size 120.
  • Start write 100 is directed to the row access control 50 of the RAM core 28 to start the write phase of a RAM array access.
  • Base mask enables 102 are directed to the RAM core 28. Each signal of base mask enables 102 enables bit replacement in the RAM array's addressed row, in particular in the signal's associated section 57. The bits preferably are replaced when the respective signal of the base mask enables 102 is asserted. Because each row in the RAM array 56 preferably is divided into S sections 57, base mask enables 102 preferably comprises S signals.
  • Next mask enables 104 are directed to the RAM core 28. Each signal of next mask enables 104 enables bit replacement in the next-consecutive section 57 of the
  • the bits preferably are replaced when the respective signal of next mask enables 104 is asserted.
  • the next mask enables preferably also comprise S signals, one corresponding to each section 57 in a row of the RAM array 56.
  • Queue select 106 selects one of the in queue registers 52 of the write access interface
  • queue select 106 triggers routing of the selected register's enqueued data to the RAM array 56 during the execution of a write transaction descriptor.
  • queue select 106 preferably comprises log,(N) signals.
  • Load enable 108 controls the loading of data read from the RAM array 56 into a corresponding out queue register 64.
  • the number of signals of the load enable 108 preferably is in one-to-one relationship with the number of out queue registers 64. Accordingly, where the number of out queue registers 64 is N, the number of signals of load enable 108 preferably is N
  • Load rcount 1 10 is directed to the unload control 46 in controlling the operation thereof Load rcount 1 10 is described hereinafter in the description of the unload control 46
  • Row address 112 section_select 1 14 and base_column 1 16 comprise the address signals for accessing the RAM array 56 and reading selected data therefrom
  • Row address 112 is directed to the row access control 58 of the RAM core 28 to control row accesses of the RAM array 56.
  • row address 112 preferably comprises log 2 (R) signals.
  • Section select 114 signals are directed to the read data routing circuit 62 of the read access interface 30 to identify sections 57 associated with an addressed row of the RAM array 56 from which data is routed to the out queue registers 64
  • Section select 1 14 preferably comprises log 2 (S) signals where S represents the number of sections 57 per row.
  • Base column 1 16 is directed to the read data routing circuit 62.
  • Base column 116 selects, within the selected section 57 of the addressed row of the RAM array 56, the particular column where the addressed data begins.
  • Base column 116 is also directed to the routing/mask circuit 54 of the write access interface 26 for generating control signals that provide for writing of data from a particular addressed column in a section 57.
  • Base column 116 preferably comprises log 2 (Q) signals, where Q represents the number of columns per section 57. In generating row address 112, as well as start read 92 and start write 100, the
  • RAM access control 44 is responsive not only to the signals received from the master control 40, but also to two sections 122.
  • Two sections 122 is generated by the routing/mask circuit 54 of the write access interface 26 and indicates to the RAM access control 44 when a RAM access engenders the crossing of the boundary between two sections of the RAM array 56.
  • the section select 114 identifies the last section of a row
  • two_sections 122 indicates the crossing of a row boundary.
  • the RAM access control 44 preferably generates two successive sequences of access signals row address 112, start read 92 and start write 100.
  • the first sequence of row address 112, start read 92 and start_write 100 provides for access to a row for the first section of data to be written to or read from the RAM array 56.
  • the second sequence of such access signals provides for access to the row having the next section, which preferably is the next consecutive row in the RAM array 56.
  • Two sections 122 is described further herein with respect to the write access interface 26.
  • Block size 1 18 is directed to the routing/mask circuit 54 of the write interface 26 and describes the size of the block of data associated with a read or write transaction descriptor That is, block size 118 determines the number of bits to be replaced or read from each section 57 of a row of the RAM array 56 in, respectively, write and read transactions.
  • Block_size 1 18 preferably comprises log 2 (Q) signals, where Q represents the number of columns in each section 57 of the RAM array 56.
  • Input block size 120 is directed to the load control 42 in controlling the operation thereof.
  • Input block size 120 describes the size of the block of data associated with an associated transaction descriptor. Input block size 120 is described further in the following description of the load control 44.
  • the load control 42 operates under the control of the RAM access control 44, together with the master control 40, to control the in queue registers' queuing of data communicated to the circuit 36 at the data input terminals 32.
  • the load control 42 preferably comprises a plurality of element counters 130, each having input block size 120 and load count 96 as inputs thereto, and a shift enable signal 132 as an output therefrom for communication to a respective in queue register 52.
  • the number of element counters 130 preferably is in one-to-one correspondence with the number of in queue registers 52 so that each counter 130 individually controls the operation of a respective register 52 through the generation of the respective shift enable signal 132.
  • the number of element counters 130 preferably is N, where N designates the number of input terminals 32 as previously described.
  • the element counters 130 preferably comprise down counters and each element counter 130 preferably operates independently of the others. Upon execution of a transaction descriptor implicating one or more of the element counters 130, such counters 130 are individually loaded with input block size 120, describing the size of the data block associated with the transaction descriptor.
  • the other element counters 130 may be loaded with a value of input block size 120 corresponding to a previous or succeeding transaction descriptor.
  • the value of the input block size 120 accordingly, may vary from transaction descriptor to transaction descriptor and, thence, from counter 130 to counter 130.
  • the input block size 120 preferably has values ranging from one bit to the full bit depth Q of the in queue registers 52.
  • the input block size 120 preferably comprises log 2 (Q) signals. It is to be recognized that input block size 120 may be received at the load control 42 from the transaction control bus 16 directly or otherwise, rather than from the RAM access control 44, without departing from the principles of the invention.
  • load count 96 preferably comprises a plurality of signals, one for each element counter 130.
  • load count 96 preferably numbers N signals. It is to be recognized that, when the circuit 36 has a plurality of ports 18 and provides a word slice N P bits wide, each transaction descriptor associated with receiving data at a particular port preferably engenders generation of N/P signals of load count 96, each of these signals being directed to a respective element counter 130 associated with that receiving port so as to initially load therein the input block size 120.
  • each of the element counters 130 associated with that receiving port will be initially loaded with a common-valued input block size 120 and, while data is to be enqueued, each will generate a shift enable signal 132 to enable the respective register.
  • each element counter 130 while holding a non-zero value, enables the enqueuing of data into the respective in queue register 52 by asserting the shift enable 132 associated therewith. For each bit of data so enqueued in a register 52, the respective element counter 130 decrements once. When the counter 130 decrements to zero, the counter 130 disables enqueuing of data and ceases to decrement.
  • the unload control 46 operates under the control of the RAM access control 44 to control the queuing of data from the RAM core 28 for communication at the data output terminals 34.
  • the unload control 46 as shown in Figure 7, preferably has substantially similar structure as the load control 42 and operates in a substantially similar manner as the load control 42, except its operations are directed at controlling the out queue registers 64 in the communication of data from the circuit 36. That is, the unload control 46 comprises a plurality of element counters 134, each preferably being substantially similar to the element counters 130 of the load control 42. These counters 134 have as inputs thereto block size 1 18 and load rcount 110, both of which have substantially similar functions and parameters as the corresponding input signals of the load control's element counters 130.
  • each element counter 130 and 134 is operatable independently of each of the other such counters 130 and 134
  • the write access interface 26 preferably comprises N input terminals 32, each coupled respectively by a buffer 150 to one of N in queue registers 52
  • the buffers 150 implement the data input interface 50 of Figure 3.
  • the in queue registers 52 preferably comprise queues, each controlled independently by a respective shift enable signal 132 received from the control interface 24.
  • Each in queue register 52 has a depth Q and receives data serially while enabled by the respective shift enable signal 132, the shift enable signal 132 preferably enabling data reception only while valid data is to be enqueued for the respective transaction.
  • each in queue register 52 receives data synchronously with the system clock 68, either at the clock's frequency or at double that frequency, e.g. at both edges.
  • the in queue registers 52 preferably are grouped in N/P registers 52 per port In that case, when executing a transaction descriptor identifying a particular port, the associated N/P in queue registers 52 are each enabled and disabled by the descriptor.
  • Each in queue register's enqueued data is received, in parallel, by the routing/mask circuit 54 This reception includes up to Q bits, and is controlled by the queue select 106 which the routing/mask circuit 54 receives from the control interface 24. As described above, queue select 106 selects one of the in queue registers 52 for transfer of the data enqueued therein to the routing/mask circuit 54.
  • the routing/mask circuit 54 preferably provides for routing of data from the in queue registers 52 to the addressed locations in the RAM array 56 and, to do so, generates masking control signals that enable only the valid data to be replaced in the write back registers of the sense/write circuit 60.
  • the routing/mask circuit 54 preferably comprises a multiplexer 152, a position shifter 154, and a shift count and write mask generator 156.
  • the multiplexer 152 selectably receives the data enqueued by the particular register 52 identified by the queue select 106 and routes it to the position shifter 154. It is to be recognized that, when the in queue registers 52 are grouped as N P registers 52 per port, the execution of a write transaction descriptor engenders consecutive retrievals of data from the implicated registers 52
  • the position shifter 154 preferably comprises a barrel shifter for rotating the data received from the multiplexer 152 and for transferring the rotated data to the RAM core 28.
  • the position shifter 154 is responsive to shift_count signal 158 provided by the routing/mask circuit's shift count and write mask generator 156.
  • the position shifter 154 rotates the data to adjust for the extent the data was pushed into the respective in queue register 52 and to provide for the data's relative position in a section 57 as addressed in the associated transaction descriptor.
  • the position shifter 154 preferably transfers the data to the RAM core 28 in Q parallel bits over write data signals 160.
  • the routing/mask circuit 54 can be implemented without using the position shifter 154, without departing from the principles of the invention.
  • the data is enqueued into the in queue registers 52 by sequentially loading starting at any appropriate position in such registers 52 during the respective load operation.
  • This alternative relies on implementing a shift function in each in queue register 52. Accordingly, this alternative implicates having additional circuitry in such registers 52 while not having the position shifter 154 in the routing/mask circuit 54.
  • the routing/mask circuit's shift count and write mask generator 156 preferably comprises an adder 170, an end range disables circuit 172, a base range enables circuit 174 and a base section write mask generation circuit 176.
  • the circuits 172, 174 and 176 preferably comprise decoding logic.
  • the generator 156 has base column 116 and block size 118 as input signals, which are received from the control interface 24. Responsive to such signals, the generator 156 generates (i) shift count 158 for routing to the position shifter 154, (ii) two sections 122 for routing to the control interface 24, and (iii) base section mask 178 and next section mask
  • Base section mask 178 and next section mask 180 comprise the masking control signals that enable only the valid data to be replaced in the write back registers of the sense/write circuit 60. More specifically, base section mask 178 selects the bits to be replaced within each selected section 57 associated with the transaction descriptor being executed. To do so, base section mask 178 preferably comprises a map of Q mask bits: each bit corresponds to a respective signal in write data 160 such that, when a mask bit is asserted, the section bit is replaced with the respective bit carried over that write_data signal 160.
  • Next_section_mask 180 performs a function substantially similar to that of base_section_mask 178, except it provides for bit replacement in the consecutive section 57 next-following the selected section 57, so as to accommodate a RAM access that crosses the boundary between two sections
  • the generator's adder 170 adds the base_column 1 16 and the block_size 1 18
  • the adder's resulting value comprises shift count 158, while the adder's carry out comprises two sections 122.
  • the base range enables circuit 174 decodes base column 1 16 to generate enables from the addressed base column (i.e., the relative position in a section 57 where valid data begins) to the end of the section 57 associated with the base column 116.
  • the end range disables circuit 172 decodes shift count
  • the end of valid data may fall either in the base column's section 57 or in the next consecutive section.
  • the disables falling in the next consecutive section comprise the next section mask 180.
  • the disables falling in the base column's section 57 are routed to the base section write mask generation circuit 176, together with the enables generated by the base range enables circuit 174.
  • the generation circuit 176 which preferably comprises a set of AND gates, combines the corresponding bits received from the circuits 174 and 172 to generate base section mask 178.
  • the RAM access control 44 preferably generates a second sequence of the access signals row address 112, start read 92 and start_write 100, responsive to two_sections 122 as previously described. However, additional masking control signals preferably are not generated. That is, the RAM access control 44 generates the second sequence of access signals so that the original next section mask 180 can be used to identify the valid data of the next section even though the next section is in a row separate from the base section.
  • the RAM core 28 preferably comprises the RAM array 56 for storing data; the row access control 58 for enabling and controlling accesses of the RAM array 56; and the sense amplifiers and write back registers 60 for both buffering data to and from the RAM array 56 and temporarily storing a row of accessed data.
  • the RAM array 56 as previously described, preferably comprises a conventional memory array, and has R rows and C columns.
  • the row access control 58 preferably comprises decoding logic The control 58 receives row address 1 12, start read 92 and start_write 100 from the control interface 24, generates row enables 190 and ram write 192 for routing to the RAM array 56, and generates ram read 194 for routing to the sense amplifiers and write back registers 60.
  • Row enables 190 generated from the decode of the row address 1 12, enable access to the rows of the RAM array 56.
  • Row enables 190 preferably comprises R signals, each signal corresponding to a respective row of the RAM array 56. In operation, preferably only one signal of row enables 190 is asserted at a time so as to limit access of the RAM array 56 to only one row at a time.
  • Ram write 192 and ram read 194 comprise timing signals that control the RAM array 56 and the sense amplifiers and write back registers 60, respectively, in buffering data therebetween.
  • Ram write 192 and ram read 194 each preferably comprise one signal
  • the row access control 58 is responsive to start write 100, start read 92 and row address 1 12 in the execution of write and read transaction descriptors. Accordingly, when a RAM access crosses a row boundary, the second sequence of access signals generated by the RAM access control 44 preferably triggers the generation of a corresponding second sequence of ram write and ram read signals 192 and 194.
  • the sense amplifiers and write back registers 60 comprise sense amplifiers 200 and a write back register 202. As shown in Figure 1 1, both the sense amplifiers 200 and the write back register 202 are logically organized in S sections, each corresponding to a respective section 57 of a RAM array row. Accordingly, each section of the amplifiers 200 and write back register 202 buffers data for Q columns of the RAM array 56, Q being the depth of each section 57. It is to be recognized, however, that the sense amplifiers 200 and write back register 202 preferably have one sense amplifier and one register element respectively for each column of the RAM array 56.
  • the sense amplifiers 200 buffer data to and from the RAM array 56 over ram data 196. If the RAM array 56 is DRAM, a complete row, comprising C bits of data, is read into the sense amplifiers 200 from the array and written back to the array on every access. Accordingly, ram data 196 preferably comprises C signals. Because the sense amplifiers 200 are organized in sections, the signals of ram data 196 preferably are organized in S groups, each group having Q signals.
  • read data 198 preferably comprises C signals that are organized in S groups, each group having Q signals Each group of Q signals of read data 198 is associated with a respective logical section of the write back register 202
  • Ram read 1 4 causes the data sensed by the sense amplifiers 200 to be latched in the write back register 202 for temporary storage, the row being enabled by one signal of row_enables 190 If the access corresponds to execution of a read transaction, the one or more sections 57 of data corresponding to the transaction are routed over read data 198 to the read access interface 30 before the read data is written back to the RAM array 56
  • write back register 202 receives new data from the write access interface 26 over write data 160
  • write data 160 preferably comprises Q parallel signals, where Q is the depth of each in queue register 52. Accordingly, Q bits of new data, so received, replace the appropriate data in the write back register 202 in each clock cycle preceding writing of the data back to the enabled row of the RAM array 56
  • Ram write 192 writes all of the data from the write back register 202 to the enabled row of the RAM array 56 whether or not data has been replaced in every section of the register 202
  • Each read and write transaction preferably is associated with one or two RAM accesses so as to comprise transfer of up to C bits of data, C being the number of columns in a full row of the RAM array 56.
  • C being the number of columns in a full row of the RAM array 56.
  • the write back register 202 preferably comprises flip flops that select between the output of the sense amplifiers 200 and the bits received from the write access interface 26 As shown in Figure 11, each section of the write back register 202 receives in parallel the bits from the write access interface 26 Each section also receives a respective signal of base_mask_enables 102, next_mask_enables 104, base section mask 178 and next section mask 180 If the signal of base mask enables 102 associated with a particular section of the write back register 202 is asserted, bit replacement is enabled for that section
  • the base section mask 178 determines which bits are replaced in the enabled section. Where the replacing data crosses a section boundary, the signal of next mask enables 104 associated with the next -consecutive section of the write back register 202 is asserted, enabling bit replacement in that section.
  • the next section mask 180 determines which bits are replaced in that enabled next section.
  • the read access interface 30 preferably comprises N output terminals 34, each coupled respectively by a buffer 210 to one of N out queue registers 64
  • the buffers 210 implement the data output interface 66 of Figure 3.
  • the out queue registers 64 preferably comprise queues, each controlled independently by a respective out enable signal 136 received from the control interface 24.
  • the out enable signals 136 enable and disable routing of data from the out queue registers 64 to the buffers 210, and control the buffering of that data through the buffers 210 to the data output terminals 34.
  • the out enable signals 136 preferably enable routing only while valid data is enqueued for the respective transaction.
  • Each out queue register 64 has a depth Q and, while enabled by the respective out enable signal 136, serially routes data to the respective buffer 210.
  • each out queue register 64 routes data synchronously with the system clock 68, either at the clock's frequency or at double that frequency, e.g. at both edges.
  • the out queue registers 64 preferably are grouped in N/P registers 64 per port. In that case, execution of a read transaction descriptor identifying a particular port entails enabling and disabling each of the associated N/P out queue registers 64.
  • Each out queue register 64 receives data, in parallel, from the read data routing circuit
  • the read data routing circuit 62 provides for routing of data from the RAM core 28 to the respective out queue register 64 associated with the data's corresponding read transaction
  • the read data routing circuit 62 receives section select 1 14 and base column 116 from the control interface 24 and receives data in sections from the RAM core 28 over read data 198
  • the read data routing circuit 62 comprises a multiplexer 212 and a justify shifter 214.
  • the multiplexer 212 selects the section 57 of RAM array data identified by section select 114, as well as the next consecutive section 57 in order to accommodate crossing of section boundaries by the valid data.
  • the justify shifter 214 receives the two sections of data selected by the multiplexer 212 and, responsive to base column 116, justifies the data so that the initial bit of the valid data is loaded into the first location in the respective out queue register 64. To route the justified data to the appropriate register 64, the justify shifter 214 is coupled in parallel to each out queue register 64. It is to be recognized that, when the out queue registers 64 are grouped as N/P registers 64 per port, the execution of a read transaction descriptor engenders consecutive routings of data from the justify shifter 214 to the implicated registers 64.
  • read data routing circuit 62 can be implemented without using the justify shifter 214, without departing from the principles of the invention.
  • read data can be loaded from the RAM array 56 directly into the out queue registers 64 provided the loaded data may be output from such registers 64 starting at any randomly selectable position therein, such selected position corresponding to the beginning of the valid data.
  • This alternative relies on implementing a random access function in each of the out queue registers 62. Accordingly, this alternative implicates having additional circuitry in such registers 64 while not having the justify shifter 214 in the read data routing circuit 62.
  • the operation of the memory circuit 36 is depicted in Figure 13 with reference to the in queue registers 52, the multiplexer 152 and position shifter 154 of the write data routing and section write mask circuit 54, the write back register 202, the RAM array 56, the multiplexer 212 and the justify shifter 214 of the read data routing circuit 62, and the out queue registers 64.
  • the memory circuit 36 is configured to have N ports 18 and is depicted receiving, at the in queue registers 52, data block 250 from port 251, data block 252 from port 253 and data block 254 from port 255. Data block 252 is received first; data block 250 is received second; and data block 254 is received third.
  • Each of the data blocks 250, 252 and 254 are depicted being routed from the out queue registers 64 at ports 251, 253 and 255, respectively.
  • Data block 250 is routed first; data block 254 is routed second, and data block 252 is routed third.
  • load descriptors 260 One descriptor of each type is contemplated to trigger memory circuit operations respecting each data block 250, 252 and 254.
  • each descriptor type has associated therewith a series of predefined steps
  • Load descriptors 260 preferably have as a principal step the loading of data into the in queue registers 52.
  • Each load descriptor 260 controls the loading of data into the one or more registers 52 corresponding to the port 18 associated with the descriptor 260.
  • Load descriptors 260 preferably are accepted at any time.
  • each port 18 can load data in response to a load descriptor 260 associated with that port while any or all other ports 18 are loading data in response to load descriptors associated therewith Moreover, execution of load descriptors 260 is independent of execution of both write descriptors 262 and read descriptors 264.
  • the circuit preferably executes the load operations to completion independent of all other memory circuit activity. If the memory circuit 36 is configured for multiple ports 18, as shown in Figure 13, multiple load descriptors 260 can be in various stages of execution at any given time.
  • the load descriptors 260 can accommodate data blocks ranging from one bit up to Q bits, where Q preferably is equal to the depth of the in queue registers 52.
  • Write descriptors 262 preferably have as principal steps a funnel operation 266, a position operation 268, a replace operation 270 and a store operation 272. Through these steps, each write descriptor 262 provides for transferring data from the in queue registers 52 associated with the descriptor's port to the RAM array 56 for storage at an address specified in the descriptor. Although as shown each port has associated therewith a single in queue register 52, it is to be recognized that each port may have a plurality of associated registers 52, without departing from the principles of the invention.
  • the funnel operation 266 selects the in queue registers 52 associated with the descriptor's port for transfer of the data enqueued, at one register per clock cycle, to the position shifter 154.
  • the funnel operation 266 employs the multiplexer 152 of the routing/mask circuit 54.
  • the position operation 268 shifts the valid data received from each in queue register 52 to provide for positioning the data in a section 57 in accordance with the addressing of the descriptor, or in two consecutive sections 57 when the positioning causes the data to cross a section boundary
  • the replace operation 270 employs the write back register 202 to replace data read from the RAM array 56 into the write back register 202 with the valid data from the in queue registers 52.
  • the replace operation 270 replaces bits starting with the section 57 in which the descriptor's base address resides and moves through sequential sections, one for each in queue register 52 associated with the write descriptor 262
  • the mask signals 102, 104, 178 and 180 are employed in this operation to determine which bits get replaced, including when data blocks cross section boundaries in the replacement operation 270
  • the store operation 272 transfers the entire contents of the write back register into the enabled row of the RAM array 56 responsive to the write descriptor 262. As previously described, a single write descriptor may engender two accesses to the RAM array
  • Figure 13 depicts execution of a sequence of write descriptors 274, 276 and 278 associated with data blocks 252, 254 and 250, respectively.
  • the write descriptor 274 has progressed to the replace operation 270, while the write descriptor 276 is ready to begin the position operation 268 and the write descriptor 278 is completing the funnel operation 266.
  • the progress in execution of the write descriptors 274, 276 and 278 preferably reflects the order of the descriptors' receipt by the memory circuit 36.
  • Read descriptors 264 preferably have as principal steps a fetch operation 280, a funnel operation 282, a justify operation 284 and an unload operation 286.
  • the fetch operation 280 comprises reading a complete row of data from the RAM array 56, as addressed by the read descriptor 264.
  • the funnel operation 282 comprises transferring, to the justify operation's justify shifter 214, two sections of fetched data for each out queue register 64 corresponding to the port 18 of the descriptor 264, each register's two sections being transferred in a single clock cycle.
  • the unload operation 286 comprises routing the justified data from the memory circuit 36 through the out queue registers 64 corresponding to the port 18 associated with the read descriptor 264. Once initiated by the read descriptor 264, the unload operation 286 preferably executes to completion independent of any other memory circuit activity. If the memory circuit 36 is configured for multiple ports 18, as shown in Figure 13, multiple unload operations 286 can be in various stages at any given time.
  • Figure 13 depicts execution of a sequence of read descriptors 288, 290 and 292 associated with data blocks 250, 254 and 252, respectively.
  • the read descriptor 288 has progressed to the unload operation 286.
  • the read descriptor 290 having completed the funnel operation 282, is ready to begin the justify operation 284.
  • the read descriptor 292 has completed the fetch and funneling operations 280 and 282 As shown for data block 250, the justify operation 284 justifies the data into one section even if, as fetched, it crosses section boundaries.
  • the progress in execution of the read descriptors 288, 290 and 292 preferably reflects the order of their receipt by the memory circuit 36.
  • Figures 14 through 23 are timing diagrams further depicting the operation of the memory circuit 36.
  • Figure 14 shows the load timing for one port 18 writing a block of eight words to the in queue registers 52 using a one-cycle transaction descriptor.
  • Figure 15 shows the load timing for one port 18 writing a block of nine or more words to the in queue registers 52 using a two-cycle transaction descriptor.
  • Figure 16 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor.
  • Figure 17 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/2 ports 18, the descriptor being a three-cycle transaction descriptor.
  • Figure 18 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/4 ports 18, the descriptor being a four-cycle transaction descriptor.
  • Figure 19 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/4 ports 18, the descriptor using a four-cycle transaction descriptor.
  • the operations shown in Figure 19 differs from those shown in Figure 18 in that the access crosses a row boundary with the contents of the first in queue register 52 and of part of the second in queue register written to the end of the addressed row, while the contents of the other part of the second in queue register 52 and of the third and fourth in queue registers 52 are written at the beginning of the next consecutive row.
  • Figure 20 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor.
  • Figure 21 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N/2 ports 18, the descriptor being a four-cycle transaction descriptor.
  • Figure 22 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N/4 ports 18, the descriptor being a three-cycle transaction descriptor.
  • Figure 23 shows a read access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor.
  • the operations shown in Figure 23 differ from those shown in Figure 22 in that the addressed data crosses a row boundary, with the data for the first, second and third out queue registers 64 being read from the end of the addressed row and the data for the fourth out queue register 64 being read from the beginning of the next consecutive row.
  • the memory circuit's control interface 24, in the above Figures, is shown to receive transaction descriptors from the transaction control bus 16 and, in response thereto, generates command and control signals for communication to the other elements of the memory circuit 36.
  • the write access interface 26 provides buffered data paths for the flow of data into the RAM core 28.
  • the interface responsive to receipt of load descriptors 260, controls the flow of data into the in queue registers 52, the data from each input terminal 32 being loaded into a respective in queue register 52.
  • the in queue registers 52 can be grouped in association with a respective port 18.
  • the enqueued data is written to the RAM core 28 responsive to receipt of write descriptors.
  • a single write descriptor 262 transfers all valid data to the RAM core 26 from the in queue registers 52 associated with the particular port 18 corresponding to the descriptor 262.
  • the data is routed through the multiplexer 152 and the position shifter 154.
  • These elements provide for writing the valid data into the RAM array 56 starting at any column of an addressed row.
  • the memory circuit 36 provides for placing in the RAM array 56 a block of data, the size of the block being independently selectable and the placement of the block in the RAM array 56 starting at an independently selectable position.
  • the memory circuit 36 provides for storing various blocks of data at independently selectable positions in the RAM array 56.
  • the memory circuit's read access interface 30 provides buffered data paths for the flow of data from the RAM core 28. Responsive to receipt of a read descriptor 264, data is read from the RAM array 56 in a complete row. Sections thereof are routed through the multiplexer 212 and the justify shifter 214 so that one or more complete or partial sections of valid data are selectable to comprise an output block. Block size is independently selectable from read descriptor to read descriptor. Each block of valid data is routed to the out queue registers 64, the placement of the blocks in the registers 64 being selectable The out queue registers 64 can be grouped in association with a respective port 18 A single read descriptor
  • read 80 is asserted and de-asserted and, thereafter, data is communicated at the respective output terminals 34.
  • the system component 14 that issued the read descriptor 264 receives the data a fixed number of system clock cycles after the de-assertion of read 80.
  • the memory circuit 36 can be packaged in various ways, including having separate data input and output terminals 32 and 34 or having a single set of terminals that are shared for input and output. Separate input and output terminals 32 and 34 allow for full-duplex operation, while shared terminals allow for support of additional ports in a package of fixed pin count.

Abstract

A digital electronic system architecture (10) having one more system components (14) and a memory array (12) coupled to selected system components, the memory array selectively storing and communicating data among the coupled components. The digital electronic system preferably also has a transaction control bus (16), coupled to each of the selected system components and to the memory array, for communicating command and control signals among the components and the memory array. The memory array has a plurality of ports (18), each of the ports (i) having an input terminal (32) and an output terminal (34) that transfer data independently of one another, (ii) operating independently of one another (iii) being coupled respectively to one of the other system components for data communication therewith. Read and write interfaces (30, 26) for the memory array are also provided which have queues (64, 52) for receiving data read from a row of the memory array and data to be written to the memory array, respectively. The interfaces also have selection circuits (62, 54) for placing in the queues a contiguous block of read data and in the memory array a contiguous block of received data, respectively. The size of the block and its placement are being selectable.

Description

A Multi -port Memory System Incl uding Read and Write Buffer Interfaces ,
Background of the Invention
This invention relates to digital electronic system architectures and circuits therefor, particularly to such architectures and circuits for use in applications requiring high performance memory access and data transfer.
Modern digital electronic systems are called upon to provide ever higher system performance, including higher speed data throughput, higher data bandwidth and lower system latencies. Higher system performance is driven by new applications, as well as advances in current applications For example, the implementation of high definition television ("HDTV") depends critically on increasing digital system performance so as to achieve fundamental improvements in the quality of the large picture size of HDTV, relative to the current television standard. At the same time, advances in personal computers also require increases in system performance to accommodate developments such as parallel, superscalar and other advanced processing techniques.
Increases in system performance ideally keep pace with increases in the performance of components employed in the systems so as to take full advantage of the components' capabilities. In practice, however, system performance lags component performance, being burdened by adherence to conventional architectures in the design of digital electronic systems
Conventional system architectures generally combine a microprocessor, a main memory, and one or more other system components, such as other microprocessors and input/output devices. These architectures generally rely on a separate data communication mechanism that interconnects, and communicates data among, the system components. In particular, these architectures provide for interconnecting components through the data communication mechanism so as to share the main memory among each of the microprocessor and selected other system components. These conventional architectures typically implement the data communication mechanism using either a conventional multi-drop data bus or a multi-port hardware switch. In multi-drop bus implementations, data communication is time-multiplexed among the system components coupled to the bus. In multi-port hardware switch implementations, each of the system components is coupled respectively to one of the switch ports, and data communication between any two components. In addition, these architectures typically implement main memory using a plurality of conventional discrete dynamic random access memory ("DRAM") devices, together with associated access circuitry. These conventional architectures, while suitable for many applications, tend to be inadequate for high performance applications. In particular, conventional architectures are inadequate for applications requiring one or more of high system throughput, high system bandwidth, or low system latencies Conventional architectures have nevertheless been employed. To do so, the architectures' performance shortfalls have typically been addressed using custom engineering solutions that adhere to the fundamental confines of the architecture.
For example, to provide enhanced video capabilities, personal computers have employed a video controller connected to the microprocessor through a multi-drop data bus, while using a bank of memory separate from main memory, this memory bank being dedicated to video and typically implemented using video random access memory ("VRAM") devices. These custom engineering solutions have significant limitations, including that they inherently address only the performance of individual components or features within the system, rather than the performance of the system as a whole. Accordingly, these solutions generally improve overall system performance to only a limited degree, if at all. Moreover, these solutions become increasingly more difficult to implement as performance demands increase, that difficulty increasing implementation expense. Accordingly, conventional architectures are increasingly inadequate, if viable at all, for high performance applications The architectures' performance shortfalls are more acute while the architecture-bound solutions suffer from ever greater limitations.
Conventional architectures' performance shortfalls stem, in particular, from constraints on the cooperation of system components. In turn, that cooperation depends in large part on data communication and main memory sharing among system components.
The implementation of the data communication mechanism is particularly associated with conventional architectures' performance shortfalls. When the architectures' data communication mechanism is implemented using a conventional multi-drop data bus, for example, system performance is limited to the bandwidth and throughput of the bus Bus bandwidth and throughput is subject both to the loading associated with interfacing the bus to system components and to the bus' physical characteristics, e.g., the length of the bus lines. In addition, because buses time-multiplex data communications, system performance is limited by associated latency in access to system data communications, a limitation that compounds with increases in either or both the number of components seeking to communicate and the size of each communication. In practice, system performance degrades as communications between any two components are impaired for any reason Implementing the architectures' data communication mechanism using a conventional multi-port hardware switch, rather than a multi-drop bus, can increase system performance The increase results from the switch's typically higher throughput and bandwidth However, these switches tend not only to be expensive, but also to introduce other significant problems in system performance. For example, the switches are not well suited either for networks and other applications requiring data communications in variable block sizes, or for HDTV and other applications requiring random accessibility of data in high speed operations. In addition, these switches typically do not provide for communication of control signals among components. Accordingly, these switches undesirably preclude each component's monitoring, e.g., "snooping", of the other components' memory activities, snooping generally being important to memory protection and cache coherency. Moreover, these switches also tend to substantially preclude the communication of data from one component to a plurality of other components, e.g., multi-cast data communications.
While conventional architectures' performance shortfalls are associated with the implementation of the data communication mechanism, the shortfalls are also associated with implementing a shared main memory. Reliance on conventional discrete DRAM devices to implement main memory significantly limits system performance, for example, as to system bandwidth and throughput. Conventional discrete DRAM devices have bandwidths that are significantly less than those of current microprocessors, as well as those of increasing numbers of other high performance components. Several approaches have been taken toward improving main memory performance
One approach is to replace conventional discrete DRAM devices with conventional discrete static random access memory ("SRAM") devices in implementing main memory, so as to take advantage of SRAM devices' substantially higher bandwidths However, using these SRAM devices generally introduces undesirable costs. Because these SRAM devices are approximately four times more expensive per unit memory size than the DRAM devices and because memory size generally is large and is likely to grow, e.g., full feature HDTV sets are expected to require at least 32 megabytes while next generation personal computers generally are expected to require at least 16 megabytes, the cost of implementing main memory using conventional discrete SRAM simply is antithetical to the economics of main memory implementation
Other conventional approaches to improving main memory performance focus on improving the bandwidth and throughput of discrete DRAM devices These approaches include incorporating SRAM memory as cache in discrete DRAM devices, bundling memory in propriety subsystems having internal data bussing, caching and protocols, employing multiple internal memory arrays, and employing alternate input/output modes While each of these approaches tends to achieve some improvement in the performance of DRAM devices, each also tends to be subject to undesirable limitations First, incorporating cache in the DRAM devices improves performance only to the extent cache hits occur with substantial regularity However, cache hits tend to vanish under various circumstances, particularly in applications having main memory rapidly accessed by several components Second, having multiple internal memory arrays tends to improve performance only if successive memory accesses address different arrays. In addition, to accommodate successive accesses of a single array, additional circuitry must be provided that compensates for the associated timing differences in the device's output of data Third, alternate output modes, which include page mode, static column mode, and nibble mode, allow faster access to data by outputting the data in bursts, but generally at the undesirable expense of reducing random accessibility, that is, the modes at best provide random access only within the burst The above, as well as other, conventional approaches to improving main memory performance also have the significant limitation of being directed narrowly at improving the memory's bandwidth and throughput. In doing so, the conventional approaches generally seek specifically to close the bandwidth gaps between main memory and microprocessors Accordingly, the conventional approaches are not directed at improving cooperation among the system components so as to improve system performance In particular, these approaches are not directed at improving communication of data among the system components or specifically at improving the sharing of main memory among a plurality of system components, all of which components may have bandwidths comparable to high performance microprocessors Accordingly, there is a need for an improved digital electronic system architecture and, in particular, an architecture that permits implementation of high performance digital electronic systems by improving data communication and main memory sharing among the system components. There is also a need for an improved memory circuit and, particularly. for a memory circuit that permits implementation of high performance digital electronic systems
Summary of the Invention
The present invention meets the aforementioned needs and overcomes the aforementioned limitations by providing a digital electronic system architecture having one or more system components and a memory coupled to selected system components, the memory selectively storing and communicating data among the coupled components. The digital electronic system preferably also has a transaction control bus, coupled to each of the selected system components and to the memory, for communicating command and control signals among the components and memory. The invention also provides a memory circuit having a plurality of ports, each of the ports (i) having an input terminal and an output terminal that transfer data independently of one another, (ii) operating independently of one another and (iii) being coupled respectively to one of the other system components for data communication therewith. The present invention also provides a read interface for a memory array, the interface having a queue for receiving data read from a row of the array and a selection circuit for placing in the queue a contiguous block of the read data, the size of the block and its placement being selectable The read interface preferably comprises a plurality of queues, and the selection circuit preferably is adapted to place independently selectable blocks of the read data in independently selectable positions in selected queues. The present invention also provides a write interface for a memory array, the interface having a queue for receiving data to be written to the array and a selection circuit for placing in the array a contiguous block of received data, the size of the block and its placement being selectable. The write interface preferably comprises a plurality of queues, and the selection circuit preferably is adapted to place independently selectable data received from selected queues in independently selectable positions in the memory array.
Therefore, it is a principal object of the present invention to provide a novel and improved digital electronic system architecture. It is another principal object of the present invention to provide a novel and improved digital electronic memory circuit.
It is another object of the present invention to provide a digital electronic system architecture having improved memory access and data transfer performance. It is a further object of the present invention to provide a digital electronic system architecture wherein main memory performs system data routing
It is yet another object of the present invention to provide a digital electronic system architecture wherein main memory is shared by a plurality of system components, the main memory being randomly accessible by each of the components through independent data ports of the main memory.
It is yet a further object of the present invention to provide a digital electronic system architecture that can transfer data between main memory and a plurality of other system components simultaneously. It is another object of the present invention to provide a digital electronic system architecture wherein any of a plurality of system components may transmit command and control signals to one or more other system components simultaneously.
It is a further object of the present invention to provide a digital electronic system architecture employing transaction-based command and control among the system components so as to enhance overall system performance.
It is yet another object of the present invention to provide a novel digital electronic system architecture that is compatible with components and techniques employed in conventional digital electronic system architectures.
It is yet a further object of the present invention to provide a digital electronic system architecture that consolidates virtually all system memory functions into a single system memory
It is still a further object of the present invention to provide a digital electronic system architecture having a multi-port main memory that is scalable in capacity, bandwidth, word width and number of ports. It is another object of the present invention to provide a novel and improved memory circuit with enhanced memory access and data transfer performance.
It is a further object of the present invention to provide a memory circuit that permits implementation, in a digital electronic system, of main memory having enhanced bandwidth, throughput and random accessibility in all data transfer modes. It is yet another object of the present invention to provide a memory circuit having multiple data transfer ports capable of simultaneous and mutually independent data transfer.
It is yet a further object of the present invention to provide a memory circuit capable of supporting ports of selectable word width while providing substantially unrestricted random accessibility to the memory through all ports, in variable size blocks and in both read and write operations
It is still another object of the present invention to provide a memory circuit that segregates control operations from access operations It is still a further object of the present invention to provide a memory circuit having a plurality of independent ports and capable of selectively sharing its bandwidth among a plurality of components coupled to respective ports
It is another object of the present invention to enable broadcast of control information simultaneously with data transfers through dedicated ports It is a further object of the present invention to enable an order of magnitude increase in achievable main memory performance while maintaining a hardware model consistent with existing operating system software, i.e., wherein all data communications pass through main memory
It is yet another object of the present invention to provide a discreet memory device with a configurable number of ports and port widths.
The foregoing and other objects, features and advantages of the invention will be more readily understood upon consideration of the following detailed description, taken in conjunction with the accompanying drawings.
Brief Description of the Drawings
Figure 1 shows a schematic representation of a generalized digital electronic circuit implemented using an architecture according to the present invention.
Figure 2 shows a general block diagram of a memory circuit according to the present invention. Figure 3 shows a block diagram of a specific embodiment of the memory circuit of
Figure 2
Figure 4 shows a logical organization of a RAM array according to the present invention.
Figure 5 shows a master control and a RAM access control according to the present invention
Figure 6 shows an embodiment of a load control according to the present invention Figure 7 shows an embodiment of an unload control according to the present invention Figure 8 shows an embodiment of a write access interface according to the present invention
Figure 9 shows an embodiment of a shift count and write mask generator circuit according to the present invention Figure 10 shows an embodiment of a RAM core according to the present invention
Figure 1 1 shows an embodiment of a sense amplifiers and write back registers circuit according to the present invention.
Figure 12 shows an embodiment of a read access interface according to the present invention Figure 13 shows a data flow diagram of a memory circuit according to the present invention
Figures 14 through 23 show timing diagrams of the operation of a memory circuit according to the present invention.
Detailed Description
Referring to Figure 1, a generalized digital electronic system 10 implemented using an architecture according to the present invention comprises a main memory 12, a plurality of other system components 14 and a control and address bus 16. The control and address bus 16 is common to the main memory 12 and the other system components 14, and is sometimes referred to herein as the transaction control bus. The main memory 12 has a plurality of ports 18, each port providing a mechanism for data communication between the main memory 12 and the respective system component 14 coupled to the main memory 12. The storage functions of the main memory 12 preferably are shared by each of the system components 14, that is, the main memory 14 preferably is randomly accessible by the system components 14 through the respective ports 18. The ports 18 preferably operate independently of each other, so as to facilitate data communication, including providing for effectively simultaneous data communication between main memory and any plurality of system components coupled thereto.
The other system components 14 include one or more microprocessors, mass storage devices, video controllers, input/output devices, network interfaces, or the like. One or more of these system components 14 may be coupled to one or more peripheral components 20.
Although the digital electronic system 10, as shown, does not include any conventional data bus or hardware switch, it is to be recognized that the system 10 may include such a data bus or switch For example, the digital electronic system 10 may comprise a conventional multi-drop input/output bus to which mass storage and other peripheral components are coupled, the bus generally being coupled to the main memory 12 by an interposed controller The important point is that the main memory 12, according to the digital electronic system architecture of the present invention, provides the primary mechanism for data communication among the system components 14 coupled thereto.
The transaction control bus 16 communicates command, control and address signals, but no data, among the system components 14. The transaction control bus 16 preferably comprises a system clock, signals combined to form transaction descriptors, and one or more control and arbitration signals coordinating accesses of, respectively, the main memory 12 and the transaction control bus 16. Each transaction descriptor preferably consumes a relatively small portion of the bus' bandwidth. Moreover, each transaction descriptor communicated over the bus 16 preferably is independent of the other communicated descriptors
Each transaction descriptor preferably corresponds to a predefined transaction. To do so, each transaction descriptor preferably includes information identifying the type of transaction, e.g., load, write and read transactions for accesses of the main memory 12, as well as information identifying each participating system component 14, e.g., by the port 18 at which the participating component 14 is respectively coupled to the main memory 12
The transaction control bus 16 preferably time multiplexes the communication of transaction descriptors thereover. In particular, each of the system components 14 competes for access to the bus 16 when transmitting transaction descriptors associated with accessing the main memory 12. Access to the transaction control bus 16 preferably is determined by a selected arbitration algorithm. Because system throughput is limited principally by time- multiplexed communication of transaction descriptors over the transaction control bus 16 and each such descriptor consumes a relatively small portion of the bus' bandwidth, the transaction control bus 16 provides for communication of descriptors at a relatively high rate. Moreover, because each descriptor can control the communication of a relatively large amount of data, the system's use of the bus 16 provides for a substantially enhanced system throughput of data Each transmission over the transaction control bus 16 by any of the main memory 12 or a system component 14 preferably is received by each of the other system components 14 and main memory 12, as the case may be. Having broadcast command and control communications, the system 10 supports conventional techniques and technologies including (i) snooping by each system component 14 of each of the other components' activities respecting the main memory 12, such as to maintain memory protection and cache coherency, where implemented, (ii) multi-cast data communication among the system components 14, and (iii) basic arbitration algorithms More specifically, broadcast command and control communications supports use of basic memory protection and cache coherency algorithms, particularly because each system component 14 can monitor the transaction descriptors communicated by the other system components 14 Moreover, broadcast command and control communications makes practical the use of basic arbitration algorithms because arbitration need only coordinate accesses to the transaction control bus 16 for defined, relatively short transaction descriptors from a known number of sources. Referring to Figure 2, a memory circuit 22, in accordance with the present invention, includes a control interface 24, a write access interface 26, a RAM core 28 and a read access interface 30 The control interface 24 is coupled to the transaction control bus 16, as well as to each of the write access interface 26, the RAM core 28 and the read access interface 30. The write access interface 26 is coupled to the RAM core 28 which, in turn, is coupled to the read access interface 30. The write access interface 26 has a plurality of data input terminals
32, while the read access interface 30 has a plurality of data output terminals 34. The input and output terminals 32 and 34 have a selected number, the number being designated herein by N.
It is to be recognized that the data input and output terminals 32 and 34 may be grouped to form a selected number of ports 18, the number being designated herein as P The number of ports P is between 1 and N, each port being coupled respectively to one of the system components 14, as shown in Figure 1. It is also to be recognized that the main memory 12 of the system architecture shown in Figure 1 preferably is implemented using one or more memory circuits 22, the circuits 22 being organized to provide a selected word width for each of the ports 18 (word width is designated herein as W). In such implementation, each circuit 22 generally provides a slice of the word width, the slice being N/P bits wide
In the memory circuit 22, the control interface 24, in response to signals received over the transaction control bus 16, controls each of the write access interface 26, the RAM core 28 and the read access interface 30. More specifically, the control interface 24 controls the routing of data into and out of the RAM core 28, as well as communication of data at the input and output terminals 32 and 34. The write access interface 26, under control of the control interface 24, provides for buffering, queuing and routing of data for storage in the RAM core 28, the data being communicated to the memory circuit 22 at one or more of the data input terminals 32. The read access interface 30, under control of the control interface 24, provides for routing, queuing and buffering of data stored in the RAM core 28 for communication at one or more of the data output terminals 34.
The memory circuit 36 shown in Figure 3 is a specific embodiment of the memory circuit 22 shown generally in Figure 2. In memory circuit 36, the control interface 24 comprises a master control 40, a load control 42, a RAM access control 44, and an unload control 46. The control interface 24, as shown, also comprises a refresh control 48. The refresh control 48 is employed when the RAM core 28 is implemented using dynamic random access memory ("DRAM"). In that case, the refresh control 48 provides, through the RAM access control 44, for refresh of the DRAM cells Refresh circuits and procedures are known and, accordingly, are not described further herein. It is to be recognized that, if the RAM core is implemented using other than DRAM, the refresh control 48 may be omitted without departing from the principles of the invention. It is also to be recognized that, although the remainder of this disclosure is directed to memory circuits using DRAM, SRAM may be employed, subject only to modifications readily understood to those of ordinary skill in the art by reference to the disclosures hereof and to well known memory design techniques, without departing from the principles of the invention.
In memory circuit 36, the write access interface 26 comprises a data input interface 50, in queue registers 52 and a write data routing and section write mask circuit 54. In turn, the RAM core 28 comprises a RAM array 56, a row access control 58, and sense amplifiers and write back registers 60. Moreover, the read access interface 30 comprises a read data routing circuit 62, out queue registers 64 and a data output interface 66. The in and out queue registers 52 and 64 preferably are equal in number and in one-to-one relationship with the number of input and output terminals 32 and 34, respectively, such that the registers are N in number. The in and out queue registers 52 and 64 preferably have uniform bit depth, that depth being designated herein by Q. The write data routing and section write mask circuit 54 and the sense amplifiers and write back registers 60 are sometimes referred to herein as the routing/mask circuit 54 and the sense/write circuit 60, respectively.
Turning to Figure 4, the RAM array 56 preferably is a conventional array, physically organized as R rows and C columns. The RAM array's columns preferably are logically organized into S sections 57. Although the sections 57, as shown, lie end to end to form each row in the array 56, it is to be recognized that the sections 57 may have other logical organization, including being interleaved bit-by-bit, without departing from the principles of the invention
The number of columns per section 57 in the array 56 preferably is uniform for all sections 57 and equals the bit depth Q of the in and out queue registers 52 and 64 The number S of sections 57 preferably is a power-of-two integer, and follows the formula S =
C/Q Having this logical organization provides for addressing the array's rows using log2(R) bits, addressing the array's sections 57 using log2(S) bits, and addressing the array's columns within an addressed section using log2(Q) bits It is to be recognized that, when the data input and output terminals 32 and 34 are N in number and are grouped to form a selected number P of ports 18, the circuit 36 generally provides a word slice N/P bits wide In that case, when a transaction descriptor associated with a particular port is executed, N/P in or out queue registers 52 or 64 generally are implicated Accordingly, the transaction descriptor generally addresses the RAM array 56 in groups of N P sections 57 This logical organization, then, provides for accessing the RAM array 56 in one or more queue-sized sections 57 in any transaction, while being able to address each column within such section
The RAM array 56 may be constructed using conventional SRAM or DRAM Generally, the RAM array 56 may be any memory technology.
Referring again to Figure 3, the control interface's master control 40 preferably has principal functions that include (i) providing configuration information for the memory circuit 36, including the number of ports 18 and associated grouping of elements of the circuit 36, (ii) receiving the external command and control signals carried over the transaction control bus 16 and, in response thereto, generating internal command and control signals, including an internal clock signal, and distributing the signals to the other elements of the circuit 36, and (iii) receiving internal command and control signals from the other elements of the circuit 36 and, in response thereto, generating external command and control signals for transmission over the transaction control bus 16.
The control interface's RAM access control 44, in response to internal signals received from the master control 40, generates internal command and control signals and distributes the signals to the appropriate elements of the circuit 36 in accordance with internal timing demands associated with performing each transaction. Among other principal functions so provided, the RAM access control 44 coordinates the flow of data in and out of the RAM core 28 and controls read and write timing. In addition, the RAM access control 44 controls the load and unload controls 42 and 46. The load control 42 operates under the control of the RAM access control 44, together with the master control 40, to control the queuing of data communicated to the circuit 36 at the data input terminals 32, while the unload control 46 operates under the control of the RAM access control 44 to control the unloading of queues of data from the RAM core 28 to the data output terminals 34 The control interface 24 provides for communication of external command and control signals carried over the transaction control bus 16, as well as communication of the interface's internally-generated command and control signals. Communication of external command and control signals is provided by coupling the master control 40 with the transaction control bus 16 Communication of the internal signals is provided by coupling the master control 40 directly to the load control 42, and with the RAM access control 44 The internal signals are communicated from the RAM access control 44 by coupling the control 44 both to the load control 42 and to the unload control 46 Although the master control 40 is not directly coupled to the unload control 46, it is to be recognized that the master control 40 is indirectly coupled to the unload control 46 through the RAM access control 44. It is also to be recognized that the master control 40 may be coupled directly to the unload control 46 without departing from the principles of the invention, provided the unload control 46 receives command and control signals so as to provide its function.
The control interface 24 also provides for distribution of its internally-generated command and control signals to the other elements of the memory circuit 36. The control interface 24 distributes the internal signals to the circuit's write access interface 26 via both the
RAM access control 44 and the load control 42. In addition, the control interface 24 distributes such internal signals to circuit's read access interface 30 via both the RAM access control 44 and the unload control 42. Moreover, the control interface 24 distributes the internal signals to the circuit's RAM core 28 via the RAM access control 44. It is to be recognized that the memory circuit's control interface 24 may comprise other or different functional blocks, or other or different interconnections between functional blocks and other elements of the memory circuit 36, or both, without departing from the principles of the invention, the important point being that the control interface 24, in response to signals received over the transaction control bus 16, controls the routing of data into and out of the RAM core 28, as well as communication of data at the input and output terminals 32 and 34.
It is also to be recognized that the memory circuit 36 may be configured other than as shown in Figure 3, that is, the circuit 36 may have configurations other than two ports (P=2), and eight input and output terminals 32 and 34 (N=8), such that each circuit provides other than a four-bit memory word slice per port (N/P=4)
Figures 5 through 7 show embodiments of the control interface's master control 40, RAM access control 44, load control 42 and unload control 46 In Figure 5, the master control 40 and the RAM access control 44 are shown in association with the transaction control bus 16 The master control 40 and RAM access control 44 preferably comprise respective state machines whose implementation is readily understood to those of ordinary skill in the art, using well known digital design techniques with reference to (i) the functions performed by, and the respective signals into and out of, each such machine, (ii) the structure and function of each functional block of the memory circuit 36 and of the memory circuit 36 overall, and (iii) the timing diagrams shown in Figures 14 through 23, all as described herein Moreover, using the design techniques, it is to be recognized that the master control 40 and the RAM access control 40 may be implemented as a single state machine, together with one or more other blocks of the circuit 36, without departing from the principles of the invention The master control 40 and transaction control bus 16 communicate therebetween external command and control signals carried over the bus 16, each signal preferably being buffered in its communication to or from the master control 40 by a respective buffer 67. The signals preferably include system clock 68, bank enable 70, byte enable 72, cancel access 74, and tcb 76, each received at the master control 40 from the transaction control bus 16, as well as q_ready 78 and read 80, both received at the transaction control bus 16 from the master control 40. The system clock 68 provides the master clock for the synchronization of data communications at the memory circuit's terminals 32 and 34, as well as for the other command and control signals communicated between the master control 40 and transaction control bus 16 It is to be recognized that the frequency of the system clock 68 may be limited by loading of the transaction control bus and, in that case, data may be communicated at the terminals 32 and 34 on both the rising and falling edges of the system clock 68 so as to maintain data bandwidth, without departing from the principles of the invention.
The tcb 76 comprises a plurality of signals for communicating transaction descriptors to the memory circuit 36 Each transaction descriptor preferably comprises one or more packets of information communicated over the tcb 76, each packet being communicated synchronous with one respective cycle of the system clock 68 and having a preselected size given by the number of signals, the number being designated herein by D. As previously described, the information associated with each transaction descriptor preferably is predefined. Transaction desc ptors n ormat on pre era y s communicated in prede ned el s w c , as respects the memory circuit 36, preferably include fields respectively for commands, RAM array addresses, source and destination identifications, and transaction cycle counts The commands preferably are encoded and correspond to transactions that include load, write and read transactions, while the unload function preferably is included as part of a read transaction and therefore has no separate transaction descriptor. The source and destination identifications preferably are encoded and identify the respective port 18 associated with communicated data. In that regard, if the circuit 36 is employed in a system 10 implemented as shown in Figure 1 , the source and destination identifications identify, not only the port 18, but also the respective system component 14 associated with the port. The transaction cycle count preferably describes, for load and read transactions, the number of system clock cycles for communication of data at the transaction's associated port 18 and, for write transactions, the size of the block of data to be written to the RAM array 56.
In any application, the transaction descriptors may vary in number of packets, while the descriptors' packets may vary in the number and types of fields, in particular depending on the command and, thence, the function of the particular descriptor. Conversely, the size D of the descriptors' packets preferably is invariate once selected for an application, being selected to optimize packet functionality and system performance while comporting with the design of the digital electronic system employing the transaction control bus 16 and memory circuit 36. The size D, in particular, preferably accommodates the addressing requisites of the RAM array 56. For example, a transaction descriptor packet having twelve signals, i.e., D=12, should be sufficient for a digital electronic system 10 having a main memory 12 constructed from memory circuits 36 that include eight terminals 32 and 34, i.e., N=8, and a RAM array 56 having 4,096 rows and 4,096 columns, wherein uniquely addressing each row and column requires 12 bits. In this example, then, a transaction descriptor engendering an access to the
RAM array 56, e.g. a write transaction, preferably comprises four packets, one to communicate the source identification and the command, another to communicate the size of the data block to be written, and the remaining two to communicate the address of the initial bit in the writing of the data to the array 56. It is to be recognized that the transaction descriptors' specific definitions are largely a matter of design choice, subject to and informed by, among other things, the transactions to be performed, the applications in which the transactions are performed, and the configuration of both the memory circuit 36 and the system 10, as described above and known in the art Accordingly, transaction descriptors' definitions are not described further herein
Bank enable 70 enables the circuit's reception of transaction descriptors from the tcb 76. The source of the transaction descriptor asserts the bank enable 70 in conjunction with the source's transmission of the descriptor, preferably in conjunction with the transmission of the descriptor's first word Where the memory circuit 36 is one of several such circuits forming memory banks in the main memory 16 of a system 10, each memory bank has an associated bank enable signal. Accordingly, the bank enable 70 associated with the memory circuit 36 is asserted only if the circuit is in the bank addressed by the transaction descriptor Byte enable 72, when asserted, enables the circuit's writing of data to the RAM array
56 in response to a write transaction descriptor. When not asserted, the circuit 36 performs the operations associated with the write transaction descriptor, but does not write data to the array 56. Byte enable 72 preferably is used where the memory circuit 36 is one of a plurality of memory circuits 36 organized to provide a selected word width W for one or more ports 18, the word width W being greater than one byte and the circuits 36 providing memory word slices. In such use, each byte of word width W has an associated byte enable signal, so that the particular byte enable signal associated with the memory circuit 36 is asserted only if the circuit 36 provides a slice of the byte addressed by the write transaction descriptor. Cancel access 74 provides for the cancellation of read and write transaction descriptors before execution. In a system 10, cancel access preferably is monitored not only by the main memory 12, but also by the system components 14 so as to accurately track memory accesses. Cancel access 74 preferably is generated by an external algorithm monitoring memory transactions for, among other things, invalid accesses.
While the previously described signals are received at the memory circuit's master control 40, q_ready 78 and read 80 are received at the transaction control bus 16 from the master control 40. Q ready 78 is a handshake signal asserted by the memory circuit 36 to indicate readiness to receive another read or write transaction descriptor, and deasserted to indicate receipt of such descriptors. Q ready 78 preferably is asserted a predetermined number of system clock cycles in advance of when it is able to accept the next read or write transaction descriptor. Advance assertion has particular application when the memory circuit
36 is employed in systems 10 having arbitration algorithms to coordinate time-multiplexing of transaction descriptors over the transaction control bus 16. In such systems 10, wherein the memory circuit 36 may be one of a plurality of such circuits forming a memory bank of the main memory 12, it is preferred to employ the q_ready signal of only one circuit 36 per bank Moreover, in such systems 10 the system components 14 preferably monitor the q_ready signals so as to determine whether to transmit data to its associated port 18 of the main memory 12 Read 80 is another handshake signal asserted upon execution of each read transaction descriptor and deasserted prior to the circuit's communication of read data at one or more of the data output terminals 34. Read 80 preferably is deasserted a predetermined number of system clock cycles in advance of that communication When the memory circuit 36 is employed in a system 10, advance deassertion allows the system component 14 that sent a read transaction descriptor to monitor the circuit's read 80 so as to determine when to receive data from the circuit 36.
The master control 40 and the RAM access control 44 generate internal command and control signals and communicate some of these signals therebetween. The communicated signals preferably include load controls 82, write enable 84, cancel 86, tcb in 88 and internal clocks 94, each received at the RAM access control 44 from the master control 40, as well as reading 90 and start read 92, both received at the master control 40 from the RAM access control 44.
The master control 40, in response to receipt of system clock 68, generates internal clocks 94 which are distributed, not only to the RAM access control 44, but also to the elements of the memory circuit 36 generally so as to synchronize the memory circuit's internal operations The internal clocks 94, though derived from and preferably synchronized with the system clock 68, need not have the same frequency as the system clock 68. For example, the internal clocks 94 may be obtained by multiplying or dividing the frequency of the system clock 68. Load controls 82 enable loading of each word of the transaction descriptor received by the master control 40 into the RAM access control 44.
Write enable 84, cancel 86 and tcb in 88 comprise synchronized versions, respectively, of byte enable 72, cancel access 74 and tcb 76 received over the transaction control bus 16. Write_enable 84 preferably determines whether data is replaced in the write back registers of the sense/write circuit 60 during a write transaction
Reading 90 is an internal version of read 80 transmitted from the master control 40 over the transaction control bus 16. Start read 92 enables the start of the read phase of a row access in the RAM array 56. Start read is generated by the RAM access control and communicated both to the master control 40 and to row access control 58 of the RAM core 28
The master control also generates a load count 96 that is directed to and controls operation of the load control 42 Load count 96 is described hereinafter in the description of the load control 42.
The RAM access control 44 generates and communicates internal command and control signals in addition to those directed to the master control 40 These signals preferably include start_write 100, base mask enables 102, next mask enables 104, queue select 106, load enable 108, load rcount 110, row address 1 12, section select 1 14, base column 116, block size 1 18, and input block size 120.
Start write 100 is directed to the row access control 50 of the RAM core 28 to start the write phase of a RAM array access.
Base mask enables 102 are directed to the RAM core 28. Each signal of base mask enables 102 enables bit replacement in the RAM array's addressed row, in particular in the signal's associated section 57. The bits preferably are replaced when the respective signal of the base mask enables 102 is asserted. Because each row in the RAM array 56 preferably is divided into S sections 57, base mask enables 102 preferably comprises S signals.
Next mask enables 104 are directed to the RAM core 28. Each signal of next mask enables 104 enables bit replacement in the next-consecutive section 57 of the
RAM array's addressed row after the section associated with the corresponding signal of the base mask enables 102. The bits preferably are replaced when the respective signal of next mask enables 104 is asserted. The next mask enables preferably also comprise S signals, one corresponding to each section 57 in a row of the RAM array 56. Queue select 106 selects one of the in queue registers 52 of the write access interface
26 in the transfer of enqueued data to the routing/mask circuit 54. Broadly, queue select 106 triggers routing of the selected register's enqueued data to the RAM array 56 during the execution of a write transaction descriptor. Where the number of in queue registers 52 is N, queue select 106 preferably comprises log,(N) signals. Load enable 108 controls the loading of data read from the RAM array 56 into a corresponding out queue register 64. The number of signals of the load enable 108 preferably is in one-to-one relationship with the number of out queue registers 64. Accordingly, where the number of out queue registers 64 is N, the number of signals of load enable 108 preferably is N
Load rcount 1 10 is directed to the unload control 46 in controlling the operation thereof Load rcount 1 10 is described hereinafter in the description of the unload control 46 Row address 112, section_select 1 14 and base_column 1 16 comprise the address signals for accessing the RAM array 56 and reading selected data therefrom Row address 112 is directed to the row access control 58 of the RAM core 28 to control row accesses of the RAM array 56. As the number of rows in the RAM array 56 is R, row address 112 preferably comprises log2(R) signals. Section select 114 signals are directed to the read data routing circuit 62 of the read access interface 30 to identify sections 57 associated with an addressed row of the RAM array 56 from which data is routed to the out queue registers 64 Section select 1 14 preferably comprises log2(S) signals where S represents the number of sections 57 per row. Base column 1 16 is directed to the read data routing circuit 62. Base column 116 selects, within the selected section 57 of the addressed row of the RAM array 56, the particular column where the addressed data begins. Base column 116 is also directed to the routing/mask circuit 54 of the write access interface 26 for generating control signals that provide for writing of data from a particular addressed column in a section 57. Base column 116 preferably comprises log2(Q) signals, where Q represents the number of columns per section 57. In generating row address 112, as well as start read 92 and start write 100, the
RAM access control 44 is responsive not only to the signals received from the master control 40, but also to two sections 122. Two sections 122 is generated by the routing/mask circuit 54 of the write access interface 26 and indicates to the RAM access control 44 when a RAM access engenders the crossing of the boundary between two sections of the RAM array 56. Moreover, if the section select 114 identifies the last section of a row, two_sections 122 indicates the crossing of a row boundary. When a row boundary crossing is so indicated, the RAM access control 44 preferably generates two successive sequences of access signals row address 112, start read 92 and start write 100. The first sequence of row address 112, start read 92 and start_write 100 provides for access to a row for the first section of data to be written to or read from the RAM array 56. The second sequence of such access signals provides for access to the row having the next section, which preferably is the next consecutive row in the RAM array 56. Two sections 122 is described further herein with respect to the write access interface 26. Block size 1 18 is directed to the routing/mask circuit 54 of the write interface 26 and describes the size of the block of data associated with a read or write transaction descriptor That is, block size 118 determines the number of bits to be replaced or read from each section 57 of a row of the RAM array 56 in, respectively, write and read transactions. Block_size 1 18 preferably comprises log2(Q) signals, where Q represents the number of columns in each section 57 of the RAM array 56.
Input block size 120 is directed to the load control 42 in controlling the operation thereof. Input block size 120 describes the size of the block of data associated with an associated transaction descriptor. Input block size 120 is described further in the following description of the load control 44.
As previously described, the load control 42 operates under the control of the RAM access control 44, together with the master control 40, to control the in queue registers' queuing of data communicated to the circuit 36 at the data input terminals 32. The load control 42, as shown in Figure 6, preferably comprises a plurality of element counters 130, each having input block size 120 and load count 96 as inputs thereto, and a shift enable signal 132 as an output therefrom for communication to a respective in queue register 52. The number of element counters 130 preferably is in one-to-one correspondence with the number of in queue registers 52 so that each counter 130 individually controls the operation of a respective register 52 through the generation of the respective shift enable signal 132. In particular, because the number of registers 52 preferably corresponds to the number of input terminals 32, the number of element counters 130 preferably is N, where N designates the number of input terminals 32 as previously described.
The element counters 130 preferably comprise down counters and each element counter 130 preferably operates independently of the others. Upon execution of a transaction descriptor implicating one or more of the element counters 130, such counters 130 are individually loaded with input block size 120, describing the size of the data block associated with the transaction descriptor. The other element counters 130 may be loaded with a value of input block size 120 corresponding to a previous or succeeding transaction descriptor. The value of the input block size 120, accordingly, may vary from transaction descriptor to transaction descriptor and, thence, from counter 130 to counter 130. In addition, the input block size 120 preferably has values ranging from one bit to the full bit depth Q of the in queue registers 52. So as to represent block sizes up to Q, the input block size 120 preferably comprises log2(Q) signals. It is to be recognized that input block size 120 may be received at the load control 42 from the transaction control bus 16 directly or otherwise, rather than from the RAM access control 44, without departing from the principles of the invention.
Loading of input_block_size 120 into one or more element counters 130 is triggered by receipt of the load_count 96 associated with that counter 130. Accordingly, load count 96 preferably comprises a plurality of signals, one for each element counter 130. For example, where the element counters 130 number N, load count 96 preferably numbers N signals. It is to be recognized that, when the circuit 36 has a plurality of ports 18 and provides a word slice N P bits wide, each transaction descriptor associated with receiving data at a particular port preferably engenders generation of N/P signals of load count 96, each of these signals being directed to a respective element counter 130 associated with that receiving port so as to initially load therein the input block size 120. Moreover, in that case, each of the element counters 130 associated with that receiving port will be initially loaded with a common-valued input block size 120 and, while data is to be enqueued, each will generate a shift enable signal 132 to enable the respective register.
In operation, each element counter 130, while holding a non-zero value, enables the enqueuing of data into the respective in queue register 52 by asserting the shift enable 132 associated therewith. For each bit of data so enqueued in a register 52, the respective element counter 130 decrements once. When the counter 130 decrements to zero, the counter 130 disables enqueuing of data and ceases to decrement.
As previously described, the unload control 46 operates under the control of the RAM access control 44 to control the queuing of data from the RAM core 28 for communication at the data output terminals 34. The unload control 46, as shown in Figure 7, preferably has substantially similar structure as the load control 42 and operates in a substantially similar manner as the load control 42, except its operations are directed at controlling the out queue registers 64 in the communication of data from the circuit 36. That is, the unload control 46 comprises a plurality of element counters 134, each preferably being substantially similar to the element counters 130 of the load control 42. These counters 134 have as inputs thereto block size 1 18 and load rcount 110, both of which have substantially similar functions and parameters as the corresponding input signals of the load control's element counters 130. These counters 134 have as outputs therefrom out enable signals 136, which are described hereinafter with respect to the read access interface 30. Accordingly, the design and operation of the unload control 46 is readily understood by those of ordinary skill in the art by reference to the description of the load control 42, as well as the disclosures hereof generally.
With respect to the load control 42 and the unload control 46, it is preferred that each element counter 130 and 134 is operatable independently of each of the other such counters 130 and 134
Referring to Figure 8, the write access interface 26 preferably comprises N input terminals 32, each coupled respectively by a buffer 150 to one of N in queue registers 52 The buffers 150 implement the data input interface 50 of Figure 3. The in queue registers 52 preferably comprise queues, each controlled independently by a respective shift enable signal 132 received from the control interface 24. Each in queue register 52 has a depth Q and receives data serially while enabled by the respective shift enable signal 132, the shift enable signal 132 preferably enabling data reception only while valid data is to be enqueued for the respective transaction. As previously described, each in queue register 52 receives data synchronously with the system clock 68, either at the clock's frequency or at double that frequency, e.g. at both edges. In a typical implementation, Q=256 and N=8.
It is to be recognized that, when the data input terminals 32 are grouped to form a selected number P of ports 18 each having an associated word slice N/P bits wide, the in queue registers 52 preferably are grouped in N/P registers 52 per port In that case, when executing a transaction descriptor identifying a particular port, the associated N/P in queue registers 52 are each enabled and disabled by the descriptor.
Each in queue register's enqueued data is received, in parallel, by the routing/mask circuit 54 This reception includes up to Q bits, and is controlled by the queue select 106 which the routing/mask circuit 54 receives from the control interface 24. As described above, queue select 106 selects one of the in queue registers 52 for transfer of the data enqueued therein to the routing/mask circuit 54.
The routing/mask circuit 54 preferably provides for routing of data from the in queue registers 52 to the addressed locations in the RAM array 56 and, to do so, generates masking control signals that enable only the valid data to be replaced in the write back registers of the sense/write circuit 60. As shown in Figure 8, the routing/mask circuit 54 preferably comprises a multiplexer 152, a position shifter 154, and a shift count and write mask generator 156. The multiplexer 152 selectably receives the data enqueued by the particular register 52 identified by the queue select 106 and routes it to the position shifter 154. It is to be recognized that, when the in queue registers 52 are grouped as N P registers 52 per port, the execution of a write transaction descriptor engenders consecutive retrievals of data from the implicated registers 52
The position shifter 154 preferably comprises a barrel shifter for rotating the data received from the multiplexer 152 and for transferring the rotated data to the RAM core 28. The position shifter 154 is responsive to shift_count signal 158 provided by the routing/mask circuit's shift count and write mask generator 156. The position shifter 154 rotates the data to adjust for the extent the data was pushed into the respective in queue register 52 and to provide for the data's relative position in a section 57 as addressed in the associated transaction descriptor. The position shifter 154 preferably transfers the data to the RAM core 28 in Q parallel bits over write data signals 160.
It is to be recognized that the routing/mask circuit 54 can be implemented without using the position shifter 154, without departing from the principles of the invention. For example, the data is enqueued into the in queue registers 52 by sequentially loading starting at any appropriate position in such registers 52 during the respective load operation. This alternative relies on implementing a shift function in each in queue register 52. Accordingly, this alternative implicates having additional circuitry in such registers 52 while not having the position shifter 154 in the routing/mask circuit 54.
Referring to Figures 8 and 9, the routing/mask circuit's shift count and write mask generator 156 preferably comprises an adder 170, an end range disables circuit 172, a base range enables circuit 174 and a base section write mask generation circuit 176. The circuits 172, 174 and 176 preferably comprise decoding logic.
The generator 156 has base column 116 and block size 118 as input signals, which are received from the control interface 24. Responsive to such signals, the generator 156 generates (i) shift count 158 for routing to the position shifter 154, (ii) two sections 122 for routing to the control interface 24, and (iii) base section mask 178 and next section mask
180 for routing to the RAM core 28.
Base section mask 178 and next section mask 180 comprise the masking control signals that enable only the valid data to be replaced in the write back registers of the sense/write circuit 60. More specifically, base section mask 178 selects the bits to be replaced within each selected section 57 associated with the transaction descriptor being executed. To do so, base section mask 178 preferably comprises a map of Q mask bits: each bit corresponds to a respective signal in write data 160 such that, when a mask bit is asserted, the section bit is replaced with the respective bit carried over that write_data signal 160. Next_section_mask 180 performs a function substantially similar to that of base_section_mask 178, except it provides for bit replacement in the consecutive section 57 next-following the selected section 57, so as to accommodate a RAM access that crosses the boundary between two sections To generate these masking control signals, the generator's adder 170 adds the base_column 1 16 and the block_size 1 18 The adder's resulting value comprises shift count 158, while the adder's carry out comprises two sections 122. The base range enables circuit 174 decodes base column 1 16 to generate enables from the addressed base column (i.e., the relative position in a section 57 where valid data begins) to the end of the section 57 associated with the base column 116. The end range disables circuit 172 decodes shift count
158 and two sections 122 to obtain, relative to the section 57 of the base column 1 16, disables for al! columns following the end of valid data to the end of the next consecutive section 57. The end of valid data may fall either in the base column's section 57 or in the next consecutive section. The disables falling in the next consecutive section comprise the next section mask 180. The disables falling in the base column's section 57 are routed to the base section write mask generation circuit 176, together with the enables generated by the base range enables circuit 174. The generation circuit 176, which preferably comprises a set of AND gates, combines the corresponding bits received from the circuits 174 and 172 to generate base section mask 178. Where valid data crosses a row boundary, the RAM access control 44 preferably generates a second sequence of the access signals row address 112, start read 92 and start_write 100, responsive to two_sections 122 as previously described. However, additional masking control signals preferably are not generated. That is, the RAM access control 44 generates the second sequence of access signals so that the original next section mask 180 can be used to identify the valid data of the next section even though the next section is in a row separate from the base section.
Referring to Figures 10 and 11, the RAM core 28 preferably comprises the RAM array 56 for storing data; the row access control 58 for enabling and controlling accesses of the RAM array 56; and the sense amplifiers and write back registers 60 for both buffering data to and from the RAM array 56 and temporarily storing a row of accessed data. The RAM array 56, as previously described, preferably comprises a conventional memory array, and has R rows and C columns. The row access control 58 preferably comprises decoding logic The control 58 receives row address 1 12, start read 92 and start_write 100 from the control interface 24, generates row enables 190 and ram write 192 for routing to the RAM array 56, and generates ram read 194 for routing to the sense amplifiers and write back registers 60. Row enables 190, generated from the decode of the row address 1 12, enable access to the rows of the RAM array 56. Row enables 190 preferably comprises R signals, each signal corresponding to a respective row of the RAM array 56. In operation, preferably only one signal of row enables 190 is asserted at a time so as to limit access of the RAM array 56 to only one row at a time. Ram write 192 and ram read 194 comprise timing signals that control the RAM array 56 and the sense amplifiers and write back registers 60, respectively, in buffering data therebetween. Ram write 192 and ram read 194 each preferably comprise one signal In generating ram write 192 and ram read 194, the row access control 58 is responsive to start write 100, start read 92 and row address 1 12 in the execution of write and read transaction descriptors. Accordingly, when a RAM access crosses a row boundary, the second sequence of access signals generated by the RAM access control 44 preferably triggers the generation of a corresponding second sequence of ram write and ram read signals 192 and 194.
The sense amplifiers and write back registers 60 comprise sense amplifiers 200 and a write back register 202. As shown in Figure 1 1, both the sense amplifiers 200 and the write back register 202 are logically organized in S sections, each corresponding to a respective section 57 of a RAM array row. Accordingly, each section of the amplifiers 200 and write back register 202 buffers data for Q columns of the RAM array 56, Q being the depth of each section 57. It is to be recognized, however, that the sense amplifiers 200 and write back register 202 preferably have one sense amplifier and one register element respectively for each column of the RAM array 56.
The sense amplifiers 200 buffer data to and from the RAM array 56 over ram data 196. If the RAM array 56 is DRAM, a complete row, comprising C bits of data, is read into the sense amplifiers 200 from the array and written back to the array on every access. Accordingly, ram data 196 preferably comprises C signals. Because the sense amplifiers 200 are organized in sections, the signals of ram data 196 preferably are organized in S groups, each group having Q signals.
The write back register 202, in read transactions, routes data to the read access interface 30 over read_data 198. Corresponding to the physical and logical organization of the write back register 202, read data 198 preferably comprises C signals that are organized in S groups, each group having Q signals Each group of Q signals of read data 198 is associated with a respective logical section of the write back register 202
Ram read 1 4 causes the data sensed by the sense amplifiers 200 to be latched in the write back register 202 for temporary storage, the row being enabled by one signal of row_enables 190 If the access corresponds to execution of a read transaction, the one or more sections 57 of data corresponding to the transaction are routed over read data 198 to the read access interface 30 before the read data is written back to the RAM array 56
If the access corresponds to execution of a write transaction, the write back register 202 receives new data from the write access interface 26 over write data 160 As previously described, write data 160 preferably comprises Q parallel signals, where Q is the depth of each in queue register 52. Accordingly, Q bits of new data, so received, replace the appropriate data in the write back register 202 in each clock cycle preceding writing of the data back to the enabled row of the RAM array 56 Ram write 192 writes all of the data from the write back register 202 to the enabled row of the RAM array 56 whether or not data has been replaced in every section of the register 202
Each read and write transaction preferably is associated with one or two RAM accesses so as to comprise transfer of up to C bits of data, C being the number of columns in a full row of the RAM array 56. As a first example, if the circuit 36 is configured as one port ( P=l ), then up to C bits of valid data can be transferred because all of the in queue registers 52 are associated with that port. In that case, if all of the in queue registers 52 are full of data and the data is to be written starting at the beginning of a row, that data will replace the read data in each of the corresponding sections of the write back register 202 prior writing back to the RAM array 56 Moreover, all of that data will be written to the RAM array 56 in one RAM access If, however, the data is to be written starting other than at the beginning of a row, two RAM accesses are necessary to write the data to the RAM array 56 As a second example, if the circuit 36 is configured as N ports where N is the number of in queue registers 52, only the valid data in the single in queue register 52 associated with the port replaces data in the write back register 202 Accordingly, less than C bits is transferred Nevertheless, two RAM accesses may be necessary in writing the data to the RAM array 56, depending on where the writing of data is to start relative to the end of a row
The write back register 202 preferably comprises flip flops that select between the output of the sense amplifiers 200 and the bits received from the write access interface 26 As shown in Figure 11, each section of the write back register 202 receives in parallel the bits from the write access interface 26 Each section also receives a respective signal of base_mask_enables 102, next_mask_enables 104, base section mask 178 and next section mask 180 If the signal of base mask enables 102 associated with a particular section of the write back register 202 is asserted, bit replacement is enabled for that section
The base section mask 178 determines which bits are replaced in the enabled section. Where the replacing data crosses a section boundary, the signal of next mask enables 104 associated with the next -consecutive section of the write back register 202 is asserted, enabling bit replacement in that section. The next section mask 180 determines which bits are replaced in that enabled next section.
Referring to Figure 12, the read access interface 30 preferably comprises N output terminals 34, each coupled respectively by a buffer 210 to one of N out queue registers 64 The buffers 210 implement the data output interface 66 of Figure 3. The out queue registers 64 preferably comprise queues, each controlled independently by a respective out enable signal 136 received from the control interface 24. The out enable signals 136 enable and disable routing of data from the out queue registers 64 to the buffers 210, and control the buffering of that data through the buffers 210 to the data output terminals 34. The out enable signals 136 preferably enable routing only while valid data is enqueued for the respective transaction. Each out queue register 64 has a depth Q and, while enabled by the respective out enable signal 136, serially routes data to the respective buffer 210. As previously described, each out queue register 64 routes data synchronously with the system clock 68, either at the clock's frequency or at double that frequency, e.g. at both edges. In a typical implementation, as previously described with respect to the in queue registers 52, Q=256 and N=8. It is to be recognized that, when the data output terminals 34 are grouped to form a selected number P of ports 18 each having an associated word slice N/P bits wide, the out queue registers 64 preferably are grouped in N/P registers 64 per port. In that case, execution of a read transaction descriptor identifying a particular port entails enabling and disabling each of the associated N/P out queue registers 64. Each out queue register 64 receives data, in parallel, from the read data routing circuit
62. This transmission includes up to Q bits and is controlled by the register's respective load_enable signal 108. When a load_enable signal 108 is asserted, the signal's respective out queue register 64 is enabled to receive data. The read data routing circuit 62 provides for routing of data from the RAM core 28 to the respective out queue register 64 associated with the data's corresponding read transaction The read data routing circuit 62 receives section select 1 14 and base column 116 from the control interface 24 and receives data in sections from the RAM core 28 over read data 198 The read data routing circuit 62 comprises a multiplexer 212 and a justify shifter 214. The multiplexer 212 selects the section 57 of RAM array data identified by section select 114, as well as the next consecutive section 57 in order to accommodate crossing of section boundaries by the valid data. The justify shifter 214 receives the two sections of data selected by the multiplexer 212 and, responsive to base column 116, justifies the data so that the initial bit of the valid data is loaded into the first location in the respective out queue register 64. To route the justified data to the appropriate register 64, the justify shifter 214 is coupled in parallel to each out queue register 64. It is to be recognized that, when the out queue registers 64 are grouped as N/P registers 64 per port, the execution of a read transaction descriptor engenders consecutive routings of data from the justify shifter 214 to the implicated registers 64.
It is to be recognized that the read data routing circuit 62 can be implemented without using the justify shifter 214, without departing from the principles of the invention. For example, read data can be loaded from the RAM array 56 directly into the out queue registers 64 provided the loaded data may be output from such registers 64 starting at any randomly selectable position therein, such selected position corresponding to the beginning of the valid data. This alternative relies on implementing a random access function in each of the out queue registers 62. Accordingly, this alternative implicates having additional circuitry in such registers 64 while not having the justify shifter 214 in the read data routing circuit 62.
The operation of the memory circuit 36 is depicted in Figure 13 with reference to the in queue registers 52, the multiplexer 152 and position shifter 154 of the write data routing and section write mask circuit 54, the write back register 202, the RAM array 56, the multiplexer 212 and the justify shifter 214 of the read data routing circuit 62, and the out queue registers 64. The memory circuit 36 is configured to have N ports 18 and is depicted receiving, at the in queue registers 52, data block 250 from port 251, data block 252 from port 253 and data block 254 from port 255. Data block 252 is received first; data block 250 is received second; and data block 254 is received third. Each of the data blocks 250, 252 and 254 are depicted being routed from the out queue registers 64 at ports 251, 253 and 255, respectively. Data block 250 is routed first; data block 254 is routed second, and data block 252 is routed third.
The memory circuit's operation is illustrated for three types of transaction descriptors: load descriptors 260, write descriptors 262 and read descriptors 264 One descriptor of each type is contemplated to trigger memory circuit operations respecting each data block 250, 252 and 254. Generally, each descriptor type has associated therewith a series of predefined steps Load descriptors 260 preferably have as a principal step the loading of data into the in queue registers 52. Each load descriptor 260 controls the loading of data into the one or more registers 52 corresponding to the port 18 associated with the descriptor 260. Load descriptors 260 preferably are accepted at any time. Moreover, each port 18 can load data in response to a load descriptor 260 associated with that port while any or all other ports 18 are loading data in response to load descriptors associated therewith Moreover, execution of load descriptors 260 is independent of execution of both write descriptors 262 and read descriptors 264. Once the memory circuit 36 initiates a load in response to a load descriptor 260, the circuit preferably executes the load operations to completion independent of all other memory circuit activity. If the memory circuit 36 is configured for multiple ports 18, as shown in Figure 13, multiple load descriptors 260 can be in various stages of execution at any given time. The load descriptors 260 can accommodate data blocks ranging from one bit up to Q bits, where Q preferably is equal to the depth of the in queue registers 52.
Write descriptors 262 preferably have as principal steps a funnel operation 266, a position operation 268, a replace operation 270 and a store operation 272. Through these steps, each write descriptor 262 provides for transferring data from the in queue registers 52 associated with the descriptor's port to the RAM array 56 for storage at an address specified in the descriptor. Although as shown each port has associated therewith a single in queue register 52, it is to be recognized that each port may have a plurality of associated registers 52, without departing from the principles of the invention.
The funnel operation 266 selects the in queue registers 52 associated with the descriptor's port for transfer of the data enqueued, at one register per clock cycle, to the position shifter 154. The funnel operation 266 employs the multiplexer 152 of the routing/mask circuit 54. The position operation 268 shifts the valid data received from each in queue register 52 to provide for positioning the data in a section 57 in accordance with the addressing of the descriptor, or in two consecutive sections 57 when the positioning causes the data to cross a section boundary The replace operation 270 employs the write back register 202 to replace data read from the RAM array 56 into the write back register 202 with the valid data from the in queue registers 52. The replace operation 270 replaces bits starting with the section 57 in which the descriptor's base address resides and moves through sequential sections, one for each in queue register 52 associated with the write descriptor 262
The mask signals 102, 104, 178 and 180 are employed in this operation to determine which bits get replaced, including when data blocks cross section boundaries in the replacement operation 270 The store operation 272 transfers the entire contents of the write back register into the enabled row of the RAM array 56 responsive to the write descriptor 262. As previously described, a single write descriptor may engender two accesses to the RAM array
56 when data blocks cross a row boundary.
Figure 13 depicts execution of a sequence of write descriptors 274, 276 and 278 associated with data blocks 252, 254 and 250, respectively. The write descriptor 274 has progressed to the replace operation 270, while the write descriptor 276 is ready to begin the position operation 268 and the write descriptor 278 is completing the funnel operation 266.
The progress in execution of the write descriptors 274, 276 and 278 preferably reflects the order of the descriptors' receipt by the memory circuit 36.
Read descriptors 264 preferably have as principal steps a fetch operation 280, a funnel operation 282, a justify operation 284 and an unload operation 286. The fetch operation 280 comprises reading a complete row of data from the RAM array 56, as addressed by the read descriptor 264. The funnel operation 282 comprises transferring, to the justify operation's justify shifter 214, two sections of fetched data for each out queue register 64 corresponding to the port 18 of the descriptor 264, each register's two sections being transferred in a single clock cycle. Funnelling two consecutive sections of data ensures reading all bits of a data block that crosses section boundaries, i.e., if a data block is stored in the RAM array 56 so as to cross a section boundary, the data block is stored in two consecutive sections and can be read from the array by operating on both the addressed section and the next consecutive section. The justify operation 284 justifies the funnelled data so that the initial bit of valid data is loaded into the first location in the respective out queue register 64. The unload operation 286 comprises routing the justified data from the memory circuit 36 through the out queue registers 64 corresponding to the port 18 associated with the read descriptor 264. Once initiated by the read descriptor 264, the unload operation 286 preferably executes to completion independent of any other memory circuit activity. If the memory circuit 36 is configured for multiple ports 18, as shown in Figure 13, multiple unload operations 286 can be in various stages at any given time.
Figure 13 depicts execution of a sequence of read descriptors 288, 290 and 292 associated with data blocks 250, 254 and 252, respectively. The read descriptor 288 has progressed to the unload operation 286. The read descriptor 290, having completed the funnel operation 282, is ready to begin the justify operation 284. The read descriptor 292 has completed the fetch and funneling operations 280 and 282 As shown for data block 250, the justify operation 284 justifies the data into one section even if, as fetched, it crosses section boundaries. The progress in execution of the read descriptors 288, 290 and 292 preferably reflects the order of their receipt by the memory circuit 36. Moreover, when both read and write descriptors 262 and 264 are received by the memory circuit 36, accesses to the RAM array 56 preferably are executed sequentially in the order of the descriptors' receipt. In addition, write and read descriptors 262 and 264 will only be accepted by the memory circuit 36 when q_ready 78 is asserted. Load descriptors 260 preferably are accepted at any time.
Figures 14 through 23 are timing diagrams further depicting the operation of the memory circuit 36. Figure 14 shows the load timing for one port 18 writing a block of eight words to the in queue registers 52 using a one-cycle transaction descriptor. Figure 15 shows the load timing for one port 18 writing a block of nine or more words to the in queue registers 52 using a two-cycle transaction descriptor. Figure 16 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor. Figure 17 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/2 ports 18, the descriptor being a three-cycle transaction descriptor. Figure 18 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/4 ports 18, the descriptor being a four-cycle transaction descriptor. Figure 19 shows an access of the RAM array 56 corresponding to a write descriptor for one port 18 in a memory circuit 36 having N/4 ports 18, the descriptor using a four-cycle transaction descriptor. The operations shown in Figure 19 differs from those shown in Figure 18 in that the access crosses a row boundary with the contents of the first in queue register 52 and of part of the second in queue register written to the end of the addressed row, while the contents of the other part of the second in queue register 52 and of the third and fourth in queue registers 52 are written at the beginning of the next consecutive row. Figure 20 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor. Figure 21 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N/2 ports 18, the descriptor being a four-cycle transaction descriptor. Figure 22 shows an access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N/4 ports 18, the descriptor being a three-cycle transaction descriptor. Figure 23 shows a read access of the RAM array 56 corresponding to a read descriptor for one port of a memory circuit 36 having N ports 18, the descriptor being a four-cycle transaction descriptor. The operations shown in Figure 23 differ from those shown in Figure 22 in that the addressed data crosses a row boundary, with the data for the first, second and third out queue registers 64 being read from the end of the addressed row and the data for the fourth out queue register 64 being read from the beginning of the next consecutive row.
The memory circuit's control interface 24, in the above Figures, is shown to receive transaction descriptors from the transaction control bus 16 and, in response thereto, generates command and control signals for communication to the other elements of the memory circuit 36. The write access interface 26 provides buffered data paths for the flow of data into the RAM core 28. The interface, responsive to receipt of load descriptors 260, controls the flow of data into the in queue registers 52, the data from each input terminal 32 being loaded into a respective in queue register 52. The in queue registers 52 can be grouped in association with a respective port 18. The enqueued data is written to the RAM core 28 responsive to receipt of write descriptors. A single write descriptor 262 transfers all valid data to the RAM core 26 from the in queue registers 52 associated with the particular port 18 corresponding to the descriptor 262. In this transfer, the data is routed through the multiplexer 152 and the position shifter 154. These elements provide for writing the valid data into the RAM array 56 starting at any column of an addressed row. Thence, the memory circuit 36 provides for placing in the RAM array 56 a block of data, the size of the block being independently selectable and the placement of the block in the RAM array 56 starting at an independently selectable position. In addition, the memory circuit 36 provides for storing various blocks of data at independently selectable positions in the RAM array 56.
The memory circuit's read access interface 30 provides buffered data paths for the flow of data from the RAM core 28. Responsive to receipt of a read descriptor 264, data is read from the RAM array 56 in a complete row. Sections thereof are routed through the multiplexer 212 and the justify shifter 214 so that one or more complete or partial sections of valid data are selectable to comprise an output block. Block size is independently selectable from read descriptor to read descriptor. Each block of valid data is routed to the out queue registers 64, the placement of the blocks in the registers 64 being selectable The out queue registers 64 can be grouped in association with a respective port 18 A single read descriptor
264 transfers all valid data associated with a port 18 corresponding to the descriptor 264 from the RAM array 56 to the one or more out queue registers 64 associated with that port 18 Because the valid data may be stored at independently selectable positions in the RAM array 56, it may be retrieved therefrom During the execution of the read descriptor 264, read 80 is asserted and de-asserted and, thereafter, data is communicated at the respective output terminals 34. The system component 14 that issued the read descriptor 264 receives the data a fixed number of system clock cycles after the de-assertion of read 80.
The memory circuit 36 can be packaged in various ways, including having separate data input and output terminals 32 and 34 or having a single set of terminals that are shared for input and output. Separate input and output terminals 32 and 34 allow for full-duplex operation, while shared terminals allow for support of additional ports in a package of fixed pin count.
The terms and expressions which have been employed in the foregoing specification are used therein as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims which follow.

Claims

What is claimed is
1 A digital electronic system, comprising
one or more system components, and
a memory, coupled through a plurality of distinct ports to respective said system components, said memory selectively storing and communicating data among said system components coupled thereto
2 The digital electronic system of claim 1, further comprising a transaction control bus, coupled to each of said selected system components and to said memory, for communicating command and control signals among said system components and said memory
3 A memory circuit, comprising
a memory array;
a plurality of communication ports; and
control circuitry for transferring selected sets of data between selected locations within said memory array and selected communication ports
4 The memory circuit of claim 3, wherein said communication ports comprise a plurality of input terminals, and said control circuitry includes input circuitry for selecting a set of input data from an input terminal and placing said set of input data in a selected location in said memory array
5 The memory circuit of claim 4, wherein said communication ports further comprise a plurality of output terminals and said control circuitry includes output circuitry for selecting a set of output data from a selected location in said memory array and sending said set of output data to a selected output terminal
6 The memory circuit of claim 4, wherein said input circuitry comprises a queue for receiving from a selected input terminal input data to be written to said memory array, and a selection circuit for placing in said memory array a contiguous block of said input data, the size of said block being selectable and the placement being selectable
7 The memory circuit of claim 3, wherein said communication ports comprise a plurality of output terminals and said control circuitry includes output circuitry for selecting a set of output data from a selected location in said memory array and sending said set of output data to a selected output terminal.
8 The memory circuit of claim 7, wherein said output circuitry comprises a queue for receiving output data from said memory array, and a selection circuit for placing in said queue a contiguous block of said output data, the size of said block being selectable and the placement being selectable.
9. The memory circuit of claim 3, wherein the transfer of data at said communication ports is synchronized to a system clock.
10 The memory circuit of claim 3, wherein said control circuitry comprises a control interface for coupling said memory circuit to a transaction control bus so as to communicate command and control signals between said bus and said memory circuit, and for controlling operations of said memory circuit responsive to receipt of command and control signals.
1 1. A read interface for a memory array having rows and columns, comprising.
a queue for receiving data read from a row of the memory array; and
a selection circuit for placing in said queue a contiguous block of said data, the size of said block being selectable and the placement being selectable.
12. The read interface of claim 1 1, wherein said selection circuit comprises a justify shifter for controlling placement of said data in said queue
13 The read interface of claim 1 1, wherein said queue comprises circuitry for randomly accessing, for output, the received data
14 The read interface of claim 1 1, further comprising a plurality of queues and wherein said selection circuit is adapted to place independently selectable blocks of said data in independently selectable positions in selected said queues
15 The read interface of claim 14, wherein said selection circuit comprises a multiplexer for receiving and selecting data
16 The read interface of claim 15, wherein said selection circuit comprises a justify shifter, coupled to said multiplexer and to said queues, for controlling placement of said data in said queues
17 The read interface of claim 15, wherein at least one of said queues comprises circuitry for randomly accessing for output the received data
18 A write interface for a memory array having rows and columns, comprising
a queue for receiving data to be written to said memory array; and
a selection circuit for placing in said memory array a contiguous block of said data, the size of said block being selectable and the placement being selectable
19 The write interface of claim 18, wherein each row of the memory array is organized in one or more sections, each section having a selected number of bits, and said selection circuit comprises a shifting circuit for controlling placement of said data in each of said sections of said memory array
20 The write interface of claim 19, wherein said selection circuit comprises a write mask generator for controlling writes of said data to selected bits of said sections
21 The write interface of claim 20, wherein said writemask generator comprises a base section write mask generator and a next section write mask generator for controlling writes of said data to respective, selected bits of a base section and a next section in the memory array
22 The write interface of claim 18, wherein each row of the memory array is organized into one or more sections, each section having a selected number of bits, and said queue comprises circuitry for enqueuing data into said queue starting at any appropriate position therein, prior to placing said data in said memory array.
23. The write interface of claim 18, further comprising a logic circuit for identifying when a write of said data to the memory array engenders writing across two rows
24. The write interface of claim 18, further comprising a plurality of queues and wherein said selection circuit is adapted to place independently selectable data from selected said queues in independently selectable positions in said memory array.
25. The write interface of claim 24, wherein said selection circuit comprises a multiplexer for receiving and selecting data.
26. The write interface of claim 25, wherein said selection circuit comprises a position shifter, coupled to said multiplexer and to the memory array, for controlling placement of said data in the memory array.
27. The write interface of claim 24, wherein at least one of said queues comprises circuitry for enqueuing data into said queue starting at any appropriate position therein, prior to placing said data in said memory array.
PCT/US1995/010684 1994-09-01 1995-08-22 A multi-port memory system including read and write buffer interfaces WO1996007139A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU34122/95A AU3412295A (en) 1994-09-01 1995-08-22 A multi-port memory system including read and write buffer interfaces

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US30042194A 1994-09-01 1994-09-01
US08/300,421 1994-09-01

Publications (1)

Publication Number Publication Date
WO1996007139A1 true WO1996007139A1 (en) 1996-03-07

Family

ID=23159037

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1995/010684 WO1996007139A1 (en) 1994-09-01 1995-08-22 A multi-port memory system including read and write buffer interfaces

Country Status (3)

Country Link
US (2) US5802580A (en)
AU (1) AU3412295A (en)
WO (1) WO1996007139A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0875854A2 (en) 1997-04-30 1998-11-04 Canon Kabushiki Kaisha Reconfigurable image processing pipeline

Families Citing this family (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6108756A (en) * 1997-01-17 2000-08-22 Integrated Device Technology, Inc. Semaphore enhancement to allow bank selection of a shared resource memory device
US6212607B1 (en) 1997-01-17 2001-04-03 Integrated Device Technology, Inc. Multi-ported memory architecture using single-ported RAM
US6006296A (en) * 1997-05-16 1999-12-21 Unisys Corporation Scalable memory controller
US6212597B1 (en) * 1997-07-28 2001-04-03 Neonet Lllc Apparatus for and method of architecturally enhancing the performance of a multi-port internally cached (AMPIC) DRAM array and like
US6067604A (en) * 1997-08-11 2000-05-23 Compaq Computer Corporation Space-time memory
US20010032278A1 (en) 1997-10-07 2001-10-18 Brown Stephen J. Remote generation and distribution of command programs for programmable devices
US6757746B2 (en) 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US7167927B2 (en) 1997-10-14 2007-01-23 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US6658480B2 (en) 1997-10-14 2003-12-02 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
US6427171B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US7237036B2 (en) * 1997-10-14 2007-06-26 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US8621101B1 (en) 2000-09-29 2013-12-31 Alacritech, Inc. Intelligent network storage interface device
US6389479B1 (en) 1997-10-14 2002-05-14 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US6687758B2 (en) 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US7133940B2 (en) 1997-10-14 2006-11-07 Alacritech, Inc. Network interface device employing a DMA command queue
US8782199B2 (en) 1997-10-14 2014-07-15 A-Tech Llc Parsing a packet header
US6226680B1 (en) 1997-10-14 2001-05-01 Alacritech, Inc. Intelligent network interface system method for protocol processing
US6807581B1 (en) 2000-09-29 2004-10-19 Alacritech, Inc. Intelligent network storage interface system
US7076568B2 (en) * 1997-10-14 2006-07-11 Alacritech, Inc. Data communication apparatus for computer intelligent network interface card which transfers data between a network and a storage device according designated uniform datagram protocol socket
US6427173B1 (en) 1997-10-14 2002-07-30 Alacritech, Inc. Intelligent network interfaced device and system for accelerated communication
US6434620B1 (en) 1998-08-27 2002-08-13 Alacritech, Inc. TCP/IP offload network interface device
US6470415B1 (en) * 1999-10-13 2002-10-22 Alacritech, Inc. Queue system involving SRAM head, SRAM tail and DRAM body
US6697868B2 (en) 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US7284070B2 (en) * 1997-10-14 2007-10-16 Alacritech, Inc. TCP offload network interface device
US7185266B2 (en) 2003-02-12 2007-02-27 Alacritech, Inc. Network interface device for error detection using partial CRCS of variable length message portions
US7042898B2 (en) 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US8539112B2 (en) 1997-10-14 2013-09-17 Alacritech, Inc. TCP/IP offload device
US7089326B2 (en) * 1997-10-14 2006-08-08 Alacritech, Inc. Fast-path processing for receiving data on TCP connection offload devices
US6591302B2 (en) 1997-10-14 2003-07-08 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US7174393B2 (en) 2000-12-26 2007-02-06 Alacritech, Inc. TCP/IP offload network interface device
US6480927B1 (en) * 1997-12-31 2002-11-12 Unisys Corporation High-performance modular memory system with crossbar connections
US6675189B2 (en) 1998-05-28 2004-01-06 Hewlett-Packard Development Company, L.P. System for learning and applying integrated task and data parallel strategies in dynamic applications
US7664883B2 (en) 1998-08-28 2010-02-16 Alacritech, Inc. Network interface device that fast-path processes solicited session layer read commands
US6574688B1 (en) * 1999-01-05 2003-06-03 Agere Systems Inc. Port manager controller for connecting various function modules
US6581145B1 (en) * 1999-03-03 2003-06-17 Oak Technology, Inc. Multiple source generic memory access interface providing significant design flexibility among devices requiring access to memory
DE19936080A1 (en) * 1999-07-30 2001-02-15 Siemens Ag Multiprocessor system for performing memory accesses to a shared memory and associated method
DE19937176A1 (en) * 1999-08-06 2001-02-15 Siemens Ag Multiprocessor system
US7010788B1 (en) 2000-05-19 2006-03-07 Hewlett-Packard Development Company, L.P. System for computing the optimal static schedule using the stored task execution costs with recent schedule execution costs
US6684270B1 (en) * 2000-06-02 2004-01-27 Nortel Networks Limited Accelerated file system that recognizes and reroutes uncontested read operations to a second faster path for use in high-capacity data transfer systems
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US6720074B2 (en) * 2000-10-26 2004-04-13 Inframat Corporation Insulator coated magnetic nanoparticulate composites with reduced core loss and method of manufacture thereof
US6560160B1 (en) * 2000-11-13 2003-05-06 Agilent Technologies, Inc. Multi-port memory that sequences port accesses
US6745257B2 (en) * 2001-01-04 2004-06-01 International Business Machines Corporation Method, system, and program for providing status in a multi-processing node system
US7904194B2 (en) 2001-02-09 2011-03-08 Roy-G-Biv Corporation Event management systems and methods for motion control systems
US6826657B1 (en) * 2001-09-10 2004-11-30 Rambus Inc. Techniques for increasing bandwidth in port-per-module memory systems having mismatched memory modules
US7333388B2 (en) * 2001-10-03 2008-02-19 Infineon Technologies Aktiengesellschaft Multi-port memory cells
US6988161B2 (en) * 2001-12-20 2006-01-17 Intel Corporation Multiple port allocation and configurations for different port operation modes on a host
US20030121835A1 (en) * 2001-12-31 2003-07-03 Peter Quartararo Apparatus for and method of sieving biocompatible adsorbent beaded polymers
US7543087B2 (en) 2002-04-22 2009-06-02 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgement that transmit data has been received by a remote device
US7496689B2 (en) 2002-04-22 2009-02-24 Alacritech, Inc. TCP/IP offload device
US7337241B2 (en) 2002-09-27 2008-02-26 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US7191241B2 (en) * 2002-09-27 2007-03-13 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US7571287B2 (en) * 2003-03-13 2009-08-04 Marvell World Trade Ltd. Multiport memory architecture, devices and systems including the same, and methods of using the same
US20060064503A1 (en) 2003-09-25 2006-03-23 Brown David W Data routing systems and methods
US8027349B2 (en) 2003-09-25 2011-09-27 Roy-G-Biv Corporation Database event driven motion systems
US6996070B2 (en) * 2003-12-05 2006-02-07 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
US7143332B1 (en) * 2003-12-16 2006-11-28 Xilinx, Inc. Methods and structures for providing programmable width and error correction in memory arrays in programmable logic devices
US8248939B1 (en) 2004-10-08 2012-08-21 Alacritech, Inc. Transferring control of TCP connections between hierarchy of processing mechanisms
US7738500B1 (en) 2005-12-14 2010-06-15 Alacritech, Inc. TCP timestamp synchronization for network connections that are offloaded to network interface devices
US8234425B1 (en) 2007-06-27 2012-07-31 Marvell International Ltd. Arbiter module
US7949817B1 (en) 2007-07-31 2011-05-24 Marvell International Ltd. Adaptive bus profiler
US8539513B1 (en) 2008-04-01 2013-09-17 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US8131915B1 (en) 2008-04-11 2012-03-06 Marvell Intentional Ltd. Modifying or overwriting data stored in flash memory
US8683085B1 (en) 2008-05-06 2014-03-25 Marvell International Ltd. USB interface configurable for host or device mode
US8341286B1 (en) 2008-07-31 2012-12-25 Alacritech, Inc. TCP offload send optimization
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies
US8423710B1 (en) 2009-03-23 2013-04-16 Marvell International Ltd. Sequential writes to flash memory
US8213236B1 (en) 2009-04-21 2012-07-03 Marvell International Ltd. Flash memory
US8688922B1 (en) 2010-03-11 2014-04-01 Marvell International Ltd Hardware-supported memory management
US8756394B1 (en) 2010-07-07 2014-06-17 Marvell International Ltd. Multi-dimension memory timing tuner
US20150063217A1 (en) * 2013-08-28 2015-03-05 Lsi Corporation Mapping between variable width samples and a frame
US10346168B2 (en) 2015-06-26 2019-07-09 Microsoft Technology Licensing, Llc Decoupled processor instruction window and operand buffer
US10409599B2 (en) 2015-06-26 2019-09-10 Microsoft Technology Licensing, Llc Decoding information about a group of instructions including a size of the group of instructions
US10175988B2 (en) 2015-06-26 2019-01-08 Microsoft Technology Licensing, Llc Explicit instruction scheduler state information for a processor
US10191747B2 (en) 2015-06-26 2019-01-29 Microsoft Technology Licensing, Llc Locking operand values for groups of instructions executed atomically
US9946548B2 (en) 2015-06-26 2018-04-17 Microsoft Technology Licensing, Llc Age-based management of instruction blocks in a processor instruction window
US10409606B2 (en) 2015-06-26 2019-09-10 Microsoft Technology Licensing, Llc Verifying branch targets
US9952867B2 (en) 2015-06-26 2018-04-24 Microsoft Technology Licensing, Llc Mapping instruction blocks based on block size
US10169044B2 (en) 2015-06-26 2019-01-01 Microsoft Technology Licensing, Llc Processing an encoding format field to interpret header information regarding a group of instructions
US10871967B2 (en) 2015-09-19 2020-12-22 Microsoft Technology Licensing, Llc Register read/write ordering
US11681531B2 (en) 2015-09-19 2023-06-20 Microsoft Technology Licensing, Llc Generation and use of memory access instruction order encodings
US10678544B2 (en) 2015-09-19 2020-06-09 Microsoft Technology Licensing, Llc Initiating instruction block execution using a register access instruction

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4630258A (en) * 1984-10-18 1986-12-16 Hughes Aircraft Company Packet switched multiport memory NXM switch node and processing method
US4780812A (en) * 1982-06-05 1988-10-25 British Aerospace Public Limited Company Common memory system for a plurality of computers
US4815038A (en) * 1987-05-01 1989-03-21 Texas Instruments Incorporated Multiport ram memory cell
US4930066A (en) * 1985-10-15 1990-05-29 Agency Of Industrial Science And Technology Multiport memory system
US5210701A (en) * 1989-05-15 1993-05-11 Cascade Design Automation Corporation Apparatus and method for designing integrated circuit modules
US5247649A (en) * 1988-05-06 1993-09-21 Hitachi, Ltd. Multi-processor system having a multi-port cache memory
US5276842A (en) * 1990-04-10 1994-01-04 Mitsubishi Denki Kabushiki Kaisha Dual port memory
US5337414A (en) * 1992-09-22 1994-08-09 Unisys Corporation Mass data storage and retrieval system
US5440523A (en) * 1993-08-19 1995-08-08 Multimedia Communications, Inc. Multiple-port shared memory interface and associated method

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL7110444A (en) * 1971-07-29 1973-01-31 Philips Nv
US4218756A (en) * 1978-06-19 1980-08-19 Bell Telephone Laboratories, Incorporated Control circuit for modifying contents of packet switch random access memory
DE3109767C2 (en) * 1981-03-13 1983-05-11 Siemens AG, 1000 Berlin und 8000 München Time division switching network unit for time-space switching
IT1155660B (en) * 1982-03-24 1987-01-28 Cselt Centro Studi Lab Telecom REFERENCES TO PCM ELEMENTARY SWITCHING MATRICES
US4730305A (en) * 1986-04-11 1988-03-08 American Telephone And Telegraph Company, At&T Bell Laboratories Fast assignment technique for use in a switching arrangement
US5014238A (en) * 1987-07-15 1991-05-07 Distributed Matrix Controls Inc. Universal input/output device
US5214760A (en) * 1988-08-26 1993-05-25 Tektronix, Inc. Adaptable multiple port data buffer
JPH03182140A (en) * 1989-12-11 1991-08-08 Mitsubishi Electric Corp Common buffer type exchange
US5301303A (en) * 1990-04-23 1994-04-05 Chipcom Corporation Communication system concentrator configurable to different access methods
GB9011743D0 (en) * 1990-05-25 1990-07-18 Plessey Telecomm Data element switch
CA2049428C (en) * 1990-08-20 1996-06-18 Yasuro Shobatake Atm communication system
US5260905A (en) * 1990-09-03 1993-11-09 Matsushita Electric Industrial Co., Ltd. Multi-port memory
EP0477595A3 (en) * 1990-09-26 1992-11-19 Siemens Aktiengesellschaft Cache memory device with m bus connections
JPH0630025A (en) * 1991-07-08 1994-02-04 Nec Corp Asynchronous time division exchange system
FR2669496B1 (en) * 1990-11-21 1995-03-31 Alcatel Business Systems TIME SWITCH WITH EXPLODED ARCHITECTURE AND CONNECTION MODULE FOR THE ESTABLISHMENT OF SUCH A SWITCH.
US5297138A (en) * 1991-04-30 1994-03-22 Hewlett-Packard Company Determining physical topology across repeaters and bridges in a computer network
JP3169639B2 (en) * 1991-06-27 2001-05-28 日本電気株式会社 Semiconductor storage device
US5287346A (en) * 1991-10-16 1994-02-15 Carnegie Mellon University Packet switch
JPH0775015B2 (en) * 1991-12-19 1995-08-09 インターナショナル・ビジネス・マシーンズ・コーポレイション Data communication and processing system and data communication processing method
US5375089A (en) * 1993-10-05 1994-12-20 Advanced Micro Devices, Inc. Plural port memory system utilizing a memory having a read port and a write port

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4780812A (en) * 1982-06-05 1988-10-25 British Aerospace Public Limited Company Common memory system for a plurality of computers
US4630258A (en) * 1984-10-18 1986-12-16 Hughes Aircraft Company Packet switched multiport memory NXM switch node and processing method
US4930066A (en) * 1985-10-15 1990-05-29 Agency Of Industrial Science And Technology Multiport memory system
US4815038A (en) * 1987-05-01 1989-03-21 Texas Instruments Incorporated Multiport ram memory cell
US5247649A (en) * 1988-05-06 1993-09-21 Hitachi, Ltd. Multi-processor system having a multi-port cache memory
US5210701A (en) * 1989-05-15 1993-05-11 Cascade Design Automation Corporation Apparatus and method for designing integrated circuit modules
US5276842A (en) * 1990-04-10 1994-01-04 Mitsubishi Denki Kabushiki Kaisha Dual port memory
US5337414A (en) * 1992-09-22 1994-08-09 Unisys Corporation Mass data storage and retrieval system
US5440523A (en) * 1993-08-19 1995-08-08 Multimedia Communications, Inc. Multiple-port shared memory interface and associated method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0875854A2 (en) 1997-04-30 1998-11-04 Canon Kabushiki Kaisha Reconfigurable image processing pipeline

Also Published As

Publication number Publication date
US5802580A (en) 1998-09-01
US6167491A (en) 2000-12-26
AU3412295A (en) 1996-03-22

Similar Documents

Publication Publication Date Title
US5802580A (en) High performance digital electronic system architecture and memory circuit thereof
US6721864B2 (en) Programmable memory controller
US6505269B1 (en) Dynamic addressing mapping to eliminate memory resource contention in a symmetric multiprocessor system
US6088774A (en) Read/write timing for maximum utilization of bidirectional read/write bus
US7290096B2 (en) Full access to memory interfaces via remote request
US5557768A (en) Functional pipelined virtual multiport cache memory with plural access during a single cycle
EP1345125B1 (en) Dynamic random access memory system with bank conflict avoidance feature
US4298954A (en) Alternating data buffers when one buffer is empty and another buffer is variably full of data
US6172927B1 (en) First-in, first-out integrated circuit memory device incorporating a retransmit function
US5526508A (en) Cache line replacing system for simultaneously storing data into read and write buffers having multiplexer which controls by counter value for bypassing read buffer
US4633440A (en) Multi-port memory chip in a hierarchical memory
US7436728B2 (en) Fast random access DRAM management method including a method of comparing the address and suspending and storing requests
US20060218332A1 (en) Interface circuit, system, and method for interfacing between buses of different widths
US6816955B1 (en) Logic for providing arbitration for synchronous dual-port memory
US7363452B2 (en) Pipelined burst memory access
US4138720A (en) Time-shared, multi-phase memory accessing system
JPH04296958A (en) Double-port memory system
JPS63175287A (en) Storage device
US4788638A (en) Data transfer apparatus between input/output devices and main storage with channel devices being of a concentrated type and stand-alone type
KR100288177B1 (en) Memory access control circuit
CA2000145C (en) Data transfer controller
JPH03189843A (en) System and method for processing data
US6378032B1 (en) Bank conflict avoidance in multi-bank DRAMS with shared sense amplifiers
GB2397668A (en) Processor array with delay elements
JPH04304546A (en) Striping system and method for data

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU BR CA JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA