WO2000016202A1 - Apparatus and method to efficiently implement a switch architecture for a multiprocessor system - Google Patents

Apparatus and method to efficiently implement a switch architecture for a multiprocessor system Download PDF

Info

Publication number
WO2000016202A1
WO2000016202A1 PCT/US1999/018784 US9918784W WO0016202A1 WO 2000016202 A1 WO2000016202 A1 WO 2000016202A1 US 9918784 W US9918784 W US 9918784W WO 0016202 A1 WO0016202 A1 WO 0016202A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
data
switch device
processors
switch
Prior art date
Application number
PCT/US1999/018784
Other languages
French (fr)
Inventor
Jeff Guan
Jiahung Chen
Yew-Koon Tan
Taner Ozcelik
Shirish Gadre
Original Assignee
Sony Electronics Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Electronics Inc. filed Critical Sony Electronics Inc.
Priority to AU55701/99A priority Critical patent/AU5570199A/en
Publication of WO2000016202A1 publication Critical patent/WO2000016202A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks
    • G06F15/17368Indirect interconnection networks non hierarchical topologies
    • G06F15/17375One dimensional, e.g. linear array, ring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q3/00Selecting arrangements
    • H04Q3/42Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker
    • H04Q3/54Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised
    • H04Q3/545Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised using a stored programme
    • H04Q3/54541Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised using a stored programme using multi-processor systems
    • H04Q3/5455Multi-processor, parallelism, distributed systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q3/00Selecting arrangements
    • H04Q3/42Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker
    • H04Q3/54Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised
    • H04Q3/545Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised using a stored programme
    • H04Q3/54541Circuit arrangements for indirect selecting controlled by common circuits, e.g. register controller, marker in which the logic circuitry controlling the exchange is centralised using a stored programme using multi-processor systems
    • H04Q3/54558Redundancy, stand-by
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/13034A/D conversion, code compression/expansion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/13103Memory
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/13107Control equipment for a part of the connection, distributed control, co-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/13204Protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/1332Logic circuits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q2213/00Indexing scheme relating to selecting arrangements in general and for multiplex systems
    • H04Q2213/13396Signaling in general, in-band signalling

Abstract

An apparatus and method to efficiently implement a switch architecture for a multiprocessor system comprises a switch device (212), a plurality of system processors (216, 226, 232), and corresponding interface sockets (214, 222, 230). Each system processor communicates with the other system processors through the switch device (212) to perform data write operations from a master processor to a slave processor, and also to perform data read operations from a slave processor to a master processor. In multiprocessor systems having more than three processors (216, 226, 232), simultaneous multiple data transfers are permitted between any two pairs of system processors.

Description

APPARATUS AND METHOD TO EFFICIENTLY IMPLEMENT A SWITCH ARCHITECTURE FOR A MULTIPROCESSOR SYSTEM
BACKGROUND OF THE INVENTION
1. Field of the Invention —
This invention relates generally to multiprocessor architectures, and relates more particularly to an apparatus and method to efficiently implement a switch architecture for a multiprocessor system.
2. Description of the Background Art
An effective and efficient method for implementing a multiprocessor system architecture is a significant consideration for designers, manufacturers, and users of many modern electronic systems. As system applications and demands increase in complexity, a single processor often becomes insufficient to perform the substantial variety of tasks required by many system users. Multiprocessor system architectures of various descriptions have thus become a significant area of technological development in the field of electronic systems design.
Referring now to FIG. 1, a block diagram illustrating an architecture for a multiprocessor system 110 is shown. In the FIG. 1 embodiment, multiprocessor system 1 10 includes processor 1 14, processor 1 16, processor 118, and processor 120. Each of the FIG. 1 processors 1 14, 116, 118, and 120 are coupled to, and communicate through, a system bus 112. Therefore, if a particular task performed by system 110 is relatively complicated or extensive, system 110 may thus divide and allocate portions of the task between processors 114, 116, 1 18, and 120 to facilitate and expedite performance of the task. Modern integrated circuit fabrication techniques have progressively reduced the individual component size and corresponding physical circuit block dimensions for manufactured integrated circuits. Since the physical dimensions of the entire integrated circuit device remains relatively unchanged, the smaller components must frequently communicate over system buses that have not been correspondingly reduced in length. __
Driving these long system buses using modern components (with reduced physical dimensions) often results in bus loading and delay problems for processors 1 14, 1 16, 1 18, and 120. In fact, in some cases, the wire delay of system bus 112 may become greater than one system clock, and therefore, a critical path is induced.
Due to the foregoing problems, system bus 1 12 becomes very slow and inefficient when servicing multiple processors. The limiting factor for system 1 10 may thus become the speed of system bus 112, rather than the speed of individual processors 114, 116, 1 18, and 120. Therefore, for all the foregoing reasons, an improved apparatus and method are needed to efficiently implement a switch architecture for a multiprocessor system.
SUMMARY OF THE INVENTION
In accordance with the present invention, an apparatus and method to efficiently implement a switch architecture for a multiprocessor system is disclosed. In one embodiment, the invention includes a host processor, a digital signal processor 1 (DSP1), and a digital signal processor 2 (DSP2) __ that preferably communicate through a switch. In operation, the host processor, the DSP1 , and the DSP2 each communicate to the switch through corresponding interface sockets. In accordance with the present invention, the host processor, the
DSP1, or the DSP2 may function as a master processor to initiate either a data read cycle or a data write cycle by generating a data transfer request. Any of the remaining processors (host processor, DSP1, or DSP2) may similarly act as a slave processor to service the data transfer request. In the data write cycle, a master processor preferably sends a write request and a slave unit number to the switch, which responsively arbitrates the write request, and generates a grant signal to the master processor to authorize a write data transfer. Next, the switch creates a data transfer bridge to pass the write data to the slave processor. Then, the master processor sends an address and a data count to the switch, which responsively stores the data count in an internal data counter.
The switch then receives the write data from the master processor and temporarily stores the received write data into an internal FIFO. If the slave processor is ready to accept the write data, then the switch initially sends the address, and subsequently sends one unit of the temporarily stored write data from the FIFO to the slave processor. The switch then decrements the data counter to monitor the amount of write data that remains to be transferred to the slave processor. The switch then continues to transmit units of data from the FIFO to the slave processor. When the current data count stored in the data counter becomes equal to zero, then all the write data has been transferred to the slave processor, and the switch generates a termination signal to end the data write cycle. In the data read cycle, the master processor preferably sends a read request and a slave unit number to the switch, which responsively arbitrates the read request and generates a grant signal to the master processor to authorize a read data transfer. Next, the master processor sends an address and a data count to the switch, which responsively stores the data count in the internal data counter. __
When the slave processor is ready, the switch then receives the read data from the slave processor and temporarily stores the received read data into the internal FIFO. When the master processor is ready to accept the read data, then the switch sends one unit of the temporarily stored read data from the FIFO to the master processor. The switch then decrements the data counter to monitor the amount of read data that remains to be transferred to the master processor. The switch then continues to transmit units of data from the FIFO to the master processor. When the current data count stored in the data counter becomes equal to zero, then all the read data has been transferred to the master processor, and the switch generates a termination signal to end the data read cycle. The present invention thus efficiently and effectively implements a switch architecture for a multiprocessor system.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating an architecture for a multiprocessor system;
FIG. 2 is a block diagram for one embodiment of a multiprocessor __ system, in accordance with the present invention;
FIG. 3 is a block diagram for one embodiment of the switch of FIG. 2, in accordance with the present invention;
FIG. 4 is a signal table corresponding to the switch of FIG. 3, in accordance with the present invention;
FIG. 5 is a detailed block diagram for one embodiment of the switch of FIG. 2, in accordance with the present invention;
FIG. 6 is a block diagram for one embodiment of the host interface socket of FIG. 3, in accordance with the present invention;
FIG. 7 is a signal table corresponding to the host interface socket of FIG. 6, in accordance with the present invention;
FIG. 8 is a block diagram for one embodiment of the DSP interface sockets of FIG. 3, in accordance with the present invention;
FIG. 9 is a signal table corresponding to the DSP interface sockets of FIG. 8, in accordance with the present invention;
FIG. 10 is a block diagram tracing a basic write pipeline for a data write cycle, in accordance with the present invention; FIG. 1 1 is a timing diagram showing exemplary waveforms for a data write cycle, in accordance with the present invention;
FIG. 12 is a flowchart of method steps for one embodiment to perform a data write cycle, in accordance with the present invention;
FIG. 13 is a block diagram tracing a basic read pipeline for a data read cycle, in accordance with the present invention;
FIG. 14 is a timing diagram showing exemplary waveforms for a data read cycle, in accordance with the present invention;
FIG. 15 is a flowchart of method steps for one embodiment to perform a data read cycle, in accordance with the present invention; and
FIG. 16 is a block diagram for one embodiment of a multiprocessor system, in accordance with the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
The present invention relates to an improvement in electronic processor architectures. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is — provided in the context of a patent application and its requirements. Various modifications to the preferred embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. Thus, the present invention is not intended to be limited to the embodiment shown, but is to be accorded the widest scope consistent with the principles and features described herein.
The present invention comprises an apparatus and method to efficiently implement a switch architecture for a multiprocessor system including a switch device, a plurality of processors, and corresponding interface sockets. Each system processor communicates with the other system processors through the switch device to perform data write operations from a master processor to a slave processor, and also to perform data read operations from a slave processor to a master processor. In multiprocessor systems having more than three processors, the present invention permits simultaneous multiple data transfers between any two pairs of system processors.
Referring now to FIG. 2, a block diagram for one embodiment of a multiprocessor system 210 is shown, in accordance with the present invention. Although multiprocessor system 210 may readily be implemented in any appropriate and compatible electronic device, in the preferred embodiment, multiprocessor system 210 is preferably part of an encoder device that encodes data, including audio data, received from a data source. The encoder device (multiprocessor system 210) may then provide the encoded data to a program destination, such as a recordable digital video disc device for storage and subsequent playback by a system user.
The FIG. 2 embodiment includes a host processor 216, a digital signal processor 1 (DSPl) 224, and a digital signal processor 2 (DSP2) 232 that preferably communicate through a switch 212. In operation, host processor 216, DSPl 224, and DSP2 232 each communicate directly to __ switch 212. For example, host processor 216 and switch 212 communicate bidirectionally through bus 218, host interface socket 214, and bus 220. Similarly, DSPl 224 and switch 212 communicate through bus 226, DSP 1 interface socket 222, and bus 228. Further, DSP2 232 and switch 212 communicate bidirectionally through bus 234, DSP 2 interface socket 230, and bus 236. Switch 212 may thus advantageously receive various data from a source processor (host processor 216, DSPl 224, or DSP2 232), and then responsively relay the received data to a selected destination processor (host processor 216, DSPl 224, or DSP2 232).
The FIG. 2 embodiment thus provides an architecture that avoids the problems discussed above in conjunction with multiprocessor system 1 10 of FIG. 1. Design independency is achieved by connecting each FIG. 2 processor to an independent port on switch 212 via a separate interface socket. System designers may thus design and test each processor independently. Furthermore, no direct connections exist between any of the FIG. 2 processors. Therefore, system timing analysis is significantly facilitated since only a single timing check is typically required for each FIG. 2 processor. The bus speed of the FIG. 1 system bus 1 12 is substantially increased in the FIG. 2 embodiment due to the shortened bus length and reduced bus loading found in multiprocessor system 210. The FIG. 2 architecture also advantageously exhibits increased circuit modularity. Each FIG. 2 processor circuit block is connected to an independent port on switch 212. Therefore, any of the FIG. 2 processor circuit blocks may readily be replaced or removed from multiprocessor system 210 without affecting the operation of the remaining FIG. 2 processor circuit blocks. Referring now to FIG. 3, a block diagram for one embodiment of the FIG. 2 switch 212 is shown, in accordance with the present invention. FIG. 3 depicts switch 212, host interface socket (HSK) 214, DSP 1 interface socket (DSK1) 222, and DSP 2 interface socket (DSK2) 230. Also shown in the FIG. 3 embodiment are the respective interface signals that pass between switch 212 and HSK 214, DSK1 222, and DSK2 230.
In the FIG. 3 embodiment, HSK 214, DSK1 222, and DSK2 230 are preferably implemented using an identical or substantially similar configuration to simplify and facilitate interfacing various processors with switch 212. Each respective set of interface signals between switch 212 and HSK 214, DSK1 222, and DSK2 230 are therefore also preferably identical or substantially similar. The interface signals shown in FIG. 3 are further discussed below in conjunction with FIGS. 4-5 and 1 1- 15.
Referring now to FIG. 4, a signal table 410 corresponding to the FIG.
3 switch 212 is shown, in accordance with the present invention. The FIG.
4 signal table 410 describes a set of interface signals 412 through 432, including an interface signal name (corresponding to FIG. 2), an interface signal direction (input or output), and an interface signal description. The timing and functionality of the interface signals described in FIG. 4 are further discussed below in conjunction with FIGS. 5 and 1 1- 15.
Referring now to FIG. 5, a detailed block diagram for one embodiment of the FIG. 2 switch 212 is shown, in accordance with the present invention. In the FIG. 5 embodiment, switch 212 includes switch arbitration logic 510, switch control logic 512, and switch data path 514. Also depicted are data counter 516 in switch control logic 512, and first-in first-out memory (FIFO) 518 in switch data path 514.
In operation, switch arbitration logic 510 receives a request 414 (FIG. 4) for either a read data transfer or a write data transfer between a master processor (host processor 216, DSPl 224, or DSP2 232) and a slave processor (host processor 216, DSPl 224, or DSP2 232) . The master processor initiates the data transfer by generating the request 414 to switch 212 and by also sending a unit number 412 to identify the slave processor. Switch 212 then checks a busy signal 430 (FIG. 4) to determine whether the designated slave processor is free to perform the requested transfer. If the slave processor is available, then switch 212 sends a grant signal 416 (FIG. 4) to the master processor to authorize the data transfer. The grant __ signal 416 may also be generated if an ignore signal 432 is asserted by the master processor.
Switch control logic 512 preferably receives a data count from the master processor to indicate the number of data units being transferred. The data count is loaded into data counter 516 which is then decremented as each data unit is transferred. Switch control logic may then generate a transfer termination signal when the data count in data counter 516 reaches zero. Implementing data counter 516 centrally within switch 212 significantly reduces the complexity of the switch interfaces required within each processor (host processor 216, DSPl 224, and DSP2 232). Switch control logic 512 also preferably generates control signals for operating FIFO 518.
In response to the grant signal 416, switch data path 514 creates a data transfer bridge connecting the master processor and the slave processor. FIFO 518 temporarily stores the transferred data to maximize performance of switch 212. For example, during a write cycle, if the slave processor is slow or delayed while saving the transfer data into its internal memory, then switch 212 may advantageously store the transfer data into FIFO 518 until the slave processor becomes ready to accept more transfer data. In addition, if FIFO 518 becomes filled to capacity with transfer data, then switch 212 may notify the master processor to temporarily halt the transmission of additional transfer data until FIFO 518 regains storage capacity. Switch 212 may thus utilize FIFO 518 to effectively coordinate the data transfer process by dividing the data transfer process into two separate steps. Furthermore, the physical distance and the corresponding signal propagation time between the master processor and the slave processor are divided in half to permit a system clock to function at twice the rate of similar data transfers performed directly from master processor to slave processor.
Referring now to FIG. 6, a block diagram for one embodiment of the
FIG. 3 host interface socket (HSK) 214 is shown, in accordance with the __ present invention. In the FIG. 6 embodiment, host interface socket 214 preferably includes electronic circuitry that receives interface signals generated from switch 212 (FIGS. 3 and 4) and responsively generates a set of corresponding host processor signals to host processor 216. The functionality of the host processor signals is further described below in conjunction with FIG. 7.
Referring now to FIG. 7, a signal table 710 corresponding to the FIG. 6 host interface socket 214 is shown, in accordance with the present invention. The FIG. 7 signal table 710 describes a set of host processor signals 712 through 748, including a host processor signal name (corresponding to FIG. 6), a host processor signal direction (input or output), and a host processor signal description. Many of the host processor signals of FIG. 7 directly correspond to the interface signals (FIG. 4) between host interface socket 214 and switch 212.
The data transfer handshaking protocol between host processor 216 and host interface socket 214 is designed so that host processor 216 may advantageously be implemented without requiring complicated interface circuitry. In practice, host interface socket 214 preferably generates a master enable signal 724 to host processor 216 (when host processor 216 functions as the master processor) and a unit of data is responsively transferred. Similarly, host interface socket 214 preferably generates a slave enable signal 736 to host processor 216 (when host processor 216 functions as the slave processor) and a unit of data is responsively transferred. Host processor 216 thus may assume a relatively passive role in the data transfer process. Referring now to FIG. 8, a block diagram for one embodiment of the FIG. 3 DSP interface sockets (DSK1 222 and DSK2 230) is shown, in accordance with the present invention. In the FIG. 8 embodiment, DSP interface sockets 222 and 230 preferably each include electronic circuitry that receives interface signals generated from switch 212 (FIGS. 3 and 4) and responsively generates similar sets of corresponding DSP signals to __ DSPl 224 and to DSP2 232. The functionality of the DSP signals is further described below in conjunction with FIG. 9.
Referring now to FIG. 9, a signal table 810 corresponding to the FIG.
8 DSP interface sockets 222 and 230 is shown, in accordance with the present invention. The FIG. 9 signal table 810 describes a set of DSP signals 812 through 848, including a DSP signal name (corresponding to FIG. 8), a DSP signal direction (input or output), and a DSP signal description. Many of the DSP signals of FIG. 9 directly correspond to the interface signals (FIG. 4) between DSPl interface socket 222 or DSP2 interface socket 230 and switch 212.
As similarly discussed above, the data transfer handshaking protocol between host processor 216 and DSP sockets 222 and 232 is designed so that DSPs 224 and 232 may advantageously be implemented without requiring complicated interface circuitry. In practice, the DSP interface sockets 222 or 230 preferably generate a master enable signal 824 to their corresponding DSP 224 or 232 (when that DSP functions as the master processor) and a unit of data is responsively transferred. Similarly, the DSP interface sockets 222 or 230 preferably generate a slave enable signal 836 to their corresponding DSP 224 or 232 (when that DSP functions as the slave processor) and a unit of data is responsively transferred. The DSPs 224 and 232 may thus assume relatively passive roles during the data transfer process.
Referring now to FIG. 10, a block diagram tracing a basic write pipeline 1010 for a data write cycle is shown, in accordance with the present invention. In the following FIGS. 10 through 12, for the sake of illustration, DSPl 224 is described as the master processor that initiates a write request to transfer write data to slave processor DSP2 232. In other uses of the present invention, any system 210 processor (host processor 216, DSPl 224, or DSP2 232) may function as a master processor to request a write data transfer, and likewise, any system 210 processor may operate as the slave processor to service the write request. __
In the FIG. 10 example, DSPl 224 sends an address and data count through DSK1 222 to switch 212 which temporarily latches the address in latch 1012. DSPl 224 then sends write data through DSK1 222 to switch 212 which temporarily latches the write data in latch 1012. At the appropriate time, switch 212 stores the address into latch 1014 of DSK2 230, and also stores the write data into latch 1016 of DSK2 230. When DSP2 232 is ready, DSK2 230 provides the address and the write data to DSP2 232, and the write cycle is complete. The operation of the write cycle is further illustrated and discussed below in conjunction with FIGS. 1 1 and 12.
Referring now to FIG. 1 1, a timing diagram 11 10 showing exemplary waveforms for a data write cycle is shown, in accordance with the present invention. The waveforms of FIG. 1 1 correspond to the signals discussed above in conjunction with FIGS. 3 through 5, and are presented to illustrate the operation of one embodiment of the present invention. In alternate embodiments, multiprocessor system 210 may readily generate and operate with various other appropriate timing waveforms. One embodiment for generating and utilizing the FIG. 11 waveforms is further discussed below in conjunction with FIG. 12.
Referring now to FIG. 12, a flowchart of method steps for one embodiment to perform a data write cycle is shown, in accordance with the present invention. In the FIG. 12 embodiment, DSPl 224 functions as the master processor and DSP2 232 functions as the slave processor. However, in alternate embodiments, any system 210 processor (host processor 216, DSPl 224, or DSP2 232) may initiate a write request as the master processor, or service a write request as the slave processor.
Initially, in step 1212, the master processor sends a write request 414 (FIG. 11) and a slave unit number 412 to switch 212 via DSKl 222. Then, in step 1214, switch 212 responsively arbitrates the write request 414, and generates a grant signal 416 to the master processor to authorize^ a write data transfer, as discussed above in conjunction with FIG. 5.
Next, in step 1216, switch 212 creates a data transfer bridge through switch data path 514 (FIG. 5) to temporarily store and then pass the write data to the slave processor. Then, in step 1218, the master processor sends an address and a data count to switch 212 via DSKl 222, and switch 212 responsively stores the data count in data counter 516 and latches the address in latch 1012.
In step 1220, switch 212 receives the write data from the master processor and temporarily stores the received write data into FIFO 518 (FIG. 5). Then, in step 1222, switch 212 determines whether the slave processor is ready to receive the write data stored in FIFO 518. In the FIG. 12 embodiment, switch 212 preferably checks a busy signal 430 to determine the status of the slave processor. If the slave processor is ready to accept the write data, then switch
212, in step 1224, initially sends the latched address, and subsequently sends one unit of the temporarily stored write data from FIFO 518 to the slave processor via DSK2 230. In step 1226, switch 212 then decrements the data counter 516 to monitor the amount of write data that remains to be transferred to the slave processor.
In step 1228, switch 212 determines whether the current data count stored in data counter 516 is equal to zero. If the current data count stored in data counter 516 is not equal to zero, then the FIG. 12 process returns to step 1224 to continue transfering the remaining units of write data. However, if the current value stored in data counter 516 is equal to zero, then all the write data has been transferred to DSP2 232, and switch 212 generates a termination signal 426 to end the FIG. 12 data write cycle. Referring now to FIG. 13, a block diagram tracing a basic read pipeline 1310 for a data read cycle is shown, in accordance with the present invention. In the following FIGS. 13 through 15, for the sake of illustration, DSPl 224 is described as the master processor that initiates a read request to transfer read data from slave processor DSP2 232. In other uses of the present invention, any system 210 processor (host processor 216, DSPl __ 224, or DSP2 232) may operate as the master processor to request a data read operation, and likewise, any system 210 processor may operate as the slave processor to service the data read request. In the FIG. 13 embodiment, DSPl 224 sends an address and data count through DSKl 222 to switch 212, which temporarily latches the address in latch 1316. When DSP2 232 is ready, then DSK2 230 provides the address to DSP2 232 via latch 1318 of DSK2 230. DSP2 232 responsively transfers the requested read data to latch 1320 in switch 212. When DSPl 224 is ready to accept the requested read data, then switch 212 transfers the read data to DSPl 224 via latch 1322 in DSKl 222. The operation of the data read cycle is further illustrated and discussed below in conjunction with FIGS. 14 and 15.
Referring now to FIG. 14, a timing diagram 1410 showing exemplary waveforms for a data read cycle is shown, in accordance with the present invention. The waveforms of FIG. 14 correspond to the signals discussed above in conjunction with FIGS. 3 through 5, and are presented to illustrate the operation of one embodiment of the present invention. In alternate embodiments, multiprocessor system 210 may readily generate and function using various other appropriate timing waveforms. One embodiment for generating and utilizing the FIG. 14 waveforms is further discussed below in conjunction with FIG. 15.
Referring now to FIG. 15, a flowchart of method steps for one embodiment to perform a data read cycle is shown, in accordance with the present invention. In the FIG. 15 embodiment, DSPl 224 functions as the master processor and DSP2 232 functions as the slave processor. However, in alternate embodiments, any system 210 processor (host processor 216, DSPl 224, or DSP2 232) may initiate a read request as the master processor, or service a read request as the slave processor.
Initially, in step 1512, the master processor sends a read request 414 and a slave unit number 412 to switch 212 via DSKl 222. In one embodiment, the master processor utilizes a write/read signal 418 (FIG. 14) to indicate whether the request is for a write operation or a read operation. Then, in step 1514, switch 212 responsively arbitrates the read request 414, and generates a grant signal 416 to the master processor to authorize a read data transfer, as discussed above in conjunction with FIG. 5.
Next, in step 1516, switch 212 creates a data transfer bridge through switch data path 514 (FIG. 5) to temporarily store and then pass the read data from the slave processor to the master processor. Then, in step 1518, the master processor sends an address and a data count to switch 212 via DSKl 222, and switch 212 responsively stores the data count in data counter 516 and latches the address in latch 1316.
In step 1520, switch 212 preferably uses a handshaking protocol to determine whether the slave processor is ready to begin the read data transfer. If the slave processor is ready, then switch 212, in step 1522, receives the read data from the slave processor and temporarily stores the received read data into FIFO 518 (FIG. 5). Then, in step 1524, switch 212 preferably uses another handshaking protocol to determine whether the master processor is ready to receive the read data stored in FIFO 518.
If the master processor is ready to accept the read data, then switch 212, in step 1526, sends one unit of the temporarily stored read data from
FIFO 518 to the master processor via DSKl 222. In step 1528, switch 212 then decrements the data counter 516 to monitor the amount of read data that remains to be transferred to the master processor.
In step 1530, switch 212 determines whether the current data count stored in data counter 516 is equal to zero. If the current data count stored in data counter 516 is not equal to zero, then the FIG. 15 process returns to step 1526 to continue transferring the remaining units of read data. However, if the current data count stored in data counter 516 is equal to zero, then all the read data has been transferred to DSPl 224, and switch 212 generates a termination signal 426 to end the FIG. 15 data read cycle.
Referring now to FIG. 16, a block diagram for one embodiment of a multiprocessor system 1610 is shown, in accordance with the present invention. The FIG. 16 embodiment includes a switch module 1612 that __ individually communicates with a processor 1 1614, a processor 2 1622, a processor 3 1638, and a processor 4 1630. In alternate embodiments, system 1610 may readily be configured to include more or less than the four processors 1630, 1614, 1622, and 1638 that are illustrated in the FIG. 16 embodiment.
In operation, data read cycles and data write cycles may be performed by the FIG. 16 multiprocessor system 1610 using the same or similar techniques as those discussed above in conjunction with FIGS. 1 through 15. However, the FIG. 16 system 1610 may advantageously also perform simultaneous data transfers between multiple pairs of processors 1630, 1614, 1622, and 1638, in accordance with the present invention.
For example, processor 1 1614 and processor 2 1622 may perform a data transfer, while processor 3 1638 and processor 4 1630 simultaneously perform another separate data transfer. Similarly, processor 1 1614 and processor 3 1638 may perform a data transfer, while processor 2 1622 and processor 4 1630 simultaneously perform another data transfer. Further, processor 1 1614 and processor 4 1630 may perform a data transfer, while processor 3 1638 and processor 2 1622 simultaneously perform a data transfer.
The FIG. 16 system 1610 advantageously creates multiple read or write pipelines to provide powerful simultaneous multiple data transfer capabilities for system 1610. The ability to concurrently process and perform multiple data transfers therefore allows multiprocessor system 1610 to significantly expedite and facilitate complex processing tasks.
The invention has been explained above with reference to a preferred embodiment. Other embodiments will be apparent to those skilled in the art in light of this disclosure. For example, the present invention may readily be implemented using configurations and techniques other than those described in the preferred embodiment above. Additionally, the present invention may effectively be used in conjunction with systems other than the one described above as the preferred embodiment. Therefore, these and other variations upon the preferred embodiments are intended to be covered by the present invention, which is limited only by the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A system for implementing a switch architecture, comprising: a plurality of processors (216, 226, 232) coupled to said system to process information; and a switch device (212) configured to provide communications between_ said plurality of processors (216, 226, 232).
2. The system of claim 1 wherein said plurality of processors (216, 226, 232) and said switch device (212) are implemented as part of an encoder device for encoding said information.
3. The system of claim 2 wherein said information is digital audio data that is encoded by said encoder device for storage on a digital video disc apparatus.
4. The system of claim 1 wherein said plurality of processors (216, 226, 232) includes a host processor (216), a first digital signal processor (224), and a second digital signal processor (232).
5. The system of claim 4 wherein said host processor (216) communicates with said switch device (212) through a host interface socket (214), said first digital signal processor (224) communicates with said switch device (212) through a first DSP interface socket (222), and said second digital signal processor (232) communicates with said switch device (212) through a second DSP interface socket (230).
6. The system of claim 5 wherein said host interface socket (214), said first DSP interface socket (222), and said second DSP interface socket (230) are implemented using an identical configuration.
7. The system of claim 1 wherein said switch device (212) includes an arbitration logic module (510), a control logic module (512), and a data path (514).
8. The system of claim 1 wherein said plurality of processors (216, 226, 232) includes a master processor to initiate a write cycle for transferring __ write data to a slave processor.
9. The system of claim 8 wherein said master processor initiates a read cycle for obtaining read data from said slave processor.
10. The system of claim 9 wherein said master processor generates a data transfer request to said switch device (212).
11. The system of claim 9 wherein said master processor generates a unit identifier to identify said slave processor from among said plurality of processors (216, 226, 232).
12. The system of claim 10 wherein said switch device (212) arbitrates said data transfer request and generates a request grant signal to said master processor when said slave processor is available for said communications .
13. The system of claim 9 wherein said switch device (212) creates a data transfer bridge to permit said communications between said master processor and said slave processor.
14. The system of claim 9 wherein said master processor sends an address to said slave processor.
15. The system of claim 9 wherein said master processor stores a data count in a data counter (516) inside of said switch device (212), said data count corresponding to data units from one of said write data and said read data.
16. The system of claim 15 wherein said data units are temporarily __ stored into a memory device (518) inside of said switch device (212).
17. The system of claim 16 wherein said counter decrements said data count each time that one of said data units is transferred from said memory device (518).
18. The system of claim 17 wherein said switch device (212) terminates said write cycle and said read cycle when said data count in said data counter (516) is equal to zero.
19. The system of claim 1 wherein said plurality of processors (216, 226, 232) includes at least four processor devices.
20. The system of claim 19 wherein said switch device (212) simultaneously performs data transfers between at least two pairs of said at least four processor devices (1612, 1622, 1630, 1638).
21. A method for implementing a switch architecture, comprising the steps of: providing a plurality of processors (216, 226, 232) for processing information; and configuring a switch device (212) to provide communications between said plurality of processors (216, 226, 232).
22. The method of claim 21 wherein said plurality of processors (216, 226, 232) and said switch device (212) are implemented as part of an encoder device for encoding said information.
23. The method of claim 22 wherein said information is digital audio data that is encoded by said encoder device for storage on a digital video disc apparatus.
24. The method of claim 21 wherein said plurality of processors (216, 226, 232) includes a host processor (216), a first digital signal processor __ (224), and a second digital signal processor (232).
25. The method of claim 24 wherein said host processor (216) communicates with said switch device (212) through a host interface socket (214), said first digital signal processor (224) communicates with said switch device (212) through a first DSP interface socket (222), and said second digital signal processor (232) communicates with said switch device (212) through a second DSP interface socket (230).
26. The method of claim 25 wherein said host interface socket (214), said first DSP interface socket (222), and said second DSP interface socket (230) are implemented using an identical configuration.
27. The method of claim 21 wherein said switch device (212) includes an arbitration logic module (510), a control logic module (512), and a data path (514).
28. The method of claim 21 wherein said plurality of processors (216, 226, 232) includes a master processor to initiate a write cycle for transferring write data to a slave processor.
29. The method of claim 28 wherein said master processor initiates a read cycle for obtaining read data from said slave processor.
30. The method of claim 29 wherein said master processor generates a data transfer request to said switch device (212).
31. The method of claim 29 wherein said master processor generates a unit identifier to identify said slave processor from among said plurality of processors (216, 226, 232).
32. The method of claim 30 wherein said switch device (212) arbitrates said data transfer request and generates a request grant signal to said __ master processor when said slave processor is available for said communications .
33. The method of claim 29 wherein said switch device (212) creates a data transfer bridge to permit said communications between said master processor and said slave processor.
34. The method of claim 29 wherein said master processor sends an address to said slave processor.
35. The method of claim 29 wherein said master processor stores a data count in a data counter (516) inside of said switch device (212), said data count corresponding to data units from one of said write data and said read data.
36. The method of claim 35 wherein said data units are temporarily stored into a memory device (518) inside of said switch device (212).
37. The method of claim 36 wherein said counter decrements said data count each time that one of said data units is transferred from said memory device (518).
38. The method of claim 37 wherein said switch device (212) terminates said write cycle and said read cycle when said data count in said data counter (516) is equal to zero.
39. The method of claim 21 wherein said plurality of processors (216, 226, 232) includes at least four processor devices.
40. The method of claim 39 wherein said switch device (212) simultaneously performs data transfers between at least two pairs of said at least four processor devices (1612, 1622, 1630, 1638). __
41. A system for implementing a switch architecture, comprising: means for providing a plurality of processors (216, 226, 232) for processing information; and means for configuring a switch device (212) to provide communications between said plurality of processors (216, 226, 232).
42. A computer- readable medium comprising program instructions for implementing a multiprocessor system by performing the steps of: providing a plurality of processors (216, 226, 232) for processing information; and configuring a switch device (212) to provide communications between said plurality of processors (216, 226, 232).
PCT/US1999/018784 1998-09-16 1999-08-19 Apparatus and method to efficiently implement a switch architecture for a multiprocessor system WO2000016202A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU55701/99A AU5570199A (en) 1998-09-16 1999-08-19 Apparatus and method to efficiently implement a switch architecture for a multiprocessor system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15460098A 1998-09-16 1998-09-16
US09/154,600 1998-09-16

Publications (1)

Publication Number Publication Date
WO2000016202A1 true WO2000016202A1 (en) 2000-03-23

Family

ID=22551980

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/018784 WO2000016202A1 (en) 1998-09-16 1999-08-19 Apparatus and method to efficiently implement a switch architecture for a multiprocessor system

Country Status (2)

Country Link
AU (1) AU5570199A (en)
WO (1) WO2000016202A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005109232A1 (en) * 2004-05-12 2005-11-17 Building 31 Clustering Ab Cluster switch

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5255264A (en) * 1991-09-26 1993-10-19 Ipc Information Systems, Inc. Distributed control switching network for multi-line telephone communications
US5581600A (en) * 1992-06-15 1996-12-03 Watts; Martin O. Service platform
US5613146A (en) * 1989-11-17 1997-03-18 Texas Instruments Incorporated Reconfigurable SIMD/MIMD processor using switch matrix to allow access to a parameter memory by any of the plurality of processors

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5613146A (en) * 1989-11-17 1997-03-18 Texas Instruments Incorporated Reconfigurable SIMD/MIMD processor using switch matrix to allow access to a parameter memory by any of the plurality of processors
US5255264A (en) * 1991-09-26 1993-10-19 Ipc Information Systems, Inc. Distributed control switching network for multi-line telephone communications
US5581600A (en) * 1992-06-15 1996-12-03 Watts; Martin O. Service platform

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005109232A1 (en) * 2004-05-12 2005-11-17 Building 31 Clustering Ab Cluster switch

Also Published As

Publication number Publication date
AU5570199A (en) 2000-04-03

Similar Documents

Publication Publication Date Title
JP5036120B2 (en) Communication system and method with unblocked shared interface
JP4083987B2 (en) Communication system with multi-level connection identification
US7761632B2 (en) Serialization of data for communication with slave in multi-chip bus implementation
US6295568B1 (en) Method and system for supporting multiple local buses operating at different frequencies
US7743186B2 (en) Serialization of data for communication with different-protocol slave in multi-chip bus implementation
KR960006506B1 (en) Computer system and system expansion unit and bus linkage unit and bus access arbitration method
US7814250B2 (en) Serialization of data for multi-chip bus implementation
US5826048A (en) PCI bus with reduced number of signals
KR100231897B1 (en) Dma control circuit receiving size data of dma channel
JP2004318901A (en) High-speed control and data bus system mutually between data processing modules
US7769933B2 (en) Serialization of data for communication with master in multi-chip bus implementation
JPH06231073A (en) Multiport processor provided with peripheral-device interconnection port and with rambus port
JP2005235197A (en) Bus system for connecting subsystem including a plurality of masters with bus based on open core protocol
KR101699784B1 (en) Bus system and operating method thereof
KR910007646B1 (en) Backplane bus
CN110557311B (en) Inter-processor communication method for inter-die access latency in system-in-package
CN111290986B (en) Bus interconnection system based on neural network
JPH0981508A (en) Method and apparatus for communication
US5838995A (en) System and method for high frequency operation of I/O bus
US9104819B2 (en) Multi-master bus architecture for system-on-chip
JPH09160866A (en) Bus interface logic system and synchronization method
US7133958B1 (en) Multiple personality I/O bus
US5717875A (en) Computing device having semi-dedicated high speed bus
WO2008133940A2 (en) Serialization of data in multi-chip bus implementation
EP0588030A2 (en) Master microchannel apparatus for converting to switch architecture

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase