WO1998013759A1

WO1998013759A1 - Data processor and data processing system

Info

Publication number: WO1998013759A1
Application number: PCT/JP1996/002819
Authority: WO
Inventors: Shigezumi Matsui; Susumu Kaneko
Original assignee: Hitachi, Ltd.
Priority date: 1996-09-27
Filing date: 1996-09-27
Publication date: 1998-04-02
Also published as: TW332272B; JP3778573B2

Abstract

A data processor (1) in which an instruction fetching unit (10) fetches an instruction, an instruction decoder (12) interprets the instruction latched in an instruction register (11), and an instruction executing unit (13) executes the instruction based on the results of interpretation by the decoder (12). One of task buffers (16 and 17) each provided with a program storing area (160 and 170) and a pointer (161 and 171) for successively reading out instructions stored in the areas (160 and 170) or the unit (10) is selected through a selector (18). The selection by the selector (18) is controlled by a switching control means (19) in accordance with an internally or externally generated event. Register means (S1 and S2) used exclusively for the task buffers (16 and 17) are provided in the instruction executing unit (13) so as to make the saving of the internal state of the unit (13) unnecessary when the task is switched by selecting a program stored in one task buffer from the outside. Therefore, the speed of switching the task is improved and the burden of the data processor (1) at the time of switching the task is reduced.

Description

Description Data processor and data processing system

Technical field

The present invention relates to a data processor, and more particularly to a multitasking or task switching technique in a data processor, and is applied to, for example, a data processor that processes a plurality of tasks in a pipeline, and a data processing system to which the data processor is applied. It is about effective technology. Background art

There is a pipeline process as a technology for speeding up data processing by a data processor. Pipelining improves the throughput of data processing by dividing one large process into multiple processing elements and executing new processing one after another at the time required for each processing element, that is, at the pipeline pitch. It is. For example, when the control processing for executing one instruction is divided into each processing of instruction fetch, instruction decode, operation, memory access, and register store, each of the above processing is regarded as one pipeline stage. Instruction fetch is performed for each pipeline (pipeline bit) of two pipeline stages, and apparently one instruction is executed at one pipeline pitch.

When task switching is performed during such pipeline processing, the values of the program counter, status register, and data register are stored in the stack area so that the task that is currently being executed can be returned later. Must be saved. However, such an evacuation process takes a considerable amount of time, which may cause disruption in the pipeline. In particular, when performing complex processing, even if the program execution state is viewed locally, frequent switchovers will occur. As a result, the throughput of the overnight processing cannot be improved as expected even if the special pipeline processing is adopted. The Japanese Patent Publication No. 6 2—2 3 7 5 3 1 provides two sets of programs R 0 M and program counters, because when multiple programs are executed in a time-sharing manner, the processing is complicated by interrupts. In this case, the timing of ROM access in each group is shifted, the output of the ROM is alternately selected by the selector, and the instruction is supplied to the instruction register. Thus, a plurality of programs can be easily executed in a time-division manner. The enabling technology is shown.

This is a technology to execute programs in a complete time-sharing manner by quasi-alternatively switching programs based on a clock signal.It is assumed that multiple programs are apparently executed simply in parallel. It does not consider switching evenings as specific events occur inside or outside the processor. A data processor generally used for equipment control, for example, needs to consider at least that, reduce the turbulence of the pipeline when switching tasks, and improve throughput.

In addition, a data processor with a single-path scalar architecture can execute multiple instructions simultaneously with multiple pipelines. In such a data processor, it is necessary to manage inter-instruction dependencies such as a data conflict state in which an instruction uses the execution result of another instruction. If it turns out to shift data conflicts into instructions to be executed in parallel, some of the pipelines will stop executing instructions and wait for the other instruction to complete. And Considering the use of the pipeline vacated by the data conflict for the execution of another task, it is necessary to shorten the processing time associated with task switching and minimize the disruption of the pipeline. What has to be done has been made clear by the present inventors.

The data processor can be equipped with a cache memory to speed up operand access. If the cache memory cache line is corrupted, the corresponding memory contents must be rewritten. For example, if only the data processor occupies the main memory, the rewritten content may be reflected in the main memory only when the cache line is replaced. Such an operation is referred to as a light stroke.

However, a DMA (Direct Memory Access) controller connected to the outside of the data processor reads an incorrect data from the main memory in which the rewriting of the cache memory is not reflected in the main memory, and reads the data from the main memory. There is a risk of transfer. Such a problem is called a cache coherency problem.To solve this problem, a write-through method that performs a memory write operation every time a cache hit is performed during a memory write operation is adopted for the cache memory. At the same time, the cache memory can be made into a non-locking configuration using a light buffer. However, when memory write operations frequently occur due to cache coherency, the data processor uses up the data transfer capability of the bus connecting the DMA controller and main memory for cache coherency. As a result, when high-speed data transfer is performed by the DMA controller, there is a problem that the data transfer speed is limited.

Therefore, in order to maintain cache-coherency in a write-back manner without employing write-through, an operation that does not maintain cache coherency is performed. Techniques for detecting and writing back at that time can be employed. For example, if the data controller detects an operation (bus snoop) for read access to the data stored in the cache memory, the data processor interrupts the operation of writing back the data, and then the DMA transfer is performed. Enable. However, the burden on the data processor of detecting operations that do not maintain cache coherency increases.

SUMMARY OF THE INVENTION An object of the present invention is to provide a data processor that can reduce processing associated with task switching and improve data processing capability.

Another object of the present invention is to provide a data processor which can minimize the disturbance of the pipeline at the time of task switching. It is still another object of the present invention to provide a super-scalar architecture data processor that can switch a pipeline vacated by a data conflict to another task to be used effectively. Another object of the present invention is to provide a data processor capable of minimizing the load for maintaining cache coherency during DMA transfer when a write-back type cache memory is incorporated. .

The above and other objective and novel features of the present invention will become apparent from the following description of the present specification. Disclosure of the invention

In the present invention, as illustrated in FIG. 1, an instruction fetch (10) fetches an instruction, and an instruction latched in an instruction register (11) is decoded by an instruction decoder (12). Instruction execution based on the decoding result The data processor (1) in which the unit (13) executes the instructions includes a program storage area (160, 170) and a memory (16) for sequentially reading the instructions stored in that area. A plurality of task buffers (16, 17) respectively provided with the respective instruction buffers, and register means dedicated to the respective task buffers and arranged in the instruction execution unit. (S 1, S 2); a selector (18) for selectively connecting one of the plurality of task buffers and the instruction feature to the instruction register; and Switching control means (19) for selecting an instruction feature and for selectively controlling the selector according to an internally or externally generated event; and controlling the plurality of instructions based on the control of the instruction execution unit. Tasks Isseki de all or part of § write "outside fin to J ability evening and a face to interface means (2 1, BUS).

The task buffers have their own unique pointers, and the instruction execution unit has a unique register means assigned to each task buffer. Therefore, the task to be executed is in accordance with the instruction program program. Saves or restores the interrupted normal instruction processing execution state (eg, the value of the program counter or general-purpose register) when switching between the normal instruction processing and the swap buffer processing according to the task buffer program. It does not require processing to access the stack area of the external memory. This achieves faster task switching and reduced processing associated with task switching, contributing to an improvement in the data processing capability of the data processor.

In the case where the instruction register, the instruction decoder and the instruction execution unit perform the pipeline processing of the instruction by advancing the processing in units of the pipeline stage, the above-mentioned arrangement can minimize the disturbance of the pipeline. The instruction execution unit outputs an instruction signal (LIR) for latching an instruction in the instruction register, and the selector supplies the instruction signal to an instruction picture unit or a task buffer selected by the switching control means. The instruction feature may update the instruction to be supplied to the instruction register based on the instruction signal, and the task buffer may update the bus instruction based on the instruction signal. This control facilitates the task buffer control.

As a method of returning from the step task processing to the normal instruction processing, the switching control means switches the selector to the previous instruction based on the result of decoding the instruction supplied from the task buffer selected by the switching control means to the instruction decoder. You can return to the selected state of the bird. That is, in consideration of the completion of the selected step task processing to return to the normal instruction processing and the completion of the step task processing with the highest priority, as shown in FIG. In response to the selection of the task buffer, it is preferable to output an interrupt disable signal (INH) that invalidates the interrupt signal input to the instruction execution unit. Thus, no interrupt request is accepted during the step task processing.

When accepting an interrupt, the switching control means (19) selects the task sofa (16, 17) as in a data processor (1A) illustrated in FIG. In this case, the selector (18) is returned to the selected state of the instruction fetch unit by the control signal ICNT corresponding to the acceptance of the interrupt by the instruction execution unit (13), and the previous task buffer is selected. What is necessary is just to save the state.

The data processor (1) can be provided with a data cache memory (15) between the instruction execution unit and the outside. As shown in FIG. 20, this data processor is connected to a memory via a bus (4). Connected to multiple peripheral circuits (2, 5) to form a data processing system. At this time, when a DMA transfer control program or a DMA transfer and data conversion control program is set in the task buffer, the load on the data processor for solving the problem of cache coherency can be reduced. That is, in a state where the processing task of the processor is switched to the DMA transfer control processing via the selector or the like, the function as the DMA controller is realized by the execution unit. Therefore, when DMA transfer is controlled between an external memory of a data processor or between an external memory and an external input / output circuit, an address signal or access control information for DMA transfer control always uses a data cache memory. I will pass. In other words, when the cache memory adopts the write knock method, even if the DMA transfer is started in a state where the rewrite of the cache memory is not reflected in the external memory, such an external memory is used. The data not reflected in the memory is read from the cache memory to the instruction execution unit and transferred. As a result, the data processor detects a DMA transfer operation that does not maintain cache coherency, does not need to perform a write-back operation in advance when it detects a DMA transfer operation, and does not need to maintain cache coherency. The processing load of the data processor for detecting the sending operation can be reduced. Naturally, in the DMA transfer control function realized by the data processor, the transfer data is once read into the data processor.

The task switching means can also be applied to superscalar data processors (1B, 1C) illustrated in FIGS. 14 and 16. In other words, the instruction latch (11A, 11B) latched to the instruction register (11A, 11B) is decoded by the instruction decoder (12A, 12B), and the instruction execution unit (13A, 13B) is decoded. A plurality of instruction execution control sequences for executing the instruction are provided, and A data processor (1B, 1C) including an instruction fetch unit (10) for fetching and capable of executing a plurality of instructions in parallel with the plurality of instruction execution control sequences is stored in a program storage area and the program storage area. A plurality of task buffers (16, 17) each provided with a pointer for sequentially reading instructions, and a dedicated instruction execution unit and software dedicated to each task buffer. And a plurality of task buffers and an instruction fetch unit are selected from the plurality of register means (S 1, S 2) arranged in the unit, and are adapted to the specific instruction execution unit. A selector (18) connected to the instruction register, causing the selector to select the instruction picture unit in an initial state, and selectively controlling the selector according to an event generated internally or externally; And a switching control means (1 9). In this data processor as well, it is possible to switch between normal instruction processing and swap task processing by using one instruction execution control system. Can be achieved and pipeline disruption can be minimized. C Therefore, the high data processing capability originally intended by superscalar architectures can be guaranteed.

In a single-pass power processor that can execute a plurality of instructions in parallel, when the dependency between instructions such as data conflicts is arbitrated by hardware, the instructions included in the respective instruction execution control sequences Based on the results of decoding instructions from the decoder, examine the dependencies between instructions to determine whether parallel execution of instructions by different instruction execution control sequences is possible, and depend on the execution results of other instructions. A conflict management unit (25) that delays the execution of the instruction to be executed will be provided.

At this time, as shown in FIG. 16, the switching control means causes the contention management unit to execute a specific instruction due to a data conflict or the like. When execution is delayed, by causing the selector (18) to select a task buffer in response to the control signal 250 for notifying the execution, the processing is interrupted by one of the instruction execution control systems or by the pipe. Instruction processing can be switched to step processing, and the instruction execution control sequence can be used effectively. In particular, when switching tasks, as described above, it is not necessary to save the execution state of normal instruction processing that is interrupted halfway. Can be migrated to.

The contention state such as the data conflict is determined by the contention management unit (25) based on the result of the instruction decode. At this time, the instruction whose processing is to be delayed has already been decoded. After that, the process is switched to the swap task process. However, if the normal instruction process in which the process is interrupted and the swap task process in which the process is started use the same instruction register and instruction decoder, then 17 As illustrated in Figure 7, pipe

The same instruction as the instruction fetch (In) of the pipeline stage m in 1 is fetched again in the stage m + 2 of the pipe 1, and the same instruction as the instruction decode (D n) of the pipeline stage m + 1 in the pipe 1 Must be decoded again at stage m + 3 of pipe 1, which disrupts the pipeline in this sense. Therefore, after the swap task processing, when returning to the interrupted normal instruction processing, the instruction must be resumed from the instruction fetch.

In order to avoid any disturbance in the pipeline at the time of switching from the normal instruction processing to the step task processing due to the data conflict described above, as shown in FIG. D) is an instruction execution control system dedicated to swap task processing.

1 C) and an instruction decoder (1 2 C) can be added. That is, Instructions (13A, 13B) that decode the instructions latched in these registers (11A, 11B) are decoded by the instruction decoders (12A, 12B), and are executed by the instruction execution units (13A, 13B). It is assumed that a plurality of execution control sequences are provided, an instruction fetch unit (10) for fetching instructions is included, and a plurality of instructions can be executed in parallel by the plurality of instruction execution control sequences. The data processor (1D) includes a plurality of task buffers (16, 17) each having a program storage area and a bus node for sequentially reading instructions stored in that area. A specific task instruction register (11C) dedicated to the plurality of task buffers; and a specific task instruction decoder (12C) for decoding the instruction latched in the specific task river instruction register. Register means (S 1, S 2) dedicated to each of the task buffers and arranged in a specific instruction execution unit; and a plurality of task buffers and instruction pieces, each of which includes a plurality of task buffers. A first selector (18) connected to an instruction register corresponding to the specific instruction execution unit, and one selected from the plurality of task buffers. The specific task A second selector (26) connected to the instruction register for the task, and selectively outputting the output of the instruction decoder corresponding to the specific instruction execution unit and the output of the instruction decoder for the specific task to the specific instruction. A third selector (27) connected to the execution unit, and a different instruction execution control sequence based on instruction decoding results from instruction decoders included in the respective instruction execution control sequences. Investigate whether or not parallel execution of instructions is possible by examining the dependencies between the instructions, delay the execution of a specific instruction that depends on the execution result of another instruction, and delay the execution of the specific instruction. A conflict management unit (25) for causing the selector of (3) to select the instruction decoder for the specific task; and causing the first selector to select the instruction fetch unit in the initial state and a second selector. Control to the non-selected state Selecting and controlling the first selector in accordance with an event generated internally or externally; and in response to the selection of the instruction decoder for the specific task by the third selector, the second selector is internally or externally controlled. Switching control means (19) for selecting a task buffer according to the event generated in (1). BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a block diagram of a data processor according to a first embodiment of the present invention,

Fig. 2 is a block diagram of an example of an instruction program.

FIG. 3 is a block diagram showing a first example of a swap task buffer, FIG. 4 is a block diagram showing a second example of a swap task buffer, and FIG. 5 is a block diagram showing a third example of a swap task buffer. FIG. 6, FIG. 6 is a block diagram showing a fourth example of the swap task buffer, FIG. 7 is an explanatory diagram of an example of a register set included in the instruction execution unit, and FIG. 8 is related to the first embodiment. Explanatory drawing of an example of a task switching operation in a data processor,

FIG. 9 is an explanatory diagram of an example of a switching operation between normal instruction processing and interrupt processing. FIG. 10 is an example timing chart showing a relationship between task switching and a pipeline in the data processor according to the first embodiment. ,

FIG. 11 is an operation timing chart of an example of the data processor according to the first embodiment in which no interrupt is accepted during the step task.

FIG. 12 is a block diagram of a data processor according to a second embodiment of the present invention.

FIG. 13 is an operation timing chart of an example of the data processor according to the second embodiment for receiving an interrupt during a swap task; FIG. 14 is a block diagram of a data processor according to a third embodiment of the present invention,

FIG. 15 is an example timing chart showing the contents of control and task switching control when a data conflict occurs in the data processor according to the third embodiment.

FIG. 16 is a block diagram of a data processor according to a fourth embodiment of the present invention,

FIG. 17 is a timing chart showing the contents of task switching control when a data conflict occurs in the data processor according to the fourth embodiment.

FIG. 18 is a block diagram of a data processor according to a fifth embodiment of the present invention,

FIG. 19 is a timing chart showing the task switching control performed by the data processor according to the fifth embodiment when a data conflict occurs,

FIG. 20 is an example block diagram of a data processing system to which the data processor of the present invention is applied,

FIG. 21 is an explanatory diagram showing an example of a sunset based on the DMA transfer control and data conversion control program.

FIG. 22 is an explanatory diagram showing an example of the minimum unit of the program description of the DMA transfer control and data conversion control program.

FIG. 23 is a block diagram of an example of a data processing system including a cache memory employing a write-back method and a DMA controller arranged outside the data processor. BEST MODE FOR CARRYING OUT THE INVENTION FIG. 1 shows a block diagram of a data processor according to the first embodiment of the present invention. Although not particularly limited, the data processor 1 shown in FIG. 1 is formed on one semiconductor substrate such as single crystal silicon by a known semiconductor integrated circuit manufacturing technique.

In FIG. 1, 10 is the instruction fetch unit, 11 is the instruction register,

12 is an instruction decoder, 13 is an instruction execution unit, 14 is an instruction cache memory, 15 is a data cache memory, 16 and 17 are swap task buffers shown as representatives, 18 is a selector, 1 Reference numeral 9 denotes a switching control circuit, and reference numeral 20 denotes a circuit block that generically refers to built-in peripheral modules.

The instruction execution unit 13 includes a program counter PC, a general-purpose register GR, a register set S1, S2 individually assigned to each of the swap task buffers 16 and 17, and an interrupt. Includes control circuit 131, sequence control circuit 132, arithmetic circuit 133, etc.

In the data processor 1 of the present embodiment, the instruction register 11, the instruction decoder 12 and the instruction execution unit 13 advance the processing in units of pipeline stages, and execute the pipeline processing of the instructions. The operation cycles of the instruction register 11, the instruction decoder 12 and the instruction execution unit 13 are synchronized with the operation reference clock signal (not shown) of the processor 1 by the sequence control circuit 13 2. Control.

The instruction execution unit 13 is externally interfaced through a data cache memory 15 connected to the internal bus BUS, although not particularly limited. The target of data cache memory cache is external memory 2 and so on. The cache memory 15 is illustrated as a circuit block including a cache memory, a cache tag, and a cache controller (not shown). The cache data section holds a part of the data held in the external memory 2 or the like. The cache evening section is cash —Retain a part of the address (addressless) as a cache tag in association with the data held by Yube. In the case of a cache hit in the external access, the cache controller outputs the hit data to the internal bus BUS from the cache bus, or the hit data as a new entry in the cache bus. Write in the evening. In the case of a cache miss, the data read from the external memory 2 or the like is given to the internal bus BUS, and the external memory 2 or the like is accessed for writing. For cache misses, replacement of cache lines can be performed. Although not particularly limited, this cache controller performs processing for writing back the contents of the cache data rewritten by the cache hit to the external memory 2 or the like only when the cache line is replaced. This is done by writing back.

The program counter PC has an instruction address to be executed next. The instruction state 10 is not particularly limited, but is not limited to an instruction predicted to be executed in the future based on the value of the program count PC (for example, an instruction specified by the program count PC and a plurality of instructions subsequent thereto). Instruction). The instruction to be fetched is stored in the external memory 3 without any particular limitation. In this embodiment, an instruction cache memory 14 is arranged between the external memory 3 and the instruction fetch unit 10.

The instruction cache memory 14 is shown as a circuit block including a cache controller, a cache controller, and a cache controller (not shown). The cache part stores a part of the instructions stored in the external memory 3 or the like. The cache tag section holds the address section (address tag) as a cache flag in association with the instruction held by the cache data section. Cache control is a command In the case of a cache hit in the memory access by the cut 10, the instruction held by the cache memory is transferred to the instruction fetch unit 10 .In the case of a cache miss, the instruction is read from the external memory 3. To give the instruction Fetish 10.

The instruction fetch unit 10 is not particularly limited, but is first-in first-out.

(First-in · First-out) It has a buffer function, and can prefetch instructions for multiple codes for the value of the program counter PC. For example, as shown in FIG. 2, four-stage latches 100 A to 100 D are arranged in series, and directly connected to the external via selectors 101 A to 101 C without passing through the preceding latch. Alternatively, the instruction from the instruction cache memory 14 can be fetched. Reference numeral 102 denotes a control circuit for fetching instructions, which outputs the address of the instruction to be fetched based on the value of the program counter PC, and precedes the instruction input by the instruction, thereby providing a first-out instruction. The latch is held at 100 A to 100 D. The output is performed from the 100 A to 100 D ft. Although not particularly limited, the latch 100 A: I 00 D latches the instruction every two words ^, and the instruction decoder 12 decodes the instruction one word at a time. In response to this, the output of the data latch 100D is divided into a low-order word and a high-order word by the selector 103 and output.

Each of the step task buffers 16 and 17 has a pointer for sequentially reading out the instructions stored in the program storage areas 160 and 170 and the storage areas 160 and 170, respectively. 16 1 and 17 1. Although not particularly limited, the swap task buffer 16 can be written to its program storage area 160 by the execution unit 13 via the internal bus BUS. The swap task buffer 17 has a serial interface controlled by the instruction execution unit 13 (the control lines are shown in the figure). Writing to the program storage area 170 is enabled via 21).

Examples of the swap task buffers 16 and 17 are shown in FIGS. In the example of FIG. 3, the shift register and the selector are assumed to be storage areas 160 (170), and a shift register having a plurality of parallel human output type latches LAT cascaded and respective latches are provided. A selector SEL that selects one bit at a time from the parallel output of LAT and outputs it in parallel, and a selector that selects the output of each latch LAT to the SEL in order from the upper or lower side via the selector SEL. Evening 1 60 (1 70). For example, if n stages of latches LAT each having m bits are provided, the instruction can be sequentially output m times in units of n bits. Data writing to the shift register is performed under the control of the serial interface 21 or the instruction execution unit 13. The number of stages of the latch LAT is determined according to the number of bits of the instruction, and FIG. 4 shows a configuration in which the number of stages of the latch LAT is different from that of FIG. In the example of FIG. 5, a RAM (Random Access Memory) consisting of a memory cell array MCA 1 in which dynamic memory cells or static memory cells are arranged in a matrix and an address decoder ADEC 1 is used as a storage area 160. Pointer 161 generates an access address to RAM. Instruction execution unit 13 controls writing to RAM. In the example shown in FIG. 6, a ROM (Read Only Memory) comprising a memory cell array MCA 2 in which nonvolatile memory elements are arranged in a matrix and an address decoder DEC 2 is used as a storage area 160, and a pointer 16 1 generates an access address for R0M. The swap task buffers 16 and 17 store a program composed of an instruction sequence for implementing one integrated process. A single unit of processing performed by a specific instruction sequence If it is defined as a task, a processing program related to a specific task is stored ₍ for example, a processing program for DMA transfer, a processing program for data compression / decompression, etc. are set. Loading of the processing program to 16 and 17 is not particularly limited, but can be performed via the serial interface 21 or the instruction execution unit 13 at the time of system initialization such as power-on reset. it can.

The selector 18 selects one of the swap task buffers 16 and 17 and the instruction fetch unit 10 and connects it to the instruction register 11. The connection control is performed by the switching control circuit 19. The switching control circuit 19 causes the selector 18 to select the instruction feature 10 at the time of the initialization reset of the data processor 1, and thereafter, a predetermined event generated inside and outside, for example, the built-in peripheral circuit module 2 The selector 18 selects the output of the swap task buffer 16 or 17 in accordance with the interrupt signal 22 from 0 and the notification signal 23 of the occurrence of a predetermined event outside. The swap task buffer to be selected is determined by the control circuit 19 by switching the correspondence table between the event source and the swap task buffer, or the P4 step is performed for each event generation notification signal. Task buffers can be allocated and controlled.

Although not particularly limited, the instruction execution unit 13 outputs an instruction signal LIR that causes the instruction register 11 to latch an instruction. The instruction register 11 latches the instruction in synchronization with the instruction signal LIR. At this time, the selector 18 supplies the instruction signal LIR to the instruction fetch unit 10 or the swap task buffers 16 and 17 selected by the switching control circuit 19. When receiving the instruction signal LIR, the instruction fetch unit 10 updates the instruction to be supplied to the instruction register 11 based on the instruction signal. Also, the task buffers 16 and 17 receive the instruction signal LIR. Upon receipt, the bus terminals 161 and 171 are updated based on the instruction signal LIR. As a result, the bus task 16 1 or 17 1 of the swap task buffer 16 or 17 selected by the selector 18 is sequentially updated, and the instruction corresponding to the value of the bus task is stored in the storage areas 160 and 170. Will be supplied to the Order Regis Evening 11.

The end of the execution of the program stored in the swap task buffers 16 and 17 is switched by the end signal 120 output when the instruction executed at the end of the program is decoded by the instruction decoder 12 and output. The control circuit 19 recognizes. Upon receiving the decoded result (end signal 120), the switching control circuit 19 returns the selector 18 to the selected state of the instruction feature 10.

FIG. 7 shows an example of a register configuration of the instruction execution unit 13. General-purpose Regis U GR includes Regis U SR, R0 to R15. SR is assigned to station overnight, address registers R0 to R7 are assigned to evening register and address register evening, and R8 to R15 are assigned to evening register and address register evening, stack Assigned to Poin Yu and others. The resist evening set S 1 includes the resist evening S 1 SR, S 1 R 0 to S 1 R 7, and the resist evening set S 2 includes the resist evening S 2 SR, S 2 R 0 to S 2 R 7. Including, these register sets S 1 and S 2 are used in place of the register registers SR and R 0 to R 7 of the general-purpose register GR and have unique register addresses. The register set S1 is dedicated to the execution of the program stored in the swap task buffer 16 and the register set S2 is dedicated to the execution of the program stored in the swap task buffer 17. The registers SR and R0 to R7 in the general-purpose register GR are assigned to execute the instructions output from the instruction fetch unit 10.

Although not particularly limited, general-purpose registration evening GR registration evening SR, R 0 to R 7, Which register of register set S1 or register set S2 to use is determined by the register number and the type of task. For example, it is specified in the operand field of the instruction. When the instruction output from the instruction fetch unit 18 is selected, the instruction execution unit 13 uses the registers SR and R0 to R15 for executing the instruction, and the instruction output from the swap task buffer 16 is used. When selected, the instruction execution unit 13 uses the registers S1SR and S1R0 to S1R7 for instruction execution, and when an instruction output from the swap task buffer 17 is selected, the instruction is executed. The execution unit 13 uses the registers S2SR and S2R0 to S2R7 for executing instructions.

As described above, the swap buffers 16 and 17 have their own unique pointers 16 1 and 17 1, respectively, and the unique registry buffers assigned to the respective swap task buffers 16 and 17. Since it has sets S 1 and S 2, when the task to be executed is switched between the instruction fetch unit 10 and the swap task sniffer 16, 17, the program counter PC and the register GR are switched. There is no need to perform any processing to access the storage area such as the external memory 2 to save or restore the value of.

FIG. 8 shows an example of task switching operation. During the execution of the instruction from the instruction fetch unit 10 (normal instruction processing), the execution of the program (swap task 1) stored in the swap task buffer 16 is requested by the signal 23, for example. Then, the switching control circuit 19 switches the selection state by the selector 18 to the skip buffer 16 in synchronization with the switching of the pipeline stage. As a result, the swap task buffer 16 instructs and outputs the first instruction of the swap task 1 in synchronization with the instruction # 3 LIR by the pin register 161, and the instruction register 11 latches it. I do. The instruction execution unit 13 is When executing a task, the register set S1 specified by the instruction description of the task is used. As a result, it is possible to move to the execution of the swap task 1 without having to save the program counter PC and the registers SR and R0 to R7. When the last instruction of the switched swap task 1 is decoded by the instruction decoder 12, the switching control circuit 19 causes the selector 18 to select the instruction fetch unit 10. At this time, the values of the program counter PC; status register SR, data, and address register R0 to R7 are maintained as they were immediately before switching to the top task 1. In execution of the instruction stored in the swap task buffer 16, the registers R8 to R15 are not used. Therefore, even when switching to normal instructions, memory access for return is not required. In the case of switching between normal instruction processing and interrupt processing shown in Fig. 9, a memory access for saving and restoring is required every time switching is performed. Memory access for saving and restoring is a task switching or pipeline switching overhead.

FIG. 10 shows an example of the state of the pipeline at the time of switching between the normal instruction processing and the step task 1. Although not particularly limited, the pipeline stage in the data processor 1 of this embodiment has five pipeline stages, and the pipeline stages in the normal instruction processing include instruction fetch (In), instruction decode (Dn), and operation (En). , Memory access (An) and register store (Sn). The pipeline stages in the swap task are instruction transfer (C s), instruction decode (D s), operation (E s), memory access (As), and register store (S s).

For example, when execution of step task 1 is requested in pipeline stage m, the switching control circuit 19 switches to pipeline stage m + 1. In the pipeline stage m + 1, the instruction corresponding to the first instruction of the swap task 1 is transferred to the instruction register 11 (C s 1). When switching tasks, as described above, it is possible to move to the execution of swap task 1 without having to save the program counter PC and registers SR and R0 to R7. Hereinafter, the processing is advanced one by one for each pipeline stage. The instruction execution unit 13 uses the general-purpose register GR for execution of normal instruction processing, but uses the register S1 for execution of the swap task 1. Which register evening to use is determined by each command description. When the last instruction of the switched swap task 1 is decoded (D s 1) by the instruction decoder 12 in the pipeline stage n, the end signal 120 is supplied to the switching control circuit 19. The switching control circuit 19 causes the selector 18 to select the instruction fetch unit 10 at the pipeline stage n + 1, whereby the instruction register unit 11 is provided at the instruction register 11 after the pipeline stage n + 1. Instructions are supplied from 0. As described above, memory access for restoration is not required when switching to normal instruction processing. As described above, there is no disturbance in the pipeline when the task is switched between the normal instruction processing and the swap task 1. In FIG. 1, the interrupt control circuit 1331 is supplied with an interrupt request signal IRQ shown as a representative. The interrupt control circuit 131 accepts an interrupt request according to the interrupt priority set for the interrupt control circuit. In the present embodiment, the switching control circuit 19 enables the interrupt acceptance inhibition signal INH while the swap task buffer 16 or 17 is selected by the selector 18 to enable the interrupt control circuit 1. 3 Supply to 1. The interrupt control circuit 131 does not accept any interrupt request when the interrupt disable signal INH is enabled. Therefore, When executing a task according to the program of the swap task buffer 16 or 17, the data processor 1 does not switch tasks until the execution of the task is completed. In other words, the task executed by the program stored in the step task buffer 16 or 17 is given the highest execution priority. When receiving an interrupt request, the interrupt control circuit 13 1 suspends the execution of the current instruction and stores the contents of the program counter PC, status register SR, data and address register R 0 to R 15 into the external memory 2, etc. And then branch to the accepted interrupt request processing program.

FIG. 11 shows an operation example when an interrupt is not accepted during the swap task as described above. If there is an interrupt request during normal processing, the program returns to the normal processing after saving the return address, etc., then branches to interrupt processing. When the interrupt processing is completed, it returns to normal processing. When there is a request to execute the swap task 1 in the normal processing, the switching control circuit 19 causes the swap task buffer 16 to be selected, and is immediately shifted to the execution of the swap task 1. While the swap task 1 is being executed, the interrupt disable signal IΝ is enabled, so that even if there is an interrupt request, no interrupt request is accepted during that time. The interrupt request that has been disabled is accepted after the interrupt disable signal I_Ν is disabled after the execution of swap task 1 is completed. When branching to interrupt processing, first, the return address of the interrupted normal instruction processing and the value of the register are saved, and then the processing branches to interrupt processing. After interrupt processing is completed, the saved information is restored, and then the process returns to normal instruction processing.

FIG. 12 shows a second embodiment of the data processor according to the present invention. The data processor 1 shown in the figure can be assigned while executing the task by the program stored in the swap task buffer 16 or 17. The difference from the data processor 1 in FIG. The other points are the same as those in FIG. 1, and the same reference numerals are given to the circuit blocks having the same functions, and detailed description thereof will be omitted.

In the processor 1A, when the interrupt control circuit 1331 receives the interrupt request, it enables the interrupt control signal ICNT and supplies it to the switching control circuit 19 described above. When the selector 18 selects the swap task buffer 16 or 17 and the interrupt control signal ICNT is enabled, the switching control circuit 19 instructs the selection state by the selector 18 to an instruction. Switch to control 10 Further, the information (swap task selection information) for specifying the swap task buffers 16 and 17 selected immediately before switching is saved. The evacuation destination is desirably an evacuation latch (not shown) inside the switching control circuit 19. It may be saved in the stack area such as the external memory 2, but in that case, when returning from the interrupt processing to the step task, an external bus access cycle must be started to restore the step task selection information. This is because the return to the step task processing is delayed.

If an interrupt is accepted during execution of the swap task, a branch from normal instruction processing to the swap task has been made before that. Therefore, it is necessary to be able to return to the interrupted normal processing after completing the interrupt processing. Therefore, after the selector 18 is switched and the step task selection information is saved, the return address of the currently interrupted normal instruction processing and the register information are saved, and thereafter, the process is branched to an interrupt processing program.

Figure 13 shows an example of operation when an interrupt is accepted during the swap task. If there is an interrupt request in the middle of normal instruction processing, the program returns to the return address, etc., branches to interrupt processing, and the interrupt processing ends. Then, after performing the return processing, it returns to the normal instruction processing. When there is a request to execute the swap task 1 in the normal instruction processing, the switching control circuit 19 causes the selector 18 to select the swap task buffer 16 and immediately proceeds to the execution of the swap task 1. The interrupt control circuit 1331 can receive an interrupt even during execution of the swap task 1, and upon receiving the interrupt, enables the interrupt control signal ICNT and supplies it to the switching control circuit 19. As a result, the switching control circuit 19 switches the selection state of the selector 18 to the instruction fetch unit 10 and controls the switching task selection information for identifying the swap task buffer selected at that time. evacuate. Then, the instruction execution unit 13 that has accepted the interrupt saves the return address and the register information of the normal instruction processing in which the processing has been interrupted earlier to the stack area (S1), and then branches to the interrupt processing program. I do. When the interrupt processing is completed (T 1), the interrupt control signal ICNT is disabled, and the switching control circuit 19 causes the interrupted swap task 1 according to the saved swap task selection information. Resume execution of. When the last instruction of the swap task 1 is decoded by the instruction decoder 12, an end signal 120 is given to the switching control circuit 19, whereby the switching control circuit 19 switches the selector 18. Causes the output of the instruction fetch circuit 10 to be selected. Then, the return processing (S2) after the interrupt processing is started, the return address and the register information of the saved normal instruction processing are restored, and the normal instruction processing is restarted. The return processing (S2) is extended until the end of the swap task processing 1 which is resumed after the end of the interrupt processing (T1), but this is switched when the interrupt processing is ended (T1). The control circuit 19 first switches the selector 18 to the swap task buffer 16 based on the fact that the step task selection information has been saved. Because you can.

FIG. 14 shows a third embodiment of the data processor according to the present invention. The data processor 1B shown in FIG. 1 has a single-pass color architecture, and can execute a plurality of instructions in parallel by two pipelines. That is, the instruction latch unit 11A decodes the instruction latched to the instruction register 11A with the instruction decoder 12A, and the instruction execution unit 13A executes the instruction. The instruction execution unit 13B decodes the instruction latched in 1B by the instruction decoder 12B, and the instruction execution unit 13B has a second instruction execution control sequence for executing the instruction. Pipeline processing performed in the first instruction execution control sequence is referred to as pipe 0, and pipeline processing performed in the second instruction execution control sequence is referred to as pipe1. LIRA is an instruction latch instruction signal for the instruction register 11A, and LIRB is an instruction latch instruction signal for the instruction register 11B, and corresponds to the instruction signal LIR.

The instruction execution units 13A and 13B respectively have dedicated sequence control circuits 13A and 13B and arithmetic circuits 13A and 13B. Dependencies between instructions such as data conflicts between pipes 0 and 1 are detected by the conflict management unit 25 based on the decoded results of the instruction decoders 12A and 12B. In other words, the contention management unit 25 determines whether or not parallel execution of instructions by the nove 0 and the pipe 1 is possible based on the result of decoding the instructions from the instruction decoders 12A and 12B. The dependency relationship is examined, and the sequence control circuits 1332A and 1332B are controlled by the control signals ARBA and ARBB so as to delay the execution of an instruction that depends on the execution result of another instruction.

Interrupt control circuit 13 1, program counter PC, general-purpose register G

R is shared by both instruction execution units 13A and 13B. Regis Evening sets S 1 and S 2 are dedicated to instruction execution unit 13 B. The details are the same as those of the data processor in Fig. 1.

In the data processor 1B of the fast path scalar architecture, the selector 18, the switching control circuit 19, and the swap task buffers 16 and 17 correspond to the instruction execution control sequence of the instruction register 11B. It is located. As in the data processor of FIG. 1, an instruction fetch unit 10, an instruction cache memory 14, a built-in peripheral module 20, a data cache memory 15 and the like are provided. In FIG. 14, components having the same functions as those of the first [# 1] are denoted by the same reference characters and their detailed description is omitted. In the case of FIG. 14, both swap task buffers 16 and 17 are designed so that the program is initially loaded via the internal bus BUS.

FIG. 15 shows an example of the contents of control and task switching control when a data conflict occurs in the data processor 1B.

For example, if the conflict management unit 25 detects a data conflict at the decode stage (Π1 + 1) of the instruction latched to the instruction register 11A or 11B at the pipeline stage m, it is executed later. The execution of the instruction to be executed is NOP (non-operation) until the execution result of the instruction to be executed first is obtained. That is, the result of the register store (S n) of the pipe 0 at the pipeline stage (m + 4) can be used at the operation stage (E n) of the pipe 1 at the stage (m + 4). Until then, the pipeline stage of Pive 1 is NOP. When the execution of the swap task 1 is requested at the pipeline stage m + 3, the switching control circuit 19 switches the selection state of the selector 18 to the skip buffer 16 at the pipeline stage m + 4, At pipeline stage m + 4, pipe 1 The instruction for the first instruction of step 1 is transferred to the instruction register 11B (Cs1). When switching tasks, as described above, it is possible to move to the execution of swap task 1 without having to save the program counter PC and the registry SR and R0 to R7. Hereafter, the processing proceeds sequentially for each pipeline stage of pipe 1. At this time, the instruction execution unit 13B uses the register set S1 to execute the step task 1. Which register evening to use is determined by each command description as in the above example. When the last instruction of the switched subtask 1 is decoded by the instruction decoder 12 B at the pipeline stage n + 1 in the pipe 1, an end signal 120 is supplied to the switching control circuit 19. The switching control circuit 19 causes the selector 18 to select the instruction fetch unit 10 at the pipeline stage n + 1, whereby the instruction register 11 B is stored in the instruction register 11 B after the pipeline stage n + 1 of the pipe 1. Instruction is supplied from fetish unit 10. As a result, normal instruction processing is resumed in pipe 1. As described above, switching to normal instruction processing does not require memory access for restoration. As described above, there is no disturbance in the pipeline when task switching is performed between normal instruction processing and step 1.

FIG. 16 shows a fourth embodiment of the data processor according to the present invention. The data processor 1C shown in the figure has a single-path scalar architecture like the data processor 1B, and can execute a plurality of instructions in parallel by two pipelines. The difference from the data processor 1B is that the occurrence of data conflicts during normal instruction processing by pipes 0 and 1 is one of the factors for switching to the step task. The conflict management unit 25 switches a control signal 250 synchronized with the occurrence of a data conflict and a control circuit 19 Give to. As a result, the switching control circuit 19 performs the swap task 1 using the free space of the pipe 1 in the normal instruction processing due to the data conflict. However, since there is only one set of the instruction register 11B and the instruction decoder 12B on the pipe 1 side, when resuming the execution of the instruction whose execution was interrupted by the data conflict, the instruction fetch must be started again. . The control is performed by the sequence control circuit 13B. The rest of the configuration is the same as that of the data processor 1B in FIG. 14, so a detailed description of the configuration will be omitted.

FIG. 17 illustrates the contents of the task switching control when a data conflict occurs. For example, if the conflict management unit 25 detects a data conflict at the decode stage (m + 1) of the instruction latched at the instruction register 11A and 11B respectively at the pipeline stage m, it is executed later. Execution of the instruction to be executed is NOP (non-operation) until the execution result of the instruction to be executed first is available. That is, until the result of the register store (S n) of the pipe 0 at the pipeline stage (m + 4) becomes available at the operation stage (E n) of the pipe 1 at the stage (m + 4), the pipe 1 Execution of the normal instruction processing in the pipeline stage is stopped. The instruction is notified to the instruction execution unit 13B by the control signal ARBB. At this time, the competition management unit 25 activates the control signal 250 and supplies it to the switching control circuit 19. The switching control circuit 19 causes the selector 18 to select the step task buffer 16 in response thereto. With this, the pipeline stage m + 1 ~! At 11 + 5, pipe 1 can process swap task 1. The period allowed for the processing of the swap task 1 is the period when the normal instruction processing of the pipe 1 is interrupted by the data conflict, and the period is controlled by the conflict management unit 25 and the control signal 25 Reflected in 0 When the control signal 250 is deactivated, the selected state of the selector 18 is returned to the original selected state of the normal instruction processing (the selected state of the instruction fetch unit 10). When switching tasks, as described above, it is possible to proceed to step task 1 without having to save the program counter PC and the registry SR, R0 to R7. At this time, the instruction execution unit 13B uses the register set S1 to execute the swap task 1. Which register is to be used is determined by each command description as in the above example.

In the example of FIG. 17, in the decode stage (m + 4) of the instructions latched in the instruction registers 11A and 11B at the pipeline stage m + 3, the conflict management unit 25 also has In the same manner as described above, a tacon conflict is detected, and the result of the register store (S n) of the pipe 0 in the pipeline stage (m + 7) is calculated in the same manner as that of the pipe 1 in the stage (m + 7). Execution of normal instruction processing in the pipeline stage of pipe 1 is halted until it becomes available in (E n). Instead, pipe 1 is processing swap task 1. In this example, the processing of the swap task 1 is fragmented and its processing is limited to the time when a data conflict occurs, but it is limited to the processing unique to the data conflict and the processing time. It is effective to apply to the processing without. Further, the control signal 250 may be used as a control signal that defines the timing for actually processing the swap task selected by the signals 22 and 23.

FIG. 18 shows a fifth embodiment of the data processor according to the present invention. The data processor 1D shown in the figure has a space scalar architecture like the data processor 1B, and can execute a plurality of instructions in parallel by two pipelines. Data processor 1D is connected to pipe 0 and pipe 1 in the same way as data processor 1C. The occurrence of data conflict during normal instruction processing is considered as one of the switching factors to the step task, and the instruction register 11c dedicated to the step task executed at that time is It is different from the processor 1C in that it has an instruction decoder 12C. The instruction register 11 C input is selected by the selector 26, and the output of the instruction decoder 12 B or 12 C is selected by the selector 27.

The conflict management unit 25 supplies a control signal 250 enabled in synchronization with the occurrence of the data conflict to the switching control circuit 19 and the selector 27. As a result, the selector 27 selects the output of the instruction decoder 12C, the control signal LIRB is also supplied to the instruction register 11C, and the instruction register 11B retains the currently held instruction. And instead, instruction register 11C is enabled to latch new instructions according to control signal LIRB. Further, the switching control circuit 19 connects the swap task buffer 16 or 17 to the instruction register 11 C by the selector 26 by the control signal 250 in the enable state. Which connection is to be made may be selectable or fixed. For example, at the time of initialization reset of the data processor, it is possible to determine what to select according to the operation mode determined.

For example, when the swap task 1 is processed by utilizing the free space in the pipe 1 for normal instruction processing due to data conflict, the pipe 1 has its own instruction register 11C and instruction decoder 12C. When resuming the execution of an instruction whose execution has been interrupted by a data conflict, it is not necessary to start over from the instruction fetch unlike the data processor 1C. The pipeline is not disturbed immediately. The rest of the configuration is the same as that of the processor 1C, and a detailed description of the configuration will be omitted.

Figure 19 shows the data processor 1 when a data conflict occurs. The contents of the task switching control performed in D are illustrated. For example, when the conflict management unit 25 detects a data conflict in the decode stage (m + 1) of the instruction latched in the instruction register 11A and 11B at the pipeline stage m, the instruction is executed later. The instruction to be executed is NOP (non-operation) until the execution result of the instruction to be executed first is obtained. That is, until the result of the register store (Sn) of Pipe 0 at the pipeline stage (m + 4) becomes available at the operation stage (En) of Pipe 1 at the stage (m + 4). Then, the execution of the normal instruction processing in the pipeline stage of the knoop 1 is stopped. The difference from FIG. 17 is that it is not necessary to repeat instruction fetching and decoding again for the operation stage (En) in the pipe 1 of the stage m + 4 in FIG. The instruction to stop the execution of the normal instruction processing in the pipeline stage of the pipe 1 is notified to the instruction execution unit 13B by the control signal ARBB. At this time, the conflict management unit 25 activates the control signal 250 and supplies it to the switching control circuit 19. The switching control circuit 19 causes the selector 18 to select the swap disk buffer 16 in response thereto. As a result, in the pipeline stages m + 1 to m + 5, the pipe 1 can perform the processing of the step task 1. The period allowed for the processing of swap task 1 is the period when the normal instruction processing of pipe 1 is interrupted by the data conflict, and the period is controlled by the conflict management unit 15 and is controlled by the control signal 250. The selected state of the selector 18 is returned to the original selected state of the normal instruction processing (the selected state of the instruction fetch unit 10) by being reflected and inactivating the signal 250. When switching tasks, as described above, it is possible to move to execution of swap task 1 without having to save the program memory PC and register software SR and R0 to R7. At this time, the instruction execution unit 13 B Uses the registry set S1 to execute the swap task 1. Which register is to be used is determined by the description of each command as in the above example.

In the example of FIG. 19, the conflict management unit 25 also removes data conflicts in the decode stages (m + 4) of the instructions latched in the instruction registers 11A and 11B, respectively, in the pipeline stage # 1 + 3. In the same manner as described above, the result of the register store (S n) of the pipe 0 at the pipeline stage (m + 7) is performed in the same manner as the performance of the pipe 1 at the stage (m + 7). ? Execution of normal instruction processing in the pipeline stage of pipe 1 is halted until it becomes available in stage (E n). Instead, pipe 1 is processing swap task 1.

FIG. 20 shows an example of a data processing system to which the data processor 1 is applied. The external memory 4 and the input / output circuit 5 are representatively connected to an external bus 4 of the processor 1. The external bus 4 includes an address bus ABUS, a data bus DBUS and a control bus CBUS. In this system, a swap task buffer 16 of the data processor 1 stores a DMA transfer control and data conversion control program. The start of the DMA transfer control and data conversion control program is an interrupt signal 230 assigned to one of the control signals 23. This interrupt signal 230 is supplied from the input / output circuit 5. FIG. 21 shows an example of a task according to the DMA transfer control and data conversion control program. That is, when the interrupt signal 230 is supplied from the input / output circuit 5 to the switching control circuit 19, the processing program of the data processor 1 switches to the DMA transfer control and data conversion control program stored in the swap task buffer 16. Can be The task processed by this program reads data from the input / output circuit 5 and The read data is subjected to data conversion (for example, compression or coordinate conversion) by the instruction execution unit 13, and the converted data is written and controlled in a predetermined area of the memory 2. The read address and the write address are sequentially updated by the program for each data transfer and data conversion. Fig. 22 shows an example of the minimum unit of the program description of such a DMA transfer control and data conversion control program. As described above, task switching using the swap task buffer does not require evacuation processing like normal interrupt processing and does not disrupt the pipeline, so it can respond quickly to events that occur. .

In the above embodiment represented by the data processor 1, when the DMA transfer control program is set in the step task buffers 16 and 17, compared with the system configuration illustrated in FIG. The burden on the data processor 1 for solving the cache coherency problem can be reduced. In other words, in the system configuration shown in Fig. 23, when the cache memory 15 adopts the write-back method, the DMA controller 6 starts DMA transfer without rewriting the cache memory 15 in the external memory. As a result, cache coherency cannot be maintained, so the processor 1E constantly monitors the start of the DMA transfer operation that does not maintain cache coherency, and when it detects this, a write-back operation is performed in advance. Must be performed, and the data processor 1 E must be responsible for processing for detecting an operation that does not maintain cache coherency. On the other hand, taking the data processor 1 in FIG. 1 as an example, when the processing task of the data processor 1 is switched to the DMA transfer control processing via the selector 18 or the like, the function as the DMA controller is executed. Unit 13 will be realized. Therefore, between the external memory of the processor 1 or between the external memory and the external input / output circuit When DAM data transfer is controlled between memory devices, an address signal or access control information for DMA transfer control always passes through the data cache memory 15. As a result, when the cache memory 15 adopts the write-no-socket method, even if the DMA transfer is started while the rewriting of the cache memory 15 is not reflected in the external memory, it is reflected in such external memory. If not, the data is read from the cache memory 15 to the instruction execution unit 13 and transferred.Therefore, the data processor 1 detects the operation that does not maintain cache coherency. There is no need to bear. In the DMA transfer control function realized by the data processor 1, the data transfer is read into the processor 1 every day.

Although the invention made by the inventor has been specifically described based on the embodiments, the present invention is not limited thereto, and it is needless to say that various modifications can be made without departing from the gist of the invention. Absent.

For example, the number of swap task buffers is not limited to the above embodiment and can be changed as appropriate. Also, the cache memory is not limited to a configuration in which the data cache memory and the instruction cache memory are separated, and may be a unified cache memory used for both instructions and data. The number of pipeline stages is not limited to the five stages in the above embodiment. In addition, the number of pipes that can be operated in parallel in the single-path power processor is not limited to two, but may be more. In addition, the content of the swap task can be applied as needed and is not limited. Industrial applicability

As described above, the data processor according to the present invention includes various types of data processing systems, particularly systems in which tasks are frequently switched, It can be widely applied to systems that require improved capabilities, for example, a computer system for controlling embedded devices equipped with a digital camera for transferring image data and data compression as a step task. can do.

Claims

The scope of the claims

1. The instruction fetch unit fetches the instruction, the instruction decoder decodes the instruction latched in the instruction register, and the instruction execution unit executes the instruction based on the decoded result.

A plurality of task buffers each having a program storage area and a bus stored in the area for sequentially reading instructions;

Register means dedicated to each of the task buffers and arranged in the previous Jd instruction execution unit;

A selector for selectively connecting one of the plurality of evening sniffers and the instruction channel to the instruction register;

Switching control means for allowing the selector to select the instruction feature in an initial state and for selectively controlling the selector according to an event generated internally or externally;

And an interface means for instructing all or a part of the plurality of task buffers with the outside so as to be able to write data based on the control of the instruction execution unit. A data processor characterized in that:

2. The deciphering device according to claim 1, wherein the pre-ci instruction register, the instruction decoder and the instruction execution unit are for performing pipeline processing of the instruction by advancing the processing in units of pipeline stages. Processor.

3. The instruction execution unit outputs an instruction signal for latching the instruction in the instruction register, and the selector supplies the instruction signal to an instruction fetch unit or a task buffer selected by the previous switching control means. Then, the instruction fetch unit updates the instruction to be supplied to the instruction register based on the instruction signal, and the task buffer stores the pointer in the instruction signal. 3. The data processor according to claim 2, wherein the data is updated based on the data.

4. The switching control means returns the selector to the selected state of the instruction fetch unit based on the result of decoding the instruction supplied to the instruction decoder from the task buffer selected by the switching control means. The data processor of claim 3.

5. The switching control means outputs an interrupt disable signal for invalidating an interrupt signal input to an instruction execution unit in response to the selection of the task buffer. The data processor described in paragraph 3.

6. When the task buffer is selected, the switching control means returns the selector to the selected state of the instruction fetch unit in response to the acceptance of the interrupt by the instruction execution unit, and the task buffer immediately before the selected state. 4. The data processor according to claim 3, wherein the selected state is saved.

7. The data processor according to claim 1, wherein a data cache memory is provided between the instruction execution unit and the outside.

8. The data processor according to claim 7, an external data bus connected to the data processor, and a memory and a human output circuit connected to the external data bus. A data processing system characterized by the following.

9.Instruction register An instruction decoder decodes the instruction latched in the instruction register, and the instruction execution unit includes a plurality of instruction execution control sequences for executing the instruction.In addition, the instruction execution unit includes an instruction unit for executing the instruction. To a data processor that can execute the instructions in parallel with the plurality of instruction execution control sequences. And

A plurality of task buffers each having a program storage area and a pointer for sequentially reading instructions stored in the program storage area,

A register means dedicated to each of the task buffers and arranged in the specific instruction execution unit;

A selector for selecting one of the plurality of buffer buffers and the instruction unit and connecting to an instruction register corresponding to the specific instruction execution unit;

Switching control means for allowing the selector to select the instruction filter in the initial state and for selectively controlling the selector according to an event generated internally or externally. De-Issue processor.

1 0. Instruction register and instruction deco included in each instruction execution control sequence

The data processor according to claim 9, wherein the instruction and the instruction execution unit are configured to advance the processing in units of pipeline stages to pipeline the instructions.

11.Based on the instruction decoding results from the instruction decoders included in the respective instruction execution control sequences, the dependence between the instructions on whether or not the instructions can be executed in parallel by different instruction execution control sequences is determined. 10. The data processor according to claim 10, further comprising a conflict management unit for examining and delaying execution of an instruction depending on an execution result of another instruction.

12. The switching control means according to claim 11, wherein said switching control means causes said selection means to select a task buffer when said contention management unit delays execution of a specific instruction. Data processor.

1 3. The instruction execution unit included in each of the instruction execution control The instruction register outputs an instruction signal for latching the instruction, and the selector supplies the instruction signal output from the corresponding instruction execution unit to an instruction fetch unit or a task buffer selected by the switching control means. The instruction unit updates the instruction to be supplied to the instruction register based on the instruction signal, and the task buffer updates the pointer based on the instruction signal. 11. The data processor according to clause 11.

14.Instruction registerInstructions executed by the instruction decoder are decoded by an instruction decoder, and the instruction execution unit includes a plurality of instruction execution control sequences for executing the instructions.In addition, the instruction execution unit includes an instruction fetch unit for fetching instructions. In a data processor capable of executing the above-mentioned instructions in parallel with a plurality of instruction execution control sequences,

A plurality of task buffers each having a program storage area and a pointer stored in the area for sequentially reading instructions;

A special task instruction register dedicated to the plurality of task buffers;

A specific task instruction decoder for decoding an instruction latched in the specific task instruction register;

Register means dedicated to the respective task buffers and arranged at a specific instruction execution unit;

A first selector for selectively selecting one of the plurality of task buffers and the instruction unit and connecting to an instruction register corresponding to the specific instruction execution unit;

A second selector for selecting one from the plurality of task buffers and connecting to the instruction register for the specific task;

The output of the instruction decoder corresponding to the specific instruction execution unit and the previous A third selector for selectively connecting the output of the instruction decoder for a specific task to a specific instruction execution unit;

Based on the instruction decoding results from the instruction decoders included in the respective instruction execution control sequences, the interdependencies between the instructions are examined as to whether or not the instructions can be executed in parallel by different instruction execution control sequences. A conflict management unit that delays execution of a specific instruction depending on the execution result of another instruction and causes the third selector to select the specific task instruction decoder when delaying execution of the specific instruction;

In the initial state, the instruction selector is selected by the first selector and the second selector is controlled to a non-selection state, and the first selector is selectively controlled according to an event generated internally or externally. Switching control means for causing the second selector to select a task buffer according to an event generated internally or externally in response to the selection of the instruction decoder for the special task by the third selector. A processor comprising: a processor;