WO2010037205A1 - Serial-connected memory system with output delay adjustment - Google Patents

Serial-connected memory system with output delay adjustment Download PDF

Info

Publication number
WO2010037205A1
WO2010037205A1 PCT/CA2009/001271 CA2009001271W WO2010037205A1 WO 2010037205 A1 WO2010037205 A1 WO 2010037205A1 CA 2009001271 W CA2009001271 W CA 2009001271W WO 2010037205 A1 WO2010037205 A1 WO 2010037205A1
Authority
WO
WIPO (PCT)
Prior art keywords
command
clock signal
input
duty cycle
output
Prior art date
Application number
PCT/CA2009/001271
Other languages
French (fr)
Inventor
Hakjune Oh
Original Assignee
Mosaid Technologies Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/241,960 external-priority patent/US8161313B2/en
Priority claimed from US12/241,832 external-priority patent/US8181056B2/en
Application filed by Mosaid Technologies Incorporated filed Critical Mosaid Technologies Incorporated
Priority to EP09817125A priority Critical patent/EP2329496A4/en
Priority to JP2011528145A priority patent/JP2012504263A/en
Priority to CN200980138194.9A priority patent/CN102165529B/en
Publication of WO2010037205A1 publication Critical patent/WO2010037205A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/22Read-write [R-W] timing or clocking circuits; Read-write [R-W] control signal generators or management 
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C16/00Erasable programmable read-only memories
    • G11C16/02Erasable programmable read-only memories electrically programmable
    • G11C16/06Auxiliary circuits, e.g. for writing into memory
    • G11C16/32Timing circuits
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1051Data output circuits, e.g. read-out amplifiers, data output buffers, data output registers, data output level conversion circuits
    • G11C7/1066Output synchronization
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1051Data output circuits, e.g. read-out amplifiers, data output buffers, data output registers, data output level conversion circuits
    • G11C7/1069I/O lines read out arrangements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/22Read-write [R-W] timing or clocking circuits; Read-write [R-W] control signal generators or management 
    • G11C7/222Clock generating, synchronizing or distributing circuits within memory device

Definitions

  • the invention relates generally to solid state memory systems featuring a set of serial-connected memory devices.
  • NAND flash memory systems use a large number of parallel signals for the commanding, addressing, and data transferring operations. This was a very popular way of configuring memory systems and results in very fast system operation. This is particularly true for random access memory devices like DRAM (dynamic random access memory), SRAM (static random access memory), etc.
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • the conventional NAND flash memory communicates with other components using a set of parallel input/output (I/O) pins, numbering 8 or 16 depending on the desired word configuration, which receive command instructions, receive input data and provide output data.
  • I/O input/output
  • This is commonly known as a parallel interface. High speed operation will cause well known 76181-75
  • Such parallel interfaces use a large number of pins to read and write data. As the number of input pins and wires increases, so do a number of undesired effects. These effects include inter-symbol interference, signal skew and cross talk.
  • serial in/out data pins typically have serial in/out data pins along with two control signals for the enabling and disabling of a serial input port and serial output port respectively in order to provide a memory controller with the maximum flexibility of serial data communication.
  • Some of these memory system configurations employ a shared bus topology for the system clock distribution, which is referred to as a 'common clock system' or 'multi-drop clocking system'.
  • Some of these architectures use a point-to-point serial- connected clocking architecture featuring a DLL (delay locked loop) or PLL (phase locked loop) in every memory chip in order to synchronize two clock signals in each memory device, one being an input clock received from a preceding device or controller and the other being an output clock transmitted to the next device.
  • the invention provides a method in a slave device of a plurality of serial-connected slave devices, the method comprising: receiving a command from a master device specifying an adjustment to a clock duty cycle; 76181-75
  • the slave device is a memory device and the master device is a memory controller.
  • the method further comprises: receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device; receiving at least one input signal, the at least one input signal comprising at least the input clock signal; for each of the at least one input signal: generating a delayed version of the input signal in accordance with the command; outputting the delayed version of the input signal, the delayed version of the input clock signal comprising a delayed version of the duty cycle corrected clock signal.
  • receiving a command from a master device specifying an adjustment to a clock duty cycle comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, the command further containing data indicating how to adjust the duty cycle.
  • receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command.
  • the method further comprises: performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that matches a device address of the slave device; performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that is a broadcast device address. 76181-75
  • generating a duty cycle corrected clock signal comprises: a) generating a half rate clock signal from the input clock signal; b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
  • the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays.
  • the invention provides a method in a memory system comprising a master device and a plurality of serial-connected slave devices comprising at least a first slave device and a last slave device, the method comprising: in the master device: a) outputting a first clock signal that functions as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command; in the first slave device of the plurality of serial-connected slave devices: a) receiving the first clock signal from the master device as the input clock signal of the first slave device; b) generating an output clock signal from the input signal; in each other slave device of the plurality of serial-connected slave devices: a) receiving the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generating an output clock signal from the input clock signal; in each of at least one of the
  • each slave device is a memory device and the master device is a memory controller. 76181-75
  • the method further comprises: in the master device: a) outputting at least one output signal, the at least one output signal comprising the first clock signal to function as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) determining an amount of phase offset between the first clock signal and the second clock signal; d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
  • generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command comprises generating a duty cycle correction command for execution by any specified one of the plurality of serial-connected slave devices.
  • generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command comprises generating a duty cycle correction command for execution by all of the plurality of serial-connected slave devices.
  • receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, and containing data indicating how to adjust the duty cycle.
  • generating a duty cycle corrected clock signal comprises: a) generating a half rate clock signal from the input clock signal; b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
  • the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays. 76181-75
  • the invention provides a slave device for use in an arrangement comprising a plurality of serial-connected slave devices, the slave device comprising: a command input for receiving a command from a master device specifying an adjustment to a duty cycle; a clock input for receiving an input clock signal; a duty cycle correction circuit for generating a duty cycle corrected clock signal from the clock input in accordance with the control command; a clock output for outputting the duty cycle corrected clock signal.
  • the slave device is a memory device.
  • the command input is also for receiving a command from the master device specifying an adjustment to output delay; an output delay adjustment circuit for generating a delayed clock signal from the duty cycle corrected clock signal in accordance with the command; wherein the clock output for outputting the duty cycle corrected clock signal outputs the delayed clock signal.
  • the slave device further comprises: a command processing circuit that processes the command, wherein the command comprises: a command identifier that identifies the command as a duty cycle correction command; and data indicating how to adjust the duty cycle.
  • the slave device further comprises: a device address register; wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device address matches contents of the device address register.
  • the duty cycle correction circuit comprises: a) a clock divider circuit that generates a half rate clock signal from the input clock signal; b) a delay circuit that delays the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) a combiner that 76181-75
  • the invention provides a system comprising: a plurality of serial-connected device acting as slave devices according to claim 13 comprising at least a first slave device and a last slave device; a master device connected to the first slave device and to the last slave device; the master device configured to output a first clock signal that functions as an input clock signal of the first slave device; a clock input for receiving a second clock signal that is an output clock signal of the last slave device; a duty detector that determines a duty cycle of the second clock signal; a command generator that generates a duty cycle correction command specifying an adjustment to a clock duty cycle as a function of the duty cycle of the second clock signal; wherein, the first slave device of the plurality of serial-connected device acting as slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial- connected device acting as slave devices: a) receives the output clock signal
  • the system is a memory system
  • each slave device is a memory device
  • the master device is a memory controller
  • the memory system further comprises: a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal; wherein the command generator also generates an output delay adjustment command as a function of the amount of phase offset; wherein, the first slave device of the plurality of serial-connected slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial-connected slave devices: a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generates an output clock signal from the input clock signal; wherein at least one of the plurality of serial-connected slave devices: a) receives the output delay adjustment command; b) generates the output clock signal of the device by delaying the input clock signal of the device in accordance with the control command.
  • a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal
  • the command generator
  • the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by a specified one of the plurality of serial- connected device acting as slave devices.
  • the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by all of the plurality of serial-connected device acting as slave devices.
  • receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the 76181-75
  • the invention provides a method in a slave device of a plurality of serial-connected slave devices, the method comprising: receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device; receiving at least one input signal, the at least one input signal comprising at least an input clock signal; for each of the at least one input signal: generating a delayed version of the input signal in accordance with the command; outputting the delayed version of the input signal.
  • the slave device is a memory device and the master device is a memory controller.
  • the method comprises: outputting a data output signal; wherein at least one of the input signals comprises a data input signal and wherein outputting the delayed version of the data input signal is performed as part of outputting the data output signal such that: a) some of the time the data output signal is said delayed version of the data input signal; b) some of the time the data output signal is a delayed version of a signal produced locally to the slave device, after applying the delay to the signal produced locally to the slave device in accordance with the command.
  • receiving a command from a master device specifying an adjustment to a delay to be applied to at least one signal output by the slave device comprises receiving a command containing a command identifier that identifies the command as an output delay adjustment command, the command further containing data indicating how to adjust the delay.
  • receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command. 76181-75
  • the method further comprises: performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that matches a device address of the slave device; performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that is a broadcast device address.
  • generating a delayed version of the input signal comprises: a) delaying the input signal by a selected one of a plurality of delays to produce the delayed version of the input signal.
  • the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
  • the plurality of input signals comprise: a clock signal; a command strobe signal; a data strobe signal; a data signal containing commands and data.
  • the invention provides a method in a memory system comprising a master device and a plurality of serial-connected device acting as slave devices comprising at least a first slave device and a last slave device, the method comprising: in the master device: a) outputting at least one output signal, the at least one output signal comprising a first clock signal to function as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) determining an amount of phase offset between the first clock signal and the second clock signal; d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
  • each slave device is a memory device and the master device is a memory controller. 76181-75
  • the method further comprises: in the first slave device of the plurality of serial-connected device acting as slave devices: a) receiving the at least one output signal from the master device as corresponding at least one input signal of the first slave device; b) for each input signal, generating an output signal based on the input signal; in each other slave device of the plurality of serial-connected device acting as slave devices: a) receiving output signal(s) of a preceding slave device corresponding to at least one input signal of the slave device; b) for each input signal, generating an output signal based on the input signal; in at least one of the slave devices, a) receiving the output delay adjustment command; and b) generating the output signal(s) by generating a delayed version of the input signal(s) in accordance with the output delay adjustment command.
  • the method further comprises: wherein the at least one output signal of the master device comprises a plurality of output signal(s).
  • generating a delay adjustment command comprises generating a delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
  • generating a delay adjustment command comprises generating a delay adjustment command for execution by all of the plurality of serial-connected slave devices.
  • generating a delayed version of the input signal(s) in accordance with the output delay adjustment command comprises generating a delayed version of the input signals(s) delayed by a selected one of a plurality of delays.
  • generating a delay adjustment command comprises generating a command containing a command identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the delay.
  • the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
  • the method further comprises: the master device outputting output delay adjustment commands that adjust delay by adding a delay one unit delay element in one slave device at a time until the phase offset is acceptable.
  • the plurality of input signals comprise: a clock signal; a command strobe signal; a data strobe signal; a data signal containing commands and data.
  • the invention provides a slave device for use in an arrangement comprising a plurality of serial-connected slave devices, the slave device comprising: a command input for receiving a command from a master device specifying how to perform output delay adjustment; a clock input for receiving an input clock signal; an output delay adjustment circuit for generating a delayed clock signal from the clock input in accordance with the command; a clock output for outputting the delayed clock signal.
  • the slave device is a memory device.
  • the slave device comprises: a command processing circuit that processes the command, wherein the command contains a command identifier that identifies the command as an output delay adjustment command, and contains data indicating how to adjust the output delay.
  • the slave device further comprises: a device address register; wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device identifier matches contents of the device address register. 76181-75
  • the output delay adjustment circuit comprises: for each of a plurality of input signals, inclusive of the input clock signal, a delay circuit that delays the input signal by a selected one of a plurality of delays to produce a delayed version of the input signal.
  • the invention provides a memory system comprising: a plurality of serial-connected slave devices comprising at least a first slave device and a last slave device; a master device connected to the first slave device and to the last slave device; the master device configured to output a first clock signal that functions as an input clock signal of the first slave device; a clock input for receiving a second clock signal that is an output clock signal of the last slave device; a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal; a command generator that generates an output delay adjustment command as a function of the amount of phase offset; wherein, the first slave device of the plurality of serial-connected slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial-connected slave devices: a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device; b)
  • the system is a memory system
  • each slave device is a memory device
  • the master device is a memory controller
  • the command generator is configured to generate the output delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
  • the command generator is configured to generate the output delay adjustment for execution by all of the plurality of serial-connected slave devices.
  • generating an output delay adjustment command comprises generating a command containing a command identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the output delay.
  • Methods and apparatus of clock duty cycle correction and/or phase synchronization are provided that do not require DLL or PLL, for a serial- connected memory system, typically including a memory controller and a plurality of memory chips connected in a ring configuration.
  • the memory controller has a phase/duty cycle detector for detecting phase and duty cycle of a clock signal after having travelled around the ring, and each memory device has one or more controller programmable delay lines that are used to adjust the phase and/or duty cycle of the clock. These are adjusted by commands sent from the memory controller until the phase and duty cycle detected by the memory controller is acceptable.
  • the methods and apparatus described herein can be applied to any kind of semiconductor integrated circuit system having any kind of semiconductor integrated circuit devices as slave devices in a serial-connected configuration with a common interface between adjacent devices.
  • integrated circuit types include central processing units, graphics processing units, display controller IC, disk drive IC, memory devices like NAND Flash EEPROM, NOR 76181-75
  • Flash EEPROM AND Flash EEPROM, DiNOR Flash EEPROM, Serial Flash EEPROM, DRAM, SRAM, ROM, EPROM, FRAM, MRAM, PCRAM etc.
  • Figure 1 is a system block diagram of serial-connected memory system having a controller programmable duty cycle correction scheme
  • Figure 2 is a block diagram of a memory device having controller programmable duty cycle correction scheme
  • Figure 3 is a block diagram of a programmable delay line for duty cycle correction
  • Figure 4 is a timing diagram of controller programmable duty cycle correction
  • Figure 5 is a flowchart of a method of duty cycle correction
  • Figure 6 is a timing diagram for a write duty cycle register command
  • Figure 7 is a block diagram of a programmable delay line for output delay adjustment
  • Figure 8 is a timing diagram of controller programmable output delay adjustment
  • Figure 9 is a flowchart of a method of performing output delay adjustment.
  • Figure 10 is a timing diagram for a write output delay register command.
  • Some of the memory system configurations referred to in the background employ a shared bus topology for the system clock distribution, which is referenced to as a 'common clock system' or 'multi-drop clocking system'. If the system clock is applied to too many memory devices in parallel and the clock signal travels too far from the clock source, typically a memory controller, the maximum operating clock frequency may be limited by the total loading of the clock signal and the distance that the clock travels in the memory system's physical layout.
  • Some of the memory system configurations referred to in the background use a point-to-point serial-connected clocking architecture featuring a DLL or PLL in each memory device in order to synchronize two clock signals in the memory device, one being an input clock received from a preceding device or controller and the other being an output clock transmitted to the next device.
  • a DLL or PLL in each memory device can cause a significant amount of power consumption.
  • various chip-to-chip clock delays (caused by various interconnect loadings and different wire bonding loadings such as multi-chip stacking or package) accumulate through a large number of serial-connected devices and may be unacceptable for system operation.
  • FIG. 1 shown is a system block diagram of a serial- connected memory system generally indicated at 101 employing a controller programmable duty cycle correction scheme.
  • the memory system 101 includes a memory controller 10 as a master device connected to a first memory device 100-1.
  • Memory device 100-1 is the first of a series of slave devices including devices 100-1 through 100-8 that are connected in a ring configuration, with the 76181-75
  • a highly multiplexed unidirectional point-to-point bus architecture is provided to transfer information such as commands, addresses and data from the memory controller 10 to the memory devices 100-1 to 100-8.
  • This bus architecture includes a link 90 from the memory controller 10 to the first memory device 100-1 , and a respective link between each pair of adjacent memory devices, these including links 90-1 through 90-7, and a link 90-8 between the last memory device 100-8 and the memory controller 10.
  • each link includes a set of signals output by a preceding device (the memory controller 10 or a memory device) for receipt by a succeeding device.
  • Each link includes a set of output ports of a preceding device, a set of input ports of a succeeding device, and a set of physical interconnections between the output ports and the input ports.
  • the output ports will be given the same name as the signals they output and the input ports will be given the same name as the signals they receive.
  • the signals (and output ports) of a preceding device are referred to as CSO (Command Strobe Output), DSO (Data Strobe Output), Qn (Data Output), CKO/CKO# (differential clock output signals).
  • the corresponding signals (and input ports) of a succeeding device are referred to as CSI (Command Strobe Input), DSI (Data Strobe Input), Dn (Data Input), CKI/CKI# (differential clock input signals).
  • CSI Common Strobe Input
  • DSI Data Strobe Input
  • Dn Data Input
  • CKI/CKI# differential clock input signals
  • CE# chip enable
  • RST# reset
  • the physical interconnections include differential clock buses S111 , S111 -1 to S111 -8 for differential clock signals, S112, S112-1 to S112-8 for command strobe, S113, S113-1 to S113-8 for data strobe, S114, S114-1 to S114-8 for data.
  • the width of the link may be programmed through a link configuration register to utilize 1 , 2, 4, or 8 of a device package's available data input and output pins. This feature allows these memory devices to operate in a ring configuration together with devices that have smaller or larger maximum link widths provided they are all programmed to use the same link width. See for example 'Switching Method of Link and Bit Width' (WO 2008/070978), hereby incorporated by reference in its entirety.
  • CKI/CKI# are input clocks.
  • a Command/Address Packet on the Dn port delineated by CSI is latched on the rising edges of CKI or the falling edges of CKI#.
  • a Write Data Packet on Dn delineated by DSI is latched on the rising edges of CKl or the falling edges of CKI#.
  • CKO/CKO# are output clocks which are delayed version of CKI/CKI#.
  • CSO, DSO and Qn signals are referenced to the rising edges of CKO or to the falling edges of CKO#; for example, a Read Data Packet on Qn delineated by DSO is referenced at the rising edges of CKO or the falling edges of CKO#.
  • Command Strobe Output is an echo signal of CSI. It echoes CSI transitions with a latency tlOL that in a particular implementation is a two clock cycle latency referenced to the rising edges of CKO or to the falling edges of CKO#. Two clock cycle latency is an implementation detail; more generally it could be any number of clock cycles appropriate for a given design.
  • DSI Data Strobe Input
  • Qn buffer When Data Strobe Input (DSI) is HIGH while the memory device is in 'Read- Mode', it enables the read data output path and Qn buffer (not shown). If DSI is LOW, the Qn buffer holds the previous data accessed. If DSI is HIGH while the memory device is in 'Write-Mode', it enables a Dn buffer and receives Write Data Packet on the rising edges of CKI or falling edges of CKI#. 76181-75
  • Data Strobe Output is an echo signal of DSI. It echoes DSI transitions with latency tIOL referenced to the rising edges of CKO or to the falling edges of CKO#. As indicated above, tIOL is two clock cycles in a particular implementation.
  • QO is the only valid signal and transmits one byte of a packet in eight clock cycles.
  • QO & Q1 are valid signals and transmit one byte of a packet in four clock cycles.
  • QO, Q1 , Q2 & Q3 are valid signals and transmit one byte of a packet in two clock cycles.
  • QO, Q1 , Q2, Q3, Q4, Q5, Q6 & Q7 are all valid signals and transmit one byte of a packet in one clock cycle.
  • serial-connected in this context is referring to the serial arrangement of memory devices, one after the other and not to the nature of the link between each pair of adjacent devices which may be serial or parallel in nature.
  • the memory controller 10 contains a phase detector 11 , a duty detector 13 and a command generator 12. In some embodiments, the memory controller 10 only includes the phase detector 11 in which case only output delay adjustment is performed. In some embodiments, the memory controller 10 includes only the duty detector 13 in which case only duty cycle correction is performed. In some embodiments, both the phase detector 11 and the duty detector 13 are included in which case both output delay adjustment and duty cycle correction may be performed. This last case is assumed in the detailed description which follows.
  • the phase detector 11 and the duty detector 13 are connected to the command generator 12 through signal buses S11 and S12 respectively.
  • the command generator 12 has an output signal bus S13 connected to CSO and Qn ports through which it can output commands.
  • the memory controller 10 drives the differential clock buses, S111 , from its port CKO/CKO#, and all eight memory devices 100-1 ⁇ 100-8 receive the differential clock buses through their own clock ports, CKI/CKI#, from the previous device's CKO/CKO# ports in a series flow-through manner.
  • the memory controller 10 drives three different buses, S112, S113 and S114 through its ports, CSO, DSO and Qn, respectively.
  • the second memory device 100-2 receives the three buses, S112-1 , S113-1 and S114-1, through its input ports, CSI, DSI and Dn, respectively. This approach applies to all of the eight memory 76181-75
  • the duty detector 13 monitors a duty ratio of CKI/CKI# which is the clock input after it has been passed between all of the devices 100-1 to 100-8 in the ring. If the duty detector 13 detects a duty error from CKI/CKI#, namely a deviation in the duty cycle from a desired duty cycle, it asserts through signal bus S12 either a 'Duty_Add' to indicate the duty cycle is shorter than the desired duty cycle and should be lengthened or 'Duty_Sub' to indicate the duty cycle is longer than the desired duty cycle and should be shortened. In response, the command generator 12 generates an appropriate "Write Duty Cycle Register" command packet.
  • the phase detector 11 monitors the phase of CKI/CKI#. If the phase detector 11 detects a phase error (PE) between CKI/CKI# and CKO/CKO#, it asserts a 'PE' signal through the signal bus S11. In response, the command generator 12 generates an appropriate "Write Output Delay Register" command packet.
  • PE phase error
  • the command generator 12 issues the appropriate command packet according to the received signals on S11 and S12, and sends the command information through signal bus, S13, and CSO, Qn ports.
  • the device includes a memory core 150, command/address packet logic 130, data packet logic 140, and duty cycle correction logic 120.
  • Memory core 150 may be a single bank of memory cell arrays or it could be multiple banks of memory cell arrays, depending on design variations.
  • Data packet logic 140 processes and stores all necessary data transferring information.
  • Command/address packet logic 130 processes all command instructions and/or 76181-75
  • the device 100 includes clock input receiver 102D for CKI/CKI# which may for example be a differential type input buffer to handle the differential clock inputs, CKI & CKI#.
  • the clock input receiver 102D translates the external interface levels of CKI/CKI# signals to the internal logic levels of an internal clock signal 'cki_i'.
  • the internal clock signal, cki_i may be used in other internal logic blocks for various operations.
  • the duty cycle correction logic 120 takes the internal clock signal, cki_i, and produces a duty cycle corrected clock signal clk_dcc.
  • the duty cycle corrected clock signal, 'clk_dcc' is delayed by a controller programmable delay line, PDL2, 105D, and its delayed signal, 'clk_dcc_d', is finally driven to the input port of an output driver block 108D, which outputs the external clock output signals, CKO/CKO#.
  • the device 100 includes command strobe receiver 102A which generates a buffered signal 'csi_i' from a CSI input signal.
  • the buffered signal, csij is connected to the D port of a D-type flip-flop 103A.
  • the flip-flop 103A is driven by the clock signal, 'cki_i', and latches the status of the 'csij' signal at every rising edge of 'cki_i'.
  • the latched signal 'csijat' is provided to the command/address packet logic 130, and also is provided to the D port of another flip-flop 103E, whose clock input port is driven by the duty corrected clock signal, clk_dcc.
  • the flip-flop 103E's output signal, 'cso_i', is delayed by a controller programmable delay line, PDL2, 105A, and its delayed signal, 'cso_d', is finally driven to the input port of an output driver block 108A, which then outputs the external signal, CSO.
  • the device 100 includes data strobe input receiver 102C which generates a buffered signal 'dsi i' from a DSI input signal.
  • the buffered signal, dsij is connected to D port of D-type flip-flop 103C.
  • the flip-flop 103C is driven by the clock signal, 'cki_i', and latches the status of the 'dsij' signal at every rising edge of 'cki_i'.
  • the latched signal 'dsijat' is provided to the command/address packet logic 130 and data packet logic 140, and also is provided to D port of another flip-flop 103G, whose clock input port is driven by the duty corrected clock signal, clk_dcc.
  • the flip-flop 103G's output signal, 'dso_i' is delayed by a controller programmable delay line, PDL2, 105C, and its delayed signal, 'dso_d', is finally driven to the input port of an output driver block 108C, which outputs the external signal, DSO.
  • PDL2, 105C controller programmable delay line
  • 'dso_d' delayed signal
  • DSO external signal
  • the device 100 includes a data receiver, 102B, for receiving an external signal Dn.
  • the number of receivers 102B can be one or more than one according to the bit width of Dn ports. For example, if Dn ports are designated in DO, D1 , ⁇ D7, for an 8 bit wide data input/output implementation, the receiver 102B will be repeated eight times.
  • the output of the receiver 102B, 'dn_i' is provided to the D port of a D-type flip-flop 103B.
  • the flip-flop 103B is driven by the clock signal, 'cki_i', and latches the status of the 'dn_i' signal at every rising edge of 'cki_i'.
  • the latched signal 'dnjat' is provided to the command/address packet logic 130 and also is provided to data packet logic 140.
  • the latched signal, 'dnjat' is also provided to one input port of a multiplexer 104.
  • the other port of the multiplexer 104 is driven by a signal, 'core_data' from the data packet logic 140.
  • the output of the multiplexer 104 is connected to the D input port of a flip- flop 103F, whose clock input port is driven by the duty corrected clock signal, clk_dcc, and latches the status of the output of the multiplexer 104 at every rising 76181-75
  • the internal signal dn_i includes both command content (as delineated by the command strobe input) and data input (as delineated by the data strobe input) when present.
  • Each device has a device address, in some embodiments stored in a device address register 131.
  • Each command includes a Device Address portion that contains the device address of one of the memory devices to which the command is addressed. There may also be a broadcast address that requires the command to be processed by all devices.
  • the memory device 100 processes each command by examining the Device Address portion. If the Device Address information in the received command/address packet matches the memory device 100's own stored device address, the command/address packet logic 130 processes the command, and also issues an "id_match" signal to signify that the command is for that memory device.
  • the id_match signal is used to steer the data flow path of the multiplexer 104. If “id_match” is in a HIGH logic state (more generally in a “match state” however that is defined) as a result of device address matching process, the multiplexer 104 selects "core_data" to be outputted, so that the data from the memory core 150 can be transferred to the flip-flop 103F.
  • the multiplexer 104 selects "dnjat" to be outputted, so that the data received from the data input Dn can be transferred to the flip-flop 103F to be echoed at the output Qn.
  • the multiplexer 104 allows for the selection between a) bypassing data received from the data input Dn by selecting the dnjat input of the multiplexer 104, and b) outputting the core_data by selecting the core_data input of the 76181-75
  • the signal 'core_data' is usually transferred from the memory core 150 to the data packet logic 140, for example as part of a 'PAGE READ' operation upon request from the memory controller 10. Then after the 'PAGE READ' operation is done, the memory controller 10 can request a 'BURST READ' operation to the memory device with a command addressed to that memory device. In that case, the memory device processes the 'BURST READ' command and the corresponding address information including Device Address portion. If the Device Address information in the received command/address packet matches the memory device 100's own stored device address, the command/address packet logic 130 issues "id_match" signal in order to steer the data flow path of the multiplexer 104.
  • the multiplexer 104 selects "core_data" to be outputted, so that the data previously transferred from the memory core 150 to the data packet logic 140 can be transferred to the flip-flop 103F.
  • the core_data input of the multiplexer 104 is still selected even though there is no data to output.
  • the core_data signal may be a static signal in such a case. This results in the data input Dn not being echoed to the next device. This can have the effect of reducing power consumption in the subsequent devices by eliminating the need for them to process data associated with commands that are not addressed to them. This is described in further detail in US Application serial no. 12/018,272 filed 01/23/2008 entitled "Semiconductor Device and Method for Reducing Power Consumption in a System Having Interconnected Devices".
  • a delayed version of the data input signal Dn is produced as one component of a data output signal (Qn). Some of the time the data output signal is the delayed version of the data input signal. For the implementation described, this will be the case when there is content on the data 76181-75
  • the data output signal comprises a delayed version of a signal produced locally to the memory device, after applying the delay to the signal produced locally to the memory device in accordance with the command.
  • the signal produced locally to the memory device is the so-called core_data output from the data packet logic 140 but other scenarios are possible.
  • the command/address packet logic 130 has a DCR (duty cycle correction register) 132 that produces an output DCR ⁇ 0:3> to the duty cycle correction circuit 120 to control the amount of duty cycle correction performed as detailed below and has an ODR (output delay register) 134 that produces an output ODR ⁇ 0:1 > to the packet delay lines 105A, 105B, 105C, 105D to control the amount of output delay applied as detailed below.
  • DCR duty cycle correction register
  • ODR output delay register
  • a "Write Duty Cycle Correction Register” command assumes an implementation, as described herein, in which an amount of delay to be applied in performing duty cycle correction is controlled by writing a value to a duty cycle correction register. More generally, any command, referred to herein as a duty cycle correction command, may be employed that has the effect of causing a device to set how duty cycle correction is to be performed. Thus, the described "Write Duty Cycle Correction Register” command is to be considered a specific example of a duty cycle correction command.
  • any command referred to herein as an output delay adjustment command, may be employed that has the effect of causing a device to set the amount of delay to be applied.
  • the described "Write Output Delay Register" command is to be considered a specific example of an output delay adjustment command.
  • the duty cycle correction circuit 120 includes a clock divider 123, and a controller programmable delay line 121 that includes a '4-to-16 Decoder' block and 'Programmable Delay Line (PDL1 )'. Respective outputs clk_ref, clk_del of the clock divider 123 and the controller programmable delay line 121 are input to an XOR gate 122 the output of which is the duty cycle corrected clock clk_dcc.
  • the clock divider 123 derives an output signal 'clk_ref which has a frequency that is one half that of the input 'cki_i' signal.
  • Clock divider circuits are well known in the art.
  • the clock divider 123 includes a D-type flip-flop 103D that is driven by the internal clock signal, cki_i, through its clock input port.
  • the output port Q of the D-type Flip-Flop 103D is connected to the input port D though inverter logic 124 in order to obtain a half frequency output signal.
  • the controller programmable delay line 121 produces an output signal, elk del, which is a delayed version of clk_ref.
  • the amount of delay is determined by the '4-to-16 Decoder' logic block's select signals, which are controlled by DCR ⁇ 0:3> signal information received from command/address packet logic 130.
  • the XOR logic gate 122 receives the two half clock signals, clk_ref and elk del, and outputs a duty cycle adjusted full clock signal, clk_dcc.
  • Figure 3 is a block diagram of an example implementation of a programmable delay line 121 for duty cycle correction that may, for example, be used in the duty cycle correction circuit 120 of Figure 2.
  • the unit delay block is composed of two NAND logic gates 1211 and 1212 and one inverter logic gate 1213.
  • the first NAND logic gate 1211 receives the clk_ref input at its first input, and receives an output from a 4-to-16 decoder 1210 at its second input.
  • the output of the first NAND logic gate 1211 is input to a first input of the second logic NAND gate 1212.
  • the second input of the second logic NAND gate 1212 is connected to Vdd.
  • the 4-to-16 Decoder block 1210 has a 4-bits wide input bus, DCR ⁇ 0:3> as its input.
  • the decoder block 1210 decodes the input and outputs a 16-bit wide bus, SEI_ ⁇ 15:0>, with one line of the bus connected to each of the 16 unit delay blocks.
  • the unit delay logic shown is an example of a known circuit technique has been used to to produce a register controlled delay-locked-loop. Other unit delay logics can alternatively be employed.
  • the '4-to-16 Decoder' logic 1210 produces the 16 SEL ⁇ 15:0> output such that only one of the 16 select signals is in a HIGH logic state and all the other 15 select signals are in LOW logic states. Therefore, only one unit delay block is selected to transfer the 'clk_ref signal through the unit delay blocks that are to the right of the selected unit delay block.
  • the control input DCR ⁇ 0:3> is used to select which of the unit delay blocks will process the clk_ref input. The minimum delay is selected by selecting the right most unit delay block 76181-75
  • the unit delay amount of the illustrated unit delay block is around 100ps ⁇ 150ps. However, in some embodiments, a finer unit delay circuit block is employed for much higher operating frequency with finer delay tuning capability.
  • the unit delay time is denoted as “tUD” in Figure 3 and the total delay time for the whole programmable delay line is denoted as “tPDL1" which is 16 times "tUD”.
  • a default setting for the power-on initialization is that having a logic HIGH state on the SEI_ ⁇ 7> bit, as it is in the middle position of the delay line.
  • the default settings can be different, and it may be recommended to have minimum delay setting in order to be ready for operating at the maximum frequency.
  • FIG 4 is an example of a timing diagram of the controller programmable duty cycle correction procedure, where all of the signals are as shown in Figure 3 except CKI which is the raw input clock signal that is to be duty corrected.
  • the timing diagram is showing one extremely distorted clock input signal, CKI at the top, for the sake of example only.
  • the half clock signal, clk_ref is derived from the 'clock divider' block 123 of Figure 2 and its rising and falling edges are aligned with two rising edges of CKI.
  • the clock signal, clk_dcc would have a distorted duty ratio, such as 45% on, 55% off, for example, in the absence of any change to the DCR ⁇ 0:3> values which are shown to initially be set to "01 11 b".
  • the duty cycle of the clock signal, clk_dcc is corrected to be 50% on and 50% off as the result of a shift in the selection of the controller programmable delay line 121 from SEL (7) being enabled to SEL (8) being enabled. 76181-75
  • the contents of the DCR 132 are used to control the amount of delay introduced by the controller programmable delay line 121 in the duty cycle correction circuit 120, thereby controlling the duty cycle correction.
  • the contents of the DCR 132 can be written with a 'Write Duty Cycle Register' command.
  • FIG. 5 is a flow chart for the duty cycle correction procedure from the perspective of the controller.
  • the method begins at block 5-1 with power on of the devices. At this point, all of the delay lines are initialized and device addresses for all devices are assigned.
  • the memory controller 10 monitors the duty ratio of CKI/CKI# using the duty detector 13. If there is a duty cycle error, yes path block 5-3, then in block 5-4 the duty detector 13 asserts the "Duty_Add" or the "Duty_Sub” signal S12. After this, the command generator 12 issues the 'Write Duty Cycle Register' command with "DCR+1 " or "DCR-1" values.
  • step 5-6 If there is still a duty cycle error, yes path block 5-6, then the method continues back at step 5-4 with the further adjustment to the duty cycle register. If there is no longer a duty cycle error, no path block 5-6, then duty cycle correction is completed at 5-7. Similarly, if no duty cycle error was detected in block 5-3, then at that point the method also is completed at 5-7.
  • Table 1 below is an example command packet definition for writing to the Duty Cycle Register (DCR).
  • a broadcast address is provided, for example FFh. If DA is set to the broadcast address, it means that the command is a broadcasting command, so that every memory device is expected to execute the command. Otherwise, only a specific memory device that is matching the DA will execute the command.
  • a 'Read Duty Cycle Register' command is also implemented in order to give more flexibility to the controller 10. 76181-75
  • FIG. 6 is an example of a timing diagram of a 'Write Duty Cycle Register' command packet sequence based on SDR (Single Data Rate) operation.
  • SDR Single Data Rate
  • this means that the 'Write Duty Cycle Register' command is a broadcasting command, so that every memory device is expected to execute the command.
  • the broadcasting command is used for Duty Cycle Correction operation.
  • the circuit disclosed also allows for the more flexible 76181-75
  • tWDCR Write Duty Cycle Register Latency
  • tWDCR value is set as 4 clock cycles as shown in Figure 6.
  • the described programmable delay lines 105A, 105B, 105C, 105D are provided to allow programmably delaying the output signals CSO, Qn, DSO and CKO/CKO# in order to allow phase correction.
  • Figure 1 also shows output delay register signal buses ODR ⁇ 0:1 > connected to a 2-to-4 Decoder logic block 106.
  • the 2-to-4 Decoder logic 106 outputs four select signal buses, SEL2 ⁇ 0:3>. Those SEL2 ⁇ 0:3> select signals are all connected to the four controller programmable delay lines 105A, 105B, 105C and 105D.
  • Figure 7 is showing an exemplary circuit block implementation for the output delay adjustment.
  • programmable delay lines 105A, 105B, 105C and 105D are composed of four unit delay elements that are the same as those used in Figure 3. This means that the range of delay adjustment for the output is only 4/16 that of the range of delay of adjustment of the duty 76181-75
  • Each programmable delay line 105A, 105B, 105C, 105D receives a respective signal cso_i, q_i, dso_i and clk_dcc, as the input of the delay line and produces a respective delayed output cso_d, q_d, dso_d and clk_dcc_d.
  • signals will be increased correspondingly, for example to be 8 in number, and the number of delay line blocks for q_i and q_d, will be increased correspondingly, for example to be 8 in number.
  • the '2-to-4 Decoder' logic 106 produces the SEL2 ⁇ 0:3> output such that only one of the 4 select signals is in a HIGH logic state and all the other 3 select signals are to be logic LOW states. Only the selected unit delay block transfers the respective input signal through the remaining unit delay blocks to the right of the selected unit delay block.
  • the control input ODR ⁇ 0:1 > is used to select which of the unit delay blocks will process the respective inputs. The minimum delay is selected by selecting the right most unit delay block UNIT_0 in which case each output signal is the respective input signal delayed by one unit delay block, whereas the maximum delay is selected by selecting the left most unit delay block UNIT_3 in which case each output signal is the respective input signal delayed by four delay unit blocks.
  • the '2-to-4 decoder' logic 106 with four unit delay blocks is implemented in this example circuit design. However more generally, any required number of delay units and the corresponding decoder logic may be used.
  • a default delay setting may be used during the power-on initialization period. In this example, the default selection might for example be set to SEL2 ⁇ 0>, and the memory device will have the least amount of delay for each output path after power-on or hard reset in some other design variations.
  • the use of 4 unit delay blocks is implementation specific. For example, more generally, an N-to-M decoder might 76181-75
  • the contents of the ODR 134 are used to control the amount of delay introduced by the delay lines 105A,105B,105C,105D thereby controlling the amount of output delay adjustment.
  • the contents of the ODR 134 can be written with a 'Write Output Delay Register' command.
  • the controller 10 When the phase detector 11 in the memory controller 10 detects an unacceptable phase difference between its CKI/CKI# and CKO/CKO# signals, the controller 10 will issue one "Write Output Delay Register" command packet with one added unit delay amount to allow the very first memory device 100-1 of Figure 1. After enough clock cycles for a first memory device, for example for the tWODR (Write Output Delay Register latency) and total tIOL latencies described below with respect to Figure 10, if there is still unacceptable phase difference, the controller 10 can issue another 'Write Output Delay Register" command packet to a second memory device, for example the second memory device 100- 2 of Figure 1. This sequence of operations can be continued until the memory 76181-75
  • controller 10 gets the acceptable phase difference. After the last memory device is instructed to adjust its output delays, then the memory controller 10 points to the very first memory device with one more added unit delay value within the command packet, and continues for the rest of the memory devices until the phase difference reaches an acceptable range.
  • the above procedure is shown in the flowchart of Figure 9.
  • the method begins at block 9-1 with power on. At this point, all the delay lines and device addresses are initialized.
  • the memory controller 10 monitors the phase error between CK!/CK!# and CKO/CKO# using the phase detector 11. If there is a phase error, yes path 9-3, then the phase detector 11 asserts the "PE" signal S11 in block 9-4. After that, the command generator 12 issues a 'write output delay register' command with "ODR+1 " value to each memory device from the first to the last, one at a time while monitoring the phase error.
  • block 9-6 if there is still a phase error, yes path, then the method continues back at block 9-4. If there is no phase error, no path block 9-6, then the phase correction is completed at block 9-7. Similarly, if no phase error was detected in block 9-3, then the method ends, phase correction having been completed at block 9-7.
  • Table 3 is an example command packet definition for the Write Output Delay Register command.
  • a broadcast address is provided, for example FFh. If DA is set to the broadcast address, it means that the command is a broadcasting command, so that every memory device is expected to execute the command. Otherwise, only a specific memory device that is matching with DA will execute the command.
  • a 'Read Output Delay Register' is implemented in order to give more flexibility to the controller 10. For example, this can be used by the controller to read the values from all of the memory devices and then rearrange the settings among the devices appropriately, if necessary. 76181-75
  • FIG 10 is an example of a timing diagram of a 'Write Output Delay Register' command packet sequence based on SDR (Single Data Rate) operation.
  • SDR Single Data Rate
  • tWODR Write Output Delay Register Latency
  • 76181 -75 76181 -75
  • 37 tWODR value is set as 4 clock cycles as shown in Figure 10. After tWODR (for example, at T8), the memory controller 10 can issue any other command packets to the memory device.
  • an embodiment of the application provides for methods and circuits performing output delay adjustment embodiments in which, a delayed version of at least one input signal is produced, the at least one input signal includes at least the clock signal.
  • generating a delayed version of an input signal for output involves conditionally generating a delayed version of the input signal for output. That is to say, some of the signals may be conditionally conveyed between adjacent devices.
  • a specific example is detailed below in which the input data signal of a memory device is conveyed to the next memory device some of the time.
  • the embodiments described above have assumed the use of programmable delay lines that are composed of identical unit delay blocks.
  • the programmable delay lines are divided into two or more sections, such as "Coarse” and “Fine” delay lines to allow further programmability of the delay adjustment for duty cycle correction and/or output delay adjustment.
  • the output delay lines are located after the last Flip-Flop that is located near an output for each signal. In some embodiments, the output delay line is located before the last flip-flop.
  • the devices that are connected in the serial-connected manner are assumed to be substantially identical. In some embodiments, these are substantially identical memory devices. In other embodiments, different 76181-75
  • differential clock signals are employed. More generally, single ended or differential clock signals may be used. Similarly, any other input/output signals can be single ended or differential.
  • a single MCP (multi-chip package) is provided that includes the plurality of memory devices and a controller, operable as described.
  • the methods and apparatus described herein have assumed a serial-connected architecture featuring a controller and a set of memory devices connected in a ring.
  • the memory devices are slave devices
  • the memory controller is a master device.
  • the methods and apparatus described herein can be applied to any kind of semiconductor integrated circuit system having any kind of semiconductor integrated circuit devices that are configured as slave devices in the serial-connected configuration with a common interface between adjacent devices, with a device that is configured to act as a master device that controls the duty cycle correction and/or phase correction performed by the slave devices .
  • Examples of integrated circuit types include central processing units, graphics processing units, display controller IC, disk drive IC, memory devices like NAND Flash EEPROM, NOR Flash EEPROM, AND Flash EEPROM, DiNOR Flash EEPROM, Serial Flash EEPROM, DRAM, SRAM, ROM, EPROM, FRAM, MRAM, PCRAM etc.

Abstract

Systems and methods for correcting clock duty cycle and/or performing output delay adjustment are provided for application in serial- connected devices operating as slave devices. A master device provides a clock to the first slave device. Each slave device passes the clock to the next slave device in turn. The last slave device returns the clock to the master device. The master device compares the outgoing and returned clocks and determines if a duty cycle correction and/or an output delay adjustment is needed. If so, the master device generates and outputs commands for slave devices to perform duty cycle and/or output delay adjustment. The slave devices each have a circuit for performing duty cycle correction and/or output delay adjustment. In some implementations, each slave device is a memory device, and the master device is a memory controller.

Description

76181-75
Serial-connected Memory System with Output Delay Adjustment
Field
The invention relates generally to solid state memory systems featuring a set of serial-connected memory devices.
Background
Conventional NAND flash memory systems use a large number of parallel signals for the commanding, addressing, and data transferring operations. This was a very popular way of configuring memory systems and results in very fast system operation. This is particularly true for random access memory devices like DRAM (dynamic random access memory), SRAM (static random access memory), etc.
A disadvantage arises from this approach in that a large number of parallel signal lines need to be routed to each and every memory device in the memory system. Also, the system power supply must have higher capacity in order to deliver higher peak power for parallel signaling. Write and read throughput for conventional NAND flash memory can be directly increased by using a higher operating frequency. For example, the present operating frequency of about 40 MHz (=25ns for tRC in NAND Flash) can be increased to about 100 ~ 200 MHz. While this approach appears to be straightforward, there is a significant problem with signal quality at such high frequencies, which sets a practical limitation on the operating frequency of the conventional NAND flash memory.
In particular, the conventional NAND flash memory communicates with other components using a set of parallel input/output (I/O) pins, numbering 8 or 16 depending on the desired word configuration, which receive command instructions, receive input data and provide output data. This is commonly known as a parallel interface. High speed operation will cause well known 76181-75
communication degrading effects such as cross-talk, signal skew and signal attenuation, for example, which degrades signal quality. Such parallel interfaces use a large number of pins to read and write data. As the number of input pins and wires increases, so do a number of undesired effects. These effects include inter-symbol interference, signal skew and cross talk.
In order to address some of these disadvantages, several serial-connected system configurations featuring a set of memory devices connected in a ring have been provided. These include 'Multiple Independent Serial Link Memory' (US20070076479A1 ), 'Daisy Chain Cascading Devices' (US20070109833A1 ), 'Memory with Output Control' (US20070153576A1 ), 'Daisy chain cascade configuration recognition technique' (US2007233903A1 ), and 'Independent Link and Bank Selection' (US2007143677A1 ), all of which are assigned to the same assignee as this application and are hereby incorporated by reference in their entirety. These systems typically have serial in/out data pins along with two control signals for the enabling and disabling of a serial input port and serial output port respectively in order to provide a memory controller with the maximum flexibility of serial data communication. Some of these memory system configurations employ a shared bus topology for the system clock distribution, which is referred to as a 'common clock system' or 'multi-drop clocking system'. Some of these architectures use a point-to-point serial- connected clocking architecture featuring a DLL (delay locked loop) or PLL (phase locked loop) in every memory chip in order to synchronize two clock signals in each memory device, one being an input clock received from a preceding device or controller and the other being an output clock transmitted to the next device.
Summary of the Invention
According to one broad aspect, the invention provides a method in a slave device of a plurality of serial-connected slave devices, the method comprising: receiving a command from a master device specifying an adjustment to a clock duty cycle; 76181-75
receiving an input clock signal; generating a duty cycle corrected clock signal from the input clock signal in accordance with the command; outputting the duty cycle corrected clock signal.
In some embodiments, the slave device is a memory device and the master device is a memory controller.
In some embodiments, the method further comprises: receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device; receiving at least one input signal, the at least one input signal comprising at least the input clock signal; for each of the at least one input signal: generating a delayed version of the input signal in accordance with the command; outputting the delayed version of the input signal, the delayed version of the input clock signal comprising a delayed version of the duty cycle corrected clock signal.
In some embodiments, receiving a command from a master device specifying an adjustment to a clock duty cycle comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, the command further containing data indicating how to adjust the duty cycle.
In some embodiments, receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command.
In some embodiments, the method further comprises: performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that matches a device address of the slave device; performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that is a broadcast device address. 76181-75
In some embodiments, generating a duty cycle corrected clock signal comprises: a) generating a half rate clock signal from the input clock signal; b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
In some embodiments, the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays.
According to another broad aspect, the invention provides a method in a memory system comprising a master device and a plurality of serial-connected slave devices comprising at least a first slave device and a last slave device, the method comprising: in the master device: a) outputting a first clock signal that functions as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command; in the first slave device of the plurality of serial-connected slave devices: a) receiving the first clock signal from the master device as the input clock signal of the first slave device; b) generating an output clock signal from the input signal; in each other slave device of the plurality of serial-connected slave devices: a) receiving the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generating an output clock signal from the input clock signal; in each of at least one of the plurality of serial-connected devices acting as a slave devices: a) receiving the duty cycle correction command; b) generating a duty cycle corrected clock signal from the input clock signal in accordance with the duty cycle correction command; c) outputting the duty cycle corrected clock signal as the output clock signal of the slave device.
In some embodiments, each slave device is a memory device and the master device is a memory controller. 76181-75
In some embodiments, the method further comprises: in the master device: a) outputting at least one output signal, the at least one output signal comprising the first clock signal to function as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) determining an amount of phase offset between the first clock signal and the second clock signal; d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
In some embodiments, generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command comprises generating a duty cycle correction command for execution by any specified one of the plurality of serial-connected slave devices.
In some embodiments, generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command comprises generating a duty cycle correction command for execution by all of the plurality of serial-connected slave devices.
In some embodiments, receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, and containing data indicating how to adjust the duty cycle.
In some embodiments, generating a duty cycle corrected clock signal comprises: a) generating a half rate clock signal from the input clock signal; b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
In some embodiments, the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays. 76181-75
According to another broad aspect, the invention provides a slave device for use in an arrangement comprising a plurality of serial-connected slave devices, the slave device comprising: a command input for receiving a command from a master device specifying an adjustment to a duty cycle; a clock input for receiving an input clock signal; a duty cycle correction circuit for generating a duty cycle corrected clock signal from the clock input in accordance with the control command; a clock output for outputting the duty cycle corrected clock signal.
In some embodiments, the slave device is a memory device.
In some embodiments, the command input is also for receiving a command from the master device specifying an adjustment to output delay; an output delay adjustment circuit for generating a delayed clock signal from the duty cycle corrected clock signal in accordance with the command; wherein the clock output for outputting the duty cycle corrected clock signal outputs the delayed clock signal.
In some embodiments, the slave device further comprises: a command processing circuit that processes the command, wherein the command comprises: a command identifier that identifies the command as a duty cycle correction command; and data indicating how to adjust the duty cycle.
In some embodiments, the slave device further comprises: a device address register; wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device address matches contents of the device address register.
In some embodiments, the duty cycle correction circuit comprises: a) a clock divider circuit that generates a half rate clock signal from the input clock signal; b) a delay circuit that delays the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal; c) a combiner that 76181-75
combines the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
In some embodiments, the delay circuit comprises M unit delay elements, M>=2, the duty cycle correction circuit further comprising: an N-to-M decoder that decodes signals received on N input lines, N>=1 , into a selection of how many of the unit delay elements are to be active in delaying the half rate clock signal to produce the delayed half rate clock signal.
According to another broad aspect, the invention provides a system comprising: a plurality of serial-connected device acting as slave devices according to claim 13 comprising at least a first slave device and a last slave device; a master device connected to the first slave device and to the last slave device; the master device configured to output a first clock signal that functions as an input clock signal of the first slave device; a clock input for receiving a second clock signal that is an output clock signal of the last slave device; a duty detector that determines a duty cycle of the second clock signal; a command generator that generates a duty cycle correction command specifying an adjustment to a clock duty cycle as a function of the duty cycle of the second clock signal; wherein, the first slave device of the plurality of serial-connected device acting as slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial- connected device acting as slave devices: a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generates an output clock signal from the input clock signal; wherein at least one of the plurality of serial-connected slave devices: a) receives the duty cycle correction command; b) generates a duty cycle corrected clock signal in accordance with the control command; c) outputs the duty cycle corrected clock signal as the output clock signal of the slave device. 76181-75
In some embodiments, the system is a memory system, each slave device is a memory device and the master device is a memory controller.
In some embodiments, the memory system further comprises: a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal; wherein the command generator also generates an output delay adjustment command as a function of the amount of phase offset; wherein, the first slave device of the plurality of serial-connected slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial-connected slave devices: a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generates an output clock signal from the input clock signal; wherein at least one of the plurality of serial-connected slave devices: a) receives the output delay adjustment command; b) generates the output clock signal of the device by delaying the input clock signal of the device in accordance with the control command.
In some embodiments, the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by a specified one of the plurality of serial- connected device acting as slave devices.
In some embodiments, the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by all of the plurality of serial-connected device acting as slave devices.
In some embodiments, receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the 76181-75
9 command as a duty cycle correction command, and containing data indicating how to adjust the duty cycle.
According to one broad aspect, the invention provides a method in a slave device of a plurality of serial-connected slave devices, the method comprising: receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device; receiving at least one input signal, the at least one input signal comprising at least an input clock signal; for each of the at least one input signal: generating a delayed version of the input signal in accordance with the command; outputting the delayed version of the input signal.
In some embodiments, the slave device is a memory device and the master device is a memory controller.
In some embodiments, the method comprises: outputting a data output signal; wherein at least one of the input signals comprises a data input signal and wherein outputting the delayed version of the data input signal is performed as part of outputting the data output signal such that: a) some of the time the data output signal is said delayed version of the data input signal; b) some of the time the data output signal is a delayed version of a signal produced locally to the slave device, after applying the delay to the signal produced locally to the slave device in accordance with the command.
In some embodiments, receiving a command from a master device specifying an adjustment to a delay to be applied to at least one signal output by the slave device comprises receiving a command containing a command identifier that identifies the command as an output delay adjustment command, the command further containing data indicating how to adjust the delay.
In some embodiments, receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command. 76181-75
10
In some embodiments, the method further comprises: performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that matches a device address of the slave device; performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that is a broadcast device address.
In some embodiments, for each input signal, generating a delayed version of the input signal comprises: a) delaying the input signal by a selected one of a plurality of delays to produce the delayed version of the input signal.
In some embodiments, the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
In some embodiments, the plurality of input signals comprise: a clock signal; a command strobe signal; a data strobe signal; a data signal containing commands and data.
According to another broad aspect, the invention provides a method in a memory system comprising a master device and a plurality of serial-connected device acting as slave devices comprising at least a first slave device and a last slave device, the method comprising: in the master device: a) outputting at least one output signal, the at least one output signal comprising a first clock signal to function as an input clock signal of the first slave device; b) receiving a second clock signal that is an output clock signal of the last slave device; c) determining an amount of phase offset between the first clock signal and the second clock signal; d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
In some embodiments, each slave device is a memory device and the master device is a memory controller. 76181-75
11
In some embodiments, the method further comprises: in the first slave device of the plurality of serial-connected device acting as slave devices: a) receiving the at least one output signal from the master device as corresponding at least one input signal of the first slave device; b) for each input signal, generating an output signal based on the input signal; in each other slave device of the plurality of serial-connected device acting as slave devices: a) receiving output signal(s) of a preceding slave device corresponding to at least one input signal of the slave device; b) for each input signal, generating an output signal based on the input signal; in at least one of the slave devices, a) receiving the output delay adjustment command; and b) generating the output signal(s) by generating a delayed version of the input signal(s) in accordance with the output delay adjustment command.
In some embodiments, the method further comprises: wherein the at least one output signal of the master device comprises a plurality of output signal(s).
In some embodiments, generating a delay adjustment command comprises generating a delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
In some embodiments, generating a delay adjustment command comprises generating a delay adjustment command for execution by all of the plurality of serial-connected slave devices.
In some embodiments, generating a delayed version of the input signal(s) in accordance with the output delay adjustment command comprises generating a delayed version of the input signals(s) delayed by a selected one of a plurality of delays.
In some embodiments, generating a delay adjustment command comprises generating a command containing a command identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the delay. 76181-75
12
In some embodiments, the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
In some embodiments, the method further comprises: the master device outputting output delay adjustment commands that adjust delay by adding a delay one unit delay element in one slave device at a time until the phase offset is acceptable.
In some embodiments, the plurality of input signals comprise: a clock signal; a command strobe signal; a data strobe signal; a data signal containing commands and data.
According to another broad aspect, the invention provides a slave device for use in an arrangement comprising a plurality of serial-connected slave devices, the slave device comprising: a command input for receiving a command from a master device specifying how to perform output delay adjustment; a clock input for receiving an input clock signal; an output delay adjustment circuit for generating a delayed clock signal from the clock input in accordance with the command; a clock output for outputting the delayed clock signal.
In some embodiments, the slave device is a memory device.
In some embodiments, the slave device comprises: a command processing circuit that processes the command, wherein the command contains a command identifier that identifies the command as an output delay adjustment command, and contains data indicating how to adjust the output delay.
In some embodiments, the slave device further comprises: a device address register; wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device identifier matches contents of the device address register. 76181-75
13
In some embodiments, the output delay adjustment circuit comprises: for each of a plurality of input signals, inclusive of the input clock signal, a delay circuit that delays the input signal by a selected one of a plurality of delays to produce a delayed version of the input signal.
In some embodiments, each output delay circuit comprises M unit delay elements, M >=2, the duty cycle correction circuit further comprising: an N-to-M decoder that decodes signals received on N input lines, N>=1 , into a selection of how many of the unit delay elements are to be active in producing the delayed version of the input signal.
According to another broad aspect, the invention provides a memory system comprising: a plurality of serial-connected slave devices comprising at least a first slave device and a last slave device; a master device connected to the first slave device and to the last slave device; the master device configured to output a first clock signal that functions as an input clock signal of the first slave device; a clock input for receiving a second clock signal that is an output clock signal of the last slave device; a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal; a command generator that generates an output delay adjustment command as a function of the amount of phase offset; wherein, the first slave device of the plurality of serial-connected slave devices: a) receives the first clock signal from the master device as the input clock signal of the first slave device; b) generates an output clock signal from the input clock signal; wherein each other slave device of the plurality of serial-connected slave devices: a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device; b) generates an output clock signal from the input clock signal; wherein at least one of the plurality of serial-connected slave devices: a) receives the output delay adjustment command; b) generates the output clock signal of the device by delaying the input clock signal of the device in accordance with the control command. 76181-75
14
In some embodiments, the system is a memory system, each slave device is a memory device and the master device is a memory controller.
In some embodiments, the command generator is configured to generate the output delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
In some embodiments, the command generator is configured to generate the output delay adjustment for execution by all of the plurality of serial-connected slave devices.
In some embodiments, generating an output delay adjustment command comprises generating a command containing a command identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the output delay.
Methods and apparatus of clock duty cycle correction and/or phase synchronization are provided that do not require DLL or PLL, for a serial- connected memory system, typically including a memory controller and a plurality of memory chips connected in a ring configuration. In some embodiments, the memory controller has a phase/duty cycle detector for detecting phase and duty cycle of a clock signal after having travelled around the ring, and each memory device has one or more controller programmable delay lines that are used to adjust the phase and/or duty cycle of the clock. These are adjusted by commands sent from the memory controller until the phase and duty cycle detected by the memory controller is acceptable.
The methods and apparatus described herein can be applied to any kind of semiconductor integrated circuit system having any kind of semiconductor integrated circuit devices as slave devices in a serial-connected configuration with a common interface between adjacent devices. Examples of integrated circuit types include central processing units, graphics processing units, display controller IC, disk drive IC, memory devices like NAND Flash EEPROM, NOR 76181-75
15
Flash EEPROM, AND Flash EEPROM, DiNOR Flash EEPROM, Serial Flash EEPROM, DRAM, SRAM, ROM, EPROM, FRAM, MRAM, PCRAM etc.
Brief Description of the Drawings
Figure 1 is a system block diagram of serial-connected memory system having a controller programmable duty cycle correction scheme;
Figure 2 is a block diagram of a memory device having controller programmable duty cycle correction scheme;
Figure 3 is a block diagram of a programmable delay line for duty cycle correction;
Figure 4 is a timing diagram of controller programmable duty cycle correction;
Figure 5 is a flowchart of a method of duty cycle correction;
Figure 6 is a timing diagram for a write duty cycle register command;
Figure 7 is a block diagram of a programmable delay line for output delay adjustment;
Figure 8 is a timing diagram of controller programmable output delay adjustment;
Figure 9 is a flowchart of a method of performing output delay adjustment; and
Figure 10 is a timing diagram for a write output delay register command.
Detailed Description
In the following detailed description of sample embodiments of the invention, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration specific sample embodiments in which the present invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention, 76181-75
16 and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical, and other changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
Some of the memory system configurations referred to in the background employ a shared bus topology for the system clock distribution, which is referenced to as a 'common clock system' or 'multi-drop clocking system'. If the system clock is applied to too many memory devices in parallel and the clock signal travels too far from the clock source, typically a memory controller, the maximum operating clock frequency may be limited by the total loading of the clock signal and the distance that the clock travels in the memory system's physical layout.
Some of the memory system configurations referred to in the background use a point-to-point serial-connected clocking architecture featuring a DLL or PLL in each memory device in order to synchronize two clock signals in the memory device, one being an input clock received from a preceding device or controller and the other being an output clock transmitted to the next device. However, having an on-chip DLL or PLL in each memory device can cause a significant amount of power consumption. With an on-chip DLL or PLL, various chip-to-chip clock delays (caused by various interconnect loadings and different wire bonding loadings such as multi-chip stacking or package) accumulate through a large number of serial-connected devices and may be unacceptable for system operation.
Referring now to Figure 1 , shown is a system block diagram of a serial- connected memory system generally indicated at 101 employing a controller programmable duty cycle correction scheme. The memory system 101 includes a memory controller 10 as a master device connected to a first memory device 100-1. Memory device 100-1 is the first of a series of slave devices including devices 100-1 through 100-8 that are connected in a ring configuration, with the 76181-75
17 last device 100-8 connected back to the memory controller 10. In the illustrated example a highly multiplexed unidirectional point-to-point bus architecture is provided to transfer information such as commands, addresses and data from the memory controller 10 to the memory devices 100-1 to 100-8. This bus architecture includes a link 90 from the memory controller 10 to the first memory device 100-1 , and a respective link between each pair of adjacent memory devices, these including links 90-1 through 90-7, and a link 90-8 between the last memory device 100-8 and the memory controller 10.
In the illustrated example, each link includes a set of signals output by a preceding device (the memory controller 10 or a memory device) for receipt by a succeeding device. Each link includes a set of output ports of a preceding device, a set of input ports of a succeeding device, and a set of physical interconnections between the output ports and the input ports. For convenience, the output ports will be given the same name as the signals they output and the input ports will be given the same name as the signals they receive. In the illustrated example, the signals (and output ports) of a preceding device are referred to as CSO (Command Strobe Output), DSO (Data Strobe Output), Qn (Data Output), CKO/CKO# (differential clock output signals). The corresponding signals (and input ports) of a succeeding device are referred to as CSI (Command Strobe Input), DSI (Data Strobe Input), Dn (Data Input), CKI/CKI# (differential clock input signals). There may be additional ports or signals (for example, CE# (chip enable) or RST# (reset) or power supplies pins) that are not shown for better understanding and simplicity. The physical interconnections include differential clock buses S111 , S111 -1 to S111 -8 for differential clock signals, S112, S112-1 to S112-8 for command strobe, S113, S113-1 to S113-8 for data strobe, S114, S114-1 to S114-8 for data.
In some embodiments, the data output Qn and the data input Dn may have different data widths with n=0 for 1-bit Link setting; n=0, 1 for 2-bit Link setting; n=0, 1 , 2, 3 for 4-bit Link setting; n=0, 1 , 2, 3, 4, 5, 6, 7 for 8-bit Link setting and 76181-75
18 so on. In some embodiments, the width of the link may be programmed through a link configuration register to utilize 1 , 2, 4, or 8 of a device package's available data input and output pins. This feature allows these memory devices to operate in a ring configuration together with devices that have smaller or larger maximum link widths provided they are all programmed to use the same link width. See for example 'Switching Method of Link and Bit Width' (WO 2008/070978), hereby incorporated by reference in its entirety.
CKI/CKI# are input clocks. A Command/Address Packet on the Dn port delineated by CSI is latched on the rising edges of CKI or the falling edges of CKI#. A Write Data Packet on Dn delineated by DSI is latched on the rising edges of CKl or the falling edges of CKI#.
CKO/CKO# are output clocks which are delayed version of CKI/CKI#. CSO, DSO and Qn signals are referenced to the rising edges of CKO or to the falling edges of CKO#; for example, a Read Data Packet on Qn delineated by DSO is referenced at the rising edges of CKO or the falling edges of CKO#.
When Command Strobe Input (CSI) is HIGH, Command/Address Packets through Dn are latched on the rising edges of CKI or falling edges of CKI#.
Command Strobe Output (CSO) is an echo signal of CSI. It echoes CSI transitions with a latency tlOL that in a particular implementation is a two clock cycle latency referenced to the rising edges of CKO or to the falling edges of CKO#. Two clock cycle latency is an implementation detail; more generally it could be any number of clock cycles appropriate for a given design.
When Data Strobe Input (DSI) is HIGH while the memory device is in 'Read- Mode', it enables the read data output path and Qn buffer (not shown). If DSI is LOW, the Qn buffer holds the previous data accessed. If DSI is HIGH while the memory device is in 'Write-Mode', it enables a Dn buffer and receives Write Data Packet on the rising edges of CKI or falling edges of CKI#. 76181-75
19
Data Strobe Output (DSO) is an echo signal of DSI. It echoes DSI transitions with latency tIOL referenced to the rising edges of CKO or to the falling edges of CKO#. As indicated above, tIOL is two clock cycles in a particular implementation.
Data Input signal Dn (n = 0,1 ,2,3,4,5,6 or 7) carries command, address and/or input data information. If the chip is configured in '1 -bit Link mode', DO is the only valid signal and receives one byte of a packet in eight clock cycles. If the chip is configured in '2-bit Link mode', DO & D1 are valid signals and receive one byte of a packet in four clock cycles. If the chip is configured in '4-bit Link mode', DO, D1 , D2 & D3 are valid signals and receive one byte of a packet in two clock cycles. If the chip is configured in '8-bit Link mode', DO, D1 , D2, D3, D4, D5, D6 & D7 are all valid signals and receive one byte of a packet in one clock cycle.
Data Output signal Qn (n = 0,1 ,2,3,4,5,6 or 7) carries output data during a read operation or bypasses command, address or input data received on Dn. If the chip is configured in '1-bit Link mode', QO is the only valid signal and transmits one byte of a packet in eight clock cycles. If the chip is configured in '2-bit Link mode', QO & Q1 are valid signals and transmit one byte of a packet in four clock cycles. If the chip is configured in '4-bit Link mode', QO, Q1 , Q2 & Q3 are valid signals and transmit one byte of a packet in two clock cycles. If the chip is configured in '8-bit Link mode', QO, Q1 , Q2, Q3, Q4, Q5, Q6 & Q7 are all valid signals and transmit one byte of a packet in one clock cycle.
It should be clearly understood that the number of ports and the signals they contain for transmission between adjacent pairs of devices and serial-connected memory systems are implementation specific and are not necessarily those depicted in Figure 1. More generally, at least a clock signal is conveyed between each pair of consecutive devices. There may be additional signals that are conveyed between the consecutive devices, and specific examples of these have been given above. It is also noted that the particular number of memory devices, eight in the example of Figure 1 , is an implementation specific detail. Any 76181-75
20 appropriate number of devices can be interconnected in the serial-connected architecture. Note that the expression "serial-connected" in this context is referring to the serial arrangement of memory devices, one after the other and not to the nature of the link between each pair of adjacent devices which may be serial or parallel in nature.
The memory controller 10 contains a phase detector 11 , a duty detector 13 and a command generator 12. In some embodiments, the memory controller 10 only includes the phase detector 11 in which case only output delay adjustment is performed. In some embodiments, the memory controller 10 includes only the duty detector 13 in which case only duty cycle correction is performed. In some embodiments, both the phase detector 11 and the duty detector 13 are included in which case both output delay adjustment and duty cycle correction may be performed. This last case is assumed in the detailed description which follows. The phase detector 11 and the duty detector 13 are connected to the command generator 12 through signal buses S11 and S12 respectively. The command generator 12 has an output signal bus S13 connected to CSO and Qn ports through which it can output commands.
The memory controller 10 drives the differential clock buses, S111 , from its port CKO/CKO#, and all eight memory devices 100-1 ~ 100-8 receive the differential clock buses through their own clock ports, CKI/CKI#, from the previous device's CKO/CKO# ports in a series flow-through manner. The memory controller 10 drives three different buses, S112, S113 and S114 through its ports, CSO, DSO and Qn, respectively. The first memory device 100-1 receives the three buses, S112, S113 and S114, through its ports, CSI, DSI and Dn, respectively, and the first memory device 100-1 re-drives (echoes) three corresponding buses, S112-1 , S113-1 and S114-1 through its output ports, CSO, DSO and Qn, respectively, with 2 clock cycles of latency (= tIOL). The second memory device 100-2 receives the three buses, S112-1 , S113-1 and S114-1, through its input ports, CSI, DSI and Dn, respectively. This approach applies to all of the eight memory 76181-75
21 devices 100-1 ~ 100-8 with the final buses, S112-8, S113-8 and S114-8, connected back to the memory controller 10 through the memory controller's input ports, CSI, DSI and Dn, respectively.
In operation, for duty cycle correction, the duty detector 13 monitors a duty ratio of CKI/CKI# which is the clock input after it has been passed between all of the devices 100-1 to 100-8 in the ring. If the duty detector 13 detects a duty error from CKI/CKI#, namely a deviation in the duty cycle from a desired duty cycle, it asserts through signal bus S12 either a 'Duty_Add' to indicate the duty cycle is shorter than the desired duty cycle and should be lengthened or 'Duty_Sub' to indicate the duty cycle is longer than the desired duty cycle and should be shortened. In response, the command generator 12 generates an appropriate "Write Duty Cycle Register" command packet.
In operation, for output delay adjustment, the phase detector 11 monitors the phase of CKI/CKI#. If the phase detector 11 detects a phase error (PE) between CKI/CKI# and CKO/CKO#, it asserts a 'PE' signal through the signal bus S11. In response, the command generator 12 generates an appropriate "Write Output Delay Register" command packet.
The command generator 12 issues the appropriate command packet according to the received signals on S11 and S12, and sends the command information through signal bus, S13, and CSO, Qn ports.
Referring now to Figure 2, shown is a block diagram of an exemplary implementation of the memory devices 100-1 to 100-8 of Figure 1. The device, generally indicated at 100, includes a memory core 150, command/address packet logic 130, data packet logic 140, and duty cycle correction logic 120. Memory core 150 may be a single bank of memory cell arrays or it could be multiple banks of memory cell arrays, depending on design variations. Data packet logic 140 processes and stores all necessary data transferring information. Command/address packet logic 130 processes all command instructions and/or 76181-75
22 address information coming through internal signals, 'dnjat', according to an internal control signal 'csijat' as detailed below.
Clock Input Processing
The device 100 includes clock input receiver 102D for CKI/CKI# which may for example be a differential type input buffer to handle the differential clock inputs, CKI & CKI#. The clock input receiver 102D translates the external interface levels of CKI/CKI# signals to the internal logic levels of an internal clock signal 'cki_i'. The internal clock signal, cki_i, may be used in other internal logic blocks for various operations. As will be described in detail below, the duty cycle correction logic 120 takes the internal clock signal, cki_i, and produces a duty cycle corrected clock signal clk_dcc. The duty cycle corrected clock signal, 'clk_dcc', is delayed by a controller programmable delay line, PDL2, 105D, and its delayed signal, 'clk_dcc_d', is finally driven to the input port of an output driver block 108D, which outputs the external clock output signals, CKO/CKO#.
Command Strobe Input Processing
The device 100 includes command strobe receiver 102A which generates a buffered signal 'csi_i' from a CSI input signal. The buffered signal, csij, is connected to the D port of a D-type flip-flop 103A. The flip-flop 103A is driven by the clock signal, 'cki_i', and latches the status of the 'csij' signal at every rising edge of 'cki_i'. The latched signal 'csijat' is provided to the command/address packet logic 130, and also is provided to the D port of another flip-flop 103E, whose clock input port is driven by the duty corrected clock signal, clk_dcc. The flip-flop 103E's output signal, 'cso_i', is delayed by a controller programmable delay line, PDL2, 105A, and its delayed signal, 'cso_d', is finally driven to the input port of an output driver block 108A, which then outputs the external signal, CSO. Two stages of flip-flop logic 103A and 103E provide an input to output latency (= tIOL) of two clock cycles for CSI to CSO bypassing. 76181-75
23 Data Strobe Input Processing
The device 100 includes data strobe input receiver 102C which generates a buffered signal 'dsi i' from a DSI input signal. The buffered signal, dsij, is connected to D port of D-type flip-flop 103C. The flip-flop 103C is driven by the clock signal, 'cki_i', and latches the status of the 'dsij' signal at every rising edge of 'cki_i'. The latched signal 'dsijat' is provided to the command/address packet logic 130 and data packet logic 140, and also is provided to D port of another flip-flop 103G, whose clock input port is driven by the duty corrected clock signal, clk_dcc. The flip-flop 103G's output signal, 'dso_i', is delayed by a controller programmable delay line, PDL2, 105C, and its delayed signal, 'dso_d', is finally driven to the input port of an output driver block 108C, which outputs the external signal, DSO. Two stages of flip-flop logic 103C and 103G provides the same input to output latency (= tIOL) of two clock cycles for DSI to DSO bypassing.
Data Processing
The device 100 includes a data receiver, 102B, for receiving an external signal Dn. It is noted that the number of receivers 102B can be one or more than one according to the bit width of Dn ports. For example, if Dn ports are designated in DO, D1 , ~ D7, for an 8 bit wide data input/output implementation, the receiver 102B will be repeated eight times. The output of the receiver 102B, 'dn_i', is provided to the D port of a D-type flip-flop 103B. The flip-flop 103B is driven by the clock signal, 'cki_i', and latches the status of the 'dn_i' signal at every rising edge of 'cki_i'. The latched signal 'dnjat' is provided to the command/address packet logic 130 and also is provided to data packet logic 140. The latched signal, 'dnjat' is also provided to one input port of a multiplexer 104. The other port of the multiplexer 104 is driven by a signal, 'core_data' from the data packet logic 140. The output of the multiplexer 104 is connected to the D input port of a flip- flop 103F, whose clock input port is driven by the duty corrected clock signal, clk_dcc, and latches the status of the output of the multiplexer 104 at every rising 76181-75
24 edge of 'clk_dcc'. The latched signal, 'q_i', is delayed by another controller programmable delay line, PDL2, 105B, and its delayed signal, 'q_d', is finally driven to the input port of an output driver block 108B, which outputs the external signal, Qn. Two stages of flip-flop logic 103B and 103F provides the same input to output latency (= tIOL) of two clock cycles for Dn to Qn bypassing.
The internal signal dn_i includes both command content (as delineated by the command strobe input) and data input (as delineated by the data strobe input) when present. Each device has a device address, in some embodiments stored in a device address register 131. Each command includes a Device Address portion that contains the device address of one of the memory devices to which the command is addressed. There may also be a broadcast address that requires the command to be processed by all devices. The memory device 100 processes each command by examining the Device Address portion. If the Device Address information in the received command/address packet matches the memory device 100's own stored device address, the command/address packet logic 130 processes the command, and also issues an "id_match" signal to signify that the command is for that memory device. The id_match signal is used to steer the data flow path of the multiplexer 104. If "id_match" is in a HIGH logic state (more generally in a "match state" however that is defined) as a result of device address matching process, the multiplexer 104 selects "core_data" to be outputted, so that the data from the memory core 150 can be transferred to the flip-flop 103F. On the other hand, if "id_match" is in a LOW logic state (more generally in a "no match state" however that is defined) as a result of device address matching process, the multiplexer 104 selects "dnjat" to be outputted, so that the data received from the data input Dn can be transferred to the flip-flop 103F to be echoed at the output Qn.
Thus, the multiplexer 104 allows for the selection between a) bypassing data received from the data input Dn by selecting the dnjat input of the multiplexer 104, and b) outputting the core_data by selecting the core_data input of the 76181-75
25 multiplexer 104. The signal 'core_data' is usually transferred from the memory core 150 to the data packet logic 140, for example as part of a 'PAGE READ' operation upon request from the memory controller 10. Then after the 'PAGE READ' operation is done, the memory controller 10 can request a 'BURST READ' operation to the memory device with a command addressed to that memory device. In that case, the memory device processes the 'BURST READ' command and the corresponding address information including Device Address portion. If the Device Address information in the received command/address packet matches the memory device 100's own stored device address, the command/address packet logic 130 issues "id_match" signal in order to steer the data flow path of the multiplexer 104. If "id_match" is in a HIGH logic state as a result of device address matching process, the multiplexer 104 selects "core_data" to be outputted, so that the data previously transferred from the memory core 150 to the data packet logic 140 can be transferred to the flip-flop 103F.
Note that in the case that a command is addressed to the memory device, but the command is not a BURST READ command, in some embodiments the core_data input of the multiplexer 104 is still selected even though there is no data to output. The core_data signal may be a static signal in such a case. This results in the data input Dn not being echoed to the next device. This can have the effect of reducing power consumption in the subsequent devices by eliminating the need for them to process data associated with commands that are not addressed to them. This is described in further detail in US Application serial no. 12/018,272 filed 01/23/2008 entitled "Semiconductor Device and Method for Reducing Power Consumption in a System Having Interconnected Devices".
Thus, in some embodiments, a delayed version of the data input signal Dn is produced as one component of a data output signal (Qn). Some of the time the data output signal is the delayed version of the data input signal. For the implementation described, this will be the case when there is content on the data 76181-75
26 input signal that is not for the particular memory device but other scenarios are possible. Furthermore, some of the time the data output signal comprises a delayed version of a signal produced locally to the memory device, after applying the delay to the signal produced locally to the memory device in accordance with the command. For the implementation described, the signal produced locally to the memory device is the so-called core_data output from the data packet logic 140 but other scenarios are possible.
The command/address packet logic 130 has a DCR (duty cycle correction register) 132 that produces an output DCR<0:3> to the duty cycle correction circuit 120 to control the amount of duty cycle correction performed as detailed below and has an ODR (output delay register) 134 that produces an output ODR<0:1 > to the packet delay lines 105A, 105B, 105C, 105D to control the amount of output delay applied as detailed below. One of the available commands is a "Write Duty Cycle Correction Register" command for writing a value to the DCR 132. Similarly, one of the available commands is a "Write Output Delay Register" command for writing a value to the ODR 134.
Write Duty Cycle Correction Register Command
The use of a "Write Duty Cycle Correction Register" command assumes an implementation, as described herein, in which an amount of delay to be applied in performing duty cycle correction is controlled by writing a value to a duty cycle correction register. More generally, any command, referred to herein as a duty cycle correction command, may be employed that has the effect of causing a device to set how duty cycle correction is to be performed. Thus, the described "Write Duty Cycle Correction Register" command is to be considered a specific example of a duty cycle correction command.
Write Output Delay Register Command
The use of a "Write Output Delay Register" command assumes an implementation, as described, in which an amount of delay to be applied is 76181-75
27 controlled by writing a value to an output delay register. More generally, any command, referred to herein as an output delay adjustment command, may be employed that has the effect of causing a device to set the amount of delay to be applied. Thus, the described "Write Output Delay Register" command is to be considered a specific example of an output delay adjustment command.
Duty Cycle Correction
In the illustrated example, the duty cycle correction circuit 120 includes a clock divider 123, and a controller programmable delay line 121 that includes a '4-to-16 Decoder' block and 'Programmable Delay Line (PDL1 )'. Respective outputs clk_ref, clk_del of the clock divider 123 and the controller programmable delay line 121 are input to an XOR gate 122 the output of which is the duty cycle corrected clock clk_dcc.
The clock divider 123 derives an output signal 'clk_ref which has a frequency that is one half that of the input 'cki_i' signal. Clock divider circuits are well known in the art. In the particular example illustrated, the clock divider 123 includes a D-type flip-flop 103D that is driven by the internal clock signal, cki_i, through its clock input port. The output port Q of the D-type Flip-Flop 103D is connected to the input port D though inverter logic 124 in order to obtain a half frequency output signal.
The controller programmable delay line 121 produces an output signal, elk del, which is a delayed version of clk_ref. The amount of delay is determined by the '4-to-16 Decoder' logic block's select signals, which are controlled by DCR<0:3> signal information received from command/address packet logic 130. The XOR logic gate 122 receives the two half clock signals, clk_ref and elk del, and outputs a duty cycle adjusted full clock signal, clk_dcc.
Figure 3 is a block diagram of an example implementation of a programmable delay line 121 for duty cycle correction that may, for example, be used in the duty cycle correction circuit 120 of Figure 2. The half frequency clock signal, clk_ref, 76181-75
28 is driven to respective inputs of each of 16 unit delay blocks UN!T_0 - UNIT_15. Each unit delay block has an identical structure, and unit delay block UNIT_15 will be described by way of example. The unit delay block is composed of two NAND logic gates 1211 and 1212 and one inverter logic gate 1213. The first NAND logic gate 1211 receives the clk_ref input at its first input, and receives an output from a 4-to-16 decoder 1210 at its second input. The output of the first NAND logic gate 1211 is input to a first input of the second logic NAND gate 1212. For unit delay block UNIT_15, the second input of the second logic NAND gate 1212 is connected to Vdd. For all unit delay blocks except the right most unit delay block, UN!T_0, the output of the second NAND gate 1212 is connected through the inverter 1213 to the second input of the second NAND gate 1212 in the next unit delay block. The output of the second NAND gate of the right most unit delay block UN!T_0 is connected through an inverter and produces the overall output clock elk del signal. The 4-to-16 Decoder block 1210 has a 4-bits wide input bus, DCR<0:3> as its input. The decoder block 1210 decodes the input and outputs a 16-bit wide bus, SEI_<15:0>, with one line of the bus connected to each of the 16 unit delay blocks. The unit delay logic shown is an example of a known circuit technique has been used to to produce a register controlled delay-locked-loop. Other unit delay logics can alternatively be employed. The use of 16 unit delay blocks is implementation specific. For example, more generally, an N-to-M decoder might be employed to decode signals received on N input lines into M control signals for M unit delay blocks, where N >=1 and M >=2.
In operation, the '4-to-16 Decoder' logic 1210, produces the 16 SEL<15:0> output such that only one of the 16 select signals is in a HIGH logic state and all the other 15 select signals are in LOW logic states. Therefore, only one unit delay block is selected to transfer the 'clk_ref signal through the unit delay blocks that are to the right of the selected unit delay block. The control input DCR<0:3> is used to select which of the unit delay blocks will process the clk_ref input. The minimum delay is selected by selecting the right most unit delay block 76181-75
29
UNIT_0 in which case the clk del is the clk_ref signal delayed by one unit delay block, whereas the maximum delay is selected by selecting the left most unit delay block UNIT_15 in which case the clk del is the clk_ref signal delayed by all 16 unit delay blocks.
For most process technologies, the unit delay amount of the illustrated unit delay block is around 100ps ~ 150ps. However, in some embodiments, a finer unit delay circuit block is employed for much higher operating frequency with finer delay tuning capability. The unit delay time is denoted as "tUD" in Figure 3 and the total delay time for the whole programmable delay line is denoted as "tPDL1" which is 16 times "tUD".
In some embodiments, a default setting for the power-on initialization is that having a logic HIGH state on the SEI_<7> bit, as it is in the middle position of the delay line. However, in other design variations, the default settings can be different, and it may be recommended to have minimum delay setting in order to be ready for operating at the maximum frequency.
Figure 4 is an example of a timing diagram of the controller programmable duty cycle correction procedure, where all of the signals are as shown in Figure 3 except CKI which is the raw input clock signal that is to be duty corrected. The timing diagram is showing one extremely distorted clock input signal, CKI at the top, for the sake of example only. The half clock signal, clk_ref, is derived from the 'clock divider' block 123 of Figure 2 and its rising and falling edges are aligned with two rising edges of CKI. It is assumed for this example that the clock signal, clk_dcc, would have a distorted duty ratio, such as 45% on, 55% off, for example, in the absence of any change to the DCR<0:3> values which are shown to initially be set to "01 11 b". After DCR<0:3> values are changed to "1000b", the duty cycle of the clock signal, clk_dcc, is corrected to be 50% on and 50% off as the result of a shift in the selection of the controller programmable delay line 121 from SEL (7) being enabled to SEL (8) being enabled. 76181-75
30 Control of the Duty Cycle Correction
Recall that the contents of the DCR 132 are used to control the amount of delay introduced by the controller programmable delay line 121 in the duty cycle correction circuit 120, thereby controlling the duty cycle correction. As described above, the contents of the DCR 132 can be written with a 'Write Duty Cycle Register' command.
Figure 5 is a flow chart for the duty cycle correction procedure from the perspective of the controller. The method begins at block 5-1 with power on of the devices. At this point, all of the delay lines are initialized and device addresses for all devices are assigned. At block 5-2, the memory controller 10 monitors the duty ratio of CKI/CKI# using the duty detector 13. If there is a duty cycle error, yes path block 5-3, then in block 5-4 the duty detector 13 asserts the "Duty_Add" or the "Duty_Sub" signal S12. After this, the command generator 12 issues the 'Write Duty Cycle Register' command with "DCR+1 " or "DCR-1" values. If there is still a duty cycle error, yes path block 5-6, then the method continues back at step 5-4 with the further adjustment to the duty cycle register. If there is no longer a duty cycle error, no path block 5-6, then duty cycle correction is completed at 5-7. Similarly, if no duty cycle error was detected in block 5-3, then at that point the method also is completed at 5-7.
Table 1 below is an example command packet definition for writing to the Duty Cycle Register (DCR). The first byte is 'Device Address (= DA)' portion and the second byte is a Command code (= CMD = FAh), and the third byte contains Register Values (= DCR<0:3>). In some embodiments, a broadcast address is provided, for example FFh. If DA is set to the broadcast address, it means that the command is a broadcasting command, so that every memory device is expected to execute the command. Otherwise, only a specific memory device that is matching the DA will execute the command. In some embodiments, a 'Read Duty Cycle Register' command is also implemented in order to give more flexibility to the controller 10. 76181-75
31
Table 1. Exemplary Command Packet Definition for Duty Cycle Register
Figure imgf000032_0001
*Notes
1 ) if DA (Device Address) is FFh(=255d) it is a broadcasting command so that every device will respond to the command
2) DA = Device Address
Table 2 is an example bit definition of Duty Cycle Register (= DCR). It is showing purely example definitions, therefore if the system configuration requires more detailed granularity for the unit delay adjustment, this table can be easily expanded in order to accommodate more manageability in terms of programmable delay lines. For example, if Bit<7:0> is entered as "0000 1000b = 08h" from the controller, DCR<3:0> will be accepting only Bit<3:0> (= "1000b") for valid register values and upper four bits Bit<7:4> will be ignored. In other design variations, however, a finer unit delay circuit can be implemented for higher frequency operation, and additional bit assignments may be used.
Figure imgf000032_0002
Figure 6 is an example of a timing diagram of a 'Write Duty Cycle Register' command packet sequence based on SDR (Single Data Rate) operation. In this timing diagram, at time T1 , the rising edge of CKI or the falling edge of CKI# latches the HIGH state of CSI and simultaneously latches DA (= Device Address = 0Oh) information on Dn port(s). If DA is set as FFh (= 255 in decimal), this means that the 'Write Duty Cycle Register' command is a broadcasting command, so that every memory device is expected to execute the command. In some embodiments, the broadcasting command is used for Duty Cycle Correction operation. However, the circuit disclosed also allows for the more flexible 76181-75
32 adjustment of duty cycle correction operations within individual devices. In the next rising edge time T2, the memory device latches CMD (= Command = FAh) information, and on the third rising edge T3, DCR (= Duty Cycle Register Value = 08h) information. The CSO output and Qn output ports echo the CSI input and Dn input signals, respectively, with two clock latency of tIOL (= Input-to-Output Latency). There is another latency specification which is tWDCR (= Write Duty Cycle Register Latency), and it is for the processing time of the Write Duty Cycle Register packet in the memory chip and for the processing time of Duty Cycle adjustment in the Controller Programmable Delay Line 121 within the duty cycle correction circuit 120. In some embodiments, tWDCR value is set as 4 clock cycles as shown in Figure 6. After tWDCR (for example, at T8), the memory controller 10 can issue any other command packet to the memory device.
The embodiments described assume that all of the devices in the serial- connected architecture implement duty cycle correction. More generally, at least one of the devices implement duty cycle correction.
Output Delay Adjustment
Referring again to Figure 1 , the described programmable delay lines 105A, 105B, 105C, 105D are provided to allow programmably delaying the output signals CSO, Qn, DSO and CKO/CKO# in order to allow phase correction. Figure 1 also shows output delay register signal buses ODR<0:1 > connected to a 2-to-4 Decoder logic block 106. The 2-to-4 Decoder logic 106 outputs four select signal buses, SEL2<0:3>. Those SEL2<0:3> select signals are all connected to the four controller programmable delay lines 105A, 105B, 105C and 105D.
Figure 7 is showing an exemplary circuit block implementation for the output delay adjustment. In the illustrated example, programmable delay lines 105A, 105B, 105C and 105D are composed of four unit delay elements that are the same as those used in Figure 3. This means that the range of delay adjustment for the output is only 4/16 that of the range of delay of adjustment of the duty 76181-75
33 cycle. However, this is an implementation detail, and other numbers of delay elements may alternatively be employed. Each programmable delay line 105A, 105B, 105C, 105D receives a respective signal cso_i, q_i, dso_i and clk_dcc, as the input of the delay line and produces a respective delayed output cso_d, q_d, dso_d and clk_dcc_d. If the memory system has a multi-bit output configuration, for example an 8-bit wide I/O configuration, q_i and q_d, signals will be increased correspondingly, for example to be 8 in number, and the number of delay line blocks for q_i and q_d, will be increased correspondingly, for example to be 8 in number.
In operation, the '2-to-4 Decoder' logic 106, produces the SEL2<0:3> output such that only one of the 4 select signals is in a HIGH logic state and all the other 3 select signals are to be logic LOW states. Only the selected unit delay block transfers the respective input signal through the remaining unit delay blocks to the right of the selected unit delay block. The control input ODR<0:1 > is used to select which of the unit delay blocks will process the respective inputs. The minimum delay is selected by selecting the right most unit delay block UNIT_0 in which case each output signal is the respective input signal delayed by one unit delay block, whereas the maximum delay is selected by selecting the left most unit delay block UNIT_3 in which case each output signal is the respective input signal delayed by four delay unit blocks.
The '2-to-4 decoder' logic 106 with four unit delay blocks is implemented in this example circuit design. However more generally, any required number of delay units and the corresponding decoder logic may be used. A default delay setting may be used during the power-on initialization period. In this example, the default selection might for example be set to SEL2<0>, and the memory device will have the least amount of delay for each output path after power-on or hard reset in some other design variations. The use of 4 unit delay blocks is implementation specific. For example, more generally, an N-to-M decoder might 76181-75
34 be employed to decode signals received on N input lines into M control signals for M unit delay blocks, where N >=1 and M >=2.
Figure 8 is an example timing diagram for controller programmable output delay adjustment. Shown is a duty cycle corrected clock clk_dcc, and the delayed version of that clk_dcc_d before and after a change in the contents of the output delay register. It can be seen that after the change in the output delay register from the value "00b = Od" to "01 b = 1 d", the delayed clock is delayed by an amount 2 x tUD whereas before the adjustment it had been delayed by 1 x tUD. Also shown is the command strobe output cso_i before output after the delay adjustment, and the output of the delay adjustment which is cso_d. Once again, before the change to the output delay register, the delayed command strobe is later by 1 x tUD. After the change to the output delay register, the delayed command strobe is later by an amount 2 x tUD.
Control of Output Delay Adjustment
Recall that the contents of the ODR 134 are used to control the amount of delay introduced by the delay lines 105A,105B,105C,105D thereby controlling the amount of output delay adjustment. As described above, the contents of the ODR 134 can be written with a 'Write Output Delay Register' command.
When the phase detector 11 in the memory controller 10 detects an unacceptable phase difference between its CKI/CKI# and CKO/CKO# signals, the controller 10 will issue one "Write Output Delay Register" command packet with one added unit delay amount to allow the very first memory device 100-1 of Figure 1. After enough clock cycles for a first memory device, for example for the tWODR (Write Output Delay Register latency) and total tIOL latencies described below with respect to Figure 10, if there is still unacceptable phase difference, the controller 10 can issue another 'Write Output Delay Register" command packet to a second memory device, for example the second memory device 100- 2 of Figure 1. This sequence of operations can be continued until the memory 76181-75
35 controller 10 gets the acceptable phase difference. After the last memory device is instructed to adjust its output delays, then the memory controller 10 points to the very first memory device with one more added unit delay value within the command packet, and continues for the rest of the memory devices until the phase difference reaches an acceptable range.
The above procedure is shown in the flowchart of Figure 9. The method begins at block 9-1 with power on. At this point, all the delay lines and device addresses are initialized. In block 9-2, the memory controller 10 monitors the phase error between CK!/CK!# and CKO/CKO# using the phase detector 11. If there is a phase error, yes path 9-3, then the phase detector 11 asserts the "PE" signal S11 in block 9-4. After that, the command generator 12 issues a 'write output delay register' command with "ODR+1 " value to each memory device from the first to the last, one at a time while monitoring the phase error. In block 9-6, if there is still a phase error, yes path, then the method continues back at block 9-4. If there is no phase error, no path block 9-6, then the phase correction is completed at block 9-7. Similarly, if no phase error was detected in block 9-3, then the method ends, phase correction having been completed at block 9-7.
Table 3 is an example command packet definition for the Write Output Delay Register command. The first byte is a 'Device Address (= DA)' portion and the second byte contains a Command code (= CMD = FBh), and the third byte contains Register Values (ODR<0:1 >). In some embodiments, a broadcast address is provided, for example FFh. If DA is set to the broadcast address, it means that the command is a broadcasting command, so that every memory device is expected to execute the command. Otherwise, only a specific memory device that is matching with DA will execute the command. In some embodiments, a 'Read Output Delay Register' is implemented in order to give more flexibility to the controller 10. For example, this can be used by the controller to read the values from all of the memory devices and then rearrange the settings among the devices appropriately, if necessary. 76181-75
36
Table 3. Exemplary Command Packet Definition for Controller Programmable Delay Line Registers
Figure imgf000037_0001
*Notes
1 ) if DA (Device Address) is FFh(=255d) it is a broadcasting command so that every device will respond to the command
2) DA = Device Address
Table 4 is an example bit definition of Output Delay Register (= ODR). It is showing purely example definitions, therefore if the system configuration requires more detailed granularity for the unit delay adjustment, this table can be easily expanded in order to accommodate more manageability in terms of programmable delay lines.
Table 4. Exemplary Bit Definition of Duty Cycle Register & Output Delay Register
Figure imgf000037_0002
Figure 10 is an example of a timing diagram of a 'Write Output Delay Register' command packet sequence based on SDR (Single Data Rate) operation. In this timing diagram, at time T1 , the rising edge of CKI or the falling edge of CKI# latches the HIGH state of CSI and simultaneously latches DA (= Device Address = 0Oh) information on Dn port(s). In the next rising edge time T2, the memory device latches CMD (= Command = FBh) information, and on the third rising edge, ODR (= Output Delay Register Value = 01 h) information. The CSO output and Qn output ports are echoing the CSI input and Dn input signals, respectively, with two clock latency of tIOL (= Input-to-Output Latency). There is another latency specification which is tWODR (= Write Output Delay Register Latency), and it is for the processing time of Write Output Delay Register packet in the memory chip and for the processing time of Output Delay adjustment in the Controller Programmable Delay Line 2 (= PDL2 105 A-D). In some embodiments, 76181 -75
37 tWODR value is set as 4 clock cycles as shown in Figure 10. After tWODR (for example, at T8), the memory controller 10 can issue any other command packets to the memory device.
More generally, an embodiment of the application provides for methods and circuits performing output delay adjustment embodiments in which, a delayed version of at least one input signal is produced, the at least one input signal includes at least the clock signal. There may be may be additional input signals conveyed between devices that are not subject to output delay adjustment. For some signals, generating a delayed version of an input signal for output involves conditionally generating a delayed version of the input signal for output. That is to say, some of the signals may be conditionally conveyed between adjacent devices. A specific example is detailed below in which the input data signal of a memory device is conveyed to the next memory device some of the time.
The embodiments described above have assumed the use of programmable delay lines that are composed of identical unit delay blocks. In some embodiments, the programmable delay lines are divided into two or more sections, such as "Coarse" and "Fine" delay lines to allow further programmability of the delay adjustment for duty cycle correction and/or output delay adjustment.
In the detailed examples described, there is a first Flip-Flop near the input, and a second Flip-Flop near the output for each signal. This is what produces the two clock cycle latency. Of course, it is to be understood that other clock latencies may result by including different functionality between the input and the output.
In the embodiments described, the output delay lines are located after the last Flip-Flop that is located near an output for each signal. In some embodiments, the output delay line is located before the last flip-flop.
In some embodiments, the devices that are connected in the serial-connected manner are assumed to be substantially identical. In some embodiments, these are substantially identical memory devices. In other embodiments, different 76181-75
38 types of memory devices can be utilized as along as they have compatible serial interfaces.
The detailed embodiments have assumed that differential clock signals are employed. More generally, single ended or differential clock signals may be used. Similarly, any other input/output signals can be single ended or differential.
In some embodiments, a single MCP (multi-chip package) is provided that includes the plurality of memory devices and a controller, operable as described.
The methods and apparatus described herein have assumed a serial-connected architecture featuring a controller and a set of memory devices connected in a ring. In such embodiments, the memory devices are slave devices, and the memory controller is a master device. More generally, the methods and apparatus described herein can be applied to any kind of semiconductor integrated circuit system having any kind of semiconductor integrated circuit devices that are configured as slave devices in the serial-connected configuration with a common interface between adjacent devices, with a device that is configured to act as a master device that controls the duty cycle correction and/or phase correction performed by the slave devices . Examples of integrated circuit types include central processing units, graphics processing units, display controller IC, disk drive IC, memory devices like NAND Flash EEPROM, NOR Flash EEPROM, AND Flash EEPROM, DiNOR Flash EEPROM, Serial Flash EEPROM, DRAM, SRAM, ROM, EPROM, FRAM, MRAM, PCRAM etc.
Some of the embodiments described herein have assumed single data rate operation. More generally, the embodiments can be applied to systems with other data rates, for example double rate operation with appropriate modifications that would be understood to a person skilled in the art upon reading this disclosure.
Numerous modifications and variations of the present invention are possible in light of the above teachings. It is therefore to be understood that within the 76181-75
39 scope of the appended claims, the invention may be practiced otherwise than as specifically described herein.

Claims

76181-7540Claims:
1. A method in a slave device of a plurality of serial-connected slave devices, the method comprising:
receiving a command from a master device specifying an adjustment to a clock duty cycle;
receiving an input clock signal;
generating a duty cycle corrected clock signal from the input clock signal in accordance with the command;
outputting the duty cycle corrected clock signal.
2. The method of claim 1 wherein the slave device is a memory device and the master device is a memory controller.
3. The method of claim 1 further comprising:
receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device;
receiving at least one input signal, the at least one input signal comprising at least the input clock signal;
for each of the at least one input signal:
generating a delayed version of the input signal in accordance with the command;
outputting the delayed version of the input signal, the delayed version of the input clock signal comprising a delayed version of the duty cycle corrected clock signal. 76181-75
41
4. The method of claim 1 wherein receiving a command from a master device specifying an adjustment to a clock duty cycle comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, the command further containing data indicating how to adjust the duty cycle.
5. The method of claim 4 wherein receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command.
6. The method of claim 5 further comprising:
performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that matches a device address of the slave device;
performing the step of generating the duty cycle corrected clock signal in accordance with the command if the command has a device address that is a broadcast device address.
7. The method of claim 4 wherein:
generating a duty cycle corrected clock signal comprises:
a) generating a half rate clock signal from the input clock signal;
b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal;
c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
8. The method of claim 7 wherein the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays. 76181-75
42
9. A method in a memory system comprising a master device and a plurality of serial-connected slave devices comprising at least a first slave device and a last slave device, the method comprising:
in the master device:
a) outputting a first clock signal that functions as an input clock signal of the first slave device;
b) receiving a second clock signal that is an output clock signal of the last slave device;
c) generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command;
in the first slave device of the plurality of serial-connected slave devices:
a) receiving the first clock signal from the master device as the input clock signal of the first slave device;
b) generating an output clock signal from the input signal;
in each other slave device of the plurality of serial-connected slave devices:
a) receiving the output clock signal of a preceding slave device as an input clock signal of the slave device;
b) generating an output clock signal from the input clock signal;
in each of at least one of the plurality of serial-connected devices acting as a slave devices:
a) receiving the duty cycle correction command; 76181-75
43 b) generating a duty cycle corrected clock signal from the input clock signal in accordance with the duty cycle correction command;
c) outputting the duty cycle corrected clock signal as the output clock signal of the slave device.
10. The method of claim 9 wherein each slave device is a memory device and the master device is a memory controller.
11. The method of claim 9 or 10 further comprising:
in the master device:
a) outputting at least one output signal, the at least one output signal comprising the first clock signal to function as an input clock signal of the first slave device;
b) receiving a second clock signal that is an output clock signal of the last slave device;
c) determining an amount of phase offset between the first clock signal and the second clock signal;
d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
12. The method of claim 9 or 10 wherein generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting the duty cycle correction command comprises generating a duty cycle correction command for execution by any specified one of the plurality of serial-connected slave devices.
13. The method of claim 12 wherein generating a duty cycle correction command as a function of a duty cycle of the second clock signal and outputting 76181-75
44 the duty cycle correction command comprises generating a duty cycle correction command for execution by all of the plurality of serial-connected slave devices.
14. The method of claim 9 wherein receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, and containing data indicating how to adjust the duty cycle.
15. The method of claim 14 wherein:
generating a duty cycle corrected clock signal comprises:
a) generating a half rate clock signal from the input clock signal;
b) delaying the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal;
c) combining the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
16. The method of claim 15 wherein the data indicating how to adjust the duty cycle correction comprises an indication of the selected one of the plurality of delays.
17. A slave device for use in an arrangement comprising a plurality of serial- connected slave devices, the slave device comprising:
a command input for receiving a command from a master device specifying an adjustment to a duty cycle;
a clock input for receiving an input clock signal;
a duty cycle correction circuit for generating a duty cycle corrected clock signal from the clock input in accordance with the control command;
a clock output for outputting the duty cycle corrected clock signal. 76181-75
45
18. The slave device of claim 17 wherein the slave device is a memory device.
19. The slave device of claim 17 or 18 wherein:
the command input is also for receiving a command from the master device specifying an adjustment to output delay;
an output delay adjustment circuit for generating a delayed clock signal from the duty cycle corrected clock signal in accordance with the command;
wherein the clock output for outputting the duty cycle corrected clock signal outputs the delayed clock signal.
20. The slave device of claim 17 or 18 further comprising:
a command processing circuit that processes the command,
wherein the command comprises:
a command identifier that identifies the command as a duty cycle correction command; and
data indicating how to adjust the duty cycle.
21. The slave device of claim 20 further comprising:
a device address register;
wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device address matches contents of the device address register.
22. The slave device of any one of claims 17 to 21 wherein the duty cycle correction circuit comprises: 76181-75
46 a) a clock divider circuit that generates a half rate clock signal from the input clock signal;
b) a delay circuit that delays the half rate clock signal by a selected one of a plurality of delays to produce a delayed half rate clock signal;
c) a combiner that combines the half rate clock signal with the delayed half rate clock signal to produce the duty cycle corrected clock signal.
23. The slave device of claim 22 wherein the delay circuit comprises M unit delay elements, M>=2, the duty cycle correction circuit further comprising:
an N-to-M decoder that decodes signals received on N input lines, N>=1 , into a selection of how many of the unit delay elements are to be active in delaying the half rate clock signal to produce the delayed half rate clock signal.
24. A system comprising:
a plurality of serial-connected device acting as slave devices according to claim 17 comprising at least a first slave device and a last slave device;
a master device connected to the first slave device and to the last slave device;
the master device configured to output a first clock signal that functions as an input clock signal of the first slave device;
a clock input for receiving a second clock signal that is an output clock signal of the last slave device;
a duty detector that determines a duty cycle of the second clock signal; 76181-75
47 a command generator that generates a duty cycle correction command specifying an adjustment to a clock duty cycle as a function of the duty cycle of the second clock signal;
wherein, the first slave device of the plurality of serial-connected device acting as slave devices:
a) receives the first clock signal from the master device as the input clock signal of the first slave device;
b) generates an output clock signal from the input clock signal;
wherein each other slave device of the plurality of serial-connected device acting as slave devices:
a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device;
b) generates an output clock signal from the input clock signal;
wherein at least one of the plurality of serial-connected slave devices:
a) receives the duty cycle correction command;
b) generates a duty cycle corrected clock signal in accordance with the control command;
c) outputs the duty cycle corrected clock signal as the output clock signal of the slave device.
25. The system of claim 24 wherein the system is a memory system, each slave device is a memory device and the master device is a memory controller.
26. The memory system of claim 24 or 25 further comprising: 76181-75
48 a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal;
wherein the command generator also generates an output delay adjustment command as a function of the amount of phase offset;
wherein, the first slave device of the plurality of serial-connected slave devices:
a) receives the first clock signal from the master device as the input clock signal of the first slave device;
b) generates an output clock signal from the input clock signal;
wherein each other slave device of the plurality of serial-connected slave devices:
a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device;
b) generates an output clock signal from the input clock signal;
wherein at least one of the plurality of serial-connected slave devices:
a) receives the output delay adjustment command;
b) generates the output clock signal of the device by delaying the input clock signal of the device in accordance with the control command.
27. The memory system of claim 24 or 25 wherein the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by a specified one of the plurality of serial-connected device acting as slave devices. 76181-75
49
28. The memory system of claim 24 or 25 wherein the command generator is configured to generate a duty cycle correction command as a function of a duty cycle of the second clock signal and output the duty cycle correction command by generating a duty cycle correction command for execution by all of the plurality of serial-connected device acting as slave devices.
29. The memory system of any one of claims 24 to 28 wherein receiving the duty cycle correction command comprises receiving a command containing a command identifier that identifies the command as a duty cycle correction command, and containing data indicating how to adjust the duty cycle.
30. A method in a slave device of a plurality of serial-connected slave devices, the method comprising:
receiving a command from a master device specifying how the slave device is to adjust a delay to be applied to at least one signal output by the slave device;
receiving at least one input signal, the at least one input signal comprising at least an input clock signal;
for each of the at least one input signal:
generating a delayed version of the input signal in accordance with the command;
outputting the delayed version of the input signal.
31. The method of claim 30 wherein the slave device is a memory device and the master device is a memory controller.
32. The method of claim 30 or 31 comprising:
outputting a data output signal; 76181-75
50 wherein at least one of the input signals comprises a data input signal and wherein outputting the delayed version of the data input signal is performed as part of outputting the data output signal such that:
a) some of the time the data output signal is said delayed version of the data input signal;
b) some of the time the data output signal is a delayed version of a signal produced locally to the slave device, after applying the delay to the signal produced locally to the slave device in accordance with the command.
33. The method of claim 30 or 31 wherein receiving a command from a master device specifying an adjustment to a delay to be applied to at least one signal output by the slave device comprises receiving a command containing a command identifier that identifies the command as an output delay adjustment command, the command further containing data indicating how to adjust the delay.
34. The method of claim 33 wherein receiving a command further comprises receiving a device address indicating which device(s) acting as slave devices is to execute the command.
35. The method of claim 34 further comprising:
performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that matches a device address of the slave device;
performing the step of, for each of the at least one input signal, generating a delayed version of the input signal in accordance with the command if the command has a device address that is a broadcast device address.
36. The method of claim 33 wherein: 76181-75
51 for each input signal, generating a delayed version of the input signal comprises:
a) delaying the input signal by a selected one of a plurality of delays to produce the delayed version of the input signal.
37. The method of claim 36 wherein the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
38. The method of claim 30 wherein the plurality of input signals comprise:
a clock signal;
a command strobe signal;
a data strobe signal;
a data signal containing commands and data.
39. A method in a memory system comprising a master device and a plurality of serial-connected device acting as slave devices comprising at least a first slave device and a last slave device, the method comprising:
in the master device:
a) outputting at least one output signal, the at least one output signal comprising a first clock signal to function as an input clock signal of the first slave device;
b) receiving a second clock signal that is an output clock signal of the last slave device;
c) determining an amount of phase offset between the first clock signal and the second clock signal; 76181-75
52 d) generating an output delay adjustment command as a function of the phase offset between the first clock signal and the second clock signal and outputting the output delay adjustment command.
40. The method of claim 39 wherein each slave device is a memory device and the master device is a memory controller.
41. The method of claim 39 or 40 further comprising:
in the first slave device of the plurality of serial-connected device acting as slave devices:
a) receiving the at least one output signal from the master device as corresponding at least one input signal of the first slave device;
b) for each input signal, generating an output signal based on the input signal;
in each other slave device of the plurality of serial-connected device acting as slave devices:
a) receiving output signal(s) of a preceding slave device corresponding to at least one input signal of the slave device;
b) for each input signal, generating an output signal based on the input signal;
in at least one of the slave devices,
a) receiving the output delay adjustment command; and
b) generating the output signal(s) by generating a delayed version of the input signal(s) in accordance with the output delay adjustment command.
42. The method of claim 41 further comprising: 76181-75
53 wherein the at least one output signal of the master device comprises a plurality of output signal(s).
43. The method of claim 39 or 40 wherein generating a delay adjustment command comprises generating a delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
44. The method of claim 39 or 40 wherein generating a delay adjustment command comprises generating a delay adjustment command for execution by all of the plurality of serial-connected slave devices.
45. The method of claim 41 wherein generating a delayed version of the input signal(s) in accordance with the output delay adjustment command comprises generating a delayed version of the input signals(s) delayed by a selected one of a plurality of delays.
46. The method of claim 45 wherein generating a delay adjustment command comprises generating a command containing a command identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the delay.
47. The method of claim 46 wherein the data indicating how to adjust the delay comprises an indication of the selected one of the plurality of delays.
48. The method of claim 39 or 40 further comprising:
the master device outputting output delay adjustment commands that adjust delay by adding a delay one unit delay element in one slave device at a time until the phase offset is acceptable.
49. The method of claim 39 or 40 wherein the plurality of input signals comprise:
a clock signal; 76181-75
54 a command strobe signal;
a data strobe signal;
a data signal containing commands and data.
50. A slave device for use in an arrangement comprising a plurality of serial- connected slave devices, the slave device comprising:
a command input for receiving a command from a master device specifying how to perform output delay adjustment;
a clock input for receiving an input clock signal;
an output delay adjustment circuit for generating a delayed clock signal from the clock input in accordance with the command;
a clock output for outputting the delayed clock signal.
51. The slave device of claim 50 wherein the slave device is a memory device.
52. The slave device of claim 50 or 51 comprising:
a command processing circuit that processes the command, wherein the command contains a command identifier that identifies the command as an output delay adjustment command, and contains data indicating how to adjust the output delay.
53. The slave device of claim 52 further comprising:
a device address register;
wherein the command further comprises a device address indicating which slave device is to execute the command, the slave device configured to execute the command if the device identifier matches contents of the device address register. 76181 -75
55
54. The slave device of claim 50 or 51 wherein the output delay adjustment circuit comprises:
for each of a plurality of input signals, inclusive of the input clock signal, a delay circuit that delays the input signal by a selected one of a plurality of delays to produce a delayed version of the input signal.
55. The slave device of claim 54 wherein each output delay circuit comprises M unit delay elements, M >=2, the duty cycle correction circuit further comprising:
an N-to-M decoder that decodes signals received on N input lines, N>=1 , into a selection of how many of the unit delay elements are to be active in producing the delayed version of the input signal.
56. A memory system comprising:
a plurality of serial-connected slave devices according to claim 47 comprising at least a first slave device and a last slave device;
a master device connected to the first slave device and to the last slave device;
the master device configured to output a first clock signal that functions as an input clock signal of the first slave device;
a clock input for receiving a second clock signal that is an output clock signal of the last slave device;
a phase detector that determines an amount of phase offset between the first clock signal and the second clock signal;
a command generator that generates an output delay adjustment command as a function of the amount of phase offset; 76181-75
56 wherein, the first slave device of the plurality of serial-connected slave devices:
a) receives the first clock signal from the master device as the input clock signal of the first slave device;
b) generates an output clock signal from the input clock signal;
wherein each other slave device of the plurality of serial-connected slave devices:
a) receives the output clock signal of a preceding slave device as an input clock signal of the slave device;
b) generates an output clock signal from the input clock signal;
wherein at least one of the plurality of serial-connected slave devices:
a) receives the output delay adjustment command;
b) generates the output clock signal of the device by delaying the input clock signal of the device in accordance with the control command.
57. The system of claim 56 wherein the system is a memory system, each slave device is a memory device and the master device is a memory controller.
58. The memory system of claim 56 or 57 wherein the command generator is configured to generate the output delay adjustment command for execution by a specified one of the plurality of serial-connected slave devices.
59. The memory system of claim 56 or 57 wherein the command generator is configured to generate the output delay adjustment for execution by all of the plurality of serial-connected slave devices.
60. The memory system of 56 or 57 wherein generating an output delay adjustment command comprises generating a command containing a command 76181 -75
57 identifier that identifies the command as an output delay adjustment command, and containing data indicating how to adjust the output delay.
PCT/CA2009/001271 2008-09-30 2009-09-17 Serial-connected memory system with output delay adjustment WO2010037205A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP09817125A EP2329496A4 (en) 2008-09-30 2009-09-17 Serial-connected memory system with output delay adjustment
JP2011528145A JP2012504263A (en) 2008-09-30 2009-09-17 Serially connected memory system with output delay adjustment
CN200980138194.9A CN102165529B (en) 2008-09-30 2009-09-17 Serial-connected memory system with output delay adjustment

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US12/241,960 2008-09-30
US12/241,832 2008-09-30
US12/241,960 US8161313B2 (en) 2008-09-30 2008-09-30 Serial-connected memory system with duty cycle correction
US12/241,832 US8181056B2 (en) 2008-09-30 2008-09-30 Serial-connected memory system with output delay adjustment

Publications (1)

Publication Number Publication Date
WO2010037205A1 true WO2010037205A1 (en) 2010-04-08

Family

ID=42072981

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2009/001271 WO2010037205A1 (en) 2008-09-30 2009-09-17 Serial-connected memory system with output delay adjustment

Country Status (6)

Country Link
EP (1) EP2329496A4 (en)
JP (2) JP2012504263A (en)
KR (1) KR20110081958A (en)
CN (1) CN102165529B (en)
TW (1) TW201027556A (en)
WO (1) WO2010037205A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8665665B2 (en) * 2011-03-30 2014-03-04 Mediatek Inc. Apparatus and method to adjust clock duty cycle of memory
US9257164B2 (en) * 2013-03-14 2016-02-09 Altera Corporation Circuits and methods for DQS autogating
JP6232313B2 (en) * 2014-02-25 2017-11-15 新日本無線株式会社 Synchronous serial communication method and slave device
KR20180033368A (en) * 2016-09-23 2018-04-03 삼성전자주식회사 Electronic device comprising storage devices transmitting reference clock via cascade coupling structure
KR20190009534A (en) * 2017-07-19 2019-01-29 에스케이하이닉스 주식회사 Semiconductor device
KR101999125B1 (en) * 2017-11-24 2019-07-11 파밀넷 주식회사 Output signal automatic controller for RS-422 and RS-485 serial communication
KR20200048607A (en) 2018-10-30 2020-05-08 삼성전자주식회사 System on chip performing training of duty cycle of write clock using mode register write command, operating method of system on chip, electronic device including system on chip
JP2020155841A (en) * 2019-03-18 2020-09-24 キオクシア株式会社 Semiconductor integrated circuit and transmitting device
US10937468B2 (en) * 2019-07-03 2021-03-02 Micron Technology, Inc. Memory with configurable die powerup delay
CN112332881B (en) * 2020-10-19 2022-04-26 深圳市信锐网科技术有限公司 Enabling circuit and communication device
CN112698683A (en) * 2020-12-28 2021-04-23 深圳市合信自动化技术有限公司 Method and device for solving error of transmission delay data by configurable bus and PLC

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040148482A1 (en) 2003-01-13 2004-07-29 Grundy Kevin P. Memory chain
US20050058233A1 (en) * 2003-09-12 2005-03-17 Huy Nguyen System and method for adaptive duty cycle optimization
US20060083043A1 (en) * 2003-11-17 2006-04-20 Sun Microsystems, Inc. Memory system topology
US20070046351A1 (en) * 2005-08-30 2007-03-01 Alessandro Minzoni Duty cycle corrector
US20070076479A1 (en) 2005-09-30 2007-04-05 Mosaid Technologies Incorporated Multiple independent serial link memory
US20070109833A1 (en) 2005-09-30 2007-05-17 Pyeon Hong B Daisy chain cascading devices
US20070143677A1 (en) 2005-09-30 2007-06-21 Mosaid Technologies Independent link and bank selection
US20070153576A1 (en) 2005-09-30 2007-07-05 Hakjune Oh Memory with output control
US20070233903A1 (en) 2006-03-28 2007-10-04 Hong Beom Pyeon Daisy chain cascade configuration recognition technique
US20080013662A1 (en) * 1999-07-14 2008-01-17 Stefanos Sidiropoulos Master Device with Time Domains for Slave Devices in Synchronous Memory System
US20080209095A1 (en) * 2007-01-09 2008-08-28 Allen James J Structure for reducing latency associated with read operations in a memory system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000148674A (en) * 1998-11-09 2000-05-30 Sharp Corp Method for transmitting serial data
US6643787B1 (en) * 1999-10-19 2003-11-04 Rambus Inc. Bus system optimization
JP2003140962A (en) * 2001-10-30 2003-05-16 Mitsubishi Electric Corp Signal transmit/receive system
JP3843002B2 (en) * 2001-11-26 2006-11-08 株式会社ルネサステクノロジ Variable delay circuit and system LSI using the variable delay circuit
US6980042B2 (en) * 2004-04-05 2005-12-27 Micron Technology, Inc. Delay line synchronizer apparatus and method
US7389375B2 (en) * 2004-07-30 2008-06-17 International Business Machines Corporation System, method and storage medium for a multi-mode memory buffer device
US8121237B2 (en) * 2006-03-16 2012-02-21 Rambus Inc. Signaling system with adaptive timing calibration
US7673093B2 (en) * 2006-07-26 2010-03-02 International Business Machines Corporation Computer system having daisy chained memory chips
JP5575474B2 (en) * 2006-08-22 2014-08-20 コンバーサント・インテレクチュアル・プロパティ・マネジメント・インコーポレイテッド Scalable memory system
JP4952177B2 (en) * 2006-10-02 2012-06-13 富士通株式会社 Storage device
CN101617371B (en) * 2007-02-16 2014-03-26 莫塞德技术公司 Non-volatile semiconductor memory having multiple external power supplies

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080013662A1 (en) * 1999-07-14 2008-01-17 Stefanos Sidiropoulos Master Device with Time Domains for Slave Devices in Synchronous Memory System
US20040148482A1 (en) 2003-01-13 2004-07-29 Grundy Kevin P. Memory chain
US20050058233A1 (en) * 2003-09-12 2005-03-17 Huy Nguyen System and method for adaptive duty cycle optimization
US20060083043A1 (en) * 2003-11-17 2006-04-20 Sun Microsystems, Inc. Memory system topology
US20070046351A1 (en) * 2005-08-30 2007-03-01 Alessandro Minzoni Duty cycle corrector
US20070076479A1 (en) 2005-09-30 2007-04-05 Mosaid Technologies Incorporated Multiple independent serial link memory
US20070109833A1 (en) 2005-09-30 2007-05-17 Pyeon Hong B Daisy chain cascading devices
US20070143677A1 (en) 2005-09-30 2007-06-21 Mosaid Technologies Independent link and bank selection
US20070153576A1 (en) 2005-09-30 2007-07-05 Hakjune Oh Memory with output control
US20070233903A1 (en) 2006-03-28 2007-10-04 Hong Beom Pyeon Daisy chain cascade configuration recognition technique
US20080209095A1 (en) * 2007-01-09 2008-08-28 Allen James J Structure for reducing latency associated with read operations in a memory system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2329496A4 *

Also Published As

Publication number Publication date
JP2012504263A (en) 2012-02-16
JP5599852B2 (en) 2014-10-01
TW201027556A (en) 2010-07-16
CN102165529B (en) 2014-12-31
CN102165529A (en) 2011-08-24
KR20110081958A (en) 2011-07-15
JP2013008386A (en) 2013-01-10
EP2329496A1 (en) 2011-06-08
EP2329496A4 (en) 2012-06-13

Similar Documents

Publication Publication Date Title
US8161313B2 (en) Serial-connected memory system with duty cycle correction
US8181056B2 (en) Serial-connected memory system with output delay adjustment
WO2010037205A1 (en) Serial-connected memory system with output delay adjustment
US9971518B2 (en) Clock mode determination in a memory system
US8504789B2 (en) Bridging device having a frequency configurable clock domain
US8713344B2 (en) Methods and apparatus for clock signal synchronization in a configuration of series connected semiconductor devices
US8837655B2 (en) Memory controller with flexible data alignment to clock
US8139390B2 (en) Mixed data rates in memory devices and systems

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980138194.9

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09817125

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2009817125

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2011528145

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 20117006956

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE