WO1999038075A1

WO1999038075A1 - Defect-tolerant memory system

Info

Publication number: WO1999038075A1
Application number: PCT/GB1999/000234
Authority: WO
Inventors: Richard Michael Taylor
Original assignee: Memory Corporation Plc
Priority date: 1998-01-26
Filing date: 1999-01-22
Publication date: 1999-07-29
Also published as: GB9801654D0; AU2178799A

Abstract

A defect-tolerant high bandwidth memory system (20) comprises a controller (22), one or more memory modules (24) each provided with one or more memory devices (28), and a high bandwidth channel (26) for connecting the controller (22) to each module (24) and for carrying data therebetween. A non-volatile memory, which may be an EEPROM (30) or a set of registers (30b), is provided on each module for storing the locations of defective areas of the modules. The controller (22) accesses the non-volatile memory (30) and remaps physical non-defective areas of memory as a set of continuous logical areas of memory. Therefore, the controller does not generate defective physical addresses, thus allowing defect-tolerance to be implemented on high bandwidth memory systems which require transmission line matching of the memory devices.

Description

DEFECT-TOLERANT MEMORY SYSTEM

The present invention relates to a memory system. In particular, the invention relates to a defect-tolerant memory system for use with a memory accessing device such as a processor to provide high speed data storage and retrieval .

A defect-tolerant memory system is one which has the advantage that partially working memory devices may be used: partially working memory devices are less expensive and more readily available than perfectly working memory devices . Each memory device is, of course, formed by a multitude of memory locations capable of storing a single bit.

A typical prior art defect-tolerant memory system 1 is shown schematically in Fig 1 having a custom circuit (ASIC) 2 mounted on a memory module 4 containing partially working memory devices 6. A memory controller 8 conveys control, address and data information on buses 9 to the module 4. The ASIC 2 intercepts the control, data and address information from the controller 8 and is pre-programmed to detect addresses corresponding to faulty memory locations, and on detection of such an address either maps the address to the address of a new (non-faulty) memory location on a different memory device or accesses the same address on a different memory device to transfer valid data. This technique works well for memory modules having buses 9 conveying information at comparatively slow rates (for example 100MHz) and having low bandwidths (less than 0.5 Gigabytes per second), such as DRAM (dynamic random access memory) memory modules.

Recently, however, a high bandwidth (for example, 1.6 Gigabytes per second) architecture for memory systems has been -2- proposed. One example of this high bandwidth architecture

(RAMBUS™) uses a single high speed (800MHz) channel which connects a memory controller to an array of memory modules so that each bit in the channel is connected to every memory module. The single channel uses a small number (typically two bytes) of very high speed signals to convey read/write data.

The channel is connected to the memory controller at one end, to a terminator at the opposite end; and memory modules are connected to the single channel between these two ends.

Previous fault tolerant techniques using an ASIC cannot be used with such memory systems because the ASIC cannot provide the correct transmission line characteristics for each of the plurality of memory devices being accessed; in addition, because there is only a short predetermined time period for providing the correct data, the ASIC would introduce too much time delay if used in such memory systems .

Thus, at present, it is not possible to use partially working memory devices in high bandwidth architecture memory systems .

According to a first aspect of the present invention there is provided a defect-tolerant high bandwidth memory system comprising: a memory controller; at least one memory module having at least one memory device ; a high bandwidth channel for connecting the memory controller to the at least one memory module and for conveying data therebetween; and a non-volatile memory for storing the locations of -3- defective areas of the at least one memory module; where the memory controller is arranged and configured to access the non-volatile memory and to remap physical non-defective areas of memory as a set of continuous logical areas of memory so that for every logical area of memory that may be accessed there is a corresponding non-defective physical area of memory.

By virtue of the present invention defect management and address remapping is performed by the memory controller so that the memory controller does not generate defective physical addresses, thus allowing defect-tolerance to be implemented on high bandwidth memory systems which require transmission line matching of the memory devices.

A module may be a physically detachable entity for releasable connection to a module socket; however, a module may also be incorporated within and be an integral part of a motherboard so that the module is not detachable: thus, a module refers to one or more memory devices arranged on a circuit board (which may be populated with other components) so that the device or devices can store information.

The memory controller may be incorporated within another device which can access memory, such as a CPU or a DMA device.

The present invention may be used with memory systems having bandwidths greater than approximately 1 GBytes per second; advantageously, the present invention may be used with memory systems having bandwidths greater than approximately 1.5 GBytes per second. -4- The non-volatile memory may be included on each of the at least one memory module. For example, for each of the at least one memory module, the non-volatile memory may be implemented as a register or as a set of registers in each memory device in the module. The non-volatile memory may be included as an EEPROM on each of the at least one memory module . The non-volatile memory may be stored in any convenient part of the memory system, for example, in the CMOS setup RAM.

The non-volatile memory may not be a solid state memory, but may be, for example, a magnetic disk, CD-ROM, a diskette, or such like.

Preferably, the memory controller is arranged and configured to remap physical non-defective areas of memory as a set of continuous logical areas of memory by using a memory in the form of a look-up table (LUT) . Conveniently, the LUT is SRAM, each entry having the address of a logical area of memory (hereinafter "a logical address") and a corresponding address of a non-defective physical area of memory (hereinafter "a physical address"). Alternatively, the memory controller may be arranged and configured to remap physical non-defective areas of memory by using a memory in the form of a content addressable memory (CAM) .

In one embodiment, nominally full memory capacity may be provided by having an additional memory device on a memory module. In another embodiment, only a reduced memory capacity is provided.

The areas of memory may be banks of rows and columns; -5- alternatively, the areas may be a band of rows or a band of columns .

Where the memory system has an additional memory device to provide full memory capacity, the memory controller may include a look-up table arranged to have logical address entries of either bands of rows or bands of columns, so that for each band of rows and for each band of columns there is a corresponding physical band of rows and band of columns, and any band of rows having a defect is replaced by a similar band of rows from the additional memory device, and any band of columns having a defect is replaced by a similar band of columns from the additional memory device, so that full memory capacity is provided by the memory system.

According to a second aspect of the present invention there is provided a memory module for connecting to a memory controller to provide a defect-tolerant high bandwidth memory system, the module including non-volatile memory for indicating the areas of defective memory in the module, whereby the memory controller may access the non-volatile memory to retrieve information relating to the areas of defective memory in the module .

Preferably, the non-volatile memory is in the form of one or more registers on each memory device populating a module. Alternatively, the non-volatile memory is in the form of a programmable read only memory (such as an EEPROM, a FLASH EPROM, a ROM, or such like) disposed on the module.

According to a third aspect of the present invention there is provided a method of accessing non-defective memory in a high -6- bandwidth memory system containing defective memory locations, the method comprising the steps of : identifying defective locations within the memory systems- constructing a set of continuous sequential logical addresses of areas of memory corresponding to physical addresses of defect-free areas of memory; storing the set of sequential logical addresses and the corresponding physical addresses as a table; whereby, when a logical address is received, the table is accessed using the received logical address, and the corresponding physical address is thereby determined so that a defect-free area of memory is accessed.

These and other aspects of the present invention will be apparent from the following specific description, given by way of example, with reference to the accompanying drawings in which:

Fig 1 is a block diagram of a prior art memory system;

Fig 2 is a block diagram of a memory system having several memory modules each incorporating several memory devices, according to one embodiment of the present invention;

Fig 3 is a block diagram of a typical memory device in the Fig

2 system;

Fig 4 is a block diagram of a typical memory module in the Fig 2 system;

Fig 5 is a block diagram of an alternative memory module, for use with the memory system of Fig 2;

Fig 6 is a block diagram of part of the memory controller of the Fig 2 system for implementing a simple defect -tolerance scheme;

Fig 7 is a schematic diagram illustrating the simple defect-tolerance scheme of Fig 2; -7- Fig 8 is a block diagram of part of an alternative memory controller for use with the Fig 2 system for implementing a complex defect-tolerance scheme; and

Fig 9 is a schematic diagram illustrating the operation of the complex defect-tolerance scheme of Fig 8.

Fig 2 is a block diagram of a defect-tolerant high bandwidth memory system 20 according to one embodiment of the present invention. The system 20 is based on RAMBUS™ technology, and has a memory controller 22 connected to three removable memory modules 24 (called RAMBUS In-line Memory Modules [RIMMs] ) by a single high bandwidth channel 26. Each module 24 is populated with eight memory devices 28 (called RAMBUS DRAMs [RDRAMs] ) and a single EEPROM 30. The EEPROM 30 is a non-volatile memory for storing the locations of defective areas of the module 24. The controller 22 is connected to each EEPROM 30 by a serial channel 32 which is separate from the high bandwidth channel 26.

The high bandwidth channel 26 is connected to the modules 24 in such a way that each bit in the high bandwidth channel 26 is connected to every memory module 24 ; and within each module 24, the memory devices 28 are connected to the high bandwidth channel 26 so that each bit in the high bandwidth channel 26 is connected to every memory device 28. This is achieved by having a high bandwidth channel input 26a to each module 24 and a separate high bandwidth channel output 26b from each module 24, so that the high bandwidth channel 26 loops through each module 24. A termination component 34 is connected to the end of the channel 26 to ensure that the high speed signals propagating on the channel 26 are not reflected back along the channel 26 to the modules 24. -8-

The controller 22 includes mapping circuitry 36 as will be described below.

Fig 3 is a block diagram of an RDRAM device 28 of Fig 2. The RDRAM device 28 comprises RDRAM control circuitry 40 and a memory area 42. The memory area 42 comprises decode circuitry 44, sense amplifier circuitry 46, and memory arranged in sixteen banks 48. The decode circuitry 44 and sense amplifier circuitry 46 are used to access memory locations within the banks 48.

Fig 4 is a block diagram of an RDRAM module 24 populated with RDRAM devices 28 having banks containing defective memory locations.

The modules 24 are formatted by having the address space of each module 24 subdivided into mapping units (MU) ; in this embodiment one MU is one bank 48 (which is 1/16 of the memory space of an entire device 28) . Each MU represents the smallest granularity of replacement for the mapping system.

In this embodiment, if even one defective memory location is present in a bank 48 then the entire bank 48 is considered to be defective. Defective banks are indicated by numeral 50 and are shown in Fig 4 by heavy shading. When a module 24 is tested the defective banks 50 are identified and the EEPROM 30 is programmed with the locations of these defective banks 50. Thus, each EEPROM 30 has a single bit for each MU (bank 48) to indicate whether that MU (bank 48) is defective or not. This is shown in Fig 4 as an array of bits within the EEPROM 30, where a zero indicates a working bank 48 and a one indicates a -9- defective bank 50. In Fig 4, the top left bit in the array represents the status (working or defective) of the top left bank. The actual array would be stored as 16 bytes within the EEPROM 30 and would be output on a sequential bit by bit basis on serial channel 32.

Fig 5 is a block diagram of an alternative formatted RDRAM module 24b in accordance with another embodiment of the present invention, but which can also be used with the memory system of Fig 2. Module 24b is populated with RDRAM devices 28, each device 28 having non-volatile memory in the form of a set of registers 30b called format registers. Standard RDRAM devices 28 are designed to have some format registers which can be used for storing formatting data. Thus, format registers are already present on standard RDRAM devices 28.

Once each device 28 has been tested, the set of registers 30b is programmed with bits to indicate the location of defective banks 50 within that device 28. Each set of registers 30b contains two bytes of information, the most significant bit in the two bytes relates to the status (working or defective) of the top left bank of that device 28. The contents of each set of registers 30b is read out by the controller 22 on serial channel 32: the devices 28 are accessed sequentially in a predetermined manner so that the controller 22 knows which particular device corresponds to the set of registers 30b being read.

The advantage of storing defect information on each device 28 is that there is no requirement to track parts as they move through a production cycle as each part contains all necessary defect information for that part. -10-

Fig 6 is a block diagram of the mapping circuitry 36 (of Fig 2). The mapping circuitry 36 comprises: a memory 60 in the form of an SRAM look-up (mapping) table, an address counter 5 62, a mapping counter 64, an address multiplexor 66, and defect map initialisation logic (DMIL) 68.

Each time the memory system 20 is switched on, an initialisation routine is executed by the controller 22. The 10 purpose of the initialisation routine is to construct, or set up the contents of, the mapping table 60 so that address translations may be performed during normal operation of the memory system 20.

15 At the start of the initialisation routine, the controller 22 resets (to zero) the address counter 62 and the mapping counter 64, and sets the address multiplexor 66 so that input 70 is routed to output 72. Thus, the address counter 62 is routed to the mapping table 60 so that the mapping table 60 is

20 addressed by the contents of the address counter 62. Once the counters 62,64 have been reset, the controller 22 begins to transfer information identifying defective areas in the memory devices 28 (defect information) from the modules 24 to the DMIL 68. This is accomplished by accessing each memory module

25 24 within the system 20 in turn, and conveying the sequence of bits from the EEPROM 30 for that module 24 to the DMIL 68 via the serial channel 32.

The defect information sequence of bits is presented in 30 increasing order of position in the address space: the first data bit represents the bank 48 at the lowest part of the address space (the lowest bank address) , and the last data bit -11- represents the bank 48 at the highest part of the address space (the highest bank address) .

On each clock cycle (clock signal is not shown for clarity) , one bit of the defect information is received by the DMIL 68. If the first bit (of the defect information) that is received indicates that the associated bank is defect free (that is, the bit is set to zero) then the contents of the mapping counter 64 (zero) are written to (stored in) the memory entry addressed by the contents of the address counter 62 (zero) ; both counters 62 and 64 are then incremented so that the address counter 62 points to the second entry and the mapping counter 64 contains the address of the second bank.

However, if the first bit (of the defect information) that is received indicates that the associated bank (the first bank) is not defect free (that is, the bit is set to one) then the contents of the mapping counter 64 (zero) are incremented (to one) but no data is written to the mapping table 60.

This procedure of receiving one bit and, depending on the state of the bit, either: writing to the mapping table 60 and incrementing both counters 62,64; or incrementing only the mapping counter 64 , is repeated for the second and each subsequent bit until all of the bits in the defect information have been received.

This procedure ensures that the addresses of defective banks 50 are not written to the mapping table 60 so the mapping table 60 only stores mappings to the usable banks 48 in the modules 24. -12- Fig 7 is a schematic diagram illustrating a completed mapping table 60 for a module consisting of two formatted DRAM devices

28a, b. The first defective bank 50 in the devices 28 is bank number six (the seventh bank) in device 28a; however, bank

5 number seven is defect-free. Therefore, entry number six (the seventh entry) points to bank number seven (the eighth bank) rather than bank number six. Fig 7 shows as an example, that entry number ten (arrow 80) is mapped to bank number eleven

(arrow 82) in device 28a: similarly, entry number twenty four

10 (arrow 84) is mapped to bank number thirteen (arrow 86) in device 28b. Thus, the mapping table 60 provides a set of continuous addresses corresponding to logical banks which can be accessed by the memory controller 22, and for each address corresponding to a logical bank the table 60 provides a

15 physical address of a defect-free bank 48.

The memory system 20 may incorporate both formatted modules (having defective banks and associated defect information) and defect-free modules. If no formatted modules are installed in 20 a memory system 20 then the entire mapping circuitry 36 may be bypassed since it always performs an identity mapping.

The size (number of entries) of the mapping table 60 determines the maximum size of addressable physical memory. 25 If each bank is M bytes in size and the mapping table 60 has N entries then the maximum addressable memory size is M multiplied by N (M*N) .

When the initialisation routine has been completed, the DMIL 30 68 sets the address multiplexor 66 so that input 88 is connected to output 72. The system 20 may then be accessed for storing and retrieving information. The system 20 may be -13- accessed by a processor (such as a CPU) ; or by any other device that can access memory, such as a peripheral having direct memory access (DMA) .

Referring to Fig 6, the address to be accessed (which could be for either a read operation or a write operation) is conveyed to the mapping circuitry 36 via bus 90 (which may be 29 bits wide) . This address to be accessed is a logical address and is part of a contiguous address space that does not include any of the defective areas of the main memory (contained within the modules 24) .

This logical address must be translated by the mapping circuitry 36 to produce a physical address for accessing the modules 24. Within mapping circuitry 36, bus 90 is split into two buses: a selector bus 92 and an index bus 94. The selector bus 92 conveys signals which determine which bank 48 is being accessed; whereas, the index bus 94 conveys signals which determine the row and column address within the bank 48 being accessed. The signals on index bus 94 are not modified by the mapping circuitry 36.

The selector bus 92 conveys signals from the most significant bits of bus 90 and index bus 94 conveys signals from the least significant bits of bus 90. The actual number of bits on bus 90 and buses 92 and 94 depend on the size and configuration of the modules 24. Only the signals on the selector bus 92 are translated by the mapping table 60.

When an address to be accessed is received on bus 90, the most significant bits are applied to the mapping table 60 via the selector bus 92 and the address multiplexor 66. The value of -14- these least significant bits is the logical address to be translated. The physical address corresponding to this logical address is output on bus 96. Bus 96 is then combined

(concatenated) with bus 94 to form bus 98, which provides the full address for accessing a bank 48 within the modules 24.

The memory controller 22 then uses bus 98 to access the memory modules 24 in the same way as a conventional high bandwidth memory system would use bus 90.

This mapping technique has the advantage that only the address applied to the modules 24 is modified: the data path to and from the modules 24 is not affected in any way. The technique ensures that only defect-free memory is ever accessed by the memory controller 22. Thus, the full bandwidth potential of the modules 24 is maintained.

Fig 8 is a block diagram of alternative mapping circuitry 36b for use with memory controller 22 for implementing a complex defect-tolerance scheme, and Fig 9 illustrates the effect of mapping circuitry 36b on two RDRAM devices.

When mapping circuitry 36b is used, the memory system includes a formatted module 24 which may have an extra RDRAM device 28c so that additional (substitute) banks 48c are available.

Mapping circuitry 36b comprises: mapping table 60b, row address selector 100, column address selector 102, row mapping table 104, row base register 106, column mapping table 108, column base register 110, mapping arbitration logic 112, block address multiplexor 114, and initialisation logic (not shown, in the interests of clarity) . -15-

Mapping circuitry 36b uses three types of MU: block MU, where one block is one bank in this embodiment; row MU, which consists of a band or range of rows within a particular block; and column MU which consists of a band or range of columns within a particular block. The width of the row MU (the number of rows in each band) is dependent on the size of the row mapping table 104. Similarly, the width of the column MU (the number of columns in each band) is dependent on the size of the column mapping table 108.

Mapping circuitry 36b allows defective banks having some working rows or columns (hereafter referred to as partial banks 50b) to be used for data storage. This is achieved by identifying partial banks 50b, monitoring access to these partial banks 50b, and if a defective row or column within a partial bank 50b is being accessed then re-directing access to a substitute bank 48c, otherwise allowing the partial bank 50b to be accessed. Thus, mapping circuitry 36b does not alter the addresses of rows or columns which are defective, but accesses these rows or columns on a substitute bank 48c rather than in the partial bank 50b.

Each substitute bank 48c may only contain remapped rows or remapped columns, not both remapped rows and columns, so that intersection problems do not arise. Each substitute bank 48c may also be a partial bank 50b, provided substitute banks 48c containing rows do not have any column faults and substitute banks 48c containing columns do not have any row faults.

The substitute banks 50b are marked as being defective in the mapping table 60b so that they do not appear in the usable -16- address space. Address coincidences (see arrows labelled 119 in Fig 9) are dealt with by directing clashing rows or columns to different row or column substitute banks 48c. Access to sequential addresses in the logical address space may require more row access operations than would be expected for all-good devices because adjacent rows may be stored in different substitute banks 48c.

The row mapping table 104 and the column mapping table 108 may be implemented as SRAM look-up tables or as content addressable memories (CAMs) . If an SRAM is used then there is no intrinsic limit on the number of mappings that can be applied, but making the row/column bands wider reduces the size of the SRAM tables. If a CAM is used then narrow bands may be used (even down to a single row/column width) but the total number of defects is limited by the total number of CAM entries. It is assumed that only one module 24 will be supported using this complex defect-tolerance scheme. Other modules 24 may contain only perfect working devices or use the simple defect-tolerance scheme described above.

Mapping circuitry 36b is initialised in a similar way to mapping circuitry 36. In particular, mapping table 60b is initialised in a similar way to mapping table 60, but table 60b considers partial banks 50b as working banks 48 but includes an additional bit (fine map bit 120) to indicate whether the bank is part of the memory space which may contain a partial bank 50b.

The row and column mapping tables 104,108 are initialised directly from an EEPROM (or other non-volatile device) on the module. A large number of bits are required compared with the -17- simple defect-tolerance scheme so it is unlikely that this would be stored on the RDRAM devices . The bits would probably be stored as a direct image in the EEPROM so the initialisation just requires the bits to be copied into the row/column mapping tables 104,108. The row and column base registers 106,110 also require to be initialised from the

EEPROM. These registers 106,110 are used because not all of the address bits need to be stored in the row/column mapping tables 104,108; registers 106,110 provide the additional bits required to complete the addresses output from tables 104,108 respectively.

The row and column address selectors 100,102 are also programmable and are controlled from configuration bits stored in the EEPROM.

When the mapping circuitry 36b has been initialised, an address may be applied to bus 90 (Fig 8) by, for example, a microprocessor. In a similar way to the Fig 6 embodiment, the bus 90 is split into a selector bus 92 (which may be 10 bits wide) and an index bus 94 (which may be 19 bits wide) .

Selector bus 92 accesses mapping table 60b and a corresponding entry having a physical address plus one additional bit is supplied therefrom. The physical address is conveyed on bus 90. The additional bit is used to indicate whether the physical address is in the address space which may have substitute banks 48c. The additional bit is conveyed on map line 120 to the mapping arbitration logic 112. The additional bit is used to enable the mapping arbitration logic 112. The status of the additional bit for each entry in mapping table 60b is set (by downloading from the EEPROM) during the -18- initialisation routine.

If the additional bit is not asserted (set to one) then the translated address (from mapping table 60b) is routed through address multiplexor 114 and is combined with bus 94 (which conveys unchanged bits from bus 90) to be conveyed by bus 98.

Some bits of the incoming address on bus 90 are also applied to row mapping table 104 and column mapping table 108 via the row address selector 100 and the column address selector 102 respectively. Row address selector 100 selects the row bits from the signal on bus 90; whereas, column address selector 100 selects the column bits from the signal on bus 90. Generally, these bits will have the logical bank selections, the least significant bits of the logical device identifiers (enough bits to address the maximum number of devices that are located on the formatted module) , and the most significant physical bits of the row or column. The most significant physical row/column bits are used to maximise the probability of physical local defects appearing in the same row/column replacement band. Only row bits are selected for indexing the row mapping table 104 and only column bits are selected for indexing the column mapping table 108. The other bits will be supplied by the appropriate base register 106,110 for each mapping table 104,108.

The total number of index bits applied to the row mapping table 104 is indicated by the label ROWSIZE. The total number of index bits applied to the column mapping table 108 is indicated by the label COLSIZE. The number of selected bits in ROWSIZE and COLSIZE determines the mapping band width. -19-

A data value of width SUBSIZE is obtained from the row and column mapping tables 104,108. This value is used to determine if a row or column band currently being accessed needs to be remapped. A predefined value (such as all bit values set to logic one) is used to indicate that no mapping is to be performed.

The row substitute value is applied to the mapping arbitration logic 112. The row substitute value is also combined with the row base register value to produce a physical address for accessing the module; this physical address is applied to address multiplexor input A 122.

Similarly, the column substitute value is applied to the mapping arbitration logic 112. The column substitute value is also combined with the column base register value to produce a physical address for accessing the module; this physical address is applied to address multiplexor input B 124.

Mapping logic 112 determines which of the block addresses A (physical address for remapping a row) , B (physical address for remapping a column) , or C (physical address for remapping a bank as per the simple defect-tolerance scheme) is to form the translated address. If no row/column mapping is being performed or the additional bit on map line 120 is not asserted then the signal on bus 90 (input C) is selected. If a row mapping is being performed then input 122 (A) is selected to access the row substitution bank. If a column mapping is being performed then input 124 (B) is selected to access a column substitution bank. If both a row and column are being mapped at the address then the mapping arbitration logic 112 will give priority to one mapping over the other. -20-

To reduce the size of the row/column mapping tables 104,108, the returned data size of SUBSIZE is likely to be less than BLKSIZE. The remaining data bits are provided by row and column base registers 106,110. These allow the physical position of the substitute banks to be placed with an alignment 2^Λ SUBSIZE.

Row and column substitute banks must occupy contiguous physical bank numbers. The number of row and column substitute banks to be used may be determined by software which generates the mapping information in the module EEPROM.

Fig 9 also illustrates a remapping configuration which is implemented when two defective columns have the same column location (an address coincidence) . In this remapping configuration, at least two substitute banks 48c are allocated for storing remapped columns.

It will be appreciated that embodiments of the present invention have a number of advantages, including the following. Defect-tolerance in a high bandwidth memory system is possible by using a modified controller. Only a low cost addition to the memory controller is required to obtain this defect-tolerance. Formatting of memory devices for such a system is quick and inexpensive. Standard (unmodified) module PCB layouts are used, therefore a standard form factor is preserved. Reduced capacity or full capacity memory systems may be implemented. A contiguous logical address space is maintained. Full channel bandwidth is maintained and the data path is unaffected. No cost is added to the memory modules used in such a system. -21- Various modifications may be made to the above described embodiments. For example, an MU may be larger or smaller than one bank. A small MU increases the effective area of silicon that may be used on each device 28 because a lesser amount of good memory will be wasted as a result of being in the same MU as a defect; however, a larger mapping table 40 will be required for the address translation, which increases the cost of the controller 22, has a slower access time than a small table, and occupies more silicon area than a small table. Other means for initialising the mapping table 40 may be used, such as storing an image of the defective locations in non-volatile memory.

In other embodiments of the present invention, an MU is a small band or range of rows or columns, so that only a small band of rows or columns which actually include the defect is treated as defective.

In other embodiments, the non-volatile memory storage on each module may store bits in a different order to that described above; for example, the most significant bit may correspond to the bottom left bank in memory.

In other embodiments, when mapping circuitry 36b is used, the memory system includes a formatted module 24 which does not have an extra RDRAM device 28c, but uses on of the RDRAM devices 28 to provide the additional (substitute) banks 48c.

This embodiment would have a reduced memory capacity compared with a defect-free module.

In other embodiments, the memory controller may be incorporated within another device which can access memory, -22- such as a CPU or a DMA device; in these embodiments, the implementation of defect-tolerance using remapping would be the same .

Claims

-23- CLAIMS

1. A defect-tolerant high bandwidth memory system (20) comprising: a memory controller (22) ; at least one memory module (24) having at least one memory device (28) ; a high bandwidth channel (26) for connecting the memory controller to the at least one memory module and for conveying data therebetween; and a non-volatile memory (30) for storing the locations of defective areas of the at least one memory module; where the memory controller is arranged and configured to access the non-volatile memory and to remap physical non- defective areas of memory as a set of continuous logical areas of memory so that for every logical area of memory that may be accessed there is a corresponding non-defective physical area of memory.

2. A memory system according to claim 1, wherein said at least one memory module (24) is a physically detachable entity for releasable connection to a module socket.

3. A memory system according to claim 1, wherein said at least one memory module (24) is incorporated within and is an integral part of a motherboard.

4. A memory system according to any preceeding claim, wherein the memory controller (22) is incorporated within another device which can access memory.

5. A memory system according to any preceeding claim, wherein said high bandwidth channel (26) has a bandwidth greater than -24- approximately 1 GBytes per second.

6. A memory system according to claim 5, wherein said high bandwidth channel (26) has a bandwidth greater than

5 approximately 1.5 GBytes per second.

7. A memory system according to any preceeding claim, wherein the non-volatile memory (30) is included on the at least one memory module .

10

8. A memory system according to claim 7, wherein for each said memory module (24b) , the non-volatile memory is implemented in the form of at least one register (30b) in each said memory device (28) in the module.

15

9. A memory system according to claim 7, wherein the nonvolatile memory is provided as a EEPROM (30) on the at least one memory module (24) .

20 10. A memory system according to any preceeding claim, wherein the memory controller (22) is arranged and configured to remap physical non-defective areas of memory as a set of continuous logical areas of memory by using a memory (60) in the form of a look-up table (LUT) .

25

11. A memory system according to claim 10, wherein the LUT is SRAM, each entry in the LUT having the address of a logical area of memory and a corresponding address of a non-defective physical area of memory.

30

12. A memory system according to any one of claims 1 to 9 , wherein the memory controller (22) is arranged and configured -25- to remap physical non-defective areas of memory by using a memory in the form of a content addressable memory (CAM) .

13. A memory system according to any preceeding claim, 5 wherein said areas of memory are banks (48) of rows and columns .

14. A memory system according to any of claims 1 to 12, wherein each said area is a band of rows or a band of columns.

10

15. A memory system according to any preceeding claim, wherein an additional memory device (28c) is provided on the at least one memory module and the memory controller includes a look-up table (60b) arranged to have logical address entries

15 of either bands of rows (row MV) or bands of columns (column MV) , so that for each band of rows and for each band of columns there is a corresponding physical band of rows and band of columns, and any band of rows having a defect is replaced by a similar band of rows from the additional memory

20 device (28c) , and any band of columns having a defect is replaced by a similar band of columns from the additional memory device (28c) , so that full memory capacity is provided by the memory system.

25 16. A memory module (24) for connecting to a memory controller (22) to provide a defect-tolerant high bandwidth memory system (20) , the module including non-volatile memory (30) for indicating areas of defective memory in the module, whereby the memory controller may access the non-volatile

30 memory to retrieve information relating to the areas of defective memory in the module. -26-

17. A memory module according to claim 16, wherein the module

(24) is populated by a plurality of memory devices (28) and said non-volatile memory is in the form of at least one register (30b) on each said memory device populating the module .

18. A memory module according to claim 16, wherein the nonvolatile memory is in the form of a programmable read only memory (30) disposed on the module.

19. A method of accessing non-defective memory in a high bandwidth memory system (20) containing defective memory locations, the method comprising the steps of: identifying defective locations within the memory system; constructing a set of continuous sequential logical addresses of areas of memory corresponding to physical addresses of defect-free areas of memory; storing the set of sequential logical addresses and the corresponding physical addresses as a table; whereby, when a logical address is received, the table is accessed using the received logical address, and the corresponding physical address is thereby determined so that a defect-free area of memory is accessed.