US20080229026A1 - System and method for concurrently checking availability of data in extending memories - Google Patents
System and method for concurrently checking availability of data in extending memories Download PDFInfo
- Publication number
- US20080229026A1 US20080229026A1 US11/724,568 US72456807A US2008229026A1 US 20080229026 A1 US20080229026 A1 US 20080229026A1 US 72456807 A US72456807 A US 72456807A US 2008229026 A1 US2008229026 A1 US 2008229026A1
- Authority
- US
- United States
- Prior art keywords
- tag
- memory
- cache
- predetermined bit
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0811—Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0846—Cache with multiple tag or data arrays being simultaneously accessible
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0613—Improving I/O performance in relation to throughput
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0864—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1021—Hit rate improvement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/451—Stack data
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L2224/00—Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
- H01L2224/01—Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
- H01L2224/42—Wire connectors; Manufacturing methods related thereto
- H01L2224/47—Structure, shape, material or disposition of the wire connectors after the connecting process
- H01L2224/48—Structure, shape, material or disposition of the wire connectors after the connecting process of an individual wire connector
- H01L2224/4805—Shape
- H01L2224/4809—Loop shape
- H01L2224/48091—Arched
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L2224/00—Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
- H01L2224/01—Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
- H01L2224/42—Wire connectors; Manufacturing methods related thereto
- H01L2224/47—Structure, shape, material or disposition of the wire connectors after the connecting process
- H01L2224/48—Structure, shape, material or disposition of the wire connectors after the connecting process of an individual wire connector
- H01L2224/481—Disposition
- H01L2224/48135—Connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip
- H01L2224/48145—Connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip the bodies being stacked
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L25/00—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
- H01L25/03—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes
- H01L25/04—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers
- H01L25/065—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00
- H01L25/0652—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00 the devices being arranged next and on each other, i.e. mixed assemblies
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L25/00—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
- H01L25/03—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes
- H01L25/04—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers
- H01L25/065—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00
- H01L25/0655—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00 the devices being arranged next to each other
-
- H—ELECTRICITY
- H01—ELECTRIC ELEMENTS
- H01L—SEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
- H01L25/00—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
- H01L25/03—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes
- H01L25/04—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers
- H01L25/065—Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00
- H01L25/0657—Stacked arrangements of devices
Definitions
- the present invention relates generally to computer memory architectures, and, more particularly, to a system and method for extending memories in stacked chips with multicore microprocessors.
- SiP system-in-package
- IC integrated circuit
- FIGS. 1A and 1B illustrate such SiP devices.
- FIG. 1A there are two core dies 110 and 120 mounted on top of a package substrate 100 .
- the core dies contain processing units as well as memories serving as Level 1 caches to the processing units.
- an additional SiP memory 130 is also mounted to serve as Level 2 cache to the dual processing units or cores.
- memory die 140 mounted on the same layer as the dual cores 110 and 120 .
- memory die 140 may serve as a Level 2 cache, and then the SiP memory 130 may serve as a Level 3 cache.
- FIG. 2 shows how a microprocessor executes data.
- a memory hierarchy 200 includes a hard drive 210 , a main memory 220 , Level 2 caches 230 , Level 1 caches 242 and a register file 244 , which is closest to an execution unit 246 (Arithmetic-Logic Unit or ALU, for example).
- the main memory 220 is typically comprised of dynamic random access memory (DRAM).
- the caches 230 and 242 are smaller, faster memories and usually made of static random access memory (SRAM) that store copies of the data from the most frequently used main memory locations.
- DRAM dynamic random access memory
- SRAM static random access memory
- Level 1 cache 242 , the register file 244 and the execution unit 246 reside usually in the same central processing unit (CPU) die 240 .
- Data are fetched through the memory hierarchy 200 from the hard drive 210 , the main memory 220 , the caches 230 and 242 , and the register file 244 to the execution unit 246 for processing.
- Data storage tends to be a subset of another storage device farther away from execution unit 246 . The farther the storage devices are away from the execution unit 246 , the larger the capacities, the slower the speed, and the narrower the bandwidth.
- This pyramid scheme works to compromise speed versus capacity based on temporal and spatial localities, namely data blocks used now will be used later; data blocks used here will be used in close proximity later.
- This memory hierarchy 200 is applied to instructions as well as data for caches, main memory and disk storage. For the lowest level cache, instruction and data caches tend to be separate entities (separate caches). Otherwise they are stored in the same storage (unified cache) for other levels of caches.
- the memory hierarchy 200 is a commonly used technique in the computer art to achieve high performance while reducing costs.
- Cache memories work like temporary storages.
- the processing unit 246 wishes to read or write to a location in the main memory 220 , it first checks whether that memory location is in the Level 1 cache 242 . This is accomplished by comparing the address of the memory location to all tags stored in the Level 1 cache 242 that might contain that address. If the processing unit 246 finds that the memory location is in the cache, then the data corresponding to the address will be accessed directly from the Level 1 cache 242 , and a cache hit will have occurred. Otherwise the data is not in the Level 1 cache 242 , and it is a cache miss.
- SiP extends computer cache capacity; however, with the aforementioned hierarchical memory management approach, the Level 2 cache 230 can not be simultaneously checked with the Level 1 cache 242 .
- the execution unit 246 can only check the Level 1 cache 242 directly. For Data to be accessed, they have to be transferred to the lower memories in the hierarchy. This lowers memory management efficiency.
- This invention discloses an extended memory comprising a first tag RAM for storing one or more tags corresponding to data stored in a first storage module, and a second tag RAM for storing one or more tags corresponding to data stored in a second storage module, wherein the first and second storage modules are separated and independent memory units, the numbers of bits in the first and second tag RAMs differ, and an address is concurrently checked against both the first and second tag RAMs using a first predetermined bit field of the address for checking against a first tag from the first tag RAM and using a second predetermined bit field of the address for checking against a second tag from the second tag RAM.
- FIGS. 1A and 1B illustrate cache memory being extended in conventional system-in-package (SiP).
- FIG. 2 illustrates a conventional memory hierarchy
- FIG. 3 is a block diagram illustrating a conventional cache accessing mechanism.
- FIG. 4 is a block diagram illustrating a cache memory management system that can access two caches concurrently according to one embodiment of the present invention.
- FIGS. 5A and 5B are block diagrams illustrating various ways of stacking shared caches for multicore systems.
- FIG. 6 is a flow chart illustrating a method for concurrently checking data availability in two caches according to another embodiment of the present invention.
- the present invention discloses a memory management system and method that can simultaneously check multiple caches either in the same level or in different levels, and hence directly accesses data stored in the caches.
- FIG. 3 is a block diagram illustrating a conventional cache accessing mechanism. Supposing a computer's physical address 302 has 32 bits, and they are divided into 20 tag bits 303 , 9 index bits 304 and 3 offset bits 305 .
- a cache 308 has a tag random access memory (RAM) 310 , and a data RAM 315 , where actual data are stored.
- the tag RAM 310 has a plurality of tag lines 322 , each store a tag 324 along with its attribute bits 326 for cache coherence operations.
- the attribute bits 326 may contain 4 bits, i.e., a modified bit, an exclusive bit, a share bit and an invalidate bit.
- the 9 index bits 304 are used to select a tag line 322 in the tag RAM 310 .
- First is to check the attribute bits 324 of the selected tag line by a block 330 .
- the modified bit may indicate whether this line of data has been modified or not and determines any line update when it is swapped back to a hard disk. Any match result may be ignored if the invalidate bit is set.
- the block 330 may be implemented as a multi-bit comparator circuit. After all the attribute bits are checked, the output of the tag portion may be compared with the tag bits 303 of the physical address 302 also at the block 330 . If the comparison produces a match, then a chunk of data the physical address 302 intends to address is stored in the cache 308 and can be fetched directly, i.e., a cache hit has occurred.
- the cache 308 illustrated in FIG. 3 has two sets of identical tag RAMs 320 [0:1] and data RAMs 315 [0:1] as well as two identical blocks 330 [0:1] as in a two-way set associated cache configuration. Both of the tag RAMs 330 [0:1] are checked against a physical address at the same time. Since all data stored in the cache 308 have unique locations, and their tags are unique, there is only one block 330 that can produce a match at a time. If the block 330 [0] produces a match, then a signal Hit 0 may be set, which may select data from the data RAM 315 [0] to output from a multiplexer 335 . Similarly, if the block 330 [1] produces a match, then a signal Hit 1 may be set, which may select data from the data RAM 315 [1] to output from the multiplexer 335 .
- FIG. 4 is a block diagram illustrating a cache memory management system 400 that concurrently accesses two caches according to one embodiment of the present invention.
- Both a first cache 410 and a second cache 420 may be implemented as the cache 308 shown in FIG. 3 .
- One physical address 402 is checked concurrently against both caches 410 and 420 , but the bit fields of the physical address 402 are divided differently for different caches.
- tag bits 403 and index bits 404 for the first cache 410 are 20 bits and 9 bits, respectively
- tag bits 405 and index bits 406 for the second cache 420 are 16 bits and 13 bits, respectively.
- Offset bits for both the first and second caches are the same and both are 3 bits.
- the same physical address can reach completely different line of tag RAMs with totally different tags, in such a way, the two caches 410 and 420 can be checked concurrently for data availability by the single physical address 402 .
- both the first and second caches 410 and 420 are implemented in two-way set association, two pairs of hit signals, Hit 0 [1:2] and Hit 1 [1:2] may be produced between them, and are sent to a control logic circuit 430 which controls a multiplexer 440 . If one of the signals Hit 0 [1] and Hit 1 [1] is hit, then the multiplexer 440 will output a chunk of line[1] data from the first cache 410 . Similarly, if one of the signals Hit 0 [2] and Hit 1 [2] is hit, then the multiplexer 440 will output a chunk of line[2] data from the second cache 420 .
- the first cache 410 may be a cache internal to a core chip
- the second cache 420 may be a cache external to the core chip.
- the external cache 420 may employ a signal bit, EScache_enable (external shared cache enable), to turn on the external cache and its tag RAM access when the signal is set, and to ignore the external cache when this signal bit is not set.
- EScache_enable external shared cache enable
- LFSR Linear Feedback Shift Register
- Select internal cache occurs when this bit is set or external if not set.
- Another embodiment is to use a portion of physical addresses to determine accessing internal or external caches. For example, according to the physical address, the lowest 8 KB in a page will be assigned to internal cache. Others will be assigned to external cache.
- a stacked cache may be slower than an on-die cache. Therefore, the stacked cache may need longer latency than the on-die cache.
- the controls of stacked caches remain better on die, while the stacked memory only provides additional data storage.
- the tag for the stacked memory may or may not be on die, though it makes more sense to remain on die due to the number of logic involved in cache operations. With this concurrent accessing method, there is more freedom in the way of building a SiP chip.
- FIGS. 5A and 5B are block diagrams illustrating various ways of stacking shared caches for multicore systems.
- FIGS. 5A and 5B present only dual core systems.
- One having skill in the art would recognize that the present invention is not limited by the number of cores in a SiP system. In fact, the number of shared caches in the SiP system is not limiting as well.
- a stacked SiP 500 contains two dies, a dual core die 505 and a cache die 506 .
- the dual core die 505 has dual cores 512 and 514 , dual Level 1 caches 522 and 524 for the dual cores 512 and 514 , respectively.
- the cache die 506 serves as an extended Level 2 cache for the dual cores 512 and 514 .
- Level 1 and Level 2 caches are typically on the same die as the core central processing units (CPUs). Stacked dies are more applicable to Level 3 cache.
- a stacked SiP 550 also contains two dies, a dual core die 505 and a cache die 556 which serves as a shared Level 3 cache for the dual cores 512 and 514 . No matter how these caches are organized, according to the present invention described above, all the shared caches may be accessed concurrently.
- FIG. 6 is a flow chart illustrating a method for concurrently checking data availability in two caches according to another embodiment of the present invention.
- the method begins in step 610 , where a processing unit selects a first tag line from a first tag RAM of a first cache, using a first predetermined bit field of a physical address as an address of the first tag.
- the processing unit concurrently selects a second tag line from a second tag RAM of a second cache, using a second predetermined bit field of the physical address as an address of the second tag. Therefore, the first and second predetermined bit fields serve as indexes of the tag RAMs and they may have different number of bits.
- the processing unit checks a third predetermined bit field of the physical address against the first tag line.
- step 640 concurrent to the step 630 , the processing unit checks a fourth predetermined bit field of the physical address against the second tag line.
- the third and fourth predetermined bit fields are also called tag fields and may have different number of bits as well.
- the processing unit will fetch a chunk of data the physical address is intended to address from a first memory module when the third predetermined bit field matches the first tag line, wherein the first memory module associates with the first tag RAM.
- the processing unit will fetch a chunk of data the physical address is intended to address from a second memory module when the fourth predetermined bit field matches the second tag line, wherein the second memory module associates with the second tag RAM.
- the first and second memory modules may be two separated and independent memory units.
- the first memory module may be a Level 1 or Level 2 cache
- the second memory module may be a Level 3 cache in a stacked die. Nevertheless, data availability in the first and second memory modules may be checked concurrently, hence increasing data access speeds.
Abstract
This invention discloses an extended memory comprising a first tag RAM for storing one or more tags corresponding to data stored in a first storage module, and a second tag RAM for storing one or more tags corresponding to data stored in a second storage module, wherein the first and second storage modules are separated and independent memory units, the numbers of bits in the first and second tag RAMs differ, and an address is concurrently checked against both the first and second tag RAMs using a first predetermined bit field of the address for checking against a first tag from the first tag RAM and using a second predetermined bit field of the address for checking against a second tag from the second tag RAM.
Description
- The present invention relates generally to computer memory architectures, and, more particularly, to a system and method for extending memories in stacked chips with multicore microprocessors.
- A recent trend to pack more functions into a small form factor is a so-called system-in-package (SiP) technology which is to enclose a number of integrated circuit (IC) dies in a single package or module. The dies may be stacked vertically or placed horizontally alongside one another inside the package. They are internally connected by fine wires that are buried in the package, or joined by solder bumps through a flip-chip technology.
FIGS. 1A and 1B illustrate such SiP devices. Referring toFIG. 1A , there are twocore dies package substrate 100. The core dies contain processing units as well as memories serving asLevel 1 caches to the processing units. On top of the core dies 110 and 120, anadditional SiP memory 130 is also mounted to serve asLevel 2 cache to the dual processing units or cores. - Referring to
FIG. 1B , beside thedual cores SiP memory 130, there is another memory die 140 mounted on the same layer as thedual cores memory die 140 is located closer to thedual cores SiP memory 130,memory die 140 may serve as aLevel 2 cache, and then theSiP memory 130 may serve as a Level 3 cache. - These SiPs can greatly extend cache capacity in a computer system. But with added levels of caches, memory management becomes more complicated.
FIG. 2 shows how a microprocessor executes data. In this computer system, amemory hierarchy 200 includes ahard drive 210, amain memory 220,Level 2caches 230,Level 1caches 242 and aregister file 244, which is closest to an execution unit 246 (Arithmetic-Logic Unit or ALU, for example). Themain memory 220 is typically comprised of dynamic random access memory (DRAM). Thecaches Level 1cache 242, theregister file 244 and theexecution unit 246 reside usually in the same central processing unit (CPU) die 240. Data are fetched through thememory hierarchy 200 from thehard drive 210, themain memory 220, thecaches register file 244 to theexecution unit 246 for processing. Data storage tends to be a subset of another storage device farther away fromexecution unit 246. The farther the storage devices are away from theexecution unit 246, the larger the capacities, the slower the speed, and the narrower the bandwidth. This pyramid scheme works to compromise speed versus capacity based on temporal and spatial localities, namely data blocks used now will be used later; data blocks used here will be used in close proximity later. Thismemory hierarchy 200 is applied to instructions as well as data for caches, main memory and disk storage. For the lowest level cache, instruction and data caches tend to be separate entities (separate caches). Otherwise they are stored in the same storage (unified cache) for other levels of caches. Thememory hierarchy 200 is a commonly used technique in the computer art to achieve high performance while reducing costs. - Cache memories work like temporary storages. When the
processing unit 246 wishes to read or write to a location in themain memory 220, it first checks whether that memory location is in theLevel 1cache 242. This is accomplished by comparing the address of the memory location to all tags stored in theLevel 1cache 242 that might contain that address. If theprocessing unit 246 finds that the memory location is in the cache, then the data corresponding to the address will be accessed directly from theLevel 1cache 242, and a cache hit will have occurred. Otherwise the data is not in theLevel 1cache 242, and it is a cache miss. - SiP extends computer cache capacity; however, with the aforementioned hierarchical memory management approach, the
Level 2cache 230 can not be simultaneously checked with theLevel 1cache 242. Theexecution unit 246 can only check theLevel 1cache 242 directly. For Data to be accessed, they have to be transferred to the lower memories in the hierarchy. This lowers memory management efficiency. - As such, what is desired is a memory management system and method that can simultaneously check multiple memories either in the same or different levels, and hence directly accesses data stored in those memories.
- This invention discloses an extended memory comprising a first tag RAM for storing one or more tags corresponding to data stored in a first storage module, and a second tag RAM for storing one or more tags corresponding to data stored in a second storage module, wherein the first and second storage modules are separated and independent memory units, the numbers of bits in the first and second tag RAMs differ, and an address is concurrently checked against both the first and second tag RAMs using a first predetermined bit field of the address for checking against a first tag from the first tag RAM and using a second predetermined bit field of the address for checking against a second tag from the second tag RAM.
- The construction and method of operation of the invention, however, together with additional objectives and advantages thereof will be best understood from the following description of specific embodiments when read in connection with the accompanying drawings.
- The drawings accompanying and forming part of this specification are included to depict certain aspects of the invention. A clearer conception of the invention, and of the components and operation of systems provided with the invention, will become more readily apparent by referring to the exemplary, and therefore non-limiting, embodiments illustrated in the drawings, wherein like reference numbers (if they occur in more than one view) designate the same elements. The invention may be better understood by reference to one or more of these drawings in combination with the description presented herein. It should be noted that the features illustrated in the drawings are not necessarily drawn to scale.
-
FIGS. 1A and 1B illustrate cache memory being extended in conventional system-in-package (SiP). -
FIG. 2 illustrates a conventional memory hierarchy. -
FIG. 3 is a block diagram illustrating a conventional cache accessing mechanism. -
FIG. 4 is a block diagram illustrating a cache memory management system that can access two caches concurrently according to one embodiment of the present invention. -
FIGS. 5A and 5B are block diagrams illustrating various ways of stacking shared caches for multicore systems. -
FIG. 6 is a flow chart illustrating a method for concurrently checking data availability in two caches according to another embodiment of the present invention. - The present invention discloses a memory management system and method that can simultaneously check multiple caches either in the same level or in different levels, and hence directly accesses data stored in the caches.
-
FIG. 3 is a block diagram illustrating a conventional cache accessing mechanism. Supposing a computer'sphysical address 302 has 32 bits, and they are divided into 20tag bits index bits 304 and 3offset bits 305. Acache 308 has a tag random access memory (RAM) 310, and adata RAM 315, where actual data are stored. Thetag RAM 310 has a plurality oftag lines 322, each store atag 324 along with itsattribute bits 326 for cache coherence operations. Theattribute bits 326 may contain 4 bits, i.e., a modified bit, an exclusive bit, a share bit and an invalidate bit. Theoffset 305 has 3 bits indicating that acache line 320 has 8 bytes (2̂3=8). When a data is stored in thecache 308, its corresponding tag is then stored in thetag RAM 310. Theindex bits 304 in thephysical address 302 are used to address thetag lines 320 of thetag RAM 310. 9 bits can address a tag RAM with 512 lines (2̂9=512). - When the
physical address 302 is checked against thecache 308, the 9index bits 304 are used to select atag line 322 in thetag RAM 310. First is to check theattribute bits 324 of the selected tag line by ablock 330. The modified bit may indicate whether this line of data has been modified or not and determines any line update when it is swapped back to a hard disk. Any match result may be ignored if the invalidate bit is set. Theblock 330 may be implemented as a multi-bit comparator circuit. After all the attribute bits are checked, the output of the tag portion may be compared with thetag bits 303 of thephysical address 302 also at theblock 330. If the comparison produces a match, then a chunk of data thephysical address 302 intends to address is stored in thecache 308 and can be fetched directly, i.e., a cache hit has occurred. - In fact, the
cache 308 illustrated inFIG. 3 has two sets of identical tag RAMs 320[0:1] and data RAMs 315[0:1] as well as two identical blocks 330[0:1] as in a two-way set associated cache configuration. Both of the tag RAMs 330[0:1] are checked against a physical address at the same time. Since all data stored in thecache 308 have unique locations, and their tags are unique, there is only oneblock 330 that can produce a match at a time. If the block 330[0] produces a match, then a signal Hit0 may be set, which may select data from the data RAM 315[0] to output from amultiplexer 335. Similarly, if the block 330[1] produces a match, then a signal Hit1 may be set, which may select data from the data RAM 315[1] to output from themultiplexer 335. -
FIG. 4 is a block diagram illustrating a cache memory management system 400 that concurrently accesses two caches according to one embodiment of the present invention. Both afirst cache 410 and asecond cache 420 may be implemented as thecache 308 shown inFIG. 3 . Onephysical address 402 is checked concurrently against bothcaches physical address 402 are divided differently for different caches. For illustration purposes,tag bits 403 andindex bits 404 for thefirst cache 410 are 20 bits and 9 bits, respectively, whiletag bits 405 andindex bits 406 for thesecond cache 420 are 16 bits and 13 bits, respectively. Offset bits for both the first and second caches are the same and both are 3 bits. Then a tag RAM (not shown) for thefirst cache 410 may have 1024 (2̂9*2=1024) lines for a two-way set association, and a tag RAM (also not shown) for thesecond cache 420 may have 8K (2̂13*2=8K) lines for a two-way set association. Since the size of the tag RAMs are relatively small, so that both tag RAMs for the first and second caches may actually reside in the same core chip for faster checking. - Because different bit fields of the
physical address 402 are used bydifferent caches caches physical address 402. - As both the first and
second caches control logic circuit 430 which controls amultiplexer 440. If one of the signals Hit0[1] and Hit1[1] is hit, then themultiplexer 440 will output a chunk of line[1] data from thefirst cache 410. Similarly, if one of the signals Hit0[2] and Hit1[2] is hit, then themultiplexer 440 will output a chunk of line[2] data from thesecond cache 420. - Although only two-way set association is described here, one having skill in the art would recognize that any other way set association may work with the present invention.
- Referring to
FIG. 4 , thefirst cache 410 may be a cache internal to a core chip, and thesecond cache 420 may be a cache external to the core chip. Theexternal cache 420 may employ a signal bit, EScache_enable (external shared cache enable), to turn on the external cache and its tag RAM access when the signal is set, and to ignore the external cache when this signal bit is not set. - There should be internal/external cache placement algorithms to prevent both
caches - Since off-chip memories have longer inter-connects to a mother die, a stacked cache may be slower than an on-die cache. Therefore, the stacked cache may need longer latency than the on-die cache.
- The controls of stacked caches remain better on die, while the stacked memory only provides additional data storage. The tag for the stacked memory may or may not be on die, though it makes more sense to remain on die due to the number of logic involved in cache operations. With this concurrent accessing method, there is more freedom in the way of building a SiP chip.
-
FIGS. 5A and 5B are block diagrams illustrating various ways of stacking shared caches for multicore systems. For illustration purpose,FIGS. 5A and 5B present only dual core systems. One having skill in the art would recognize that the present invention is not limited by the number of cores in a SiP system. In fact, the number of shared caches in the SiP system is not limiting as well. - Referring to
FIG. 5A , astacked SiP 500 contains two dies, a dual core die 505 and acache die 506. The dual core die 505 hasdual cores dual Level 1caches dual cores extended Level 2 cache for thedual cores Level 1 andLevel 2 caches are typically on the same die as the core central processing units (CPUs). Stacked dies are more applicable to Level 3 cache. - Referring to
FIG. 5B , astacked SiP 550 also contains two dies, a dual core die 505 and a cache die 556 which serves as a shared Level 3 cache for thedual cores -
FIG. 6 is a flow chart illustrating a method for concurrently checking data availability in two caches according to another embodiment of the present invention. The method begins instep 610, where a processing unit selects a first tag line from a first tag RAM of a first cache, using a first predetermined bit field of a physical address as an address of the first tag. Instep 620, the processing unit concurrently selects a second tag line from a second tag RAM of a second cache, using a second predetermined bit field of the physical address as an address of the second tag. Therefore, the first and second predetermined bit fields serve as indexes of the tag RAMs and they may have different number of bits. Instep 630, the processing unit checks a third predetermined bit field of the physical address against the first tag line. Instep 640, concurrent to thestep 630, the processing unit checks a fourth predetermined bit field of the physical address against the second tag line. The third and fourth predetermined bit fields are also called tag fields and may have different number of bits as well. Then, as shown instep 650, the processing unit will fetch a chunk of data the physical address is intended to address from a first memory module when the third predetermined bit field matches the first tag line, wherein the first memory module associates with the first tag RAM. Alternatively, the processing unit will fetch a chunk of data the physical address is intended to address from a second memory module when the fourth predetermined bit field matches the second tag line, wherein the second memory module associates with the second tag RAM. According to the embodiment of the present invention, the first and second memory modules may be two separated and independent memory units. For instance, the first memory module may be aLevel 1 orLevel 2 cache, and the second memory module may be a Level 3 cache in a stacked die. Nevertheless, data availability in the first and second memory modules may be checked concurrently, hence increasing data access speeds. - Although the present disclosure uses cache memories as an embodiment of the present invention, one having skill in the art would appreciate the present invention can be applied to memory systems where multiple modules exist and tags are used for keeping track of the data stored in the modules.
- The above illustration provides many different embodiments or embodiments for implementing different features of the invention. Specific embodiments of components and processes are described to help clarify the invention. These are, of course, merely embodiments and are not intended to limit the invention from that described in the claims.
- Although the invention is illustrated and described herein as embodied in one or more specific examples, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made therein without departing from the spirit of the invention and within the scope and range of equivalents of the claims. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the invention, as set forth in the following claims.
Claims (20)
1. A memory system for a multi-core computer system, the memory system comprising:
a first tag random access memory (RAM) for storing one or more tags corresponding to data stored in a first storage module; and
a second tag RAM for storing one or more tags corresponding to data stored in a second storage module,
wherein the first and second storage modules are separated and independent memory units, and
wherein the numbers of bits in the first and second tag RAMs differ, and
wherein a number is concurrently checked against both the first and second tag RAMs using a first predetermined bit field of the number for checking against a first tag from the first tag RAM and using a second predetermined bit field of the number for checking against a second tag from the second tag RAM.
2. The memory system of claim 1 , wherein
the first tag is addressed by a third predetermined bit field of the number; and
the second tag is addressed by a fourth predetermined bit field of the number,
wherein the numbers of bits in the third and fourth predetermined bit fields differ.
3. The memory system of claim 1 , wherein the number is an address of a chunk of data intended to access.
4. The memory system of claim 1 , wherein the first and second predetermined bit fields of the number have one or more overlapping bits.
5. The memory system of claim 1 , wherein the first storage module is a stacked cache memory.
6. The memory system of claim 1 , wherein both the first and second storage modules are stacked cache memories.
7. The memory system of claim 6 , wherein the first and second storage modules are in different stacked dies.
8. The memory system of claim 1 , wherein the first storage module is a Level 1 or Level 2 cache and the second storage module is a Level 3 cache.
9. The memory system of claim 1 , wherein the first and second tag RAMs reside in the same memory die.
10. The memory system of claim 1 , wherein the first or second tag RAMs further comprises one or more attribute bits for memory coherent operations.
11. A cache memory system for a multi-core computer system, the cache memory system comprising:
a first tag random access memory (RAM) for storing one or more tags corresponding to data stored in a first cache memory; and
a second tag RAM for storing one or more tags corresponding to data stored in a second cache memory,
wherein the first and second cache memories are separated and are independent memory units, and
wherein the numbers of bits in the first and second tag RAMs differ, and
wherein a data address is concurrently checked against both the first and second tag RAMs using a first predetermined bit field of the data address for checking against a first tag from the first tag RAM and using a second predetermined bit field of the data address for checking against a second tag from the second tag RAM.
12. The memory system of claim 11 , wherein
the first tag is addressed by a third predetermined bit field of the data address; and
the second tag is addressed by a fourth predetermined bit field of the data address,
wherein the number of bits of the third and fourth predetermined bit fields differ.
13. The memory system of claim 11 , wherein the first cache memory is a Level 1 cache and the second cache memory is either a Level 2 or Level 3 cache.
14. The memory system of claim 11 , wherein the first and second tag RAMs reside in the same memory die.
15. The memory system of claim 11 , wherein the first and second cache memories are in different stacked dies.
16. A method for concurrently checking availability of data in extended memories of a multi-core computer system, the method comprising:
addressing a first tag line in a first tag random access memory (RAM) by a first predetermined bit field of a data address;
addressing a second tag line in a second tag RAM by a second predetermined bit field of the data address;
comparing a third predetermined bit field of the data address against the first tag line;
comparing a fourth predetermined bit field of the data address against the second tag line;
wherein a chunk of data in the data address is intended to address is stored in a first memory module and the first tag RAM associated to when the third predetermined bit field matches the first tag line and the chunk of data is stored in a second memory module when the fourth predetermined bit field matches the second tag line, and
wherein the first and second memory modules are separated and are independent memory units.
17. The method of claim 16 , wherein the numbers of bits in the third and fourth predetermined bit fields differ.
18. The method of claim 16 , wherein the first and second tag RAMs reside in the same memory die.
19. The method of claim 16 , wherein the first and second memory modules are in different stacked dies.
20. The method of claim 16 further comprising accessing one or more attribute bits in the first or second tag RAM for memory coherent operations.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/724,568 US20080229026A1 (en) | 2007-03-15 | 2007-03-15 | System and method for concurrently checking availability of data in extending memories |
US14/835,988 US10310976B2 (en) | 2007-03-15 | 2015-08-26 | System and method for concurrently checking availability of data in extending memories |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/724,568 US20080229026A1 (en) | 2007-03-15 | 2007-03-15 | System and method for concurrently checking availability of data in extending memories |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/835,988 Continuation US10310976B2 (en) | 2007-03-15 | 2015-08-26 | System and method for concurrently checking availability of data in extending memories |
Publications (1)
Publication Number | Publication Date |
---|---|
US20080229026A1 true US20080229026A1 (en) | 2008-09-18 |
Family
ID=39763836
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/724,568 Abandoned US20080229026A1 (en) | 2007-03-15 | 2007-03-15 | System and method for concurrently checking availability of data in extending memories |
US14/835,988 Active 2027-05-14 US10310976B2 (en) | 2007-03-15 | 2015-08-26 | System and method for concurrently checking availability of data in extending memories |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/835,988 Active 2027-05-14 US10310976B2 (en) | 2007-03-15 | 2015-08-26 | System and method for concurrently checking availability of data in extending memories |
Country Status (1)
Country | Link |
---|---|
US (2) | US20080229026A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090224388A1 (en) * | 2008-03-04 | 2009-09-10 | International Business Machines Corporation | Semiconductor chip stacking for redundancy and yield improvement |
US20100015732A1 (en) * | 2007-11-29 | 2010-01-21 | International Business Machines Corporation | Semiconductor chip repair by stacking of a base semiconductor chip and a repair semiconductor chip |
US20120290793A1 (en) * | 2011-05-10 | 2012-11-15 | Jaewoong Chung | Efficient tag storage for large data caches |
US20120297110A1 (en) * | 2011-05-18 | 2012-11-22 | University Of North Texas | Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks |
US20130103892A1 (en) * | 2011-10-20 | 2013-04-25 | Hae Chan PARK | Combined memory block and data processing system having the same |
US20130179642A1 (en) * | 2012-01-10 | 2013-07-11 | Qualcomm Incorporated | Non-Allocating Memory Access with Physical Address |
US9123409B2 (en) | 2009-06-11 | 2015-09-01 | Micron Technology, Inc. | Memory device for a hierarchical memory architecture |
US20160140039A1 (en) * | 2014-11-14 | 2016-05-19 | Avinash Sodani | Providing multiple memory modes for a processor including internal memory |
US20160210243A1 (en) * | 2015-01-16 | 2016-07-21 | Oracle International Corporation | Memory Paging for Processors using Physical Addresses |
US20180300139A1 (en) * | 2015-10-29 | 2018-10-18 | Intel Corporation | Boosting local memory performance in processor graphics |
US10296480B2 (en) | 2011-10-20 | 2019-05-21 | SK Hynix Inc. | Data processing system having combined memory block and stack package |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10042576B2 (en) | 2016-08-17 | 2018-08-07 | Advanced Micro Devices, Inc. | Method and apparatus for compressing addresses |
US10901639B2 (en) * | 2016-11-29 | 2021-01-26 | Sap Se | Memory allocation in multi-core processors |
US10915453B2 (en) * | 2016-12-29 | 2021-02-09 | Intel Corporation | Multi level system memory having different caching structures and memory controller that supports concurrent look-up into the different caching structures |
US11165482B1 (en) * | 2020-08-20 | 2021-11-02 | Nxp Usa, Inc. | Efficient engine and algorithm for control and data multiplexing/demultiplexing in 5G NR devices |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5649154A (en) * | 1992-02-27 | 1997-07-15 | Hewlett-Packard Company | Cache memory system having secondary cache integrated with primary cache for use with VLSI circuits |
US5903908A (en) * | 1994-01-04 | 1999-05-11 | Intel Corporation | Method and apparatus for maintaining cache coherency using a single controller for multiple cache memories |
US6282614B1 (en) * | 1999-04-15 | 2001-08-28 | National Semiconductor Corporation | Apparatus and method for reducing the power consumption of a microprocessor with multiple levels of caches |
US6412038B1 (en) * | 2000-02-14 | 2002-06-25 | Intel Corporation | Integral modular cache for a processor |
US6427188B1 (en) * | 2000-02-09 | 2002-07-30 | Hewlett-Packard Company | Method and system for early tag accesses for lower-level caches in parallel with first-level cache |
US20030154345A1 (en) * | 2002-02-08 | 2003-08-14 | Terry Lyon | Multilevel cache system having unified cache tag memory |
US20040098540A1 (en) * | 2002-11-19 | 2004-05-20 | Renesas Technology Corp. | Cache system and cache memory control device controlling cache memory having two access modes |
US20040162971A1 (en) * | 1999-05-11 | 2004-08-19 | Sun Microsystems, Inc. | Switching method in a multi-threaded processor |
US6848031B2 (en) * | 2002-01-02 | 2005-01-25 | Intel Corporation | Parallel searching for an instruction at multiple cache levels |
US20050033920A1 (en) * | 2003-08-07 | 2005-02-10 | Delan Eric | Cache structure and methodology |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5276826A (en) * | 1988-01-04 | 1994-01-04 | Hewlett-Packard Company | Apparatus for transforming addresses to provide pseudo-random access to memory modules |
US5678020A (en) * | 1994-01-04 | 1997-10-14 | Intel Corporation | Memory subsystem wherein a single processor chip controls multiple cache memory chips |
US6397296B1 (en) * | 1999-02-19 | 2002-05-28 | Hitachi Ltd. | Two-level instruction cache for embedded processors |
US6430655B1 (en) * | 2000-01-31 | 2002-08-06 | Mips Technologies, Inc. | Scratchpad RAM memory accessible in parallel to a primary cache |
EP1486875A1 (en) * | 2003-06-12 | 2004-12-15 | STMicroelectronics Limited | Allowing multiple simultaneous acccesses to a cache |
-
2007
- 2007-03-15 US US11/724,568 patent/US20080229026A1/en not_active Abandoned
-
2015
- 2015-08-26 US US14/835,988 patent/US10310976B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5649154A (en) * | 1992-02-27 | 1997-07-15 | Hewlett-Packard Company | Cache memory system having secondary cache integrated with primary cache for use with VLSI circuits |
US5903908A (en) * | 1994-01-04 | 1999-05-11 | Intel Corporation | Method and apparatus for maintaining cache coherency using a single controller for multiple cache memories |
US6282614B1 (en) * | 1999-04-15 | 2001-08-28 | National Semiconductor Corporation | Apparatus and method for reducing the power consumption of a microprocessor with multiple levels of caches |
US20040162971A1 (en) * | 1999-05-11 | 2004-08-19 | Sun Microsystems, Inc. | Switching method in a multi-threaded processor |
US6427188B1 (en) * | 2000-02-09 | 2002-07-30 | Hewlett-Packard Company | Method and system for early tag accesses for lower-level caches in parallel with first-level cache |
US6412038B1 (en) * | 2000-02-14 | 2002-06-25 | Intel Corporation | Integral modular cache for a processor |
US6848031B2 (en) * | 2002-01-02 | 2005-01-25 | Intel Corporation | Parallel searching for an instruction at multiple cache levels |
US20030154345A1 (en) * | 2002-02-08 | 2003-08-14 | Terry Lyon | Multilevel cache system having unified cache tag memory |
US20040098540A1 (en) * | 2002-11-19 | 2004-05-20 | Renesas Technology Corp. | Cache system and cache memory control device controlling cache memory having two access modes |
US20050033920A1 (en) * | 2003-08-07 | 2005-02-10 | Delan Eric | Cache structure and methodology |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100015732A1 (en) * | 2007-11-29 | 2010-01-21 | International Business Machines Corporation | Semiconductor chip repair by stacking of a base semiconductor chip and a repair semiconductor chip |
US8796047B2 (en) | 2007-11-29 | 2014-08-05 | International Business Machines Corporation | Semiconductor chip repair by stacking of a base semiconductor chip and a repair semiconductor chip |
US8679861B2 (en) | 2007-11-29 | 2014-03-25 | International Business Machines Corporation | Semiconductor chip repair by stacking of a base semiconductor chip and a repair semiconductor chip |
US8597960B2 (en) * | 2008-03-04 | 2013-12-03 | International Business Machines Corporation | Semiconductor chip stacking for redundancy and yield improvement |
US20090224388A1 (en) * | 2008-03-04 | 2009-09-10 | International Business Machines Corporation | Semiconductor chip stacking for redundancy and yield improvement |
US8686559B2 (en) | 2008-03-04 | 2014-04-01 | International Business Machines Corporation | Semiconductor chip stacking for redundancy and yield improvement |
DE102009037984B4 (en) * | 2009-06-11 | 2017-10-19 | Micron Technology, Inc. | Memory unit for a hierarchical memory architecture |
US9123409B2 (en) | 2009-06-11 | 2015-09-01 | Micron Technology, Inc. | Memory device for a hierarchical memory architecture |
US10725956B2 (en) | 2009-06-11 | 2020-07-28 | Micron Technology, Inc. | Memory device for a hierarchical memory architecture |
US9626327B2 (en) | 2009-06-11 | 2017-04-18 | Micron Technology, Inc. | Memory device for a hierarchical memory architecture |
US10031879B2 (en) | 2009-06-11 | 2018-07-24 | Micron Technology, Inc. | Memory device for a hierarchical memory architecture |
US20120290793A1 (en) * | 2011-05-10 | 2012-11-15 | Jaewoong Chung | Efficient tag storage for large data caches |
US20120297110A1 (en) * | 2011-05-18 | 2012-11-22 | University Of North Texas | Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks |
US9396135B2 (en) * | 2011-05-18 | 2016-07-19 | University Of North Texas | Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks |
US20130103892A1 (en) * | 2011-10-20 | 2013-04-25 | Hae Chan PARK | Combined memory block and data processing system having the same |
US10296480B2 (en) | 2011-10-20 | 2019-05-21 | SK Hynix Inc. | Data processing system having combined memory block and stack package |
US9552874B2 (en) * | 2011-10-20 | 2017-01-24 | Hynix Semiconductor Inc. | Combined memory block and data processing system having the same |
US20130179642A1 (en) * | 2012-01-10 | 2013-07-11 | Qualcomm Incorporated | Non-Allocating Memory Access with Physical Address |
US9720827B2 (en) * | 2014-11-14 | 2017-08-01 | Intel Corporation | Providing multiple memory modes for a processor including internal memory |
US10346300B2 (en) * | 2014-11-14 | 2019-07-09 | Intel Corporation | Providing multiple memory modes for a processor including internal memory |
US20160140039A1 (en) * | 2014-11-14 | 2016-05-19 | Avinash Sodani | Providing multiple memory modes for a processor including internal memory |
US11526440B2 (en) * | 2014-11-14 | 2022-12-13 | Intel Corporation | Providing multiple memory modes for a processor including internal memory |
US9678872B2 (en) * | 2015-01-16 | 2017-06-13 | Oracle International Corporation | Memory paging for processors using physical addresses |
US20160210243A1 (en) * | 2015-01-16 | 2016-07-21 | Oracle International Corporation | Memory Paging for Processors using Physical Addresses |
US20180300139A1 (en) * | 2015-10-29 | 2018-10-18 | Intel Corporation | Boosting local memory performance in processor graphics |
US10768935B2 (en) * | 2015-10-29 | 2020-09-08 | Intel Corporation | Boosting local memory performance in processor graphics |
US20200371804A1 (en) * | 2015-10-29 | 2020-11-26 | Intel Corporation | Boosting local memory performance in processor graphics |
Also Published As
Publication number | Publication date |
---|---|
US10310976B2 (en) | 2019-06-04 |
US20150363314A1 (en) | 2015-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10310976B2 (en) | System and method for concurrently checking availability of data in extending memories | |
US11074190B2 (en) | Slot/sub-slot prefetch architecture for multiple memory requestors | |
US7558920B2 (en) | Apparatus and method for partitioning a shared cache of a chip multi-processor | |
US8032711B2 (en) | Prefetching from dynamic random access memory to a static random access memory | |
US8868843B2 (en) | Hardware filter for tracking block presence in large caches | |
US11921642B2 (en) | Methods and apparatuses for addressing memory caches | |
US20130046934A1 (en) | System caching using heterogenous memories | |
US8954672B2 (en) | System and method for cache organization in row-based memories | |
US11232039B2 (en) | Cache for storing regions of data | |
CN112558889B (en) | Stacked Cache system based on SEDRAM, control method and Cache device | |
US8234453B2 (en) | Processor having a cache memory which is comprised of a plurality of large scale integration | |
US20130031313A1 (en) | Cache arrangement | |
CN115132238A (en) | Integrated three-dimensional (3D) DRAM cache | |
US10402325B2 (en) | Memory system | |
US20220276969A1 (en) | Sedram-based stacked cache system and device and controlling method therefor | |
US11526448B2 (en) | Direct mapped caching scheme for a memory side cache that exhibits associativity in response to blocking from pinning | |
US20240078041A1 (en) | Die-Based Rank Management | |
US20240070073A1 (en) | Page cache and prefetch engine for external memory | |
CN114556335A (en) | On-chip cache and integrated chip |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TAIWAN SEMICONDUCTOR MANUFACTURING CO., LTD., TAIW Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CHUNG, SHINE;REEL/FRAME:019115/0647 Effective date: 20070312 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |