WO2005033926A3 - Methods and apparatus for reducing memory latency in a software application - Google Patents
Methods and apparatus for reducing memory latency in a software application Download PDFInfo
- Publication number
- WO2005033926A3 WO2005033926A3 PCT/US2004/032212 US2004032212W WO2005033926A3 WO 2005033926 A3 WO2005033926 A3 WO 2005033926A3 US 2004032212 W US2004032212 W US 2004032212W WO 2005033926 A3 WO2005033926 A3 WO 2005033926A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- software application
- thread
- helper
- memory latency
- main thread
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
Abstract
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2006534105A JP4783291B2 (en) | 2003-10-02 | 2004-09-29 | Method and apparatus for reducing memory latency in software applications |
EP04789368A EP1678610A2 (en) | 2003-10-02 | 2004-09-29 | Methods and apparatus for reducing memory latency in a software application |
CN200480035709XA CN1890635B (en) | 2003-10-02 | 2004-09-29 | Methods and apparatus for reducing memory latency in a software application |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/677,414 US7328433B2 (en) | 2003-10-02 | 2003-10-02 | Methods and apparatus for reducing memory latency in a software application |
US10/677,414 | 2003-10-02 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2005033926A2 WO2005033926A2 (en) | 2005-04-14 |
WO2005033926A3 true WO2005033926A3 (en) | 2005-12-29 |
Family
ID=34422137
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2004/032212 WO2005033926A2 (en) | 2003-10-02 | 2004-09-29 | Methods and apparatus for reducing memory latency in a software application |
Country Status (5)
Country | Link |
---|---|
US (1) | US7328433B2 (en) |
EP (1) | EP1678610A2 (en) |
JP (2) | JP4783291B2 (en) |
CN (1) | CN1890635B (en) |
WO (1) | WO2005033926A2 (en) |
Families Citing this family (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040128489A1 (en) * | 2002-12-31 | 2004-07-01 | Hong Wang | Transformation of single-threaded code to speculative precomputation enabled code |
US20040243767A1 (en) * | 2003-06-02 | 2004-12-02 | Cierniak Michal J. | Method and apparatus for prefetching based upon type identifier tags |
US7707554B1 (en) * | 2004-04-21 | 2010-04-27 | Oracle America, Inc. | Associating data source information with runtime events |
US20060080661A1 (en) * | 2004-10-07 | 2006-04-13 | International Business Machines Corporation | System and method for hiding memory latency |
US7506325B2 (en) | 2004-10-07 | 2009-03-17 | International Business Machines Corporation | Partitioning processor resources based on memory usage |
US7752016B2 (en) * | 2005-01-11 | 2010-07-06 | Hewlett-Packard Development Company, L.P. | System and method for data analysis |
US7809991B2 (en) * | 2005-01-11 | 2010-10-05 | Hewlett-Packard Development Company, L.P. | System and method to qualify data capture |
US7849453B2 (en) * | 2005-03-16 | 2010-12-07 | Oracle America, Inc. | Method and apparatus for software scouting regions of a program |
US7950012B2 (en) * | 2005-03-16 | 2011-05-24 | Oracle America, Inc. | Facilitating communication and synchronization between main and scout threads |
US7472256B1 (en) | 2005-04-12 | 2008-12-30 | Sun Microsystems, Inc. | Software value prediction using pendency records of predicted prefetch values |
US20070130114A1 (en) * | 2005-06-20 | 2007-06-07 | Xiao-Feng Li | Methods and apparatus to optimize processing throughput of data structures in programs |
US7784040B2 (en) * | 2005-11-15 | 2010-08-24 | International Business Machines Corporation | Profiling of performance behaviour of executed loops |
US7856622B2 (en) * | 2006-03-28 | 2010-12-21 | Inventec Corporation | Computer program runtime bottleneck diagnostic method and system |
US7383402B2 (en) * | 2006-06-05 | 2008-06-03 | Sun Microsystems, Inc. | Method and system for generating prefetch information for multi-block indirect memory access chains |
US7383401B2 (en) * | 2006-06-05 | 2008-06-03 | Sun Microsystems, Inc. | Method and system for identifying multi-block indirect memory access chains |
US7596668B2 (en) * | 2007-02-20 | 2009-09-29 | International Business Machines Corporation | Method, system and program product for associating threads within non-related processes based on memory paging behaviors |
JP4821907B2 (en) * | 2007-03-06 | 2011-11-24 | 日本電気株式会社 | Memory access control system, memory access control method and program thereof |
US8886887B2 (en) * | 2007-03-15 | 2014-11-11 | International Business Machines Corporation | Uniform external and internal interfaces for delinquent memory operations to facilitate cache optimization |
US8271963B2 (en) * | 2007-11-19 | 2012-09-18 | Microsoft Corporation | Mimicking of functionality exposed through an abstraction |
CN101482831B (en) * | 2008-01-08 | 2013-05-15 | 国际商业机器公司 | Method and equipment for concomitant scheduling of working thread and worker thread |
US8359589B2 (en) * | 2008-02-01 | 2013-01-22 | International Business Machines Corporation | Helper thread for pre-fetching data |
CN101639799B (en) * | 2008-07-31 | 2013-02-13 | 英赛特半导体有限公司 | Integrated circuit characterization system and method |
US8312442B2 (en) * | 2008-12-10 | 2012-11-13 | Oracle America, Inc. | Method and system for interprocedural prefetching |
US20100153934A1 (en) * | 2008-12-12 | 2010-06-17 | Peter Lachner | Prefetch for systems with heterogeneous architectures |
US8327325B2 (en) * | 2009-01-14 | 2012-12-04 | International Business Machines Corporation | Programmable framework for automatic tuning of software applications |
CA2680597C (en) * | 2009-10-16 | 2011-06-07 | Ibm Canada Limited - Ibm Canada Limitee | Managing speculative assist threads |
US8572337B1 (en) * | 2009-12-14 | 2013-10-29 | Symantec Corporation | Systems and methods for performing live backups |
JP5541491B2 (en) * | 2010-01-07 | 2014-07-09 | 日本電気株式会社 | Multiprocessor, computer system using the same, and multiprocessor processing method |
CN101807144B (en) * | 2010-03-17 | 2014-05-14 | 上海大学 | Prospective multi-threaded parallel execution optimization method |
US8423750B2 (en) | 2010-05-12 | 2013-04-16 | International Business Machines Corporation | Hardware assist thread for increasing code parallelism |
US8468531B2 (en) | 2010-05-26 | 2013-06-18 | International Business Machines Corporation | Method and apparatus for efficient inter-thread synchronization for helper threads |
US8612730B2 (en) | 2010-06-08 | 2013-12-17 | International Business Machines Corporation | Hardware assist thread for dynamic performance profiling |
US20120005457A1 (en) * | 2010-07-01 | 2012-01-05 | International Business Machines Corporation | Using software-controlled smt priority to optimize data prefetch with assist thread |
FR2962567B1 (en) * | 2010-07-12 | 2013-04-26 | Bull Sas | METHOD FOR OPTIMIZING MEMORY ACCESS, WHEN RE-EXECUTING AN APPLICATION, IN A MICROPROCESSOR COMPRISING SEVERAL LOGICAL HEARTS AND COMPUTER PROGRAM USING SUCH A METHOD |
US8683129B2 (en) * | 2010-10-21 | 2014-03-25 | Oracle International Corporation | Using speculative cache requests to reduce cache miss delays |
US20130086564A1 (en) * | 2011-08-26 | 2013-04-04 | Cognitive Electronics, Inc. | Methods and systems for optimizing execution of a program in an environment having simultaneously parallel and serial processing capability |
US9021152B2 (en) * | 2013-09-30 | 2015-04-28 | Google Inc. | Methods and systems for determining memory usage ratings for a process configured to run on a device |
KR102525295B1 (en) | 2016-01-06 | 2023-04-25 | 삼성전자주식회사 | Method for managing data and apparatus thereof |
JP6845657B2 (en) * | 2016-10-12 | 2021-03-24 | 株式会社日立製作所 | Management server, management method and its program |
CN106776047B (en) * | 2017-01-19 | 2019-08-02 | 郑州轻工业学院 | Group-wise thread forecasting method towards irregular data-intensive application |
US20180260255A1 (en) * | 2017-03-10 | 2018-09-13 | Futurewei Technologies, Inc. | Lock-free reference counting |
US11816500B2 (en) | 2019-03-15 | 2023-11-14 | Intel Corporation | Systems and methods for synchronization of multi-thread lanes |
US11132268B2 (en) | 2019-10-21 | 2021-09-28 | The Boeing Company | System and method for synchronizing communications between a plurality of processors |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5590293A (en) * | 1988-07-20 | 1996-12-31 | Digital Equipment Corporation | Dynamic microbranching with programmable hold on condition, to programmable dynamic microbranching delay minimization |
US5835947A (en) * | 1996-05-31 | 1998-11-10 | Sun Microsystems, Inc. | Central processing unit and method for improving instruction cache miss latencies using an instruction buffer which conditionally stores additional addresses |
US5809566A (en) * | 1996-08-14 | 1998-09-15 | International Business Machines Corporation | Automatic cache prefetch timing with dynamic trigger migration |
US6199154B1 (en) * | 1997-11-17 | 2001-03-06 | Advanced Micro Devices, Inc. | Selecting cache to fetch in multi-level cache system based on fetch address source and pre-fetching additional data to the cache for future access |
US6223276B1 (en) * | 1998-03-31 | 2001-04-24 | Intel Corporation | Pipelined processing of short data streams using data prefetching |
US6643766B1 (en) * | 2000-05-04 | 2003-11-04 | Hewlett-Packard Development Company, L.P. | Speculative pre-fetching additional line on cache miss if no request pending in out-of-order processor |
-
2003
- 2003-10-02 US US10/677,414 patent/US7328433B2/en not_active Expired - Fee Related
-
2004
- 2004-09-29 EP EP04789368A patent/EP1678610A2/en not_active Withdrawn
- 2004-09-29 WO PCT/US2004/032212 patent/WO2005033926A2/en active Application Filing
- 2004-09-29 JP JP2006534105A patent/JP4783291B2/en not_active Expired - Fee Related
- 2004-09-29 CN CN200480035709XA patent/CN1890635B/en not_active Expired - Fee Related
-
2010
- 2010-12-22 JP JP2010286087A patent/JP5118744B2/en not_active Expired - Fee Related
Non-Patent Citations (4)
Title |
---|
DORAI G ET AL: "Optimizing SMT Processors for High Single-Thread Performance", THE JOURNAL OF INSTRUCTION-LEVEL PARALLELISM, vol. 5, April 2003 (2003-04-01), XP002348824, Retrieved from the Internet <URL:http://www.jilp.org/vol5/v5paper3.pdf> [retrieved on 20051010] * |
HONG WANG ET AL: "Speculative precomputation: exploring the use of multithreading for latency", INTEL TECHNOLOGY JOURNAL, vol. 6, no. 1, 14 February 2002 (2002-02-14), XP002303432 * |
KIM D ET AL: "Design and evaluation of compiler algorithms for pre-execution", ASPLOS. PROCEEDINGS. INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, NEW YORK, NY, US, October 2002 (2002-10-01), pages 159 - 170, XP002311601 * |
LIAO S S W ET AL: "Post-pass binary adaptation for software-based speculative precomputation", ACM SIGPLAN NOTICES, ACM, ASSOCIATION FOR COMPUTING MACHINERY, NEW YORK, NY, US, vol. 37, no. 5, May 2002 (2002-05-01), pages 117 - 128, XP002302652, ISSN: 0362-1340 * |
Also Published As
Publication number | Publication date |
---|---|
US7328433B2 (en) | 2008-02-05 |
JP5118744B2 (en) | 2013-01-16 |
CN1890635B (en) | 2011-03-09 |
CN1890635A (en) | 2007-01-03 |
JP2007507807A (en) | 2007-03-29 |
JP4783291B2 (en) | 2011-09-28 |
JP2011090705A (en) | 2011-05-06 |
EP1678610A2 (en) | 2006-07-12 |
US20050086652A1 (en) | 2005-04-21 |
WO2005033926A2 (en) | 2005-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2005033926A3 (en) | Methods and apparatus for reducing memory latency in a software application | |
EP1702269B1 (en) | Dynamic performance monitoring-based approach to memory management | |
Schoeberl | A time predictable instruction cache for a Java processor | |
US9652230B2 (en) | Computer processor employing dedicated hardware mechanism controlling the initialization and invalidation of cache lines | |
WO2004027605A3 (en) | Post-pass binary adaptation for software-based speculative precomputation | |
Ekman et al. | A robust main-memory compression scheme | |
US6397296B1 (en) | Two-level instruction cache for embedded processors | |
US9274965B2 (en) | Prefetching data | |
US7401188B2 (en) | Method, device, and system to avoid flushing the contents of a cache by not inserting data from large requests | |
Zhuang et al. | Reducing cache pollution via dynamic data prefetch filtering | |
US7278136B2 (en) | Reducing processor energy consumption using compile-time information | |
GB2409747A (en) | Processor cache memory as ram for execution of boot code | |
Luk et al. | Automatic compiler-inserted prefetching for pointer-based applications | |
WO2004068339A3 (en) | Multithreaded processor with recoupled data and instruction prefetch | |
Chen et al. | TEST: a tracer for extracting speculative threads | |
WO2004055667A3 (en) | System and method for data prefetching | |
EP1460532A3 (en) | Computer processor data fetch unit and related method | |
WO2004102376A3 (en) | Apparatus and method to provide multithreaded computer processing | |
McCurdy et al. | Characterizing the impact of prefetching on scientific application performance | |
Guttman et al. | Performance and energy evaluation of data prefetching on intel xeon phi | |
WO2002027498A3 (en) | System and method for identifying and managing streaming-data | |
Kiani et al. | Skerd: Reuse distance analysis for simultaneous multiple gpu kernel executions | |
Lewis et al. | Avoiding initialization misses to the heap | |
Liu et al. | Enhancements for accurate and timely streaming prefetcher | |
Cebrian et al. | Boosting Store Buffer Efficiency with Store-Prefetch Bursts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200480035709.X Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006534105 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2004789368 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2004789368 Country of ref document: EP |