US20060143241A1 - System and method for scaleable multiplexed transactional log recovery - Google Patents
System and method for scaleable multiplexed transactional log recovery Download PDFInfo
- Publication number
- US20060143241A1 US20060143241A1 US11/357,333 US35733306A US2006143241A1 US 20060143241 A1 US20060143241 A1 US 20060143241A1 US 35733306 A US35733306 A US 35733306A US 2006143241 A1 US2006143241 A1 US 2006143241A1
- Authority
- US
- United States
- Prior art keywords
- log
- transactional
- region
- multiplexed
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2308—Concurrency control
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99951—File or database maintenance
- Y10S707/99952—Coherency, e.g. same view to multiple users
- Y10S707/99953—Recoverability
Definitions
- Transactional logging involves maintaining a transactional log that durably records a time serial history of transactions in a system.
- the transactional log provides information for restoring a system to a particular state in time prior to a system failure.
- a transactional logging system must be able to reliably and accurately restore logging functionalities after such a system failure.
- transactional logging systems have used a dedicated log to support a single log client.
- Dedicated transactional logging systems are typically very robust and achieve a high performance level.
- the inventor has determined that the high level of reliability of a single log client using a dedicated logging system may actually result in overall performance degradation for a computing environment where multiple log clients are using multiple dedicated logging systems.
- I/O input/output
- each dedicated logging system independently incurs input/output (I/O) overhead to write and retrieve information.
- the I/O overhead results in adverse performance impact, and the impact is cumulative for each of the independent transactional logging systems.
- An improved transactional logging system is desirable that could overcome some of these performance problems but could still allow reliable system recovery.
- the present invention provides scaleable recovery for a multiplexed transactional log.
- a multiplexed transactional log may include log data from multiple clients.
- log data from different clients may be multiplexed into the multiplexed transactional log in any order.
- the log data associated with a particular client is represented by a virtual log of that client within the multiplexed transactional log.
- the invention is directed to a computer-implemented method for transactional logging using a multiplexed log. The computer-implemented method maintains a multiplexed log for multiple clients using a scaleable logging process.
- the computer-implemented method recovers the multiplexed log using a scaleable recovery process.
- the scaleable recovery process includes an end-of-log locating process for locating the end of each virtual log within the multiplexed log.
- the end-of-log locating process is also scaleable.
- the invention is directed to a computer-implemented method for maintaining a recoverable transactional log.
- a log block containing log data is received from one of the clients.
- the log block is appended to a current region in a flush queue.
- Metadata associated with the current region is updated to account for the newly appended log block in the current region. If the end of the current region is reached, the metadata is appended to the current region in the flush queue.
- the invention is directed to a computer-implemented method for recovering a transactional log after a system failure.
- a starting point in the transactional log is determined by referring to metadata associated with the transactional log.
- the last valid owner page within the transactional log is located by checking at discrete intervals from the starting point toward the end of the transactional log.
- the method checks the validity of a region in the transactional log associated with the last valid owner page. If the region associated with last valid owner page is valid, the first invalid log block in an incomplete region is located where the incomplete region is located beyond the last valid region toward the end of the transactional log. The end of the transactional log is found when the first invalid log block is located.
- the present invention is directed to a transactional logging system that includes a transactional log, a metadata file, and a multiplexed transactional logging component.
- the transactional log is typically stored in a storage unit.
- the transactional log contains log blocks from clients and owner pages that include information on how the log blocks are organized in the transactional log.
- the metadata file includes information about the transactional log.
- the multiplexed transactional logging component is configured to append the log blocks and the owner pages to the transactional log and to recover the transactional log after a system failure using information in the owner pages and the metadata file.
- FIG. 1 illustrates an exemplary computing device that implements the present invention.
- FIG. 2 is a schematic diagram of a multiplexed transactional logging system.
- FIG. 3 is a graphical representation of two exemplary owner pages.
- FIG. 4 is an operational flow diagram of an exemplary process for handling log blocks from a client.
- FIG. 5 is an operational flow diagram of another exemplary process for handling log blocks from a client.
- FIG. 6 is an operational flow diagram of yet another exemplary process for handling log blocks from a client.
- FIG. 7 is an operational flow diagram of an exemplary process for recovering a multiplexed log.
- FIG. 8 is an operational flow diagram of another exemplary process for recovering a multiplexed log.
- logging system recovery is an important aspect of a multiplexed transactional logging system.
- the present invention focuses on recovering a multiplexed log after a system failure and restoring logging functionalities.
- restoring logging functionality typically includes determining the end of each of the virtual logs within the multiplexed log.
- the invention provides a number of methods for locating the end of a multiplexed log and the end of each of the virtual logs within the multiplexed log.
- the manner in which multiplexed logs are recovered in the present invention is very different from the manner in which dedicated logs are recovered.
- a conventional method that scans the entire dedicated log from its last written restart area to locate the end of the log is typically used.
- the dedicated log may be scanned sequentially or logarithmically using a binary search algorithm.
- this conventional method is not practical for recovering a multiplexed log.
- each of the virtual logs within the multiplexed log would have to be located by scanning.
- the number of scans for log recovery proportionally increases with the size and the number of virtual logs within the multiplexed log.
- the amount of time and system resources required by conventional log recovery methods is prohibitive, especially for a large scale multiplexed transactional logging system.
- the present invention provides an improved system and method that enables multiplexed log recovery but requires significantly less time and fewer system resources.
- the maintenance and recovery of the multiplexed log are scaleable (independent of the size of the multiplexed log and the number of clients).
- FIG. 1 illustrates an exemplary computing device 100 that may be used in one exemplary embodiment of the present invention.
- a computing device such as computing device 100 .
- computing device 100 typically includes at least one processing unit 102 and system memory 104 .
- system memory 104 may include volatile memory (such as RAM 106 ), non-volatile memory (such as ROM 110 , flash memory, etc.), and storage unit 130 (such as hard drive or other stable storage devices).
- Computing device 100 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape.
- Computer storage media may include volatile and nonvolatile memory, storage units, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
- computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 100 .
- Computing device 100 may also include input component(s) 140 such as keyboard 122 , mouse 123 , pen, voice input device, touch input device, etc.
- Output component(s) 145 such as a display, speakers, printer, etc. may also be included. These devices are well known in the art and need not be discussed at length here.
- Computing device 100 may also contain communication connection(s) 150 that allow computing device 100 to communicate with other computing devices, such as over one or more network(s) 160 .
- Signals used by communication connection(s) 150 are one example of communication media.
- Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
- communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
- the term computer readable media as used herein includes both storage media and communication media.
- System memory 104 typically includes BIOS 111 , operating system 112 , and one or more applications 120 . As shown in the figure, system memory 104 may include multiplexed transactional logging system 114 . Multiplexed transactional logging system 114 is a computer executable component that provides logging services for applications 120 , such as Client A and Client B. For illustrative purposes, multiplexed transactional logging system 114 is shown as a part of the kernel of computing device 100 . But multiplexed transactional logging system 114 may be implemented as a separate application executing in either the kernel mode or the user mode of computing device 100 . Multiplexed transactional logging system 114 may also be implemented as two or more components executing in either mode.
- Multiplexed transactional logging system 114 is configured to maintain and retrieve log data for applications 120 .
- Multiplexed transactional logging system 114 maintains the log data in multiplexed log 134 stored in storage unit 130 .
- the log data from different applications 120 are multiplexed by multiplexed transactional logging system 114 before being appended to multiplexed log 134 .
- log data are organized into fixed size log blocks. Log blocks are grouped into regions that make up the multiplexed log 134 .
- Multiplexed transactional logging system 114 may defer log data in flush queue 116 before appending the log data to multiplexed log 134 .
- Metadata which is information about the structure and organization of the multiplexed log 134 , may be included in metadata file 136 stored in storage unit 130 . Metadata for multiplexed log 134 may also be appended to multiplexed log 134 as owner pages. Owner pages will be discussed in more detail in conjunction with FIG. 3 . Briefly stated, an owner page contains metadata of a particular region in the multiplexed log. One or more owner pages 109 may be stored in volatile memory before they are appended to multiplexed log 134 .
- FIG. 2 is a schematic diagram of multiplexed transactional logging system 114 .
- Multiplexed transactional logging system 114 provides logging services to multiple clients, such as Clients A, B and C.
- Each client is an application that maintains a log through the multiplexed transactional logging system 114 . Examples of the clients may be a database application, a transactional file system, or the like.
- Clients A and B which are also shown in FIG. 1 , are applications executing in the same computing device on which multiplexed transactional logging system 114 is executing.
- Client C is an application executing on a remote computing device.
- Multiplexed transactional logging system 114 is capable of providing logging services to remote applications such as Client C.
- Multiplexed transactional logging system 114 is configured to provide the illusion to each of Clients A, B, and C that a separate, dedicated log is being maintained for each client.
- Clients A, B, and C send log data to multiplexed transactional logging system 114 with the expectation that the log data are stored in dedicated logs.
- the illusory dedicated logs are referred to as “virtual logs,” represented in the figure as virtual logs 211 - 213 .
- multiplexed transactional logging system 114 multiplexes and appends the log data from each client to multiplexed log 134 , which is shared by Clients A, B, and C.
- the log data are typically organized into log blocks.
- a log block is a unit of physical log I/O that includes a fixed-sized log block header and a body which may be of any size. In one embodiment, the size of log blocks may be multiple of the size of a sector associated with storage unit 130 .
- Multiplexed transactional logging system 114 may be configured to maintain owner pages, which are data structures that contain information about how log blocks are arranged in multiplexed log 134 . Owner pages will be discussed in more detailed in conjunction with FIG. 3 . Briefly stated, an owner page may include information about the ownership of log blocks in a region of multiplexed log 134 . Multiplexed transactional logging system 114 may use the metadata in the owner pages to organize log blocks as virtual logs 211 - 213 for providing logging services to Clients A, B, and C. Multiplexed transactional logging system 114 may also use the metadata in owner pages for recovering logging finctionalities after a system failure. Owner pages may be appended to multiplexed log 134 as shown in the figure. Owner pages may also be appended to metadata file 136 , which is a data structure separate from multiplexed log 134 .
- multiplexed transactional logging system 114 may receive log blocks from Clients A, B and C at different times and order. Multiplexed transactional logging system 114 multiplexes the log blocks by appending them to a single multiplexed log 134 . Multiplexed transactional logging system 114 may defer appending the multiplexed log blocks using flush queue 116 .
- Flush queue 116 is a data structure that represents the multiplexed log blocks that will be appended to multiplexed log 134 . Flush queue 116 is typically stored in volatile memory. Multiplexed transactional logging system 114 may be configured to use flush queue 116 for reducing the need to access storage unit 130 and improving system performance.
- a performance overhead is generated every time multiplexed transactional logging system 114 accesses the hard drive to append multiplexed log blocks.
- Multiplexed logging system 114 may be configured to write log blocks in flush queue 116 to the hard drive only when the user voluntarily requests that the log blocks be forced to the hard disk or when memory tied up by log blocks has exceeded a user-defined flush threshold. By accumulating log blocks in volatile memory using a flush queue, the performance overhead is reduced by amortizing multiple potential accesses to the hard drive with a single hard drive access.
- Metadata file 136 that contains metadata about the multiplexed log 134 .
- Metadata file 136 is typically stored in a stable storage media, such as storage unit 130 .
- Metadata file 136 may include many different kinds of information.
- metadata may include the owner pages of the regions of multiplexed log 134 .
- the owner pages for the regions are appended in multiplexed log 134 and metadata file 136 includes the location in multiplexed log 134 at which the last owner page in the log is appended. This location enables multiplexed transactional logging system 114 to locate the last owner page for recovering multiplexed log 134 after a system failure.
- Multiplexed transactional logging system 114 may defer one or more regions of log data in flush queue 116 .
- the owner page of the current region in the flush queue may be stored in volatile memory until the current region is filled.
- multiplexed transactional logging system 114 may be configured to immediately append the log blocks in the flush queue 116 to multiplexed log 134 .
- Multiplexed transactional logging system 114 may append the owner page associated with the region to metadata file 136 .
- Multiplexed transactional logging system 114 may also append the owner page to the region before appending the region to multiplexed log 134 .
- multiplexed transactional logging system 114 may be configured to improve performance by reducing the overhead associated with appending log blocks to multiplexed log 134 and owner pages to metadata 136 .
- multiplexed transactional logging system 114 is capable of deferring multiple regions of log blocks. For example, as shown in the figure, multiplexed log 134 has appended log blocks to flush queue 116 up to current region 222 . After receiving enough log data to current region 222 , multiplexed transactional logging system 114 creates new owner page 340 for new region 224 and appends current owner page 310 associated with current region 222 to the flush queue 116 .
- Multiplexed transactional logging system 114 may copy some of the data in current owner page 310 to new owner page 340 . Multiplexed transactional logging system 114 may append the log blocks in flush queue 116 to multiplexed log 134 when the size of the flush queue 116 reaches a critical value, when a client instructed its log blocks be immediately appended to multiplexed log 134 , or some other predetermined events.
- the log blocks in flush queue 116 may be appended to multiplexed log 134 in any order.
- each region in flush queue 116 is appended to multiplexed log 134 in sequential order.
- the log blocks in each region may be appended in any order.
- FIG. 3 is a graphical representation of two exemplary owner pages.
- Current owner page 310 is associated with current region 222 shown in FIG. 2 and new owner page 340 is associated with new region 224 .
- an owner page contains information about client ownership of the log blocks in a region.
- the owner page is a special log block with the metadata that associates the log blocks with the clients.
- the owner pages may be stored at specified intervals within the multiplexed log so that the locations of the owner pages may be determined directly, as opposed to scanning the entire multiplexed log.
- An owner page may include an owner referral and an owner array.
- Owner referral 320 maps each client to a range of locations within the multiplexed log where log blocks owned by the client are found.
- owner referral 320 of current owner page 310 contains a minimum location identifier and a maximum location identifier for each client that has log blocks in current region 222 .
- the minimum location identifier identifies a location where the beginning of the client's first log block in the region is found.
- the maximum location identifier identifies a location where the end of the client's the last log block in the multiplexed log is found.
- the minimum location identifiers and the maximum identifiers are strictly monotonically increasing within a client's virtual log.
- Owner array 330 identifies the client owner of each of the sectors associated with current region 222 .
- New owner page 340 is an owner page created for a new region 224 after current region 222 has been filled. For illustrative purposes, no log data have been appended to new region 224 .
- To create new owner page 340 some of the data from current owner page 310 may be copied to new owner page 340 .
- the maximum location identifiers in the owner referral of a current owner page are copied to the owner referral of a new owner page. As shown in the figure, the maximum location identifier for each of the clients in owner referral 320 is copied to owner referral 350 .
- the minimum location identifiers in owner referral 350 are filled with place-holders. In this embodiment, only some of the data and not all the data are copied.
- New owner page 340 initializes its owner array to indicate that nothing has been written to its log region.
- copying maximum location identifiers from a current owner page to a new owner page enables the new owner page to identify where the last log block of each of the clients is located in the multiplexed log.
- the new owner page may be used as a look-up table for finding the end of each of the virtual logs in the multiplexed log.
- FIG. 4 is an operational flow diagram of an exemplary process 400 for handling log blocks from a client. Moving from a start block, process 400 goes to block 410 where a log block is appended to a flush queue.
- the owner page of the current region is updated.
- This current owner page may be included in a metadata file stored in a stable storage medium.
- the current owner page is immediately modified and flushed to the metadata file to account for the newly appended log block.
- the process continues at decision block 415 .
- process 400 goes to block 420 where a new region is started and the current region is appended to the multiplexed log.
- the current owner page is appended to the current region and is appended to the multiplexed log along with the current region.
- the current owner page in the metadata file is overwritten to create a new owner page.
- certain data from the current owner page are transferred to the new owner page.
- Process 400 continues at decision block 440 .
- decision block 440 a determination is made whether more log blocks are ready for appending to the multiplexed log. If so, process 400 returns to block 410 . If no log block is ready for appending, the process ends.
- process 400 requires the multiplexed log and the metadata file to be updated for each new log blocks, the multiplexed log is readily recoverable.
- a relatively large amount of system resources would have to be dedicated for constantly accessing one or more stable storage media where the multiplexed log and the metadata file are stored.
- every log block requires accessing a stable storage medium (e.g. a hard disk) at least twice: one to write the metadata and one to append the log block to the multiplexed log.
- FIG. 5 is an operational flow diagram of another exemplary process 500 for handling log blocks from a client. Moving from start block, process 500 moves to block 510 where a log block is appended to a flush queue. At block 515 , the owner page of the current region is updated. The owner page may be stored in volatile memory. The current owner page is modified to account for the newly appended log data. Process 500 continues at decision block 520 .
- process 500 returns to block 510 . If no log block is ready for appending, the process ends.
- process 500 consumes less system resources and incurs less I/O overhead than process 400 discussed previously in conjunction with FIG. 4 .
- Deferring the log blocks in a flush queue before appending them to the multiplexed log and keeping the owner page of the current region in volatile memory reduce the frequency for accessing one or more stable storage media where the multiplexed log and the metadata file are stored.
- the disadvantage of process 500 is that the flush queue is forced to the multiplexed log at the end of every log region. Forcing the flush queue occurs when the end of a region is reached, as opposed to the voluntary intent of a log client. This is not desirable because during forward progress an efficient logging system should allow its clients to voluntarily determine when to incur a performance penalty associated with forcing the flush queue to a log.
- process 500 since the multiplexed log and the metadata file are not updated until a complete region is actually appended and forced to non-volatile storage, a process is needed for recovering the multiplexed log in case a system failure occurs while log blocks are stored in the flush queue but before they actually make it to non-volatile storage.
- An exemplary recovery process associated with process 500 will be discussed in conjunction with FIG. 7 .
- FIG. 6 is an operational flow diagram of yet another exemplary process 600 for handling log blocks from a client. Moving from start block, process 600 moves to block 610 where a log block is appended to a flush queue. At block 615 , the owner page of the current region cached in volatile memory is updated to account for the newly appended log block. The process continues at decision block 620 .
- process 600 returns to block 610 . If no log block is ready for appending, the process ends.
- process 600 incurs even less system resources than process 500 discussed previously in conjunction with FIG. 5 . Unlike process 400 and process 500 , process 600 does not force a flush queue to be appended to a multiplexed log when an owner page is appended to the flush queue. Thus, process 600 enables clients to control when the flush queue is forced to stable storage in the multiplexed log.
- Process 600 also allows forward progress of the multiplexed log to incur little or no I/O overhead when compared with a dedicated log system. Thus, forward progress is scaleable because appending owner pages to the flush queue occurs in constant time and does not incur undesirable and unexpected overhead associated with forcing the flush queue to stable storage in the multiplexed log.
- FIG. 7 is an operational flow diagram of an exemplary process 700 for recovering a multiplexed log.
- Process 700 may be used to recover log blocks appended to a multiplexed log using process 500 described in conjunction with FIG. 5 .
- Process 700 begins after a system failure. Moving from a start block, the process moves to block 710 where the multiplexed log is opened. At block 715 , the last owner page in the multiplexed log is determined. The last owner page and its location in the multiplexed log are determined by referring to metadata associated with the multiplexed log.
- Process 700 continues at block 735 where the end of the multiplexed log is determined.
- the end of the multiplexed log may be determined by sequentially checking each log block from the start of the region associated with the last owner page. The log blocks of the region are sequentially checked until an invalid log block is determined, indicating the end of the multiplexed log.
- the process moves to block 740 where the last cached owner page is updated. For example, some of the entries in the owner page may have to be deleted to account for the log blocks that were not appended to the multiplexed log due to the system failure. Process 700 then ends.
- FIG. 8 is an operational flow diagram of another exemplary process 800 for recovering a multiplexed log.
- Process 800 may be used to recover log blocks appended to a multiplexed log using process 600 described in conjunction with FIG. 6 . Moving from a start block, the process moves to block 810 where the multiplexed log is opened.
- Process 800 moves to block 815 where location information of the last owner page in the multiplexed log is determined.
- the location information of the last owner page is typically stored in a metadata file as metadata. To improve performance, metadata may not be updated very frequently. Thus, the location information may not indicate the location of the last owner page that was actually appended to the multiplexed log. But the indicated location may be used as a starting point.
- the last valid owner page is determined.
- the last valid owner page may be determined beginning from the starting point indicated by the location information determined at block 815 and scanning forward in the multiplexed log at a fixed interval.
- the fixed interval may coincide with the size of the fixed size region. Scanning forward across owner pages may be performed by a linear scan or an exponential back out followed by a binary search of owner pages.
- process 800 continues at 825 .
- the log blocks in the region associated with the last valid owner page are checked. Many methods for checking data validity may be used. One exemplary method is linearly validating each block in the region.
- Process 800 continues at decision block 835 where a determination is made whether the region is valid. If not, the process goes to block 830 where the prior region is checked and loops back to decision block 835 . The loop continues until a valid region is found. Typically, the last valid region is further down the multiplexed log than the starting point. Then, process 800 moves to block 840 .
- the owner page is reconstructed in memory from the log blocks of the incomplete region and the end of the multiplexed log is determined.
- the last valid log block of multiplexed log may be determined by checking log blocks located after the last valid region. Each log block is checked for validity until an invalid log block is located. Information obtained from checking the log blocks may be used to reconstruct the owner page.
- the owner page is reconstructed, the end of each of the virtual logs in the multiplexed log is readily determined by the reconstructed owner page. As discussed in conjunction with FIG. 3 , maximum location identifiers are copied into the owner referral of a new owner page.
- maximum location identifiers in the owner referral of the reconstructed owner page identify the last log block of each of the client in the multiplexed log. Thus, the end of each of the virtual logs is readily determined.
- logging functionality on the multiplexed log is restored and process 800 ends.
- log recovery is scaleable because the process involves a bounded scan of regions towards the end of the multiplexed log. The bound is determined by the flush threshold, which is typically set by the log clients. Finding the end of the multiplexed log and the end of each of the virtual logs is also scaleable because the process is a constant time and space table lookup independent of the size of the multiplexed log and the number of clients. Furthermore, after recovery, little or no I/O overhead is incurred since the owner referral of the last region that was recovered by process 800 is already reconstructed in memory.
- multiplexed log recovery process 800 in conjunction with the forward progress process 600 minimizes the log I/O overheard during forward progress of the multiplexed log at the expense of a more elaborate recovery scheme after system failure.
- the client not the logging system determines when the flush queue is forced to stable storage.
- the normal forward progress of the multiplexed log is efficient.
- the advantages of having an efficient forward progress are offset only in the rare event of a log recovery after a system failure. But even this offset is minimal because the multiplexed log recovery process 800 is scaleable.
- the system and method of the present invention optimize normal forward progress of a multiplexed log with the compromise of a more elaborate recovery process in the exceptional case of log recovery. With a recovery process that is scaleable, determination of the end of each of the virtual logs in the multiplexed log requires very little effort.
Abstract
A system and method for providing scaleable recovery for a multiplexed transactional log. Unlike a dedicated log that includes log data of only one client, a multiplexed transactional log may include log data from multiple clients. In a multiplexed transactional log, log data from different clients may be multiplexed into the multiplexed transactional log in any order. The multiplexed log is maintained for multiple clients using a scaleable logging process. After a system failure, the multiplexed log is recovered using a scaleable recovery process. The scaleable recovery process includes an end-of-log locating process for locating the end of the multiplexed log and each of the virtual logs with the multiplexed log. The end-of-log locating process is also scaleable.
Description
- Transactional logging involves maintaining a transactional log that durably records a time serial history of transactions in a system. The transactional log provides information for restoring a system to a particular state in time prior to a system failure. A transactional logging system must be able to reliably and accurately restore logging functionalities after such a system failure.
- Traditionally, transactional logging systems have used a dedicated log to support a single log client. Dedicated transactional logging systems are typically very robust and achieve a high performance level. However, the inventor has determined that the high level of reliability of a single log client using a dedicated logging system may actually result in overall performance degradation for a computing environment where multiple log clients are using multiple dedicated logging systems. One of the reasons for this is that each dedicated logging system independently incurs input/output (I/O) overhead to write and retrieve information. The I/O overhead results in adverse performance impact, and the impact is cumulative for each of the independent transactional logging systems. An improved transactional logging system is desirable that could overcome some of these performance problems but could still allow reliable system recovery.
- Briefly stated, the present invention provides scaleable recovery for a multiplexed transactional log. Unlike a dedicated log that includes log data of only one client, a multiplexed transactional log may include log data from multiple clients. In a multiplexed transactional log, log data from different clients may be multiplexed into the multiplexed transactional log in any order. The log data associated with a particular client is represented by a virtual log of that client within the multiplexed transactional log. In one aspect, the invention is directed to a computer-implemented method for transactional logging using a multiplexed log. The computer-implemented method maintains a multiplexed log for multiple clients using a scaleable logging process. After a system failure, the computer-implemented method recovers the multiplexed log using a scaleable recovery process. The scaleable recovery process includes an end-of-log locating process for locating the end of each virtual log within the multiplexed log. The end-of-log locating process is also scaleable.
- In yet another aspect, the invention is directed to a computer-implemented method for maintaining a recoverable transactional log. A log block containing log data is received from one of the clients. The log block is appended to a current region in a flush queue. Metadata associated with the current region is updated to account for the newly appended log block in the current region. If the end of the current region is reached, the metadata is appended to the current region in the flush queue.
- In still another aspect, the invention is directed to a computer-implemented method for recovering a transactional log after a system failure. A starting point in the transactional log is determined by referring to metadata associated with the transactional log. The last valid owner page within the transactional log is located by checking at discrete intervals from the starting point toward the end of the transactional log. The method checks the validity of a region in the transactional log associated with the last valid owner page. If the region associated with last valid owner page is valid, the first invalid log block in an incomplete region is located where the incomplete region is located beyond the last valid region toward the end of the transactional log. The end of the transactional log is found when the first invalid log block is located.
- In yet another aspect, the present invention is directed to a transactional logging system that includes a transactional log, a metadata file, and a multiplexed transactional logging component. The transactional log is typically stored in a storage unit. The transactional log contains log blocks from clients and owner pages that include information on how the log blocks are organized in the transactional log. The metadata file includes information about the transactional log. The multiplexed transactional logging component is configured to append the log blocks and the owner pages to the transactional log and to recover the transactional log after a system failure using information in the owner pages and the metadata file.
-
FIG. 1 illustrates an exemplary computing device that implements the present invention. -
FIG. 2 is a schematic diagram of a multiplexed transactional logging system. -
FIG. 3 is a graphical representation of two exemplary owner pages. -
FIG. 4 is an operational flow diagram of an exemplary process for handling log blocks from a client. -
FIG. 5 is an operational flow diagram of another exemplary process for handling log blocks from a client. -
FIG. 6 is an operational flow diagram of yet another exemplary process for handling log blocks from a client. -
FIG. 7 is an operational flow diagram of an exemplary process for recovering a multiplexed log. -
FIG. 8 is an operational flow diagram of another exemplary process for recovering a multiplexed log. - The inventor of the present invention has appreciated that logging system recovery is an important aspect of a multiplexed transactional logging system. Thus, the present invention focuses on recovering a multiplexed log after a system failure and restoring logging functionalities. For a multiplexed log, restoring logging functionality typically includes determining the end of each of the virtual logs within the multiplexed log. The invention provides a number of methods for locating the end of a multiplexed log and the end of each of the virtual logs within the multiplexed log. The manner in which multiplexed logs are recovered in the present invention is very different from the manner in which dedicated logs are recovered. For example, to recover a dedicated log, a conventional method that scans the entire dedicated log from its last written restart area to locate the end of the log is typically used. Generally, the dedicated log may be scanned sequentially or logarithmically using a binary search algorithm. However, this conventional method is not practical for recovering a multiplexed log. Using this conventional method, each of the virtual logs within the multiplexed log would have to be located by scanning. The number of scans for log recovery proportionally increases with the size and the number of virtual logs within the multiplexed log. The amount of time and system resources required by conventional log recovery methods is prohibitive, especially for a large scale multiplexed transactional logging system.
- The present invention provides an improved system and method that enables multiplexed log recovery but requires significantly less time and fewer system resources. In one configuration, the maintenance and recovery of the multiplexed log are scaleable (independent of the size of the multiplexed log and the number of clients). These and other aspects of the invention will become apparent after reading the following detailed description.
-
FIG. 1 illustrates anexemplary computing device 100 that may be used in one exemplary embodiment of the present invention. With reference toFIG. 1 , one exemplary system for implementing the invention includes a computing device, such ascomputing device 100. In a very basic configuration,computing device 100 typically includes at least oneprocessing unit 102 andsystem memory 104. Depending on the exact configuration and type of computing device,system memory 104 may include volatile memory (such as RAM 106), non-volatile memory (such asROM 110, flash memory, etc.), and storage unit 130 (such as hard drive or other stable storage devices). -
Computing device 100 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Computer storage media may include volatile and nonvolatile memory, storage units, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Thus, computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computingdevice 100. Any such computer storage media may be part ofcomputing device 100.Computing device 100 may also include input component(s) 140 such as keyboard 122, mouse 123, pen, voice input device, touch input device, etc. Output component(s) 145 such as a display, speakers, printer, etc. may also be included. These devices are well known in the art and need not be discussed at length here. -
Computing device 100 may also contain communication connection(s) 150 that allowcomputing device 100 to communicate with other computing devices, such as over one or more network(s) 160. Signals used by communication connection(s) 150 are one example of communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. The term computer readable media as used herein includes both storage media and communication media. -
System memory 104 typically includesBIOS 111,operating system 112, and one ormore applications 120. As shown in the figure,system memory 104 may include multiplexedtransactional logging system 114. Multiplexedtransactional logging system 114 is a computer executable component that provides logging services forapplications 120, such as Client A and Client B. For illustrative purposes, multiplexedtransactional logging system 114 is shown as a part of the kernel ofcomputing device 100. But multiplexedtransactional logging system 114 may be implemented as a separate application executing in either the kernel mode or the user mode ofcomputing device 100. Multiplexedtransactional logging system 114 may also be implemented as two or more components executing in either mode. - Multiplexed
transactional logging system 114 is configured to maintain and retrieve log data forapplications 120. Multiplexedtransactional logging system 114 maintains the log data in multiplexedlog 134 stored instorage unit 130. The log data fromdifferent applications 120 are multiplexed by multiplexedtransactional logging system 114 before being appended to multiplexedlog 134. Typically, log data are organized into fixed size log blocks. Log blocks are grouped into regions that make up the multiplexedlog 134. - Multiplexed
transactional logging system 114 may defer log data inflush queue 116 before appending the log data to multiplexedlog 134. Metadata, which is information about the structure and organization of the multiplexedlog 134, may be included inmetadata file 136 stored instorage unit 130. Metadata for multiplexedlog 134 may also be appended to multiplexed log 134 as owner pages. Owner pages will be discussed in more detail in conjunction withFIG. 3 . Briefly stated, an owner page contains metadata of a particular region in the multiplexed log. One ormore owner pages 109 may be stored in volatile memory before they are appended to multiplexedlog 134. -
FIG. 2 is a schematic diagram of multiplexedtransactional logging system 114. Multiplexedtransactional logging system 114 provides logging services to multiple clients, such as Clients A, B and C. Each client is an application that maintains a log through the multiplexedtransactional logging system 114. Examples of the clients may be a database application, a transactional file system, or the like. Clients A and B, which are also shown inFIG. 1 , are applications executing in the same computing device on which multiplexedtransactional logging system 114 is executing. As shown in the figure, Client C is an application executing on a remote computing device. Multiplexedtransactional logging system 114 is capable of providing logging services to remote applications such as Client C. - Multiplexed
transactional logging system 114 is configured to provide the illusion to each of Clients A, B, and C that a separate, dedicated log is being maintained for each client. In other words, Clients A, B, and C send log data to multiplexedtransactional logging system 114 with the expectation that the log data are stored in dedicated logs. For the purpose of this discussion, the illusory dedicated logs are referred to as “virtual logs,” represented in the figure as virtual logs 211-213. In actuality, multiplexedtransactional logging system 114 multiplexes and appends the log data from each client to multiplexedlog 134, which is shared by Clients A, B, and C. The log data are typically organized into log blocks. A log block is a unit of physical log I/O that includes a fixed-sized log block header and a body which may be of any size. In one embodiment, the size of log blocks may be multiple of the size of a sector associated withstorage unit 130. - Multiplexed
transactional logging system 114 may be configured to maintain owner pages, which are data structures that contain information about how log blocks are arranged in multiplexedlog 134. Owner pages will be discussed in more detailed in conjunction withFIG. 3 . Briefly stated, an owner page may include information about the ownership of log blocks in a region ofmultiplexed log 134. Multiplexedtransactional logging system 114 may use the metadata in the owner pages to organize log blocks as virtual logs 211-213 for providing logging services to Clients A, B, and C. Multiplexedtransactional logging system 114 may also use the metadata in owner pages for recovering logging finctionalities after a system failure. Owner pages may be appended to multiplexed log 134 as shown in the figure. Owner pages may also be appended tometadata file 136, which is a data structure separate from multiplexedlog 134. - In operation, multiplexed
transactional logging system 114 may receive log blocks from Clients A, B and C at different times and order. Multiplexedtransactional logging system 114 multiplexes the log blocks by appending them to a single multiplexedlog 134. Multiplexedtransactional logging system 114 may defer appending the multiplexed log blocks usingflush queue 116.Flush queue 116 is a data structure that represents the multiplexed log blocks that will be appended to multiplexedlog 134.Flush queue 116 is typically stored in volatile memory. Multiplexedtransactional logging system 114 may be configured to useflush queue 116 for reducing the need to accessstorage unit 130 and improving system performance. - For example, if
storage unit 130 is a hard drive, a performance overhead is generated every time multiplexedtransactional logging system 114 accesses the hard drive to append multiplexed log blocks.Multiplexed logging system 114 may be configured to write log blocks inflush queue 116 to the hard drive only when the user voluntarily requests that the log blocks be forced to the hard disk or when memory tied up by log blocks has exceeded a user-defined flush threshold. By accumulating log blocks in volatile memory using a flush queue, the performance overhead is reduced by amortizing multiple potential accesses to the hard drive with a single hard drive access. - To facilitate management of log blocks, multiplexed
transactional logging system 114 maintainsmetadata file 136 that contains metadata about the multiplexedlog 134.Metadata file 136 is typically stored in a stable storage media, such asstorage unit 130.Metadata file 136 may include many different kinds of information. For example, metadata may include the owner pages of the regions of multiplexedlog 134. In one embodiment of the invention, the owner pages for the regions are appended in multiplexedlog 134 and metadata file 136 includes the location in multiplexedlog 134 at which the last owner page in the log is appended. This location enables multiplexedtransactional logging system 114 to locate the last owner page for recovering multiplexedlog 134 after a system failure. - Multiplexed
transactional logging system 114 may defer one or more regions of log data inflush queue 116. The owner page of the current region in the flush queue may be stored in volatile memory until the current region is filled. After receiving enough log blocks to fill a region, multiplexedtransactional logging system 114 may be configured to immediately append the log blocks in theflush queue 116 to multiplexedlog 134. Multiplexedtransactional logging system 114 may append the owner page associated with the region to metadatafile 136. Multiplexedtransactional logging system 114 may also append the owner page to the region before appending the region to multiplexedlog 134. - In one embodiment, multiplexed
transactional logging system 114 may be configured to improve performance by reducing the overhead associated with appending log blocks to multiplexedlog 134 and owner pages tometadata 136. In this configuration, multiplexedtransactional logging system 114 is capable of deferring multiple regions of log blocks. For example, as shown in the figure, multiplexedlog 134 has appended log blocks to flushqueue 116 up tocurrent region 222. After receiving enough log data tocurrent region 222, multiplexedtransactional logging system 114 createsnew owner page 340 fornew region 224 and appendscurrent owner page 310 associated withcurrent region 222 to theflush queue 116. Multiplexedtransactional logging system 114 may copy some of the data incurrent owner page 310 tonew owner page 340. Multiplexedtransactional logging system 114 may append the log blocks inflush queue 116 to multiplexedlog 134 when the size of theflush queue 116 reaches a critical value, when a client instructed its log blocks be immediately appended to multiplexedlog 134, or some other predetermined events. - The log blocks in
flush queue 116 may be appended to multiplexedlog 134 in any order. Typically, each region inflush queue 116 is appended to multiplexedlog 134 in sequential order. The log blocks in each region may be appended in any order. -
FIG. 3 is a graphical representation of two exemplary owner pages.Current owner page 310 is associated withcurrent region 222 shown inFIG. 2 andnew owner page 340 is associated withnew region 224. Generally stated, an owner page contains information about client ownership of the log blocks in a region. In one embodiment, the owner page is a special log block with the metadata that associates the log blocks with the clients. The owner pages may be stored at specified intervals within the multiplexed log so that the locations of the owner pages may be determined directly, as opposed to scanning the entire multiplexed log. An owner page may include an owner referral and an owner array. -
Owner referral 320 maps each client to a range of locations within the multiplexed log where log blocks owned by the client are found. As shown in the figure,owner referral 320 ofcurrent owner page 310 contains a minimum location identifier and a maximum location identifier for each client that has log blocks incurrent region 222. The minimum location identifier identifies a location where the beginning of the client's first log block in the region is found. The maximum location identifier identifies a location where the end of the client's the last log block in the multiplexed log is found. In one embodiment, the minimum location identifiers and the maximum identifiers are strictly monotonically increasing within a client's virtual log.Owner array 330 identifies the client owner of each of the sectors associated withcurrent region 222. -
New owner page 340 is an owner page created for anew region 224 aftercurrent region 222 has been filled. For illustrative purposes, no log data have been appended tonew region 224. To createnew owner page 340, some of the data fromcurrent owner page 310 may be copied tonew owner page 340. In one embodiment, the maximum location identifiers in the owner referral of a current owner page are copied to the owner referral of a new owner page. As shown in the figure, the maximum location identifier for each of the clients inowner referral 320 is copied toowner referral 350. The minimum location identifiers inowner referral 350 are filled with place-holders. In this embodiment, only some of the data and not all the data are copied.New owner page 340 initializes its owner array to indicate that nothing has been written to its log region. - It is to be appreciated that copying maximum location identifiers from a current owner page to a new owner page enables the new owner page to identify where the last log block of each of the clients is located in the multiplexed log. In other words, the new owner page may be used as a look-up table for finding the end of each of the virtual logs in the multiplexed log. A scaleable process that determines the end of a multiplexed log using the owner referral of an owner page will be discussed in detail in conjunction with
FIG. 8 . -
FIG. 4 is an operational flow diagram of anexemplary process 400 for handling log blocks from a client. Moving from a start block,process 400 goes to block 410 where a log block is appended to a flush queue. - At
block 412, the owner page of the current region is updated. This current owner page may be included in a metadata file stored in a stable storage medium. The current owner page is immediately modified and flushed to the metadata file to account for the newly appended log block. The process continues atdecision block 415. - At
decision block 415, a determination is made whether the end of the current region is reached. If so,process 400 goes to block 420 where a new region is started and the current region is appended to the multiplexed log. The current owner page is appended to the current region and is appended to the multiplexed log along with the current region. Atblock 425, the current owner page in the metadata file is overwritten to create a new owner page. Atblock 430, certain data from the current owner page are transferred to the new owner page.Process 400 continues atdecision block 440. - Returning to block 415, if the end of the current region is not reached, the process continues at
decision block 440. Atdecision block 440, a determination is made whether more log blocks are ready for appending to the multiplexed log. If so,process 400 returns to block 410. If no log block is ready for appending, the process ends. - Since
process 400 requires the multiplexed log and the metadata file to be updated for each new log blocks, the multiplexed log is readily recoverable. However, it is to be appreciated that a relatively large amount of system resources would have to be dedicated for constantly accessing one or more stable storage media where the multiplexed log and the metadata file are stored. In particular, every log block requires accessing a stable storage medium (e.g. a hard disk) at least twice: one to write the metadata and one to append the log block to the multiplexed log. -
FIG. 5 is an operational flow diagram of anotherexemplary process 500 for handling log blocks from a client. Moving from start block,process 500 moves to block 510 where a log block is appended to a flush queue. Atblock 515, the owner page of the current region is updated. The owner page may be stored in volatile memory. The current owner page is modified to account for the newly appended log data.Process 500 continues atdecision block 520. - At
decision block 520, a determination is made whether the end of the current region is reached. If so, a new region is started in the flush queue andprocess 500 goes to block 525 where a new owner page associated with the new region is created in volatile memory. Atblock 530, certain data from the current owner page are transferred to the new owner page. Atblock 535, the current owner page in a metadata file is replaced with a new owner page for the new region. The metadata file may be stored in a stable storage medium. Atblock 540, the current region in the flush queue is forced to the multiplexed log.Process 500 continues atdecision block 545. - At
decision block 545, a determination is made whether more log blocks are ready for appending. If so,process 500 returns to block 510. If no log block is ready for appending, the process ends. - It is to be appreciated that
process 500 consumes less system resources and incurs less I/O overhead thanprocess 400 discussed previously in conjunction withFIG. 4 . Deferring the log blocks in a flush queue before appending them to the multiplexed log and keeping the owner page of the current region in volatile memory reduce the frequency for accessing one or more stable storage media where the multiplexed log and the metadata file are stored. The disadvantage ofprocess 500 is that the flush queue is forced to the multiplexed log at the end of every log region. Forcing the flush queue occurs when the end of a region is reached, as opposed to the voluntary intent of a log client. This is not desirable because during forward progress an efficient logging system should allow its clients to voluntarily determine when to incur a performance penalty associated with forcing the flush queue to a log. - For
process 500, since the multiplexed log and the metadata file are not updated until a complete region is actually appended and forced to non-volatile storage, a process is needed for recovering the multiplexed log in case a system failure occurs while log blocks are stored in the flush queue but before they actually make it to non-volatile storage. An exemplary recovery process associated withprocess 500 will be discussed in conjunction withFIG. 7 . -
FIG. 6 is an operational flow diagram of yet anotherexemplary process 600 for handling log blocks from a client. Moving from start block,process 600 moves to block 610 where a log block is appended to a flush queue. Atblock 615, the owner page of the current region cached in volatile memory is updated to account for the newly appended log block. The process continues atdecision block 620. - At
decision block 620, a determination is made whether the end of the current region is reached. If so, a new region is started in flush queue andprocess 600 goes to block 625 where a new owner page associated with the new region is created in volatile memory. Atblock 630, certain data from the current owner page are transferred to the new owner page. Atblock 635, the current owner page is appended to the flush queue as a log block. It is to be appreciated thatprocess 600 does not require the owner page to be stored separately and immediately in a stable storage medium. The process also enables multiple regions of log blocks to be appended to the flush queue. Thus,process 600 reduces system overhead but still allows recovery of the multiplexed log. - At
decision block 640, a determination is made whether more log blocks are ready for appending. If so,process 600 returns to block 610. If no log block is ready for appending, the process ends. - It is to be appreciated that
process 600 incurs even less system resources thanprocess 500 discussed previously in conjunction withFIG. 5 . Unlikeprocess 400 andprocess 500,process 600 does not force a flush queue to be appended to a multiplexed log when an owner page is appended to the flush queue. Thus,process 600 enables clients to control when the flush queue is forced to stable storage in the multiplexed log. -
Process 600 also allows forward progress of the multiplexed log to incur little or no I/O overhead when compared with a dedicated log system. Thus, forward progress is scaleable because appending owner pages to the flush queue occurs in constant time and does not incur undesirable and unexpected overhead associated with forcing the flush queue to stable storage in the multiplexed log. - However, because multiple regions of log blocks may be in the flush queue when a system failure occurs, a sophisticated process is required to recover a multiplexed log maintained by
process 600. An exemplary recovery process associated withprocess 600 will be discussed in conjunction withFIG. 8 . -
FIG. 7 is an operational flow diagram of anexemplary process 700 for recovering a multiplexed log.Process 700 may be used to recover log blocks appended to a multiplexedlog using process 500 described in conjunction withFIG. 5 .Process 700 begins after a system failure. Moving from a start block, the process moves to block 710 where the multiplexed log is opened. Atblock 715, the last owner page in the multiplexed log is determined. The last owner page and its location in the multiplexed log are determined by referring to metadata associated with the multiplexed log. -
Process 700 continues atblock 735 where the end of the multiplexed log is determined. The end of the multiplexed log may be determined by sequentially checking each log block from the start of the region associated with the last owner page. The log blocks of the region are sequentially checked until an invalid log block is determined, indicating the end of the multiplexed log. After the end of the multiplexed log was determined, the process moves to block 740 where the last cached owner page is updated. For example, some of the entries in the owner page may have to be deleted to account for the log blocks that were not appended to the multiplexed log due to the system failure.Process 700 then ends. -
FIG. 8 is an operational flow diagram of anotherexemplary process 800 for recovering a multiplexed log.Process 800 may be used to recover log blocks appended to a multiplexedlog using process 600 described in conjunction withFIG. 6 . Moving from a start block, the process moves to block 810 where the multiplexed log is opened. -
Process 800 moves to block 815 where location information of the last owner page in the multiplexed log is determined. The location information of the last owner page is typically stored in a metadata file as metadata. To improve performance, metadata may not be updated very frequently. Thus, the location information may not indicate the location of the last owner page that was actually appended to the multiplexed log. But the indicated location may be used as a starting point. - At
block 820, the last valid owner page is determined. The last valid owner page may be determined beginning from the starting point indicated by the location information determined atblock 815 and scanning forward in the multiplexed log at a fixed interval. The fixed interval may coincide with the size of the fixed size region. Scanning forward across owner pages may be performed by a linear scan or an exponential back out followed by a binary search of owner pages. When the last valid owner page is located,process 800 continues at 825. - At
block 825, the log blocks in the region associated with the last valid owner page are checked. Many methods for checking data validity may be used. One exemplary method is linearly validating each block in the region.Process 800 continues atdecision block 835 where a determination is made whether the region is valid. If not, the process goes to block 830 where the prior region is checked and loops back todecision block 835. The loop continues until a valid region is found. Typically, the last valid region is further down the multiplexed log than the starting point. Then,process 800 moves to block 840. - At
block 840, the owner page is reconstructed in memory from the log blocks of the incomplete region and the end of the multiplexed log is determined. The last valid log block of multiplexed log may be determined by checking log blocks located after the last valid region. Each log block is checked for validity until an invalid log block is located. Information obtained from checking the log blocks may be used to reconstruct the owner page. When the owner page is reconstructed, the end of each of the virtual logs in the multiplexed log is readily determined by the reconstructed owner page. As discussed in conjunction withFIG. 3 , maximum location identifiers are copied into the owner referral of a new owner page. Forprocess 800, maximum location identifiers in the owner referral of the reconstructed owner page identify the last log block of each of the client in the multiplexed log. Thus, the end of each of the virtual logs is readily determined. Atblock 845, logging functionality on the multiplexed log is restored andprocess 800 ends. - It is appreciated using
process 800, log recovery is scaleable because the process involves a bounded scan of regions towards the end of the multiplexed log. The bound is determined by the flush threshold, which is typically set by the log clients. Finding the end of the multiplexed log and the end of each of the virtual logs is also scaleable because the process is a constant time and space table lookup independent of the size of the multiplexed log and the number of clients. Furthermore, after recovery, little or no I/O overhead is incurred since the owner referral of the last region that was recovered byprocess 800 is already reconstructed in memory. - It is further appreciated that multiplexed
log recovery process 800 in conjunction with theforward progress process 600 minimizes the log I/O overheard during forward progress of the multiplexed log at the expense of a more elaborate recovery scheme after system failure. During normal forward progress, the client, not the logging system, determines when the flush queue is forced to stable storage. Thus, the normal forward progress of the multiplexed log is efficient. The advantages of having an efficient forward progress are offset only in the rare event of a log recovery after a system failure. But even this offset is minimal because the multiplexedlog recovery process 800 is scaleable. Thus, the system and method of the present invention optimize normal forward progress of a multiplexed log with the compromise of a more elaborate recovery process in the exceptional case of log recovery. With a recovery process that is scaleable, determination of the end of each of the virtual logs in the multiplexed log requires very little effort. - The above specification, examples and data provide a complete description of the invention. Since many embodiments of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.
Claims (21)
1-57. (canceled)
58. A computer-implemented method for maintaining a transactional log that is multiplexed, comprising:
receiving log blocks from clients; wherein the log blocks that are received from each of the clients are unrelated such that recovery of data for one of the clients does not depend upon the log blocks of another client;
appending in any order each of the log blocks to a current region in a flush queue; wherein the flush queue is configured to store a predetermined number of log blocks;
determining when a size of the current region is reaches a predetermined value;
updating a current owner page associated with the current region; and
appending the current owner page and the log blocks that are contained within the flush queue to the transactional log.
59. The computer-implemented method of claim 58 , wherein the current owner page is updated in response to appending the log blocks to the current region; and wherein the current owner page is stored in a metadata file.
60. The computer-implemented method of claim 59 , further comprising determining a location identifier identifying a location of the current owner page in the transactional log and storing the location identifier in the metadata file.
61. The computer-implemented method of claim 58 , further comprising starting a new region; and creating a new owner page associated with the new region after appending the current owner page and the log blocks that are contained within the flush queue to the transactional log.
62. The computer-implemented method of claim 58 , wherein appending the current owner page and the log blocks that are contained within the flush queue to the transactional log occurs upon the occurrence of a predetermined event.
63. The computer-implemented method of claim 62 , wherein the predetermined event occurs when one of the clients issues a command for immediately appending a log block to the transactional log.
64. The computer-implemented method of claim 63 , further comprising updating metadata associated with the transactional log.
65. The computer-implemented method of claim 64 , wherein the metadata includes the location of an owner page associated with the last region in the transactional log.
66. A computer-readable medium having computer executable instructions for maintaining a transactional log, the instructions comprising:
receiving a first log block from a first client and a second log block from a second; wherein the first log block and the second log block relate to different programs;
appending the first log block and the second log block to a flush queue; wherein the flush queue is configured to store a predetermined number of log blocks;
determining when an event occurs relating to the flush queue;
updating a current owner page; wherein the current owner page is updated when the first log block is appended to the flush queue and when the second log block is appended to the flush queue; and
appending the current owner page and the log blocks that are contained within the flush queue to the transactional log.
67. The computer-readable medium of claim 66 , further comprising storing the current owner page in a metadata file.
68. The computer-readable medium of claim 67 , further comprising determining a location identifier identifying a location of the current owner page in the transactional log and storing the location identifier in the metadata file.
69. The computer-readable medium of claim 68 , further comprising starting a new region and creating a new owner page when the log blocks that are contained within the flush queue are appended to the transactional log.
70. The computer-readable medium of claim 66 , wherein determining when the event occurs relating to the flush queue comprises determining when one of the clients issues a command for immediately appending a log block to the transactional log.
71. The computer-readable medium method of claim 66 , wherein determining when the event occurs relating to the flush queue comprises determining when a size of the log blocks within the flush queue exceed a predetermined value.
72. A computer-implemented method for recovering a transactional log after a system failure, comprising:
determining a starting point in the transactional log by referring to metadata associated with the transactional log; wherein the transactional log includes log blocks from clients and wherein the log blocks from different clients that are stored within the transactional log are unrelated;
locating a last valid owner page within the transactional log by checking at discrete intervals from the starting point toward the end of the transactional log;
checking the validity of a region in the transactional log associated with the last valid owner page; and
if the region associated with last valid owner page is valid, determining a first invalid log block in a incomplete region, wherein the incomplete region is located beyond the valid region toward the end of the transactional log.
73. The computer-implemented method of claim 72 , wherein the discrete intervals are the extent of a region.
74. The computer-implemented method of claim 72 , further comprising sequentially checking regions toward the beginning of the transactional log until a valid region is found when the region associated with last valid owner page is not valid.
75. The computer-implemented method of claim 72 , further comprising reconstructing a new owner page associated with the incomplete region.
76. The computer-implemented method of claim 72 , wherein checking the validity of a region in the transactional log associated with the last valid owner page comprises linearly validating each block in the region.
77. The computer-implemented method of claim 72 , further comprising reconstructing in a memory an owner page form log blocks in the incomplete region.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/357,333 US20060143241A1 (en) | 2002-11-27 | 2006-02-17 | System and method for scaleable multiplexed transactional log recovery |
US13/292,972 US8626721B2 (en) | 2002-11-27 | 2011-11-09 | System and method for scaleable multiplexed transactional log recovery |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/305,824 US7003532B2 (en) | 2002-11-27 | 2002-11-27 | System and method for scaleable multiplexed transactional log recovery |
US11/357,333 US20060143241A1 (en) | 2002-11-27 | 2006-02-17 | System and method for scaleable multiplexed transactional log recovery |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/305,824 Continuation US7003532B2 (en) | 2002-11-27 | 2002-11-27 | System and method for scaleable multiplexed transactional log recovery |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/292,972 Division US8626721B2 (en) | 2002-11-27 | 2011-11-09 | System and method for scaleable multiplexed transactional log recovery |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060143241A1 true US20060143241A1 (en) | 2006-06-29 |
Family
ID=32325534
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/305,824 Expired - Lifetime US7003532B2 (en) | 2002-11-27 | 2002-11-27 | System and method for scaleable multiplexed transactional log recovery |
US11/357,333 Abandoned US20060143241A1 (en) | 2002-11-27 | 2006-02-17 | System and method for scaleable multiplexed transactional log recovery |
US13/292,972 Expired - Lifetime US8626721B2 (en) | 2002-11-27 | 2011-11-09 | System and method for scaleable multiplexed transactional log recovery |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/305,824 Expired - Lifetime US7003532B2 (en) | 2002-11-27 | 2002-11-27 | System and method for scaleable multiplexed transactional log recovery |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/292,972 Expired - Lifetime US8626721B2 (en) | 2002-11-27 | 2011-11-09 | System and method for scaleable multiplexed transactional log recovery |
Country Status (1)
Country | Link |
---|---|
US (3) | US7003532B2 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060053339A1 (en) * | 2002-05-31 | 2006-03-09 | Microsoft Corporation | Virtual logging system and method |
US20080133615A1 (en) * | 2006-12-04 | 2008-06-05 | Microsoft Corporation | Multi-level read caching for multiplexed transactional logging |
US20120102265A1 (en) * | 2008-09-19 | 2012-04-26 | Microsoft Corporation | Aggregation of Write Traffic to a Data Store |
US8626721B2 (en) | 2002-11-27 | 2014-01-07 | Microsoft Corporation | System and method for scaleable multiplexed transactional log recovery |
US20150227603A1 (en) * | 2013-03-01 | 2015-08-13 | Datadirect Networks, Inc. | Asynchronous namespace maintenance |
US9529716B2 (en) | 2005-12-16 | 2016-12-27 | Microsoft Technology Licensing, Llc | Optimizing write and wear performance for a memory |
US9690496B2 (en) | 2004-10-21 | 2017-06-27 | Microsoft Technology Licensing, Llc | Using external memory devices to improve system performance |
US10387313B2 (en) | 2008-09-15 | 2019-08-20 | Microsoft Technology Licensing, Llc | Method and system for ensuring reliability of cache data and metadata subsequent to a reboot |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100414362C (en) * | 2005-12-22 | 2008-08-27 | 群康科技(深圳)有限公司 | Liquid crystal display module |
CN101288056A (en) * | 2006-03-13 | 2008-10-15 | 松下电器产业株式会社 | Flash memory controller |
US8768890B2 (en) * | 2007-03-14 | 2014-07-01 | Microsoft Corporation | Delaying database writes for database consistency |
US7991967B2 (en) * | 2007-06-29 | 2011-08-02 | Microsoft Corporation | Using type stability to facilitate contention management |
US8327193B2 (en) * | 2009-04-13 | 2012-12-04 | Seagate Technology Llc | Data storage device including a failure diagnostic log |
US8171257B2 (en) * | 2009-09-25 | 2012-05-01 | International Business Machines Corporation | Determining an end of valid log in a log of write records using a next pointer and a far ahead pointer |
US8862897B2 (en) | 2011-10-01 | 2014-10-14 | Oracle International Corporation | Increasing data security in enterprise applications by using formatting, checksums, and encryption to detect tampering of a data buffer |
US8977898B1 (en) | 2012-09-24 | 2015-03-10 | Emc Corporation | Concurrent access to data during replay of a transaction log |
US9021303B1 (en) | 2012-09-24 | 2015-04-28 | Emc Corporation | Multi-threaded in-memory processing of a transaction log for concurrent access to data during log replay |
US9372767B2 (en) | 2014-06-06 | 2016-06-21 | Netapp, Inc. | Recovery consumer framework |
US9747174B2 (en) | 2015-12-11 | 2017-08-29 | Microsoft Technology Licensing, Llc | Tail of logs in persistent main memory |
US9971687B2 (en) * | 2016-02-15 | 2018-05-15 | International Business Machines Corporation | Operation of a multi-slice processor with history buffers storing transaction memory state information |
US10241855B2 (en) * | 2016-11-14 | 2019-03-26 | International Business Machines Corporation | Recovery of first failure data capture logs |
CN106648933A (en) * | 2016-12-26 | 2017-05-10 | 北京奇虎科技有限公司 | Consuming method and device of message queue |
US11455292B2 (en) | 2018-09-21 | 2022-09-27 | Microsoft Technology Licensing, Llc | Brokering persisted and unpersisted log records |
Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4961224A (en) * | 1989-03-06 | 1990-10-02 | Darby Yung | Controlling access to network resources |
US5737600A (en) * | 1994-09-12 | 1998-04-07 | International Business Machines Corporation | Method and system for log management in a coupled data processing system |
US5737763A (en) * | 1995-03-30 | 1998-04-07 | International Computers Limited | Incremental disk backup |
US5845292A (en) * | 1996-12-16 | 1998-12-01 | Lucent Technologies Inc. | System and method for restoring a distributed checkpointed database |
US5893155A (en) * | 1994-07-01 | 1999-04-06 | The Board Of Trustees Of The Leland Stanford Junior University | Cache memory for efficient data logging |
US5966706A (en) * | 1997-02-19 | 1999-10-12 | At&T Corp | Local logging in a distributed database management computer system |
US5996054A (en) * | 1996-09-12 | 1999-11-30 | Veritas Software Corp. | Efficient virtualized mapping space for log device data storage system |
US6021408A (en) * | 1996-09-12 | 2000-02-01 | Veritas Software Corp. | Methods for operating a log device |
US6185663B1 (en) * | 1998-06-15 | 2001-02-06 | Compaq Computer Corporation | Computer method and apparatus for file system block allocation with multiple redo |
US6256637B1 (en) * | 1998-05-05 | 2001-07-03 | Gemstone Systems, Inc. | Transactional virtual machine architecture |
US20010052073A1 (en) * | 1998-06-12 | 2001-12-13 | Kern Robert Frederic | Storage controller conditioning host access to stored data according to security key stored in host-inaccessible metadata |
US6353834B1 (en) * | 1996-11-14 | 2002-03-05 | Mitsubishi Electric Research Laboratories, Inc. | Log based data architecture for a transactional message queuing system |
US20020099843A1 (en) * | 2001-01-24 | 2002-07-25 | International Business Machines Corporation | Method, system, and program for managing client access to a shared resource |
US20020103814A1 (en) * | 2000-12-12 | 2002-08-01 | Edouard Duvillier | High speed, non-log based database recovery technique |
US6490595B1 (en) * | 2000-03-30 | 2002-12-03 | International Business Machines Corporation | Method, system and program products for providing efficient syncpoint processing of distributed transactions |
US20030226058A1 (en) * | 2002-05-31 | 2003-12-04 | Microsoft Corporation, | Virtual logging system and method |
US20030225585A1 (en) * | 2002-05-31 | 2003-12-04 | Microsoft Corporation | System and method for locating log records in multiplexed transactional logs |
US20030233389A1 (en) * | 2002-06-18 | 2003-12-18 | Microsoft Corporation | System and method for decoupling space reservation in transactional logs |
US20040010499A1 (en) * | 2002-07-02 | 2004-01-15 | Sybase, Inc. | Database system with improved methods for asynchronous logging of transactions |
US20040030703A1 (en) * | 2002-08-12 | 2004-02-12 | International Business Machines Corporation | Method, system, and program for merging log entries from multiple recovery log files |
US6728879B1 (en) * | 1999-06-02 | 2004-04-27 | Microsoft Corporation | Transactional log with multi-sector log block validation |
US7003532B2 (en) * | 2002-11-27 | 2006-02-21 | Microsoft Corporation | System and method for scaleable multiplexed transactional log recovery |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6453325B1 (en) * | 1995-05-24 | 2002-09-17 | International Business Machines Corporation | Method and means for backup and restoration of a database system linked to a system for filing data |
US5920863A (en) * | 1997-05-31 | 1999-07-06 | International Business Machines Corporation | System and method for supporting transactions for a thin client lacking a persistent store in a distributed object-oriented environment |
US6856993B1 (en) * | 2000-03-30 | 2005-02-15 | Microsoft Corporation | Transactional file system |
US6832229B2 (en) * | 2001-03-09 | 2004-12-14 | Oracle International Corporation | System and method for maintaining large-grained database concurrency with a log monitor incorporating dynamically redefinable business logic |
-
2002
- 2002-11-27 US US10/305,824 patent/US7003532B2/en not_active Expired - Lifetime
-
2006
- 2006-02-17 US US11/357,333 patent/US20060143241A1/en not_active Abandoned
-
2011
- 2011-11-09 US US13/292,972 patent/US8626721B2/en not_active Expired - Lifetime
Patent Citations (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4961224A (en) * | 1989-03-06 | 1990-10-02 | Darby Yung | Controlling access to network resources |
US5893155A (en) * | 1994-07-01 | 1999-04-06 | The Board Of Trustees Of The Leland Stanford Junior University | Cache memory for efficient data logging |
US5737600A (en) * | 1994-09-12 | 1998-04-07 | International Business Machines Corporation | Method and system for log management in a coupled data processing system |
US5737763A (en) * | 1995-03-30 | 1998-04-07 | International Computers Limited | Incremental disk backup |
US5996054A (en) * | 1996-09-12 | 1999-11-30 | Veritas Software Corp. | Efficient virtualized mapping space for log device data storage system |
US6021408A (en) * | 1996-09-12 | 2000-02-01 | Veritas Software Corp. | Methods for operating a log device |
US6353834B1 (en) * | 1996-11-14 | 2002-03-05 | Mitsubishi Electric Research Laboratories, Inc. | Log based data architecture for a transactional message queuing system |
US5845292A (en) * | 1996-12-16 | 1998-12-01 | Lucent Technologies Inc. | System and method for restoring a distributed checkpointed database |
US5966706A (en) * | 1997-02-19 | 1999-10-12 | At&T Corp | Local logging in a distributed database management computer system |
US6256637B1 (en) * | 1998-05-05 | 2001-07-03 | Gemstone Systems, Inc. | Transactional virtual machine architecture |
US20010052073A1 (en) * | 1998-06-12 | 2001-12-13 | Kern Robert Frederic | Storage controller conditioning host access to stored data according to security key stored in host-inaccessible metadata |
US6185663B1 (en) * | 1998-06-15 | 2001-02-06 | Compaq Computer Corporation | Computer method and apparatus for file system block allocation with multiple redo |
US6728879B1 (en) * | 1999-06-02 | 2004-04-27 | Microsoft Corporation | Transactional log with multi-sector log block validation |
US6490595B1 (en) * | 2000-03-30 | 2002-12-03 | International Business Machines Corporation | Method, system and program products for providing efficient syncpoint processing of distributed transactions |
US20020103814A1 (en) * | 2000-12-12 | 2002-08-01 | Edouard Duvillier | High speed, non-log based database recovery technique |
US20020099843A1 (en) * | 2001-01-24 | 2002-07-25 | International Business Machines Corporation | Method, system, and program for managing client access to a shared resource |
US20030226058A1 (en) * | 2002-05-31 | 2003-12-04 | Microsoft Corporation, | Virtual logging system and method |
US20030225585A1 (en) * | 2002-05-31 | 2003-12-04 | Microsoft Corporation | System and method for locating log records in multiplexed transactional logs |
US20030233389A1 (en) * | 2002-06-18 | 2003-12-18 | Microsoft Corporation | System and method for decoupling space reservation in transactional logs |
US20040010499A1 (en) * | 2002-07-02 | 2004-01-15 | Sybase, Inc. | Database system with improved methods for asynchronous logging of transactions |
US6721765B2 (en) * | 2002-07-02 | 2004-04-13 | Sybase, Inc. | Database system with improved methods for asynchronous logging of transactions |
US20040030703A1 (en) * | 2002-08-12 | 2004-02-12 | International Business Machines Corporation | Method, system, and program for merging log entries from multiple recovery log files |
US7003532B2 (en) * | 2002-11-27 | 2006-02-21 | Microsoft Corporation | System and method for scaleable multiplexed transactional log recovery |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7219262B2 (en) * | 2002-05-31 | 2007-05-15 | Microsoft Corporation | Virtual logging system and method |
US20060053339A1 (en) * | 2002-05-31 | 2006-03-09 | Microsoft Corporation | Virtual logging system and method |
US8626721B2 (en) | 2002-11-27 | 2014-01-07 | Microsoft Corporation | System and method for scaleable multiplexed transactional log recovery |
US9690496B2 (en) | 2004-10-21 | 2017-06-27 | Microsoft Technology Licensing, Llc | Using external memory devices to improve system performance |
US9529716B2 (en) | 2005-12-16 | 2016-12-27 | Microsoft Technology Licensing, Llc | Optimizing write and wear performance for a memory |
US11334484B2 (en) | 2005-12-16 | 2022-05-17 | Microsoft Technology Licensing, Llc | Optimizing write and wear performance for a memory |
US20080133615A1 (en) * | 2006-12-04 | 2008-06-05 | Microsoft Corporation | Multi-level read caching for multiplexed transactional logging |
US8074027B2 (en) | 2006-12-04 | 2011-12-06 | Microsoft Corporation | Multi-level read caching for multiplexed transactional logging |
US10387313B2 (en) | 2008-09-15 | 2019-08-20 | Microsoft Technology Licensing, Llc | Method and system for ensuring reliability of cache data and metadata subsequent to a reboot |
US9361183B2 (en) * | 2008-09-19 | 2016-06-07 | Microsoft Technology Licensing, Llc | Aggregation of write traffic to a data store |
US9448890B2 (en) * | 2008-09-19 | 2016-09-20 | Microsoft Technology Licensing, Llc | Aggregation of write traffic to a data store |
US20140237173A1 (en) * | 2008-09-19 | 2014-08-21 | Microsoft Corporation | Aggregation of write traffic to a data store |
US10509730B2 (en) | 2008-09-19 | 2019-12-17 | Microsoft Technology Licensing, Llc | Aggregation of write traffic to a data store |
US20120102265A1 (en) * | 2008-09-19 | 2012-04-26 | Microsoft Corporation | Aggregation of Write Traffic to a Data Store |
US20150227603A1 (en) * | 2013-03-01 | 2015-08-13 | Datadirect Networks, Inc. | Asynchronous namespace maintenance |
US9792344B2 (en) * | 2013-03-01 | 2017-10-17 | Datadirect Networks, Inc. | Asynchronous namespace maintenance |
Also Published As
Publication number | Publication date |
---|---|
US7003532B2 (en) | 2006-02-21 |
US8626721B2 (en) | 2014-01-07 |
US20120078854A1 (en) | 2012-03-29 |
US20040103123A1 (en) | 2004-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8626721B2 (en) | System and method for scaleable multiplexed transactional log recovery | |
US6912645B2 (en) | Method and apparatus for archival data storage | |
US8165221B2 (en) | System and method for sampling based elimination of duplicate data | |
US6728879B1 (en) | Transactional log with multi-sector log block validation | |
US6513093B1 (en) | High reliability, high performance disk array storage system | |
US6397351B1 (en) | Method and apparatus for rapid data restoration including on-demand output of sorted logged changes | |
US7574435B2 (en) | Hierarchical storage management of metadata | |
US7885921B2 (en) | Managing atomic updates on metadata tracks in a storage system | |
KR100510808B1 (en) | A log-structured write cache for data storage devices and systems | |
US7096332B1 (en) | Use of read data tracking and caching to recover from data corruption | |
CN106951375B (en) | Method and device for deleting snapshot volume in storage system | |
EP2590078B1 (en) | Shadow paging based log segment directory | |
US7640276B2 (en) | Backup system, program and backup method | |
US20020103784A1 (en) | Fast data retrieval based upon contiguous consolidation of records according to frequency of access | |
EP2140382A1 (en) | Improved sequential media reclamation and replication | |
US6415296B1 (en) | Method and system for more efficiently providing a copy in a raid data storage system | |
US20170124104A1 (en) | Durable file system for sequentially written zoned storage | |
US20170123928A1 (en) | Storage space reclamation for zoned storage | |
US5963961A (en) | Database reconstruction using embedded database backup codes | |
US20170123714A1 (en) | Sequential write based durable file system | |
US6684308B2 (en) | Method and system for providing direct access recovery using seekable tape device | |
US7930495B2 (en) | Method and system for dirty time log directed resilvering | |
CN108271420B (en) | Method for managing files, file system and server system | |
US20050246385A1 (en) | Database-rearranging program, database-rearranging method, and database-rearranging apparatus | |
JP2004341926A (en) | Database management system and database management program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034766/0001 Effective date: 20141014 |