US20050015416A1 - Method and apparatus for data recovery using storage based journaling - Google Patents

Method and apparatus for data recovery using storage based journaling Download PDF

Info

Publication number
US20050015416A1
US20050015416A1 US10/621,791 US62179103A US2005015416A1 US 20050015416 A1 US20050015416 A1 US 20050015416A1 US 62179103 A US62179103 A US 62179103A US 2005015416 A1 US2005015416 A1 US 2005015416A1
Authority
US
United States
Prior art keywords
journal
snapshot
data
journal entries
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/621,791
Inventor
Kenji Yamagami
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to US10/621,791 priority Critical patent/US20050015416A1/en
Assigned to HITACHI, LTD. reassignment HITACHI, LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YAMAGAMI, KENJI
Priority to US10/931,543 priority patent/US7398422B2/en
Publication of US20050015416A1 publication Critical patent/US20050015416A1/en
Priority to US11/365,096 priority patent/US8145603B2/en
Priority to US12/143,419 priority patent/US7761741B2/en
Priority to US12/814,002 priority patent/US7979741B2/en
Priority to US13/407,322 priority patent/US8868507B2/en
Priority to US13/551,892 priority patent/US9092379B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • the present invention is related to computer storage and in particular to the recovery of data.
  • Journaling is a backup and restore technique commonly used in database systems. An image of the data to be backed up is taken. Then, as changes are made to the data, a journal of the changes is maintained. Recovery of data is accomplished by applying the journal to an appropriate image to recover data at any point in time.
  • Typical database systems such as Oracle, can perform journaling.
  • Recovering data at any point in time addresses the following types of administrative requirements. For example, a typical request might be, “I deleted a file by mistake at around 10:00 am yesterday. I have to recover the file just before it was deleted.”
  • the invention is directed to method and apparatus for data recovery and comprises performing a fast recovery mode operation in conjunction with an undo-able recovery mode operation.
  • a fast recovery mode operation after-journal entries are applied to a snapshot to update the snapshot.
  • the undo-able recovery mode operation a before-journal entry is taken of the snapshot before applying an after-journal entry to it.
  • a user can perform one or more undo operations when a snapshot has been updated in the undo-able recovery mode.
  • FIG. 1 is a high level generalized block diagram of an illustrative embodiment of the present invention
  • FIG. 2 is a generalized illustration of a illustrative embodiment of a data structure for storing journal entries in accordance with the present invention
  • FIG. 3 is a generalized illustration of an illustrative embodiment of a data structure for managing the snapshot volumes and the journal entry volumes in accordance with the present invention
  • FIG. 3A is a generalized illustration of an illustrative embodiment of a data structure for managing the snapshot volumes and the journal entry volumes in accordance with another aspect of the present invention
  • FIG. 4 is a high level flow diagram highlighting the processing between the recovery manager and the controller in the storage system
  • FIG. 5 illustrates the relationship between a snapshot and a plurality of journal entries
  • FIG. 5A illustrates the relationship among a plurality of snapshots and a plurality of journal entries
  • FIG. 6 is a high level illustration of the data flow when an overflow condition arises
  • FIG. 7 is a high level flow chart highlighting an aspect of the controller in the storage system to handle an overflow condition
  • FIG. 7A illustrates an alternative to a processing step shown in FIG. 7 ;
  • FIG. 8 is a generalized flowchart highlighting data recovery in accordance with another aspect of the invention.
  • FIG. 9 is a flowchart highlighting the steps for phase I recovery
  • FIG. 10 is a diagrammatic illustration of the BEFORE and AFTER journaling.
  • FIG. 11 is provided to illustrate how an “undo” operation can be performed using the journaling shown in FIG. 10 .
  • FIG. 1 is a high level generalized block diagram of an illustrative embodiment of a backup and recovery system according to the present invention.
  • a snapshot is taken for production data volumes (DVOL) 101 .
  • the term “snapshot” in this context conventionally refers to a data image of at the data volume at a given point in time.
  • the snapshot can be of the entire data volume, or some portion or portions of the data volume(s); e.g., filesystem(s), file(s), directorie(s), etc.
  • a journal entry is made for every write operation issued from the host to the data volumes. As will be discussed below, by applying a series of journal entries to an appropriate snapshot, data can be recovered at any point in time.
  • the backup and recovery system shown in FIG. 1 includes at least one storage system 100 . Though not shown, one of ordinary skill can appreciate that the storage system includes suitable processor(s), memory, and control circuitry to perform 10 between a host 110 and its storage media (e.g., disks). The backup and recovery system also requires at least one host 110 . A suitable communication path 130 is provided between the host and the storage system.
  • the host 110 typically will have one or more user applications (APP) 112 executing on it. These applications will read and/or write data to storage media contained in the data volumes 101 of storage system 100 . Thus, applications 112 and the data volumes 101 represent the target resources to be protected. It can be appreciated that data used by the user applications can be stored in one or more data volumes.
  • APP user applications
  • journal group (JNLG) 102 is defined.
  • the data volumes 101 are organized into the journal group.
  • a journal group is the smallest unit of data volumes where journaling of the write operations from the host 110 to the data volumes is guaranteed.
  • the associated journal records the order of write operations from the host to the data volumes in proper sequence.
  • the journal data produced by the journaling activity can be stored in one or more journal volumes(JVOL) 106 .
  • the host 110 also includes a recovery manager (RM) 111 .
  • RM recovery manager
  • This component provides a high level coordination of the backup and recovery operations. Additional discussion about the recovery manager will be discussed below.
  • the storage system 100 provides a snapshot (SS) 105 of the data volumes comprising a journal group.
  • the snapshot 105 is representative of the data volumes 101 in the journal group 106 at the point in time that the snapshot was taken.
  • Conventional methods are known for producing the snapshot image.
  • One or more snapshot volumes (SVOL) 107 are provided in the storage system which contain the snapshot data.
  • a snapshot can be contained in one or more snapshot volumes.
  • the disclosed embodiment illustrates separate storage components for the journal data and the snapshot data, it can be appreciated that other implementations can provide a single storage component for storing the journal data and the snapshot data.
  • a management table (MT) 108 is provided to store the information relating to the journal group 102 , the snapshot 105 , and the journal volume(s) 106 .
  • FIG. 3 and the accompanying discussion below reveal additional detail about the management table.
  • a controller component 140 is also provided which coordinates the journaling of write operations and snapshots of the data volumes, and the corresponding movement of data among the different storage components 101 , 106 , 107 . It can be appreciated that the controller component is a logical representation of a physical implementation which may comprise one or more sub-components distributed within the storage system 100 .
  • FIG. 2 shows the data used in an implementation of the journal.
  • a journal is generated in response.
  • the journal comprises a Journal Header 219 and Journal Data 225 .
  • the Journal Header 219 contains information about its corresponding Journal Data 225 .
  • the Journal Data 225 comprises the data (write data) that is the subject of the write operation. This kind of journal is also referred to as an “AFTER journal.”
  • the Journal Header 219 comprises an offset number (JH_OFS) 211 .
  • the offset number identifies a particular data volume 101 in the journal group 102 .
  • the data volumes are ordered as the 0 th data volume, the 1 st data volume, the 2nd data volume and so on.
  • the offset numbers might be 0, 1, 2, etc.
  • a starting address in the data volume (identified by the offset number 211 ) to which the write data is to be written is stored to a field in the Journal Header 219 to contain an address (JH_ADR) 212 .
  • the address can be represented as a block number (LBA, Logical Block Address).
  • a field in the Journal Header 219 stores a data length (JH_LEN) 213 , which represents the data length of the write data. Typically it is represented as a number of blocks.
  • a field in the Journal Header 219 stores the write time (JH_TIME) 214 , which represents the time when the write request arrives at the storage system 100 .
  • the write time can include the calendar date, hours, minutes, seconds and even milliseconds. This time can be provided by the disk controller 140 or by the host 110 .
  • a timer called the Sysplex Timer, and can provide the time in a write command when it is issued.
  • a sequence number(JH_SEQ) 215 is assigned to each write request.
  • the sequence number is stored in a field in the Journal Header 219 . Every sequence number within a given journal group 102 is unique. The sequence number is assigned to a journal entry when it is created.
  • a journal volume identifier (JH_JVOL) 216 is also stored in the Journal Header 219 .
  • the volume identifier identifies the journal volume 106 associated with the Journal Data 225 .
  • the identifier is indicative of the journal volume containing the Journal Data. It is noted that the Journal Data can be stored in a journal volume that is different from the journal volume which contains the Journal Header.
  • a journal data address (JH_JADR) 217 stored in the Journal Header 219 contains the beginning address of the Journal Data 225 in the associated journal volume 106 that contains the Journal Data.
  • FIG. 2 shows that the journal volume 106 comprises two data areas: a Journal Header Area 210 and a Journal Data Area 220 .
  • the Journal Header Area 210 contains only Journal Headers 219
  • Journal Data Area 220 contains only Journal Data 225 .
  • the Journal Header is a fixed size data structure.
  • a Journal Header is allocated sequentially from the beginning of the Journal Header Area. This sequential organization corresponds to the chronological order of the journal entries.
  • data is provided that points to the first journal entry in the list, which represents the “oldest” journal entry. It is typically necessary to find the Journal Header 219 for a given sequence number (as stored in the sequence number field 215 ) or for a given write time (as stored in the time field 214 ).
  • a journal type field (JH_TYPE) 218 identifies the type of journal entry.
  • two types of journal entries are kept: (1) an AFTER journal and (2) a BEFORE journal.
  • An AFTER journal entry contains the data that is contained in the write operation for which a journal entry is made.
  • a BEFORE journal entry contains the original data of the area in storage that is the target of a write operation.
  • a BEFORE journal entry therefore represents the contents “before” the write operation is performed. The purpose of maintaining BEFORE journal entries will be discussed below.
  • Journal Header 219 and Journal Data 225 are contained in chronological order in their respective areas in the journal volume 106 .
  • the order in which the Journal Header and the Journal Data are stored in the journal volume is the same order as the assigned sequence number.
  • an aspect of the present invention is that the journal information 219 , 225 wrap within their respective areas 210 , 220 .
  • FIG. 3 shows detail about the management table 108 ( FIG. 1 ).
  • the management table maintains configuration information about a journal group 102 and the relationship between the journal group and its associated journal volume(s) 106 and snapshot image 105 . 1301
  • the management table 300 shown in FIG. 3 illustrates an example management table and its contents.
  • the management table stores a journal group ID (GRID) 310 which identifies a particular journal group 102 in a storage system 100 .
  • a journal group name (GRNAME) 311 can also be provided to identify the journal group with a human recognizable identifier.
  • a journal attribute (GRATTR) 312 is associated with the journal group 102 .
  • two attributes are defined: MASTER and RESTORE.
  • the MASTER attribute indicates the journal group is being journaled.
  • the RESTORE attribute indicates that the journal group is being restored from a journal.
  • a journal status (GRSTS) 315 is associated with the journal group 102 . There are two statuses: ACTIVE and INACTIVE.
  • the management table includes a field to hold a sequence counter (SEQ) 313 .
  • This counter serves as the source of sequence numbers used in the Journal Header 219 .
  • SEQ sequence counter
  • the number (NUM_DVOL) 314 of data volumes 101 contained in a give journal group 102 is stored in the management table.
  • a data volume list (DVOL_LIST) 320 lists the data volumes in a journal group.
  • DVOL_LIST is a pointer to the first entry of a data structure which holds the data volume information. This can be seen in FIG. 3 .
  • Each data volume information comprises an offset number (DVOL_OFFS) 321 .
  • DVD_OFFS offset number
  • a data volume identifier (DVOL_ID) 322 uniquely identifies a data volume within the entire storage system 100 .
  • a pointer (DVOL_NEXT) 324 points to the data structure holding information for the next data volume in the journal group; it is a NULL value otherwise.
  • the management table includes a field to store the number of journal volumes (NUM_JVOL) 330 that are being used to contain the data (journal header and journal data) associated with a journal group 102 .
  • the Journal Header Area 210 contains the Journal Headers 219 for each journal; likewise for the Journal Data components 225 .
  • an aspect of the invention is that the data areas 210 , 220 wrap. This allows for journaling to continue despite the fact that there is limited space in each data area.
  • the management table includes fields to store pointers to different parts of the data areas 210 , 220 to facilitate wrapping. Fields are provided to identify where the next journal entry is to be stored.
  • a field (JI_HEAD_VOL) 331 identifies the journal volume 106 that contains the Journal Header Area 210 which will store the next new Journal Header 219 .
  • a field (JI_HEAD_ADR) 332 identifies an address on the journal volume of the location in the Journal Header Area where the next Journal Header will be stored.
  • the journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 335 .
  • JI_DATA_ADR A field (JI_DATA_ADR) 336 identifies the specific address in the Journal Data Area where the data will be stored. Thus, the next journal entry to be written is “pointed” to by the information contained in the “JI_” fields 331 , 332 , 335 , 336 .
  • the management table also includes fields which identify the “oldest” journal entry. The use of this information will be described below.
  • a field (JO_HEAD_VOL) 333 identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219 .
  • a field (JO_HEAD_ADR) 334 identifies the address within the Journal Header Area of the location of the journal header of the oldest journal.
  • a field (JO_DATA_VOL) 337 identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 338 .
  • the management table includes a list of journal volumes (JVOL_LIST) 340 associated with a particular journal group 102 .
  • JVOL_LIST is a pointer to a data structure of information for journal volumes.
  • each data structure comprises an offset number (JVOL_OFS) 341 which identifies a particular journal volume 106 associated with a given journal group 102 .
  • JVOL_OFS offset number
  • a journal volume identifier (JVOL_ID) 342 uniquely identifies the journal volume within the storage system 100 .
  • a pointer (JVOL_NEXT) 344 points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • the management table includes a list (SS_LIST) 350 of snapshot images 105 associated with a given journal group 102 .
  • SS_LIST is a pointer to snapshot information data structures, as indicated in FIG. 3 .
  • Each snapshot information data structure includes a sequence number (SS_SEQ) 351 that is assigned when the snapshot is taken. As discussed above, the number comes from the sequence counter 313 .
  • a time value (SS_TIME) 352 indicates the time when the snapshot was taken.
  • a status (SS_STS) 358 is associated with each snapshot; valid values include VALID and INVALID.
  • a pointer (SS_NEXT) 353 points to the next snapshot information data structure; it is a NULL value otherwise.
  • Each snapshot information data structure also includes a list of snapshot volumes 107 ( FIG. 1 ) used to store the snapshot images 105 .
  • a pointer (SVOL_LIST) 354 to a snapshot volume information data structure is stored in each snapshot information data structure.
  • Each snapshot volume information data structure includes an offset number (SVOL_OFFS) 355 which identifies a snapshot volume that contains at least a portion of the snapshot image. It is possible that a snapshot image will be segmented or otherwise partitioned and stored in more than one snapshot volume.
  • the offset identifies the i th snapshot volume which contains a portion (segment, partition, etc) of the snapshot image.
  • Each snapshot volume information data structure further includes a snapshot volume identifier (SVOL_ID) 356 that uniquely identifies the snapshot volume in the storage system 100 .
  • a pointer (SVOL_NEXT) 357 points to the next snapshot volume information data structure for a given snapshot image.
  • FIG. 4 shows a flowchart highlighting the processing performed by the recovery manager 111 and Storage System 100 to initiate backup processing in accordance with the illustrative embodiment of the invention as shown in the figures.
  • a single sequence of numbers (SEQ) 313 are associated with each of one or more snapshots and journal entries, as they are created. The purpose of associating the same sequence of numbers to both the snapshots and the journal entries will be discussed below.
  • the recovery manager 111 might define, in a step 410 , a journal group (JNLG) 102 if one has not already been defined. As indicated in FIG. 1 , this may include identifying one or data volumes (DVOL) 101 for which journaling is performed, and identifying one or journal volumes (JVOL) 106 which are used to store the journal-related information.
  • the recovery manager performs a suitable sequence of interactions with the storage system 100 to accomplish this.
  • the storage system may create a management table 108 ( FIG. 1 ), incorporating the various information shown in the table detail 300 illustrated in FIG. 3 .
  • the process includes initializing the JVOL_LIST 340 to list the journal volumes which comprise the journal group 102 Likewise, the list of data volumes DVOL_LIST 320 is created.
  • the fields which identify the next journal entry (or in this case where the table is first created, the first journal entry) are initialized.
  • JI_HEAD_VOL 331 might identify the first in the list of journal volumes and JI_HEAD_ADR 332 might point to the first entry in the Journal Header Area 210 located in the first journal volume.
  • JI_DATA_VOL 335 might identify the first in the list of journal volumes and JI_DATA_ADR 336 might point to the beginning of the Journal Data Area 220 in the first journal volume.
  • the header and the data areas 210 , 220 may reside on different journal volumes, so JI_DATA_VOL might identify a journal volume different from the first journal volume.
  • a step 420 the recovery manager 111 will initiate the journaling process. Suitable communication(s) are made to the storage system 100 to perform journaling.
  • a step 425 the storage system will make a journal entry (also referred to as an “AFTER journal”) for each write operation that issues from the host 110 .
  • making a journal entry includes, among other things, identifying the location for the next journal entry.
  • the fields JI_HEAD_VOL 331 and JI_HEAD_ADR 332 identify the journal volume 106 and the location in the Journal Header Area 210 of the next Journal Header 219 .
  • the sequence counter (SEQ) 313 from the management table is copied to (associated with) the JH_SEQ 215 field of the next header.
  • the sequence counter is then incremented and stored back to the management table.
  • the sequence counter can be incremented first, copied to JH_SEQ, and then stored back to the management table.
  • the fields JI_DATA_VOL 335 and in the management table identify the journal volume and the beginning of the Journal Data Area 220 for storing the data associated with the write operation.
  • the JI_DATA_VOL and JI_DATA_ADR fields are copied to JH_JVOL 216 and to JH_ADR 212 , respectively, of the Journal Header, thus providing the Journal Header with a pointer to its corresponding Journal Data.
  • the data of the write operation is stored.
  • JI_HEAD_VOL 331 and JI_HEAD_ADR 332 fields are updated to point to the next Journal Header 219 for the next journal entry. This involves taking the next contiguous Journal Header entry in the Journal Header Area 210 .
  • the JI_DATA_ADR field (and perhaps JI_DATA_VOL field) is updated to reflect the beginning of the Journal Data Area for the next journal entry. This involves advancing to the next available location in the Journal Data Area.
  • These fields therefore can be viewed as pointing to a list of journal entries. Journal entries in the list are linked together by virtue of the sequential organization of the Journal Headers 219 in the Journal Header Area 210 .
  • the Journal Header 219 for the next journal entry wraps to the beginning of the Journal Header Area.
  • the Journal Data 225 For the Journal Data 225 .
  • the present invention provides for a procedure to free up entries in the journal volume 106 . This aspect of the invention is discussed below.
  • the JO_HEAD_VOL field 333 , JO_HEAD_ADR field 334 , JO_DATA_VOL field 337 , and the JO_DATA_ADR field 338 are set to contain their contents of their corresponding “JI_” fields.
  • the “JO_” fields point to the oldest journal entry.
  • the “JO_” fields do not advance while the “JI_” fields do advance. Update of the “JO_” fields is discussed below.
  • journaling process when the journaling process has been initiated, all write operations issuing from the host are journaled. Then in a step 430 , the recovery manager 111 will initiate taking a snapshot of the data volumes 101 .
  • the storage system 100 receives an indication from the recovery manager to take a snapshot.
  • the storage system performs the process of taking a snapshot of the data volumes. Among other things, this includes accessing SS_LIST 350 from the management table ( FIG. 3 ). A suitable amount of memory is allocated for fields 351 - 354 to represent the next snapshot.
  • the sequence counter (SEQ) 313 is copied to the field SS_SEQ 351 and incremented, in the manner discussed above for JH_SEQ 215 .
  • a sequence of numbers is produced from SEQ 313 , each number in the sequence being assigned either to a journal entry or a snapshot entry.
  • the snapshot is stored in one (or more) snapshot volumes (SVOL) 107 .
  • a suitable amount of memory is allocated for fields 355 - 357 .
  • the information relating to the SVOLs for storing the snapshot are then stored into the fields 355 - 357 . If additional volumes are required to store the snapshot, then additional memory is allocated for fields 355 - 357 .
  • FIG. 5 illustrates the relationship between journal entries and snapshots.
  • the snapshot 520 represents the first snapshot image of the data volumes 101 belonging to a journal group 102 .
  • journal entries ( 510 ) having sequence numbers SEQ 0 and SEQ 1 have been made, and represent journal entries for two write operations. These entries show that journaling has been initiated at a time prior to the snapshot being taken (step 420 ).
  • the recovery manager 111 initiates the taking of a snapshot, and since journaling has been initiated, any write operations occurring during the taking of the snapshot are journaled.
  • the write operations 500 associated with the sequence numbers SEQ 3 and higher show that those operations are being journaled.
  • the journal entries identified by sequence numbers SEQ 0 and SEQ 1 can be discarded or otherwise ignored.
  • Recovering data typically requires recover the data state of at least a portion of the data volumes 101 at a specific time. Generally, this is accomplished by applying one or more journal entries to a snapshot that was taken earlier in time relative to the journal entries.
  • the sequence number SEQ 313 is incremented each time it is assigned to a journal entry or to a snapshot. Therefore, it is a simple matter to identify which journal entries can be applied to a selected snapshot; i.e., those journal entries whose associated sequence numbers (JH_SEQ, 215 ) are greater than the sequence number (SS_SEQ, 351 ) associated with the selected snapshot.
  • the administrator may specify some point in time, presumably a time that is earlier than the time (the “target time”) at which the data in the data volume was lost or otherwise corrupted.
  • the time field SS_TIME 352 for each snapshot is searched until a time earlier than the target time is found.
  • the Journal Headers 219 in the Journal Header Area 210 is searched, beginning from the “oldest” Journal Header.
  • the oldest Journal Header can be identified by the “JO_” fields 333 , 334 , 337 , and 338 in the management table.
  • the Journal Headers are searched sequentially in the area 210 for the first header whose sequence number JH_SEQ 215 is greater than the sequence number SS_SEQ 351 associated with the selected snapshot.
  • the selected snapshot is incrementally updated by applying each journal entry, one at a time, to the snapshot in sequential order, thus reproducing the sequence of write operations. This continues as long as the time field JH_TIME 214 of the journal entry is prior to the target time. The update ceases with the first journal entry whose time field 214 is past the target time.
  • a single snapshot is taken. All journal entries subsequent to that snapshot can then be applied to reconstruct the data state at a given time.
  • multiple snapshots can be taken. This is shown in FIG. 5A where multiple snapshots 520 ′ are taken.
  • each snapshot and journal entry is assigned a sequence number in the order in which the object (snapshot or journal entry) is recorded. It can be appreciated that there typically will be many journal entries 510 recorded between each snapshot 520 ′. Having multiple snapshots allows for quicker recovery time for restoring data. The snapshot closest in time to the target recovery time would be selected. The journal entries made subsequent to the snapshot could then be applied to restore the desired data state.
  • FIG. 6 illustrates another aspect of the present invention.
  • a journal entry is made for every write operation issued from the host; this can result in a rather large number of journal entries.
  • the one or more journal volumes 106 defined by the recovery manager 111 for a journal group 102 will eventually fill up. At that time no more journal entries can be made. As a consequence, subsequent write operations would not be journaled and recovery of the data state subsequent to the time the journal volumes become filled would not be possible.
  • FIG. 6 shows that the storage system 100 will apply journal entries to a suitable snapshot in response to detection of an “overflow” condition.
  • An “overflow” is deemed to exist when the available space in the journal volume(s) falls below some predetermined threshold. It can be appreciated that many criteria can be used to determine if an overflow condition exists. A straightforward threshold is based on the total storage capacity of the journal volume(s) assigned for a journal group. When the free space becomes some percentage (say, 10%) of the total storage capacity, then an overflow condition exists. Another threshold might be used for each journal volume.
  • the free space capacity in the journal volume(s) is periodically monitored. Alternatively, the free space can be monitored in an aperiodic manner. For example, the intervals between monitoring can be randomly spaced. As another example, the monitoring intervals can be spaced apart depending on the level of free space; i.e., the monitoring interval can vary as a function of the free space level.
  • FIG. 7 highlights the processing which takes place in the storage system 100 to detect an overflow condition.
  • the storage system periodically checks the total free space of the journal volume(s) 106 ; e.g., every ten seconds.
  • the free space can easily be calculated since the pointers (e.g., JI_CTL_VOL 331 , JI_CTL_ADDR 332 ) in the management table 300 maintain the current state of the storage consumed by the journal volumes. If the free space is above the threshold, then the monitoring process simply waits for a period of time to pass and then repeats its check of the journal volume free space.
  • journal entries are applied to a snapshot to update the snapshot.
  • the oldest journal entry(ies) are applied to the snapshot.
  • the Journal Header 219 of the “oldest” journal entry is identified by the JO_HEAD_VOL field 333 and the JO_HEAD_ADR field 334 . These fields identify the journal volume and the location in the journal volume of the Journal Header Area 210 of the oldest journal entry.
  • the Journal Data of the oldest journal entry is identified by the JO_DATA_VOL field 337 and the JO_DATA_ADR field 338 .
  • the journal entry identified by these fields is applied to a snapshot.
  • the snapshot that is selected is the snapshot having an associated sequence number closest to the sequence number of the journal entry and earlier in time than the journal entry.
  • the snapshot having the sequence number closest to but less than the sequence number of the journal entry is selected (i.e., “earlier in time).
  • the applied journal entry is freed. This can simply involve updating the JO_HEAD_VOL field 333 , JO_HEAD_ADR field 334 , JO_DATA_VOL field 337 , and the JO_DATA_ADR field 338 to the next journal entry.
  • sequence numbers will eventually wrap, and start counting from zero again. It is well within the level of ordinary skill to provide a suitable mechanism for keeping track of this when comparing sequence numbers.
  • the free space can be compared against the threshold criterion used in step 710 .
  • a different threshold can be used. For example, here a higher amount of free space may be required to terminate this process than was used to initiate the process. This avoids invoking the process too frequently, but once invoked the second higher threshold encourages recovering as much free space as is reasonable. It can be appreciated that these thresholds can be determined empirically over time by an administrator.
  • step 730 if the threshold for stopping the process is met (i.e., free space exceeds threshold), then the process stops. Otherwise, step 720 is repeated for the next oldest journal entry. Steps 730 and 720 are repeated until the free space level meets the threshold criterion used in step 730 .
  • FIG. 7A highlights sub-steps for an alternative embodiment to step 720 shown in FIG. 7 .
  • Step 720 frees up a journal entry by applying it to the latest snapshot that is not later in time than the journal entry. However, where multiple snapshots are available, it may be possible to avoid the time consuming process of applying the journal entry to a snapshot in order to update the snapshot.
  • FIG. 7A shows details for a step 720 ′ that is an alternate to step 720 of FIG. 7 .
  • a determination is made whether a snapshot exists that is later in time than the oldest journal entry. This determination can be made by searching for the first snapshot whose associated sequence number is greater than that of the oldest journal entry. Alternatively, this determination can be made by looking for a snapshot that is a predetermined amount of time later than the oldest journal entry can be selected; for example, the criterion may be that the snapshot must be at least one hour later in time than the oldest journal entry. Still another alternate is to use the sequence numbers associated with the snapshots and the journal entries, rather than time. For example, the criterion might be to select a snapshot whose sequence number is N increments away from the sequence number of the oldest journal entry.
  • journal entries can be removed without having to apply them to a snapshot.
  • the “JO_” fields (JO_HEAD_VOL 333 , JO_HEAD_ADR 334 , JO_DATA_VOL 337 , and JO_DATA_ADR 338 ) are simply moved to a point in the list of journal entries that is later in time than the selected snapshot. If no such snapshot can be found, then in a step 723 the oldest journal entry is applied to a snapshot that is earlier in time than the oldest journal entry, as discussed for step 720 .
  • step 721 Still another alternative for step 721 is simply to select the most recent snapshot. All the journal entries whose sequence numbers are less than that of the most recent snapshot can be freed. Again, this simply involves updating the “JO_” fields so they point to the first journal entry whose sequence number is greater than that of the most recent snapshot.
  • an aspect of the invention is being able to recover the data state for any desired point in time. This can be accomplished by storing as many journal entries as possible and then applying the journal entries to a snapshot to reproduce the write operations.
  • This last embodiment has the potential effect of removing large numbers of journal entries, thus reducing the range of time within which the data state can be recovered. Nevertheless, for a particular configuration it may be desirable to remove large numbers of journal entries for a given operating environment.
  • recovery of the production volume(s) 101 can be facilitated by allowing the user to interact with the recovery process.
  • a “fast recovery” can be performed which quickly recovers the data state to a point in time prior to a target time.
  • a more granular recovery procedure can then be performed which allows a user to hone in on the target data state.
  • the user can perform “undo-able recoveries” to inspect the data state in a trial and error manner by allowing the user to step forward and backward (undo operation) in time.
  • This aspect of the invention allows a user to be less specific as to the time of the desired data state.
  • the target time specified by the user need only be a time that he is certain is prior to the time of the target data state. It is understood that “the target data state” can refer to any desired state of the data.
  • FIG. 3A shows an illustrative embodiment of a management table 300 ′ according to this aspect of the present invention.
  • the alternative management table 300 ′ includes two sets of fields, one set of fields ( 330 , 331 , 340 ) for managing AFTER journal entries and another set of fields ( 332 , 333 , 341 ) for managing BEFORE journal entries.
  • the fields related to the AFTER journal entries include a field to store the number of journal volumes (NUM_JVOLa) 330 that are used to contain the data journal header and journal data) associated with the AFTER journal entries for a journal group 102 .
  • the Journal Header Area 210 contains the Journal Headers 219 for each journal; likewise for the Journal Data components 225 .
  • an aspect of the invention is that the data areas 210 , 220 wrap. This allows for journaling to continue despite the fact that there is limited space in each data area.
  • the management table includes fields to store pointers to different parts of the data areas 210 , 220 to facilitate wrapping. Pointer-type information is provided to facilitate identifying where the next journal entry is to be stored. A set of such information (“AFTER journal pointers”) is provided for the AFTER journal entries. A field (JVOL_PTRa) 331 in the management table identifies the location of the AFTER journal pointers.
  • the AFTER journal entries are stored in one or more journal volumes, separate from the BEFORE journal entries.
  • a field (JI_HEAD_VOL) 331 a identifies the journal volume 106 that contains the Journal Header Area 210 from which the next Journal Header 219 will be obtained.
  • a field (JI_HEAD_ADR) 331 b identifies where in the in Journal Header Area the next Journal Header is located.
  • the journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 331 e .
  • a field (JI_DATA_ADR) 33 if identifies the specific address in the Journal Data Area where the data will be stored.
  • the next AFTER journal entry to be written is “pointed” to by the information contained in the “JI_” fields 331 a , 331 b , 331 e , 331 f.
  • the AFTER journal pointers also includes fields which identify the “oldest” AFTER journal entry. The use of this information will be described below.
  • a field (JO_HEAD_VOL) 331 c identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219 .
  • a field (JO_HEAD_ADR) 331 d identifies the address within the Journal Header Area of the location of the journal header of the oldest journal.
  • a field (JO_DATA_VOL) 331 g identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 331 h.
  • the management table includes a list of journal volumes (JVOL_LISTa) 340 associated with the AFTER journal entries of a journal group 102 .
  • JVOL_LISTa is a pointer to a data structure of information for journal volumes.
  • each data structure comprises an offset number (JVOL_OFS) 340 a which identifies a particular journal volume 106 associated with a given journal group 102 .
  • JVOL_OFS offset number
  • a journal volume identifier (JVOL_ID) 340 b uniquely identifies the journal volume within the storage system 100 .
  • a pointer (JVOL_NEXT) 340 c points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • the management table also includes a set of similar fields for managing the BEFORE journal entries.
  • the fields related to the BEFORE journal entries include a field to store the number of journal volumes (NUM_JVOLb) 332 that are being used to contain the data (journal header and journal data) associated with the BEFORE journal entries for a journal group 102 .
  • an aspect of the invention is that the data areas 210 , 220 wrap.
  • the management table includes fields to store pointers to different parts of the data areas 210 , 220 to facilitate wrapping. Pointer-type information is provided to facilitate identifying where the next BEFORE journal entry is to be stored. A set of such information (“BEFORE journal pointers”) is provided for the BEFORE journal entries.
  • a field (JVOL_PTRb) 333 in the management table identifies the location of the BEFORE journal pointers.
  • the BEFORE journal entries are stored in one or more journal volumes, separate from the journal volume(s) used to store the AFTER journal entries.
  • a field (JI_HEAD_VOL) 332 a identifies the journal volume 106 that contains the Journal Header Area 210 from which the next Journal Header 219 will be obtained.
  • a field (JI_HEAD_ADR) 332 b identifies where in the in Journal Header Area the next Journal Header is located.
  • the journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 332 e .
  • a field (JI_DATA_ADR) 332 f identifies the specific address in the Journal Data Area where the data will be stored.
  • the next BEFORE journal entry to be written is “pointed” to by the information contained in the “JI_” fields 332 a , 332 b , 332 e , 332 f.
  • the AFTER journal pointers also includes fields which identify the “oldest” BEFORE journal entry. The use of this information will be described below.
  • a field (JO_HEAD_VOL) 332 c identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219 .
  • a field (JO_HEAD_ADR) 332 d identifies the address within the Journal Header Area of the location of the journal header of the oldest journal.
  • a field (JO_DATA_VOL) 332 g identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 332 h.
  • the management table includes a list of journal volumes (JVOL_LISTh) 341 associated with the AFTER journal entries of a journal group 102 .
  • JVOL_LISTa is a pointer to a data structure of information for journal volumes.
  • each data structure comprises an offset number (JVOL_OFS) 341 a which identifies a particular journal volume 106 associated with a given journal group 102 .
  • a journal volume identifier (JVOL_ID) 341 b uniquely identifies the journal volume within the storage system 100 .
  • a pointer (JVOL_NEXT) 341 c points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • the recover manager 111 provides the following interface to the storage system for the aspect of the invention which provides for “fast” and “undo-able” recovery modes.
  • the interface is shown in a format of an application programmer's interface (API).
  • API application programmer's interface
  • the functionality and needed information (parameters) are described. It can be appreciated that any suitable programming language can be used.
  • FIG. 8 a generalized process flow is shown highlighting the steps for recovering data in accordance with the “fast” and “undo-able” recovery mode aspects of the present invention.
  • the retrieval methods and apparatus disclosed herein are not limited to disaster recovery scenarios.
  • the invention has applicability for users (e.g., system administrators) who might have a need to look at the state of a file or a directory at an earlier point in time.
  • the term “recovery volume” is used in a generic sense to refer to one or more volumes on which the data recovery process is being performed.
  • the recovery manager 111 can include a suitable interface for interaction with a user.
  • An appropriate interface might be a graphical user interface, or a command line interface.
  • voice recognition technology and even virtual reality technology can be used as input and output components of the interface for interacting with a user.
  • the “user” can be a machine (such as a data processing system) rather than a human. In such a case, a suitable machine-machine interface can be readily devised and implemented.
  • the first phase of the recovery process is referred to as “fast” recovery.
  • the idea is to quickly access the data state of the recovery volume at a point in time that is “close” in time to the desired data state, but prior in time to the desired data state.
  • the recovery manager 111 obtains from the user a “target time” that specifies a point in time that is close to the time of the desired data state.
  • a suitable query to the user might inform the user as to the nature of this target time. For example, if the user interacted with a system administrator, she might tell the administrator that she was sure her files were not deleted until after 10:30 AM. The target time would then be 10:30 AM, or earlier.
  • a user interface can obtain such information from a user by presenting a suitable set of queries or prompts. Given the target time, the recovery manager can then issue a RECOVER_PH1 operation to the storage system (e.g., system 100 , FIG. 1 ) that contains the recovery volume.
  • the storage system e.g., system 100 , FIG. 1
  • the storage system would initiate phase I recovery.
  • the storage system 100 in response to the RECOVER_PH1 request, would determine in a step 910 whether recovery is possible. Two conditions are checked:
  • recovery target time is in scope—The target time that user specifies must be between the oldest journal and the newest journal.
  • the recovery volume is set to an offline state.
  • offline is taken to mean that the user, and more generally the host device 110 , cannot access the recovery volume.
  • the host 110 in the case that the production volume is being used as the recovery volume, it is likely to be desirable that the host 110 be prevented at least from issuing write operations to the volume. Also, the host typically will not be permitted to perform read operations.
  • the storage system itself has full access to the recovery volume in order to perform the recovery task.
  • the snapshot is copied to the recovery volume in preparation for phase I recovery.
  • Tthe production volume itself can be the recovery volume.
  • the recovery manager 111 can allow the user to specify a volume other than the production volume to serve as the target of the data recovery operation.
  • the recovery volume can be the volume on which the snapshot is stored. Using a volume other than the production volume to perform the recovery operation may be preferred where it is desirable to provide continued use of the production volume.
  • one or more AFTER journal entries are applied to update the snapshot volume in the manner as discussed previously. Enough AFTER journal entries are applied to update the snapshot to a point in time up to or prior to the user-specified target time.
  • the storage system 100 can signal the recovery manager (step 820 ) to indicate phase I has completed.
  • the recovery manager 111 would then issue a STOP_RECOVER operation to the storage system.
  • the storage system 100 (step 830 ) would put the recovery volume into an online state.
  • the “online” state is taken to mean that the host device 10 is given access to the recovery volume.
  • a step 840 the user is given the opportunity to review the state of the data on the recovery volume to determine whether the desired data state has been recovered. At this point, the data state has been recovered to some point in time prior to the time of the desired data state. Additional recovery might bee needed to reach the desired data state. If the desired data state has been achieved then the recovery process is stopped. If the desired data state is not achieved, then a determination is made whether another phase I recovery operation is to be performed, or whether a phase II recovery operation is to be performed.
  • phase I recovery involves updating the snapshot by applying the AFTER journal entries to it to reproduce the sequence of write operations made since the snapshot was taken.
  • a phase II recovery operation involves taking a BEFORE journal entry for each AFTER journal entry that is applied.
  • phase II recovery is a slower process than phase I recover.
  • the decision whether to proceed using phase I recovery mode or phase II recovery mode can be made by the user after she has inspected the recovered data state. For example, she may learn from inspecting the recovered data state that an additional few hours of recovery is needed, in which case she may specify via the recovery manager 111 to perform the faster phase I recovery and provide a refined target time. If the recovered data state seems close to the desired data state, then the user may want to perform the slower phase II recovery to take advantage of the “undo” aspect (see below) provided by a phase II recovery operation.
  • the user interface can algorithmically determine whether to perform phase I or phase II recover.
  • the interface can input the user's refined target time and compare that against the initial target time. Based on the comparison, the interface can choose an appropriate recovery mode. For example, if the difference in time is X minutes or greater, then a phase I recovery is performed, otherwise a phase II recovery is commenced.
  • phase I recovery cannot be conveniently “undone.” If the recovered data state is beyond the desired data state, then the only way to reverse the data recovery action is to start again from the original snapshot. This can be time consuming.
  • a phase II recovery in accordance with the present invention can be undone. Thus, if a recovered data state is close to the user's refined time estimate, then a phase II recovery operation may be preferred.
  • FIG. 8 shows a step 850 for the initiation of phase II recovery. This includes taking the recovery volume offline and applying one or more AFTER journal entries to the snapshot as before, in order to move the state of the recovered data forward in time.
  • phase II processing includes the additional step of taking BEFORE journal entries.
  • BEFORE journaling turned on, a BEFORE journal entry is taken of the snapshot prior to updating the snapshot with an AFTER journal entry; one such BEFORE journal entry is taken for each AFTER journal entry.
  • a BEFORE journal entry records the data that is stored in the target location of the write operation. Consequently, the state of the snapshot is preserved in a BEFORE journal entry prior to updating the snapshot with an AFTER journal entry.
  • pairs of BEFORE journal and AFTER journal entries are created during phase II recovery.
  • sequence numbering provided by the sequence number (SEQ) 313 is associated with each BEFORE entry journal.
  • sequence number (SEQ) 313 is associated with each BEFORE entry journal.
  • a STOP_RECOVER operation is issued to put the recovery volume in an online state.
  • the user is then able to inspect the recovery volume. Based on the inspection, if the user determines in a step 870 that the desired data state of the recovery volume is achieved, then the recovery process is complete. If the user determines that the desired data state is not achieved, then a further determination is made whether the data recovery has gone beyond the desired data state. If so, then the snapshot updates are “undone” (step 880 ) by accessing one or more BEFORE journal entries. This combination of taking BEFORE journals and AFTER journals constitutes a phase II recovery.
  • FIG. 10 illustrates how an updated snapshot can be undone.
  • the figure shows that at some point in time a snapshot 1020 of a recovery volume (e.g., data volume 101 , FIG. 1 ) was taken.
  • the figure shows phase II processing where BEFORE and AFTER journal entries are taken.
  • the application of the AFTER journal entry 1012 a to the snapshot is preceded by a BEFORE journal entry 1012 .
  • the BEFORE journal entry contains the original data that is stored in the area of the recovery volume that is the target of the write operation recorded by the AFTER journal entry, prior to performing the write operation.
  • a pair of journal entries is created comprising an AFTER journal entry and a corresponding BEFORE journal entry.
  • the AFTER journal entry 1012 a is then applied to the snapshot to update the snapshot.
  • a BEFORE journal entry 1014 is created to record the original data in the area of the production volume that is the target of the AFTER journal entry before the AFTER journal entry is applied to the snapshot 1020 .
  • a pair of journal entries result: an AFTER journal entry 1014 a and its corresponding BEFORE journal entry 1014 .
  • Similar BEFORE journal entries 1016 and 1018 are created for the AFTER journal entries 1016 a and 1018 a.
  • the snapshot 1020 is updated by the sequential application of the AFTER journal entries 1012 a - 1018 a (along with the creation of the corresponding BEFORE journal entries 1012 - 1018 ).
  • the snapshot 1020 is updated by performing the write operation indicated in the AFTER journal entry 1012 a to produce an updated snapshot 1120 a .
  • the updated snapshot 1120 a is again updated by performing the write operation indicated in the AFTER journal entry 1014 a to produce 1120 b .
  • the updated snapshot 1120 b is subsequently updated in turn by the AFTER journal entries 1016 a and 1018 a to produce snapshots 1120 c and 1120 d.
  • one or more of the BEFORE journal entries can be applied (step 880 ) to the updated snapshot in this manner to perform a “reverse update” of one or more of the AFTER journal entries.
  • the number of BEFORE journal entries to apply can be a fixed number; for example, move back in time by one minute increments, or by some number N of BEFORE journal entries.
  • the user can specify how far back in time to move the data state by specifying a reverse target time (e.g., an absolute time such as 10:34 AM), or an increment of time (e.g., a delta time value such as 10 minutes).
  • the user is given the opportunity to inspect the data state of the recovery volume to determine whether to continue backward in time or to move forward. Repeating this allows the user to restore the desired data state in an iterative and interactive manner by shuffling the data state backward and forward in time.
  • phase II processing will be slower than phase I recovery for the reason that a BACKUP journal entry must be created before applying an AFTER journal entry to update the snapshot. For this reason, phase I recovery is also referred to as “fast recovery.” Since phase II recovery permits the user to undo an updated snapshot, it can be referred to as “undo-able” recovery.
  • the foregoing disclosed embodiments typically can be provided using a combination of hardware and software implementations; e.g., combinations of software, firmware, and/or custom logic such as ASICs (application specific ICs) are possible.
  • ASICs application specific ICs
  • One of ordinary skill can readily appreciate that the underlying technical implementation will be determined based on factors including but not limited to or restricted to system cost, system performance, the existence of legacy software and legacy hardware, operating environment, and so on.
  • the disclosed embodiments can be readily reduced to specific implementations without undue experimentation by those of ordinary skill in the relevant art.

Abstract

A storage system maintains a journal and a snapshot of one or more data volumes. Two journal entry types are maintained, an AFTER journal entry and a BEFORE journal entry. Two modes of data recovery are provided: “fast” recovery and “undo-able” recovery. A combination of both recovery modes allows the user to quickly recover a targeted data state.

Description

    CROSS-REFERENCES TO RELATED APPLICATIONS
  • This application is related to the following commonly owned and co-pending U.S. applications:
      • “Method and Apparatus for Data Recovery Using Storage Based Journaling,” Attorney Docket Number 16869B-082700US, and
      • “Method and Apparatus for Synchronizing Applications for Data Recovery Using Storage Based Journaling,” Attorney Docket Number 16869B-082900US, both of which are herein incorporated by reference for all purposes.
    BACKGROUND OF THE INVENTION
  • The present invention is related to computer storage and in particular to the recovery of data.
  • Several methods are conventionally used to prevent the loss of data. Typically, data is backed up in a periodic manner (e.g., once a day) by a system administrator. Many systems are commercially available which provide backup and recovery of data; e.g., Veritas NetBackup, Legato/Networker, and so on. Another technique is known as volume shadowing. This technique produces a mirror image of data onto a secondary storage system as it is being written to the primary storage system.
  • Journaling is a backup and restore technique commonly used in database systems. An image of the data to be backed up is taken. Then, as changes are made to the data, a journal of the changes is maintained. Recovery of data is accomplished by applying the journal to an appropriate image to recover data at any point in time. Typical database systems, such as Oracle, can perform journaling.
  • Except for database systems, however, there are no ways to recover data at any point in time. Even for database systems, applying a journal takes time since the procedure includes:
      • reading the journal data from storage (e.g., disk)
      • the journal must be analyzed to determine at where in the journal the desired data can be found
      • apply the journal data to a suitable image of the data to reproduce the activities performed on the data—this usually involves accessing the image, and writing out data as the journal is applied
        Also, if an application running on the database system interacts with another application (regardless of whether it is a database system or not), then there is no way to recover its data at any point in time. This is because there is no coordination mechanism to recover the data of the other application.
  • Recovering data at any point in time addresses the following types of administrative requirements. For example, a typical request might be, “I deleted a file by mistake at around 10:00 am yesterday. I have to recover the file just before it was deleted.”
  • If the data is not in a database system, this kind of request cannot be conveniently, if at all, serviced. A need therefore exists for processing data in a manner that facilitates recovery of lost data. A need exists for being able to provide data processing that facilitates data recovery in user environments other than in a database application, or database application interacting with other applications.
  • SUMMARY OF THE INVENTION
  • The invention is directed to method and apparatus for data recovery and comprises performing a fast recovery mode operation in conjunction with an undo-able recovery mode operation. In the fast recovery mode operation, after-journal entries are applied to a snapshot to update the snapshot. In the undo-able recovery mode operation, a before-journal entry is taken of the snapshot before applying an after-journal entry to it. A user can perform one or more undo operations when a snapshot has been updated in the undo-able recovery mode.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Aspects, advantages and novel features of the present invention will become apparent from the following description of the invention presented in conjunction with the accompanying drawings:
  • FIG. 1 is a high level generalized block diagram of an illustrative embodiment of the present invention;
  • FIG. 2 is a generalized illustration of a illustrative embodiment of a data structure for storing journal entries in accordance with the present invention;
  • FIG. 3 is a generalized illustration of an illustrative embodiment of a data structure for managing the snapshot volumes and the journal entry volumes in accordance with the present invention;
  • FIG. 3A is a generalized illustration of an illustrative embodiment of a data structure for managing the snapshot volumes and the journal entry volumes in accordance with another aspect of the present invention;
  • FIG. 4 is a high level flow diagram highlighting the processing between the recovery manager and the controller in the storage system;
  • FIG. 5 illustrates the relationship between a snapshot and a plurality of journal entries;
  • FIG. 5A illustrates the relationship among a plurality of snapshots and a plurality of journal entries;
  • FIG. 6 is a high level illustration of the data flow when an overflow condition arises;
  • FIG. 7 is a high level flow chart highlighting an aspect of the controller in the storage system to handle an overflow condition;
  • FIG. 7A illustrates an alternative to a processing step shown in FIG. 7;
  • FIG. 8 is a generalized flowchart highlighting data recovery in accordance with another aspect of the invention;
  • FIG. 9 is a flowchart highlighting the steps for phase I recovery;
  • FIG. 10 is a diagrammatic illustration of the BEFORE and AFTER journaling; and
  • FIG. 11 is provided to illustrate how an “undo” operation can be performed using the journaling shown in FIG. 10.
  • DESCRIPTION OF THE SPECIFIC EMBODIMENTS
  • FIG. 1 is a high level generalized block diagram of an illustrative embodiment of a backup and recovery system according to the present invention. When the system is activated, a snapshot is taken for production data volumes (DVOL) 101. The term “snapshot” in this context conventionally refers to a data image of at the data volume at a given point in time. Depending on system requirements, implementation, and so on, the snapshot can be of the entire data volume, or some portion or portions of the data volume(s); e.g., filesystem(s), file(s), directorie(s), etc. During the normal course of operation of the system in accordance with the invention, a journal entry is made for every write operation issued from the host to the data volumes. As will be discussed below, by applying a series of journal entries to an appropriate snapshot, data can be recovered at any point in time.
  • The backup and recovery system shown in FIG. 1 includes at least one storage system 100. Though not shown, one of ordinary skill can appreciate that the storage system includes suitable processor(s), memory, and control circuitry to perform 10 between a host 110 and its storage media (e.g., disks). The backup and recovery system also requires at least one host 110. A suitable communication path 130 is provided between the host and the storage system.
  • The host 110 typically will have one or more user applications (APP) 112 executing on it. These applications will read and/or write data to storage media contained in the data volumes 101 of storage system 100. Thus, applications 112 and the data volumes 101 represent the target resources to be protected. It can be appreciated that data used by the user applications can be stored in one or more data volumes.
  • In accordance with the invention, a journal group (JNLG) 102 is defined. The data volumes 101 are organized into the journal group. In accordance with the present invention, a journal group is the smallest unit of data volumes where journaling of the write operations from the host 110 to the data volumes is guaranteed. The associated journal records the order of write operations from the host to the data volumes in proper sequence. The journal data produced by the journaling activity can be stored in one or more journal volumes(JVOL) 106.
  • The host 110 also includes a recovery manager (RM) 111. This component provides a high level coordination of the backup and recovery operations. Additional discussion about the recovery manager will be discussed below.
  • The storage system 100 provides a snapshot (SS) 105 of the data volumes comprising a journal group. For example, the snapshot 105 is representative of the data volumes 101 in the journal group 106 at the point in time that the snapshot was taken. Conventional methods are known for producing the snapshot image. One or more snapshot volumes (SVOL) 107 are provided in the storage system which contain the snapshot data. A snapshot can be contained in one or more snapshot volumes. Though the disclosed embodiment illustrates separate storage components for the journal data and the snapshot data, it can be appreciated that other implementations can provide a single storage component for storing the journal data and the snapshot data.
  • A management table (MT) 108 is provided to store the information relating to the journal group 102, the snapshot 105, and the journal volume(s) 106. FIG. 3 and the accompanying discussion below reveal additional detail about the management table.
  • A controller component 140 is also provided which coordinates the journaling of write operations and snapshots of the data volumes, and the corresponding movement of data among the different storage components 101, 106, 107. It can be appreciated that the controller component is a logical representation of a physical implementation which may comprise one or more sub-components distributed within the storage system 100.
  • FIG. 2 shows the data used in an implementation of the journal. When a write request from the host 110 arrives at the storage system 100, a journal is generated in response. The journal comprises a Journal Header 219 and Journal Data 225. The Journal Header 219 contains information about its corresponding Journal Data 225. The Journal Data 225 comprises the data (write data) that is the subject of the write operation. This kind of journal is also referred to as an “AFTER journal.”
  • The Journal Header 219 comprises an offset number (JH_OFS) 211. The offset number identifies a particular data volume 101 in the journal group 102. In this particular implementation, the data volumes are ordered as the 0th data volume, the 1st data volume, the 2nd data volume and so on. The offset numbers might be 0, 1, 2, etc.
  • A starting address in the data volume (identified by the offset number 211) to which the write data is to be written is stored to a field in the Journal Header 219 to contain an address (JH_ADR) 212. For example, the address can be represented as a block number (LBA, Logical Block Address).
  • A field in the Journal Header 219 stores a data length (JH_LEN) 213, which represents the data length of the write data. Typically it is represented as a number of blocks.
  • A field in the Journal Header 219 stores the write time (JH_TIME) 214, which represents the time when the write request arrives at the storage system 100. The write time can include the calendar date, hours, minutes, seconds and even milliseconds. This time can be provided by the disk controller 140 or by the host 110. For example, in a mainframe computing environment, two or more mainframe hosts share a timer, called the Sysplex Timer, and can provide the time in a write command when it is issued.
  • A sequence number(JH_SEQ) 215 is assigned to each write request. The sequence number is stored in a field in the Journal Header 219. Every sequence number within a given journal group 102 is unique. The sequence number is assigned to a journal entry when it is created.
  • A journal volume identifier (JH_JVOL) 216 is also stored in the Journal Header 219. The volume identifier identifies the journal volume 106 associated with the Journal Data 225. The identifier is indicative of the journal volume containing the Journal Data. It is noted that the Journal Data can be stored in a journal volume that is different from the journal volume which contains the Journal Header.
  • A journal data address (JH_JADR) 217 stored in the Journal Header 219 contains the beginning address of the Journal Data 225 in the associated journal volume 106 that contains the Journal Data.
  • FIG. 2 shows that the journal volume 106 comprises two data areas: a Journal Header Area 210 and a Journal Data Area 220. The Journal Header Area 210 contains only Journal Headers 219, and Journal Data Area 220 contains only Journal Data 225. The Journal Header is a fixed size data structure. A Journal Header is allocated sequentially from the beginning of the Journal Header Area. This sequential organization corresponds to the chronological order of the journal entries. As will be discussed, data is provided that points to the first journal entry in the list, which represents the “oldest” journal entry. It is typically necessary to find the Journal Header 219 for a given sequence number (as stored in the sequence number field 215) or for a given write time (as stored in the time field 214).
  • A journal type field (JH_TYPE) 218 identifies the type of journal entry. In accordance with the invention, two types of journal entries are kept: (1) an AFTER journal and (2) a BEFORE journal. An AFTER journal entry contains the data that is contained in the write operation for which a journal entry is made. A BEFORE journal entry contains the original data of the area in storage that is the target of a write operation. A BEFORE journal entry therefore represents the contents “before” the write operation is performed. The purpose of maintaining BEFORE journal entries will be discussed below.
  • Journal Header 219 and Journal Data 225 are contained in chronological order in their respective areas in the journal volume 106. Thus, the order in which the Journal Header and the Journal Data are stored in the journal volume is the same order as the assigned sequence number. As will be discussed below, an aspect of the present invention is that the journal information 219, 225 wrap within their respective areas 210, 220.
  • FIG. 3 shows detail about the management table 108 (FIG. 1). In order to manage the Journal Header Area 210 and Journal Data Area 220, pointers for each area are needed. As mentioned above, the management table maintains configuration information about a journal group 102 and the relationship between the journal group and its associated journal volume(s) 106 and snapshot image 105. 1301 The management table 300 shown in FIG. 3 illustrates an example management table and its contents. The management table stores a journal group ID (GRID) 310 which identifies a particular journal group 102 in a storage system 100. A journal group name (GRNAME) 311 can also be provided to identify the journal group with a human recognizable identifier.
  • A journal attribute (GRATTR) 312 is associated with the journal group 102. In accordance with this particular implementation, two attributes are defined: MASTER and RESTORE. The MASTER attribute indicates the journal group is being journaled. The RESTORE attribute indicates that the journal group is being restored from a journal.
  • A journal status (GRSTS) 315 is associated with the journal group 102. There are two statuses: ACTIVE and INACTIVE.
  • The management table includes a field to hold a sequence counter (SEQ) 313. This counter serves as the source of sequence numbers used in the Journal Header 219. When creating a new journal, the sequence number 313 is read and assigned to the new journal. Then, the sequence number is incremented and written back into the management table.
  • The number (NUM_DVOL) 314 of data volumes 101 contained in a give journal group 102 is stored in the management table.
  • A data volume list (DVOL_LIST) 320 lists the data volumes in a journal group. In a particular implementation, DVOL_LIST is a pointer to the first entry of a data structure which holds the data volume information. This can be seen in FIG. 3. Each data volume information comprises an offset number (DVOL_OFFS) 321. For example, if the journal group 102 comprises three data volumes, the offset values could be 0, 1 and 2. A data volume identifier (DVOL_ID) 322 uniquely identifies a data volume within the entire storage system 100. A pointer (DVOL_NEXT) 324 points to the data structure holding information for the next data volume in the journal group; it is a NULL value otherwise.
  • The management table includes a field to store the number of journal volumes (NUM_JVOL) 330 that are being used to contain the data (journal header and journal data) associated with a journal group 102.
  • As described in FIG. 2, the Journal Header Area 210 contains the Journal Headers 219 for each journal; likewise for the Journal Data components 225. As mentioned above, an aspect of the invention is that the data areas 210, 220 wrap. This allows for journaling to continue despite the fact that there is limited space in each data area.
  • The management table includes fields to store pointers to different parts of the data areas 210, 220 to facilitate wrapping. Fields are provided to identify where the next journal entry is to be stored. A field (JI_HEAD_VOL) 331 identifies the journal volume 106 that contains the Journal Header Area 210 which will store the next new Journal Header 219. A field (JI_HEAD_ADR) 332 identifies an address on the journal volume of the location in the Journal Header Area where the next Journal Header will be stored. The journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 335. A field (JI_DATA_ADR) 336 identifies the specific address in the Journal Data Area where the data will be stored. Thus, the next journal entry to be written is “pointed” to by the information contained in the “JI_” fields 331, 332, 335, 336.
  • The management table also includes fields which identify the “oldest” journal entry. The use of this information will be described below. A field (JO_HEAD_VOL) 333 identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219. A field (JO_HEAD_ADR) 334 identifies the address within the Journal Header Area of the location of the journal header of the oldest journal. A field (JO_DATA_VOL) 337 identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 338.
  • The management table includes a list of journal volumes (JVOL_LIST) 340 associated with a particular journal group 102. In a particular implementation, JVOL_LIST is a pointer to a data structure of information for journal volumes. As can be seen in FIG. 3, each data structure comprises an offset number (JVOL_OFS) 341 which identifies a particular journal volume 106 associated with a given journal group 102. For example, if a journal group is associated with two journal volumes 106, then each journal volume might be identified by a 0 or a 1. A journal volume identifier (JVOL_ID) 342 uniquely identifies the journal volume within the storage system 100. Finally, a pointer (JVOL_NEXT) 344 points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • The management table includes a list (SS_LIST) 350 of snapshot images 105 associated with a given journal group 102. In this particular implementation, SS_LIST is a pointer to snapshot information data structures, as indicated in FIG. 3. Each snapshot information data structure includes a sequence number (SS_SEQ) 351 that is assigned when the snapshot is taken. As discussed above, the number comes from the sequence counter 313. A time value (SS_TIME) 352 indicates the time when the snapshot was taken. A status (SS_STS) 358 is associated with each snapshot; valid values include VALID and INVALID. A pointer (SS_NEXT) 353 points to the next snapshot information data structure; it is a NULL value otherwise.
  • Each snapshot information data structure also includes a list of snapshot volumes 107 (FIG. 1) used to store the snapshot images 105. As can be seen in FIG. 3, a pointer (SVOL_LIST) 354 to a snapshot volume information data structure is stored in each snapshot information data structure. Each snapshot volume information data structure includes an offset number (SVOL_OFFS) 355 which identifies a snapshot volume that contains at least a portion of the snapshot image. It is possible that a snapshot image will be segmented or otherwise partitioned and stored in more than one snapshot volume. In this particular implementation, the offset identifies the ith snapshot volume which contains a portion (segment, partition, etc) of the snapshot image. In one implementation, the ith segment of the snapshot image might be stored in the ith snapshot volume. Each snapshot volume information data structure further includes a snapshot volume identifier (SVOL_ID) 356 that uniquely identifies the snapshot volume in the storage system 100. A pointer (SVOL_NEXT) 357 points to the next snapshot volume information data structure for a given snapshot image.
  • FIG. 4 shows a flowchart highlighting the processing performed by the recovery manager 111 and Storage System 100 to initiate backup processing in accordance with the illustrative embodiment of the invention as shown in the figures. If journal entries are not recorded during the taking of a snapshot, the write operations corresponding to those journal entries would be lost and data corruption could occur during a data restoration operation. Thus, in accordance with an aspect of the invention, the journaling process is started prior to taking the first snapshot. Doing this ensures that any write operations which occur during the taking of a snapshot are journaled. As a note, any journal entries recorded prior to the completion of the snapshot can be ignored.
  • Further in accordance with the invention, a single sequence of numbers (SEQ) 313 are associated with each of one or more snapshots and journal entries, as they are created. The purpose of associating the same sequence of numbers to both the snapshots and the journal entries will be discussed below.
  • Continuing with FIG. 4, the recovery manager 111 might define, in a step 410, a journal group (JNLG) 102 if one has not already been defined. As indicated in FIG. 1, this may include identifying one or data volumes (DVOL) 101 for which journaling is performed, and identifying one or journal volumes (JVOL) 106 which are used to store the journal-related information. The recovery manager performs a suitable sequence of interactions with the storage system 100 to accomplish this. In a step 415, the storage system may create a management table 108 (FIG. 1), incorporating the various information shown in the table detail 300 illustrated in FIG. 3. Among other things, the process includes initializing the JVOL_LIST 340 to list the journal volumes which comprise the journal group 102 Likewise, the list of data volumes DVOL_LIST 320 is created. The fields which identify the next journal entry (or in this case where the table is first created, the first journal entry) are initialized. Thus, JI_HEAD_VOL 331 might identify the first in the list of journal volumes and JI_HEAD_ADR 332 might point to the first entry in the Journal Header Area 210 located in the first journal volume. Likewise, JI_DATA_VOL 335 might identify the first in the list of journal volumes and JI_DATA_ADR 336 might point to the beginning of the Journal Data Area 220 in the first journal volume. Note, that the header and the data areas 210, 220 may reside on different journal volumes, so JI_DATA_VOL might identify a journal volume different from the first journal volume.
  • In a step 420, the recovery manager 111 will initiate the journaling process. Suitable communication(s) are made to the storage system 100 to perform journaling. In a step 425, the storage system will make a journal entry (also referred to as an “AFTER journal”) for each write operation that issues from the host 110.
  • With reference to FIG. 3, making a journal entry includes, among other things, identifying the location for the next journal entry. The fields JI_HEAD_VOL 331 and JI_HEAD_ADR 332 identify the journal volume 106 and the location in the Journal Header Area 210 of the next Journal Header 219. The sequence counter (SEQ) 313 from the management table is copied to (associated with) the JH_SEQ 215 field of the next header. The sequence counter is then incremented and stored back to the management table. Of course, the sequence counter can be incremented first, copied to JH_SEQ, and then stored back to the management table.
  • The fields JI_DATA_VOL 335 and in the management table identify the journal volume and the beginning of the Journal Data Area 220 for storing the data associated with the write operation. The JI_DATA_VOL and JI_DATA_ADR fields are copied to JH_JVOL 216 and to JH_ADR 212, respectively, of the Journal Header, thus providing the Journal Header with a pointer to its corresponding Journal Data. The data of the write operation is stored.
  • The JI_HEAD_VOL 331 and JI_HEAD_ADR 332 fields are updated to point to the next Journal Header 219 for the next journal entry. This involves taking the next contiguous Journal Header entry in the Journal Header Area 210. Likewise, the JI_DATA_ADR field (and perhaps JI_DATA_VOL field) is updated to reflect the beginning of the Journal Data Area for the next journal entry. This involves advancing to the next available location in the Journal Data Area. These fields therefore can be viewed as pointing to a list of journal entries. Journal entries in the list are linked together by virtue of the sequential organization of the Journal Headers 219 in the Journal Header Area 210.
  • When the end of the Journal Header Area 210 is reached, the Journal Header 219 for the next journal entry wraps to the beginning of the Journal Header Area. Similarly for the Journal Data 225. To prevent overwriting earlier journal entries, the present invention provides for a procedure to free up entries in the journal volume 106. This aspect of the invention is discussed below.
  • For the very first journal entry, the JO_HEAD_VOL field 333, JO_HEAD_ADR field 334, JO_DATA_VOL field 337, and the JO_DATA_ADR field 338 are set to contain their contents of their corresponding “JI_” fields. As will be explained the “JO_” fields point to the oldest journal entry. Thus, as new journal entries are made, the “JO_” fields do not advance while the “JI_” fields do advance. Update of the “JO_” fields is discussed below.
  • Continuing with the flowchart of FIG. 4, when the journaling process has been initiated, all write operations issuing from the host are journaled. Then in a step 430, the recovery manager 111 will initiate taking a snapshot of the data volumes 101. The storage system 100 receives an indication from the recovery manager to take a snapshot. In a step 435, the storage system performs the process of taking a snapshot of the data volumes. Among other things, this includes accessing SS_LIST 350 from the management table (FIG. 3). A suitable amount of memory is allocated for fields 351-354 to represent the next snapshot. The sequence counter (SEQ) 313 is copied to the field SS_SEQ 351 and incremented, in the manner discussed above for JH_SEQ 215. Thus, over time, a sequence of numbers is produced from SEQ 313, each number in the sequence being assigned either to a journal entry or a snapshot entry.
  • The snapshot is stored in one (or more) snapshot volumes (SVOL) 107. A suitable amount of memory is allocated for fields 355-357. The information relating to the SVOLs for storing the snapshot are then stored into the fields 355-357. If additional volumes are required to store the snapshot, then additional memory is allocated for fields 355-357.
  • FIG. 5 illustrates the relationship between journal entries and snapshots. The snapshot 520 represents the first snapshot image of the data volumes 101 belonging to a journal group 102. Note that journal entries (510) having sequence numbers SEQ0 and SEQ1 have been made, and represent journal entries for two write operations. These entries show that journaling has been initiated at a time prior to the snapshot being taken (step 420). Thus, at a time corresponding to the sequence number SEQ2, the recovery manager 111 initiates the taking of a snapshot, and since journaling has been initiated, any write operations occurring during the taking of the snapshot are journaled. Thus, the write operations 500 associated with the sequence numbers SEQ3 and higher show that those operations are being journaled. As an observation, the journal entries identified by sequence numbers SEQ0 and SEQ1 can be discarded or otherwise ignored.
  • Recovering data typically requires recover the data state of at least a portion of the data volumes 101 at a specific time. Generally, this is accomplished by applying one or more journal entries to a snapshot that was taken earlier in time relative to the journal entries. In the disclosed illustrative embodiment, the sequence number SEQ 313 is incremented each time it is assigned to a journal entry or to a snapshot. Therefore, it is a simple matter to identify which journal entries can be applied to a selected snapshot; i.e., those journal entries whose associated sequence numbers (JH_SEQ, 215) are greater than the sequence number (SS_SEQ, 351) associated with the selected snapshot.
  • For example, the administrator may specify some point in time, presumably a time that is earlier than the time (the “target time”) at which the data in the data volume was lost or otherwise corrupted. The time field SS_TIME 352 for each snapshot is searched until a time earlier than the target time is found. Next, the Journal Headers 219 in the Journal Header Area 210 is searched, beginning from the “oldest” Journal Header. The oldest Journal Header can be identified by the “JO_” fields 333, 334, 337, and 338 in the management table. The Journal Headers are searched sequentially in the area 210 for the first header whose sequence number JH_SEQ 215 is greater than the sequence number SS_SEQ 351 associated with the selected snapshot. The selected snapshot is incrementally updated by applying each journal entry, one at a time, to the snapshot in sequential order, thus reproducing the sequence of write operations. This continues as long as the time field JH_TIME 214 of the journal entry is prior to the target time. The update ceases with the first journal entry whose time field 214 is past the target time.
  • In accordance with one aspect of the invention, a single snapshot is taken. All journal entries subsequent to that snapshot can then be applied to reconstruct the data state at a given time. In accordance with another aspect of the present invention, multiple snapshots can be taken. This is shown in FIG. 5A where multiple snapshots 520′ are taken. In accordance with the invention, each snapshot and journal entry is assigned a sequence number in the order in which the object (snapshot or journal entry) is recorded. It can be appreciated that there typically will be many journal entries 510 recorded between each snapshot 520′. Having multiple snapshots allows for quicker recovery time for restoring data. The snapshot closest in time to the target recovery time would be selected. The journal entries made subsequent to the snapshot could then be applied to restore the desired data state.
  • FIG. 6 illustrates another aspect of the present invention. In accordance with the invention, a journal entry is made for every write operation issued from the host; this can result in a rather large number of journal entries. As time passes and journal entries accumulate, the one or more journal volumes 106 defined by the recovery manager 111 for a journal group 102 will eventually fill up. At that time no more journal entries can be made. As a consequence, subsequent write operations would not be journaled and recovery of the data state subsequent to the time the journal volumes become filled would not be possible.
  • FIG. 6 shows that the storage system 100 will apply journal entries to a suitable snapshot in response to detection of an “overflow” condition. An “overflow” is deemed to exist when the available space in the journal volume(s) falls below some predetermined threshold. It can be appreciated that many criteria can be used to determine if an overflow condition exists. A straightforward threshold is based on the total storage capacity of the journal volume(s) assigned for a journal group. When the free space becomes some percentage (say, 10%) of the total storage capacity, then an overflow condition exists. Another threshold might be used for each journal volume. In an aspect of the invention, the free space capacity in the journal volume(s) is periodically monitored. Alternatively, the free space can be monitored in an aperiodic manner. For example, the intervals between monitoring can be randomly spaced. As another example, the monitoring intervals can be spaced apart depending on the level of free space; i.e., the monitoring interval can vary as a function of the free space level.
  • FIG. 7 highlights the processing which takes place in the storage system 100 to detect an overflow condition. Thus, in a step, 710, the storage system periodically checks the total free space of the journal volume(s) 106; e.g., every ten seconds. The free space can easily be calculated since the pointers (e.g., JI_CTL_VOL 331, JI_CTL_ADDR 332) in the management table 300 maintain the current state of the storage consumed by the journal volumes. If the free space is above the threshold, then the monitoring process simply waits for a period of time to pass and then repeats its check of the journal volume free space.
  • If the free space falls below a predetermined threshold, then in a step 720 some of the journal entries are applied to a snapshot to update the snapshot. In particular, the oldest journal entry(ies) are applied to the snapshot.
  • Referring to FIG. 3, the Journal Header 219 of the “oldest” journal entry is identified by the JO_HEAD_VOL field 333 and the JO_HEAD_ADR field 334. These fields identify the journal volume and the location in the journal volume of the Journal Header Area 210 of the oldest journal entry. Likewise, the Journal Data of the oldest journal entry is identified by the JO_DATA_VOL field 337 and the JO_DATA_ADR field 338. The journal entry identified by these fields is applied to a snapshot. The snapshot that is selected is the snapshot having an associated sequence number closest to the sequence number of the journal entry and earlier in time than the journal entry. Thus, in this particular implementation where the sequence number is incremented each time, the snapshot having the sequence number closest to but less than the sequence number of the journal entry is selected (i.e., “earlier in time). When the snapshot is updated by applying the journal entry to it, the applied journal entry is freed. This can simply involve updating the JO_HEAD_VOL field 333, JO_HEAD_ADR field 334, JO_DATA_VOL field 337, and the JO_DATA_ADR field 338 to the next journal entry.
  • As an observation, it can be appreciated by those of ordinary skill, that the sequence numbers will eventually wrap, and start counting from zero again. It is well within the level of ordinary skill to provide a suitable mechanism for keeping track of this when comparing sequence numbers.
  • Continuing with FIG. 7, after applying the journal entry to the snapshot to update the snapshot, a check is made of the increase in the journal volume free space as a result of the applied journal entry being freed up (step 730). The free space can be compared against the threshold criterion used in step 710. Alternatively, a different threshold can be used. For example, here a higher amount of free space may be required to terminate this process than was used to initiate the process. This avoids invoking the process too frequently, but once invoked the second higher threshold encourages recovering as much free space as is reasonable. It can be appreciated that these thresholds can be determined empirically over time by an administrator.
  • Thus, in step 730, if the threshold for stopping the process is met (i.e., free space exceeds threshold), then the process stops. Otherwise, step 720 is repeated for the next oldest journal entry. Steps 730 and 720 are repeated until the free space level meets the threshold criterion used in step 730.
  • FIG. 7A highlights sub-steps for an alternative embodiment to step 720 shown in FIG. 7. Step 720 frees up a journal entry by applying it to the latest snapshot that is not later in time than the journal entry. However, where multiple snapshots are available, it may be possible to avoid the time consuming process of applying the journal entry to a snapshot in order to update the snapshot.
  • FIG. 7A shows details for a step 720′ that is an alternate to step 720 of FIG. 7. At a step 721, a determination is made whether a snapshot exists that is later in time than the oldest journal entry. This determination can be made by searching for the first snapshot whose associated sequence number is greater than that of the oldest journal entry. Alternatively, this determination can be made by looking for a snapshot that is a predetermined amount of time later than the oldest journal entry can be selected; for example, the criterion may be that the snapshot must be at least one hour later in time than the oldest journal entry. Still another alternate is to use the sequence numbers associated with the snapshots and the journal entries, rather than time. For example, the criterion might be to select a snapshot whose sequence number is N increments away from the sequence number of the oldest journal entry.
  • If such a snapshot can be found in step 721, then the earlier journal entries can be removed without having to apply them to a snapshot. Thus, in a step 722, the “JO_” fields (JO_HEAD_VOL 333, JO_HEAD_ADR 334, JO_DATA_VOL 337, and JO_DATA_ADR 338) are simply moved to a point in the list of journal entries that is later in time than the selected snapshot. If no such snapshot can be found, then in a step 723 the oldest journal entry is applied to a snapshot that is earlier in time than the oldest journal entry, as discussed for step 720.
  • Still another alternative for step 721 is simply to select the most recent snapshot. All the journal entries whose sequence numbers are less than that of the most recent snapshot can be freed. Again, this simply involves updating the “JO_” fields so they point to the first journal entry whose sequence number is greater than that of the most recent snapshot. Recall that an aspect of the invention is being able to recover the data state for any desired point in time. This can be accomplished by storing as many journal entries as possible and then applying the journal entries to a snapshot to reproduce the write operations. This last embodiment has the potential effect of removing large numbers of journal entries, thus reducing the range of time within which the data state can be recovered. Nevertheless, for a particular configuration it may be desirable to remove large numbers of journal entries for a given operating environment.
  • In another aspect of the present invention, recovery of the production volume(s) 101 can be facilitated by allowing the user to interact with the recovery process. A “fast recovery” can be performed which quickly recovers the data state to a point in time prior to a target time. A more granular recovery procedure can then be performed which allows a user to hone in on the target data state. The user can perform “undo-able recoveries” to inspect the data state in a trial and error manner by allowing the user to step forward and backward (undo operation) in time. This aspect of the invention allows a user to be less specific as to the time of the desired data state. The target time specified by the user need only be a time that he is certain is prior to the time of the target data state. It is understood that “the target data state” can refer to any desired state of the data.
  • FIG. 3A shows an illustrative embodiment of a management table 300′ according to this aspect of the present invention. The alternative management table 300′ includes two sets of fields, one set of fields (330, 331, 340) for managing AFTER journal entries and another set of fields (332, 333, 341) for managing BEFORE journal entries.
  • The fields related to the AFTER journal entries include a field to store the number of journal volumes (NUM_JVOLa) 330 that are used to contain the data journal header and journal data) associated with the AFTER journal entries for a journal group 102.
  • As described in FIG. 2, the Journal Header Area 210 contains the Journal Headers 219 for each journal; likewise for the Journal Data components 225. As mentioned above, an aspect of the invention is that the data areas 210, 220 wrap. This allows for journaling to continue despite the fact that there is limited space in each data area.
  • The management table includes fields to store pointers to different parts of the data areas 210, 220 to facilitate wrapping. Pointer-type information is provided to facilitate identifying where the next journal entry is to be stored. A set of such information (“AFTER journal pointers”) is provided for the AFTER journal entries. A field (JVOL_PTRa) 331 in the management table identifies the location of the AFTER journal pointers.
  • The AFTER journal entries are stored in one or more journal volumes, separate from the BEFORE journal entries. A field (JI_HEAD_VOL) 331 a identifies the journal volume 106 that contains the Journal Header Area 210 from which the next Journal Header 219 will be obtained. A field (JI_HEAD_ADR) 331 b identifies where in the in Journal Header Area the next Journal Header is located. The journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 331 e. A field (JI_DATA_ADR) 33 if identifies the specific address in the Journal Data Area where the data will be stored. Thus, the next AFTER journal entry to be written is “pointed” to by the information contained in the “JI_” fields 331 a, 331 b, 331 e, 331 f.
  • The AFTER journal pointers also includes fields which identify the “oldest” AFTER journal entry. The use of this information will be described below. A field (JO_HEAD_VOL) 331 c identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219. A field (JO_HEAD_ADR) 331 d identifies the address within the Journal Header Area of the location of the journal header of the oldest journal. A field (JO_DATA_VOL) 331 g identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 331 h.
  • The management table includes a list of journal volumes (JVOL_LISTa) 340 associated with the AFTER journal entries of a journal group 102. In a particular implementation, JVOL_LISTa is a pointer to a data structure of information for journal volumes. As can be seen in FIG. 3, each data structure comprises an offset number (JVOL_OFS) 340 a which identifies a particular journal volume 106 associated with a given journal group 102. For example, if a journal group is associated with two journal volumes 106, then each journal volume might be identified by a 0 or a 1. A journal volume identifier (JVOL_ID) 340 b uniquely identifies the journal volume within the storage system 100. Finally, a pointer (JVOL_NEXT) 340 c points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • The management table also includes a set of similar fields for managing the BEFORE journal entries. The fields related to the BEFORE journal entries include a field to store the number of journal volumes (NUM_JVOLb) 332 that are being used to contain the data (journal header and journal data) associated with the BEFORE journal entries for a journal group 102.
  • As discussed above for the AFTER journal entries, an aspect of the invention is that the data areas 210, 220 wrap. The management table includes fields to store pointers to different parts of the data areas 210, 220 to facilitate wrapping. Pointer-type information is provided to facilitate identifying where the next BEFORE journal entry is to be stored. A set of such information (“BEFORE journal pointers”) is provided for the BEFORE journal entries. A field (JVOL_PTRb) 333 in the management table identifies the location of the BEFORE journal pointers.
  • The BEFORE journal entries are stored in one or more journal volumes, separate from the journal volume(s) used to store the AFTER journal entries. A field (JI_HEAD_VOL) 332 a identifies the journal volume 106 that contains the Journal Header Area 210 from which the next Journal Header 219 will be obtained. A field (JI_HEAD_ADR) 332 b identifies where in the in Journal Header Area the next Journal Header is located. The journal volume that contains the Journal Data Area 220 into which the journal data will be stored is identified by information in a field (JI_DATA_VOL) 332 e. A field (JI_DATA_ADR) 332 f identifies the specific address in the Journal Data Area where the data will be stored. Thus, the next BEFORE journal entry to be written is “pointed” to by the information contained in the “JI_” fields 332 a, 332 b, 332 e, 332 f.
  • The AFTER journal pointers also includes fields which identify the “oldest” BEFORE journal entry. The use of this information will be described below. A field (JO_HEAD_VOL) 332 c identifies the journal volume which stores the Journal Header Area 210 that contains the oldest Journal Header 219. A field (JO_HEAD_ADR) 332 d identifies the address within the Journal Header Area of the location of the journal header of the oldest journal. A field (JO_DATA_VOL) 332 g identifies the journal volume which stores the Journal Data Area 220 that contains the data of the oldest journal. The location of the data in the Journal Data Area is stored in a field (JO_DATA_ADR) 332 h.
  • The management table includes a list of journal volumes (JVOL_LISTh) 341 associated with the AFTER journal entries of a journal group 102. In a particular implementation, JVOL_LISTa is a pointer to a data structure of information for journal volumes. As can be seen in FIG. 3, each data structure comprises an offset number (JVOL_OFS) 341 a which identifies a particular journal volume 106 associated with a given journal group 102. A journal volume identifier (JVOL_ID) 341 b uniquely identifies the journal volume within the storage system 100. Finally, a pointer (JVOL_NEXT) 341 c points to the next data structure entry pertaining to the next journal volume associated with the journal group; it is a NULL value otherwise.
  • The recover manager 111 provides the following interface to the storage system for the aspect of the invention which provides for “fast” and “undo-able” recovery modes. The interface is shown in a format of an application programmer's interface (API). The functionality and needed information (parameters) are described. It can be appreciated that any suitable programming language can be used.
  • BACKUP journal_volume
      • This initiates backup processing to commence in the storage system 100. More specifically, the logging of AFTER journal entries is initiated for each write operation to the data volumes 101. The parameter journal_volume identifies the volume 102 that contains the journal entries. As discussed above, a initial snapshot is taken after journaling commences.
  • RECOVER_PH1 journal_volume target_time
      • This initiates a PHASE I recovery process. This recovery is the procedure discussed above. Briefly, AFTER journal entries are applied to an appropriate snapshot. The journal entries are contained in the volume(s) identified by journal_volume. The desired data state is specified by target_time. The target_time can be a time format (e.g., year:month:date:hh:mm). Alternatively, the target_time can be a journal sequence number 215, so that journal entries subsequent to the sequence number associated with the snapshot and up to the specified sequence number are applied. Still another alternative is that the target_time is simply the number of journal entries to be applied to a snapshot (e.g., apply the next one hundred journal entries).
      • Depending on configuration and storage resources, the snapshot can be copied to the production volume. Data recovery can then proceed on the production volume.
  • RECOVER_PH2 journal_volume 1 journal_volume 2 target_time
      • This initiates a PHASE II recovery process. As will be discussed in more detail below, this procedure involves making a BEFORE journal entry before applying eacn AFTER journal entry to a snapshot. As will be explained below, this recovery process allows for “un-doing” an update operation on a snapshot. The AFTER journal entries are located in the volume identified by journal_volume 1. The BEFORE journal entries are located in the volume identified by journal_volume 2. The desired data state is specified by target_time. The desired data state is specified by target_time. The target_time can be a time format (e.g., year:month:date:hh:mm).
      • Alternatively, the target_time can be a journal sequence number 215, so that journal entries subsequent to the sequence number associated with the snapshot and up to the specified sequence number are applied. Still another alternative is that the target_time is simply the number of journal entries to be applied to a snapshot (e.g., apply the next one hundred journal entries).
  • STOP_RECOVER
      • This will cause the storage system to cease recovery processing. Thus, a PHASE I recovery operation or a PHASE II recovery operation will be terminated. In addition, BEFORE journaling is initiated. This will cause BEFORE journal entries to be made each time the host 110 issues a write operation, in addition to the making an AFTER journal entry.
  • UNDO_RECOVER journal_volume 1 journal_volume 2 target_time
      • As will be discussed in more detail below, this operation will revert an updated snapshot to an earlier point in time. This is accomplished by “undoing” one or more applications of an AFTER journal entry. The target_time can be any of the forms previously discussed.
  • Referring now to FIG. 8, a generalized process flow is shown highlighting the steps for recovering data in accordance with the “fast” and “undo-able” recovery mode aspects of the present invention. One will appreciate from the following that the described technique can be used to recover or otherwise retrieve a desired data state of any data volume(s). The retrieval methods and apparatus disclosed herein are not limited to disaster recovery scenarios. The invention has applicability for users (e.g., system administrators) who might have a need to look at the state of a file or a directory at an earlier point in time. Accordingly, the term “recovery volume” is used in a generic sense to refer to one or more volumes on which the data recovery process is being performed.
  • It can be appreciated that the recovery manager 111 can include a suitable interface for interaction with a user. An appropriate interface might be a graphical user interface, or a command line interface. It can be appreciated that voice recognition technology and even virtual reality technology can be used as input and output components of the interface for interacting with a user. Alternatively, the “user” can be a machine (such as a data processing system) rather than a human. In such a case, a suitable machine-machine interface can be readily devised and implemented.
  • The first phase of the recovery process is referred to as “fast” recovery. The idea is to quickly access the data state of the recovery volume at a point in time that is “close” in time to the desired data state, but prior in time to the desired data state. Thus, in a step 810, the recovery manager 111 obtains from the user a “target time” that specifies a point in time that is close to the time of the desired data state. A suitable query to the user might inform the user as to the nature of this target time. For example, if the user interacted with a system administrator, she might tell the administrator that she was sure her files were not deleted until after 10:30 AM. The target time would then be 10:30 AM, or earlier. Likewise, a user interface can obtain such information from a user by presenting a suitable set of queries or prompts. Given the target time, the recovery manager can then issue a RECOVER_PH1 operation to the storage system (e.g., system 100, FIG. 1) that contains the recovery volume.
  • In response, the storage system would initiate phase I recovery. Referring to FIG. 9 for a moment, the storage system 100 in response to the RECOVER_PH1 request, would determine in a step 910 whether recovery is possible. Two conditions are checked:
      • (1) a good snapshot exists—A snapshot must have been taken between the oldest journal and newest journal. As discussed above, every snapshot has a sequence number. The sequence number can be used to identify a suitable snapshot. If the sequence number of a candidate snapshot is greater than that of the oldest journal and smaller than that of the newest journal, then the snapshot is suitable.
  • (2) recovery target time is in scope—The target time that user specifies must be between the oldest journal and the newest journal.
  • Then in a step 920, the recovery volume is set to an offline state. In the context of the present invention “offline” is taken to mean that the user, and more generally the host device 110, cannot access the recovery volume. For example, in the case that the production volume is being used as the recovery volume, it is likely to be desirable that the host 110 be prevented at least from issuing write operations to the volume. Also, the host typically will not be permitted to perform read operations. Of course, the storage system itself has full access to the recovery volume in order to perform the recovery task.
  • In a step 930, the snapshot is copied to the recovery volume in preparation for phase I recovery. Tthe production volume itself can be the recovery volume. However, it can be appreciated that the recovery manager 111 can allow the user to specify a volume other than the production volume to serve as the target of the data recovery operation. For example, the recovery volume can be the volume on which the snapshot is stored. Using a volume other than the production volume to perform the recovery operation may be preferred where it is desirable to provide continued use of the production volume.
  • In a step 940, one or more AFTER journal entries are applied to update the snapshot volume in the manner as discussed previously. Enough AFTER journal entries are applied to update the snapshot to a point in time up to or prior to the user-specified target time.
  • Returning to FIG. 8, upon completion of phase I recovery, the storage system 100 can signal the recovery manager (step 820) to indicate phase I has completed. The recovery manager 111 would then issue a STOP_RECOVER operation to the storage system. In response, the storage system 100 (step 830) would put the recovery volume into an online state. In the context of the present invention, the “online” state is taken to mean that the host device 10 is given access to the recovery volume.
  • Next, in a step 840, the user is given the opportunity to review the state of the data on the recovery volume to determine whether the desired data state has been recovered. At this point, the data state has been recovered to some point in time prior to the time of the desired data state. Additional recovery might bee needed to reach the desired data state. If the desired data state has been achieved then the recovery process is stopped. If the desired data state is not achieved, then a determination is made whether another phase I recovery operation is to be performed, or whether a phase II recovery operation is to be performed.
  • Recall that phase I recovery involves updating the snapshot by applying the AFTER journal entries to it to reproduce the sequence of write operations made since the snapshot was taken. A phase II recovery operation involves taking a BEFORE journal entry for each AFTER journal entry that is applied. It can be appreciated that phase II recovery is a slower process than phase I recover. The decision whether to proceed using phase I recovery mode or phase II recovery mode can be made by the user after she has inspected the recovered data state. For example, she may learn from inspecting the recovered data state that an additional few hours of recovery is needed, in which case she may specify via the recovery manager 111 to perform the faster phase I recovery and provide a refined target time. If the recovered data state seems close to the desired data state, then the user may want to perform the slower phase II recovery to take advantage of the “undo” aspect (see below) provided by a phase II recovery operation.
  • Alternatively, the user interface can algorithmically determine whether to perform phase I or phase II recover. The interface can input the user's refined target time and compare that against the initial target time. Based on the comparison, the interface can choose an appropriate recovery mode. For example, if the difference in time is X minutes or greater, then a phase I recovery is performed, otherwise a phase II recovery is commenced.
  • A factor to consider at this decision point (step 840) is that phase I recovery cannot be conveniently “undone.” If the recovered data state is beyond the desired data state, then the only way to reverse the data recovery action is to start again from the original snapshot. This can be time consuming. A phase II recovery in accordance with the present invention, on the other hand, can be undone. Thus, if a recovered data state is close to the user's refined time estimate, then a phase II recovery operation may be preferred.
  • FIG. 8 shows a step 850 for the initiation of phase II recovery. This includes taking the recovery volume offline and applying one or more AFTER journal entries to the snapshot as before, in order to move the state of the recovered data forward in time. However, phase II processing includes the additional step of taking BEFORE journal entries. With BEFORE journaling turned on, a BEFORE journal entry is taken of the snapshot prior to updating the snapshot with an AFTER journal entry; one such BEFORE journal entry is taken for each AFTER journal entry. As mentioned above, a BEFORE journal entry records the data that is stored in the target location of the write operation. Consequently, the state of the snapshot is preserved in a BEFORE journal entry prior to updating the snapshot with an AFTER journal entry. Thus, pairs of BEFORE journal and AFTER journal entries are created during phase II recovery. In accordance with the invention, the sequence numbering provided by the sequence number (SEQ) 313 is associated with each BEFORE entry journal. Thus, the same sequence of numbers is applied to BEFORE journal entries as well as to AFTER journal entries and snapshots.
  • In a step 860, a STOP_RECOVER operation is issued to put the recovery volume in an online state. The user is then able to inspect the recovery volume. Based on the inspection, if the user determines in a step 870 that the desired data state of the recovery volume is achieved, then the recovery process is complete. If the user determines that the desired data state is not achieved, then a further determination is made whether the data recovery has gone beyond the desired data state. If so, then the snapshot updates are “undone” (step 880) by accessing one or more BEFORE journal entries. This combination of taking BEFORE journals and AFTER journals constitutes a phase II recovery.
  • FIG. 10 illustrates how an updated snapshot can be undone. The figure shows that at some point in time a snapshot 1020 of a recovery volume (e.g., data volume 101, FIG. 1) was taken. The figure shows phase II processing where BEFORE and AFTER journal entries are taken. Thus, the application of the AFTER journal entry 1012 a to the snapshot is preceded by a BEFORE journal entry 1012. The BEFORE journal entry contains the original data that is stored in the area of the recovery volume that is the target of the write operation recorded by the AFTER journal entry, prior to performing the write operation. Thus, a pair of journal entries is created comprising an AFTER journal entry and a corresponding BEFORE journal entry. When the BEFORE journal entry 1012 is recorded, the AFTER journal entry 1012 a is then applied to the snapshot to update the snapshot.
  • Continuing, to the next AFTER journal entry 1014 a, again a BEFORE journal entry 1014 is created to record the original data in the area of the production volume that is the target of the AFTER journal entry before the AFTER journal entry is applied to the snapshot 1020. Again, a pair of journal entries result: an AFTER journal entry 1014 a and its corresponding BEFORE journal entry 1014. Similar BEFORE journal entries 1016 and 1018 are created for the AFTER journal entries 1016 a and 1018 a.
  • Now, with reference to FIG. 11, in accordance with phase II processing, the snapshot 1020 is updated by the sequential application of the AFTER journal entries 1012 a-1018 a (along with the creation of the corresponding BEFORE journal entries 1012-1018). Thus, the snapshot 1020 is updated by performing the write operation indicated in the AFTER journal entry 1012 a to produce an updated snapshot 1120 a. The updated snapshot 1120 a is again updated by performing the write operation indicated in the AFTER journal entry 1014 a to produce 1120 b. The updated snapshot 1120 b is subsequently updated in turn by the AFTER journal entries 1016 a and 1018 a to produce snapshots 1120 c and 1120 d.
  • Referring to FIGS. 10 and 11, an “undo” operation can now be described. The
      • procedure includes applying the information contained in the BEFORE journals to the updated snapshots. The BEFORE journal entries are applied in timewise reverse order. Thus, to restore the snapshot from its state in 1012 d to its previous state in 1120 c, the BEFORE journal entry 1018 is applied to the snapshot 1020 d to reproduce the snapshot 1120 c. To perform another “undo” iteration, the BEFORE journal entry 1016 is applied to the snapshot 1120 c to reproduce the snapshot 1120 b. From this discussion, it can be appreciated that in order to “undo” a snapshot that has been updated by a set of AFTER journals, a BEFORE journal is needed that exists earlier in time than any of the AFTER journals in the set. Phase II processing provides the requisite BEFORE journal entries in order to perform the undo operation.
  • Returning to FIG. 8, one or more of the BEFORE journal entries can be applied (step 880) to the updated snapshot in this manner to perform a “reverse update” of one or more of the AFTER journal entries. This has the effect of moving the state of the recovered data in the recovery volume backward in time. The number of BEFORE journal entries to apply can be a fixed number; for example, move back in time by one minute increments, or by some number N of BEFORE journal entries. Alternatively, the user can specify how far back in time to move the data state by specifying a reverse target time (e.g., an absolute time such as 10:34 AM), or an increment of time (e.g., a delta time value such as 10 minutes). The user is given the opportunity to inspect the data state of the recovery volume to determine whether to continue backward in time or to move forward. Repeating this allows the user to restore the desired data state in an iterative and interactive manner by shuffling the data state backward and forward in time.
  • It can be appreciated that phase II processing will be slower than phase I recovery for the reason that a BACKUP journal entry must be created before applying an AFTER journal entry to update the snapshot. For this reason, phase I recovery is also referred to as “fast recovery.” Since phase II recovery permits the user to undo an updated snapshot, it can be referred to as “undo-able” recovery.
  • The foregoing disclosed embodiments typically can be provided using a combination of hardware and software implementations; e.g., combinations of software, firmware, and/or custom logic such as ASICs (application specific ICs) are possible. One of ordinary skill can readily appreciate that the underlying technical implementation will be determined based on factors including but not limited to or restricted to system cost, system performance, the existence of legacy software and legacy hardware, operating environment, and so on. The disclosed embodiments can be readily reduced to specific implementations without undue experimentation by those of ordinary skill in the relevant art.

Claims (23)

1. A method for processing data in a data store comprising:
obtaining a snapshot of a data store;
updating the snapshot with one or more first after-journal entries; and
after updating the snapshot with one or more first after-journal entries, performing one or more subsequent updates of the snapshot with one or more second after-journal entries, each subsequent update of the snapshot including:
storing a before-journal entry; and
after storing the before-journal entry, applying one of the second after-journal entries to the snapshot,
wherein the subsequent updates of the snapshot can be undone.
2. The method of claim 1 further comprising, after performing one or more subsequent updates, applying one or more before-journal entries to the snapshot, wherein one or more updates of the snapshot by the second after-journal entries can be undone.
3. The method of claim 2 further comprising receiving information indicative of an undo request, and in response thereto performing the step of applying one or more before-journal entries to the snapshot.
4. The method of claim 1 wherein the number of first after-journal entries is determined based on a user-provided target time.
5. The method of claim 1 wherein the second after-journal entries are applied in increasing order of time.
6. The method of claim 1 wherein the step of updating the snapshot with one or more first after-journal entries includes further updating the snapshot with one or more additional after-journal entries, wherein the step of further updating is performed in response to receiving information indicative of a fast recovery request.
7. The method of claim 1 wherein the step of obtaining a snapshot includes making a copy of the snapshot on the data store, wherein the updating steps are performed on the copy of the snapshot stored on the data store.
8. The method of claim 1 further comprising receiving information indicative of a user-specified data store, wherein the step of obtaining a snapshot includes making a copy of the snapshot on the user-specified data store, wherein the updating steps are performed on the copy of the snapshot stored on the user-specified data store.
9. A data processing device comprising:
a data store;
a controller;
a data storage component configured to store after-journal entries and before-journal entries, and further configured to provide access to the after-journal entries and the before-journal entries,
the controller configured to access the data store and to access the data storage component,
the controller further configured to perform the method steps of claim 1.
10. A method for processing data comprising:
obtaining a snapshot of at least a portion of a data store;
applying a plurality of first after-journal entries to update the snapshot, including receiving a first time indication from a user, the number of first after-journal entries being based on the first time indication;
providing access to the snapshot so that the user can access the snapshot;
receiving a recovery mode indication and a second time indication from the user;
applying a plurality of second after-journal entries to further update the snapshot, the number of second after-journal entries being based on the second time indication; and
if the recovery mode indication is indicative of an undo-able recovery mode, then for each second after-journal entry, taking a before-journal entry of the snapshot before applying the second after-journal entry to the snapshot.
11. The method of claim 10 further comprising receiving a third time indication from the user and applying one or more before-journal entries to the snapshot, the number of before-journal entries that are applied to the snapshot being dependent on the third time indication.
12. A data processing system comprising:
a host component comprising at least one host processing unit;
a storage component comprising at least one storage control unit;
first program control means contained in the host component for controlling operation of the host processing unit; and
second program control means contained in the storage component for controlling operation of the storage control unit,
the first program control means and the second program control means further for operating, respectively, the host processing unit and the storage control unit to perform the method steps of claim 10.
13. The data processing system of claim 12 wherein the first program control means comprises first program code and the second program control means comprises second program code.
14. A method for processing data on a data store comprising:
receiving input from a user indicative of a first data volume;
receiving input from the user indicative of a second data volume;
obtaining a snapshot of at least a portion of the first data volume;
storing the snapshot on the second data volume;
a first step of updating the snapshot with a plurality of first after-journal entries;
providing user-access to the second data volume;
receiving a first indication from the user, wherein if the first indication is indicative of a fast recovery operation, then repeating the first step of updating the snapshot with a plurality of second after-journal entries; and
subsequent to the first step of updating, a second step of updating the snapshot with a plurality of third after-journal entries, including for each third after-journal entry taking a before-journal entry of the snapshot prior to updating the snapshot with the third after-journal entry,
the first, second, and third after-journal entries being representative of write operations previously performed on the first data volume.
15. The method of claim 14 further comprising receiving input from the user indicative of a target time wherein the number of first after-journal entries is based on the target time.
16. The method of claim 15 further comprising receiving input from the user indicative of a refined target time wherein the number of second after-journal entries is based on the refined target time.
17. The method of claim 15 further comprising receiving input from the user indicative of a refined target time wherein the number of third after-journal entries is based on the refined target time.
18. The method of claim 14 further comprising applying one or more before-journal entries to the snapshot to undo snapshot updates produced by the application of one or more of the third after-journal entries.
19. The method of claim 14 further comprising receiving a second indication from the user and in response thereto, applying one or more before-journal entries to the snapshot to undo snapshot updates produced by the application of one or more of the third after-journal entries.
20. The method of claim 19 further comprising receiving input from the user indicative of a time, wherein the number of before-journal entries is based on the time.
21. The method of claim 19 wherein the one or more before-journal entries are applied sequentially beginning with the most recent before-journal entry.
22. The method of claim 14 wherein the first data volume and the second data volume refer to the same data volume, wherein the snapshot represents a data state of at least a portion of the first data volume at a first point in time.
23. The method of claim 14 wherein the first data volume is a production volume and the second data volume refers to a data volume different from the production volume.
US10/621,791 2003-06-26 2003-07-16 Method and apparatus for data recovery using storage based journaling Abandoned US20050015416A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
US10/621,791 US20050015416A1 (en) 2003-07-16 2003-07-16 Method and apparatus for data recovery using storage based journaling
US10/931,543 US7398422B2 (en) 2003-06-26 2004-08-31 Method and apparatus for data recovery system using storage based journaling
US11/365,096 US8145603B2 (en) 2003-07-16 2006-02-28 Method and apparatus for data recovery using storage based journaling
US12/143,419 US7761741B2 (en) 2003-06-26 2008-06-20 Method and apparatus for data recovery system using storage based journaling
US12/814,002 US7979741B2 (en) 2003-06-26 2010-06-11 Method and apparatus for data recovery system using storage based journaling
US13/407,322 US8868507B2 (en) 2003-07-16 2012-02-28 Method and apparatus for data recovery using storage based journaling
US13/551,892 US9092379B2 (en) 2003-06-26 2012-07-18 Method and apparatus for backup and recovery using storage based journaling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/621,791 US20050015416A1 (en) 2003-07-16 2003-07-16 Method and apparatus for data recovery using storage based journaling

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/608,391 Continuation-In-Part US7111136B2 (en) 2003-06-26 2003-06-26 Method and apparatus for backup and recovery system using storage based journaling

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US10/931,543 Continuation-In-Part US7398422B2 (en) 2003-06-26 2004-08-31 Method and apparatus for data recovery system using storage based journaling
US11/365,096 Continuation US8145603B2 (en) 2003-07-16 2006-02-28 Method and apparatus for data recovery using storage based journaling

Publications (1)

Publication Number Publication Date
US20050015416A1 true US20050015416A1 (en) 2005-01-20

Family

ID=34063061

Family Applications (3)

Application Number Title Priority Date Filing Date
US10/621,791 Abandoned US20050015416A1 (en) 2003-06-26 2003-07-16 Method and apparatus for data recovery using storage based journaling
US11/365,096 Active 2026-01-31 US8145603B2 (en) 2003-07-16 2006-02-28 Method and apparatus for data recovery using storage based journaling
US13/407,322 Expired - Fee Related US8868507B2 (en) 2003-07-16 2012-02-28 Method and apparatus for data recovery using storage based journaling

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/365,096 Active 2026-01-31 US8145603B2 (en) 2003-07-16 2006-02-28 Method and apparatus for data recovery using storage based journaling
US13/407,322 Expired - Fee Related US8868507B2 (en) 2003-07-16 2012-02-28 Method and apparatus for data recovery using storage based journaling

Country Status (1)

Country Link
US (3) US20050015416A1 (en)

Cited By (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040267836A1 (en) * 2003-06-25 2004-12-30 Philippe Armangau Replication of snapshot using a file system copy differential
US20070100905A1 (en) * 2005-11-03 2007-05-03 St. Bernard Software, Inc. Malware and spyware attack recovery system and method
US20070112893A1 (en) * 2005-11-15 2007-05-17 Wataru Okada Computer system, management computer, storage system, and backup management method
JP2007140746A (en) * 2005-11-16 2007-06-07 Hitachi Ltd Computer system, management computer and recovery management method
US20070166117A1 (en) * 2006-01-17 2007-07-19 Hsin-Tien Chang Gyration balancing calibration free high-speed boring tool
US20070174681A1 (en) * 1999-10-19 2007-07-26 Idocrase Investments Llc Stored memory recovery system
US20070174569A1 (en) * 2006-01-26 2007-07-26 Infortrend Technology, Inc. Method of managing data snapshot images in a storage system
US20070179994A1 (en) * 2006-01-31 2007-08-02 Akira Deguchi Storage system
US20070198604A1 (en) * 2006-02-21 2007-08-23 Hitachi, Ltd. Computer system, computer system management console, and data recovery management method
US20070214334A1 (en) * 2006-03-07 2007-09-13 Naoko Maruyama Storage system
US20070271422A1 (en) * 2006-05-19 2007-11-22 Nobuyuki Osaki Method and apparatus for data recovery
US20070300013A1 (en) * 2006-06-21 2007-12-27 Manabu Kitamura Storage system having transaction monitoring capability
US20080027998A1 (en) * 2006-07-27 2008-01-31 Hitachi, Ltd. Method and apparatus of continuous data protection for NAS
US20080059732A1 (en) * 2006-09-06 2008-03-06 Wataru Okada Computer system with data recovering, method of managing data with data recovering and managing computer for data recovering
US20080071841A1 (en) * 2006-09-20 2008-03-20 Hitachi, Ltd. Recovery method using CDP
US20080091744A1 (en) * 2006-10-11 2008-04-17 Hidehisa Shitomi Method and apparatus for indexing and searching data in a storage system
US20080098156A1 (en) * 1999-10-19 2008-04-24 Shen Andrew W Operating system and data protection
US7383465B1 (en) * 2004-06-22 2008-06-03 Symantec Operating Corporation Undoable volume using write logging
US20080154914A1 (en) * 2006-05-26 2008-06-26 Nec Corporation Storage system, data protection method, and program
US20080243946A1 (en) * 2007-03-29 2008-10-02 Hitachi, Ltd. Storage system and data recovery method
US7433898B1 (en) * 2004-06-01 2008-10-07 Sanbolic, Inc. Methods and apparatus for shared storage journaling
US20090024871A1 (en) * 2005-11-21 2009-01-22 Hitachi, Ltd. Failure management method for a storage system
US7571350B2 (en) 2006-02-14 2009-08-04 Hitachi, Ltd. Storage system and recovery method thereof
US20090259669A1 (en) * 2008-04-10 2009-10-15 Iron Mountain Incorporated Method and system for analyzing test data for a computer application
US20100169281A1 (en) * 2006-05-22 2010-07-01 Rajeev Atluri Coalescing and capturing data between events prior to and after a temporal window
US20100169283A1 (en) * 2006-05-22 2010-07-01 Rajeev Atluri Recovery point data view formation with generation of a recovery view and a coalesce policy
WO2010096685A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Methods and systems for single instance storage of asset parts
US20100217953A1 (en) * 2009-02-23 2010-08-26 Beaman Peter D Hybrid hash tables
US20100217931A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Managing workflow communication in a distributed storage system
US20100215175A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Methods and systems for stripe blind encryption
US20100274767A1 (en) * 2009-04-23 2010-10-28 Hitachi, Ltd. Backup method for storage system
US20110184918A1 (en) * 2006-05-22 2011-07-28 Rajeev Atluri Recovery point data view shift through a direction-agnostic roll algorithm
US20130138615A1 (en) * 2011-11-29 2013-05-30 International Business Machines Corporation Synchronizing updates across cluster filesystems
US20140019411A1 (en) * 2005-09-21 2014-01-16 Infoblox Inc. Semantic replication
US8788459B2 (en) 2012-05-15 2014-07-22 Splunk Inc. Clustering for high availability and disaster recovery
US8805886B1 (en) * 2004-05-26 2014-08-12 Symantec Operating Corporation Recoverable single-phase logging
US20140279900A1 (en) * 2013-03-15 2014-09-18 Amazon Technologies, Inc. Place snapshots
US8892516B2 (en) 2005-09-21 2014-11-18 Infoblox Inc. Provisional authority in a distributed database
US9098455B2 (en) 2004-06-01 2015-08-04 Inmage Systems, Inc. Systems and methods of event driven recovery management
US9317545B2 (en) 2005-09-21 2016-04-19 Infoblox Inc. Transactional replication
US20160132401A1 (en) * 2010-08-12 2016-05-12 Security First Corp. Systems and methods for secure remote storage
US20160147616A1 (en) * 2014-11-25 2016-05-26 Andre Schefe Recovery strategy with dynamic number of volumes
US9383937B1 (en) * 2013-03-14 2016-07-05 Emc Corporation Journal tiering in a continuous data protection system using deduplication-based storage
US9411717B2 (en) 2012-10-23 2016-08-09 Seagate Technology Llc Metadata journaling with error correction redundancy
US20160239372A1 (en) * 2013-09-26 2016-08-18 Hewlett Packard Enterprise Development Lp Undoing changes made by threads
US9547560B1 (en) * 2015-06-26 2017-01-17 Amazon Technologies, Inc. Amortized snapshots
US9558078B2 (en) 2014-10-28 2017-01-31 Microsoft Technology Licensing, Llc Point in time database restore from storage snapshots
US20170123685A1 (en) * 2015-11-04 2017-05-04 Nimble Storage, Inc. Virtualization of non-volatile random access memory
US9984129B2 (en) 2012-05-15 2018-05-29 Splunk Inc. Managing data searches using generation identifiers
US10162715B1 (en) * 2009-03-31 2018-12-25 Amazon Technologies, Inc. Cloning and recovery of data volumes
US10282260B2 (en) 2015-07-21 2019-05-07 Samsung Electronics Co., Ltd. Method of operating storage system and storage controller
US20190155698A1 (en) * 2017-11-20 2019-05-23 Salesforce.Com, Inc. Distributed storage reservation for recovering distributed data
US10387448B2 (en) 2012-05-15 2019-08-20 Splunk Inc. Replication of summary data in a clustered computing environment
US10534768B2 (en) 2013-12-02 2020-01-14 Amazon Technologies, Inc. Optimized log storage for asynchronous log updates
CN110832490A (en) * 2017-07-10 2020-02-21 美光科技公司 Secure snapshot management for data storage devices
US10698881B2 (en) 2013-03-15 2020-06-30 Amazon Technologies, Inc. Database system with database engine and separate distributed storage service
US10747746B2 (en) 2013-04-30 2020-08-18 Amazon Technologies, Inc. Efficient read replicas
US10872076B2 (en) 2013-05-13 2020-12-22 Amazon Technologies, Inc. Transaction ordering
US11003687B2 (en) 2012-05-15 2021-05-11 Splunk, Inc. Executing data searches using generation identifiers
US11030055B2 (en) 2013-03-15 2021-06-08 Amazon Technologies, Inc. Fast crash recovery for distributed database systems
US11120152B2 (en) 2013-09-20 2021-09-14 Amazon Technologies, Inc. Dynamic quorum membership changes
US11210184B1 (en) * 2017-06-07 2021-12-28 Amazon Technologies, Inc. Online restore to a selectable prior state for database engines
US11334522B2 (en) * 2015-09-11 2022-05-17 Cohesity, Inc. Distributed write journals that support fast snapshotting for a distributed file system
US11429488B2 (en) 2016-10-14 2022-08-30 Tencent Technology (Shenzhen) Company Limited Data recovery method based on snapshots, device and storage medium

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8572040B2 (en) * 2006-04-28 2013-10-29 International Business Machines Corporation Methods and infrastructure for performing repetitive data protection and a corresponding restore of data
US8539253B2 (en) * 2006-07-18 2013-09-17 Netapp, Inc. System and method for securing information by obscuring contents of a persistent image
US8238624B2 (en) * 2007-01-30 2012-08-07 International Business Machines Corporation Hybrid medical image processing
US8331737B2 (en) * 2007-04-23 2012-12-11 International Business Machines Corporation Heterogeneous image processing system
US8462369B2 (en) * 2007-04-23 2013-06-11 International Business Machines Corporation Hybrid image processing system for a single field of view having a plurality of inspection threads
US8326092B2 (en) * 2007-04-23 2012-12-04 International Business Machines Corporation Heterogeneous image processing system
US8675219B2 (en) * 2007-10-24 2014-03-18 International Business Machines Corporation High bandwidth image processing with run time library function offload via task distribution to special purpose engines
US9135073B2 (en) 2007-11-15 2015-09-15 International Business Machines Corporation Server-processor hybrid system for processing data
US20090132582A1 (en) * 2007-11-15 2009-05-21 Kim Moon J Processor-server hybrid system for processing data
US8095827B2 (en) 2007-11-16 2012-01-10 International Business Machines Corporation Replication management with undo and redo capabilities
US20090150556A1 (en) * 2007-12-06 2009-06-11 Kim Moon J Memory to storage communication for hybrid systems
US9332074B2 (en) * 2007-12-06 2016-05-03 International Business Machines Corporation Memory to memory communication and storage for hybrid systems
US8229251B2 (en) * 2008-02-08 2012-07-24 International Business Machines Corporation Pre-processing optimization of an image processing system
US8379963B2 (en) * 2008-03-28 2013-02-19 International Business Machines Corporation Visual inspection system
US8250031B2 (en) * 2008-08-26 2012-08-21 Hitachi, Ltd. Low traffic failback remote copy
US8805953B2 (en) * 2009-04-03 2014-08-12 Microsoft Corporation Differential file and system restores from peers and the cloud
US20100257403A1 (en) * 2009-04-03 2010-10-07 Microsoft Corporation Restoration of a system from a set of full and partial delta system snapshots across a distributed system
JP5939896B2 (en) * 2012-06-12 2016-06-22 キヤノン株式会社 Image forming apparatus
US8495221B1 (en) * 2012-10-17 2013-07-23 Limelight Networks, Inc. Targeted and dynamic content-object storage based on inter-network performance metrics
US10339112B1 (en) * 2013-04-25 2019-07-02 Veritas Technologies Llc Restoring data in deduplicated storage
US10255137B1 (en) * 2013-12-16 2019-04-09 EMC IP Holding Company LLC Point-in-time recovery on deduplicated storage
US9886353B2 (en) 2014-12-19 2018-02-06 International Business Machines Corporation Data asset reconstruction
US11392541B2 (en) * 2019-03-22 2022-07-19 Hewlett Packard Enterprise Development Lp Data transfer using snapshot differencing from edge system to core system
US11163799B2 (en) * 2019-10-29 2021-11-02 Dell Products L.P. Automatic rollback to target for synchronous replication
US11809279B2 (en) * 2019-12-13 2023-11-07 EMC IP Holding Company LLC Automatic IO stream timing determination in live images
US11775391B2 (en) 2020-07-13 2023-10-03 Samsung Electronics Co., Ltd. RAID system with fault resilient storage devices

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5088502A (en) * 1990-12-12 1992-02-18 Cuderm Corporation Skin surface sampling and visualizing device
US5404508A (en) * 1992-12-03 1995-04-04 Unisys Corporation Data base backup and recovery system and method
US6301677B1 (en) * 1996-12-15 2001-10-09 Delta-Tek Research, Inc. System and apparatus for merging a write event journal and an original storage to produce an updated storage using an event map
US20040133575A1 (en) * 2002-12-23 2004-07-08 Storage Technology Corporation Scheduled creation of point-in-time views
US6829819B1 (en) * 1999-05-03 2004-12-14 Western Digital (Fremont), Inc. Method of forming a magnetoresistive device

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4077059A (en) 1975-12-18 1978-02-28 Cordi Vincent A Multi-processing system with a hierarchial memory having journaling and copyback
US4823261A (en) 1986-11-24 1989-04-18 International Business Machines Corp. Multiprocessor system for updating status information through flip-flopping read version and write version of checkpoint data
US5065311A (en) 1987-04-20 1991-11-12 Hitachi, Ltd. Distributed data base system of composite subsystem type, and method fault recovery for the system
GB8915875D0 (en) 1989-07-11 1989-08-31 Intelligence Quotient United K A method of operating a data processing system
JPH03103941A (en) 1989-09-18 1991-04-30 Nec Corp Automatic commitment control system
US5479654A (en) 1990-04-26 1995-12-26 Squibb Data Systems, Inc. Apparatus and method for reconstructing a file from a difference signature and an original file
US6816872B1 (en) 1990-04-26 2004-11-09 Timespring Software Corporation Apparatus and method for reconstructing a file from a difference signature and an original file
US5369757A (en) 1991-06-18 1994-11-29 Digital Equipment Corporation Recovery logging in the presence of snapshot files by ordering of buffer pool flushing
JPH052517A (en) 1991-06-26 1993-01-08 Nec Corp Data base journal control system
US5701480A (en) 1991-10-17 1997-12-23 Digital Equipment Corporation Distributed multi-version commitment ordering protocols for guaranteeing serializability during transaction processing
US5263154A (en) 1992-04-20 1993-11-16 International Business Machines Corporation Method and system for incremental time zero backup copying of data
JPH0827754B2 (en) 1992-05-21 1996-03-21 インターナショナル・ビジネス・マシーンズ・コーポレイション File management method and file management system in computer system
US5416915A (en) 1992-12-11 1995-05-16 International Business Machines Corporation Method and system for minimizing seek affinity and enhancing write sensitivity in a DASD array
US5555371A (en) 1992-12-17 1996-09-10 International Business Machines Corporation Data backup copying with delayed directory updating and reduced numbers of DASD accesses at a back up site using a log structured array data storage
JP3130536B2 (en) 1993-01-21 2001-01-31 アップル コンピューター インコーポレーテッド Apparatus and method for transferring and storing data from multiple networked computer storage devices
CA2153770A1 (en) 1993-01-21 1994-08-04 Charles S. Spirakis Method and apparatus for data transfer and storage in a highly parallel computer network environment
JPH0869404A (en) 1994-08-29 1996-03-12 Fujitsu Ltd Backup method for data and data processor utilizing same
US5835953A (en) 1994-10-13 1998-11-10 Vinca Corporation Backup system that takes a snapshot of the locations in a mass storage device that has been identified for updating prior to updating
US5644696A (en) 1995-06-06 1997-07-01 International Business Machines Corporation Recovering multi-volume data sets during volume recovery
US5720029A (en) 1995-07-25 1998-02-17 International Business Machines Corporation Asynchronously shadowing record updates in a remote copy session using track arrays
US5680640A (en) 1995-09-01 1997-10-21 Emc Corporation System for migrating data by selecting a first or second transfer means based on the status of a data element map initialized to a predetermined state
US5870758A (en) 1996-03-11 1999-02-09 Oracle Corporation Method and apparatus for providing isolation levels in a database system
US6889214B1 (en) 1996-10-02 2005-05-03 Stamps.Com Inc. Virtual security device
US6081875A (en) 1997-05-19 2000-06-27 Emc Corporation Apparatus and method for backup of a disk storage system
US6490610B1 (en) 1997-05-30 2002-12-03 Oracle Corporation Automatic failover for clients accessing a resource through a server
US6128630A (en) 1997-12-18 2000-10-03 International Business Machines Corporation Journal space release for log-structured storage systems
JPH11272427A (en) 1998-03-24 1999-10-08 Hitachi Ltd Method for saving data and outside storage device
US6324654B1 (en) 1998-03-30 2001-11-27 Legato Systems, Inc. Computer network remote data mirroring system
US6154852A (en) 1998-06-10 2000-11-28 International Business Machines Corporation Method and apparatus for data backup and recovery
JPH11353215A (en) 1998-06-11 1999-12-24 Nec Corp Journal-after-update collecting process system
US6189016B1 (en) 1998-06-12 2001-02-13 Microsoft Corporation Journaling ordered changes in a storage volume
US6269381B1 (en) 1998-06-30 2001-07-31 Emc Corporation Method and apparatus for backing up data before updating the data and for restoring from the backups
US6298345B1 (en) 1998-07-10 2001-10-02 International Business Machines Corporation Database journal mechanism and method that supports multiple simultaneous deposits
US6260124B1 (en) 1998-08-13 2001-07-10 International Business Machines Corporation System and method for dynamically resynchronizing backup data
US6353878B1 (en) 1998-08-13 2002-03-05 Emc Corporation Remote control of backup media in a secondary storage subsystem through access to a primary storage subsystem
US6269431B1 (en) 1998-08-13 2001-07-31 Emc Corporation Virtual storage and block level direct access of secondary storage for recovery of backup data
US6397351B1 (en) 1998-09-28 2002-05-28 International Business Machines Corporation Method and apparatus for rapid data restoration including on-demand output of sorted logged changes
JP2000155708A (en) 1998-11-24 2000-06-06 Nec Corp Automatic monitoring method for use state of journal file
JP2000284987A (en) 1999-03-31 2000-10-13 Fujitsu Ltd Computer, computer network system and recording medium
US7099875B2 (en) 1999-06-29 2006-08-29 Emc Corporation Method and apparatus for making independent data copies in a data processing system
US6539462B1 (en) 1999-07-12 2003-03-25 Hitachi Data Systems Corporation Remote data copy using a prospective suspend command
TW454120B (en) 1999-11-11 2001-09-11 Miralink Corp Flexible remote data mirroring
US7203732B2 (en) 1999-11-11 2007-04-10 Miralink Corporation Flexible remote data mirroring
US6560614B1 (en) 1999-11-12 2003-05-06 Xosoft Inc. Nonintrusive update of files
US6711409B1 (en) 1999-12-15 2004-03-23 Bbnt Solutions Llc Node belonging to multiple clusters in an ad hoc wireless network
US6526418B1 (en) 1999-12-16 2003-02-25 Livevault Corporation Systems and methods for backing up data files
JP4115060B2 (en) 2000-02-02 2008-07-09 株式会社日立製作所 Data recovery method for information processing system and disk subsystem
US6473775B1 (en) 2000-02-16 2002-10-29 Microsoft Corporation System and method for growing differential file on a base volume of a snapshot
US6587970B1 (en) 2000-03-22 2003-07-01 Emc Corporation Method and apparatus for performing site failover
JP3968207B2 (en) 2000-05-25 2007-08-29 株式会社日立製作所 Data multiplexing method and data multiplexing system
US6711572B2 (en) 2000-06-14 2004-03-23 Xosoft Inc. File system for distributing content in a data network and related methods
US6665815B1 (en) 2000-06-22 2003-12-16 Hewlett-Packard Development Company, L.P. Physical incremental backup using snapshots
US7031986B2 (en) 2000-06-27 2006-04-18 Fujitsu Limited Database system with backup and recovery mechanisms
US6732125B1 (en) 2000-09-08 2004-05-04 Storage Technology Corporation Self archiving log structured volume with intrinsic data protection
US6691245B1 (en) 2000-10-10 2004-02-10 Lsi Logic Corporation Data storage with host-initiated synchronization and fail-over of remote mirror
US20020111891A1 (en) * 2000-11-24 2002-08-15 Woodward Hoffman Accounting system for dynamic state of the portfolio reporting
US7730213B2 (en) 2000-12-18 2010-06-01 Oracle America, Inc. Object-based storage device with improved reliability and fast crash recovery
US6662281B2 (en) 2001-01-31 2003-12-09 Hewlett-Packard Development Company, L.P. Redundant backup device
US6742138B1 (en) 2001-06-12 2004-05-25 Emc Corporation Data recovery method and apparatus
US6978282B1 (en) 2001-09-04 2005-12-20 Emc Corporation Information replication system having automated replication storage
EP1436873B1 (en) 2001-09-28 2009-04-29 Commvault Systems, Inc. System and method for generating and managing quick recovery volumes
US6832289B2 (en) 2001-10-11 2004-12-14 International Business Machines Corporation System and method for migrating data
JP4108973B2 (en) 2001-12-26 2008-06-25 株式会社日立製作所 Backup system
US6898688B2 (en) 2001-12-28 2005-05-24 Storage Technology Corporation Data management appliance
US6839819B2 (en) * 2001-12-28 2005-01-04 Storage Technology Corporation Data management appliance
US7237075B2 (en) 2002-01-22 2007-06-26 Columbia Data Products, Inc. Persistent snapshot methods
US20030177306A1 (en) 2002-03-14 2003-09-18 Cochran Robert Alan Track level snapshot
US7225204B2 (en) 2002-03-19 2007-05-29 Network Appliance, Inc. System and method for asynchronous mirroring of snapshots at a destination using a purgatory directory and inode mapping
US7778958B2 (en) 2002-04-11 2010-08-17 Quantum Corporation Recovery of data on a primary data volume
AU2003214624A1 (en) 2002-04-25 2003-11-10 Kashya Israel Ltd. An apparatus for continuous compression of large volumes of data
US20030220935A1 (en) 2002-05-21 2003-11-27 Vivian Stephen J. Method of logical database snapshot for log-based replication
JP2004013367A (en) 2002-06-05 2004-01-15 Hitachi Ltd Data storage subsystem
US7844577B2 (en) 2002-07-15 2010-11-30 Symantec Corporation System and method for maintaining a backup storage system for a computer system
US6842825B2 (en) 2002-08-07 2005-01-11 International Business Machines Corporation Adjusting timestamps to preserve update timing information for cached data objects
US7020755B2 (en) 2002-08-29 2006-03-28 International Business Machines Corporation Method and apparatus for read-only recovery in a dual copy storage system
US8219777B2 (en) 2002-10-03 2012-07-10 Hewlett-Packard Development Company, L.P. Virtual storage systems, virtual storage methods and methods of over committing a virtual raid storage system
CA2508089A1 (en) 2002-10-07 2004-04-22 Commvault Systems, Inc. System and method for managing stored data
US6981114B1 (en) * 2002-10-16 2005-12-27 Veritas Operating Corporation Snapshot reconstruction from an existing snapshot and one or more modification logs
US20040153558A1 (en) 2002-10-31 2004-08-05 Mesut Gunduc System and method for providing java based high availability clustering framework
US7010645B2 (en) 2002-12-27 2006-03-07 International Business Machines Corporation System and method for sequentially staging received data to a write cache in advance of storing the received data
US7231544B2 (en) 2003-02-27 2007-06-12 Hewlett-Packard Development Company, L.P. Restoring data from point-in-time representations of the data
US20050039069A1 (en) 2003-04-03 2005-02-17 Anand Prahlad Remote disaster data recovery system and method
US20040225689A1 (en) 2003-05-08 2004-11-11 International Business Machines Corporation Autonomic logging support
US7143317B2 (en) 2003-06-04 2006-11-28 Hewlett-Packard Development Company, L.P. Computer event log overwriting intermediate events

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5088502A (en) * 1990-12-12 1992-02-18 Cuderm Corporation Skin surface sampling and visualizing device
US5404508A (en) * 1992-12-03 1995-04-04 Unisys Corporation Data base backup and recovery system and method
US6301677B1 (en) * 1996-12-15 2001-10-09 Delta-Tek Research, Inc. System and apparatus for merging a write event journal and an original storage to produce an updated storage using an event map
US6829819B1 (en) * 1999-05-03 2004-12-14 Western Digital (Fremont), Inc. Method of forming a magnetoresistive device
US20040133575A1 (en) * 2002-12-23 2004-07-08 Storage Technology Corporation Scheduled creation of point-in-time views

Cited By (119)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7516357B2 (en) * 1999-10-19 2009-04-07 Idocrase Investments Llc Stored memory recovery system
US7783923B2 (en) 1999-10-19 2010-08-24 Shen Andrew W Stored memory recovery system
US20080098156A1 (en) * 1999-10-19 2008-04-24 Shen Andrew W Operating system and data protection
US20070174681A1 (en) * 1999-10-19 2007-07-26 Idocrase Investments Llc Stored memory recovery system
US7818617B2 (en) 1999-10-19 2010-10-19 Shen Andrew W Operating system and data protection
US7844855B2 (en) * 1999-10-19 2010-11-30 Shen Andrew W Stored memory recovery system
US7567991B2 (en) * 2003-06-25 2009-07-28 Emc Corporation Replication of snapshot using a file system copy differential
US20040267836A1 (en) * 2003-06-25 2004-12-30 Philippe Armangau Replication of snapshot using a file system copy differential
US8805886B1 (en) * 2004-05-26 2014-08-12 Symantec Operating Corporation Recoverable single-phase logging
US7433898B1 (en) * 2004-06-01 2008-10-07 Sanbolic, Inc. Methods and apparatus for shared storage journaling
US9165157B2 (en) 2004-06-01 2015-10-20 Citrix Systems, Inc. Methods and apparatus facilitating access to storage among multiple computers
US9098455B2 (en) 2004-06-01 2015-08-04 Inmage Systems, Inc. Systems and methods of event driven recovery management
US7383465B1 (en) * 2004-06-22 2008-06-03 Symantec Operating Corporation Undoable volume using write logging
US8892516B2 (en) 2005-09-21 2014-11-18 Infoblox Inc. Provisional authority in a distributed database
US20140019411A1 (en) * 2005-09-21 2014-01-16 Infoblox Inc. Semantic replication
US8874516B2 (en) * 2005-09-21 2014-10-28 Infoblox Inc. Semantic replication
US9317545B2 (en) 2005-09-21 2016-04-19 Infoblox Inc. Transactional replication
US20070100905A1 (en) * 2005-11-03 2007-05-03 St. Bernard Software, Inc. Malware and spyware attack recovery system and method
US7756834B2 (en) * 2005-11-03 2010-07-13 I365 Inc. Malware and spyware attack recovery system and method
US20070112893A1 (en) * 2005-11-15 2007-05-17 Wataru Okada Computer system, management computer, storage system, and backup management method
US7409414B2 (en) * 2005-11-15 2008-08-05 Hitachi, Ltd. Computer system, management computer, storage system, and backup management method
US7693885B2 (en) 2005-11-15 2010-04-06 Hitachi, Ltd. Computer system, management computer, storage system, and backup management method
US20080307021A1 (en) * 2005-11-15 2008-12-11 Wataru Okada Computer system, management computer, storage system, and backup management method
JP2007140746A (en) * 2005-11-16 2007-06-07 Hitachi Ltd Computer system, management computer and recovery management method
US7444545B2 (en) 2005-11-16 2008-10-28 Hitachi, Ltd. Computer system, managing computer and recovery management method
US20090013008A1 (en) * 2005-11-16 2009-01-08 Hitachi, Ltd. Computer System, Managing Computer and Recovery Management Method
US8135986B2 (en) 2005-11-16 2012-03-13 Hitachi, Ltd. Computer system, managing computer and recovery management method
US20090024871A1 (en) * 2005-11-21 2009-01-22 Hitachi, Ltd. Failure management method for a storage system
US20070166117A1 (en) * 2006-01-17 2007-07-19 Hsin-Tien Chang Gyration balancing calibration free high-speed boring tool
US8533409B2 (en) * 2006-01-26 2013-09-10 Infortrend Technology, Inc. Method of managing data snapshot images in a storage system
US20070174569A1 (en) * 2006-01-26 2007-07-26 Infortrend Technology, Inc. Method of managing data snapshot images in a storage system
US7571348B2 (en) 2006-01-31 2009-08-04 Hitachi, Ltd. Storage system creating a recovery request point enabling execution of a recovery
US8327183B2 (en) 2006-01-31 2012-12-04 Hitachi, Ltd. Storage system creating a recovery request enabling execution of a recovery and comprising a switch that detects recovery request point events
US20090276661A1 (en) * 2006-01-31 2009-11-05 Akira Deguchi Storage system creating a recovery request point enabling execution of a recovery
US20070179994A1 (en) * 2006-01-31 2007-08-02 Akira Deguchi Storage system
US7571350B2 (en) 2006-02-14 2009-08-04 Hitachi, Ltd. Storage system and recovery method thereof
US20070198604A1 (en) * 2006-02-21 2007-08-23 Hitachi, Ltd. Computer system, computer system management console, and data recovery management method
US7681001B2 (en) * 2006-03-07 2010-03-16 Hitachi, Ltd. Storage system
US20070214334A1 (en) * 2006-03-07 2007-09-13 Naoko Maruyama Storage system
US7581136B2 (en) 2006-05-19 2009-08-25 Hitachi, Ltd. Method and apparatus for data recovery
US20070271422A1 (en) * 2006-05-19 2007-11-22 Nobuyuki Osaki Method and apparatus for data recovery
US8527470B2 (en) * 2006-05-22 2013-09-03 Rajeev Atluri Recovery point data view formation with generation of a recovery view and a coalesce policy
US8838528B2 (en) * 2006-05-22 2014-09-16 Inmage Systems, Inc. Coalescing and capturing data between events prior to and after a temporal window
US8732136B2 (en) * 2006-05-22 2014-05-20 Inmage Systems, Inc. Recovery point data view shift through a direction-agnostic roll algorithm
US20100169283A1 (en) * 2006-05-22 2010-07-01 Rajeev Atluri Recovery point data view formation with generation of a recovery view and a coalesce policy
US20100169281A1 (en) * 2006-05-22 2010-07-01 Rajeev Atluri Coalescing and capturing data between events prior to and after a temporal window
US20110184918A1 (en) * 2006-05-22 2011-07-28 Rajeev Atluri Recovery point data view shift through a direction-agnostic roll algorithm
US20080154914A1 (en) * 2006-05-26 2008-06-26 Nec Corporation Storage system, data protection method, and program
US20070300013A1 (en) * 2006-06-21 2007-12-27 Manabu Kitamura Storage system having transaction monitoring capability
US20080027998A1 (en) * 2006-07-27 2008-01-31 Hitachi, Ltd. Method and apparatus of continuous data protection for NAS
US20080059732A1 (en) * 2006-09-06 2008-03-06 Wataru Okada Computer system with data recovering, method of managing data with data recovering and managing computer for data recovering
US7698503B2 (en) * 2006-09-06 2010-04-13 Hitachi, Ltd. Computer system with data recovering, method of managing data with data recovering and managing computer for data recovering
US8082232B2 (en) 2006-09-20 2011-12-20 Hitachi, Ltd. Recovery method using CDP
US20080071841A1 (en) * 2006-09-20 2008-03-20 Hitachi, Ltd. Recovery method using CDP
US7467165B2 (en) 2006-09-20 2008-12-16 Hitachi, Ltd. Recovery method using CDP
US20090070390A1 (en) * 2006-09-20 2009-03-12 Hitachi, Ltd. Recovery method using cdp
US20080091744A1 (en) * 2006-10-11 2008-04-17 Hidehisa Shitomi Method and apparatus for indexing and searching data in a storage system
US8060468B2 (en) * 2007-03-29 2011-11-15 Hitachi, Ltd. Storage system and data recovery method
US20080243946A1 (en) * 2007-03-29 2008-10-02 Hitachi, Ltd. Storage system and data recovery method
US20090259669A1 (en) * 2008-04-10 2009-10-15 Iron Mountain Incorporated Method and system for analyzing test data for a computer application
GB2484371B (en) * 2009-02-23 2012-05-30 Iron Mountain Inc Methods and systems for single instance storage of asset parts
US20100217953A1 (en) * 2009-02-23 2010-08-26 Beaman Peter D Hybrid hash tables
WO2010096685A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Methods and systems for single instance storage of asset parts
US8397051B2 (en) 2009-02-23 2013-03-12 Autonomy, Inc. Hybrid hash tables
US20100217931A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Managing workflow communication in a distributed storage system
GB2484371A (en) * 2009-02-23 2012-04-11 Iron Mountain Inc Methods and systems for single instance storage of asset parts
US8145598B2 (en) 2009-02-23 2012-03-27 Iron Mountain Incorporated Methods and systems for single instance storage of asset parts
US20100215175A1 (en) * 2009-02-23 2010-08-26 Iron Mountain Incorporated Methods and systems for stripe blind encryption
US8090683B2 (en) 2009-02-23 2012-01-03 Iron Mountain Incorporated Managing workflow communication in a distributed storage system
US8806175B2 (en) 2009-02-23 2014-08-12 Longsand Limited Hybrid hash tables
US20100228784A1 (en) * 2009-02-23 2010-09-09 Iron Mountain Incorporated Methods and Systems for Single Instance Storage of Asset Parts
US20230070982A1 (en) * 2009-03-31 2023-03-09 Amazon Technologies, Inc. Cloning and recovery of data volumes
US11914486B2 (en) * 2009-03-31 2024-02-27 Amazon Technologies, Inc. Cloning and recovery of data volumes
US11385969B2 (en) * 2009-03-31 2022-07-12 Amazon Technologies, Inc. Cloning and recovery of data volumes
US10162715B1 (en) * 2009-03-31 2018-12-25 Amazon Technologies, Inc. Cloning and recovery of data volumes
US20100274767A1 (en) * 2009-04-23 2010-10-28 Hitachi, Ltd. Backup method for storage system
US8185502B2 (en) * 2009-04-23 2012-05-22 Hitachi, Ltd. Backup method for storage system
US20160132401A1 (en) * 2010-08-12 2016-05-12 Security First Corp. Systems and methods for secure remote storage
US20130138615A1 (en) * 2011-11-29 2013-05-30 International Business Machines Corporation Synchronizing updates across cluster filesystems
US9235594B2 (en) * 2011-11-29 2016-01-12 International Business Machines Corporation Synchronizing updates across cluster filesystems
US20160103850A1 (en) * 2011-11-29 2016-04-14 International Business Machines Corporation Synchronizing Updates Across Cluster Filesystems
US20130138616A1 (en) * 2011-11-29 2013-05-30 International Business Machines Corporation Synchronizing updates across cluster filesystems
US10698866B2 (en) * 2011-11-29 2020-06-30 International Business Machines Corporation Synchronizing updates across cluster filesystems
US9984128B2 (en) 2012-05-15 2018-05-29 Splunk Inc. Managing site-based search configuration data
US8788459B2 (en) 2012-05-15 2014-07-22 Splunk Inc. Clustering for high availability and disaster recovery
US11675810B2 (en) 2012-05-15 2023-06-13 Splunkinc. Disaster recovery in a clustered environment using generation identifiers
US10474682B2 (en) 2012-05-15 2019-11-12 Splunk Inc. Data replication in a clustered computing environment
US10387448B2 (en) 2012-05-15 2019-08-20 Splunk Inc. Replication of summary data in a clustered computing environment
US9984129B2 (en) 2012-05-15 2018-05-29 Splunk Inc. Managing data searches using generation identifiers
US9160798B2 (en) 2012-05-15 2015-10-13 Splunk, Inc. Clustering for high availability and disaster recovery
US11003687B2 (en) 2012-05-15 2021-05-11 Splunk, Inc. Executing data searches using generation identifiers
US9411717B2 (en) 2012-10-23 2016-08-09 Seagate Technology Llc Metadata journaling with error correction redundancy
US9383937B1 (en) * 2013-03-14 2016-07-05 Emc Corporation Journal tiering in a continuous data protection system using deduplication-based storage
US11500852B2 (en) 2013-03-15 2022-11-15 Amazon Technologies, Inc. Database system with database engine and separate distributed storage service
US10180951B2 (en) * 2013-03-15 2019-01-15 Amazon Technologies, Inc. Place snapshots
US11030055B2 (en) 2013-03-15 2021-06-08 Amazon Technologies, Inc. Fast crash recovery for distributed database systems
AU2017239539B2 (en) * 2013-03-15 2019-08-15 Amazon Technologies, Inc. In place snapshots
US20140279900A1 (en) * 2013-03-15 2014-09-18 Amazon Technologies, Inc. Place snapshots
US10698881B2 (en) 2013-03-15 2020-06-30 Amazon Technologies, Inc. Database system with database engine and separate distributed storage service
US10747746B2 (en) 2013-04-30 2020-08-18 Amazon Technologies, Inc. Efficient read replicas
US10872076B2 (en) 2013-05-13 2020-12-22 Amazon Technologies, Inc. Transaction ordering
US11120152B2 (en) 2013-09-20 2021-09-14 Amazon Technologies, Inc. Dynamic quorum membership changes
US20160239372A1 (en) * 2013-09-26 2016-08-18 Hewlett Packard Enterprise Development Lp Undoing changes made by threads
US10534768B2 (en) 2013-12-02 2020-01-14 Amazon Technologies, Inc. Optimized log storage for asynchronous log updates
US9558078B2 (en) 2014-10-28 2017-01-31 Microsoft Technology Licensing, Llc Point in time database restore from storage snapshots
US20160147616A1 (en) * 2014-11-25 2016-05-26 Andre Schefe Recovery strategy with dynamic number of volumes
US9836360B2 (en) * 2014-11-25 2017-12-05 Sap Se Recovery strategy with dynamic number of volumes
US9547560B1 (en) * 2015-06-26 2017-01-17 Amazon Technologies, Inc. Amortized snapshots
US10019184B2 (en) 2015-06-26 2018-07-10 Amazon Technologies, Inc. Amortized snapshots
US10282260B2 (en) 2015-07-21 2019-05-07 Samsung Electronics Co., Ltd. Method of operating storage system and storage controller
US11741048B2 (en) 2015-09-11 2023-08-29 Cohesity, Inc. Distributed write journals that support fast snapshotting for a distributed file system
US11334522B2 (en) * 2015-09-11 2022-05-17 Cohesity, Inc. Distributed write journals that support fast snapshotting for a distributed file system
US20170123685A1 (en) * 2015-11-04 2017-05-04 Nimble Storage, Inc. Virtualization of non-volatile random access memory
US10019193B2 (en) * 2015-11-04 2018-07-10 Hewlett Packard Enterprise Development Lp Checkpointing a journal by virtualization of non-volatile random access memory
US11429488B2 (en) 2016-10-14 2022-08-30 Tencent Technology (Shenzhen) Company Limited Data recovery method based on snapshots, device and storage medium
US11210184B1 (en) * 2017-06-07 2021-12-28 Amazon Technologies, Inc. Online restore to a selectable prior state for database engines
CN110832490A (en) * 2017-07-10 2020-02-21 美光科技公司 Secure snapshot management for data storage devices
US20190155698A1 (en) * 2017-11-20 2019-05-23 Salesforce.Com, Inc. Distributed storage reservation for recovering distributed data
US10754735B2 (en) * 2017-11-20 2020-08-25 Salesforce.Com, Inc. Distributed storage reservation for recovering distributed data

Also Published As

Publication number Publication date
US20060149798A1 (en) 2006-07-06
US8868507B2 (en) 2014-10-21
US8145603B2 (en) 2012-03-27
US20120166396A1 (en) 2012-06-28

Similar Documents

Publication Publication Date Title
US8868507B2 (en) Method and apparatus for data recovery using storage based journaling
US7979741B2 (en) Method and apparatus for data recovery system using storage based journaling
US7243197B2 (en) Method and apparatus for backup and recovery using storage based journaling
US8296265B2 (en) Method and apparatus for synchronizing applications for data recovery using storage based journaling
CN111316245B (en) Restoring databases using fully hydrated backups
US6665815B1 (en) Physical incremental backup using snapshots
US7310654B2 (en) Method and system for providing image incremental and disaster recovery
US7167880B2 (en) Method and apparatus for avoiding journal overflow on backup and recovery system using storage based journaling
US8015155B2 (en) Non-disruptive backup copy in a database online reorganization environment
WO2003058449A2 (en) Appliance for management of data replication
CN117130827A (en) Restoring databases using fully hydrated backups

Legal Events

Date Code Title Description
AS Assignment

Owner name: HITACHI, LTD., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAMAGAMI, KENJI;REEL/FRAME:015043/0271

Effective date: 20030716

STCB Information on status: application discontinuation

Free format text: EXPRESSLY ABANDONED -- DURING EXAMINATION