US20070061379A1 - Method and apparatus for sequencing transactions globally in a distributed database cluster - Google Patents
Method and apparatus for sequencing transactions globally in a distributed database cluster Download PDFInfo
- Publication number
- US20070061379A1 US20070061379A1 US11/221,752 US22175205A US2007061379A1 US 20070061379 A1 US20070061379 A1 US 20070061379A1 US 22175205 A US22175205 A US 22175205A US 2007061379 A1 US2007061379 A1 US 2007061379A1
- Authority
- US
- United States
- Prior art keywords
- transactions
- queue
- replication
- global
- transaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
Definitions
- This invention relates generally to the sequencing and processing of transactions within a cluster of replicated databases.
- a database has become the core component of most computer application software nowadays.
- application software makes use of a single or multiple databases as repositories of data (content) required by the application to function properly.
- the application's operational efficiency and availability is greatly dependent on the performance and availability of these database(s), which can be measured by two metrics: (1) request response time; and (2) transaction throughput.
- the clustering of applications can be achieved readily by running the application software on multiple, interconnected application servers that facilitate the execution of the application software and provide hardware redundancy for high availability, with the application software actively processing requests concurrently.
- database clustering technologies cannot provide the level of availability and redundancy in a similar active-active configuration. Consequently database servers are primarily configured as active-standby, meaning that one of the computer systems in the cluster does not process application request until a failover occurs. Active-standby configuration wastes system resources, extends the windows of unavailability and increases the chance of data loss.
- An update conflict refers to two or more database servers updating the same record in the databases that they manage. Since data in these databases must be consistent among them in order to scale out for performance and achieve high availability, the conflict must be resolved.
- conflict resolution there are two different schemes of conflict resolution: (1) time based resolution; and (2) location based resolution.
- neither conflict resolution schemes can be enforced without some heuristic decision to be made by human intervention. It is not possible to determine these heuristic decision rules unless there is a thorough understanding of the application software business rules and their implications. Consequently, most clustered database configurations adopt the active-standby model, and fail to achieve high performance and availability at the same time.
- the systems and methods disclosed herein provide a system for globally managing transaction requests to one or more database servers and to obviate or mitigate at least some of the above presented disadvantages.
- An update conflict refers to two or more database servers updating the same record in the databases that they manage. Since data in these databases must be consistent among them in order to scale out for performance and achieve high availability, the conflict must be resolved.
- conflict resolution there are two different schemes of conflict resolution: (1) time based resolution; and (2) location based resolution.
- neither conflict resolution schemes can be enforced without some heuristic decision to be made by human intervention. Consequently, most clustered database configurations adopt the active-standby model, and fail to achieve high performance and availability at the same time. Contrary to current database configurations there is provided a system and method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network.
- the system and method comprise a global queue for storing a number of the received transactions in a first predetermined order.
- the system and method also comprise a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- One aspect provided is a system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising: a global queue for storing a number of the received transactions in a first predetermined order; and a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- a further aspect provided is a system for receiving a plurality of transactions from at least one application server, distributing the transactions to at least two replication queues and applying the transactions to a plurality of databases comprising: a director coupled to each of said at least one application server for capturing a plurality of database calls therefrom as the plurality of transactions; and a controller for receiving each of the plurality of transactions, the controller configured for storing the transactions within a global queue in a predetermined order, for generating a copy of each said transaction for each of said at least two replication queues, and for transmitting in the predetermined order each said copy to each of said at least two replication queues respectively.
- a still further aspect provided is a method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the method comprising: storing a number of the received transactions in a first predetermined order in a global queue; creating a copy of each of the transactions for each of said at least two replication queues; and distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- a still further aspect provided is a system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising: means for storing a number of the received transactions in a first predetermined order; and means for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- FIG. 1A is a block diagram of a system for sequencing transactions
- FIG. 1B is a block diagram of a transaction replicator of the system of FIG. 1A ;
- FIGS. 1C, 1D and 1 E show an example operation of receiving and processing transactions for the system of FIG. 1A ;
- FIG. 2 is a block diagram of a director of the system of FIG. 1A ;
- FIG. 3 is a block diagram of a monitor of the system of FIG. 1A ;
- FIG. 4 is an example operation of the transaction replicator of FIG. 1B ;
- FIG. 5 is an example operation of a global transaction queue and a replication queue of FIG. 1B ;
- FIG. 6 is an example operation of the transaction replicator of FIG. 1B for resolving gating and indoubt transactions.
- FIG. 7 is an example operation of a replication server of FIG. 1B .
- a method and apparatus for sequencing transactions in a database cluster is described for use with computer programs or software applications whose functions are designed primarily to replicate update transactions to one or more databases such that data in these databases are approximately synchronized for read and write access.
- a system 10 comprising a plurality of application servers 7 for interacting with one or more database servers 4 and one or more databases 5 via a transaction replicator 1 .
- each of the application 7 instances represents a client computer.
- each of the application 7 instances represents an application server that is coupled to one or more users (not shown). Accordingly, it is recognized that the transaction replicator 1 can receive transactions from applications 7 , application servers 7 , or a combination thereof.
- the transaction replicator 1 of the system 10 receives transaction requests from the application servers 7 and provides sequenced and replicated transactions using a controller 2 to one or more replication servers 3 , which apply the transactions to the databases 5 .
- the transaction replicator 1 helps to prevent the transaction requests from interfering with each other and facilitates the integrity of the databases 5 .
- a transaction refers to a single logical operation from a user application 7 and typically include requests to read, insert, update and delete records within a predetermined database 5 .
- the controller 2 can be the central command center of the transaction replicator 1 that can run for example on the application servers 7 , the database servers 4 or dedicated hardware.
- the controller 2 may be coupled to a backup controller 9 that is set up to take over the command when the primary controller 2 fails.
- the backup controller 9 is approximately synchronized with the primary controller such that there exists transaction integrity. It is recognized that the controller 2 and associated transaction replicator 1 can also be configured for use as a node in a peer-to-peer network, as further described below.
- a replica global transaction queue is utilized.
- the backup controller 9 takes over control of transaction replicator 1 upon the failure of the primary controller 2 .
- the primary and backup controllers are installed at different sites and a redundant WAN is recommended between the two sites.
- the controller 2 receives input transactions 11 from a user application 7 and provides sequenced transactions 19 via the replication servers 3 , the sequenced transactions 19 are then ready for commitment to the database servers 4 .
- the controller 2 comprises a resent transaction queue 18 (resent TX queue), an indoubt transaction queue 17 (indoubt TX queue), a global transaction sequencer 12 (global TX sequencer), a global TX queue 13 (global TX queue) and at least one global disk queue 14 .
- the global queue 13 (and other queues if desired) can be configured as searchable a first-in-first out pipe (FIFO) or as a first-in-any-out (FIAO), as desired.
- a FIFO queue 13 could be used when the contents of the replication queues 15 are intended for databases 5
- a FIAO queue 13 could be used when the contents of the replication queues 15 are intended for consumption by unstructured data processing environments (not shown).
- the global disk queue 14 can be configured for an indexed and randomly accessible data set.
- the transaction replicator 1 maintains the globally sequenced transactions in two different types of queues: the global TX queue 13 and one or more replication queues 15 equal to that of the database server 4 instances. These queues are created using computer memory with spill over area on disks such as the global disk queue 14 and one or more replication disk queues 16 .
- the disk queues serve a number of purposes including: persist transactions to avoid transaction loss during failure of a component in the cluster; act as a very large transaction storage (from gigabytes to terabytes) that computer memory cannot reasonably provide (typically less than 64 gigabytes).
- the indoubt TX queue 17 is only used when indoubt transactions are detected after a certain system failures. Transactions found in this queue have an unknown transaction state and require either human intervention or pre-programmed resolution methods to resolve.
- the application resends the request which is then placed in the resent TX queue 18 .
- the application resends the request which is then placed in the resent TX queue 18 .
- the controller 2 uses the global TX queue 13 to track the status of each of the input transactions and to send the committed transaction for replication in sequence.
- FIGS. 1C, 1D , and 1 E shown is an example operation of the system 10 for receiving and processing a new transaction.
- the sequencer 12 assigns a new transaction ID to the received transaction.
- the transaction ID is a globally unique sequence number for each transaction within a replication group.
- the sequence ID for the newly received transaction is “K”.
- the controller 2 receives the transaction, the transaction and its ID are transferred to the global TX queue 20 if there is space available. Otherwise, if the global TX queue 13 is above a predetermined threshold and is full, for example, as shown in FIG. 1C , the transaction K and its ID are stored in the global disk queue 14 ( FIG. 1D ).
- the sequencer Before accepting any new transactions in the global TX queue, the sequencer distributes the committed transactions from the global TX queue 13 to a first replication server 20 and a second (or more) replication server 23 for execution against the databases.
- the transfer of the transactions to the replication servers can be triggered when at least one of the following two criteria occurs: 1) a predetermined transfer time interval and 2) a predetermined threshold for the total number of transactions within the global TX queue 13 is met.
- each replication server 20 , 23 has a respective replication queue 21 , 24 and applies the sequenced transactions, obtained from the global queue 13 , at its own rate.
- transaction F is transferred from the global TX queue 13 to the first and second replication servers 20 , 23 .
- the first replication server 20 has a first replication queue 21 and a first replication disk queue 22 and the second replication server 23 has a second replication queue 22 and a second replication disk queue 25 .
- the replication queues are an ordered repository of update transactions stored in computer memory for executing transactions on a predetermined database. In this case, since the second replication queue 24 is above a predetermined threshold (full, for example) transaction F is transferred to the second replication disk queue 25 .
- the unprocessed transaction F in the second replication disk queue 25 is moved to the second replication queue 24 for execution of the transaction request against the data within its respective database.
- a preselected threshold for example, full
- the core functions of the controller 2 can be summarized as registering one or more directors 8 and associating them with their respective replication groups; controlling the replication servers' activities; maintaining the global TX queue 13 that holds all the update transactions sent from the directors 8 ; synchronizing the global TX queue 13 with the backup controller 9 (where applicable); managing all replication groups defined; distributing committed transactions to the replication servers 3 ; tracking the operational status of each database server 4 within a replication group; providing system status to a monitor 6 ; and recovering from various system failures.
- the registry function of the controller 2 occurs when applications are enabled on a new application server 7 to access databases 5 in a replication group.
- the director 8 on the new application server contacts the controller 2 and registers itself to the replication group.
- this provides dynamic provisioning of application servers to scale up system capacity on demand. The registration is performed on the first database call made by an application. Subsequently the director 8 communicates with the controller 2 for transaction and server status tracking.
- the replication server control function allows the controller 2 to start the replication servers 3 and monitors their state. For example, when an administrator requests to pause replication to a specific database 5 , the controller then instructs the replication server to stop applying transactions until an administrator or an automated process requests it.
- the replication group management function allows the controller 2 to manage one or more groups of databases 5 that require transaction synchronization and data consistency among them.
- the number of replication groups that can be managed and controlled by the controller 2 is dependent upon the processing power of the computer that the controller is operating on and the sum of the transaction rates of all the replication groups.
- the director 8 can be installed on the application server 7 or the client computer.
- the director 8 is for initiating a sequence of operations to track the progress of a transaction.
- the director 8 comprises a first 27 , a second 28 , a third 29 and a fourth 30 functional module.
- the director 8 wraps around a vendor supplied JDBC driver.
- the director 8 is typically installed on the application server 7 in a 3-tier architecture, and on the client computer in a 2-tier architecture.
- the director 8 can act like an ordinary JDBC driver to the applications 7 , for example.
- the system 10 can also support any of the following associated with the transaction requests, such as but not limited to:
- the first module 27 captures all JDBC calls 26 , determines transaction type and boundary, and analyzes the SQLs in the transaction. Once determined to be an update transaction, the director 8 initiates a sequence of operations to track the progress of the transaction until it ends with a commit or rollback. Both DDL and DML are captured for replication to other databases in the same replication group.
- the second module 28 collects a plurality of different statistical elements on transactions and SQL statements for analyzing application execution and performance characteristics.
- the statistics can be exported as comma delimited text file for importing into a spreadsheet.
- the director's third module 29 manages database connections for the applications 7 .
- the director 8 reroutes transactions to one or more of the remaining databases.
- the director 8 also attempts to re-execute the transactions to minimize in flight transaction loss. Accordingly, the director 8 has the ability to instruct the controller 2 as to which database 5 is the primary database for satisfying the request of the respective application 7 .
- the director 8 routes read transactions to the least busy database server 4 for processing. This also applies when a database server 4 failure has resulted in transaction redirection.
- the director 8 redirects all the read transactions to the least busy database server 4 . Once the disk queue becomes empty, the director 8 subsequently allows read access to that database. Accordingly, the fill/usage status of the replication disk queues in the replication group can be obtained or otherwise received by the director 8 for use in management of through-put rate of transactions applied to the respective databases 5 .
- the director 8 or replication servers 3 fails to communicate with the database servers 4 , they report the failure to the controller 2 which then may redistribute transactions or take other appropriate actions to allow continuous operation of the transaction replicator 1 .
- the controller 2 instructs the replication server 3 to stop applying transactions to it and relays the database lock down status to a monitor 6 .
- the transactions start to accumulate within the queues until the database server 3 is repaired and the administrator or an automated process instructs to resume replication via the monitor 6 .
- the monitor 6 may also provide other predetermined administrative commands (for example: create database alias, update parameters, changing workload balancing setting).
- the monitor 6 allows a user to view and monitor the status of the controllers 2 , the replication servers 3 , and the databases 5 .
- the monitor 6 is a web application that is installed on an application or application server 7 and on the same network as the controllers 2 .
- FIG. 3 shown is a diagrammatic view of the system monitor 6 for use with the transaction replicator 1 .
- the system monitor 6 receives input data 32 from both primary and backup controllers 2 , 9 (where applicable), replication servers 3 , the database servers 4 and relevant databases 5 within a replication group. This information is used to display an overall system status on a display screen 31 .
- the relevant status of the controller 2 is shown.
- the status of each of the replication servers 3 within a desired replication group is shown.
- a detailed description of the transaction rate, the number of transactions within each replication queue 15 , the number transactions within each replication disk queue 16 is further shown.
- the monitor 6 further receives data regarding the databases 5 and displays the status of each database 5 and the number of committed transactions.
- the administrator can analyze the above information and choose to manually reroute the transactions. For example, when it is seen that there exists many transactions within the replication disk queue 16 of a particular replication server 3 or that the transaction rate of a replication server 3 is slow, the administrator may send output data in the form of a request 33 to distribute the transactions for a specified amount of time to a different database server within the replication group.
- the global TX sequencer 12 also referred to as the sequencer hereafter and as shown in FIG. 1B , is the control logic of the transaction replicator 1 .
- the controller 2 When the controller 2 is started, it initializes itself by reading from configuration and property files the parameters to be used in the current session 101 .
- the sequencer 12 Before accepting any new transactions, the sequencer 12 examines the global disk queue 14 to determine if any transactions are left behind from previous session. For example, if a transaction is found on the global disk queue 14 , it implies at least one database in the cluster is out of synchronization with the others and the database must be applied with these transactions before it can be accessed by applications. Transactions on the global disk queue 14 are read into the global TX queue 13 in preparation for applying to the database(s) 5 .
- the sequencer 12 then starts additional servers called replication servers 3 that create and manage the replication queues 15 . After initialization is complete, the sequencer 12 is ready to accept transactions from the application servers 7 .
- the sequencer 12 examines the incoming transaction to determine whether it is a new transaction or one that has already been recorded in the global TX queue 102 . For a new transaction, the sequencer 12 assigns a Transaction ID 103 and records the transaction together with this ID in the global TX queue 13 . If the new transactions ID is generated as a result of lost ID 104 , the transaction and the ID are stored in the resent TX queue 109 for use in identifying duplicated transactions. The sequencer 12 checks the usage of the global TX queue 105 to determine if the maximum number of transactions in memory has already been exceeded. The sequencer 12 stores the transaction ID in the global TX queue 13 if the memory is not full 106 . Otherwise, the sequencer 12 stores the transaction ID in the global disk queue 107 . The sequencer 12 then returns the ID to the application 108 and the sequencer 12 is ready to process another request from the application.
- the sequencer 12 searches and retrieves the entry from either the global TX queue 13 or the disk queue 110 . If this transaction has been committed to the database 111 , the entry's transaction status is set to “committed” 112 by the sequencer 12 , indicating that this transaction is ready for applying to the other databases 200 . If the transaction has been rolled back 113 , the entry's transaction status is marked “for deletion” 114 and as will be described, subsequent processing 200 deletes the entry from the global TX queue.
- the entry's transaction status is set to “indoubt” 115 .
- An alert message is sent to indicate that database recovery may be required 116 .
- Database access is suspended immediately 117 until the indoubt transaction is resolved manually 300 or automatically 400 .
- the global TX queue 13 is used to maintain the proper sequencing and states of all update transactions at commit time.
- the replication queue 5 is created by the sequencer 12 for each destination database.
- the sequencer 12 moves committed transactions from the global TX queue to the replication queue based on the following two criteria: (1) a predetermined transaction queue threshold (Q threshold) and (2) a predetermined sleep time (transfer interval).
- the Q Threshold is the sole determining criteria to move committed transactions to the replication queue 201 .
- both the Q Threshold and transfer interval are used to make the transfer decision 201 , 213 .
- Transactions are transferred in batches to reduce communication overhead.
- the sequencer 12 prepares a batch of transactions to be moved from the global TX queue 13 to the replication queue 202 . If the batch contains transactions, the sequencer 12 removes all the rolled back transactions from it because they are not to be applied to the other databases 204 . The remaining transactions in the batch are sent to the replication queue for processing 205 .
- the sequencer 12 searches the global TX queue for any unprocessed transactions (status is committing) 206 . Since transactions are executed in a same order of occurrence, unprocessed transactions typically occur when a previous transaction has not completed, therefore delaying the processing of subsequent transactions. A transaction that is being committed and has not yet returned its completion status is called gating transaction. A transaction that is being committed and returns a status of unknown is called indoubt transaction. Both types of transactions will remain in the state of “committing” and block processing of subsequent committed transactions, resulting in the transaction batch being empty.
- gating transaction is transient, meaning that it will eventually become committed, unless there is a system failure that causes it to remain in the “gating state” indefinitely. Therefore when the sequencer 12 finds unprocessed transactions 207 it must differentiate the two types of “committing” transactions 208 . For a gating transaction, the sequencer 12 sends out an alert 209 and enters the transaction recovery process 300 . Otherwise, the sequencer 12 determines if the transaction is resent from the application 210 , 211 , and removes the resent transaction from the global TX queue 211 . A resent transaction is a duplicated transaction in the global TX queue 13 and has not been moved to the replication queue 15 .
- the sequencer 12 then enters into a sleep because there is no transaction to be processed at the time 214 .
- the sleep process is executed in its own thread such that it does not stop 200 from being executed at any time. It is a second entry point into the global queue size check at 201 .
- the sequencer 12 creates the transaction batch 202 for transfer to the replication queue 203 , 204 , 205 .
- FIG. 6 shown is a flow diagram illustrating the method 300 for providing manual recovery of transactions 116 as shown in FIG. 100 .
- the sequencer 12 is unable to resolve gating transactions and indoubt transactions caused by certain types of failure and manual recovery may be needed.
- a gating transaction remains in the global TX queue 13 for an extended period of time, stopping all subsequent committed transactions from being applied to the other databases.
- a transaction status is unknown after some system component failure.
- the sequencer 12 first identifies the transactions causing need resolution 301 and send out an alert 302 . Then the transaction can be manually analyzed to determine whether the transaction has been committed or rolled back in the database 304 and whether any manual action needs to be taken.
- the transaction entry is deleted manually from the global TX queue 305 . If the transaction has been committed to the database, it is manually marked “committed” 306 . In both cases the replication process can resume without having to recover the database 500 . If the transaction is flagged as indoubt in the database, it must be forced to commit or roll back at the database before performing 304 , 305 and 306 .
- the process 400 is entered when an indoubt transaction is detected 115 and automatic failover and recovery of a failed database is performed. Unlike gating transactions that may get resolved in the next moment, an indoubt transaction is permanent until the transaction is rolled back or committed by hand or by some heuristic rules supported by the database. If the resolution is done with heuristic rules, the indoubt transaction will have been resolved as “committed” or “rolled back” and will not require database failover or recovery. Consequently the process 400 is only entered when an indoubt transaction cannot be heuristically resolved and an immediate database failover is desirable.
- the database is marked as “needing recovery” 401 , with an alert sent out 402 by the sequencer 12 .
- the sequencer 12 stops the generation of new transaction ID 403 and moves the indoubt transactions to the indoubt TX queue 404 .
- the sequencer 12 replaces it with one of the available databases in the group 405 and enables the transaction ID generation 406 such that normal global TX queue processing can continue 200 .
- the sequencer 12 then executes a user defined recovery procedure to recover the failed database 407 . For example, if the database recovery fails, the recovery process is reentered 408 , 407 .
- Replication queues 15 are managed by the replication servers 3 started by the sequencer 12 .
- One of the replication servers 3 receives batches of transactions from the sequencer 12 .
- the process 500 is entered if a new batch of committed transactions arrives or at any time when queued transactions are to be applied to the databases.
- the batch of transactions are stored in the replication queue in memory 508 , 509 , or in replication disk queue 511 if the memory queue is full.
- Replication disk queue capacity is determined by the amount of disk space available. If the disk is above a predetermined threshold or is full for example 510 , an alert is sent 512 by the sequencer 12 and the database is marked unusable 513 because committed transactions cannot be queued up anymore.
- the replication server first determines whether there is any unprocessed transaction in the replication queue in memory 502 . If the memory queue is empty but unprocessed transactions are found in the replication disk queue 503 , they are moved from the disk queue to the memory queue in batches for execution 504 , 505 . Upon successful execution of all the transactions in the batch they are removed from the replication queue by the replication server and another batch of transactions are processed 501 . If there are transactions in the replication disk queue 16 , the processing continues until the disk queue is empty, at which time the replication server 3 waits for more transactions from the global TX queue 501 .
- the replication server 3 stops when it is instructed by the sequencer during the apparatus shutdown process 118 , 119 and 120 shown in FIG. 4 .
- the transaction replicators 1 can be configured as a plurality of transaction replicators 1 in a replicator peer-to-peer (P2P) network, in which each database server 4 is assigned or otherwise coupled to at least one principal transaction replicator 1 .
- P2P replicator peer-to-peer
- the distributed nature of the replicator P2P network can increase robustness in case of failure by replicating data over multiple peers (i.e. transaction replicators 1 ), and by enabling peers to find/store the data of the transactions without relying on a centralized index server.
- the application or application servers 7 can communicate with a selected one of the database servers 7 , such that the replicator P2P network of transaction replicators 1 would communicate with one another for load balancing and/or failure mode purposes.
- One example would be one application server 7 sending the transaction request to one of the transaction replicators 1 , which would then send the transaction request to another of the transaction replicators 1 of the replicator P2P network, which in turn would replicate and then communicate the replicated copies of the transactions to the respective database servers 4 .
- the applications/application servers 7 could be configured in an application P2P network such that two or more application computers could share their resources such as storage hard drives, CD-ROM drives, and printers. Resources would then accessible from every computer on the application P2P network. Because P2P computers have their own hard drives that are accessible by all computers, each computer can act as both a client and a server in the application P2P networks (e.g. both as an application 7 and as a database 4 ). P2P networks are typically used for connecting nodes via largely ad hoc connections.
- P2P networks are useful for many purposes, such as but not limited to sharing content files, containing audio, video, data or anything in digital format is very common, and realtime data, such as Telephony traffic, is also passed using P2P technology.
- P2P network can also mean grid computing.
- a pure P2P file transfer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both “clients” and “servers” to the other nodes on the network.
- This model of network arrangement differs from the client-server model where communication is usually to and from a central server or controller. It is recognized that there are three major types of P2P network, by way of example only, namely:
Abstract
A system and method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network. The system and method comprise a global queue for storing a number of the received transactions in a first predetermined order. The system and method also comprise a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
Description
- This invention relates generally to the sequencing and processing of transactions within a cluster of replicated databases.
- A database has become the core component of most computer application software nowadays. Typically application software makes use of a single or multiple databases as repositories of data (content) required by the application to function properly. The application's operational efficiency and availability is greatly dependent on the performance and availability of these database(s), which can be measured by two metrics: (1) request response time; and (2) transaction throughput.
- There are several techniques for improving application efficiency based on these two metrics: (1) Vertical scale up of computer hardware supporting the application—this is achieved by adding to or replacing existing hardware with faster central processing units (CPUs), random access memory (RAM), disk adapters/controllers, and network; and (2) Horizontal scale out (clustering) of computer hardware supporting the application—this approach refers to connecting additional computing hardware to the existing configuration by interconnecting them with a fast network. Although both approaches can address the need of reducing request response time and increase transaction throughput, the scale out approach can offer higher efficiency at lower costs, thus driving most new implementations into clustering architecture.
- The clustering of applications can be achieved readily by running the application software on multiple, interconnected application servers that facilitate the execution of the application software and provide hardware redundancy for high availability, with the application software actively processing requests concurrently. However current database clustering technologies cannot provide the level of availability and redundancy in a similar active-active configuration. Consequently database servers are primarily configured as active-standby, meaning that one of the computer systems in the cluster does not process application request until a failover occurs. Active-standby configuration wastes system resources, extends the windows of unavailability and increases the chance of data loss.
- To cluster multiple database servers in an active-active configuration, one technical challenge is to resolve update conflict. An update conflict refers to two or more database servers updating the same record in the databases that they manage. Since data in these databases must be consistent among them in order to scale out for performance and achieve high availability, the conflict must be resolved. Currently there are two different schemes of conflict resolution: (1) time based resolution; and (2) location based resolution. However, neither conflict resolution schemes can be enforced without some heuristic decision to be made by human intervention. It is not possible to determine these heuristic decision rules unless there is a thorough understanding of the application software business rules and their implications. Consequently, most clustered database configurations adopt the active-standby model, and fail to achieve high performance and availability at the same time. There is a need for providing a database management system that uses an active-active configuration and substantially reduces the possibility of update conflicts that may occur when two or more databases attempt to update a record at the same time.
- The systems and methods disclosed herein provide a system for globally managing transaction requests to one or more database servers and to obviate or mitigate at least some of the above presented disadvantages.
- To cluster multiple database servers in an active-active configuration, one technical challenge is to resolve update conflict. An update conflict refers to two or more database servers updating the same record in the databases that they manage. Since data in these databases must be consistent among them in order to scale out for performance and achieve high availability, the conflict must be resolved. Currently there are two different schemes of conflict resolution: (1) time based resolution; and (2) location based resolution. However, neither conflict resolution schemes can be enforced without some heuristic decision to be made by human intervention. Consequently, most clustered database configurations adopt the active-standby model, and fail to achieve high performance and availability at the same time. Contrary to current database configurations there is provided a system and method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network. The system and method comprise a global queue for storing a number of the received transactions in a first predetermined order. The system and method also comprise a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- One aspect provided is a system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising: a global queue for storing a number of the received transactions in a first predetermined order; and a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- A further aspect provided is a system for receiving a plurality of transactions from at least one application server, distributing the transactions to at least two replication queues and applying the transactions to a plurality of databases comprising: a director coupled to each of said at least one application server for capturing a plurality of database calls therefrom as the plurality of transactions; and a controller for receiving each of the plurality of transactions, the controller configured for storing the transactions within a global queue in a predetermined order, for generating a copy of each said transaction for each of said at least two replication queues, and for transmitting in the predetermined order each said copy to each of said at least two replication queues respectively.
- A still further aspect provided is a method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the method comprising: storing a number of the received transactions in a first predetermined order in a global queue; creating a copy of each of the transactions for each of said at least two replication queues; and distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- A still further aspect provided is a system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising: means for storing a number of the received transactions in a first predetermined order; and means for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
- Exemplary embodiments of the invention will now be described in conjunction with the following drawings, by way of example only, in which:
-
FIG. 1A is a block diagram of a system for sequencing transactions; -
FIG. 1B is a block diagram of a transaction replicator of the system ofFIG. 1A ; -
FIGS. 1C, 1D and 1E show an example operation of receiving and processing transactions for the system ofFIG. 1A ; -
FIG. 2 is a block diagram of a director of the system ofFIG. 1A ; -
FIG. 3 is a block diagram of a monitor of the system ofFIG. 1A ; -
FIG. 4 is an example operation of the transaction replicator ofFIG. 1B ; -
FIG. 5 is an example operation of a global transaction queue and a replication queue ofFIG. 1B ; -
FIG. 6 is an example operation of the transaction replicator ofFIG. 1B for resolving gating and indoubt transactions; and -
FIG. 7 is an example operation of a replication server ofFIG. 1B . - A method and apparatus for sequencing transactions in a database cluster is described for use with computer programs or software applications whose functions are designed primarily to replicate update transactions to one or more databases such that data in these databases are approximately synchronized for read and write access.
- Referring to
FIG. 1A , shown is asystem 10 comprising a plurality ofapplication servers 7 for interacting with one ormore database servers 4 and one ormore databases 5 via atransaction replicator 1. It is understood that in two-tier applications, each of theapplication 7 instances represents a client computer. For three-tiered applications, each of theapplication 7 instances represents an application server that is coupled to one or more users (not shown). Accordingly, it is recognized that thetransaction replicator 1 can receive transactions fromapplications 7,application servers 7, or a combination thereof. - Referring to FIGS 1A and 1B, the
transaction replicator 1 of thesystem 10, receives transaction requests from theapplication servers 7 and provides sequenced and replicated transactions using acontroller 2 to one ormore replication servers 3, which apply the transactions to thedatabases 5. By providing sequencing of transactions in two or more tiered application architectures, thetransaction replicator 1 helps to prevent the transaction requests from interfering with each other and facilitates the integrity of thedatabases 5. For example, a transaction refers to a single logical operation from auser application 7 and typically include requests to read, insert, update and delete records within apredetermined database 5. - Referring again to
FIG. 1A , thecontroller 2 can be the central command center of thetransaction replicator 1 that can run for example on theapplication servers 7, thedatabase servers 4 or dedicated hardware. Thecontroller 2 may be coupled to abackup controller 9 that is set up to take over the command when theprimary controller 2 fails. Thebackup controller 9 is approximately synchronized with the primary controller such that there exists transaction integrity. It is recognized that thecontroller 2 and associatedtransaction replicator 1 can also be configured for use as a node in a peer-to-peer network, as further described below. - Referring again to
FIG. 1A , when a backup and a primary controller are utilized, a replica global transaction queue is utilized. Thebackup controller 9 takes over control oftransaction replicator 1 upon the failure of theprimary controller 2. Preferably, the primary and backup controllers are installed at different sites and a redundant WAN is recommended between the two sites. - As is shown in
FIG. 1B , thecontroller 2 receivesinput transactions 11 from auser application 7 and provides sequencedtransactions 19 via thereplication servers 3, the sequencedtransactions 19 are then ready for commitment to thedatabase servers 4. Thecontroller 2 comprises a resent transaction queue 18 (resent TX queue), an indoubt transaction queue 17 (indoubt TX queue), a global transaction sequencer 12 (global TX sequencer), a global TX queue 13 (global TX queue) and at least oneglobal disk queue 14. The global queue 13 (and other queues if desired) can be configured as searchable a first-in-first out pipe (FIFO) or as a first-in-any-out (FIAO), as desired. For example, aFIFO queue 13 could be used when the contents of thereplication queues 15 are intended fordatabases 5, and aFIAO queue 13 could be used when the contents of thereplication queues 15 are intended for consumption by unstructured data processing environments (not shown). Further, it is recognized that theglobal disk queue 14 can be configured for an indexed and randomly accessible data set. - The
transaction replicator 1 maintains the globally sequenced transactions in two different types of queues: theglobal TX queue 13 and one ormore replication queues 15 equal to that of thedatabase server 4 instances. These queues are created using computer memory with spill over area on disks such as theglobal disk queue 14 and one or morereplication disk queues 16. The disk queues serve a number of purposes including: persist transactions to avoid transaction loss during failure of a component in the cluster; act as a very large transaction storage (from gigabytes to terabytes) that computer memory cannot reasonably provide (typically less than 64 gigabytes). Further, theindoubt TX queue 17 is only used when indoubt transactions are detected after a certain system failures. Transactions found in this queue have an unknown transaction state and require either human intervention or pre-programmed resolution methods to resolve. - For example, in the event of a temporary communication failure resulting in lost response from the
global TX sequencer 12 to a transaction ID request, the application resends the request which is then placed in the resentTX queue 18. Under this circumstance, there can be two or more transactions with different Transaction ID in theglobal TX queue 13 and duplicated transactions are removed subsequently. - In normal operation, the
controller 2 uses theglobal TX queue 13 to track the status of each of the input transactions and to send the committed transaction for replication in sequence. Referring toFIGS. 1C, 1D , and 1E, shown is an example operation of thesystem 10 for receiving and processing a new transaction. For example, upon receiving a new transaction, thesequencer 12 assigns a new transaction ID to the received transaction. The transaction ID is a globally unique sequence number for each transaction within a replication group. InFIG. 1C , the sequence ID for the newly received transaction is “K”. Once thecontroller 2 receives the transaction, the transaction and its ID are transferred to theglobal TX queue 20 if there is space available. Otherwise, if theglobal TX queue 13 is above a predetermined threshold and is full, for example, as shown inFIG. 1C , the transaction K and its ID are stored in the global disk queue 14 (FIG. 1D ). - Before accepting any new transactions in the global TX queue, the sequencer distributes the committed transactions from the
global TX queue 13 to afirst replication server 20 and a second (or more)replication server 23 for execution against the databases. As will be discussed, the transfer of the transactions to the replication servers can be triggered when at least one of the following two criteria occurs: 1) a predetermined transfer time interval and 2) a predetermined threshold for the total number of transactions within theglobal TX queue 13 is met. However, eachreplication server respective replication queue global queue 13, at its own rate. - For example, when a slower database server is unable to process the transactions at the rate the transactions are distributed by the
controller 2, the transactions in the corresponding replication queue are spilled over to the replication disk queues. As shown inFIGS. 1C and 1D , transaction F is transferred from theglobal TX queue 13 to the first andsecond replication servers first replication server 20 has afirst replication queue 21 and a firstreplication disk queue 22 and thesecond replication server 23 has asecond replication queue 22 and a secondreplication disk queue 25. The replication queues are an ordered repository of update transactions stored in computer memory for executing transactions on a predetermined database. In this case, since thesecond replication queue 24 is above a predetermined threshold (full, for example) transaction F is transferred to the secondreplication disk queue 25. Referring toFIG. 1D andFIG. 1E , once space opens up in thesecond replication queue 24 as transaction J is applied to its database server, the unprocessed transaction F in the secondreplication disk queue 25 is moved to thesecond replication queue 24 for execution of the transaction request against the data within its respective database. In the case where both the replication disk queue and the replication queues are above a preselected threshold (for example, full), an alert is sent by thesequencer 12 and the database is marked unusable until the queues become empty. - The core functions of the
controller 2 can be summarized as registering one ormore directors 8 and associating them with their respective replication groups; controlling the replication servers' activities; maintaining theglobal TX queue 13 that holds all the update transactions sent from thedirectors 8; synchronizing theglobal TX queue 13 with the backup controller 9(where applicable); managing all replication groups defined; distributing committed transactions to thereplication servers 3; tracking the operational status of eachdatabase server 4 within a replication group; providing system status to amonitor 6; and recovering from various system failures. - The registry function of the
controller 2 occurs when applications are enabled on anew application server 7 to accessdatabases 5 in a replication group. Here, thedirector 8 on the new application server contacts thecontroller 2 and registers itself to the replication group. Advantageously, this provides dynamic provisioning of application servers to scale up system capacity on demand. The registration is performed on the first database call made by an application. Subsequently thedirector 8 communicates with thecontroller 2 for transaction and server status tracking. - The replication server control function allows the
controller 2 to start thereplication servers 3 and monitors their state. For example, when an administrator requests to pause replication to aspecific database 5, the controller then instructs the replication server to stop applying transactions until an administrator or an automated process requests it. - The replication group management function allows the
controller 2 to manage one or more groups ofdatabases 5 that require transaction synchronization and data consistency among them. The number of replication groups that can be managed and controlled by thecontroller 2 is dependent upon the processing power of the computer that the controller is operating on and the sum of the transaction rates of all the replication groups. - Director
- Referring to
FIG. 2 , shown is a block diagram of thedirector 8 of thesystem 10 ofFIG. 1A . The director can be installed on theapplication server 7 or the client computer. Thedirector 8 is for initiating a sequence of operations to track the progress of a transaction. Thedirector 8 comprises a first 27, a second 28, a third 29 and a fourth 30 functional module. According to an embodiment of thesystem 10, thedirector 8 wraps around a vendor supplied JDBC driver. As discussed earlier, thedirector 8 is typically installed on theapplication server 7 in a 3-tier architecture, and on the client computer in a 2-tier architecture. As a wrapper, thedirector 8 can act like an ordinary JDBC driver to theapplications 7, for example. Further, thesystem 10 can also support any of the following associated with the transaction requests, such as but not limited to: -
- 1. a database access driver/protocol based on SQL for a relational database 5 (ODBC, OLE/DB, ADO.NET, RDBMS native clients, etc. . .);
- 2. messages sent over message queues of the network;
- 3. XML (and other structured definition languages) based transactions; and
- 4. other data access drivers as desired.
- As an example, the
first module 27 captures all JDBC calls 26, determines transaction type and boundary, and analyzes the SQLs in the transaction. Once determined to be an update transaction, thedirector 8 initiates a sequence of operations to track the progress of the transaction until it ends with a commit or rollback. Both DDL and DML are captured for replication to other databases in the same replication group. - The
second module 28 collects a plurality of different statistical elements on transactions and SQL statements for analyzing application execution and performance characteristics. The statistics can be exported as comma delimited text file for importing into a spreadsheet. - In addition to intercepting and analyzing transactions and SQL statements, the director's
third module 29, manages database connections for theapplications 7. In the event that one of thedatabases 5 should fail, thedirector 8 reroutes transactions to one or more of the remaining databases. Whenever feasible, thedirector 8 also attempts to re-execute the transactions to minimize in flight transaction loss. Accordingly, thedirector 8 has the ability to instruct thecontroller 2 as to whichdatabase 5 is the primary database for satisfying the request of therespective application 7. - Depending on a database's workload and the relative power settings of the
database servers 4 in a replication group, thedirector 8 routes read transactions to the leastbusy database server 4 for processing. This also applies when adatabase server 4 failure has resulted in transaction redirection. - Similarly, if the replication of transactions to a
database server 4 becomes too slow for any reason such that the transactions start to build up and spill over to thereplication disk queue 16, thedirector 8 redirects all the read transactions to the leastbusy database server 4. Once the disk queue becomes empty, thedirector 8 subsequently allows read access to that database. Accordingly, the fill/usage status of the replication disk queues in the replication group can be obtained or otherwise received by thedirector 8 for use in management of through-put rate of transactions applied to therespective databases 5. - For example, when the
director 8 orreplication servers 3 fails to communicate with thedatabase servers 4, they report the failure to thecontroller 2 which then may redistribute transactions or take other appropriate actions to allow continuous operation of thetransaction replicator 1. When one of thedatabase servers 4 cannot be accessed, thecontroller 2 instructs thereplication server 3 to stop applying transactions to it and relays the database lock down status to amonitor 6. The transactions start to accumulate within the queues until thedatabase server 3 is repaired and the administrator or an automated process instructs to resume replication via themonitor 6. Themonitor 6 may also provide other predetermined administrative commands (for example: create database alias, update parameters, changing workload balancing setting). - Monitor
- Referring again to
FIG. 1A , themonitor 6 allows a user to view and monitor the status of thecontrollers 2, thereplication servers 3, and thedatabases 5. Preferably, themonitor 6 is a web application that is installed on an application orapplication server 7 and on the same network as thecontrollers 2. - Referring to
FIG. 3 , shown is a diagrammatic view of the system monitor 6 for use with thetransaction replicator 1. The system monitor 6 receivesinput data 32 from both primary andbackup controllers 2, 9 (where applicable),replication servers 3, thedatabase servers 4 andrelevant databases 5 within a replication group. This information is used to display an overall system status on adisplay screen 31. - For example, depending on whether the controller is functioning or a failure has occurred, the relevant status of the
controller 2 is shown. Second, the status of each of thereplication servers 3 within a desired replication group is shown. A detailed description of the transaction rate, the number of transactions within eachreplication queue 15, the number transactions within eachreplication disk queue 16 is further shown. Themonitor 6 further receives data regarding thedatabases 5 and displays the status of eachdatabase 5 and the number of committed transactions. - The administrator can analyze the above information and choose to manually reroute the transactions. For example, when it is seen that there exists many transactions within the
replication disk queue 16 of aparticular replication server 3 or that the transaction rate of areplication server 3 is slow, the administrator may send output data in the form of arequest 33 to distribute the transactions for a specified amount of time to a different database server within the replication group. - Referring to
FIG. 4 , shown is a flow diagram overview of themethod 100 for initializing and processing transactions according to the invention. Theglobal TX sequencer 12 also referred to as the sequencer hereafter and as shown inFIG. 1B , is the control logic of thetransaction replicator 1. - When the
controller 2 is started, it initializes itself by reading from configuration and property files the parameters to be used in thecurrent session 101. Theglobal TX Queue 13, indoubtTX queue 17 and resentTX queue 18 shown inFIG. 1B , are created and emptied in preparation for use. Before accepting any new transactions, thesequencer 12 examines theglobal disk queue 14 to determine if any transactions are left behind from previous session. For example, if a transaction is found on theglobal disk queue 14, it implies at least one database in the cluster is out of synchronization with the others and the database must be applied with these transactions before it can be accessed by applications. Transactions on theglobal disk queue 14 are read into theglobal TX queue 13 in preparation for applying to the database(s) 5. Thesequencer 12 then starts additional servers calledreplication servers 3 that create and manage thereplication queues 15. After initialization is complete, thesequencer 12 is ready to accept transactions from theapplication servers 7. - The
sequencer 12 examines the incoming transaction to determine whether it is a new transaction or one that has already been recorded in theglobal TX queue 102. For a new transaction, thesequencer 12 assigns aTransaction ID 103 and records the transaction together with this ID in theglobal TX queue 13. If the new transactions ID is generated as a result of lostID 104, the transaction and the ID are stored in the resentTX queue 109 for use in identifying duplicated transactions. Thesequencer 12 checks the usage of theglobal TX queue 105 to determine if the maximum number of transactions in memory has already been exceeded. Thesequencer 12 stores the transaction ID in theglobal TX queue 13 if the memory is not full 106. Otherwise, thesequencer 12 stores the transaction ID in theglobal disk queue 107. Thesequencer 12 then returns the ID to theapplication 108 and thesequencer 12 is ready to process another request from the application. - When a request from the application or
application server 7, comes in with a transaction that has already obtained a transaction ID previously and recorded in theglobal TX queue 13, thesequencer 12 searches and retrieves the entry from either theglobal TX queue 13 or thedisk queue 110. If this transaction has been committed to thedatabase 111, the entry's transaction status is set to “committed” 112 by thesequencer 12, indicating that this transaction is ready for applying to theother databases 200. If the transaction has been rolled back 113, the entry's transaction status is marked “for deletion” 114 and as will be described,subsequent processing 200 deletes the entry from the global TX queue. If the transaction failed with an indoubt status, the entry's transaction status is set to “indoubt” 115. An alert message is sent to indicate that database recovery may be required 116. Database access is suspended immediately 117 until the indoubt transaction is resolved manually 300 or automatically 400. - Referring to
FIG. 5 , shown is a flow diagram of themethod 200 for distributing transactions from theglobal TX queue 13 according to the invention. Theglobal TX queue 13 is used to maintain the proper sequencing and states of all update transactions at commit time. To apply the committed transactions to the other databases, thereplication queue 5 is created by thesequencer 12 for each destination database. Thesequencer 12 moves committed transactions from the global TX queue to the replication queue based on the following two criteria: (1) a predetermined transaction queue threshold (Q threshold) and (2) a predetermined sleep time (transfer interval). - For a system with sustained workload, the Q Threshold is the sole determining criteria to move committed transactions to the replication queue 201. For a system with sporadic activities, both the Q Threshold and transfer interval are used to make the
transfer decision 201, 213. Transactions are transferred in batches to reduce communication overhead. When one or both criteria are met, thesequencer 12 prepares a batch of transactions to be moved from theglobal TX queue 13 to thereplication queue 202. If the batch contains transactions, thesequencer 12 removes all the rolled back transactions from it because they are not to be applied to theother databases 204. The remaining transactions in the batch are sent to the replication queue forprocessing 205. If the batch does not contain anytransaction 203, thesequencer 12 searches the global TX queue for any unprocessed transactions (status is committing) 206. Since transactions are executed in a same order of occurrence, unprocessed transactions typically occur when a previous transaction has not completed, therefore delaying the processing of subsequent transactions. A transaction that is being committed and has not yet returned its completion status is called gating transaction. A transaction that is being committed and returns a status of unknown is called indoubt transaction. Both types of transactions will remain in the state of “committing” and block processing of subsequent committed transactions, resulting in the transaction batch being empty. The difference between a gating transaction and an indoubt transaction is that gating transaction is transient, meaning that it will eventually become committed, unless there is a system failure that causes it to remain in the “gating state” indefinitely. Therefore when thesequencer 12 findsunprocessed transactions 207 it must differentiate the two types of “committing”transactions 208. For a gating transaction, thesequencer 12 sends out an alert 209 and enters thetransaction recovery process 300. Otherwise, thesequencer 12 determines if the transaction is resent from theapplication 210, 211, and removes the resent transaction from theglobal TX queue 211. A resent transaction is a duplicated transaction in theglobal TX queue 13 and has not been moved to thereplication queue 15. Thesequencer 12 then enters into a sleep because there is no transaction to be processed at thetime 214. The sleep process is executed in its own thread such that it does not stop 200 from being executed at any time. It is a second entry point into the global queue size check at 201. When the sleep time is up, thesequencer 12 creates thetransaction batch 202 for transfer to thereplication queue - Referring to
FIG. 6 , shown is a flow diagram illustrating themethod 300 for providing manual recovery oftransactions 116 as shown inFIG. 100 . There are two scenarios under which thesequencer 12 is unable to resolve gating transactions and indoubt transactions caused by certain types of failure and manual recovery may be needed. First, a gating transaction remains in theglobal TX queue 13 for an extended period of time, stopping all subsequent committed transactions from being applied to the other databases. Second, a transaction status is unknown after some system component failure. Thesequencer 12 first identifies the transactions causingneed resolution 301 and send out analert 302. Then the transaction can be manually analyzed to determine whether the transaction has been committed or rolled back in thedatabase 304 and whether any manual action needs to be taken. If the transaction is found to have been rolled back in the database, the transaction entry is deleted manually from theglobal TX queue 305. If the transaction has been committed to the database, it is manually marked “committed” 306. In both cases the replication process can resume without having to recover thedatabase 500. If the transaction is flagged as indoubt in the database, it must be forced to commit or roll back at the database before performing 304, 305 and 306. - Referring again to
FIG. 6 , theprocess 400 is entered when an indoubt transaction is detected 115 and automatic failover and recovery of a failed database is performed. Unlike gating transactions that may get resolved in the next moment, an indoubt transaction is permanent until the transaction is rolled back or committed by hand or by some heuristic rules supported by the database. If the resolution is done with heuristic rules, the indoubt transaction will have been resolved as “committed” or “rolled back” and will not require database failover or recovery. Consequently theprocess 400 is only entered when an indoubt transaction cannot be heuristically resolved and an immediate database failover is desirable. Under the automatic recovery process, the database is marked as “needing recovery” 401, with an alert sent out 402 by thesequencer 12. To help prevent further transaction loss, thesequencer 12 stops the generation ofnew transaction ID 403 and moves the indoubt transactions to theindoubt TX queue 404. While the database is marked “needing recovery” thesequencer 12 replaces it with one of the available databases in thegroup 405 and enables thetransaction ID generation 406 such that normal global TX queue processing can continue 200. Thesequencer 12 then executes a user defined recovery procedure to recover the faileddatabase 407. For example, if the database recovery fails, the recovery process is reentered 408, 407. - Referring to
FIG. 7 , shown is a flow diagram illustrating the processing of committed transactions by thereplication servers 3 and the management of transactions in thereplication queue 15 according to the present invention.Replication queues 15 are managed by thereplication servers 3 started by thesequencer 12. One of thereplication servers 3 receives batches of transactions from thesequencer 12. Theprocess 500 is entered if a new batch of committed transactions arrives or at any time when queued transactions are to be applied to the databases. - If the process is entered because of
new transactions 501, the batch of transactions are stored in the replication queue inmemory replication disk queue 511 if the memory queue is full. Replication disk queue capacity is determined by the amount of disk space available. If the disk is above a predetermined threshold or is full for example 510, an alert is sent 512 by thesequencer 12 and the database is marked unusable 513 because committed transactions cannot be queued up anymore. - If the process is entered in an attempt to apply transactions in the replication queue to the databases, the replication server first determines whether there is any unprocessed transaction in the replication queue in
memory 502. If the memory queue is empty but unprocessed transactions are found in thereplication disk queue 503, they are moved from the disk queue to the memory queue in batches forexecution replication disk queue 16, the processing continues until the disk queue is empty, at which time thereplication server 3 waits for more transactions from theglobal TX queue 501. During execution of the transactions in thereplication queue 15, error may occur and the execution must be retried until the maximum number of retries is exceeded 507, then an alert is sent 512 with the database marked unusable 513. However, even though a database is marked unusable, the system continues to serve the application requests. The marked database is inaccessible until the error condition is resolved. Thereplication server 3 stops when it is instructed by the sequencer during theapparatus shutdown process FIG. 4 . - It will be evident to those skilled in the art that the
system 10 and its corresponding components can take many forms, and that such forms are within the scope of the invention as claimed. For example, thetransaction replicators 1 can be configured as a plurality oftransaction replicators 1 in a replicator peer-to-peer (P2P) network, in which eachdatabase server 4 is assigned or otherwise coupled to at least oneprincipal transaction replicator 1. The distributed nature of the replicator P2P network can increase robustness in case of failure by replicating data over multiple peers (i.e. transaction replicators 1), and by enabling peers to find/store the data of the transactions without relying on a centralized index server. In the latter case, there may be no single point of failure in thesystem 10 when using the replicator P2P network. For example, the application orapplication servers 7 can communicate with a selected one of thedatabase servers 7, such that the replicator P2P network oftransaction replicators 1 would communicate with one another for load balancing and/or failure mode purposes. One example would be oneapplication server 7 sending the transaction request to one of thetransaction replicators 1, which would then send the transaction request to another of thetransaction replicators 1 of the replicator P2P network, which in turn would replicate and then communicate the replicated copies of the transactions to therespective database servers 4. - Further, it is recognized that the applications/
application servers 7 could be configured in an application P2P network such that two or more application computers could share their resources such as storage hard drives, CD-ROM drives, and printers. Resources would then accessible from every computer on the application P2P network. Because P2P computers have their own hard drives that are accessible by all computers, each computer can act as both a client and a server in the application P2P networks (e.g. both as anapplication 7 and as a database 4). P2P networks are typically used for connecting nodes via largely ad hoc connections. Such P2P networks are useful for many purposes, such as but not limited to sharing content files, containing audio, video, data or anything in digital format is very common, and realtime data, such as Telephony traffic, is also passed using P2P technology. The term “P2P network” can also mean grid computing. A pure P2P file transfer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both “clients” and “servers” to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server or controller. It is recognized that there are three major types of P2P network, by way of example only, namely: -
- 1) Pure P2P in which peers act as clients and server, there is no central server, and there is no central router;
- 2) Hybrid P2P which has a central server that keeps information on peers and responds to requests for that information, peers are responsible for hosting the information as the central server does not store files and for letting the central server know what files they want to share and for downloading its shareable resources to peers that request it, and route terminals are used as addresses which are referenced by a set of indices to obtain an absolute address; and
- 3) Mixed P2P which has both pure and hybrid characteristics. Accordingly, it is recognized that in the application and replicator P2P networks the applications/
application servers 7 and thetransaction replicators 1 can operate as both clients and servers, depending upon whether they are the originator or receiver of the transaction request respectively. Further, it is recognized that both the application and replicator P2P networks can be used in thesystem 10 alone or in combination, as desired.
- in view of the above, the spirit and scope of the appended claims should: not be limited to the examples or the description of the preferred versions contained herein.
Claims (32)
1. A system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising:
a global queue for storing a number of the received transactions in a first predetermined order; and
a sequencer coupled to the global queue for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
2. The system according to claim 1 , wherein the predetermined orders are selected from the group comprising: the first predetermined order is the same as the second predetermined order; and the first predetermined order is different from the second predetermined order.
3. The system according to claim 2 in which the sequencer distributes each said copy at a predetermined time interval.
4. The system according to claim 2 in which the sequencer distributes each said copy when the number of the transactions within the global queue exceeds a predetermined value.
5. The system according to claim 2 in which the sequencer distributes each said copy upon the earlier of:
a predetermined time interval; and
the number of the transactions within the global queue exceeds a predetermined value.
6. The system according to claim 5 in which each of the transactions comprises an update transaction and a unique transaction id assigned by the sequencer.
7. The system according to claim 6 further comprising a global disk queue in communication with the global queue for receiving and storing the transactions when the global queue is above a global threshold.
8. The system according to claim 7 wherein each of said at least two replication queues have a corresponding replication disk queue for receiving and storing the transactions from the global queue when the corresponding replication queue is above a replication threshold.
9. The system according to claim 8 in which the global queue receives the transactions from the global disk queue and other than receives the transactions from said at least one application server when the global disk queue is other than empty.
10. The system according to claim 5 further comprising an indoubt transaction queue in communication with the sequencer for storing the transactions identified as having unknown status by a database server during system failures.
11. The system according to claim 6 wherein the update transaction comprises at least one of a read, insert, update or delete request for at least one database in communication with at least one of said at least two replication queues.
12. The system according to claim 6 further comprising a resent transaction queue for storing the transactions when the transactions repeated the request for the transaction id.
13. The system according to claim 2 , wherein the global queue is configured for receipt of the received transactions from a network entity selected from the group comprising: an application; and an application server.
14. The system according to claim 2 , wherein the global queue is a searchable first-in first-out pipe.
15. The system according to claim 14 further comprising the sequencer configured for assuring the order of transactions in the global queue remain consistent with their execution order at a database server coupled to at least one of the replication queues.
16. The system according to claim 14 , wherein the global disk queue is configured for storing an indexed and randomly accessible data set.
17. The system according to claim 2 , wherein the global queue and sequencer are hosted on a network entity selected from the group comprising: a central control server and a peer-to-peer node.
18. A system for receiving a plurality of transactions from at least one application server, distributing the transactions to at least two replication queues and applying the transactions to a plurality of databases comprising:
a director coupled to each of said at least one application server for capturing a plurality of database calls therefrom as the plurality of transactions; and
a controller for receiving each of the plurality of transactions, the controller configured for storing the transactions within a global queue in a predetermined order, for generating a copy of each said transaction for each of said at least two replication queues, and for transmitting in the predetermined order each said copy to each of said at least two replication queues respectively.
19. The system according to claim 18 further comprising at least two replication servers including said at least two replication queues wherein each of said at least two replication servers is coupled to each of the databases; wherein the director routes each of the transactions to one or more of the databases relative to the workload and transaction throughput.
20. The system according to claim 19 further comprising a backup controller for receiving the transactions from said at least one application server upon failure of the controller, the backup controller including a backup global queue wherein the backup global queue is substantially synchronized with the controller and the backup global queue is a copy of the global queue.
21. A method for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the method comprising:
storing a number of the received transactions in a first predetermined order in a global queue;
creating a copy of each of the transactions for each of said at least two replication queues; and
distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
22. The method according to claim 21 wherein the step of distributing each said copy occurs at a predetermined time interval.
23. The method according to claim 21 wherein the step of distributing each said copy occurs when the number of the transactions within the global queue exceeds a predetermined number.
24. The method according to claim 21 wherein the step of distributing each said copy occurs upon the earlier of: a predetermined time interval; and the number of the transactions within the global queue exceeds a predetermined number.
25. The method according to claim 24 , wherein each of the transactions comprises an update transaction and a unique transaction id assigned by the sequencer.
26. The method according to claim 24 further comprising the step of receiving and storing the transactions within a global disk queue when the global queue storage capacity reaches a global threshold.
27. The method according to claim 21 further comprising the steps of:
determining whether the global disk queue is other than empty; and
receiving the transaction from the global disk queue rather than receiving the transactions from said at least one application server when the global disk queue is other than empty.
28. The method according to claim 21 further comprising the step of storing the transactions within an indoubt transaction queue during system failures.
29. The method according to claim 25 wherein the update transaction comprises at least one of a read, insert, update or delete request for at least one database in communication with at least one of said at least two replication queues.
30. The method according to claim 24 further comprising the steps of:
determining when at least one of said at least two replication queues are above a replication threshold, each of said at least two replication queues having a corresponding replication disk queue;
storing a number of the transactions within said corresponding replication disk queue based upon the determination; and
sending an alert to notify when said at least two replication queues and said corresponding replication disk queue capacity reach a preselected threshold.
31. The method according to claim 30 further comprising the step of: redirecting the transactions to at least one of said at least two replication queues being below said preselected threshold, based on receiving the alert.
32. A system for receiving and tracking a plurality of transactions and distributing the transactions to at least two replication queues over a network, the system comprising:
means for storing a number of the received transactions in a first predetermined order; and
means for creating a copy of each of the transactions for each of said at least two replication queues and for distributing in a second predetermined order each said copy to each of said at least two replication queues respectively, said copy containing one or more of the received transactions.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/221,752 US20070061379A1 (en) | 2005-09-09 | 2005-09-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
PCT/CA2006/001474 WO2007028248A1 (en) | 2005-09-09 | 2006-09-08 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
CA2619778A CA2619778C (en) | 2005-09-09 | 2006-09-08 | Method and apparatus for sequencing transactions globally in a distributed database cluster with collision monitoring |
PCT/CA2006/001475 WO2007028249A1 (en) | 2005-09-09 | 2006-09-08 | Method and apparatus for sequencing transactions globally in a distributed database cluster with collision monitoring |
US12/071,603 US8856091B2 (en) | 2005-09-09 | 2008-02-22 | Method and apparatus for sequencing transactions globally in distributed database cluster |
US12/149,927 US9785691B2 (en) | 2005-09-09 | 2008-05-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/221,752 US20070061379A1 (en) | 2005-09-09 | 2005-09-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
Related Child Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2006/000147 Continuation-In-Part WO2006081672A1 (en) | 2005-02-07 | 2006-02-07 | Database employing biometric indexing and method therefor |
PCT/CA2006/001475 Continuation-In-Part WO2007028249A1 (en) | 2005-09-09 | 2006-09-08 | Method and apparatus for sequencing transactions globally in a distributed database cluster with collision monitoring |
US12/149,927 Continuation US9785691B2 (en) | 2005-09-09 | 2008-05-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070061379A1 true US20070061379A1 (en) | 2007-03-15 |
Family
ID=37835340
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/221,752 Abandoned US20070061379A1 (en) | 2005-09-09 | 2005-09-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
US12/149,927 Active 2028-08-20 US9785691B2 (en) | 2005-09-09 | 2008-05-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/149,927 Active 2028-08-20 US9785691B2 (en) | 2005-09-09 | 2008-05-09 | Method and apparatus for sequencing transactions globally in a distributed database cluster |
Country Status (2)
Country | Link |
---|---|
US (2) | US20070061379A1 (en) |
WO (1) | WO2007028248A1 (en) |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070136389A1 (en) * | 2005-11-29 | 2007-06-14 | Milena Bergant | Replication of a consistency group of data storage objects from servers in a data network |
US20070203910A1 (en) * | 2006-02-13 | 2007-08-30 | Xkoto Inc. | Method and System for Load Balancing a Distributed Database |
US20080114816A1 (en) * | 2006-11-10 | 2008-05-15 | Sybase, Inc. | Replication system with methodology for replicating database sequences |
US20080140734A1 (en) * | 2006-12-07 | 2008-06-12 | Robert Edward Wagner | Method for identifying logical data discrepancies between database replicas in a database cluster |
US7454478B1 (en) * | 2007-11-30 | 2008-11-18 | International Business Machines Corporation | Business message tracking system using message queues and tracking queue for tracking transaction messages communicated between computers |
US20090077016A1 (en) * | 2007-09-14 | 2009-03-19 | Oracle International Corporation | Fully automated sql tuning |
US20090077017A1 (en) * | 2007-09-18 | 2009-03-19 | Oracle International Corporation | Sql performance analyzer |
US20090106219A1 (en) * | 2007-10-17 | 2009-04-23 | Peter Belknap | SQL Execution Plan Verification |
US20090300320A1 (en) * | 2008-05-28 | 2009-12-03 | Jing Zhang | Processing system with linked-list based prefetch buffer and methods for use therewith |
US20090327303A1 (en) * | 2008-06-27 | 2009-12-31 | Microsoft Corporation | Intelligent allocation of file server resources |
US20100005124A1 (en) * | 2006-12-07 | 2010-01-07 | Robert Edward Wagner | Automated method for identifying and repairing logical data discrepancies between database replicas in a database cluster |
US7769722B1 (en) | 2006-12-08 | 2010-08-03 | Emc Corporation | Replication and restoration of multiple data storage object types in a data network |
US20100217840A1 (en) * | 2009-02-25 | 2010-08-26 | Dehaan Michael Paul | Methods and systems for replicating provisioning servers in a software provisioning environment |
US20120124311A1 (en) * | 2009-08-04 | 2012-05-17 | Axxana (Israel) Ltd. | Data Gap Management in a Remote Data Mirroring System |
US20120197961A1 (en) * | 2007-04-13 | 2012-08-02 | Platform Computing Corporation | Method and system for information exchange utilizing an asynchronous persistent store protocol |
CN102841783A (en) * | 2011-06-24 | 2012-12-26 | 镇江华扬信息科技有限公司 | Delphi-based three-layer database system implementation method |
US8706833B1 (en) | 2006-12-08 | 2014-04-22 | Emc Corporation | Data storage server having common replication architecture for multiple storage object types |
US20140279892A1 (en) * | 2013-03-13 | 2014-09-18 | International Business Machines Corporation | Replication group partitioning |
US20150254298A1 (en) * | 2014-03-06 | 2015-09-10 | International Business Machines Corporation | Restoring database consistency integrity |
CN107918620A (en) * | 2016-10-10 | 2018-04-17 | 阿里巴巴集团控股有限公司 | Wiring method and device, the electronic equipment of a kind of database |
US20180143996A1 (en) * | 2016-11-22 | 2018-05-24 | Chen Chen | Systems, devices and methods for managing file system replication |
US10007695B1 (en) * | 2017-05-22 | 2018-06-26 | Dropbox, Inc. | Replication lag-constrained deletion of data in a large-scale distributed data storage system |
CN109299136A (en) * | 2018-11-27 | 2019-02-01 | 佛山科学技术学院 | A kind of real-time synchronization method and device in the database resource pond of intelligence manufacture |
US20190238605A1 (en) * | 2018-01-31 | 2019-08-01 | Salesforce.Com, Inc. | Verification of streaming message sequence |
US10379958B2 (en) | 2015-06-03 | 2019-08-13 | Axxana (Israel) Ltd. | Fast archiving for database systems |
US10592326B2 (en) | 2017-03-08 | 2020-03-17 | Axxana (Israel) Ltd. | Method and apparatus for data loss assessment |
US10621064B2 (en) | 2014-07-07 | 2020-04-14 | Oracle International Corporation | Proactive impact measurement of database changes on production systems |
US10708213B2 (en) | 2014-12-18 | 2020-07-07 | Ipco 2012 Limited | Interface, method and computer program product for controlling the transfer of electronic messages |
US10769028B2 (en) | 2013-10-16 | 2020-09-08 | Axxana (Israel) Ltd. | Zero-transaction-loss recovery for database systems |
EP3754514A4 (en) * | 2018-02-12 | 2020-12-23 | ZTE Corporation | Distributed database cluster system, data synchronization method and storage medium |
US10963882B2 (en) | 2014-12-18 | 2021-03-30 | Ipco 2012 Limited | System and server for receiving transaction requests |
US10997568B2 (en) | 2014-12-18 | 2021-05-04 | Ipco 2012 Limited | System, method and computer program product for receiving electronic messages |
US20210216981A1 (en) * | 2015-01-04 | 2021-07-15 | Tencent Technology (Shenzhen) Company Limited | Method and device for processing virtual cards |
US11080690B2 (en) | 2014-12-18 | 2021-08-03 | Ipco 2012 Limited | Device, system, method and computer program product for processing electronic transaction requests |
US11327932B2 (en) | 2017-09-30 | 2022-05-10 | Oracle International Corporation | Autonomous multitenant database cloud service framework |
US11386058B2 (en) | 2017-09-29 | 2022-07-12 | Oracle International Corporation | Rule-based autonomous database cloud service framework |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8352538B2 (en) * | 2006-10-16 | 2013-01-08 | Siemens Medical Solutions Usa, Inc. | Transaction monitoring system |
US8543863B2 (en) * | 2009-11-18 | 2013-09-24 | Microsoft Corporation | Efficiency of hardware memory access using dynamically replicated memory |
US9110968B2 (en) * | 2010-04-14 | 2015-08-18 | At&T Intellectual Property I, L.P. | Removal of invisible data packages in data warehouses |
US9063969B2 (en) * | 2010-12-28 | 2015-06-23 | Sap Se | Distributed transaction management using optimization of local transactions |
US8788601B2 (en) * | 2011-05-26 | 2014-07-22 | Stratify, Inc. | Rapid notification system |
US8566280B2 (en) | 2011-05-31 | 2013-10-22 | International Business Machines Corporation | Grid based replication |
WO2013019892A1 (en) * | 2011-08-01 | 2013-02-07 | Tagged, Inc. | Generalized reconciliation in a distributed database |
US9495238B2 (en) | 2013-12-13 | 2016-11-15 | International Business Machines Corporation | Fractional reserve high availability using cloud command interception |
US9246840B2 (en) | 2013-12-13 | 2016-01-26 | International Business Machines Corporation | Dynamically move heterogeneous cloud resources based on workload analysis |
US9910733B2 (en) * | 2014-06-26 | 2018-03-06 | Sybase, Inc. | Transaction completion in a synchronous replication environment |
US20170168756A1 (en) * | 2014-07-29 | 2017-06-15 | Hewlett Packard Enterprise Development Lp | Storage transactions |
US9959308B1 (en) | 2014-09-29 | 2018-05-01 | Amazon Technologies, Inc. | Non-blocking processing of federated transactions for distributed data partitions |
US11102313B2 (en) * | 2015-08-10 | 2021-08-24 | Oracle International Corporation | Transactional autosave with local and remote lifecycles |
US10582001B2 (en) | 2015-08-11 | 2020-03-03 | Oracle International Corporation | Asynchronous pre-caching of synchronously loaded resources |
US10419514B2 (en) | 2015-08-14 | 2019-09-17 | Oracle International Corporation | Discovery of federated logins |
US10452497B2 (en) | 2015-08-14 | 2019-10-22 | Oracle International Corporation | Restoration of UI state in transactional systems |
US10582012B2 (en) | 2015-10-16 | 2020-03-03 | Oracle International Corporation | Adaptive data transfer optimization |
US11438224B1 (en) | 2022-01-14 | 2022-09-06 | Bank Of America Corporation | Systems and methods for synchronizing configurations across multiple computing clusters |
CN116302450B (en) * | 2023-05-18 | 2023-09-01 | 深圳前海环融联易信息科技服务有限公司 | Batch processing method and device for tasks, computer equipment and storage medium |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6012059A (en) * | 1997-08-21 | 2000-01-04 | Dataxel Corporation | Method and apparatus for replicated transaction consistency |
US6023720A (en) * | 1998-02-09 | 2000-02-08 | Matsushita Electric Industrial Co., Ltd. | Simultaneous processing of read and write requests using optimized storage partitions for read and write request deadlines |
US20010032282A1 (en) * | 2000-01-13 | 2001-10-18 | Marietta Bryan D. | Bus protocol independent method and structure for managing transaction priority, ordering and deadlocks in a multi-processing system |
US20020133507A1 (en) * | 2001-03-16 | 2002-09-19 | Iti, Inc. | Collision avoidance in database replication systems |
US20020133491A1 (en) * | 2000-10-26 | 2002-09-19 | Prismedia Networks, Inc. | Method and system for managing distributed content and related metadata |
US20030212738A1 (en) * | 2002-05-10 | 2003-11-13 | Wookey Michael J. | Remote services system message system to support redundancy of data flow |
US20040034640A1 (en) * | 2002-08-01 | 2004-02-19 | Oracle International Corporation | Buffered message queue architecture for database management systems with guaranteed at least once delivery |
US20040133591A1 (en) * | 2001-03-16 | 2004-07-08 | Iti, Inc. | Asynchronous coordinated commit replication and dual write with replication transmission and locking of target database on updates only |
US20050021567A1 (en) * | 2003-06-30 | 2005-01-27 | Holenstein Paul J. | Method for ensuring referential integrity in multi-threaded replication engines |
US20070027896A1 (en) * | 2005-07-28 | 2007-02-01 | International Business Machines Corporation | Session replication |
US7177886B2 (en) * | 2003-02-07 | 2007-02-13 | International Business Machines Corporation | Apparatus and method for coordinating logical data replication with highly available data replication |
US20070088970A1 (en) * | 2003-04-10 | 2007-04-19 | Lenovo (Singapore) Pte.Ltd | Recovery from failures within data processing systems |
US7249163B2 (en) * | 2002-05-27 | 2007-07-24 | International Business Machines Corporation | Method, apparatus, system and computer program for reducing I/O in a messaging environment |
US20070174346A1 (en) * | 2006-01-18 | 2007-07-26 | Brown Douglas P | Closed-loop validator |
Family Cites Families (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5613106A (en) | 1989-09-15 | 1997-03-18 | Motorola, Inc. | Method for processing and storing a transaction in a distributed database system |
US5170480A (en) * | 1989-09-25 | 1992-12-08 | International Business Machines Corporation | Concurrently applying redo records to backup database in a log sequence using single queue server per queue at a time |
US5727203A (en) | 1995-03-31 | 1998-03-10 | Sun Microsystems, Inc. | Methods and apparatus for managing a database in a distributed object operating environment using persistent and transient cache |
US6911987B1 (en) * | 1995-07-05 | 2005-06-28 | Microsoft Corporation | Method and system for transmitting data for a shared application |
US5721825A (en) * | 1996-03-15 | 1998-02-24 | Netvision, Inc. | System and method for global event notification and delivery in a distributed computing environment |
US5870761A (en) * | 1996-12-19 | 1999-02-09 | Oracle Corporation | Parallel queue propagation |
US5878414A (en) | 1997-06-06 | 1999-03-02 | International Business Machines Corp. | Constructing a transaction serialization order based on parallel or distributed database log files |
US6374336B1 (en) | 1997-12-24 | 2002-04-16 | Avid Technology, Inc. | Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner |
GB9727463D0 (en) | 1997-12-30 | 1998-02-25 | Orange Personal Comm Serv Ltd | Telecommunications system |
US6792540B1 (en) * | 1998-05-28 | 2004-09-14 | Oracle International Corporation | Data replication security |
US6243715B1 (en) | 1998-11-09 | 2001-06-05 | Lucent Technologies Inc. | Replicated database synchronization method whereby primary database is selected queries to secondary databases are referred to primary database, primary database is updated, then secondary databases are updated |
TW454120B (en) * | 1999-11-11 | 2001-09-11 | Miralink Corp | Flexible remote data mirroring |
US6826182B1 (en) * | 1999-12-10 | 2004-11-30 | Nortel Networks Limited | And-or multi-cast message routing method for high performance fault-tolerant message replication |
US7065538B2 (en) * | 2000-02-11 | 2006-06-20 | Quest Software, Inc. | System and method for reconciling transactions between a replication system and a recovered database |
US6523036B1 (en) * | 2000-08-01 | 2003-02-18 | Dantz Development Corporation | Internet database system |
US6862595B1 (en) | 2000-10-02 | 2005-03-01 | International Business Machines Corporation | Method and apparatus for implementing a shared message queue using a list structure |
US6920447B2 (en) | 2001-02-15 | 2005-07-19 | Microsoft Corporation | Concurrent data recall in a hierarchical storage environment using plural queues |
US20020194015A1 (en) * | 2001-05-29 | 2002-12-19 | Incepto Ltd. | Distributed database clustering using asynchronous transactional replication |
WO2003044697A1 (en) * | 2001-11-16 | 2003-05-30 | Paralleldb, Incorporated | Data replication system and method |
US7139932B2 (en) * | 2002-01-03 | 2006-11-21 | Hitachi, Ltd. | Data synchronization of multiple remote storage after remote copy suspension |
US20030182464A1 (en) * | 2002-02-15 | 2003-09-25 | Hamilton Thomas E. | Management of message queues |
US7197533B2 (en) * | 2003-01-24 | 2007-03-27 | International Business Machines Corporation | Non-persistent service support in transactional application support environments |
US7707181B2 (en) * | 2003-02-19 | 2010-04-27 | Microsoft Corporation | System and method of distributing replication commands |
US20040199553A1 (en) | 2003-04-02 | 2004-10-07 | Ciaran Byrne | Computing environment with backup support |
GB0315064D0 (en) * | 2003-06-27 | 2003-07-30 | Ibm | Apparatus for returning a data item to a requestor |
US8635256B2 (en) | 2003-07-17 | 2014-01-21 | Silicon Graphics International, Corp. | Network filesystem asynchronous I/O scheduling |
US7490113B2 (en) * | 2003-08-27 | 2009-02-10 | International Business Machines Corporation | Database log capture that publishes transactions to multiple targets to handle unavailable targets by separating the publishing of subscriptions and subsequently recombining the publishing |
US7406487B1 (en) * | 2003-08-29 | 2008-07-29 | Symantec Operating Corporation | Method and system for performing periodic replication using a log |
EP1522932B1 (en) | 2003-10-08 | 2006-07-19 | Alcatel | Fast database replication |
US8156110B1 (en) | 2004-01-29 | 2012-04-10 | Teradata Us, Inc. | Rescheduling of modification operations for loading data into a database system |
EP1577776B1 (en) * | 2004-03-18 | 2007-05-02 | Alcatel Lucent | Method and apparatus for data synchronization in a distributed data base system |
US7359927B1 (en) * | 2004-12-01 | 2008-04-15 | Emc Corporation | Method for performing periodic replication of data on a remote storage system |
US8214353B2 (en) * | 2005-02-18 | 2012-07-03 | International Business Machines Corporation | Support for schema evolution in a multi-node peer-to-peer replication environment |
US7734605B2 (en) | 2005-08-22 | 2010-06-08 | Sun Microsystems, Inc. | Dynamic quota policy for queuing mechanism |
-
2005
- 2005-09-09 US US11/221,752 patent/US20070061379A1/en not_active Abandoned
-
2006
- 2006-09-08 WO PCT/CA2006/001474 patent/WO2007028248A1/en active Application Filing
-
2008
- 2008-05-09 US US12/149,927 patent/US9785691B2/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6012059A (en) * | 1997-08-21 | 2000-01-04 | Dataxel Corporation | Method and apparatus for replicated transaction consistency |
US6023720A (en) * | 1998-02-09 | 2000-02-08 | Matsushita Electric Industrial Co., Ltd. | Simultaneous processing of read and write requests using optimized storage partitions for read and write request deadlines |
US20010032282A1 (en) * | 2000-01-13 | 2001-10-18 | Marietta Bryan D. | Bus protocol independent method and structure for managing transaction priority, ordering and deadlocks in a multi-processing system |
US20020133491A1 (en) * | 2000-10-26 | 2002-09-19 | Prismedia Networks, Inc. | Method and system for managing distributed content and related metadata |
US20020133507A1 (en) * | 2001-03-16 | 2002-09-19 | Iti, Inc. | Collision avoidance in database replication systems |
US20040133591A1 (en) * | 2001-03-16 | 2004-07-08 | Iti, Inc. | Asynchronous coordinated commit replication and dual write with replication transmission and locking of target database on updates only |
US20030212738A1 (en) * | 2002-05-10 | 2003-11-13 | Wookey Michael J. | Remote services system message system to support redundancy of data flow |
US7249163B2 (en) * | 2002-05-27 | 2007-07-24 | International Business Machines Corporation | Method, apparatus, system and computer program for reducing I/O in a messaging environment |
US20040034640A1 (en) * | 2002-08-01 | 2004-02-19 | Oracle International Corporation | Buffered message queue architecture for database management systems with guaranteed at least once delivery |
US7177886B2 (en) * | 2003-02-07 | 2007-02-13 | International Business Machines Corporation | Apparatus and method for coordinating logical data replication with highly available data replication |
US20070088970A1 (en) * | 2003-04-10 | 2007-04-19 | Lenovo (Singapore) Pte.Ltd | Recovery from failures within data processing systems |
US20050021567A1 (en) * | 2003-06-30 | 2005-01-27 | Holenstein Paul J. | Method for ensuring referential integrity in multi-threaded replication engines |
US20070027896A1 (en) * | 2005-07-28 | 2007-02-01 | International Business Machines Corporation | Session replication |
US20070174346A1 (en) * | 2006-01-18 | 2007-07-26 | Brown Douglas P | Closed-loop validator |
Cited By (70)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070136389A1 (en) * | 2005-11-29 | 2007-06-14 | Milena Bergant | Replication of a consistency group of data storage objects from servers in a data network |
US7765187B2 (en) * | 2005-11-29 | 2010-07-27 | Emc Corporation | Replication of a consistency group of data storage objects from servers in a data network |
US20070203910A1 (en) * | 2006-02-13 | 2007-08-30 | Xkoto Inc. | Method and System for Load Balancing a Distributed Database |
US8209696B2 (en) | 2006-02-13 | 2012-06-26 | Teradata Us, Inc. | Method and system for load balancing a distributed database |
US7587435B2 (en) * | 2006-11-10 | 2009-09-08 | Sybase, Inc. | Replication system with methodology for replicating database sequences |
US20080114816A1 (en) * | 2006-11-10 | 2008-05-15 | Sybase, Inc. | Replication system with methodology for replicating database sequences |
US20080140734A1 (en) * | 2006-12-07 | 2008-06-12 | Robert Edward Wagner | Method for identifying logical data discrepancies between database replicas in a database cluster |
US8126848B2 (en) | 2006-12-07 | 2012-02-28 | Robert Edward Wagner | Automated method for identifying and repairing logical data discrepancies between database replicas in a database cluster |
US20100005124A1 (en) * | 2006-12-07 | 2010-01-07 | Robert Edward Wagner | Automated method for identifying and repairing logical data discrepancies between database replicas in a database cluster |
US7769722B1 (en) | 2006-12-08 | 2010-08-03 | Emc Corporation | Replication and restoration of multiple data storage object types in a data network |
US8706833B1 (en) | 2006-12-08 | 2014-04-22 | Emc Corporation | Data storage server having common replication architecture for multiple storage object types |
US9967360B2 (en) | 2007-04-13 | 2018-05-08 | International Business Machines Corporation | Method and system for information exchange utilizing an asynchronous persistent store protocol |
US20120197961A1 (en) * | 2007-04-13 | 2012-08-02 | Platform Computing Corporation | Method and system for information exchange utilizing an asynchronous persistent store protocol |
US9407715B2 (en) * | 2007-04-13 | 2016-08-02 | International Business Machines Corporation | Method and system for information exchange utilizing an asynchronous persistent store protocol |
US9734200B2 (en) | 2007-09-14 | 2017-08-15 | Oracle International Corporation | Identifying high risk database statements in changing database environments |
US9720941B2 (en) | 2007-09-14 | 2017-08-01 | Oracle International Corporation | Fully automated SQL tuning |
US8903801B2 (en) | 2007-09-14 | 2014-12-02 | Oracle International Corporation | Fully automated SQL tuning |
US20090077016A1 (en) * | 2007-09-14 | 2009-03-19 | Oracle International Corporation | Fully automated sql tuning |
US20090077017A1 (en) * | 2007-09-18 | 2009-03-19 | Oracle International Corporation | Sql performance analyzer |
US8341178B2 (en) * | 2007-09-18 | 2012-12-25 | Oracle International Corporation | SQL performance analyzer |
US20090106321A1 (en) * | 2007-10-17 | 2009-04-23 | Dinesh Das | Maintaining and Utilizing SQL Execution Plan Histories |
US8335767B2 (en) | 2007-10-17 | 2012-12-18 | Oracle International Corporation | Maintaining and utilizing SQL execution plan histories |
US10229158B2 (en) | 2007-10-17 | 2019-03-12 | Oracle International Corporation | SQL execution plan verification |
US8600977B2 (en) | 2007-10-17 | 2013-12-03 | Oracle International Corporation | Automatic recognition and capture of SQL execution plans |
US8700608B2 (en) | 2007-10-17 | 2014-04-15 | Oracle International Corporation | SQL execution plan verification |
US20090106320A1 (en) * | 2007-10-17 | 2009-04-23 | Benoit Dageville | Automatic Recognition and Capture of SQL Execution Plans |
US9189522B2 (en) | 2007-10-17 | 2015-11-17 | Oracle International Corporation | SQL execution plan baselines |
US20090106219A1 (en) * | 2007-10-17 | 2009-04-23 | Peter Belknap | SQL Execution Plan Verification |
US7454478B1 (en) * | 2007-11-30 | 2008-11-18 | International Business Machines Corporation | Business message tracking system using message queues and tracking queue for tracking transaction messages communicated between computers |
US8650364B2 (en) * | 2008-05-28 | 2014-02-11 | Vixs Systems, Inc. | Processing system with linked-list based prefetch buffer and methods for use therewith |
US20090300320A1 (en) * | 2008-05-28 | 2009-12-03 | Jing Zhang | Processing system with linked-list based prefetch buffer and methods for use therewith |
US20090327303A1 (en) * | 2008-06-27 | 2009-12-31 | Microsoft Corporation | Intelligent allocation of file server resources |
US20100217840A1 (en) * | 2009-02-25 | 2010-08-26 | Dehaan Michael Paul | Methods and systems for replicating provisioning servers in a software provisioning environment |
US9727320B2 (en) * | 2009-02-25 | 2017-08-08 | Red Hat, Inc. | Configuration of provisioning servers in virtualized systems |
US20120124311A1 (en) * | 2009-08-04 | 2012-05-17 | Axxana (Israel) Ltd. | Data Gap Management in a Remote Data Mirroring System |
US20160378617A1 (en) * | 2009-08-04 | 2016-12-29 | Axxana (Israel) Ltd. | Data gap management in a remote data mirroring system |
US11055183B2 (en) * | 2009-08-04 | 2021-07-06 | Axxana (Israel) Ltd. | Data gap management in a remote data mirroring system |
CN102841783A (en) * | 2011-06-24 | 2012-12-26 | 镇江华扬信息科技有限公司 | Delphi-based three-layer database system implementation method |
US20140279891A1 (en) * | 2013-03-13 | 2014-09-18 | International Business Machines Coporation | Replication group partitioning |
US11157518B2 (en) * | 2013-03-13 | 2021-10-26 | International Business Machines Corporation | Replication group partitioning |
US20140279892A1 (en) * | 2013-03-13 | 2014-09-18 | International Business Machines Corporation | Replication group partitioning |
US11151164B2 (en) * | 2013-03-13 | 2021-10-19 | International Business Machines Corporation | Replication group partitioning |
US10769028B2 (en) | 2013-10-16 | 2020-09-08 | Axxana (Israel) Ltd. | Zero-transaction-loss recovery for database systems |
US9858305B2 (en) * | 2014-03-06 | 2018-01-02 | International Business Machines Corporation | Restoring database consistency integrity |
US9875266B2 (en) * | 2014-03-06 | 2018-01-23 | International Business Machines Corporation | Restoring database consistency integrity |
US20150254296A1 (en) * | 2014-03-06 | 2015-09-10 | International Business Machines Corporation | Restoring database consistency integrity |
US20150254298A1 (en) * | 2014-03-06 | 2015-09-10 | International Business Machines Corporation | Restoring database consistency integrity |
US10621064B2 (en) | 2014-07-07 | 2020-04-14 | Oracle International Corporation | Proactive impact measurement of database changes on production systems |
US10997568B2 (en) | 2014-12-18 | 2021-05-04 | Ipco 2012 Limited | System, method and computer program product for receiving electronic messages |
US11665124B2 (en) | 2014-12-18 | 2023-05-30 | Ipco 2012 Limited | Interface, method and computer program product for controlling the transfer of electronic messages |
US11521212B2 (en) | 2014-12-18 | 2022-12-06 | Ipco 2012 Limited | System and server for receiving transaction requests |
US10708213B2 (en) | 2014-12-18 | 2020-07-07 | Ipco 2012 Limited | Interface, method and computer program product for controlling the transfer of electronic messages |
US11080690B2 (en) | 2014-12-18 | 2021-08-03 | Ipco 2012 Limited | Device, system, method and computer program product for processing electronic transaction requests |
US10999235B2 (en) | 2014-12-18 | 2021-05-04 | Ipco 2012 Limited | Interface, method and computer program product for controlling the transfer of electronic messages |
US10963882B2 (en) | 2014-12-18 | 2021-03-30 | Ipco 2012 Limited | System and server for receiving transaction requests |
US20210216981A1 (en) * | 2015-01-04 | 2021-07-15 | Tencent Technology (Shenzhen) Company Limited | Method and device for processing virtual cards |
US10379958B2 (en) | 2015-06-03 | 2019-08-13 | Axxana (Israel) Ltd. | Fast archiving for database systems |
CN107918620A (en) * | 2016-10-10 | 2018-04-17 | 阿里巴巴集团控股有限公司 | Wiring method and device, the electronic equipment of a kind of database |
US11640384B2 (en) | 2016-10-10 | 2023-05-02 | Alibaba Group Holding Limited | Database processing method, apparatus, and electronic device |
US10725974B2 (en) * | 2016-11-22 | 2020-07-28 | Huawei Technologies Co., Ltd. | Systems, devices and methods for managing file system replication |
US20180143996A1 (en) * | 2016-11-22 | 2018-05-24 | Chen Chen | Systems, devices and methods for managing file system replication |
CN109792453A (en) * | 2016-11-22 | 2019-05-21 | 华为技术有限公司 | Manage the system, apparatus and method of file system duplication |
US10592326B2 (en) | 2017-03-08 | 2020-03-17 | Axxana (Israel) Ltd. | Method and apparatus for data loss assessment |
US10007695B1 (en) * | 2017-05-22 | 2018-06-26 | Dropbox, Inc. | Replication lag-constrained deletion of data in a large-scale distributed data storage system |
US11226954B2 (en) | 2017-05-22 | 2022-01-18 | Dropbox, Inc. | Replication lag-constrained deletion of data in a large-scale distributed data storage system |
US11386058B2 (en) | 2017-09-29 | 2022-07-12 | Oracle International Corporation | Rule-based autonomous database cloud service framework |
US11327932B2 (en) | 2017-09-30 | 2022-05-10 | Oracle International Corporation | Autonomous multitenant database cloud service framework |
US20190238605A1 (en) * | 2018-01-31 | 2019-08-01 | Salesforce.Com, Inc. | Verification of streaming message sequence |
EP3754514A4 (en) * | 2018-02-12 | 2020-12-23 | ZTE Corporation | Distributed database cluster system, data synchronization method and storage medium |
CN109299136A (en) * | 2018-11-27 | 2019-02-01 | 佛山科学技术学院 | A kind of real-time synchronization method and device in the database resource pond of intelligence manufacture |
Also Published As
Publication number | Publication date |
---|---|
US9785691B2 (en) | 2017-10-10 |
WO2007028248A1 (en) | 2007-03-15 |
US20090106323A1 (en) | 2009-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9785691B2 (en) | Method and apparatus for sequencing transactions globally in a distributed database cluster | |
US8856091B2 (en) | Method and apparatus for sequencing transactions globally in distributed database cluster | |
EP1782289B1 (en) | Metadata management for fixed content distributed data storage | |
US10817478B2 (en) | System and method for supporting persistent store versioning and integrity in a distributed data grid | |
US9904605B2 (en) | System and method for enhancing availability of a distributed object storage system during a partial database outage | |
JP4204769B2 (en) | System and method for handling failover | |
US10489412B2 (en) | Highly available search index with storage node addition and removal | |
EP2619695B1 (en) | System and method for managing integrity in a distributed database | |
US20110191300A1 (en) | Metadata management for fixed content distributed data storage | |
US9575975B2 (en) | Cluster-wide unique ID for object access control lists | |
US20030065760A1 (en) | System and method for management of a storage area network | |
US9589002B2 (en) | Content selection for storage tiering | |
US20120078850A1 (en) | System and method for managing scalability in a distributed database | |
US9396076B2 (en) | Centralized version control system having high availability | |
US7120821B1 (en) | Method to revive and reconstitute majority node set clusters | |
US11461201B2 (en) | Cloud architecture for replicated data services | |
CA2619778C (en) | Method and apparatus for sequencing transactions globally in a distributed database cluster with collision monitoring | |
AU2011265370B2 (en) | Metadata management for fixed content distributed data storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: AVOKIA INC., CANADA Free format text: REMOVE 11/211752;ASSIGNORS:WONG, FRANKIE;YU, XIONG;WANG, ELAINE;REEL/FRAME:017583/0030 Effective date: 20051028 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: OPEN INVENTION NETWORK, LLC, NORTH CAROLINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AVOKIA, INC.;REEL/FRAME:023679/0208 Effective date: 20090930 |