WO2014008652A1

WO2014008652A1 - Metadata management method and device

Info

Publication number: WO2014008652A1
Application number: PCT/CN2012/078563
Authority: WO
Inventors: 李熠斌
Original assignee: 华为技术有限公司
Priority date: 2012-07-12
Filing date: 2012-07-12
Publication date: 2014-01-16
Also published as: CN104054294A; CN104054294B

Abstract

The present invention provides a metadata management method and device. The method comprises: storing in a current basic shape on a top layer a currently running metadata set of a node, a first leaf node of the current basic shape being used to be connected to another current basic shape storing a currently running metadata set of another node, a second leaf node of the current basic shape being used to be connected to a degradation basic shape of a metadata set, running at a first time point, of a node corresponding to the current basic shape, and the first time point being prior to the current time; and managing the metadata according to a metadata heap. The present invention implements quick execution of active/standby switch in a cluster.

Description

Metadata management method and device

TECHNICAL FIELD The present invention relates to storage technologies, and in particular, to a metadata management method and apparatus. Background technique

Metadata is the description of the resources managed by the current device or system. It plays an important role in the self-maintenance and management of the device itself. The functional operations involved in device operation, such as I/O access and resource allocation, need to be performed on the metadata. Add, delete, modify, etc. In a distributed structure cluster system, metadata is often scattered among the nodes of the cluster. Moreover, the cluster usually includes a primary node and a standby node, and the primary node maintains a complete metadata complete set including metadata of all nodes of the cluster, and the standby node only maintains a subset of metadata that may be used by itself (the subset is Relative to the corpus maintained by the primary node, the subset maintained by the standby node is the metadata necessary to maintain normal operation of the node.

There may be a problem of active/standby switchover in the cluster. That is, for some reason, a standby node will become the new primary node, and the original primary node will become the standby node. At this time, data migration will occur, and the metadata collection maintained on the original primary node will be Migrate to the new primary node, for example, you can copy all of the metadata to the new node from the disk that mirrors the metadata set of the original primary node. The more nodes the cluster maintains, the larger the amount of data in the metadata set and the longer the data replication time. However, the switching time of the active/standby switchover in the cluster is strictly limited. Once the timeout occurs, the cluster I will be caused. /O business is blocked and other serious consequences. Summary of the invention

The present invention provides a metadata management method and apparatus for quickly implementing active/standby node switching in a distributed cluster.

A first aspect of the present invention provides a metadata management method, where the method is applied to a cluster including a plurality of nodes, and the method includes:

Storing the metadata in the cluster as a metadata heap, the metadata including the current node itself a currently running metadata set and a metadata set currently running by all nodes in the cluster other than itself, the metadata pile including at least two basic shapes, each of which is composed of a vertex, a first leaf node, and a a binary tree composed of two leaf nodes;

The storing the metadata in the cluster as a metadata heap includes:

Storing a metadata set currently running by the current node itself in a current basic shape located at a top level of the metadata heap, where the first leaf node of the current basic shape is used to connect and store a metadata set currently running by another node Another current basic shape of the other current basic shape, the first leaf node of the other current basic shape is used to connect to store another current basic shape of the metadata set currently running by the node, and so on until all the nodes in the cluster are connected Node,

The second leaf node of the current basic shape is used to connect the degenerate basic shape of the node corresponding to the current basic shape, and the degenerate basic shape is a metadata set that is operated by the node corresponding to the current basic shape at the first time point. The first time point is earlier than the current time point; the second leaf node of the degenerate basic shape is used to connect another degenerate basic shape of the node corresponding to the current basic shape, and the other degenerate basic shape is the current basic shape a set of metadata that the corresponding node runs at a second time point, the second time point is earlier than the first time point, and so on;

The management of the metadata is performed according to the metadata heap.

In a first possible implementation manner of the first aspect, when the metadata in the metadata heap is updated, the managing the metadata according to the metadata heap includes: acquiring the An updated metadata set currently running by one of the nodes other than itself in the cluster; storing the updated metadata set in a current basic shape in the metadata heap corresponding to the one of the nodes.

In conjunction with the first possible implementation of the first aspect, in a second possible implementation, after the storing in the current basic shape corresponding to the one of the nodes in the metadata heap, : storing, in the metadata base, degraded data corresponding to the one of the nodes in a degenerate basic shape, where the degraded data is stored in the current basic shape before the updated metadata set Data; connecting the degenerate base shape to a second leaf node of the current base shape.

In a third possible implementation manner of the first aspect, when the another node other than the current node itself in the cluster is split from the cluster, the performing according to the metadata heap The management of the metadata includes: the current basic shape corresponding to the another node, and the current The first leaf node of the current basic shape of the node is disconnected, and the other current node corresponding to the other node of the other current basic shape corresponding to the other node is disconnected Opening a connection; connecting the further current basic shape to a first leaf node of a current basic shape of the current node.

In a fourth possible implementation manner of the first aspect, when the new node is added to the cluster, the managing the metadata according to the metadata heap includes: from being to join the cluster Obtaining a currently running metadata set of the new node in a metadata heap stored in a new node; establishing a current basic shape corresponding to the new node in a metadata heap of the current node, and the new node The currently running metadata set is stored in a current basic shape corresponding to the new node, and the current basic shape of the new node is connected to a current basic shape of the first leaf node having the idleness in the metadata heap.

In a fifth possible implementation manner of the foregoing aspect, the performing the management of the metadata according to the metadata heap includes: storing, respectively, a metadata set running at each time point, where the metadata set includes a metadata set of each node in the cluster, where each time point includes the current time point, a first time point, and a second time point, to run metadata corresponding to a certain time point in each time point set.

In a sixth possible implementation manner of the foregoing aspect, the performing the management of the metadata according to the metadata heap includes: when a current basic shape corresponding to a current node in the metadata heap exists When the metadata is repaired, the repair data corresponding to the data to be repaired is obtained from the degenerate basic shape corresponding to the current node itself in the metadata heap; or the metadata heap from the other nodes in the cluster and itself Acquiring the repair data corresponding to the data to be repaired in the current basic shape; replacing the acquired repair data with the metadata to be repaired in the current basic shape corresponding to the current node in the metadata heap.

A second aspect of the present invention provides a metadata management apparatus, including:

a storage unit, configured to store metadata in the cluster as a metadata heap, where the metadata includes a metadata set currently running by the current node itself and a metadata set currently running by all nodes in the cluster other than itself, The metadata heap includes at least two basic shapes, each of which is a binary tree shape composed of a vertex, a first leaf node, and a second leaf node; the storing the metadata as a metadata heap, including:

Storing the metadata set currently running by the current node itself at the top of the metadata heap a current basic shape of the layer, the first leaf node of the current basic shape is used to connect another current basic shape storing a metadata set currently running by another node, and the first leaf node of the other current basic shape is used for Connecting to store another current basic shape of the metadata set currently running by the node, and so on until all nodes in the cluster are connected;

a second leaf node of each of the current basic shapes is used to connect the node corresponding to the current basic shape to run at a first time point in each current basic shape for storing a metadata set currently running by each node in the cluster. a degenerate basic shape of the metadata set, the first time point being earlier than the current time point; the second leaf node of the degenerate basic shape is used to connect another set of metadata of the node running at the second time point a degenerate basic shape, the second time point is earlier than the first time point, and so on; the management unit is configured to perform management of the metadata according to the metadata heap.

In a first possible implementation manner of the second aspect, the management unit includes: a synchronization subunit, configured to acquire, in addition to the cluster itself, the metadata in the metadata heap An updated metadata set currently running by one of the nodes; storing the updated metadata set in a current basic shape in the metadata heap corresponding to the one of the nodes.

With reference to the first possible implementation of the second aspect, in a second possible implementation, the management unit includes: a storage subunit, configured to correspond to one of the metadata heaps Degraded data stored in a degenerate basic shape, the degraded data being data stored in the current basic shape before the updated metadata set; connecting the degenerate basic shape to the current basic shape The second leaf node.

In a third possible implementation manner of the second aspect, the management unit includes: a morphological control subunit, configured to split the another node other than the current node itself in the cluster from the cluster When going out, disconnecting the current basic shape corresponding to the another node from the first leaf node of the current basic shape of the current node, and the another current basic shape corresponding to the another node The further current basic shape corresponding to the further node connected by the first leaf node is disconnected; the first current basic shape is connected to the first leaf node of the current basic shape of the current node; When a new node is added to the cluster, a metadata set of the current running of the new node is obtained from a metadata heap stored in a new node to be joined to the cluster; and a metadata heap of the current node is established. a current basic shape corresponding to the new node, and storing the currently running metadata set of the new node in a current basic shape corresponding to the new node, and current The base shape is connected to the current base shape of the first leaf node that is free in the metadata heap. In a fourth possible implementation manner of the second aspect, the management unit includes: a snapshot subunit, configured to separately store a metadata set running at each time point, where the metadata set includes each node in the cluster The metadata set includes the current time point, the first time point, and the second time point to run a metadata set corresponding to a certain time point in each time point.

In a fifth possible implementation manner of the second aspect, the management unit includes: a repair subunit, configured to: when there is metadata to be repaired in the current basic shape corresponding to the current node in the metadata heap, Acquiring the repair data corresponding to the data to be repaired in the degenerate basic shape corresponding to the current node in the metadata heap; or from the current basic shape corresponding to the metadata heap of the other nodes in the cluster Acquiring the repair data corresponding to the data to be repaired; replacing the acquired repair data with the metadata to be repaired in the current basic shape corresponding to the current node in the metadata heap.

In a sixth possible implementation manner of the second aspect, when the primary node in the cluster is powered off, the management unit is further configured to: determine, according to a predetermined rule, that the current node is a master node; The primary and secondary identifiers of the current node are modified for the primary use.

A third aspect of the present invention provides a metadata management apparatus, including a memory and a processor, where the memory is used to store metadata in the cluster as a metadata heap, where the metadata includes a current operation of the current node itself. a metadata set and a metadata set currently running by all nodes in the cluster other than itself, the metadata pile including at least two basic shapes, each basic shape being composed of a vertex, a first leaf node, and a second leaf node Binary tree

The storing the metadata in the cluster as a metadata heap includes:

The second leaf node of the current basic shape is used to connect the degenerate basic shape of the node corresponding to the current basic shape, and the degenerate basic shape is a metadata set that is operated by the node corresponding to the current basic shape at the first time point. The first time point is earlier than the current time point; the second leaf node of the degenerate basic shape is used to connect another degenerate basic shape of the node corresponding to the current basic shape, and the other degenerate basic shape is the current basic shape a metadata set of the corresponding node running at the second time point, the second The time is earlier than the first time, and so on;

The processor is configured to perform management of the metadata according to the metadata heap.

The metadata management method and apparatus provided by the present invention saves a metadata heap in each node, and each metadata pile stores a current metadata set of the current node itself and a data set other than itself. The metadata set of all the nodes in the cluster can be used as the master node because all the nodes in the cluster have the same architecture. Therefore, only the primary and secondary identifiers inside the node need to be used for the active/standby switchover. The modification can be used as the main function. It does not need to perform data migration between the active and standby nodes. This avoids the large amount of data migration during the active/standby switchover of the prior art, and implements the fast execution of the active/standby switchover in the cluster. BRIEF DESCRIPTION OF THE DRAWINGS In order to more clearly illustrate the technical solutions in the embodiments of the present invention, a brief description of the drawings to be used in the description of the embodiments will be briefly made. It is obvious that the drawings in the following description are some of the present invention. For the embodiments, those skilled in the art can obtain other drawings according to the drawings without any creative labor.

1 is a schematic diagram showing the configuration of a metadata heap in a node in an embodiment of a metadata management method according to the present invention;

2 is a schematic diagram of a metadata heap in a node in an embodiment of a metadata management method according to the present invention; FIG. 3 is a schematic flowchart of an embodiment of a metadata management method according to the present invention;

4 is a schematic diagram of a principle of another embodiment of a metadata management method according to the present invention;

FIG. 5 is a schematic diagram of a schematic diagram of still another embodiment of a metadata management method according to the present invention; FIG.

6 is a schematic diagram of a schematic diagram of still another embodiment of a metadata management method according to the present invention;

7 is a schematic diagram of a schematic diagram of still another embodiment of a metadata management method according to the present invention;

FIG. 8 is a schematic diagram 1 of another embodiment of a metadata management method according to the present invention; FIG.

FIG. 9 is a schematic diagram 2 of another embodiment of a metadata management method according to the present invention; FIG.

FIG. 10 is a schematic diagram of a schematic diagram of still another embodiment of a metadata management method according to the present invention; FIG.

FIG. 1 is a schematic structural diagram of an embodiment of a metadata management apparatus according to the present invention;

FIG. 12 is a schematic structural diagram of another embodiment of a metadata management apparatus according to the present invention. detailed description Embodiments of the present invention implement metadata management based on fractal theory. Fractal theory is generally understood as "a rough or fragmentary geometry that can be divided into several parts, and each part is at least roughly the overall reduced size shape", This property is called self-similarity; a mathematical fractal is based on an iterative equation, a recursive-based feedback system. There are several types of fractals, which can be defined according to the exact self-similarity, semi-self-similarity and statistical self-similarity respectively. Fractals generally have the following characteristics: they can have fine structures on any small scale; Rules, whether in whole or in part, are difficult to describe in the language of traditional Euclidean geometry; have (at least approximate or statistical) self-similar forms.

The following describes how the embodiment of the present invention applies fractal theory to the organization and management of metadata. The organization and management of the metadata is applicable to clusters including multiple nodes, including, for example, metadata management of a cluster of computer hosts, cluster devices. Cache metadata management, metadata management for the Distributed File System, and all other scenarios that require discrete management of cluster data. The following embodiment uses a distributed cluster system as an example to describe the method of the embodiment of the present invention. In the distributed cluster system, a plurality of nodes are included. In this embodiment, a metadata set of all nodes of the entire cluster is stored inside each node of the cluster, and at least a metadata set of all nodes currently stored is stored. Among them, the metadata is stored as a metadata heap in each node, and the composition of the metadata heap is based on the fractal theory.

Firstly, several concepts used in various embodiments of the present invention are briefly described as follows: Metadata heap: Within each node, a plurality of metadata, such as metadata including nodes themselves, metadata of other nodes, etc., are stored. These metadata are stored in association with each other, and the whole of all metadata is called a metadata heap;

Basic shape: A unit that stores metadata. Each basic shape stores one type of metadata. For example, one basic shape is used to store metadata of the node itself, and another basic shape is used to store metadata of another node. .

The base shape includes a current base shape and a degenerate base shape, wherein the current base shape: metadata of the current running of the storage node itself; the degenerate base shape: metadata of the storage node itself running at a previous time point before the current time point.

Here, the specific meanings of the above respective concepts and their mutual relations will be described in detail in the following embodiments.

FIG. 1 is a schematic diagram showing the configuration of a metadata heap in a node in an embodiment of a metadata management method according to the present invention; The intention is to take a cluster including two nodes, node 1 and node 2 as an example. FIG. 1 shows the principle of the metadata heap inside the node 1 in the cluster. The metadata heap includes at least two basic shapes, that is, for example, a basic shape 11 for storing a metadata set currently running by the node 1, and another for storing a metadata set currently running by the node 2 The basic shape 12 and the like, since the basic shape 11 and the basic shape 12 store the metadata set currently running by the node, it may be referred to as "current basic shape", and the "current basic shape" mentioned in the subsequent embodiment of the present invention is also Both refer to the basic shape of the metadata set currently used to store the node, which refers to each node in the cluster. Inside each node in the cluster, at least the current base shape of the currently running metadata set of each node of the storage cluster is stored. For example, if the cluster is a five-node cluster, each node internally includes at least five current basic shapes for storing the currently running metadata sets of the nodes.

As shown in FIG. 1, each of the basic shapes includes a vertex a, a first leaf node b, and a second leaf node c, the basic shape being a triangle-like shape, which may also be referred to as a binary tree shape; for example, the base shape 12 also has The above-mentioned vertex a, first leaf node b, and second leaf node ^ base 12 are connected to the first leaf node b of the base shape 11. Similarly, if the cluster also includes node 3, the current base shape corresponding to node 3 will be connected to the first leaf node b of the base shape 12; and so on until all nodes in the cluster are traversed.

In fact, the basic shape is introduced in the metadata heap in order to more clearly explain the connection relationship between the stored metadata sets; specifically, for example, in the basic shape 11, it can be understood that the vertex a represents the element currently running by the node 1. The data set (there is no limitation on the data structure inside the set), that is, the metadata set stored in each base shape can be represented by the vertex a. The first leaf node b and the second leaf node c of the basic shape can be understood as a "connection interface" for connecting another metadata set, indicating that the metadata sets are related to each other; For example, as shown in FIG. 1, the first leaf node b of the base shape 11 is a metadata set for connecting to the node 2, that is, the first leaf node b is a connection interface that associates the metadata set currently running by each node.

By storing the metadata set currently running by each node of the entire cluster in each node of the cluster, even if the active/standby switchover occurs, since the metadata stored in each node is the same, the original primary node and the new primary node are not needed. Data migration between them enables fast active/standby switchover.

Optionally, the second leaf node c in each basic shape is used to connect the degraded data of each node, where the degraded data refers to a metadata set run by the node at a time point before the current time point; The basic shape of the degraded data may be referred to as a "degenerate basic shape", in a subsequent embodiment of the present invention The "degenerate basic shape" mentioned also refers to the basic form of the metadata set used to store the point in time before the node is stored. As shown in FIG. 1, the second leaf node c of the base shape 11 (which may also be referred to as the current base shape 11) is connected to the degenerate base shape 13 corresponding to the node 1, and the degenerate base shape 13 stores, for example, the node 1 at the first time. a set of metadata running, the first time point is earlier than the current time point; the second leaf node c of the degraded basic shape 13 is connected to another degenerate basic shape 15 of the metadata set of the storage node 1 running at the second time point, The second time point is earlier than the first time point; the second leaf node c of the base 12 is connected to the degenerate base shape 14 corresponding to the node 2. As can be seen from the above, the second leaf node c in the basic shape is a connection interface that links the metadata set running at each time point of the node for a single node, and the time points include the current time point and before the current time point. Time point.

It should be noted that the above-mentioned degradation data is determined according to actual needs. For example, the metadata currently running by node 1 is modified at time t1, and is modified at time points t2 and t3, t3 is later than t2. After t2 is later than tl, the data modified by the time point tl and the data modified by the time point t2 are saved according to actual needs, and are respectively stored in a degenerate basic shape, and are performed according to the connection rule of FIG. 1 above. Connection; you can also choose to save only the modified data at time point tl. As shown in FIG. 1, A represents a metadata set of node 1, including a currently running metadata set of node 1, and a metadata set of node 1 at two time points before the current time point, and B represents The metadata set of node 2, including the currently running metadata set of node 2, and the metadata set of node 2 at a previous point in time; A and B are connected by the current basic shape corresponding to node 1 and node 2, and It is indicated that the node 2 corresponds to the vertex a in the current base shape 12 (the metadata set indicating the current operation of the node 2) and is connected to the first leaf node b (connection interface) corresponding to the current base shape 11 of the node 1. The other nodes in the cluster also organize metadata according to the above connection method.

In addition, in FIG. 1, the first leaf node is connected to the bottom edge of the base shape in the current basic shape, or alternatively, the second leaf node on the bottom edge of the base shape may be connected. Degraded data is also a complete set of metadata inside a node, except that degradation means that the data is not currently running, but data that is run at a point in time before the current point in time; for example, after the current running metadata has changed , the data before the change can be stored in the degenerate base shape. In Figure 1, the node 1 has two layers of degradation (that is, two degenerate basic shapes). In practice, more levels of degenerate basic shapes can be configured according to requirements. Of course, the more degraded levels, the corresponding time points are also The more, but the more storage space it occupies. As shown in FIG. 1, in the metadata heap, node 2 may be referred to as a "logical neighbor node" of node 1, and the "logical neighbor node" refers to two nodes connected in the metadata heap. Mutual logical neighbors, for example, node 1 in Figure 1 is a logical neighbor of node 2, node 2 is also a logical neighbor of node 1, if node 3 also includes node 3, node 3 is connected to node 2 Corresponding to the first leaf node of the current base shape 12, the node 3 is referred to as a logical neighbor node of the node 2. That is, a "logical neighbor" is a definition used in a metadata heap to represent a connection relationship between nodes, regardless of the actual physical connection of each node.

It can also be seen from FIG. 1 that each basic shape in the metadata heap may include a current basic shape and a degenerate basic shape, and each basic shape includes a vertex, a first leaf node, and a second leaf node; however, The connection relationship of the above three nodes of the current basic shape and the degenerate basic shape is different. For example, the vertex a of the current base shape 11 represents the metadata set currently running by the node 1, and the first leaf node b of the current base shape 11 is the current base shape of the logical neighboring node (node 2) for connecting the node 1 The second leaf node c of the current base shape 11 is a degenerate base shape 13 corresponding to the metadata set of the last time point of the connection node 1 itself; and the degenerate base shape 13 whose vertex a indicates that the node 1 is on the above A set of metadata running at a point in time, the second leaf node c is used to connect the metadata set run by the node 1 at a higher point in time, but the first leaf node b of the degraded basic shape 13 does not hang any data. The above characteristics are also the composition rules of the metadata heap inside each node; and, each node internally has the current basic shape of the node itself as the top layer of the metadata heap, and the degenerate basic shape does not necessarily exist in the metadata heap, for example Newly established clusters or clusters without any metadata modifications may not have degenerate base shapes.

2 is a schematic diagram of a metadata heap in a node in an embodiment of a metadata management method according to the present invention. FIG. 2 is an example of a cluster including three nodes of node 1, node 2, and node 3, and is internal to node 1. Metadata heap example. It should be noted that FIG. 2 only shows the degraded data of the two levels of the node 1, that is, Γ and 1 ", where Γ is the metadata set of the node 1 running at the first time point, the first time point Earlier than the current time point, 1 " is the metadata set of node 1 running at the second time point, the second time point is earlier than the first time point; Figure 2 shows the degradation data of one level of node 2, ie 2 Figure 2 also shows the metadata set currently running by node 3, but does not show the degraded data of node 3. As described above, FIG. 2 is merely an example, and the degradation level of each node is not limited. For example, the node 2 may also have the second level of degradation data 2", and the node 3 may also have degraded data. Comparing FIG. 2 with FIG. 1, substantially, if the current basic shape of a node 3 is connected at the first leaf node b of the current basic shape 12 in FIG. 1, FIG. 2 and FIG. 1 are the same; FIG. Just for the sake of brevity, the plurality of basic shapes dispersed in Fig. 1 are combined. For example, the degenerate basic shape 13 in Fig. 1, that is, the basic shape storing the data, in Fig. 1, the first leaf node b is not connected to any data, but in Fig. 2, the first base of the degenerate basic shape 13 A leaf node b is overlapped with the second leaf node c of the current base shape 12 of the node 2, as described above, only to make the representation of the metadata heap more concise. For the same reason, for example, the first leaf node b of the degenerate base shape 15 in FIG. 1 is also overlapped with the second leaf node c of the degenerate base shape 14 of the node 2. The other overlapping processing principles are the same and will not be described again. In the subsequent embodiments, the metadata stack shape shown in FIG. 2 is also described.

As shown in FIG. 2, in the metadata heap, at the top level is the current storage shape of the metadata set currently running by the storage node 1; the metadata heap starts from the top layer and is the rightmost column (indicated by C), that is The metadata of all nodes in the current cluster (including node 1, node 2, and node 3) (that is, the metadata currently running on all nodes). A column indicated by D in Fig. 2 includes the degenerate basic shape of the degraded data Γ of the storage node 1 and the degraded basic shape of the storage node 2, and is the degraded data of the first hierarchy. The degenerate base form 15 of the degraded data 1 " of the storage node 1 in Fig. 2 is the degraded data of the second level. The nature of the metadata pile explained in the subsequent embodiment is similar.

The above details the structure of the metadata heap within a node and describes how metadata is organized through the metadata heap. In the following embodiments, on the basis of this, how to manage the metadata according to the metadata heap is specifically introduced, for example, how the metadata stored in the above metadata heap is consistent in each node of the cluster, How clusters are split and combined, how clusters implement metadata redundancy and fault tolerance, and so on.

Embodiment 1

FIG. 3 is a schematic flowchart of a method for managing a metadata according to an embodiment of the present invention. The method may be performed by a node in a cluster. The method in this embodiment is a simple description. For the specific principle, refer to the intra-node metadata heap as described above. The way of composition. As shown in FIG. 3, the method may include:

301. Store metadata as a metadata heap, where the metadata includes a metadata set currently running by the node and a metadata set currently running by all nodes except the cluster itself;

The metadata heap includes at least two basic shapes, each of which is a binary shape formed by a vertex, a first leaf node, and a second leaf node. The storing the metadata as a metadata heap includes:

Storing the metadata set currently running by itself in a current basic shape located at a top level of the metadata heap, where the first leaf node of the current basic shape is used to connect another one storing a metadata set currently running by another node a current basic shape, the first leaf node of the another current basic shape is used to connect and store another current basic shape of the metadata set currently running by the node, and so on to traverse all the nodes in the cluster;

a second leaf node of each of the current basic shapes is used to connect the node corresponding to the current basic shape to run at a first time point in each current basic shape for storing a metadata set currently running by each node in the cluster. a degenerate base shape of the metadata set, the first time point being earlier than the current; the second leaf node of the degenerate base shape is used to connect another degradation of the metadata set of the node running at a second time point The base shape, the second time point is earlier than the first time point, and so on.

302. Perform management of the metadata according to the metadata heap.

The management of the metadata includes, for example, metadata management when the nodes of the cluster perform consistency synchronization, metadata management during cluster splitting and combining, redundancy of cluster metadata, and fault-tolerant management.

In the embodiment of the present invention, when the primary node in the cluster is powered off, the method may further include: determining, according to a predetermined rule, that the current node is a master node; modifying a primary and secondary identifier of the current node as a primary use. Specifically, the primary and secondary identifiers may be Flag identifiers inside the storage node. After the identifier is modified for use as a primary node, the node becomes a primary node, and the metadata in the cluster may be managed, because the node is in the node. The same holds the structure of the metadata heap, which stores the metadata collection of all the nodes in the cluster, thus avoiding the data migration in the prior art.

The metadata management method in this embodiment does not need to perform data migration between the active and standby nodes when performing the active/standby switchover by storing the metadata set of the current node running in all the nodes in the cluster. , realizes the fast execution of the active/standby switchover in the cluster.

In addition, since the metadata set of all the nodes in the cluster is stored inside each node, including the currently running metadata set and the metadata set running at the previous time point (ie, degraded data), the metadata at each node can be The heap retrieves the metadata of any node of the cluster; and if the degraded data is stored, the degraded data of the cluster can also be retrieved, which is also the degraded data of all the nodes of the entire cluster. The metadata organization structure of the embodiment makes the retrieval of the metadata very Convenience.

Embodiment 2 FIG. 4 is a schematic diagram of the principle of another embodiment of the metadata management method of the present invention. This embodiment is to explain how each node in the cluster maintains data consistency, that is, synchronization of configuration data at each node.

As shown in FIG. 4, the cluster includes four nodes, namely node 1, node 2, node 3, and node 4; and correspondingly displays the metadata heap form inside each node, which is identified by the top layer of each metadata pile. It is a collection of metadata currently running on this node. It should be noted that, referring to FIG. 4, for example, the two layers of degraded data 1 of node 1 and 1 " are stored in the metadata heap of node 1, and only one layer of node 1 is stored in the metadata heap of node 3. Degraded data Γ, there is no storage node 1 degradation data in the metadata heap of node 2; this is because FIG. 4 is only an example, as already explained above, two nodes 1 can also be stored in node 2 and node 3. Layer degradation data 1, and 1".

In general, the embodiment may be configured such that the degraded data of the node itself must be saved. For example, the metadata heap of the node 1 must store 1, and 1 ", but the degraded data of other nodes may be optionally saved, for example, Node 1 can selectively save the degraded data of node 3, the actual node 3 has two layers of degraded data 3, and 3", but the node only saves one layer of degraded data 3; because, even if the degraded data of other nodes is not saved, It can also be obtained from the metadata heap of other nodes themselves. In summary, this embodiment can be set, each node must save the metadata of its current operation and its own degraded data, and the metadata of the current running of other nodes must also be saved, and the degraded data for other nodes is Optional save.

In addition, the degraded data is stored in chronological order; for example: The metadata heap vertex in node 1 stores the currently running metadata set, and stores the metadata set running at time point T1, 1 "stored Is the metadata set running at time point T2, then the current time point (for example, 10 points) - time point T1 (for example, 9 points) - time point T2 (for example, 8 points), the three times are sequentially forward, Then, if the current metadata set changes and needs to be saved, the metadata set running at the current time point (10 o'clock) needs to be stored, and the metadata set originally stored at the time point T1 (9 o'clock) is going backward. , stored to 1 ", the same reason, the original metadata collection at the time point T2 (8 points) of 1 " is also going backwards, stored to a newly established degenerate base shape 1 "'.

In addition, the specific preservation of several layers of degraded data can also be set autonomously. For example, still the above example is an example. If the pre-set is to save the two layers of degraded data, ie 1 ", then there is no need to create a new degenerate base shape 1 ", and the original time point T2 stored at 1" The metadata set (8 points) will be discarded directly. In addition, for the convenience of searching for degraded data, it can be saved. The running time points corresponding to the degradation data are also saved together. For example, the above-mentioned time points T1 (for example, 9 points) and time points T2 (for example, 8 points) are required to be stored.

The following describes how the four nodes synchronize metadata: The cluster usually includes the primary node and the standby node, and if some part of the metadata needs to be changed, it usually starts from the primary node, that is, the primary node changes first.

Assume that in the cluster shown in Figure 4, node 1 is the master node, and the metadata related to I/O is initiated from node 1 to each node; each node in the cluster can communicate with each other, node 1 The corresponding I/O metadata in other nodes may be sequentially changed in a certain logical order, or the changes of each other node may be performed concurrently. The above is called "first layer synchronization", that is, the top level of the metadata heap in each node is configured to change the metadata.

After the first layer synchronization is completed, the metadata heap of each node is internally updated synchronously: For example, in node 1, 1->2->3->4 synchronizes other node metadata in the node metadata heap, Node 2 internally synchronizes other node metadata in the node metadata heap with 2->3->4->1, and so on. The internal update of the metadata heap described here means that, for example, after the first layer synchronization, the metadata stored in the top layer of the metadata heap of node 2, node 3, and node 4 is changed, and the currently running metadata set has been It is no longer the metadata set before the first layer synchronization. Therefore, the current base shape corresponding to node 2, node 3, and node 4 in the metadata heap inside node 1 ("1" in Figure 4,

The basic representation of "2", "3", and "4" must also be updated to keep the data in line with the data currently running in the corresponding node.

Taking the internal update of node 1 as an example: Since each node in the cluster is in communication with each other, node 1 can acquire the first node other than itself in the cluster (the first node refers to node 2, or node 3) , or node 4) The currently running updated metadata set can be obtained from the current base shape at the top of the metadata heap in the first node. Then, the obtained updated metadata set is stored in the current basic shape corresponding to the first node in the metadata heap of the node 1; for example, the node 1 obtains the update data from the top of the metadata heap of the node 2 After that, the identifier "2" stored in the own metadata heap corresponds to the current base shape 20, and the others are similar.

Optionally, each node can also save the degraded data of other nodes. For example, after the top level of the metadata heap of node 2 is updated to a new metadata set, the metadata set stored at the top level before the first layer synchronization becomes the degraded data at the time point before the current time point, and node 2 will degrade the data. Saved in the degenerate base shape in its own metadata heap, for example stored in the identity 2, where the degradation is based Shape 21. The node 1 can acquire the degradation data from the degenerate basic shape 21 of the node 2 and store it in the metadata heap of the node 1 itself, specifically in the degenerate basic shape 22 corresponding to the node 2, the degenerate basic shape 22 The second leaf node of the current base shape 20 corresponding to the node 2 (the left node is seen from FIG. 4). If node 1 has no degenerate basic shape of node 2 before, node 1 newly creates a degenerate base shape 22, stores the degraded data, and connects to the second leaf node of the current base shape 20.

Through the above processing, the data synchronization of each node of the cluster is realized, and the data consistency of the embodiment is simple to implement.

Embodiment 3

FIG. 5 is a schematic diagram of another embodiment of a metadata management method according to the present invention. This embodiment illustrates how the metadata heap implements cluster splitting.

As shown in FIG. 5, the cluster still includes four nodes, and the cluster includes node 1, node 2, node 3, and node 4. The cluster is split into two two-node clusters, including node 1 and node 2. New cluster, and a new cluster of nodes 3 and 4. After the cluster is split, because the nodes in the new cluster change, the metadata heap inside each node in the new cluster must also be changed. For example, in the new cluster consisting of node 1 and node 2, the new cluster does not include Node 3 and Node 4, then in the metadata heap inside Node 1, the current base shape corresponding to Node 3 is the current base shape that should not be connected to Node 2, because the two current connections in the metadata heap are connected. The nodes corresponding to the basic shape belong to the same cluster, so it is necessary to disconnect the current basic shape corresponding to the node 3 and the node 2.

Specifically, referring to FIG. 5, when the cluster is split, in the metadata heap inside the node 1 and the node 2, the metadata set corresponding to the node 3 and the node 4 needs to be segmented, because the node 3 and the node 4 no longer belong to a new cluster consisting of node 1 and node 2; and, to connect the metadata set corresponding to node 1 and node 2, because in the new cluster, the two nodes communicate with each other, and also connect in the metadata heap. stand up.

For example: In node 1, the connection between the current basic shape 31 of the node 3 and the current basic shape 32 of the node 2 is disconnected, and the current basic shape 31 is connected by the first leaf node b of the current basic shape 32, splitting When the connection here is disconnected (the slash line shown in FIG. 5 means disconnection), of course, as described above, the degenerate basic shape 33 and the node 2 where the degradation data of the node 3 is located The degenerate base shape 34 is actually not connected, the figure just shows that it will retreat The second leaf node of the basic shape 33 overlaps with the first leaf node of the degenerate basic shape 34; as the connection at the first leaf node b of the current base shape 32 is broken, the node 3 on the right side of the entire slice line and The basic shape corresponding to node 4 is removed from the metadata heap, that is, it no longer belongs to the metadata heap, because node 3 and node 4 no longer belong to the new cluster where node 1 is located.

For further example: In node 2, the same basic shape 31 and node of node 3 need to be disconnected.

2 (which may be referred to as the second node) the connection of the current base shape 32 at the first leaf node b; in addition, the current base shape 35 of the node 4 and the current state of the node 1 (which may be referred to as the third node) are also required to be disconnected The connection of the base shape 36 at the first leaf node b, since the node 4 no longer belongs to the new cluster to which the node 1 belongs, there is no longer a communication connection with the node 1. Then, referring to FIG. 5, it is also necessary to connect the current base shape 36 of the node 1 at the first leaf node b of the current base shape 32 of the node 2, because at this time, the node 1 and the node 2 form a new cluster, so the two nodes The current basic shape is to be connected.

For the splitting of the metadata heap in the new cluster consisting of node 3 and node 4, the principle is the same as that of node 1 and node 2, and will not be described again. See Figure 5. It can be seen that the metadata heap can be cut and fractal with the basic shape as the minimum granularity.

Embodiment 4

FIG. 6 is a schematic diagram of another embodiment of a metadata management method according to the present invention. This embodiment is to explain how the metadata heap implements cluster combination.

As shown in Figure 6, a cluster of two two-node clusters is used as a four-node cluster. For example, a cluster consisting of node 1 and node 2, and a cluster consisting of node 3 and node 4, the two cluster groups. A new cluster is synthesized, which includes node 1, node 2, node 3, and node 4, which is equivalent to the reverse process of the embodiment shown in FIG.

When clustering, referring to Figure 6, the metadata set of node 3 and node 4 needs to be added to the metadata heap of node 1 and node 2, and the metadata set of node 1 and node 2 needs to be added to node 3 and In the metadata heap of node 4, the metadata heap inside each node in the new cluster includes at least the currently running metadata set of each node.

For example: In node 1, node 1 can obtain the metadata set currently running by node 3 from the top layer of the metadata heap inside node 3, and obtain the metadata set currently running by node 4 from the top level of the metadata heap inside node 4. . Then, the node 1 can establish a current basic shape 41 corresponding to the node 3, a current basic shape 42 corresponding to the node 4 in its own metadata heap, and store the current running metadata set of the node 3 in the current basic shape 41. , connect the current base shape 41 to node 2 At the first leaf node b of the current base shape 43 (the first leaf node b is currently idle and not yet connected); store the metadata set currently running by the node 4 in the current base shape 42, and connect the current base shape 42 At the first leaf node b of the current base shape 41 of the node 3.

The metadata processing process of the node 2, the node 3, and the node 4 in the cluster combination is similar to the above, and will not be described again. See FIG. 6 for details. The degraded data of each node is optionally saved; for example, in the node 1, the degraded data 3 of the first level of the node 3 may be selected, and in the node 2, two of the nodes 3 may be selected for saving. The layer degradation data is 3, and 3"; of course, each node in the cluster can also save the current running metadata set of all nodes and the metadata set of the degraded data, so that the data stored in the metadata heap of each node is consistent.

After the above processing, the metadata heap implements support for cluster morphological transformation (such as splitting or combining); and, since the metadata heap is based on fractal theory design, its basic shape is very conducive to combination and segmentation, and a cluster can By dividing any number of nodes, the metadata heap can be split or combined according to the above principle, and the clustering and combination can be realized very simply. In addition, the management of metadata based on fractal theory is not node-limited, because the metadata in each node is stored fractally (that is, stored in the basic shape), so in theory, as long as there is a storage in the node "metadata heap" "The memory and disk space, cluster nodes have no upper limit, and because the clustering and combination can be quickly and easily implemented, the impact on cluster performance is small under the condition that the number of nodes increases.

Embodiment 5

FIG. 7 is a schematic diagram of another embodiment of a metadata management method according to the present invention. This embodiment is a snapshot implementation principle of the metadata heap.

As shown in FIG. 7, in some cases, such as second level degradation data 51 inside the node 1, the second level degradation data 51 includes second level degradation data of each node of the cluster (may be referred to as a second time point) The running metadata set), for example, the degraded data 1" of the node 1, the degraded data 2 of the node 2, etc., the node 1 can store the second hierarchical degradation data 51 as a whole, which is called "snapshot". Similarly, the first level of degradation data 53 (which may be referred to as a metadata set running at a first point in time) and the current running data 52 of the currently running metadata set may also be stored; .

When it is necessary to run the second level degradation data 51, that is, if the entire operation data 52 (including the metadata set currently running by each node) is to be replaced by the second level degradation data 51 as a whole, The second level of degradation data 51 stored in the previous snapshot is moved to the position of the current running data 52, that is, the second level of the degraded data 51 is stored in the current basic shape of the metadata heap, which is called "rolling forward" (ie, data). Move to the position of the more advanced time point; or, it is also possible to operate only the second level degradation data 51, which is equivalent to temporarily selected use, but does not move its position, and is still stored in the position shown in FIG.

In the case where the second hierarchical degradation data 51 as a whole replaces the current operational data 52 and moves to the position of the current operational data 52, the movement of the second hierarchical degradation data 51 may be referred to as "rolling forward"; corresponding, current operation The position of the data 52 must also be moved, for example, to the position of the previous second level of degraded data 51, that is, the entirety of the metadata set running in each of the current basic shapes of the metadata heap, stored to the second level of the metadata heap is degraded. In the degenerate basic shape corresponding to the data 51, the position corresponding to the second hierarchical degradation data 51 is interchanged, and the positional movement of the current operational data 52 can be referred to as "rollback".

Through the snapshot implementation described above, data rollback, rollforward, and the like can be processed more quickly. Embodiment 6

FIG. 8 is a schematic diagram 1 of another embodiment of a metadata management method according to the present invention. FIG. 9 is a schematic diagram of a second embodiment of a metadata management method according to the present invention. This embodiment illustrates how the metadata heap implements redundancy and Fault-tolerant.

Referring to FIG. 8, assuming that the metadata set currently running in node 1 (that is, the data stored in the current basic shape 61 at the top of the internal metadata heap of node 1), some data is lost or damaged, and needs to be modified. Part of the data is called "to be repaired metadata", then node 1 can first view its own degenerate basic shape, such as degenerate basic shape 62, to see if there is repair data corresponding to the above-mentioned metadata to be repaired (ie, before loss or damage) Data); If there is, the repair data can be obtained immediately, and the acquired repair data is replaced with the metadata to be repaired in the current basic shape, and the node 1 realizes the self-repair of the metadata.

Alternatively, if node 1 cannot find the repair data in its own degenerate basic shape, it can be directly obtained from the metadata heap of other nodes. For example, referring to FIG. 9, node 1 can be from the node in node 2 The repair data is obtained in the corresponding current basic shape 63.

In this embodiment, each node not only provides redundancy for its own metadata, for example, the metadata set currently running by the node is also stored in each of its degenerate basic shapes, and is also stored in other nodes, in the cluster. Other nodes, such as node 2, will store the currently running element of that node 1. Data collection, therefore, the metadata of node 1 is equivalent to having multiple backups. Node 1 can perform data repair in multiple ways, which improves the fault tolerance and redundancy level of data, and has higher security assurance.

Example 7

This embodiment mainly illustrates that the management method of the metadata heap can make the data change mode more flexible, and can implement a more flexible cluster read/write lock.

For example: Assume that the metadata heap inside node 1 is to modify part of the data in the currently running metadata set of node 1, if the currently running metadata set for node 1 is according to the prior art Locking, which is equivalent to aborting the current running of node 1, other nodes are also unable to read or write the current running data of node 1, and can not resume normal reading and writing until the modification is completed; however, in this embodiment Since the current running data of the node 1 is already redundantly stored in the degenerate basic shape, for example, the degenerate basic shape of the first level of the node 1 has the metadata to be modified in this part, then the node 1 may only base the degradation The shape is locked and modified. Other nodes cannot read and write the data in the degenerate basic shape, but have no effect on the stored data of the current basic shape of node 1. After the modification is completed, the modified new metadata is replaced with the current one. The corresponding data in the basic shape can be.

As can be seen from the above, the metadata heap of this embodiment can be designed in such a way that the data change is more flexible and does not affect the operation of the cluster.

Example eight

10 is a schematic diagram of another embodiment of a metadata management method according to the present invention. As shown in FIG. 10, a metadata set stored in a current basic shape corresponding to each node of a cluster may be stored in a memory, and the degenerate basic shape may be stored. The metadata set is stored in a storage medium other than the memory to save memory usage.

For example, referring to Figure 10, the metadata heap inside node 1 can store the metadata in each current primitive in memory (Mem in the figure represents memory), while other degraded data, that is, the first level in Figure 10 The degraded data 71 and the second degraded data 72 exist as redundancy and fault tolerance, so they can be placed on other slow storage media, for example, the first hierarchical degradation data 71 is placed on a cache. The second level of degraded data 72 is placed on a solid state disk (SSD) or DISK (disk), which saves memory usage. Moreover, the above storage method is to store data of different degradation levels on different storage media.

Example nine 11 is a schematic structural diagram of an embodiment of a metadata management apparatus according to the present invention. The metadata management apparatus can execute a metadata management method according to any embodiment of the present invention, and the metadata management apparatus is equivalent to, for example, each set in a cluster. A control module in the node, which is set in each node, can be used to store the metadata of the cluster, and can also be used for metadata management such as cluster splitting, combining or repairing the metadata. This embodiment briefly describes the structure of the device, wherein the specific working principle of each functional unit can be combined with any of the method embodiments of the present invention.

As shown in FIG. 11, the metadata management apparatus of this embodiment may include: a storage unit 91 and a management unit 92;

The storage unit 91 is configured to store the metadata in the cluster as a metadata heap, where the metadata includes a metadata set currently running by the current node itself and a metadata set currently running by all nodes in the cluster other than itself. The metadata heap includes at least two basic shapes, each basic shape is a binary tree shape composed of a vertex, a first leaf node, and a second leaf node; and the storing the metadata as a metadata heap includes:

Storing a metadata set currently running by the current node itself in a current basic shape at a top level of the metadata heap, where the first leaf node of the current basic shape is used to connect and store a metadata set currently running by another node. Another current basic shape, the first leaf node of the other current basic shape is used to connect to store another current basic shape of the metadata set currently running by another node, and so on until all nodes in the cluster are connected ;

a second leaf node of each of the current basic shapes is used to connect the node corresponding to the current basic shape to run at a first time point in each current basic shape for storing a metadata set currently running by each node in the cluster. a degenerate basic shape of the metadata set, the first time point being earlier than the current time point; the second leaf node of the degenerate basic shape is used to connect another set of metadata of the node running at the second time point a degenerate basic shape, the second time point is earlier than the first time point, and so on; the management unit 92 is configured to perform management of the metadata according to the metadata heap.

Further, the management unit 92 may include: a synchronization subunit 921, configured to acquire, after updating the metadata in the metadata heap, updated metadata currently running by nodes other than the cluster itself And storing the updated metadata set in a current basic shape corresponding to the node in the metadata heap.

Further, the management unit 92 may include: a storage subunit 922, configured to use the metadata Degraded data corresponding to the node in the heap, stored in a degenerate basic shape, the degraded data being data stored in the current basic shape before the updated metadata set; The base shape is connected to the second leaf node of the current base shape.

Further, the storage unit 91 is specifically configured to connect the current basic shape of the first node to the first leaf node of the current basic shape of the second node, and connect the first leaf node of the current basic shape of the first node to the third node. The current basic shape of the node; correspondingly, the management unit 92 may include: a morphological control sub-unit 923; the morphological control sub-unit 923 is configured to take the other node other than the current node itself in the cluster When the cluster is split, the current basic shape corresponding to the other node is disconnected from the first leaf node of the current basic shape of the current node, and the another current corresponding to the another node is And a further current basic shape corresponding to another node connected by the first leaf node of the basic shape; connecting another current basic shape of the further node to the first leaf node of the current basic shape of the current node Also used to acquire the current node of the new node from the metadata heap stored in the new node to be joined to the cluster when adding a new node to the cluster Running a metadata set; establishing a current basic shape corresponding to the new node in a metadata heap of the current node, and storing the currently running metadata set of the new node in a current basic shape corresponding to the new node, The current base shape of the new node is connected to a current base shape of the first leaf node having an idleness in the metadata heap.

Further, the management unit 92 may include: a snapshot subunit 924, a selection subunit 925, and a migration subunit 926;

The snapshot sub-unit 924 is configured to separately store the entirety of the metadata set running by each node in the cluster at each time point, where each time point includes the current time point, the first time point, and the second time point, Running a metadata set corresponding to a certain time point in each of the time points.

a sub-unit 925, configured to run an entirety of a metadata set corresponding to a certain time point in the each time point, where the certain time point is a time point other than the current time, and the metadata set is The whole is stored in the degenerate basic shape of the metadata heap;

The migration sub-unit 926 is configured to store the entirety of the metadata set corresponding to the certain time point in each current basic shape of the metadata heap; and to use the current basis of the metadata heap The entirety of the set of metadata running in the shape is stored in the degenerate base shape of the metadata heap.

Further, the management unit 92 may include: a repair subunit 927, configured to: when there is metadata to be repaired in a current basic shape corresponding to the current node in the metadata heap, Acquiring the repair data corresponding to the data to be repaired according to the degenerate basic shape corresponding to the current node in the stack; or acquiring the current basic shape corresponding to the metadata heap of the other nodes in the cluster The repair data corresponding to the data to be repaired; the repair data to be repaired is replaced with the metadata to be repaired in the current basic shape corresponding to the current node in the metadata heap.

Further, the management unit 92 may include: a storage control sub-unit 928, configured to store, in the memory, a metadata set stored in a current basic shape corresponding to each node of the cluster in the metadata heap, where the element is The set of metadata stored in the degenerate basic shape of the data heap is stored in a storage medium other than the memory.

Further, the management unit 92 may include: a read/write control subunit 929, configured to acquire metadata that needs to be changed from a certain degenerate basic shape in the metadata heap, and perform locking modification on the metadata. And replacing the modified metadata with the metadata that needs to be changed stored in the current basic shape corresponding to the same.

Further, when the primary node in the cluster is powered off, the management unit 92 is further configured to: determine that the current node is a master node according to a predetermined rule; and modify the primary and secondary identifiers of the current node to be used as a primary node. . Specifically, the primary and secondary identifiers may be Flag identifiers inside the storage node. After the identifier is modified for primary use, the node becomes a primary node, and the metadata in the cluster may be managed, because the node is in the node. Also stored is a structure of a metadata heap in which a metadata set of all nodes in the cluster is stored, thus avoiding data migration in the prior art.

The metadata management apparatus provided by the present invention saves the current running metadata set of all the nodes in the cluster in each node, so that data migration between the active and standby nodes is not required when the active/standby switchover is performed. The fast execution of the active/standby switchover in the cluster is implemented.

Example ten

FIG. 12 is a schematic structural diagram of another embodiment of a metadata management apparatus according to the present invention. The metadata management apparatus may perform a metadata management method according to any embodiment of the present invention. As shown in FIG. 12, the apparatus may include: a memory 1201 and processing. 1202; wherein

a storage 1201, configured to store metadata in the cluster as a metadata heap, where the metadata includes a metadata set currently running by the current node itself and a metadata set currently running by all nodes in the cluster other than itself, The metadata heap includes at least two basic shapes, each of which is a binary tree shape composed of a vertex, a first leaf node, and a second leaf node; the storing the metadata as a metadata heap, including: Storing a metadata set currently running by the current node itself in a current basic shape at a top level of the metadata heap, where the first leaf node of the current basic shape is used to connect and store a metadata set currently running by another node. Another current basic shape, the first leaf node of the other current basic shape is used to connect to store another current basic shape of the metadata set currently running by another node, and so on until all nodes in the cluster are connected ;

a second leaf node of each of the current basic shapes is used to connect the node corresponding to the current basic shape to run at a first time point in each current basic shape for storing a metadata set currently running by each node in the cluster. a degenerate basic shape of the metadata set, the first time point being earlier than the current time point; the second leaf node of the degenerate basic shape is used to connect another set of metadata of the node running at the second time point a degenerate basic shape, the second time point is earlier than the first time point, and so on; the processor 1202 is configured to perform the management of the metadata according to the metadata heap.

Specifically, the processor 1202 may be configured to: when updating the metadata in the metadata heap, acquire an updated metadata set currently running by one of the nodes other than the cluster itself; The updated metadata set is stored in a current basic shape in the metadata heap corresponding to the one of the nodes.

The processor 1202 is further configured to store, in the metadata base, degraded data corresponding to the one of the nodes in a degenerate basic shape, where the degraded data is stored before the updated metadata set. Data in the current base shape; connecting the degenerate base shape to a second leaf node of the current base shape.

The processor 1202 is further configured to: when the another node other than the current node itself in the cluster is split from the cluster, use a current basic shape corresponding to the another node, and the current The first leaf node of the current basic shape of the node is disconnected, and the other current node corresponding to the other node of the other current basic shape corresponding to the other node is disconnected Opening a connection; connecting the further current basic shape to a first leaf node of a current basic shape of the current node; and also for using a new node to join the cluster when adding a new node to the cluster Obtaining a currently running metadata set of the new node in a metadata heap stored in the metadata; establishing a current basic shape corresponding to the new node in a metadata heap of the current node, and The running metadata set is stored in a current basic shape corresponding to the new node, and the current basic shape of the new node is connected with a current basic shape of the first leaf node having the idleness in the metadata heap. Pick up. The processor 1202 is further configured to separately store a metadata set that is run at each time point, where the metadata set includes a metadata set of each node in the cluster, where each time point includes the current time point, A time point and a second time point are used to run a metadata set corresponding to a certain time point in each time point.

The processor 1202 is further configured to: when the metadata to be repaired exists in the current basic shape corresponding to the current node in the metadata heap, obtain the information from the degenerate basic shape corresponding to the current node in the metadata heap. Recovering the repair data corresponding to the repair data; or acquiring the repair data corresponding to the data to be repaired from the current basic shape of the metadata stack of the other nodes in the cluster; The metadata to be repaired in the current basic shape corresponding to the current node itself in the metadata heap is replaced.

A person skilled in the art can understand that all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, and when executed, the program includes The foregoing steps of the method embodiment; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Finally, it should be noted that the above embodiments are only for explaining the technical solutions of the present invention, and are not intended to be limiting thereof; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims

claims

1. A metadata management method, characterized in that the method is applied to a cluster containing multiple nodes, and the method includes:

The metadata in the cluster is stored as a metadata heap. The metadata includes the currently running metadata set of the current node itself and the currently running metadata set of all nodes in the cluster except itself. The metadata The heap includes at least two basic shapes, each basic shape is a binary tree composed of a vertex, a first leaf node and a second leaf node;

The storing of metadata in the cluster as a metadata heap specifically includes:

The current node's own currently running metadata set is stored in the current basic shape located at the top of the metadata heap. The first leaf node of the current basic shape is used to connect and store the currently running metadata set of another node. Another current base shape, the first leaf node of the other current base shape is used to connect to another current base shape that stores the metadata set currently running on another node, and so on until all nodes in the cluster are connected. node,

The second leaf node of the current basic shape is used to connect the degenerated basic shape of the node corresponding to the current basic shape. The degenerated basic shape is a set of metadata of the node corresponding to the current basic shape running at the first point in time, so The first time point is earlier than the current time point; The second leaf node of the degenerated basic shape is used to connect another degenerated basic shape of the node corresponding to the current basic shape, and the other degenerated basic shape is the current basic shape. The metadata set of the corresponding node running at the second time point, which is earlier than the first time point, and so on;

The metadata is managed according to the metadata heap.

2. The metadata management method according to claim 1, characterized in that, after updating the metadata in the metadata pile, managing the metadata according to the metadata pile includes: :

Obtain the updated metadata set currently running on one of the nodes other than itself in the cluster;

The updated metadata set is stored in the current basic shape corresponding to one of the nodes in the metadata heap.

3. The metadata management method according to claim 2, characterized in that, after the current basic shape corresponding to one of the nodes stored in the metadata heap is stored, further comprising: converting the metadata The degraded data corresponding to one of the nodes in the data heap is stored in the degraded In the base form, the degraded data is data stored in the current base form before the updated metadata set;

Connect the degenerated basic shape to the second leaf node of the current basic shape.

4. The metadata management method according to claim 1, characterized in that,

When the other node other than the current node in the cluster is split from the cluster, the management of the metadata according to the metadata pile includes:

Disconnect the current basic shape corresponding to the other node from the first leaf node of the current basic shape of the current node, and connect the first leaf node of the other current basic shape corresponding to the other node. The further current basic shape corresponding to the further node connected by the node is disconnected;

Connect the further current basic shape to the first leaf node of the current basic shape of the current node.

5. The metadata management method according to claim 1, characterized in that, when adding a new node to the cluster, the management of the metadata according to the metadata heap includes: Obtain the currently running metadata set of the new node from the metadata heap stored in the new node of the cluster;

Establish the current basic shape corresponding to the new node in the metadata pile of the current node, and store the currently running metadata set of the new node in the current basic shape corresponding to the new node, and store the current basic shape corresponding to the new node. The current base shape of the new node is connected to the current base shape with the free first leaf node in the metadata heap.

6. The metadata management method according to claim 1, wherein the management of the metadata according to the metadata heap includes:

Metadata sets running at each time point are respectively stored. The metadata set includes the metadata set of each node in the cluster. Each time point includes the current time point, the first time point and the second time point. , to run the metadata collection corresponding to a certain time point among the time points.

7. The metadata management method according to claim 1, wherein the management of the metadata according to the metadata heap includes:

When there is metadata that needs to be repaired in the current basic shape corresponding to the current node itself in the metadata heap, obtain the metadata corresponding to the metadata that needs to be repaired from the degraded basic shape corresponding to the current node itself in the metadata heap. The degraded data; or, obtain the degraded data corresponding to the metadata that needs to be repaired from the current basic shape corresponding to itself in the metadata pile of other nodes in the cluster; The acquired degraded data corresponding to the metadata that needs to be repaired is replaced with the metadata that needs to be repaired in the current basic shape corresponding to the current node itself in the metadata heap.

8. A metadata management device, characterized by including:

A storage unit used to store metadata in the cluster as a metadata heap, where the metadata includes a set of metadata currently running on the current node itself and a set of metadata currently running on all nodes in the cluster except itself, the The metadata heap includes at least two basic shapes, each basic shape is a binary tree shape composed of a vertex, a first leaf node and a second leaf node; the storage of metadata as a metadata heap includes:

The current node's own currently running metadata set is stored in the current basic shape located at the top of the metadata heap, and the first leaf node of the current basic shape is used to connect and store the currently running metadata set of another node. Another current basic shape, the first leaf node of the other current basic shape is used to connect to another current basic shape that stores the metadata set currently running by another node, and so on until all nodes in the cluster are connected. ;

In each current basic shape used to store the metadata set currently running by each node in the cluster, the second leaf node of each current basic shape is used to connect the node corresponding to the current basic shape running at the first point in time. The degenerated basic shape of the metadata set, the first time point is earlier than the current time point; the second leaf node of the degenerated basic shape is used to connect another metadata set of the node running at the second time point. A degenerated basic form, the second time point is earlier than the first time point, and so on; a management unit, used to manage the metadata according to the metadata heap.

9. The metadata management device according to claim 8, characterized in that the management unit includes:

Synchronization subunit, used to obtain the updated metadata set currently running on one of the nodes other than itself in the cluster after updating the metadata in the metadata pile; The data set is stored in the current basic shape corresponding to one of the nodes in the metadata heap.

10. The metadata management device according to claim 9, characterized in that the management unit includes:

The storage subunit is used to store the degraded data corresponding to one of the nodes in the metadata heap in the degraded basic shape, and the degraded data is stored in the updated metadata set before the updated metadata set. data in the current basic shape; connect the degenerated basic shape to the current basic shape The second leaf node of the shape.

11. The metadata management device according to claim 8, characterized in that the management unit includes:

A shape control subunit, configured to, when splitting another node other than the current node in the cluster from the cluster, compare the current basic shape corresponding to the other node with the current node. The first leaf node of the current basic shape is disconnected, and the first leaf node of the other current basic shape corresponding to the other node is connected to the further current basic shape corresponding to the further node. Connect; Connect the further current basic shape to the first leaf node of the current basic shape of the current node;

It is also configured to, when adding a new node to the cluster, obtain the currently running metadata set of the new node from the metadata heap stored in the new node that will be added to the cluster; The metadata heap establishes the current basic shape corresponding to the new node, stores the currently running metadata set of the new node in the current basic shape corresponding to the new node, and stores the current basic shape of the new node. Connect to the current base shape of the first leaf node that is free in the metadata heap.

12. The metadata management device according to claim 8, characterized in that the management unit includes:

Snapshot subunit, used to respectively store metadata sets running at each time point. The metadata set includes the metadata set of each node in the cluster. Each time point includes the current time point, the first time point and the second time point to run the metadata set corresponding to a certain time point among the time points.

13. The metadata management device according to claim 8, characterized in that the management unit includes:

Repair subunit, used to obtain metadata that needs to be repaired in the current basic shape corresponding to the current node itself in the metadata pile, from the degraded basic shape corresponding to the current node itself in the metadata pile. The degraded data corresponding to the metadata that needs to be repaired; or, obtain the degraded data corresponding to the metadata that needs to be repaired from the current basic shape corresponding to itself in the metadata pile of other nodes in the cluster; The degraded data corresponding to the repaired metadata replaces the metadata that needs to be repaired in the current basic shape corresponding to the current node itself in the metadata heap.

14. The metadata management device according to claim 8, characterized in that when the cluster When the master node in is powered off, the management unit is also used to:

Determine the current node as the master node according to predetermined rules;

Modify the primary and backup identifiers of the current node to the primary.

15. A metadata management device, characterized in that it includes a memory and a processor; the memory is used to store metadata in the cluster as a metadata heap, and the metadata includes the currently running data of the current node itself. A set of metadata and a set of metadata currently running on all nodes in the cluster except itself. The metadata heap includes at least two basic shapes, each basic shape is composed of a vertex, a first leaf node and a second leaf node. binary tree;

The processor is configured to manage the metadata according to the metadata heap.