CN102932331A - Super-safe-storage coding/decoding method applicable to distributed storage system - Google Patents

Super-safe-storage coding/decoding method applicable to distributed storage system Download PDF

Info

Publication number
CN102932331A
CN102932331A CN2012103715859A CN201210371585A CN102932331A CN 102932331 A CN102932331 A CN 102932331A CN 2012103715859 A CN2012103715859 A CN 2012103715859A CN 201210371585 A CN201210371585 A CN 201210371585A CN 102932331 A CN102932331 A CN 102932331A
Authority
CN
China
Prior art keywords
data
blocks
decoding
block
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012103715859A
Other languages
Chinese (zh)
Inventor
张真
刘志明
王义飞
赵庆福
蒋文佼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING INNOVATIVE CLOUD STORAGE TECHNOLOGY Co Ltd
Original Assignee
NANJING INNOVATIVE CLOUD STORAGE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING INNOVATIVE CLOUD STORAGE TECHNOLOGY Co Ltd filed Critical NANJING INNOVATIVE CLOUD STORAGE TECHNOLOGY Co Ltd
Priority to CN2012103715859A priority Critical patent/CN102932331A/en
Publication of CN102932331A publication Critical patent/CN102932331A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a super-safe-storage coding/decoding method applicable to a distributed storage system. According to the method, a Reed-Solomon (RS) algorithm is combined with the distributed storage system, and the problem of wasting of a disk space caused by adoption of a simple copy backup method in a common cloud storage supporting method is solved, so on the premise of the ultra-high performance of a large-scale data cloud storage solution, the utilization rate of a disk is improved, the safety of data is ensured, the total energy consumption of the system is reduced, and cost is saved. According to the super-safe-storage coding/decoding method provided by the invention, the system is configured according to a parameter (M, N), so if at least (N-M) random block data storage nodes in the system fail, metadata can be decoded according to remaining nodes, the safety of the data is fully ensured, a metadata server can continuously detect the validity of data blocks and can automatically recover invalid data blocks into original data blocks, redundancy cost is low, and cost is reduced.

Description

The super peace that is applied to distributed memory system is deposited decoding method
Technical field
This method relates to computer data storage security field, is specifically related to a kind of super peace of distributed memory system high reliability that ensures and deposits decoding method.
Background technology
At present cloud storage method for supporting generally is the storage means that adopts distributed file system, deblocking is stored on the cheap common computer, the such safety problem of data, namely the disaster tolerance problem just becomes the important consideration point of general cloud storage method for supporting.General cloud storage method for supporting adopts the mode of simple redundancy backup that data block is carried out copy backup mostly, a data block has several parts of on all four copies at many machines, but this disaster recovery method has caused the waste of a large amount of disk spaces, energy consumption, cost to system have considerable influence, this method adopts the security strategy of code encoding/decoding mode, guaranteeing under the prerequisite that systematic function does not reduce, utilize the aspect greatly to improve the utilance of disk space, cost and the energy consumption of the corresponding system that also greatly reduces from disk.
RS (Reed-Solomon) code is the very strong special nonbinary BCH code of a class error correcting capability.For the optional positive integer S q system BCH code that can to construct a corresponding code length be n=qS-1, and q is as the power of certain prime number.Work as S=1, q〉the q system BCH code of the code length n=q-1 that 2 o'clock sets up, claim that it is the RS code.As q=2m (m〉1), the binary system RS code that its symbol is taken from F (2m) can be used to correct burst error, and it is the most frequently used RS code.A RS code has following parameter: block length: n (=2^m-1) individual symbol; Message-length: a k symbol; Parity check length: a n-k=e symbol; Minimum range: a dmin=n-k+1 symbol, by shortening, the length of (n, k) RS code can reduce to (n', k') RS code with same-sign length, and wherein n' and k' are less than or equal to respectively n and k.
Summary of the invention
The object of the invention is to the RS algorithm is incorporated distributed memory system, provide a kind of super peace to deposit decoding method, solve the waste problem of the disk space that simple copy backup method that general cloud storage method for supporting adopts causes, large-scale data cloud storage solution is being had under the prerequisite of very-high performance, improve the utilance of disk, ensure the fail safe of data, reduce system's total energy consumption, save cost.
It is to design on the basis of distributed cloud storage system that super peace of the present invention is deposited coding and decoding scheme.Distributed cloud storage system (as: the cStor cloud storage system that Nanjing cloud wound storage Science and Technology Ltd. researches and develops voluntarily, and the metadata management node that relates among the present invention, client, the technical name of using in the technical name such as blocks of data memory node and the cStor cloud storage system has identical definition) adopt distributed storage policy, mainly comprise metadata management node and a plurality of blocks of data memory node, wherein, the metadata management node is responsible for organizational scheduling blocks of data memory node, deposit and management of metadata information, the blocks of data memory node is responsible for storing data block, the read-write operation of client needs to obtain alternately blocks of data memory node information with the metadata management node first, directly carries out data interaction with the blocks of data memory node again.The metadata management node is made of meta data server, and the blocks of data memory node is made of the blocks of data server.
The following technical scheme of the concrete employing of the present invention:
A kind of super peace that is applied to distributed memory system is deposited decoding method, comprising:
Configuration codec parameters (M, N), wherein M is the original data block number, N is number of data blocks behind the RS coding;
When client is write data file is carried out sequential packet, every group is the X byte-sized, and X is any positive integer, again each group is carried out order and is blocked into the M piece, and every is the size of continuous X/M byte; For each grouping, client is deposited M different blocks of data server to the M of this a grouping data block, and meta data server records the piece storage information of this grouping;
During the client read data, when the object block data server can't be accessed, any all the other the M platforms addressable data block server reading out data of client within divide into groups, and obtain target data by the RS decoding;
When the blocks of data server is received the encoding and decoding task of meta data server, M data block in the grouping of metadata management node indication copied to this locality, delegation take each piece as matrix, be configured to a matrix that M is capable, and go out the current position that needs data block place in grouping of encoding and decoding according to the piece index calculation, if less than or equal to M, then carry out RS coding task, produce the redundant data piece; If less than or equal to N, then carry out the RS decoding task greater than M, produce original data block.
Beneficial effect:
Deposit decoding method according to super peace of the present invention, with parameter (M, N) configuration-system, then be not more than arbitrarily the N-M platform blocks of data memory node machine of delaying in the system, all can go out former data according to the residue node decoder, fully guarantee the fail safe of data, meta data server can constantly detect effective situation of data block, can automatically recover original data block to invalid data block, simultaneously with smaller redundant cost, reduce cost again.
Description of drawings
Accompanying drawing 1: client is write data flow;
Accompanying drawing 2: file data grouping block distribution schematic diagram (M=4, N=6);
Accompanying drawing 3: client read data flow process;
Accompanying drawing 4: client decoding read data flow process;
Accompanying drawing 5: Metadata Service controll block data server codec data flow process.
Embodiment
Below in conjunction with accompanying drawing technical scheme of the present invention is elaborated:
4.1 client decoding method
Client is after succeeding in registration to the metadata management node, and the metadata management node returns codec parameters (M, N) to client, and wherein, M is the original data block number, and N is the rear number of data blocks of coding.Do not encode when client is write data, guaranteed that like this client writes the performance of data, and can not require too high machine hardware configuration.If the original data block data memory node of reading during the client read data can't be accessed, the client terminal start-up decoding process reads the interior data block of grouping from any addressable blocks of data memory node of all the other M platforms, and the computing of decoding obtains target data.
4.1.1 client is write data flow
Client is write data flow as shown in Figure 1, file system driving module is received the data-message of writing of kernel, readjustment encoding and decoding client function writes the write operation buffering area with data, is responsible for the blocks of data memory node is taken out and sent to data from buffering area by thread.Client is write in the process of data, a file at first illustrates as an example of the X=64MB size example by 64MB() size carries out sequential packet as a group, again take 64MB/M as a data block (chunk, 1chunk=1024block) even piecemeal (filling with numeral 0 during not enough byte encoding and decoding) is carried out in each grouping, and storing respectively M blocks of data memory node of metadata management node appointment into, meta data server records the piece storage information (piece storage information comprises the position of block identification, piece version number and the storage of data block) of each grouping.In the process of data writing, request comprises the side-play amount of offset(relative file original position when writing data), the current size of writing data of size() and the real data of data(current request) etc. field, client computing block index, block numbering and block internal blas amount, computational process is as follows:
1) according to block_no=offset〉〉 16 obtain block numbering;
2) according to the * M/1024 of chindx=(block_no〉〉 10) * N+(block_no%1024), the computing block index position, send message to meta data server, meta data server is according to new slot of index creation, and records the metadata information of new piece;
3) calculate block in the block of blocks of data server side-play amount according to pos=(block_no % (M<<10))/M;
4) according to from=offset﹠amp; 0xFFFF calculates the side-play amount among the block;
Client blocking process when Figure 2 shows that (M=4, N=6), client is divided into continuous 4 (chunk1, chunk2, chunk3, chunk4) to the data of 64MB size, and every block size is 16MB.These 4 original data blocks are kept at 4 different blocks of data servers; Chunk5 and chunk6 are the redundant data piece that generates behind 2 blocks of data server codes.
4.1.2 client read data
Client need not decoded the reading out data flow process as shown in Figure 3, file system driving module is received the read data message of kernel, readjustment encoding and decoding client function, client is according to side-play amount computing block index, send a request message to meta data server and to obtain the blocks of data server that this data block is preserved, client is from blocks of data server reading out data.In the process of read data, request comprises the side-play amount of offset(relative file original position during read data), the size of the current read data of size() and the buffering area of data(return data) etc. field.The algorithm of computing block index was not as follows when client need not be decoded read data:
1) according to block_no=offset〉〉 16 obtain block numbering;
2) the * M/1024 of indx=(block_no〉〉 10) * N+(block_no%1024), the computing block index number;
Client decoding reading out data flow process as shown in Figure 4, when the object block data server can't be accessed, any other the M platform addressable data block server reading out data of encoding and decoding client from grouping, and obtain target data by the decoding computing.
4.2 metadata management node module decoding method
Scheduling, management and the control of the whole encoding and decoding task of metadata management node control, it does not carry out concrete coding-decoding operation.Metadata management node encoding and decoding task has two trigger points:
1) after client writes a data block, sending to the metadata management node and to write data block message, all is initial data because client writes, and the metadata management node only need produce N-M the task record of encoding, add in the encoded recording formation, message structure is s3codechunk.Because after client whenever writes a piece, the encoding block in the corresponding grouping need recompile, can delay time and just begin to trigger coding after the M piece of a grouping is all write successfully, the metadata management node just begins really to dispatch the encoding and decoding task after perhaps surpassing a period of time, send encoding and decoding task requests message to the blocks of data server, message structure is s3code;
2) the metadata management node periodically checks each file, when certain piece of finding file is that sky or active block number are when being 0, produce an encoding and decoding task, dispatch a blocks of data memory node and carry out the encoding and decoding task, recover the valid data of this piece, the blocks of data memory node returns the encoding and decoding state to the metadata management node, and the metadata management node is removed encoding and decoding task record in the formation;
The single encoded task record structure of meta data server is: typedef struct _ s3codechunk
{
Uint32_t inode; // Archive sit sign
Uint64_t chunkid; // piece node identification
Uint32_t version; // piece version number
Uint32_t indx; // piece index
Uint32_t savetime; The time that // record is preserved
Uint8_t decodeable; // coding or decoding
Uint8_t sesflags; // session identification
Uint32_t rootinode; // root node sign
Uint32_t uid; // user ID
Uint8_t s3coding; // whether encode
Uint32_t s3codingstarttime; // coding the time started
struct?_s3codechunk?*next;
}s3codechunk;
Meta data server to the encoded data structure that the data block server sends is:
typedef?struct?_s3codearg?{
Uint32_t chindx; // piece index
Uint64_t chunkid; // block identification
Uint32_t version; // version number
Void* srceptr; // data block server info
}?s3codearg;
typedef?struct?_s3code{
Uint32_t indx; // coded data block index
Uint64_t chunkid; // block identification
Uint32_t version; // version number
Uint32_t savetime; // coding the time started
S3codearg s3codeargm[M]; // code set record
}?s3code;
Codec parameters (M, N) is by meta data server control, and when encoding and decoding client and blocks of data memory node were registered to the metadata management node, the metadata management node returned codec parameters.
4.3 blocks of data memory node module coding method
After the blocks of data memory node is received the piece coding task of metadata management node, produce an encoding and decoding task and join in the work queue.The task invoking block copies flow process (from copied chunks between the different blocks of data memory nodes), M data block in the grouping of metadata management node indication copied to this locality, delegation take each piece as matrix, be configured to a matrix that M is capable, carry out the coding task, produce the redundant data piece.Codec parameters (M, N) is returned when the metadata management node is registered for the blocks of data memory node.
Blocks of data memory node encoding and decoding flow process is as shown in Figure 5:
1) the metadata management node sends the coding request to the blocks of data memory node;
2) M the data block of blocks of data memory node from any M blocks of data memory node duplication packets;
3) M data block of blocks of data memory node coding obtains target data block and preserves;
4) the blocks of data memory node is deleted M the data block that copies;
5) the blocks of data memory node sends the coding success message to the metadata management node.
4.4 blocks of data memory node module coding/decoding method
After the blocks of data memory node is received the piece decoding task of metadata management node, produce a decoding task and join in the work queue.The task invoking block copies flow process (from copied chunks between the different blocks of data memory nodes), M data block in the grouping of metadata management node indication copied to this locality, delegation take each piece as matrix, be configured to a matrix that M is capable, carry out decoding task, produce original data block.Codec parameters (M, N) is returned when the metadata management node is registered for the blocks of data memory node.
Blocks of data memory node decoding process is as shown in Figure 5:
1) the metadata management node sends the decoding request to the blocks of data memory node;
2) M the data block of blocks of data memory node from any M blocks of data memory node duplication packets;
3) M data block of blocks of data memory node decoding obtains target data block and preserves;
4) the blocks of data memory node is deleted M the data block that copies;
The blocks of data memory node sends successfully decoded message to the metadata management node.

Claims (3)

1. a super peace that is applied to distributed memory system is deposited decoding method, and described it is characterized in that comprises:
Configuration codec parameters (M, N), wherein M is the original data block number, N is number of data blocks behind the RS coding;
When client is write data file is carried out sequential packet, every group is the X byte-sized, and X is any positive integer, again each group is carried out order and is blocked into the M piece, and every is the size of continuous X/M byte; For each grouping, client is deposited M different blocks of data server to the M of this a grouping data block, and meta data server records the piece storage information of this grouping;
During the client read data, when the object block data server can't be accessed, any all the other the M platforms addressable data block server reading out data of client within divide into groups, and obtain target data by the RS decoding;
When the blocks of data server is received the encoding and decoding task of meta data server, M data block in the grouping of metadata management node indication copied to this locality, delegation take each piece as matrix, be configured to a matrix that M is capable, and go out the current position that needs data block place in grouping of encoding and decoding according to the piece index calculation, if less than or equal to M, then carry out RS coding task, produce the redundant data piece; If less than or equal to N, then carry out the RS decoding task greater than M, produce original data block.
2. the super peace that is applied to distributed memory system as claimed in claim 1 is deposited decoding method, it is characterized in that, the metadata management node periodically checks each file, when certain piece of finding file is that sky or active block number are when being 0, produce an encoding and decoding task, dispatch a blocks of data memory node and carry out the encoding and decoding task, recover the valid data of this piece.
3. the super peace that is applied to distributed memory system as claimed in claim 1 is deposited decoding method, it is characterized in that, codec parameters (M, N) controlled by meta data server, when client and blocks of data memory node were registered to the metadata management node, the metadata management node returned codec parameters.
CN2012103715859A 2012-09-29 2012-09-29 Super-safe-storage coding/decoding method applicable to distributed storage system Pending CN102932331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012103715859A CN102932331A (en) 2012-09-29 2012-09-29 Super-safe-storage coding/decoding method applicable to distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012103715859A CN102932331A (en) 2012-09-29 2012-09-29 Super-safe-storage coding/decoding method applicable to distributed storage system

Publications (1)

Publication Number Publication Date
CN102932331A true CN102932331A (en) 2013-02-13

Family

ID=47647033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012103715859A Pending CN102932331A (en) 2012-09-29 2012-09-29 Super-safe-storage coding/decoding method applicable to distributed storage system

Country Status (1)

Country Link
CN (1) CN102932331A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103688514A (en) * 2013-02-26 2014-03-26 北京大学深圳研究生院 Coding method for minimum storage regeneration codes and method for restoring of storage nodes
CN103797455A (en) * 2013-11-06 2014-05-14 华为技术有限公司 Method and apparatus for storing files
WO2015180038A1 (en) * 2014-05-27 2015-12-03 北京大学深圳研究生院 Partial replica code construction method and device, and data recovery method therefor
CN103797455B (en) * 2013-11-06 2016-11-30 华为技术有限公司 The method and apparatus of storage file
CN107065800A (en) * 2017-04-27 2017-08-18 合肥城市云数据中心股份有限公司 Industrial signal data access method based on fixed length block
CN107153506A (en) * 2016-03-02 2017-09-12 上海云熵网络科技有限公司 Distributed memory system and processing method based on regeneration code
CN107885615A (en) * 2016-09-30 2018-04-06 上海云熵网络科技有限公司 The restored method and system of distributed storage data
CN110347344A (en) * 2019-07-19 2019-10-18 北京计算机技术及应用研究所 It is a kind of that block storage method is automatically configured based on distributed memory system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110029809A1 (en) * 2009-07-30 2011-02-03 Cleversafe, Inc. Method and apparatus for distributed storage integrity processing
CN101488104B (en) * 2009-02-26 2011-05-04 北京云快线软件服务有限公司 System and method for implementing high-efficiency security memory
CN102546755A (en) * 2011-12-12 2012-07-04 华中科技大学 Data storage method of cloud storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104B (en) * 2009-02-26 2011-05-04 北京云快线软件服务有限公司 System and method for implementing high-efficiency security memory
US20110029809A1 (en) * 2009-07-30 2011-02-03 Cleversafe, Inc. Method and apparatus for distributed storage integrity processing
CN102546755A (en) * 2011-12-12 2012-07-04 华中科技大学 Data storage method of cloud storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余林琛等: "《RS纠删码在云存储中的应用》", 《微电子学与计算机》, vol. 28, no. 8, 5 August 2011 (2011-08-05) *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103688514A (en) * 2013-02-26 2014-03-26 北京大学深圳研究生院 Coding method for minimum storage regeneration codes and method for restoring of storage nodes
CN103688514B (en) * 2013-02-26 2017-07-11 北京大学深圳研究生院 A kind of minimum memory regenerates the coding and memory node restorative procedure of code
WO2014131148A1 (en) * 2013-02-26 2014-09-04 北京大学深圳研究生院 Method for encoding minimal storage regenerating codes and repairing storage nodes
WO2015066850A1 (en) * 2013-11-06 2015-05-14 华为技术有限公司 Method and device for storing file
CN103797455B (en) * 2013-11-06 2016-11-30 华为技术有限公司 The method and apparatus of storage file
CN103797455A (en) * 2013-11-06 2014-05-14 华为技术有限公司 Method and apparatus for storing files
WO2015180038A1 (en) * 2014-05-27 2015-12-03 北京大学深圳研究生院 Partial replica code construction method and device, and data recovery method therefor
CN107153506A (en) * 2016-03-02 2017-09-12 上海云熵网络科技有限公司 Distributed memory system and processing method based on regeneration code
CN107885615A (en) * 2016-09-30 2018-04-06 上海云熵网络科技有限公司 The restored method and system of distributed storage data
CN107885615B (en) * 2016-09-30 2020-09-04 上海云熵网络科技有限公司 Distributed storage data recovery method and system
CN107065800A (en) * 2017-04-27 2017-08-18 合肥城市云数据中心股份有限公司 Industrial signal data access method based on fixed length block
CN107065800B (en) * 2017-04-27 2019-04-09 合肥城市云数据中心股份有限公司 Industrial signal data access method based on fixed length block
CN110347344A (en) * 2019-07-19 2019-10-18 北京计算机技术及应用研究所 It is a kind of that block storage method is automatically configured based on distributed memory system

Similar Documents

Publication Publication Date Title
CN102932331A (en) Super-safe-storage coding/decoding method applicable to distributed storage system
CN101488104B (en) System and method for implementing high-efficiency security memory
Qi et al. BFT-Store: Storage partition for permissioned blockchain via erasure coding
US9378088B1 (en) Method and system for reclamation of distributed dynamically generated erasure groups for data migration between high performance computing architectures and data storage using non-deterministic data addressing
US9710346B2 (en) Decoupled reliability groups
US9477551B1 (en) Method and system for data migration between high performance computing architectures and file system using distributed parity group information structures with non-deterministic data addressing
US10191808B2 (en) Systems and methods for storing, maintaining, and accessing objects in storage system clusters
EP3404527B1 (en) Data updating technique
CN103944981A (en) Cloud storage system and implement method based on erasure code technological improvement
Frolund et al. A decentralized algorithm for erasure-coded virtual disks
CN101840366A (en) Storage method of loop chain type n+1 bit parity check code
CN110427156B (en) Partition-based MBR (Membrane biological reactor) parallel reading method
WO2023103213A1 (en) Data storage method and device for distributed database
André et al. Archiving cold data in warehouses with clustered network coding
Konwar et al. A layered architecture for erasure-coded consistent distributed storage
JP2021086289A (en) Distributed storage system and parity update method of distributed storage system
Lee et al. Erasure coded storage systems for cloud storage—challenges and opportunities
CN116501553B (en) Data recovery method, device, system, electronic equipment and storage medium
CN108923960A (en) A kind of memory node restorative procedure for assisting regeneration code based on agency
CN110651262B (en) Hierarchical distributed storage system and techniques for edge computing systems
CN105007286A (en) Decoding method, decoding device, and cloud storage method and system
Fu et al. A scheme of data confidentiality and fault-tolerance in cloud storage
Li Enabling low degraded read latency and fast recovery for erasure coded cloud storage systems
CN110231999B (en) Method and device for improving reliability of storage system based on local repair coding
Xu et al. CRL: Efficient Concurrent Regeneration Codes with Local Reconstruction in Geo-Distributed Storage Systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130213