CN103281400A - Data segmenting, coding and recovering method used for cloud storage gateway - Google Patents

Data segmenting, coding and recovering method used for cloud storage gateway Download PDF

Info

Publication number
CN103281400A
CN103281400A CN2013102420120A CN201310242012A CN103281400A CN 103281400 A CN103281400 A CN 103281400A CN 2013102420120 A CN2013102420120 A CN 2013102420120A CN 201310242012 A CN201310242012 A CN 201310242012A CN 103281400 A CN103281400 A CN 103281400A
Authority
CN
China
Prior art keywords
data
user
block
cutting
cloud
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102420120A
Other languages
Chinese (zh)
Inventor
张尧学
胡宏扬
周悦芝
张迪
刘浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN2013102420120A priority Critical patent/CN103281400A/en
Publication of CN103281400A publication Critical patent/CN103281400A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a data segmenting, coding and recovering method used for a cloud storage gateway. The data segmenting, coding and recovering method includes the following steps of receiving a data storage requirement of a user, obtaining a value which meets a reliability requirement of the user, calculating a plurality of segmenting coding schemes for conducting segmenting on data of the user, selecting an optimal segmenting scheme from the plurality of segmenting coding schemes according to a corresponding redundancy rate and a corresponding number of verification blocks of each segmenting scheme, conducting segmenting and coding on the data of the user to generate data blocks and the verification blocks according to the optimal segmenting scheme, storing the data blocks and the verification blocks into a plurality of cloud storages in order, and when the data blocks or the verification blocks are stored into part of cloud storages and are damaged, conducting recovering on the original data of the user and the damaged data blocks or the damaged verification blocks through the other data blocks stored in the other plurality of cloud storages and the other verification blocks stored in the other plurality of cloud storages. According to the data segmenting, coding and recovering method, when the data of the user are lost, the recovering can be conducted on the original data of the user, and usability and reliability of the cloud gateway for storing the data of the user are improved.

Description

The data cutting coding and the restoration methods that are used for the cloud storage gateway
Technical field
The present invention relates to the network technology application, particularly a kind of data cutting coding and restoration methods for the cloud storage gateway.
Background technology
Since the cloud computing concept proposed, the cloud computing technology had obtained fast development.Along with the upgrading of computation model and improving constantly of calculating memory technology, develop into present cloud storage from initial unit storage, the network storage, distributed storage.The cloud storage refers to a large amount of dissimilar memory devices in the network are gathered collaborative work by virtualization software, and data storage and Operational Visit function are provided jointly.Compare with traditional storage, it is independent, with low cost and can be advantages such as the user is customized that the cloud stores service has equipment.
Current most of public cloud storage adopts http protocol for user access by Internet protocol usually.Owing to do not adopt traditional storage area network (SAN, Storage Area Network) and network attached storage (NAS, Network Attached Storage) agreement, existing cloud storage and original user use incompatible, this uses the cloud stores service to bring many inconvenience to the user, make things convenient for just as the visit local disk in order to make the user visit the cloud storage, released the cloud storage gateway, simplified the use of cloud stores service.
The cloud storage gateway is to be positioned at the customer network system based on the equipment of hardware or software, and as the bridge between this locality application and cloud storage, the local user can be with local area network (LAN) speed visit cloud storage gateway.The cloud storage gateway provide basic protocol conversion and simply connectivity allow incompatible technology is transparent to be exchanged, allow cloud storage seem NAS filter of picture, block stores array, backup target or or even a expansion that should application itself, thereby solve that cloud is stored and original application between incompatibility problem.In addition, the cloud storage gateway also provides functions such as data compression, data encryption, data de-duplication, snapshot, Version Control, the fail safe that has improved the cloud stores service.
Current, mainly contain two class cloud storage gateways on the market, the first kind is the cloud storage gateway that certain particular cloud storage and service only are provided for the user, for example, the Emulex cloud storage gateway of Amazon AWS (Amazon Web Services) storage gateway and EMC.Second class is integrated a plurality of cloud memory interfaces, user can select the cloud storage gateway of cloud storage according to self needs, mainly comprise the cloud storage gateway that enterprises such as Cirtas, Nasuni, StorSimple and TwinStrata release.
The AWS storage gateway is a virtual machine image of Amazon exploitation, after downloading, the user operates on the virtual machine of enterprise's home server establishment, it couples together local software or hardware device and cloud storage, between local IT environment and Amazon cloud storage infrastructure, set up seamless, safe integratedly, make the user use the storage of Amazon cloud to be carried in a local volume just as use.Use the AWS storage gateway, the user can store data into the AWS cloud safely, can expand and economical and practical storage in order to enjoy.The AWS storage gateway is supported the industry standard storage protocol that current application program is used, and also safeguards the data of often visit and guarantees that the safety that is stored in all data on the Amazon S3 provides low delay performance by local cache.Simultaneously, the AWS storage gateway can import and spread out of the data encryption of AWS cloud to all SSL into.All volumes and snapshot data use Advanced Encryption Standard (AES) 256 (a kind of safe symmetric key encryption standards of use 256 bit encryption keys) static encryption in Amazon S3.
The use of Amazon cloud stores service has been simplified in the exploitation of AWS storage gateway, has the Emulex cloud storage gateway of similar functions to help the user to visit the storage of EMC cloud with the AWS storage gateway and makes things convenient for just as the visit local disk.But this class generally only provides the user connection to self cloud stores service by the cloud storage gateway of cloud storage exploitation, can't use other cloud stores service by this class cloud storage gateway, and versatility is not strong.
Second class can be the access interface of the integrated different cloud storages of user by the cloud storage gateway that enterprises such as Cirtas, Nasuni, StorSimple and TwinStrata release, and the user can select different cloud storages according to the demand of oneself, and versatility is stronger.
Cirtas was in issue cloud storage gateway Bluejet Cloud Storage Controller in 2010, and Bluejet finishes in this locality to simplify configuration automatically, can allow the enterprise customer as using the local datastore array to use the cloud storage.If be installed on the SAN, the CloudCache technology of Bluejet just can be between high-performance local cache and cloud data model storage migration data, allow the data that the user used in the recent period and frequency of utilization is the highest be stored on the Bluejet, and the not high data of frequency of utilization are stored in the cloud.
The cloud storage gateway Nasuni Cloud Storage Gateway of Nasuni exploitation is a virtual NAS equipment that runs on enterprise's home server.Nasuni Cloud Storage Gateway is also referred to as the Nasuni filter, is designed to replace traditional NAS equipment, and it can be directed to data traffic in the cloud storage.By its simple interface that provides, the user can oneself select the stores service merchant, management of performance and volume equipment, and use self-defining data protection instrument.Nasuni Cloud Storage Gateway can integrate user's NAS equipment and cloud storage, automatically the memory space of leading subscriber.This system can protect user's file by snapshot mechanism and the intrinsic redundant ability of cloud storage simultaneously.
The cloud storage gateway CloudArray Cloud Storage Gateway of TwinStrata company is hardware device independently, operate in the intranet, the integrated composition of memory device of the Local or Remote that it can exist public cloud, privately owned cloud and enterprise in a few minutes is " Cloud SANs " flexibly, and local storage system is quick, safety and simple just as visiting to make the visit cloud storage system.The application program of enterprise this locality is undertaken by iSCSI interface and CloudArray Cloud Storage Gateway alternately.In order to improve access speed, CloudArray portion has within it safeguarded the data buffer memory, and as required data is read and write data from the cloud storage on the backstage.In this process, system can by data are encrypted, compression, data remove heavily to wait multi-mode operation.CloudArray virtual unit and physical equipment can use internal memory, and local hard drive or solid state hard disc are as buffer memory equipment.A large amount of buffer memory equipment can be divided into different volume equipment, to satisfy the specific performance properties demand.
The cloud storage gateway of StorSimple company is disposed enterprise-level storage infrastructure and processing scheme for the enterprise customer, and StorSimple cloud storage gateway provides nearline storage, filing, backup and disaster recovery solution for the integration of enterprise customer and cloud.By StorSimple cloud storage gateway, non-alive data and snapshot will be sent to cloud storage, and the cloud storage is another level storage in fact just, data with the application layer form but not the form of Backup Data be kept in the cloud.The cloud storage of StorSimple cloud storage gateway equipment support comprises the RRS of Google, Hewlett-Packard, Rackspace and Amazon.
Compared to first kind cloud storage gateway, the interface exploitation of a plurality of clouds storages that the sharpest edges of this class cloud storage gateway have been integrated, the user can select suitable cloud store according to the factors such as price of own storage demand, each cloud storage.
Above-mentioned two class cloud storage gateways, in the availability that ensures user data and the performance aspect the memory reliability cloud storage that all places one's entire reliance upon, availability refers to whether the user can visit at any time and reads the data that are stored in the cloud, and reliability refers to that the data that the user is stored in the cloud can or can not lose or damage.Out of service because of system maintenance when cloud storage, when barrier causes the machine of delaying or system crash that service temporarily can't be provided for some reason, two class cloud storage gateways all can do nothing to help user's calling party data, when the cloud storage caused user data loss or damage because of some reason, two class cloud storage gateways all can do nothing to help the user and recover institute's deposit data.As seen, existing two class cloud storage gateways can't ensure availability and the reliability of user data.
Summary of the invention
Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.
For this reason, the objective of the invention is to propose a kind of data cutting coding and restoration methods for the cloud storage gateway.
For achieving the above object, embodiments of the invention propose a kind of data cutting coding and restoration methods for the cloud storage gateway, may further comprise the steps: receive user's data storage request, wherein, described request of data comprises the user data title; The user dependability requirements is satisfied in acquisition, and calculates a plurality of cutting encoding schemes of user data being carried out cutting according to described user dependability requirements; According to each cutting scheme corresponding redundant rate and check block number, from described a plurality of cutting encoding schemes, select optimum cutting scheme; According to described optimum cutting scheme described user data is carried out cutting and encodes to generate data block and check block; Described data block and check block are deposited in respectively in a plurality of cloud storages in order; And the partial data piece in depositing described a plurality of cloud storage in or check block be when destroyed, by being stored in data block in other a plurality of clouds storages and check block the former data of user and ruined data block or check block recovered.
Method according to the embodiment of the invention, by user data being blocked into a plurality of data blocks and check block, be stored in respectively in a plurality of cloud storages, make when certain customers' loss of data, the former data of user are recovered, improved availability and the reliability of cloud storage gateway to storage of subscriber data.
In one embodiment of the present of invention, described optimum cutting scheme is the minimum cutting encoding scheme of redundancy rate.
In one embodiment of the present of invention, when the cutting encoding scheme that described redundancy rate is minimum was a plurality of, the cutting scheme that the check block number is minimum was optimum cutting scheme.
In one embodiment of the present of invention, describedly by data block and the check block that is stored in other a plurality of clouds storage ruined data are recovered, specifically comprise: the title according to user data is searched not ruined data block and the check block corresponding with the title of described user data in described a plurality of clouds storages; Recover all data blocks of described user data according to described not ruined data block and check block; And thereby all data blocks that recovers is merged into the recovery that complete user data is finished the former data of user.
In one embodiment of the present of invention, describedly according to described optimum cutting scheme described user data is carried out cutting and coding to generate data block and check block, specifically comprise: the data block that described user data is cut into a plurality of equal lengths by described optimum cutting scheme; And the check block that obtains corresponding equal length according to the data block of described a plurality of equal lengths.
In one embodiment of the present of invention, the equal and opposite in direction of described data block and check block.
In one embodiment of the present of invention, the described all data blocks of recovering described user data according to described not ruined data block and check block, specifically comprise: first matrix that obtains according to described data block and check block is handled, to obtain second matrix; Data block after described second matrix and described the renaming and check block are carried out one by one corresponding, to obtain the 3rd matrix; And handle according to described the 3rd matrix, to recover all data blocks of described user data.
The aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Description of drawings
Above-mentioned and/or the additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
Fig. 1 is used for the data cutting coding of cloud storage gateway and the flow chart of restoration methods according to an embodiment of the invention;
Fig. 2 is for carrying out the flow chart of cutting coding according to an embodiment of the invention to user data; And
The flow chart of Fig. 3 for according to an embodiment of the invention user data being recovered.
Embodiment
Describe embodiments of the invention below in detail, the example of embodiment is shown in the drawings, and wherein identical or similar label is represented identical or similar elements or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, only be used for explaining the present invention, and can not be interpreted as limitation of the present invention.
In description of the invention, it will be appreciated that, term " " center "; " vertically "; " laterally "; " on "; D score; " preceding ", " back ", " left side ", " right side ", " vertically ", " level ", " top ", " end ", " interior ", close the orientation of indications such as " outward " or position is based on orientation shown in the drawings or position relation, only be that the present invention for convenience of description and simplification are described, rather than device or the element of indication or hint indication must have specific orientation, with specific orientation structure and operation, therefore can not be interpreted as limitation of the present invention.In addition, term " first ", " second " only are used for describing purpose, and can not be interpreted as indication or hint relative importance.
In description of the invention, need to prove that unless clear and definite regulation and restriction are arranged in addition, term " installation ", " linking to each other ", " connection " should be done broad understanding, for example, can be fixedly connected, also can be to removably connect, or connect integratedly; Can be mechanical connection, also can be to be electrically connected; Can be directly to link to each other, also can link to each other indirectly by intermediary, can be the connection of two element internals.For the ordinary skill in the art, can concrete condition understand above-mentioned term concrete implication in the present invention.
Fig. 1 is used for the data cutting coding of cloud storage gateway and the flow chart of restoration methods according to an embodiment of the invention.Fig. 2 is for carrying out the flow chart of cutting coding according to an embodiment of the invention to user data.As shown in Figure 1, the data cutting coding and the restoration methods that are used for the cloud storage gateway according to the embodiment of the invention may further comprise the steps:
Step 101 receives user's data storage request, and wherein, request of data comprises the user data title.
Particularly, receive a plurality of cloud stores service of user's data storage request, reliability requirement value P and store data from the cloud storage gateway, wherein, data storage request comprises title and the size that needs storage user data, user data, it is expressed as: (D, N, Size), wherein, D represents user data, N represents the title of user data, and Size represents the size of user data, and unit is byte, reliability requirement value P value is between 90% to 100%, the cloud stores service of store data comprises the title of cloud stores service, interface specification.
Step 102 obtains to satisfy the user dependability requirements, and calculates a plurality of cutting encoding schemes of user data being carried out cutting according to the user dependability requirements.
Particularly, appoint and to get a check block and count j, the value of check block number is the positive integer greater than 1.Count i for some check blocks of choosing, appoint and to get a data block and count j, the value of data block number be the positive integer greater than 1, if j is satisfied with lower inequality: Σ k = 1 i k i + j ( 1 - 99.999 % ) k ( 99.999 % ) i + j - k ≥ P , Wherein, the 99.999%th, the reliability that each cloud storage service provider promises to undertake.Then this data block is counted j and check block and is counted i and constitute a cutting encoding scheme, be expressed as (j, i).
Step 103 according to each cutting scheme corresponding redundant rate and check block number, is selected optimum cutting scheme from a plurality of cutting encoding schemes.Optimum cutting scheme is the minimum cutting encoding scheme of redundancy rate, if when the minimum cutting encoding scheme of redundancy rate is a plurality of, the cutting scheme that the check block number is minimum is optimum cutting scheme.
Particularly, from a plurality of cutting encoding schemes, further find out all and satisfy the data block number and check block is counted the cutting encoding scheme that sum is not more than the cloud stores service number of user's appointment.Then, calculate the corresponding redundancy rate of cutting encoding scheme of the cloud stores service number be not more than user's appointment, redundancy rate is that check block number and data block number and corresponding check piece thereof are counted the ratio of sum in the cutting encoding scheme.Cutting encoding scheme when selecting redundancy rate minimum is optimum cutting scheme.If more than one of this scheme, the scheme that the selection check piece is counted minimum from these schemes be as optimum cutting scheme, this optimum cutting scheme be expressed as (n, m).
Step 104 is carried out cutting and is encoded to generate data block and check block user data according to optimum cutting scheme.
In one embodiment of the invention, by optimum cutting scheme user data is cut into the data block of a plurality of equal lengths.Then, according to the check block of the corresponding equal length of the data block acquisition of a plurality of equal lengths, wherein, the equal and opposite in direction of data block and check block.
Particularly, whether the size (Size) of calculating user data can be divided exactly by n, if aliquant, then in the full remainder certificate of user data end increase A byte-sized, the size of user data can be divided exactly by n, and the computational methods of A are as follows:
Figure BDA00003365793600064
Wherein
Figure BDA00003365793600065
Expression Size is divided exactly n and is rounded up.If can not be divided exactly by n, then read successively from user data with binary form
Figure BDA00003365793600066
The data content of size generates data block d 1, d 2, d 3D n
According to optimum cutting scheme, and the n that tells data block m the isometric check block of encoding out, m check block piece is expressed as c respectively 1, c 2, c 3C m, wherein, check block and data block equal and opposite in direction.Detailed process is,
Make up the matrix of m*n A = 1 1 1 · · · 1 1 2 3 · · · n · · · · · · · · · · · · · · · 1 2 m - 1 3 m - 1 · · · n m - 1 , Then, with isometric data block d 1, d 2, d 3D nBe expressed as the capable matrix of n D = d 1 d 2 · · · d n · Recycling matrix A and matrix D are obtained out optimum cutting scheme (n, m) the check block c of respective amount 1, c 2, c 3C m, its result is,
Figure BDA00003365793600063
Step 105 deposits data block and check block respectively in a plurality of cloud storages in order.
Particularly, to data block d 1, d 2, d 3D nWith check block c 1, c 2, c 3C mName, data block is called after N_k1 respectively, N_k2, N_k3 ... N_kn is expressed as (d with the data block after the name 1, N_k1), (d 2, N_k2), (d 3, N_k3) ... (d n, form N_kn), check block is called after N_x1 respectively, N_x2, N_x3 ... N_xm is expressed as (c with the check block after the name 1, N_x1), (c 2, N_x2), (c 3, N_x3) ... (c m, form N_x5).
With the data block (d after the name 1, N_k1), (d 2, N_k2), (d 3, N_k3) ... (d n, N_kn) and check block (c 1, N_x1), (c 2, N_x2), (c 3, N_x3) ... (c m, N_x5) be deposited into m+n cloud depository respectively, and make each cloud depository only store a data block or check block.With the title (N) of user data, size (Size), corresponding optimum cutting scheme (n, m) and the cloud store name that uses stored record deposits in the cloud storage gateway as a user data cloud, be expressed as<N, Size, (n, m), (cloud storage 1, cloud storage 2 ... cloud storage m+n) 〉.
Step 106 when the partial data piece in depositing the storage of a plurality of clouds in or check block are destroyed, is recovered the former data of user and ruined data block or check block by being stored in data block in other a plurality of clouds storages and check block.
Title according to user data is searched not ruined data block and the check block corresponding with the title of user data in a plurality of cloud storages.All data blocks according to not ruined data block and check block restoring user data.Its process is, first matrix that obtains according to data block and check block handled, to obtain second matrix.Then, with second matrix with rename after data block and check block carry out one by one corresponding, to obtain the 3rd matrix.Afterwards, handle according to the 3rd matrix, with all data blocks of restoring user data.The all data blocks that recovers is merged into complete user data.
The flow chart of Fig. 3 for according to an embodiment of the invention user data being recovered.As shown in Figure 3, specific as follows:
Obtain user data from the cloud storage gateway and read request, comprising requesting users data name N '.In the cloud storage gateway, search the stored record of the corresponding user data cloud of user's requesting users data name, if can find, then read data block and the check block of user data N ' successively from the cloud depository according to the data cloud stored record.Afterwards, the quantity sum of further judging the data block that reads and check block is not less than the quantity n of data block in the optimum cutting scheme of this user data correspondence.If read all data blocks of user data, then read d successively 1, d 2, d 3D nContent be merged into complete data.
In one embodiment of the invention, if do not read user's all data blocks, namely certain customers' data are destroyed can't normally read the time, as follows the user data that can't read recovered.
Step 201 is chosen n data block sum check piece in the data block that reads and the check block.
Step 202 is utilized the matrix of m * n, i.e. first matrix A = 1 1 1 · · · 1 1 2 3 · · · n · · · · · · · · · · · · · · · 1 2 m - 1 3 m - 1 · · · n m - 1 Expand the matrix that obtains (m+n) * n B = 1 0 0 · · · 0 0 1 0 · · · 0 · · · . . . · · · · · · · · · · · . . . · 0 0 0 · · · 1 1 1 1 · · · 1 1 2 3 · · · n · · · . . . · · · · · · · · · · · . . . · 1 2 m - 1 3 m - 1 · · · n m - 1 , I.e. second matrix.
Step 203, with matrix B from first the row to n capable successively with data block (d 1, N_k1), (d 2, N_k2), (d 3, N_k3) ... (d n, N_kn) correspondence, n+1 capable to n+m capable successively with check block (c 1, N_x1), (c 2, N_x2), (c 3, N_x3) ... (c m, N_x5) correspondence.
Step 204 is found out row corresponding in n data block sum check piece and the matrix B, and these row are formed the matrix B that n*n is capable ', i.e. the 3rd matrix, matrix B ' is nonsingular matrix.
Step 205, with n data block sum check piece with the order N_k1 according to title, N_k2, N_k3 ... N_kn, N_x1, N_x2, N_x3 ... N_xm is E ' with matrix form from the 1st ranks to the n line display.
Step 206 is obtained matrix B ' inverse matrix B ' -1Matrix B ' the inverse matrix computational methods are, with matrix B ' write together with a same order unit matrix, be expressed as [B ' I], I representation unit matrix wherein, then this matrix is carried out the form that elementary row is converted into [I C] with unit matrix, namely when the matrix B inverse matrix B ' of ' when being transformed to unit matrix I, the Matrix C of writing together with it is exactly matrix B ' -1, i.e. B ' -1=C.
Step 207 is utilized B ' -1Multiply by the data block d that E ' obtains this user data 1, d 2, d 3D n, concrete computational methods are as follows,
d 1 d 2 · · · d n = B ' - 1 * E ' ·
Table 1 is the data details of user's data storage request, and user dependability requirements P is 99.9999999%, and a plurality of cloud stores service of appointment have 4, are respectively Google's cloud, Baidu's cloud, Ali's cloud and grand cloud.
Title Type Size
2008NewYear.zip Compressed file 298819583 bytes (285MB)
Table 1
Below by the storage of subscriber data request shown in the table 1 whole process of the present invention is described.
Step (1.1), calculate all the cutting encoding schemes to user data 2008NewYear.zip that satisfy user dependability requirements 99.9999999%, each cutting encoding scheme is expressed as required data block number and the corresponding check piece number thereof of cutting coding.
Particularly, step (1.1.1): appoint and to get a check block and count i, the value of check block number is the positive integer greater than 1.
Step (1.1.2): count i for some check blocks of choosing, appoint and to get a data block and count j, the value of data block number is the positive integer greater than 1, if j satisfies following equation:
Σ k = 1 i k i + j ( 1 - 99.999 % ) k ( 99.999 % ) i + j - k ≥ P , Then this data block is counted j and check block and is counted i and constitute a cutting encoding scheme.
According to said method, enumerate when the check block number and be respectively 1,2,3, all can satisfy the value of the data block number of user dependability requirements in 4 o'clock, and (j i) has (1,1) thereby satisfy the cutting encoding scheme of reliability requirement as can be known, (2,1), (3,1), (4,1), (1,2), (2,2), (3,2), (4,2), (5,2), (6,2), (7,2), (8,2), (1,3), (2,3), (3,3), (4,3), (5,3), (6,3), (7,3), (8,3), (9,3), (10,3), (11,3), (12,3), (1,4), (2,4), (3,4), (4,4), (5,4), (6,4), (7,4), (8,4), (9,4), (10,4), (11,4), (12,4), (13,4), (14,4), (15,4), (16,4).
Step (1.2) is found out the optimum cutting scheme in all cutting encoding schemes that obtain in the step (1.1), it is expressed as (n, m), wherein n represents data block number in the optimum cutting scheme, m represents check block number in the optimum cutting scheme.
Particularly, step (1.2.1): in each cutting encoding scheme, further find out all and satisfy the data block number and check block is counted sum less than the cutting encoding scheme of the cloud stored number of user's appointment, the quantity of the cloud of user's appointment storage is 4 in this example, the cutting encoding scheme of requirement is (1,1), (2,1), (3 so meet the requirements, 1), (1,2), (2,2), (1,3).
Step (1.2.2): calculate the corresponding redundancy rate of above-mentioned each cutting encoding scheme, redundancy rate is that the required check block number of cutting coding and data block number and corresponding check piece thereof are counted the ratio of sum, (1,1), (2,1), (3,1), (1,2), (2,2), (1,3) corresponding redundant rate is respectively 50%, 33.3%, 25%, 66.7%, 50%, 75%.
Step (1.2.3), the cutting encoding scheme when selecting redundancy rate minimum, (if more than one of this scheme, the scheme that the selection check piece is counted minimum from these schemes is as optimum cutting scheme, this have most the cutting scheme be expressed as (n, m).
Step (1.3) is encoded into 3 data blocks and 1 check block according to optimum cutting scheme with the user data cutting, and data block is expressed as d 1, d 2, d 3, check block is expressed as c 1, concrete steps are as follows:
Step (1.3.1): user data is cut into 3 isometric data blocks according to optimum cutting scheme, is expressed as d respectively 1, d 2, d 3
Particularly, whether step (1.3.11), the size (Size) of calculating user data can be divided exactly by 3, if aliquant, change step (1.3.12); Otherwise, change step (1.3.13).The user data size is 298819583 bytes, can not be divided exactly by 3, so change step (1.1.12).
Step (1.3.12) increases the full remainder certificate of A byte-sized at the user data end, the size of user data can be divided exactly by 3, and the computational methods of A are as follows:
Figure BDA00003365793600093
Namely increase the full remainder certificate of 1 byte-sized at the user data end.
Step (1.3.13) reads from user data successively with binary form
Figure BDA00003365793600094
The data content of size generates data block d 1, d 2, d 3
Step (1.3.2), 3 data block d that utilize step (1.3.1) to be syncopated as according to optimum cutting scheme 1, d 2, d 31 check block of encoding out is expressed as c 1, check block and data block equal and opposite in direction.
Particularly, step (1.3.21): make up 1 * 3 matrix A = 1 1 1 .
Step (1.3.22) is with isometric data block d 1, d 2, d 3Be expressed as the matrix of 3 row D = d 1 d 2 d 3 ·
Step (1.3.23) utilizes matrix A and matrix D to obtain out the check block c of optimum cutting scheme (3,1) respective amount 1, concrete computational methods are as follows:
Figure BDA00003365793600101
Step (1.4) is to data block d 1, d 2, d 3With check block c 1Name, data block is called after 2008NewYear_k1.zip respectively, 2008NewYear_k2.zip, 2008NewYear_k3.zip is expressed as (d with data block 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, form 2008NewYear_k3.zip), check block called after 2008NewYear_x1.zip is expressed as (c with check block 1, form 2008NewYear_x1.zip), thus data block sum check piece is carried out unique sign.
Step (1.5): the data block (d after will naming 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, 2008NewYear_k3.zip) and check block (c 1, 2008NewYear_x1.zip) be deposited into respectively in 4 cloud storages such as Google's cloud, Baidu's cloud, Ali's cloud, grand cloud, and make each cloud depository only store a data block or check block.
Step (1.6): with title 2008NewYear.zip, size 298819583 and the corresponding optimum cutting scheme (3 thereof of user data, 1) stored record deposits in the cloud storage gateway as a user data cloud, is expressed as<2008NewYear.zip 298819583, (3,1) 〉.
As follows user data is recovered.
Step (2.1) is obtained the user to the request of reading of data 2008NewYear.zip from the cloud storage gateway.
Step (2.2), in the cloud storage gateway, search the corresponding data cloud stored record<2008NewYear.zip of user's requesting users data 2008NewYear.zip, 298819583, (3,1), (Google's cloud, Baidu's cloud, Ali's cloud, grand cloud) 〉, if can find, change step (2.3); Otherwise, finish to search, and the result that can't read data returns to the user.
Step (2.3), according to requesting users data name 2008NewYear.zip and data cloud stored record<2008NewYear.zip, 298819583, (3,1), (Google's cloud, Baidu's cloud, Ali's cloud, grand cloud)〉read the data block 2008NewYear_k1.zip of these data successively from the cloud depository, 2008NewYear_k2.zip, 2008NewYear_k3.zip with check block 2008NewYear_x1.zip, if read data block and check block, change step (2.4); Otherwise user data can't recover, and step finishes.
Step (2.4) if the quantity sum of the data block that reads and check block is not less than the quantity 3 of data block in the optimum cutting scheme of this user data correspondence, is changeed step (2.5); Otherwise user data can't recover, and step finishes.
Step (2.5) judges whether to read all data blocks (d of this user data according to the title of data block and check block 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, 2008NewYear_k3.zip), if read all data blocks of this user data, then change step (2.8); Otherwise, change step (2.6).
Step (2.6) is chosen 3 data block sum check pieces (2008NewYear_k1.zip) the data block that reads from step (2.2) and the check block, (2008NewYear_k3.zip), and (2008NewYear_x1.zip).
Step (2.7) is utilized all data blocks (d of 3 data block sum check piece restoring user data choosing in the step (2.6) 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, 2008NewYear_k3.zip).
Particularly, step (2.7.1) is to 1 * 3 matrix A = 1 1 1 Expand the matrix that obtains 4*3 B = 1 0 0 0 1 0 0 0 1 1 1 1 ·
Step (2.7.2), with matrix B from the 1st the row to the 3rd the row successively with data block (d 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, 2008NewYear_k3.zip) correspondence, the 4th row and check block (c 1, 2008NewYear_x1.zip) correspondence.
Step (2.7.3) is found out 3 data block sum check pieces (2008NewYear_k1.zip) of selecting, and (2008NewYear_k3.zip), (2008NewYear_x1.zip) row corresponding with matrix B formed the matrix that 3*3 is capable with these row B ′ = 1 0 0 0 0 1 1 1 1 , Matrix B ' be nonsingular matrix.
Step (2.7.4), with 3 data block sum check pieces selecting with the order N_k1 according to title, N_k2, N_k3 ... N_kn, N_x1, N_x2, N_x3 ... N_xm with matrix form from the 1st ranks to the 3 line displays is E ' = 2008 NewYear _ k 1 . zip 2008 NewYear _ k 3 . zip 2008 NewYear _ x 1 . zip ·
Step (2.7.5) is obtained matrix B ' inverse matrix B ' -1Matrix B ' the inverse matrix computational methods are, with matrix B ' write together with a same order unit matrix, be expressed as [B ' I], I representation unit matrix wherein, then this matrix is carried out the form that elementary row is converted into [I C] with unit matrix, namely when the matrix B inverse matrix B ' of ' when being transformed to unit matrix I, the Matrix C of writing together with it is exactly matrix B ' -1, namely B ' - 1 = C = 1 0 0 - 1 - 1 1 0 1 0 ·
Step (2.7.6) is utilized B ' -1Multiply by E ' and obtain the data block (d that this user data is being syncopated as 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, 2008NewYear_k3.zip), concrete computational methods are as follows:
2008 NewYear _ k 1 . zip 2008 NewYear _ k 2 . zip 2008 NewYear _ k 3 . zip = B ' - 1 E ' = 1 0 0 - 1 - 1 1 0 1 0 2008 NewYear _ k 1 . zip 2008 NewYear _ k 3 . zip 2008 NewYear _ x 1 . zip ·
Step (2.8) reads (d successively 1, 2008NewYear_k1.zip), (d 2, 2008NewYear_k2.zip), (d 3, content 2008NewYear_k3.zip) also is merged into complete data, and this size of data is 298819584 bytes.
Step (2.9) is 298819583 bytes according to the size that reads this user data, and the head of the partial data that generates from step (2.8) begins to intercept the data of 298819583 bytes, recovers original subscriber's data 2008NewYear.zip.
Method according to the embodiment of the invention, by user data being blocked into a plurality of data blocks and check block, be stored in respectively in a plurality of cloud storages, make when certain customers' loss of data, check block by the data block in other cloud storages recovers all data of user, has improved availability and the reliability of cloud storage gateway to storage of subscriber data.
Although illustrated and described embodiments of the invention above, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, those of ordinary skill in the art can change above-described embodiment under the situation that does not break away from principle of the present invention and aim within the scope of the invention, modification, replacement and modification.

Claims (7)

1. data cutting coding and a restoration methods that is used for the cloud storage gateway is characterized in that, may further comprise the steps:
Receive user's data storage request, wherein, described request of data comprises the user data title;
The user dependability requirements is satisfied in acquisition, and calculates a plurality of cutting encoding schemes of user data being carried out cutting according to described user dependability requirements;
According to each cutting scheme corresponding redundant rate and check block number, from described a plurality of cutting encoding schemes, select optimum cutting scheme;
According to described optimum cutting scheme described user data is carried out cutting and encodes to generate data block and check block;
Described data block and check block are deposited in respectively in a plurality of cloud storages in order; And
When the partial data piece in depositing described a plurality of cloud storage in or check block are destroyed, by being stored in data block in other a plurality of clouds storages and check block the former data of user and ruined data block or check block are recovered.
2. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 1 is characterized in that, described optimum cutting scheme is the minimum cutting encoding scheme of redundancy rate.
3. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 2 is characterized in that, when the cutting encoding scheme that described redundancy rate is minimum was a plurality of, the cutting scheme that the check block number is minimum was optimum cutting scheme.
4. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 1 is characterized in that, describedly by data block and the check block that is stored in other a plurality of clouds storage ruined data are recovered, and specifically comprise:
Title according to user data is searched not ruined data block and the check block corresponding with the title of described user data in described a plurality of cloud storages;
Recover all data blocks of described user data according to described not ruined data block and check block; And
Thereby all data blocks that recovers is merged into the recovery that complete user data is finished the former data of user.
5. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 1 is characterized in that, describedly according to described optimum cutting scheme described user data are carried out cutting and coding to generate data block and check block, specifically comprise:
Described user data is cut into the data block of a plurality of equal lengths by described optimum cutting scheme; And
Obtain the check block of corresponding equal length according to the data block of described a plurality of equal lengths.
6. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 5 is characterized in that the equal and opposite in direction of described data block and check block.
7. data cutting coding and restoration methods for the cloud storage gateway as claimed in claim 4 is characterized in that, describedly recover all data blocks of described user data according to described not ruined data block and check block, specifically comprise:
First matrix that obtains according to described data block and check block is handled, to obtain second matrix;
Data block after described second matrix and described the renaming and check block are carried out one by one corresponding, to obtain the 3rd matrix; And
Handle according to described the 3rd matrix, to recover all data blocks of described user data.
CN2013102420120A 2013-06-18 2013-06-18 Data segmenting, coding and recovering method used for cloud storage gateway Pending CN103281400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102420120A CN103281400A (en) 2013-06-18 2013-06-18 Data segmenting, coding and recovering method used for cloud storage gateway

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102420120A CN103281400A (en) 2013-06-18 2013-06-18 Data segmenting, coding and recovering method used for cloud storage gateway

Publications (1)

Publication Number Publication Date
CN103281400A true CN103281400A (en) 2013-09-04

Family

ID=49063845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102420120A Pending CN103281400A (en) 2013-06-18 2013-06-18 Data segmenting, coding and recovering method used for cloud storage gateway

Country Status (1)

Country Link
CN (1) CN103281400A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559279A (en) * 2013-11-07 2014-02-05 深信服网络科技(深圳)有限公司 Cloud storage acceleration method and device
CN104750428A (en) * 2013-12-27 2015-07-01 纬创资通股份有限公司 Block storage access and gateway module, storage system and method, and content delivery apparatus
CN106462605A (en) * 2014-05-13 2017-02-22 云聚公司 Distributed secure data storage and transmission of streaming media content
CN107484161A (en) * 2017-07-24 2017-12-15 国家电网公司 A kind of efficient information push based on mobile self-grouping network and safe sharing method
CN108076090A (en) * 2016-11-11 2018-05-25 华为技术有限公司 Data processing method and storage management system
CN108683729A (en) * 2018-05-14 2018-10-19 重庆第二师范学院 A kind of environmental monitoring data safe storage system and method towards credible cloud
CN113572813A (en) * 2021-06-22 2021-10-29 复旦大学 Data backup method based on network coding

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110029809A1 (en) * 2009-07-30 2011-02-03 Cleversafe, Inc. Method and apparatus for distributed storage integrity processing
CN201994961U (en) * 2011-02-01 2011-09-28 西安建筑科技大学 Dispersion-oriented cloud-storage security architecture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110029809A1 (en) * 2009-07-30 2011-02-03 Cleversafe, Inc. Method and apparatus for distributed storage integrity processing
CN201994961U (en) * 2011-02-01 2011-09-28 西安建筑科技大学 Dispersion-oriented cloud-storage security architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡宏扬,等: "《基于云存储网关的两点优化设计》", 《计算机光盘软件与应用》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559279A (en) * 2013-11-07 2014-02-05 深信服网络科技(深圳)有限公司 Cloud storage acceleration method and device
CN104750428A (en) * 2013-12-27 2015-07-01 纬创资通股份有限公司 Block storage access and gateway module, storage system and method, and content delivery apparatus
CN104750428B (en) * 2013-12-27 2018-03-02 纬创资通股份有限公司 Block storage access and gateway module, storage system and method, and content delivery apparatus
CN106462605A (en) * 2014-05-13 2017-02-22 云聚公司 Distributed secure data storage and transmission of streaming media content
CN108076090A (en) * 2016-11-11 2018-05-25 华为技术有限公司 Data processing method and storage management system
CN107484161A (en) * 2017-07-24 2017-12-15 国家电网公司 A kind of efficient information push based on mobile self-grouping network and safe sharing method
CN107484161B (en) * 2017-07-24 2019-05-24 国家电网公司 A kind of efficient information push based on mobile self-grouping network and safe sharing method
CN108683729A (en) * 2018-05-14 2018-10-19 重庆第二师范学院 A kind of environmental monitoring data safe storage system and method towards credible cloud
CN113572813A (en) * 2021-06-22 2021-10-29 复旦大学 Data backup method based on network coding
CN113572813B (en) * 2021-06-22 2022-06-14 复旦大学 Data backup method based on network coding

Similar Documents

Publication Publication Date Title
US11093139B1 (en) Durably storing data within a virtual storage system
US11526408B2 (en) Data recovery in a virtual storage system
US11797197B1 (en) Dynamic scaling of a virtual storage system
US20210019067A1 (en) Data deduplication across storage systems
US20210360066A1 (en) Utilizing Cloud-Based Storage Systems To Support Synchronous Replication Of A Dataset
US20220019367A1 (en) Migrating Data In And Out Of Cloud Environments
US11947683B2 (en) Replicating a storage system
CN103281400A (en) Data segmenting, coding and recovering method used for cloud storage gateway
US20220035714A1 (en) Managing Disaster Recovery To Cloud Computing Environment
US11126364B2 (en) Virtual storage system architecture
US20220083245A1 (en) Declarative provisioning of storage
AU2022268336A1 (en) Synchronously replicating datasets and other managed objects to cloud-based storage systems
EP2799973B1 (en) A method for layered storage of enterprise data
CN101496005B (en) Distributed replica storage system with web services interface
US11422751B2 (en) Creating a virtual storage system
CN110609797A (en) Page cache logging for block-based storage
CN103890738A (en) System and method for retaining deduplication in a storage object after a clone split operation
CN104932956A (en) Big-data-oriented cloud disaster tolerant backup method
US20230004330A1 (en) Sizing A Virtual Storage System
CN106156359A (en) A kind of data synchronization updating method under cloud computing platform
CN103098015A (en) Storage system
Song et al. Parity cloud service: a privacy-protected personal data recovery service
US20210263667A1 (en) Multi-cloud orchestration as-a-service
CN102014152A (en) Long-distance duplicating system and method
US11327676B1 (en) Predictive data streaming in a virtual storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130904