WO2009097710A1 - Method for organizing and retrieving files, module and system for organizing files and storage media thereof - Google Patents

Method for organizing and retrieving files, module and system for organizing files and storage media thereof Download PDF

Info

Publication number
WO2009097710A1
WO2009097710A1 PCT/CN2008/071908 CN2008071908W WO2009097710A1 WO 2009097710 A1 WO2009097710 A1 WO 2009097710A1 CN 2008071908 W CN2008071908 W CN 2008071908W WO 2009097710 A1 WO2009097710 A1 WO 2009097710A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
module
index
retrieving
storage node
Prior art date
Application number
PCT/CN2008/071908
Other languages
French (fr)
Chinese (zh)
Inventor
Wentao Yang
Kegang Dou
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2009097710A1 publication Critical patent/WO2009097710A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • the present invention relates to the field of communications, and in particular, to a file organization method, a file retrieval method, a file organization module, a file retrieval system, and a computer readable storage medium.
  • the bills are stored on the storage module in the form of a file before the conversion is printed or displayed by the tool. This type of file is called a billing document.
  • the billing file has the following characteristics:
  • the number of files is too large, which wastes the storage nodes (storage space) of the file system; on the other hand, when a large number of files are searched and located, the efficiency is low, and in specific applications, in the business operation support system (Business Operation Support System) In BOSS, the billing file has low efficiency in finding and positioning, which makes the printing (presentation) and reprinting speed of the billing file slow, which reduces the experience satisfaction of the user's billing service.
  • the technical problem to be solved by the embodiments of the present invention is to provide a file organization method, a file retrieval method, a file organization module, a file retrieval system and a computer readable storage medium, which can solve The number of files is too large, resulting in wasted storage space and low file location efficiency.
  • an embodiment of the present invention provides a method for organizing a file, including: obtaining at least one file set; Combining all the files in each of the file sets into one storage node; establishing an index for retrieving the merged files under the storage node.
  • the embodiment of the invention further provides a file retrieval method, including:
  • a file corresponding to the index in the file set is output.
  • an embodiment of the present invention further provides a file organization module, including:
  • a merging submodule for merging all the files in each of the file sets into one storage node
  • An index creation sub-module is configured to retrieve an index of the merged file under the storage node.
  • an embodiment of the present invention further provides a file retrieval system, including:
  • the storage module corresponds to at least one storage node, configured to store an index for retrieving files in the merged file set under the storage node, and the file set;
  • a master control module configured to receive a read request for reading a file in a file set merged under the storage node, and output corresponding control information according to the read request;
  • a file retrieval module according to the control information of the total control module, obtaining an index for retrieving a file in the file set merged under the storage node and the file set, and outputting the index and the file from the storage module Collection
  • a file output module configured to output a file corresponding to the index in the file set.
  • an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
  • an index for retrieving the merged file under the storage node is established, thereby solving the problem due to the number of files. Too much, causing wasted storage space; and receiving the text in the merged file set under the storage node based on the above-structured file and index storage relationship Reading a read request, and searching and outputting the file corresponding to the read request according to an index for retrieving a file in the file set merged under the storage node, thereby solving the problem of low file positioning efficiency.
  • FIG. 1 is a schematic diagram of a method of organizing a file according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a file storage structure established in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a file organization module according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a document retrieval system according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a method of retrieving a file according to an embodiment of the present invention. detailed description
  • FIG. 1 is a schematic diagram of a method for organizing a file according to an embodiment of the present invention, where the method includes:
  • each file set corresponds to each classification result, that is, forms a file collection of a certain class or classes, and in a specific implementation, a hash algorithm may be used.
  • Documents are classified, but are not limited to this;
  • a hash algorithm may be used to establish a directory structure for each file set, but is not limited thereto;
  • the index organization may adopt a B+ tree form, and the above description is equally applicable to other embodiments of the present invention.
  • the file storage structure established above can be as shown in FIG. 2, including the total directory structure f(y), the file f(x) in the directory under the directory structure, the index (idx file), and the packed compression node (.tar. Gz), merged files (numbering from 1.256900 to 1.659900, numbering from 1.348699 to 1.648699, etc.).
  • the method may further include:
  • the files under each merged storage node are packaged and compressed.
  • 101, 102 can be selected according to the actual situation.
  • an index corresponding to the file may be updated according to an operation of adding, deleting, or modifying the file.
  • the above file may be a bill file, or other type of file.
  • the method for organizing the file in the embodiment of the present invention as shown in FIG. 1 can solve the problem of storing the file by categorizing the file, establishing a directory structure, merging under a storage node, and establishing an index.
  • the problem of wasted space can be solved by categorizing the file, establishing a directory structure, merging under a storage node, and establishing an index. The problem of wasted space.
  • the file organization module includes a classification sub-module 41, a directory establishment sub-module 42, an acquisition sub-module 43, a merge sub-module 44, and an index establishment sub-module 45.
  • the function of each submodule is as follows:
  • a classification sub-module 41 configured to classify files, obtain at least one file set, and each file set corresponds to each classification result, that is, form a file collection of a certain class or classes;
  • the directory creation sub-module 42 is configured to establish a directory structure for each type of file collection obtained by classifying the sub-module 41, and the directory file in the directory structure corresponds to each type of file set;
  • the algorithm of the classification process of the classification sub-module 41 or the process of establishing the directory structure of the directory creation sub-module 42 may be a hash algorithm
  • the obtaining sub-module 43 is configured to obtain various types of file sets obtained by the classification processing of the classification sub-module 41.
  • the merging sub-module 44 is configured to merge all the files in each type of file set obtained by the obtaining sub-module 43 into one storage. Under the node;
  • the index establishing sub-module 45 establishes an index for retrieving the merged file under the storage node.
  • the index organization may adopt the form of a B+ tree, and the above description is also applicable to other embodiments of the present invention.
  • each of the above functional units performs a corresponding function, and the created file storage structure can still be as shown in FIG. 2.
  • all files in each merged file set in the merge submodule 44 may be package compressed by a package compression submodule.
  • the classification sub-module 41 and the directory creation sub-module 42 may be selected according to actual conditions.
  • the index corresponding to the file may be updated by the index maintenance submodule according to the adding, deleting, or modifying operation of the file. (idx file), the index is maintained.
  • the above file may be a bill file, or other type of file.
  • the file organization module of the embodiment of the present invention as shown in FIG. 3 is implemented. By classifying files by different sub-modules, establishing a directory structure, merging under a storage node, and establishing an index, the number of files can be solved. The problem of wasted storage space.
  • the embodiment of the present invention further provides a storage module, which stores the storage structure of the file constructed as described above.
  • FIG. 4 is a schematic diagram of a file retrieval system according to an embodiment of the present invention.
  • the system mainly includes a storage module 51, a general control module 52, a file retrieval module 53, a file generation module 54, a file organization module 55, and a file output module 56.
  • the bill presentation processing module 57 has the following functions:
  • a storage module 51 the storage module 51 corresponding to the at least one storage node, configured to store an index for retrieving a bill file in the merged file set under the storage node, and the file set, the bill in the storage module 51
  • the storage structure of the file may still be as shown in FIG. 2, and details are not described herein again;
  • the receiving submodule in the master control module 52 receives the read request of the bill presentation processing module 57 to read the bill file in the file set merged under the storage node in the storage module 51, and the read request may be a print request or a reprint request for a billing file;
  • the control submodule in the master control module 52 determines whether the bill file corresponding to the read request is in the request queue, and if so, outputs first control information for controlling the bill file corresponding to the read request The first control information carries a request queue number. Otherwise, the output is used to control the file retrieval module 53 to obtain an index from the storage module 51 for retrieving the bill file in the file set merged under the storage node. And the second control information of the file set, the second control information includes the read request and the index key value, and the file retrieval module 53 may retrieve the index and the file set according to the index key value;
  • the file output module 56 directly transmits the bill file to the bill presentation processing module 57 in the form of a file stream, and the bill presentation processing module 57 may include a bill presentation processing program; the file retrieval module And obtaining, according to the second control information, an index for retrieving a bill file in the file set merged under the storage node, and the file set, and obtaining the obtained index and file from the storage module 51.
  • the collection is transmitted to the file output module 56;
  • the file output module 56 reads the file set and index sent from the file retrieval module 53, selects the corresponding bill file in the file set according to the index, and transmits the bill file to the bill presentation processing module as a file stream. 57, whereby the bill presentation processing module 57 triggers printing according to the bill file Or reprint the operation to complete the bill presentation;
  • the control submodule in the master control module 52 after obtaining the obtained result information returned by the file retrieval module 53, the obtaining result information instructing the file retrieval module 53 not to obtain the index and the file set from the storage module 51, Sending, to the file generation module 54, third control information for controlling generation of a bill file of the user;
  • the file generating module 54 generates a bill file of the user according to the third control information of the master control module 52, and the file generating module 54 may include a file generating program;
  • the file organization module 55 organizes the bill file generated by the file generating module 54 into the storage module 51, and the file organization module 55 can The function of the file organization module shown in FIG. 3 processes the generated bill file, and sends the processed bill file to the storage module 51 for storage;
  • the file retrieval module 53 can obtain the generated bill file from the storage module 51, and add the index sent by the total control module 52, and the file retrieval module 53 can obtain the file set for retrieving the merged file under the storage node.
  • the index of the billing file and the set of files are transferred to the file output module 56 to complete the bill presentation.
  • the file output module 56 specifically includes:
  • a compressed stream processing module after reading the file set and index sent by the file retrieval module 53 (the bill file in the file set exists in the form of a compressed package), decompressing the bill corresponding to the index in the file set Document
  • the output module transmits the bill file obtained by decompressing the compressed stream processing module to the bill presentation processing module 57 in the form of a file stream.
  • the above document retrieval system can be applied to a bill presentation subsystem in BOSS.
  • the file generating module 54 and the file organization module 55 may select the application according to the actual situation.
  • the bill file may be organized in a fixed format.
  • FIG. 5 is a schematic diagram of a file retrieval method according to an embodiment of the present invention. The method is based on the storage structure of the file created in FIG. 2, referring to FIG. 5, in conjunction with the system shown in FIG. 4, the method includes:
  • the master control module receives a read request of the bill presentation processing module to read the bill file in the merged file set under the storage node.
  • the read request may be a print request of the bill file or Reprint the request, but not limited to this;
  • the master control module determines whether the read request is in an existing request queue, and if so, directly notifies the file output module of the request queue number, and then the file output module directly transmits the bill file to the account as a file stream.
  • the request queue set by the embodiment of the present invention can implement the recently used bill file to be cached in the request queue, so as to quickly dispatch the bill file to the bill presentation processing module for processing, and the request queue
  • the bill file that is not used in the medium and long term can be cleared periodically.
  • a time can be set. When the set time is exceeded, the bill file is cleared to save the request queue resource;
  • control module according to the read request, the control file retrieval module obtains, from the storage module, an index for retrieving the bill file in the merged file set under the storage node, and the file set.
  • the control module generates control information for the file retrieval module to include the read request and the index key value, and the file retrieval module may retrieve the index and the file set according to the index key value;
  • the file retrieval module transfers the obtained index and file set to the file output module
  • the file output module reads the file collection and index sent by the file retrieval module, and selects a corresponding bill file in the file collection according to the index;
  • the file output module transmits the bill file to the bill presentation processing module in a file stream, so that the bill presentation processing module can trigger a printing or reprinting operation on the bill file to complete the bill presentation.
  • the master control module obtains the obtained result information returned by the file retrieval module, where the obtained result information indicates that the file retrieval module does not obtain the index and the file set from the storage module.
  • the master control module control file generating module generates a bill file of the user.
  • the master control module simultaneously controls the file organization module to perform the generated billing statement as shown in FIG. Organize the pieces;
  • the master control module sends the index corresponding to the generated billing file to the file retrieval module
  • the file generation module sends the generated bill file of the user to the file organization module, and the file organization module can process the generated bill file according to the process of the organization method of the file shown in FIG. 1;
  • the file organization module sends the processed bill file to the storage module for storage.
  • the file retrieval module can obtain the generated bill file from the storage module, and add the index sent by the total control module 310, and the file retrieval module can perform the corresponding function of 304 to complete the bill presentation.
  • the above 305 is specifically:
  • the file output module After the file output module reads the file set and the index sent by the file retrieval module (the bill file in the file set exists in the form of a compressed package), decompress the bill file corresponding to the index in the file set, and then The processing of the decompressed bill file is performed 306. To improve efficiency, partial decompression of the corresponding bill file may be performed according to the index, without decompressing all of the file set.
  • the embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
  • the storage medium may be a magnetic disk, an optical disk, or a read-only memory (Read-Only Memory, ROM) or random access memory (RAM).

Abstract

A method for organizing files is disclosed. The method comprises the steps of: obtaining at least one set of files; merging all the files in each of the sets of files into a storage node; establishing index for retrieving the files merged into the storage node. A method for retrieving files, a module for organizing files, a system for retrieving files and a computer readable storage media are still disclosed.

Description

文件的组织、 检索方法、 文件组织模块、 系统及存储媒介 技术领域  Document organization, retrieval method, file organization module, system and storage medium
本发明涉及通信领域, 尤其涉及一种文件的组织方法, 一种文件的检索方 法, 一种文件组织模块, 一种文件检索系统及一种计算机可读存储媒介。 背景技术  The present invention relates to the field of communications, and in particular, to a file organization method, a file retrieval method, a file organization module, a file retrieval system, and a computer readable storage medium. Background technique
现在电信行业, 尤其是海外电信行业, 用户的消费及金额相关信息都要求 在帐单上显示提供给用户, 同时在帐单上可刊登或插页广告, 以告之用户新的 活动、 新的优惠政策, 或进行其他的市场宣传。 帐单在转换打印或通过工具展 现之前, 是以文件的形式存储在存储模块上, 这类文件称为帐单文件, 帐单文 件具有以下特点:  Now in the telecommunications industry, especially in the overseas telecommunications industry, users' consumption and amount-related information are required to be displayed on the bill to the user, and advertisements can be posted or inserted on the bill to inform the user of new activities and new offers. Policy, or other market promotion. The bills are stored on the storage module in the form of a file before the conversion is printed or displayed by the tool. This type of file is called a billing document. The billing file has the following characteristics:
Al、 文件小;  Al, small file;
A2、 文件数量多;  A2, the number of documents is large;
A3、 查找定位一个文件效率低;  A3. Finding and locating a file is inefficient;
A4、 修改每一个文件的内容效率低;  A4. Modifying the content of each file is inefficient;
A5、 占用的存储空间大。  A5, the occupied storage space is large.
一方面, 文件数量太多, 浪费了文件系统的存储节点 (存储空间); 另一方面, 大量的文件进行查找定位时, 效率低, 在具体应用时, 在业务 运营支撑系统 ( Business Operation Support System, BOSS ) 中, 帳单文件由于 查找定位效率低, 使得帐单文件的打印 (展现) 与重打印速度慢, 降低了用户 对帐单业务的体验满意度。 发明内容  On the one hand, the number of files is too large, which wastes the storage nodes (storage space) of the file system; on the other hand, when a large number of files are searched and located, the efficiency is low, and in specific applications, in the business operation support system (Business Operation Support System) In BOSS, the billing file has low efficiency in finding and positioning, which makes the printing (presentation) and reprinting speed of the billing file slow, which reduces the experience satisfaction of the user's billing service. Summary of the invention
本发明实施例所要解决的技术问题在于, 提供了一种文件的组织方法, 一 种文件的检索方法, 一种文件组织模块, 一种文件检索系统及一种计算机可读 存储媒介, 可解决由于文件数量太多, 造成的存储空间浪费的问题以及文件定 位效率低的问题。  The technical problem to be solved by the embodiments of the present invention is to provide a file organization method, a file retrieval method, a file organization module, a file retrieval system and a computer readable storage medium, which can solve The number of files is too large, resulting in wasted storage space and low file location efficiency.
为了解决上述技术问题, 本发明实施例提出了一种文件的组织方法, 包括: 获得至少一个文件集合; 将所述每一个文件集合中的所有文件合并到一个存储节点下; 建立用于检索所述存储节点下合并的文件的索引。 In order to solve the above technical problem, an embodiment of the present invention provides a method for organizing a file, including: obtaining at least one file set; Combining all the files in each of the file sets into one storage node; establishing an index for retrieving the merged files under the storage node.
本发明实施例还提供了一种文件的检索方法, 包括:  The embodiment of the invention further provides a file retrieval method, including:
接收对存储节点下合并的文件集合中的文件进行读取的读取请求; 根据所述读取请求, 获得用于检索所述存储节点下合并的文件集合中的文 件的索引以及所述文件集合;  Receiving a read request for reading a file in the merged file set under the storage node; obtaining, according to the read request, an index for retrieving a file in the merged file set under the storage node and the file set ;
输出所述文件集合中与所述索引对应的文件。  A file corresponding to the index in the file set is output.
相应地, 本发明实施例还提供了一种文件组织模块, 包括:  Correspondingly, an embodiment of the present invention further provides a file organization module, including:
获取子模块, 用于获得至少一个文件集合;  Obtaining a submodule, configured to obtain at least one file set;
合并子模块, 用于将所述每一个文件集合中的所有文件合并到一个存储节 点下;  a merging submodule for merging all the files in each of the file sets into one storage node;
索引建立子模块, 建立用于检索所述存储节点下合并的文件的索引。  An index creation sub-module is configured to retrieve an index of the merged file under the storage node.
相应地, 本发明实施例还提供了一种文件检索系统, 包括:  Correspondingly, an embodiment of the present invention further provides a file retrieval system, including:
存储模块, 所述存储模块对应至少一个存储节点, 用于存储有用于检索所 述存储节点下合并的文件集合中的文件的索引以及所述文件集合;  a storage module, where the storage module corresponds to at least one storage node, configured to store an index for retrieving files in the merged file set under the storage node, and the file set;
总控模块, 用于接收对存储节点下合并的文件集合中的文件进行读取的读 取请求, 并根据所述读取请求输出对应的控制信息;  a master control module, configured to receive a read request for reading a file in a file set merged under the storage node, and output corresponding control information according to the read request;
文件检索模块, 根据所述总控模块的控制信息, 从所述存储模块中获得用 于检索所述存储节点下合并的文件集合中的文件的索引以及所述文件集合, 输 出所述索引及文件集合;  a file retrieval module, according to the control information of the total control module, obtaining an index for retrieving a file in the file set merged under the storage node and the file set, and outputting the index and the file from the storage module Collection
文件输出模块, 用于输出所述文件集合中与所述索引对应的文件。  And a file output module, configured to output a file corresponding to the index in the file set.
相应地, 本发明实施例还提供了一种计算机可读存储媒介, 所述计算机可 读存储媒介存储有多个可执行指令, 所述可执行指令用于:  Correspondingly, an embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
获得至少一个文件集合;  Obtain at least one file collection;
将所述每一个文件集合中的所有文件合并到一个存储节点下;  Combine all the files in each of the file sets into one storage node;
建立用于检索所述存储节点下合并的文件的索引。  Establishing an index for retrieving the merged file under the storage node.
本发明实施例通过获得至少一个文件集合, 并将所述每一个文件集合中的 所有文件合并到一个存储节点下, 建立用于检索所述存储节点下合并的文件的 索引, 从而解决由于文件数量太多, 造成的存储空间浪费的问题; 并在上述构 造的文件与索引存储关系的基础上, 接收对存储节点下合并的文件集合中的文 件进行读取的读取请求, 并根据用于检索所述存储节点下合并的文件集合中的 文件的索引, 查找并输出所述读取请求对应的文件, 可解决文件定位效率低的 问题, 提高了用户的体验满意度。 附图说明 In the embodiment of the present invention, by obtaining at least one file set, and merging all the files in each file set into one storage node, an index for retrieving the merged file under the storage node is established, thereby solving the problem due to the number of files. Too much, causing wasted storage space; and receiving the text in the merged file set under the storage node based on the above-structured file and index storage relationship Reading a read request, and searching and outputting the file corresponding to the read request according to an index for retrieving a file in the file set merged under the storage node, thereby solving the problem of low file positioning efficiency. Improve user experience satisfaction. DRAWINGS
图 1是本发明实施例的文件的组织方法的示意图;  1 is a schematic diagram of a method of organizing a file according to an embodiment of the present invention;
图 2是本发明实施例中建立的文件存储结构的示意图;  2 is a schematic diagram of a file storage structure established in an embodiment of the present invention;
图 3是本发明实施例的文件组织模块的示意图;  3 is a schematic diagram of a file organization module according to an embodiment of the present invention;
图 4是本发明实施例的文件检索系统的示意图;  4 is a schematic diagram of a document retrieval system according to an embodiment of the present invention;
图 5是本发明实施例的文件的检索方法的示意图。 具体实施方式  FIG. 5 is a schematic diagram of a method of retrieving a file according to an embodiment of the present invention. detailed description
下面结合附图, 对本发明实施例进行详细说明。  The embodiments of the present invention are described in detail below with reference to the accompanying drawings.
图 1是本发明实施例的文件的组织方法的示意图, 该方法包括:  1 is a schematic diagram of a method for organizing a file according to an embodiment of the present invention, where the method includes:
101 , 对文件进行分类, 得到至少一个文件集合; 每一个文件集合与每一个 分类结果对应,即形成某一类或几类的文件集合 ,具体实现时,可釆用哈希( hash ) 算法对文件进行分类, 但不仅限于此;  101. classify files to obtain at least one file set; each file set corresponds to each classification result, that is, forms a file collection of a certain class or classes, and in a specific implementation, a hash algorithm may be used. Documents are classified, but are not limited to this;
102, 为各类文件集合建立目录结构; 该目录结构中的目录文件与每一类文 件集合相对应, 具体实现时, 可釆用 hash算法为各类文件集合建立目录结构, 但不仅限于此;  102. Create a directory structure for each type of file collection; the directory file in the directory structure corresponds to each type of file set. In specific implementation, a hash algorithm may be used to establish a directory structure for each file set, but is not limited thereto;
103 , 获得上述各类文件集合, 将每一类文件集合中的所有文件合并到一个 存储节点下;  103. Obtain the above-mentioned various file sets, and merge all the files in each type of file set into one storage node;
104, 建立用于检索所述存储节点下合并的文件的索引; 在具体实现时, 索 引组织可釆用 B+树的形式, 上述说明同样适用于本发明的其它实施例。  104. Establish an index for retrieving the merged file under the storage node; in a specific implementation, the index organization may adopt a B+ tree form, and the above description is equally applicable to other embodiments of the present invention.
上述建立的文件存储结构可如图 2所示, 其中包括总的目录结构 f(y)、 目录 结构下的目录中的文件 f(x)、 索引 (idx文件)、 打包压缩节点(.tar.gz )、 合并的 文件 (编号从 1.256900到 1.659900 , 编号从 1.348699到 1.648699等)。  The file storage structure established above can be as shown in FIG. 2, including the total directory structure f(y), the file f(x) in the directory under the directory structure, the index (idx file), and the packed compression node (.tar. Gz), merged files (numbering from 1.256900 to 1.659900, numbering from 1.348699 to 1.648699, etc.).
作为一种实施方式, 103、 104之间还可以包括:  As an implementation manner, between 103 and 104, the method may further include:
将合并后的每个存储节点下的文件进行打包压缩。  The files under each merged storage node are packaged and compressed.
另外, 101、 102可根据实际情况选择适用。 作为一种实施方式, 当要添加、 删除或修改所述建立的文件的存储结构中 的文件时, 可根据所述文件的添加、 删除或修改操作, 更新该文件对应的索引In addition, 101, 102 can be selected according to the actual situation. As an implementation manner, when a file in a storage structure of the created file is to be added, deleted, or modified, an index corresponding to the file may be updated according to an operation of adding, deleting, or modifying the file.
( idx文件), 对该索 1进行维护。 (idx file), maintenance of the cable 1.
作为一种实施方式, 上述文件可以是帐单文件, 或其他类型的文件。  As an embodiment, the above file may be a bill file, or other type of file.
实施如图 1所示的本发明实施例的文件的组织方法, 通过对文件进行分类、 建立目录结构、 在一存储节点下合并、 建立索引等操作, 可解决由于文件数量 太多, 造成的存储空间浪费的问题。  The method for organizing the file in the embodiment of the present invention as shown in FIG. 1 can solve the problem of storing the file by categorizing the file, establishing a directory structure, merging under a storage node, and establishing an index. The problem of wasted space.
相应地, 下面对本发明实施例的模块及系统进行说明。  Accordingly, the modules and systems of the embodiments of the present invention are described below.
图 3是本发明实施例的文件组织模块的示意图, 参照图 3 , 该文件组织模块 包括有分类子模块 41、 目录建立子模块 42、 获取子模块 43、 合并子模块 44、 索引建立子模块 45 , 各子模块的功能如下述:  3 is a schematic diagram of a file organization module according to an embodiment of the present invention. Referring to FIG. 3, the file organization module includes a classification sub-module 41, a directory establishment sub-module 42, an acquisition sub-module 43, a merge sub-module 44, and an index establishment sub-module 45. The function of each submodule is as follows:
分类子模块 41 , 用于对文件进行分类, 得到至少一个文件集合, 每一个文 件集合与每一个分类结果对应, 即形成某一类或几类的文件集合;  a classification sub-module 41, configured to classify files, obtain at least one file set, and each file set corresponds to each classification result, that is, form a file collection of a certain class or classes;
目录建立子模块 42,用于为分类子模块 41分类处理所得各类文件集合建立 目录结构, 该目录结构中的目录文件与每一类文件集合相对应;  The directory creation sub-module 42 is configured to establish a directory structure for each type of file collection obtained by classifying the sub-module 41, and the directory file in the directory structure corresponds to each type of file set;
在具体实现时, 所述分类子模块 41 的分类处理或目录建立子模块 42的建 立目录结构的处理釆用的算法可以是 hash算法;  In a specific implementation, the algorithm of the classification process of the classification sub-module 41 or the process of establishing the directory structure of the directory creation sub-module 42 may be a hash algorithm;
获取子模块 43 ,用于获得所述分类子模块 41分类处理所得的各类文件集合; 合并子模块 44,用于将获取子模块 43获得的每一类文件集合中的所有文件 合并到一个存储节点下;  The obtaining sub-module 43 is configured to obtain various types of file sets obtained by the classification processing of the classification sub-module 41. The merging sub-module 44 is configured to merge all the files in each type of file set obtained by the obtaining sub-module 43 into one storage. Under the node;
索引建立子模块 45 , 建立用于检索所述存储节点下合并的文件的索引, 在 具体实现时, 索引组织可釆用 B+树的形式, 上述说明同样适用于本发明的其它 实施例。  The index establishing sub-module 45 establishes an index for retrieving the merged file under the storage node. In the specific implementation, the index organization may adopt the form of a B+ tree, and the above description is also applicable to other embodiments of the present invention.
上述各功能单元执行对应功能, 建立出的文件存储结构可仍如图 2所示。 作为一种实施方式, 所述合并子模块 44中的合并的每一类文件集合中的所 有文件可由一打包压缩子模块进行打包压缩。  Each of the above functional units performs a corresponding function, and the created file storage structure can still be as shown in FIG. 2. As an implementation manner, all files in each merged file set in the merge submodule 44 may be package compressed by a package compression submodule.
另外, 上述分类子模块 41、 目录建立子模块 42可根据实际情况选择釆用。 作为一种实施方式, 当要添加、 删除或修改所述建立的文件的存储结构中 的文件时, 可根据所述文件的添加、 删除或修改操作, 由索引维护子模块更新 该文件对应的索引 (idx文件), 对该索引进行维护。 作为一种实施方式, 上述文件可以是帐单文件, 或其他类型的文件。 In addition, the classification sub-module 41 and the directory creation sub-module 42 may be selected according to actual conditions. As an implementation manner, when a file in the storage structure of the created file is to be added, deleted, or modified, the index corresponding to the file may be updated by the index maintenance submodule according to the adding, deleting, or modifying operation of the file. (idx file), the index is maintained. As an embodiment, the above file may be a bill file, or other type of file.
实施如图 3 所示的本发明实施例的文件组织模块, 通过由不同子模块对文 件进行分类、 建立目录结构、 在一存储节点下合并、 建立索引等操作, 可解决 由于文件数量太多, 造成的存储空间浪费的问题。  The file organization module of the embodiment of the present invention as shown in FIG. 3 is implemented. By classifying files by different sub-modules, establishing a directory structure, merging under a storage node, and establishing an index, the number of files can be solved. The problem of wasted storage space.
在上述本发明实施例的文件组织模块构造的文件的存储结构的基础上, 本 发明实施例还提供了一种存储模块, 存储有上述构造的文件的存储结构。  On the basis of the storage structure of the file structure of the file organization module of the embodiment of the present invention, the embodiment of the present invention further provides a storage module, which stores the storage structure of the file constructed as described above.
图 4是本发明实施例的文件检索系统的示意图, 参照图 4 , 该系统主要包括 存储模块 51、 总控模块 52、 文件检索模块 53、 文件生成模块 54、 文件组织模 块 55、 文件输出模块 56、 帐单展现处理模块 57 , 各模块功能如下述:  4 is a schematic diagram of a file retrieval system according to an embodiment of the present invention. Referring to FIG. 4, the system mainly includes a storage module 51, a general control module 52, a file retrieval module 53, a file generation module 54, a file organization module 55, and a file output module 56. The bill presentation processing module 57 has the following functions:
存储模块 51 , 该存储模块 51对应至少一个存储节点, 用于存储有用于检索 所述存储节点下合并的文件集合中的帐单文件的索引以及所述文件集合, 该存 储模块 51中的帐单文件的存储结构可仍如图 2所示, 此处不再赘述;  a storage module 51, the storage module 51 corresponding to the at least one storage node, configured to store an index for retrieving a bill file in the merged file set under the storage node, and the file set, the bill in the storage module 51 The storage structure of the file may still be as shown in FIG. 2, and details are not described herein again;
总控模块 52中的接收子模块, 接收帐单展现处理模块 57的对存储模块 51 中存储节点下合并的文件集合中的帐单文件进行读取的读取请求, 该读取请求 可以是对帐单文件的打印请求或重打印请求;  The receiving submodule in the master control module 52 receives the read request of the bill presentation processing module 57 to read the bill file in the file set merged under the storage node in the storage module 51, and the read request may be a print request or a reprint request for a billing file;
总控模块 52中的控制子模块, 判断所述读取请求对应的帐单文件是否在请 求队列, 若是, 则输出用于控制将所述读取请求对应的帐单文件输出的第一控 制信息, 该第一控制信息中携带有请求队列号, 否则, 输出用于控制所述文件 检索模块 53从存储模块 51 中获得用于检索所述存储节点下合并的文件集合中 的帐单文件的索引以及所述文件集合的第二控制信息, 该第二控制信息中包括 有所述读取请求及索引键值, 文件检索模块 53即可根据索引键值检索出所述索 引及文件集合;  The control submodule in the master control module 52 determines whether the bill file corresponding to the read request is in the request queue, and if so, outputs first control information for controlling the bill file corresponding to the read request The first control information carries a request queue number. Otherwise, the output is used to control the file retrieval module 53 to obtain an index from the storage module 51 for retrieving the bill file in the file set merged under the storage node. And the second control information of the file set, the second control information includes the read request and the index key value, and the file retrieval module 53 may retrieve the index and the file set according to the index key value;
文件输出模块 56 , 根据所述第一控制信息, 直接将帐单文件以文件流的方 式传给帐单展现处理模块 57,帐单展现处理模块 57中可包括帐单展现处理程序; 文件检索模块 53 , 根据所述第二控制信息,从存储模块 51中获得用于检索 所述存储节点下合并的文件集合中的帐单文件的索引以及所述文件集合, 并将 获得的所述索引及文件集合传送给文件输出模块 56;  The file output module 56, according to the first control information, directly transmits the bill file to the bill presentation processing module 57 in the form of a file stream, and the bill presentation processing module 57 may include a bill presentation processing program; the file retrieval module And obtaining, according to the second control information, an index for retrieving a bill file in the file set merged under the storage node, and the file set, and obtaining the obtained index and file from the storage module 51. The collection is transmitted to the file output module 56;
文件输出模块 56 , 读入文件检索模块 53传来的文件集合及索引,根据索引 选取文件集合中对应的帐单文件, 并将所述帐单文件以文件流的方式传给帐单 展现处理模块 57 ,从而帐单展现处理模块 57触发根据所述帐单文件进行的打印 或重打印操作, 完成帐单展现; The file output module 56 reads the file set and index sent from the file retrieval module 53, selects the corresponding bill file in the file set according to the index, and transmits the bill file to the bill presentation processing module as a file stream. 57, whereby the bill presentation processing module 57 triggers printing according to the bill file Or reprint the operation to complete the bill presentation;
总控模块 52中的控制子模块, 在获得文件检索模块 53返回的获得结果信 息后, 该获得结果信息指示所述文件检索模块 53从存储模块 51 中未获得所述 索引以及所述文件集合, 向文件生成模块 54发送用于控制生成所述用户的帐单 文件的第三控制信息;  The control submodule in the master control module 52, after obtaining the obtained result information returned by the file retrieval module 53, the obtaining result information instructing the file retrieval module 53 not to obtain the index and the file set from the storage module 51, Sending, to the file generation module 54, third control information for controlling generation of a bill file of the user;
文件生成模块 54 ,才艮据所述总控模块 52的第三控制信息生成所述用户的帐 单文件, 该文件生成模块 54中可包括文件生成程序;  The file generating module 54 generates a bill file of the user according to the third control information of the master control module 52, and the file generating module 54 may include a file generating program;
文件组织模块 55 , 根据所述总控模块 52的第四控制信息, 将所述文件生成 模块 54生成的帐单文件进行组织处理后存放到所述存储模块 51 中, 文件组织 模块 55即可根据图 3所示的文件组织模块的功能对生成的帐单文件进行处理, 并将处理后的帐单文件发送到存储模块 51进行存储;  The file organization module 55, according to the fourth control information of the master control module 52, organizes the bill file generated by the file generating module 54 into the storage module 51, and the file organization module 55 can The function of the file organization module shown in FIG. 3 processes the generated bill file, and sends the processed bill file to the storage module 51 for storage;
文件检索模块 53即可从所述存储模块 51获得上述生成的帐单文件, 加总 控模块 52发送的索引 , 文件检索模块 53即可获得用于检索所述存储节点下合 并的文件集合中的帐单文件的索引以及所述文件集合, 并将获得的所述索引及 文件集合传送给文件输出模块 56 , 从而完成帐单展现。  The file retrieval module 53 can obtain the generated bill file from the storage module 51, and add the index sent by the total control module 52, and the file retrieval module 53 can obtain the file set for retrieving the merged file under the storage node. The index of the billing file and the set of files are transferred to the file output module 56 to complete the bill presentation.
作为一种实施方式, 当所述帐单文件以压缩包形式存在时, 即各文件集合 被打包压缩时, 上述文件输出模块 56具体包括:  As an embodiment, when the bill file is in the form of a compressed package, that is, each file set is packaged and compressed, the file output module 56 specifically includes:
压缩流处理模块, 读入文件检索模块 53传来的文件集合及索引 (该文件集 合中的帐单文件以压缩包形式存在)后, 解压缩所述文件集合中与所述索引对 应的帐单文件;  a compressed stream processing module, after reading the file set and index sent by the file retrieval module 53 (the bill file in the file set exists in the form of a compressed package), decompressing the bill corresponding to the index in the file set Document
输出模块, 将所述压缩流处理模块解压缩得到的所述帐单文件以文件流的 方式传给帐单展现处理模块 57。  The output module transmits the bill file obtained by decompressing the compressed stream processing module to the bill presentation processing module 57 in the form of a file stream.
作为一种实施方式, 上述文件检索系统可应用于 BOSS 中的帐单展现子系 统中。  As an embodiment, the above document retrieval system can be applied to a bill presentation subsystem in BOSS.
值得说明的是, 上述文件生成模块 54、 文件组织模块 55可根据实际情况选 择釆用, 当不釆用文件组织模块 55时, 可釆用固定的格式组织上述帐单文件。  It should be noted that the file generating module 54 and the file organization module 55 may select the application according to the actual situation. When the file organization module 55 is not used, the bill file may be organized in a fixed format.
实施如图 4 所示的本发明实施例的文件检索系统, 通过不同模块接收对存 储节点下合并的文件集合中的文件进行读取的读取请求, 根据用于检索所述存 储节点下合并的文件集合中的文件的索引, 获得并输出所述读取请求对应的文 件, 可文件定位效率低的问题; 釆用了总控与模块调度机制, 更能实现按需的 快速调度, 提高了用户的体验满意度。 Implementing a file retrieval system according to an embodiment of the present invention as shown in FIG. 4, receiving, by different modules, a read request for reading a file in a file set merged under a storage node, according to a search for retrieving the storage node Index of the file in the file collection, obtaining and outputting the file corresponding to the read request, which can solve the problem of low file positioning efficiency; using the total control and module scheduling mechanism, and more capable of realizing on-demand Fast scheduling improves user experience satisfaction.
图 5是本发明实施例的文件的检索方法的示意图, 该方法基于图 2所示中 建立的文件的存储结构, 参照图 5 , 结合图 4所示的系统, 该方法包括:  FIG. 5 is a schematic diagram of a file retrieval method according to an embodiment of the present invention. The method is based on the storage structure of the file created in FIG. 2, referring to FIG. 5, in conjunction with the system shown in FIG. 4, the method includes:
301, 总控模块接收帐单展现处理模块的对存储节点下合并的文件集合中的 帐单文件进行读取的读取请求, 具体实现时, 该读取请求可以是帐单文件的打 印请求或重打印请求, 但不仅限于此;  301. The master control module receives a read request of the bill presentation processing module to read the bill file in the merged file set under the storage node. In specific implementation, the read request may be a print request of the bill file or Reprint the request, but not limited to this;
302 , 总控模块判断所述读取请求是否在已有的请求队列中, 若是, 则直接 将请求队列号通知文件输出模块, 然后文件输出模块直接将帐单文件以文件流 的方式传给帐单展现处理模块, 完成帐单展现, 否则执行 303 ;  302, the master control module determines whether the read request is in an existing request queue, and if so, directly notifies the file output module of the request queue number, and then the file output module directly transmits the bill file to the account as a file stream. Single presentation processing module, complete bill presentation, otherwise execute 303;
需要说明的是, 本发明实施例所设定的请求队列可实现最近使用的帐单文 件在请求队列中緩存, 以快速调度出该帐单文件到帐单展现处理模块中进行处 理, 而请求队列中长期不使用的帐单文件可定时清除, 具体实现时, 可设定一 时间, 当超过该设定时间时清除该帐单文件以节约请求队列资源;  It should be noted that the request queue set by the embodiment of the present invention can implement the recently used bill file to be cached in the request queue, so as to quickly dispatch the bill file to the bill presentation processing module for processing, and the request queue The bill file that is not used in the medium and long term can be cleared periodically. In specific implementation, a time can be set. When the set time is exceeded, the bill file is cleared to save the request queue resource;
303 , 总控模块根据所述读取请求, 控制文件检索模块从存储模块中获得用 于检索所述存储节点下合并的文件集合中的帐单文件的索引以及所述文件集 合, 具体实现时, 总控模块在进行所述控制时产生对所述文件检索模块的控制 信息中包括所述读取请求及索引键值 , 文件检索模块即可根据索引键值检索出 所述索引及文件集合;  303, the control module, according to the read request, the control file retrieval module obtains, from the storage module, an index for retrieving the bill file in the merged file set under the storage node, and the file set. The control module generates control information for the file retrieval module to include the read request and the index key value, and the file retrieval module may retrieve the index and the file set according to the index key value;
304 , 文件检索模块将获得的所述索引及文件集合传送给文件输出模块; 304, the file retrieval module transfers the obtained index and file set to the file output module;
305 , 文件输出模块读入文件检索模块传来的文件集合及索引, 根据索引选 取文件集合中对应的帐单文件; 305, the file output module reads the file collection and index sent by the file retrieval module, and selects a corresponding bill file in the file collection according to the index;
306, 文件输出模块将所述帐单文件以文件流的方式传给帐单展现处理模 块, 从而帐单展现处理模块可触发对所述帐单文件进行打印或重打印操作, 完 成帐单展现。  306. The file output module transmits the bill file to the bill presentation processing module in a file stream, so that the bill presentation processing module can trigger a printing or reprinting operation on the bill file to complete the bill presentation.
作为一种实施方式, 当所述文件检索模块未从存储模块中获得所述索引以  As an embodiment, when the file retrieval module does not obtain the index from the storage module
307 , 总控模块获得文件检索模块返回的获得结果信息, 该获得结果信息指 示所述文件检索模块未从所述存储模块中获得所述索引以及文件集合; 307. The master control module obtains the obtained result information returned by the file retrieval module, where the obtained result information indicates that the file retrieval module does not obtain the index and the file set from the storage module.
308 , 总控模块控制文件生成模块生成所述用户的帐单文件;  308. The master control module control file generating module generates a bill file of the user.
309 , 总控模块同时控制文件组织模块进行图 1所示的将所述生成的帐单文 件进行组织处理; 309. The master control module simultaneously controls the file organization module to perform the generated billing statement as shown in FIG. Organize the pieces;
310, 总控模块将所述生成的帐单文件对应的索引发送到文件检索模块; 310, the master control module sends the index corresponding to the generated billing file to the file retrieval module;
311 , 文件生成模块将生成的所述用户的帐单文件发送到文件组织模块, 文 件组织模块即可根据图 1 所示的文件的组织方法的流程对生成的帐单文件进行 处理; 311. The file generation module sends the generated bill file of the user to the file organization module, and the file organization module can process the generated bill file according to the process of the organization method of the file shown in FIG. 1;
312, 文件组织模块将处理后的帐单文件发送到存储模块进行存储。  312. The file organization module sends the processed bill file to the storage module for storage.
在 312之后, 文件检索模块即可从所述存储模块获得上述生成的帐单文件, 加总控模块在 310发送的索引, 文件检索模块即可执行 304的对应功能, 完成 帐单展现。  After 312, the file retrieval module can obtain the generated bill file from the storage module, and add the index sent by the total control module 310, and the file retrieval module can perform the corresponding function of 304 to complete the bill presentation.
作为一种实施方式, 当所述帐单文件以压缩包形式存在时, 即各文件集合 被打包压缩时, 上述 305具体为:  As an embodiment, when the bill file exists in a compressed package, that is, each file set is packaged and compressed, the above 305 is specifically:
文件输出模块读入文件检索模块传来的文件集合及索引 (该文件集合中的 帐单文件以压缩包形式存在)后, 解压缩所述文件集合中与所述索引对应的帐 单文件, 然后以所述解压缩得到的帐单文件进行 306 的处理, 为提高效率, 此 处可根据索引进行索引对应帐单文件的部分解压缩, 而不需要对所述文件集合 全部进行解压缩。  After the file output module reads the file set and the index sent by the file retrieval module (the bill file in the file set exists in the form of a compressed package), decompress the bill file corresponding to the index in the file set, and then The processing of the decompressed bill file is performed 306. To improve efficiency, partial decompression of the corresponding bill file may be performed according to the index, without decompressing all of the file set.
实施如图 5 所示的本发明实施例的文件的检索方法, 通过接收对存储节点 下合并的文件集合中的文件进行读取的读取请求, 根据用于检索所述存储节点 下合并的文件集合中的文件的索引, 获得并输出所述读取请求对应的文件, 可 文件定位效率低的问题; 釆用了总控与模块调度机制, 更能实现按需的快速调 度, 提高了用户的体验满意度。  Implementing a retrieval method of a file according to an embodiment of the present invention as shown in FIG. 5, by receiving a read request for reading a file in a file set merged under a storage node, according to a file for retrieving the merged storage node Index of the file in the collection, obtaining and outputting the file corresponding to the read request, which can solve the problem of low file positioning efficiency; using the total control and module scheduling mechanism, can realize on-demand fast scheduling, and improve the user's Experience satisfaction.
同时, 本发明实施例还提供一种计算机可读存储媒介, 所述计算机可读存 储媒介存储有多个可执行指令, 所述可执行指令用于:  In the meantime, the embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a plurality of executable instructions, where the executable instructions are used to:
获得至少一个文件集合;  Obtain at least one file collection;
将所述每一个文件集合中的所有文件合并到一个存储节点下;  Combine all the files in each of the file sets into one storage node;
建立用于检索所述存储节点下合并的文件的索引。  Establishing an index for retrieving the merged file under the storage node.
另外, 本领域普通技术人员可以理解实现上述实施例方法中的全部或部分 流程, 是可以通过程序来指令相关的硬件来完成, 所述的程序可存储于一计算 机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体(Read-Only Memory, ROM )或随机存储记忆体(Random Access Memory, RAM )等。 In addition, one of ordinary skill in the art can understand that all or part of the process in implementing the foregoing embodiments may be completed by a program instructing related hardware, and the program may be stored in a computer readable storage medium. The program, when executed, may include the flow of an embodiment of the methods as described above. The storage medium may be a magnetic disk, an optical disk, or a read-only memory (Read-Only Memory, ROM) or random access memory (RAM).
以上所述是本发明的具体实施方式, 应当指出, 对于本技术领域的普通技 术人员来说, 在不脱离本发明原理的前提下, 还可以做出若干改进和润饰, 这 些改进和润饰也视为本发明的保护范围。  The above is a specific embodiment of the present invention. It should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. These improvements and retouchings are also considered. It is the scope of protection of the present invention.

Claims

权 利 要 求 Rights request
1、 一种文件的组织方法, 其特征在于, 包括:  A method for organizing a file, comprising:
获得至少一个文件集合;  Obtain at least one file collection;
将所述每一个文件集合中的所有文件合并到一个存储节  Combine all the files in each of the file collections into one storage section
建立用于检索所述存储节点下合并的文件的索引。  Establishing an index for retrieving the merged file under the storage node.
2、 如权利要求 1所述的文件的组织方法, 其特征在于, 该方法还包括: 对文件进行分类, 得到所述至少一个文件集合, 所述每一个文件集合与所 述每一个分类结果对应。 2. The method of organizing a file according to claim 1, wherein the method further comprises: classifying the file to obtain the at least one file set, and each of the file sets corresponds to each of the classification results .
3、 如权利要求 1所述的文件的组织方法, 其特征在于, 该方法还包括: 为所述至少一个文件集合建立目录结构, 所述每一个目录结构中的目录文 件与所述每一个文件集合对应。 3. The method of organizing a file according to claim 1, wherein the method further comprises: establishing a directory structure for the at least one file set, the directory file in each of the directory structures and each of the files The collection corresponds.
4、 如权利要求 2或 3所述的文件的组织方法, 其特征在于, 釆用哈希算法 进行分类或建立目录结构。 4. The method of organizing a file according to claim 2 or 3, characterized in that the hash algorithm is used for classification or a directory structure is established.
5、 如权利要求 1至 3中任一项所述的文件的组织方法, 其特征在于, 该方 法还包括: The method of organizing a file according to any one of claims 1 to 3, wherein the method further comprises:
打包压缩所述合并的文件。  The package compresses the merged file.
6、 如权利要求 1至 3中任一项所述的文件的组织方法, 其特征在于, 该方 法还包括: The method of organizing a file according to any one of claims 1 to 3, wherein the method further comprises:
根据所述文件的添加、 删除或修改操作, 更新所述文件对应的索引。  Updating the index corresponding to the file according to the adding, deleting or modifying operation of the file.
7、 一种文件的检索方法, 其特征在于, 包括: 7. A method for retrieving a file, comprising:
接收对存储节点下合并的文件集合中的文件进行读取的读取请求; 根据所述读取请求, 获得用于检索所述存储节点下合并的文件集合中的文 件的索引以及所述文件集合; 输出所述文件集合中与所述索引对应的文件。 Receiving a read request for reading a file in the merged file set under the storage node; obtaining, according to the read request, an index for retrieving a file in the merged file set under the storage node and the file set ; A file corresponding to the index in the file set is output.
8、 如权利要求 7所述的文件的检索方法, 其特征在于, 所述方法还包括: 判断所述读取请求对应的文件是否在请求队列, 若是, 则将该请求队列中 所述读取请求对应的文件输出。 The method for retrieving a file according to claim 7, wherein the method further comprises: determining whether a file corresponding to the read request is in a request queue, and if so, reading the request in the request queue Request the corresponding file output.
9、 如权利要求 7所述的文件的检索方法, 其特征在于, 所述方法还包括: 当未获得所述索引以及所述文件集合时, 生成所述文件, 并将该生成的文 件对应索引输出; The method for retrieving a file according to claim 7, wherein the method further comprises: generating the file when the index and the file set are not obtained, and correspondingly indexing the generated file Output
将所述生成的文件进行组织处理并进行存储, 获得该存储的文件。  The generated file is organized and stored to obtain the stored file.
10、 如权利要求 7 所述的文件的检索方法, 其特征在于, 当所述文件以压 缩包形式存在时, 所述输出所述文件集合中与所述索引对应的文件具体包括: 解压缩所述文件集合中与所述索引对应的文件; The method for retrieving a file according to claim 7, wherein when the file exists in a compressed package, the outputting the file corresponding to the index in the file set specifically includes: a file corresponding to the index in the file set;
输出所述解压缩得到的文件。  The decompressed file is output.
11、 如权利要求 7至 9中任一项所述的文件的检索方法, 其特征在于, 所 述文件为帐单文件。 The method for retrieving a document according to any one of claims 7 to 9, characterized in that the file is a bill file.
12、 一种文件组织模块, 其特征在于, 包括: 12. A file organization module, comprising:
获取子模块, 用于获得至少一个文件集合;  Obtaining a submodule, configured to obtain at least one file set;
合并子模块, 用于将所述每一个文件集合中的所有文件合并到一个存储节 点下;  a merging submodule for merging all the files in each of the file sets into one storage node;
索引建立子模块, 建立用于检索所述存储节点下合并的文件的索引。  An index creation sub-module is configured to retrieve an index of the merged file under the storage node.
13、 如权利要求 12所述的文件组织模块, 其特征在于, 还包括: 13. The file organization module of claim 12, further comprising:
14、 如权利要求 12所述的文件组织模块, 其特征在于, 还包括: 目录建立子模块, 用于为所述至少一个文件集合建立目录结构, 所述每一 个目录结构中的目录文件与所述每一个文件集合对应。 The file organization module of claim 12, further comprising: And a directory establishing submodule, configured to establish a directory structure for the at least one file set, where the directory file in each directory structure corresponds to each of the file sets.
15、 一种文件检索系统, 其特征在于, 包括: 15. A document retrieval system, comprising:
存储模块, 所述存储模块对应至少一个存储节点, 用于存储有用于检索所 述存储节点下合并的文件集合中的文件的索引以及所述文件集合;  a storage module, where the storage module corresponds to at least one storage node, configured to store an index for retrieving files in the merged file set under the storage node, and the file set;
总控模块, 用于接收对存储节点下合并的文件集合中的文件进行读取的读 取请求, 并根据所述读取请求输出对应的控制信息;  a master control module, configured to receive a read request for reading a file in a file set merged under the storage node, and output corresponding control information according to the read request;
文件检索模块, 根据所述总控模块的控制信息, 从所述存储模块中获得用 于检索所述存储节点下合并的文件集合中的文件的索引以及所述文件集合, 输 出所述索引及文件集合;  a file retrieval module, according to the control information of the total control module, obtaining an index for retrieving a file in the file set merged under the storage node and the file set, and outputting the index and the file from the storage module Collection
文件输出模块, 用于输出所述文件集合中与所述索引对应的文件。  And a file output module, configured to output a file corresponding to the index in the file set.
16、如权利要求 15所述的文件检索系统, 其特征在于, 所述总控模块包括: 接收子模块, 接收所述读取请求; The file retrieval system according to claim 15, wherein the master control module comprises: a receiving submodule, receiving the read request;
控制子模块, 判断所述读取请求对应的文件是否在请求队列, 若是, 则输 出用于控制将该请求队列中所述读取请求对应的文件输出的第一控制信息, 否 则, 输出用于控制所述文件检索模块从存储模块中获得用于检索所述存储节点 下合并的文件集合中的文件的索引以及所述文件集合的第二控制信息。  Controlling the submodule, determining whether the file corresponding to the read request is in the request queue, and if so, outputting first control information for controlling output of the file corresponding to the read request in the request queue, otherwise, outputting Controlling the file retrieval module to obtain, from the storage module, an index for retrieving a file in the merged file set under the storage node and second control information of the file set.
17、 如权利要求 15所述的文件检索系统, 其特征在于, 该系统还包括: 文件生成模块, 当所述文件检索模块未获得所述索引以及所述文件集合时, 才艮据所述总控模块的第三控制信息生成所述文件; The file retrieval system according to claim 15, wherein the system further comprises: a file generation module, when the file retrieval module does not obtain the index and the file collection, according to the total The third control information of the control module generates the file;
文件组织模块, 将所述文件生成模块生成的文件进行组织处理后存储到所 述存储模块中。  The file organization module organizes the files generated by the file generation module and stores them in the storage module.
18、 如权利要求 15所述的文件检索系统, 其特征在于, 当所述文件以压缩 包形式存在时, 所述文件输出模块包括: The file retrieval system according to claim 15, wherein when the file exists in a compressed package form, the file output module comprises:
压缩流处理模块, 用于解压缩所述文件集合中与所述索引对应的文件; 输出模块, 用于输出所述解压缩得到的文件。 a compressed stream processing module, configured to decompress a file corresponding to the index in the file set, and an output module, configured to output the decompressed file.
19、 如权利要求 15至 17中任一项所述的文件检索系统, 其特征在于, 所 述文件为帐单文件, 该系统应用于业务运营支撑系统中的帐单展现子系统中。 The document retrieval system according to any one of claims 15 to 17, wherein the document is a bill file, and the system is applied to a bill presentation subsystem in a business operation support system.
20、 一种计算机可读存储媒介, 所述计算机可读存储媒介存储有多个可执 行指令, 其特征在于, 所述可执行指令用于: 20. A computer readable storage medium, the computer readable storage medium storing a plurality of executable instructions, wherein the executable instructions are:
获得至少一个文件集合;  Obtain at least one file collection;
将所述每一个文件集合中的所有文件合并到一个存储节点下;  Combine all the files in each of the file sets into one storage node;
建立用于检索所述存储节点下合并的文件的索引。  Establishing an index for retrieving the merged file under the storage node.
PCT/CN2008/071908 2008-02-01 2008-08-07 Method for organizing and retrieving files, module and system for organizing files and storage media thereof WO2009097710A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2008100262348A CN101226546B (en) 2008-02-01 2008-02-01 Method for searching document
CN200810026234.8 2008-02-01

Publications (1)

Publication Number Publication Date
WO2009097710A1 true WO2009097710A1 (en) 2009-08-13

Family

ID=39858541

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2008/071908 WO2009097710A1 (en) 2008-02-01 2008-08-07 Method for organizing and retrieving files, module and system for organizing files and storage media thereof

Country Status (2)

Country Link
CN (1) CN101226546B (en)
WO (1) WO2009097710A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226546B (en) * 2008-02-01 2011-12-21 华为技术有限公司 Method for searching document
CN102148859A (en) * 2011-01-20 2011-08-10 南京烽火星空通信发展有限公司 Network file service method
CN102880677B (en) * 2012-09-11 2016-04-13 珠海金山网络游戏科技有限公司 A kind of packing of the file based on Hash and read method
CN103853791A (en) * 2012-12-07 2014-06-11 腾讯科技(深圳)有限公司 Implementation method and device for quick file retrieving
CN104978330A (en) * 2014-04-04 2015-10-14 西南大学 Data storage method and device
CN107092604B (en) * 2016-02-18 2020-03-20 中国移动通信集团河北有限公司 File processing method and device
CN106547911B (en) * 2016-11-25 2020-07-10 长城计算机软件与系统有限公司 Access method and system for massive small files
CN108549545A (en) * 2018-04-20 2018-09-18 武汉极意网络科技有限公司 A kind of project organization method and system based on tornado frames

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5729743A (en) * 1995-11-17 1998-03-17 Deltatech Research, Inc. Computer apparatus and method for merging system deltas
CN1113304C (en) * 1998-09-18 2003-07-02 英业达股份有限公司 Method for merging files
CN101226546A (en) * 2008-02-01 2008-07-23 华为技术有限公司 Method for organizing and searching document

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100454308C (en) * 2006-08-30 2009-01-21 华为技术有限公司 Method of file distributing and searching and its system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5729743A (en) * 1995-11-17 1998-03-17 Deltatech Research, Inc. Computer apparatus and method for merging system deltas
CN1113304C (en) * 1998-09-18 2003-07-02 英业达股份有限公司 Method for merging files
CN101226546A (en) * 2008-02-01 2008-07-23 华为技术有限公司 Method for organizing and searching document

Also Published As

Publication number Publication date
CN101226546A (en) 2008-07-23
CN101226546B (en) 2011-12-21

Similar Documents

Publication Publication Date Title
WO2009097710A1 (en) Method for organizing and retrieving files, module and system for organizing files and storage media thereof
KR102007070B1 (en) Reference block aggregating into a reference set for deduplication in memory management
JP5316711B2 (en) File storage device, file storage method and program
WO2014015488A1 (en) Method and apparatus for data storage and query
CN105868286B (en) The parallel method of adding and system merged based on distributed file system small documents
RU2007143551A (en) METHOD, SYSTEM AND COMPUTER READED INFORMATION MEDIA FOR SYNCHRONIZATION OF SUBJECT TO CHANGE DOCUMENTS FOR MANY CUSTOMERS
US20120011101A1 (en) Integrating client and server deduplication systems
CN102456059A (en) Data deduplication processing system
US8280895B2 (en) Multi-streamed method for optimizing data transfer through parallelized interlacing of data based upon sorted characteristics to minimize latencies inherent in the system
US9922041B2 (en) Storing data files in a file system
EP3788505B1 (en) Storing data items and identifying stored data items
Upadhyay et al. Deduplication and compression techniques in cloud design
WO2022082891A1 (en) Big data acquisition method and system, and computer device and storage medium thereof
JP4233564B2 (en) Data processing apparatus, data processing program and recording medium
KR101512760B1 (en) Method of producing and managing a large-volume long-term archive
CN108090186A (en) A kind of electric power data De-weight method on big data platform
CN104123309A (en) Method and system used for data management
JP5444728B2 (en) Storage system, data writing method in storage system, and data writing program
CN107577809A (en) Offline small documents processing method and processing device
CN101415029B (en) Method and apparatus for distributing files
JP5194936B2 (en) File conversion apparatus, file conversion method, and program
CN114896222A (en) Log data processing method and device, computer equipment and system
US20130218851A1 (en) Storage system, data management device, method and program
US9742832B2 (en) Transmission apparatus, transmission method, computer-readable storage medium storing transmission program, and relay system
US20110072160A1 (en) Data relay device, data receiving device and communication system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08783901

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08783901

Country of ref document: EP

Kind code of ref document: A1