WO2011074942A1 - System and method of converting data from a multiple table structure into an edoc format - Google Patents

System and method of converting data from a multiple table structure into an edoc format Download PDF

Info

Publication number
WO2011074942A1
WO2011074942A1 PCT/MY2010/000323 MY2010000323W WO2011074942A1 WO 2011074942 A1 WO2011074942 A1 WO 2011074942A1 MY 2010000323 W MY2010000323 W MY 2010000323W WO 2011074942 A1 WO2011074942 A1 WO 2011074942A1
Authority
WO
WIPO (PCT)
Prior art keywords
edoc
document
documenttype
fields
character string
Prior art date
Application number
PCT/MY2010/000323
Other languages
French (fr)
Inventor
Kim Seng Kee
Huei Huang Aries Ang
Meing Jye Kelvin Sim
Original Assignee
Emanual System Sdn Bhd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Emanual System Sdn Bhd filed Critical Emanual System Sdn Bhd
Publication of WO2011074942A1 publication Critical patent/WO2011074942A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Definitions

  • the present invention relates to a system and method of converting from a multiple table structure into an eDoc format.
  • PCT MY08/00017 discloses a method of data storage and management.
  • data obtained from a physical document is converted into an electronic document (eDoc) format and appended into an identified destination file in the eDoc format.
  • An eDoc is therefore a document comprising data values in an eDoc format.
  • the destination file eDoc is then stored away in the computer system for future retrieval.
  • the disclosed method is account-centric in that electronic documents of the same account are grouped and stored in the same location (destination file). For example, data obtained from a physical invoice of a particular customer is converted into an eDoc format and appended into a destination file used for storing all invoices of the particular customer.
  • the data is kept as a whoie into a single eDoc.
  • This single eDoc may be stored as an entry in a single RDBMS (Relational Database Management System) table. While typical RDBMS tables contain predefined column widths for each field of the data, the method disclosed in PCT MY08/00017 uses only a single table (if it is stored in an RDBMS at all). Column widths in this single table are defined as arbitrary lengths of variable character strings.
  • Embedded within the data are special characters, which serve as field separators when the eDoc is extracted from the table.
  • the data eDoc obtained from the table can then be parsed using a separate dictionary such that each field of data can be identified using the special characters.
  • an eDoc is an extremely flexible storage tool, and can be configured in many ways to suit the user, corporation or industry. It is not fixed in a rigid structure as compared with traditional multiple table based database systems, but can be modified accordingly, in summary, an eDoc comprises a logical relational grouping of data values that can enable users to access and process information more efficiently and effectively.
  • RDB S data values are already stored in multiple traditional RDBMS tables.
  • the multiple table RDBMS structure does not allow efficient retrieval of information, especially in a complex database. It further requires data normalisation where the data is structured and organised into tables, to eliminate redundancy and to maintain data integrity. As such, techniques of data mining will have to be programmed to obtain specific information comprising data from different tables, from the database.
  • example embodiments of the present invention seek to provide a system and method for converting from a multiple table structure into an eDoc format.
  • a method of converting data from a multi table structure into an eDoc format comprising the steps of: creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, creating one or more Rowtypes defining respective first singie character strings; mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; extracting data values from the mapped fields of the multiple tables; storing the extracted data values in the corresponding fields of each first single character string; combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second singie character string.
  • the method may further comprise creating a Rowtype dictionary for interpreting the first single character string defined by the respective Rowtype.
  • the Rowtype dictionary may define one or more of a group consisting data type of each field of the first character string, name of each field of the first character sting and the arrangement of each field of the first character string.
  • the method may further comprise creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each fieid of the second character string in a graphical user interface (GUI).
  • GUI graphical user interface
  • Defining the Documenttype may comprise defining a document ID for uniquely identifying each eDoc document.
  • the document ID may comprise one or more data values of fields of the multiple tables.
  • Extracting the data values from the mapped fields of the multiple tables may comprise generating a list of potential eDoc documents based on the document IDs.
  • Extracting the data values from the mapped fields of the multiple tables may further comprise constructing a query to retrieve the data values from the mapped fields of the multiple tables.
  • One or more Rowtypes may be shared between different Documenttypes.
  • One or more Rowtypes may have multiple occurrences in a single Documenttype.
  • the method may further comprise storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file.
  • a system for converting data from a multi table structure into an eDoc format comprising: means for creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, means for creating one or more Rowtypes defining respective first single character strings; means for mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; means for extracting data values from the mapped fields of the multiple tables; means for storing the extracted data values in the corresponding fields of each first single character string; means for combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
  • the system may further comprise means for creating a Rowtype dictionary for interpreting the first single character string defined by the respective Rowtype.
  • the Rowtype dictionary may define one or more of a group consisting data type of each field of the first character string, name of each field of the first character string and the arrangement of each field of the first character string.
  • the system may further comprise means for creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each field of the second character string in a graphical user interface (GUI).
  • GUI graphical user interface
  • Means for defining the Documenttype may comprise means for defining a document ID for uniquely identifying each eDoc document
  • the document ID may comprise one or more data values of fields of the multiple tables.
  • Means for extracting the data values from the mapped fields of the multiple tables may comprise means for generating a list of potential eDoc documents based on the document IDs.
  • Means for extracting the data values from the mapped fields of the multiple tables may further comprise means for constructing a query to retrieve the data values from the mapped fields of the multiple tables.
  • One or more Rowtypes may be shared between different Documenttypes.
  • One or more Rowtypes may have multiple occurrences in a single Documenttype.
  • the system may further comprise means for storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file.
  • a computer readable data storage medium having stored thereon computer code means for instructing a computer to execute a method of converting data from a multi table structure into an eDoc format; the method comprising the steps of: creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, creating one or more Rowtypes defining respective first single character strings; mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; extracting data values from the mapped fields of the multiple tabies; storing the extracted data values in the corresponding fields of each first single character siring; combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
  • Figure 1 illustrates the storage of data obtained from a form in a typical RDBMS multi-table RDBMS database and its corresponding storage in an eDoc format in an example embodiment.
  • Figure 2 illustrates the steps of a method for converting data in a traditional RDBMS into an eDoc format in an example embodiment
  • Figure 3 illustrates a user display of a conversion tool in an example embodiment.
  • Figure 4 is a flow chart iiiustrating a method of defining the transformation rules in an example embodiment
  • Figure 5 illustrates the eDoc syntax represented in EBNF in an example embodiment
  • Figure 6a shows, an example of a form representing a data entry for a database.
  • Figure 6b shows the corresponding storage of the data entry in a traditional table based RDBMS.
  • Figure 7a shows an example embodiment of the same data entry represented in an eDoc document.
  • Figure 7b, 7c and 7d shows example embodiments of associated Rowtype dictionaries for interpreting Rowtypes in the eDoc format
  • Figure 7e shows example embodiments of an associated Documenttype dictionary for interpreting Documenttypes in the eDoc format
  • Figure 8 illustrates the storage of multiple eDocs into a single DBMS table in an example embodiment.
  • Figure 9 illustrates the storage of multiple eDocs in a single file in an example embodiment.
  • Figure 10 is a flow chart illustrating a method of transforming data values stored across multiple tables of a DBMS into an eDoc format in an example embodiment
  • Figure 11 is a schematic diagram of the method and system of the example embodiment implemented on a computer system.
  • Figure 12 is a schematic diagram of the method and system of the example embodiment implemented on a wireless device.
  • the present specification also discloses apparatus for performing the operations of the methods.
  • Such apparatus may be specially constructed for the required purposes, or may comprise a general purpose computer or other device selectively activated or reconfigured by a computer program stored in the computer.
  • the algorithms and displays presented herein are not inherently related to any particular computer or other apparatus.
  • Various general purpose machines may be used with programs in accordance with the teachings herein.
  • the construction of more specialized apparatus to perform the required method steps may be appropriate.
  • the structure of a conventional general purpose computer will appear from the description below.
  • the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code.
  • the computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the disclosure contained herein.
  • the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
  • Such a computer program may be stored on any computer readable medium.
  • the computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a general purpose computer.
  • the computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system.
  • the computer program when loaded and executed on such a general-purpose computer effectively results in an apparatus that implements the steps of the preferred method.
  • Figure 1 shows an example of how the data vaiues 112-138 obtained from an actual form 102 are typically stored as multiple tables 144, 146, 148 in a table based database management system 142 (such as RDBMS) and its corresponding storage in the eDoc document 150 in an eDoc format.
  • data vaiues are split into relevant and logically relational information, and stored in multiple separated tables linked by a key or index.
  • the eDoc format combines and stores the data in a single document / destination, using one or more variable character strings.
  • different data fields are delineated by unique characters termed delineators.
  • Different types of variable character strings are defined by "Rowtypes".
  • Each eDoc document is a combination of one or more of the variable character strings, delineated by unique characters termed delineators.
  • delineators Different types of variable character strings
  • the data values 112-138 obtained from the form 102 are broken up into e.g. the three tables 144, 146 and 148 in a typical database management system such as RDBMS.
  • the data vaiues 112-120 are identified to contain personal data and are stored in a first RDBMS table 144.
  • Data values 122-130 are identified to contain employment details and are stored in a second RDBMS table 146.
  • Data values 132-138 are identified to contain banking references and stored in a third RDBMS table 148.
  • transformation rules which transform the data in the tables 144, 146 and 148 into an eDoc document 150 are implemented in an example embodiment.
  • the first data table 144 is a personal data table, storing data vaiues for the fields/attributes of Name 112, ID number 114, Date of birth 116, Home Address 118 and Home Telephone 120.
  • the second data table 146 is an employment details table, storing data values for the fields/attributes of Name of Employment 122, Office Telephone 124, Office Address 126, Nature of Business 128 and Designation 130.
  • the third data table 148 is a banking reference table, storing data vaiues for the fields/attributes of Bank Name 132, Bank Account Number 134, Type of Account 136, and Date Opened 138.
  • the eDoc document 150 comprises a definitions header 158 for defining the Documenttype, document date, etc. Further, the Rowtype defined variable character strings 152, 154 and 156 which store data from the tables 144, 146 and 148 respectively may be grouped under a particular section 160 within the eDoc document 150.
  • Figure 2 illustrates the steps of a method 200 for converting data in a traditional RDBMS 142 ( Figure 1) into an eDoc format 150 in an example embodiment.
  • Documenttypes and associated Rowtypes are designed.
  • Transformation Rules are defined.
  • a list of document identifiers based on the Transformation Rules are retrieved.
  • SQL statements are constructed to retrieve data from the traditional RDBMS 142 ( Figure 1) based on the Transformation Rules.
  • the data is extracted from the traditional RDBMS 142 ( Figure 1) using the constructed SQL statements.
  • the extracted data is converted into eDoc documents e.g. 213, and a Rowtype dictionary is constructed based on the Transformation Rules.
  • the eDoc documents e.g. 213 and the corresponding Rowtype dictionary are stored in a repository.
  • the eDoc documents 213 may be stored in a RDBMS 215, or alternatively as a single file.
  • data that was stored in a traditional RDBMS 142 ( Figure 1) in separate tables 144, 146, 168 is now consolidated into single eDoc documents.
  • the defining of the Documenttype and Rowtypes involves defining the constituents of an eDoc document for the storage of the data contained in the form 102 ( Figure 1).
  • a Documenttype is identified and given a name.
  • the Documenttype is a document template or a document design for a particular type of document. It does not contain any data but is a template.
  • the eDoc document 150 represented in Figure 1 may be used for storing personal credit card details and is assigned a Documenttype of D100.
  • Rowtypes of the particular Documenttype e.g. D100 are defined.
  • a Rowtype represents related data in the Documenttype, grouped together to form a set. Fields or Attributes in each Rowtype are also defined.
  • the eDoc document 150 represented in Figure 1 has three Rowtype defined variable character strings 152, 154 and 156.
  • a typical RDBMS may have thousands of fields spread into multiple tables.
  • a consultant may perform either one of the following:
  • A. Obtain an original/source document (be it a GUI interface, a printed form etc) where the data is obtained from, and based on these documents, create the appropriate Documenttype and Rowtype. For example, suppose the RDBMS data is a result of input from the form 102 as illustrated in Figure 1. The consultant may then identify the document as one that stores personal credit card data and assign "D100" as its Documenttype. Rowtypes of e.g. R101 , R102 and R103, to respectively store the sets of personal data values 112-120, employment details data values 122-130, and banking references data values 132-138 are also designed.
  • R101 , R102 and R103 to respectively store the sets of personal data values 112-120, employment details data values 122-130, and banking references data values 132-138 are also designed.
  • the RDBMS data may be analysed to produce the Documenttype design. This can be useful if the actual form 102 from which the data values were obtained is not available and the first option is not possible.
  • the consultant may design Documenttypes by analyzing the data in the existing RDBMS tables to group related data together based on their business processes/practices.
  • a user display of a conversion tool 300 in an example embodiment is shown.
  • the user may be prompted (field 303 of a frame 304 of the conversion tool 300) to enter a 4 character Documenttype code e.g. D100 for the new Documenttype and click on the 'Add New Document' button.
  • a 4 character Documenttype code e.g. D100 for the new Documenttype
  • the user is prompted (field 307) to enter a 4 character Rowtype code e.g. R101, R102 or R103, select the Documenttype to add the created Rowtype to (dropdown field 309) and click the 'Add New Rowtype' button 311.
  • Rowtype(s) associated with the created Documenttype are added.
  • the same Rowtype e.g. R101 may be applicable for different data.
  • a Rowtype e.g. R101 may be designed with the fields of "Name of Firm”, “Office Telephone”, Office Address", “Nature of Business”, etc.
  • the Rowtype R101 may thus be used for storing employment details of an individual, and can be included as a Rowtype in a Documenttype e.g. D101 for storing personal details of an individual.
  • Rowtype R101 may also be used to store the former employment details (or work experience) of the individual and may therefore be reused in the same Documenttype D101 , as a separate instance.
  • a different Documenttype e.g.
  • Rowtype R101 originally designed for D101 may be also used in the different Documenttype D102.
  • multiple instances of the same Rowtype in a Documenttype are catered for.
  • Ro types can also be reused by different Documenttypes.
  • a Rowtype can be reused for mapping.
  • the reused Rowtype is hereafter referred to as a new "instance" of the same Rowtype.
  • the consultant also selects the Documenttype, Rowtype and Field to which extracted data may be mapped onto using dropdown tools 334, 336 and 338 respectively.
  • the "Map” button 340 By clicking the “Map” button 340, the new "instance” of the Rowtype for storing data from the mapped fields is created,
  • the consultant aiso defines which fieids can serve as unique document identifiers (DocID) of the Documenttype.
  • DocID unique document identifiers
  • the data values in the Name, NRIC and Bank Account Number fields 112, 114, 134 are, in combination, unique and can serve as a good document identifier for the Documenttype D100.
  • a portion of the conversion tool 300 that allows a user to select the DocID of the Documenttype is provided in frame 320.
  • Dropdown list 322 ailows the consultant to first select the table where the DocID field may be selected from.
  • Next Dropdown list 324 is populated with fields available in the selected table, and the consultant is then selects the field to be used as a DocID field.
  • a list of presently selected DocID fields is shown in the display window 328.
  • the next step involves the defining of transformation rules 204.
  • This step 204 involves several sub-steps.
  • the sub steps for defining the transformation ruies in an example embodiment are illustrated in Figure 4.
  • the database connection parameters are defined, information such as the type of RDBMS involved, database server address, port number, username and password to access the database are input by the user. Once the information is provided by the user, a list of tables in the relevant RDBMS can be made available for the subsequent sub-steps that follow.
  • the portion of the conversion tool 300 for defining the database connection parameters is provided in frame 302.
  • each field in the RDBMS is linked/mapped to a Rowtype field in a Documenttype.
  • a portion of the conversion tool 300 that allows a user to select a table 308 using a first drop down list of tables identified from the RDBMS database is provided in frame 306.
  • a field 310 is also selected from the selected table 308 from a second drop down list as the source.
  • a previously defined Documenttype 312 e.g. D100 is selected from a third drop down list.
  • a Rowtype 314 of the selected Documenttype 312 e.g. D100 from a last drop down list is selected as the destination.
  • the DocID is a minimum set of parameters to uniquely identify individual documents of a particular Documenttype. That is, a list of potential eDoc documents based on the retrieved DoclDs for each Documenttype is generated from the RDBMS tables. This list 207 of DoclDs serves as a unique document identifier compilation such that the conversion tool can extract data on a per eDoc document basis. For example, for the Documenttype D100 at numeral 150 in Figure 1, the DocID is based on the source data values of the fields of Name 112, NRIC 114 and Bank Account Number 134, as extracted from the relevant tabies 144, 146, 1 8 of the RDBMS 142.
  • SQL statements are constructed to retrieve data for the respective eDoc documents based on the Transformation Rules defined in step 204.
  • the list 207 of unique DoclDs together with the previously defined transformation rules can be used to construct SQL statements for querying the RDBMS for data extraction purposes as will be appreciated by a person skilled in the art.
  • the conversion process can be configured to construct SQL statements to cater for parallel data extraction for multiple eDocs to improve efficiency rather than a sequential querying per eDoc document.
  • the data for the eDoc documents is extracted from the RDBMS using the constructed SQL statements.
  • the retrieved SQL results are then converted into an eDoc format and a Rowtype dictionary is constructed based on the Transformation Rules.
  • a Rowtype dictionary is constructed based on the Transformation Rules.
  • an eDoc document e.g. 213 is constructed based on the Documenttype definition.
  • Each data value extracted from the RDBMS is stored in the corresponding ('mapped') field of a variable character string of the type defined by the relevant Rowtype in the Documenttype definition.
  • sets of Rowtype defined variable character strings of the Documenttype can be constructed and combined to form the eDoc document e.g. 213.
  • the data values are extracted as a character string, regardless of its data type, so that it can be combined into a Rowtype defined variable character string.
  • data of the "date" data type is not extracted in the "date" data type format, but as a character string.
  • the eDoc document may also contain metadata in a reserved system Rowtype RiDO.
  • the system Rowtype RiDO may contain data such as the document name, the document creation date, the size and other system related information.
  • the Rowtype dictionary is aiso created at this stage (step 212).
  • the dictionary contains information for specifying the definition for each data value in the created eDoc document. This information is obtained from the definitions provided input by the user for transforming fields in the RDBMS tables into Rowtype fields in step 204. It will be appreciated that the eDoc document contains only data values delineated by special characters.
  • the Rowtype dictionary can then be used to parse the eDoc document, where the values for each field or attribute is identified.
  • the Rowtype dictionary provides definitions for the data type, name and arrangement of each field in the Rowtype defined variable character string.
  • a Documenttype dictionary may also be created to include display information for displaying the eDoc document through a GUI. These may include parameters such as specific X,Y coordinate positioning information, and may be provided by the consultant when the eDoc GUI display is created.
  • the eDoc document e.g. 213 and corresponding Rowtype and Documenttype dictionaries are stored in an RDBMS 215, or alternatively as a single file.
  • data that was stored in a traditional RDBMS 142 ( Figure 1) in separate tables 144, 146, 168 is now consolidated into single eDoc documents.
  • the final step 214 can for example involve storing of the Rowtype and Documenttype dictionaries and eDoc documents by the eLedger Filing System (eLFS) into the storage system.
  • eLFS eLedger Filing System
  • step 214 may involve storing the eDoc documents as a singie file and similarly corresponding Rowtype dictionaries as another single file. That is, as the eDoc document is a single data entry of an e.g. form, and comprises a string of characters, it can be stored as a single line of a single e.g. text file as an alternative to storing the eDoc document as a row in an RDMBS table. Similarly, the Rowtype and Documenttype dictionaries can also be respectively stored as a line in a text file as an alternative to storing as a row in an RDBMS table.
  • the eDoc documents and Rowtype dictionaries are stored separately as respective single files, or respective tables. It will be appreciated that while this is the current implementation, it is possible for the eDoc documents and Rowtype and Documenttype dictionaries to be stored together as a single file or table.
  • Figure 6a shows an example of a form 602 representing a data entry for a database.
  • Figure 6b shows the. corresponding storage of the data entry in a traditional table based RDBMS 604.
  • the DBMS table 604 comprises four different tables 606, 608, 610 and 612.
  • Figure 7a shows an example embodiment of the same data entry represented in an eDoc document 702.
  • the eDoc document comprises a Documenttype with header DOD0 710, data stored in Rowtype defined variable character strings with headers RiD0 712, RPD0 714, REM0 716, REM0 718 and RBK0 720.
  • the consultant may use the same Rowtype more than once.
  • data in the Employment Details Table 608 and Experience Table 610 have similar fields and may therefore use similar Rowtype definitions.
  • the consultant may have defined the Documenttype such that the created eDoc document has two instances of the same Rowtype.
  • the REMO Rowtype (header 716, 718) is used twice in that eDoc document.
  • a "sequence number" field 722, 724 is added.
  • the system Rowtype i.e. the variable character string with header RiD0 712, comprising system information described earlier is also shown.
  • Figure 7b, 7c, and 7d shows an example embodiment of the associated Rowtype dictionaries in the eDoc format.
  • the Rowtype dictionaries 704, 706 and 708 respectively allow the data values in each Rowtype field of the eDoc document 702 to be interpreted.
  • Figure 7b provides Rowtype definitions for interpreting the Rowtype RPD0 714 of the eDoc document 702 ( Figure 7a).
  • Figure 7c provides Rowtype definitions for interpreting the Rowtype REMO 716 of the eDoc document 702 ( Figure 7a).
  • Figure 7d provides Rowtype definitions for interpreting the Rowtype RBK0 720 of the eDoc document 702 ( Figure 7a).
  • each Rowtype dictionary comprises:
  • Each RdC0 line in numeral 732 provide a definition of the "fields" that are present in the particular Rowtype, RPD0.
  • the RdCOs 732 further comprises special identification and checking fields.
  • the first 8 RdC0 lines 732a which provide for system related fields, are:
  • Numeral 734 shows the data type of the field represented by the RdC0 line, where "X” represents a character string, "9” represents a number data type, and "D” represents a Date data type.
  • Numeral 736 shows the field names of the data values stored in of the eDoc document 702 for the Rowtype RPD0 714.
  • the last 4 RdC0 lines 732c, which provide for other system related fields such as document size, checksum, etc., in the Rowtype dictionary 704 are:
  • Rowtype dictionary 704 Based on the above description for the Rowtype dictionary 704, it will be appreciated by a person skilled in the art that the Rowtype dictionaries 706 and 708 for identifying the data values in the eDoc document 702 for the Rowtypes with headers REM0 714 and RBK0 720 can similarly be interpreted.
  • a Documenttype dictionary may also be created.
  • Figure 7e shows an example embodiment of a Documenttype dictionary 709.
  • the Documenttype dictionary 709 contains display information for use by e.g. Graphical User Interface (GUI) applications to recreate and display the eDoc data of a particular Documenttype e.g. DOD0 710 of Figure 7a.
  • GUI Graphical User Interface
  • the Rowtype dictionaries e.g. 704, 706 and 708 of Figures 7b, 7c and 7d respectively and a partial Documenttype dictionary are generated based on the transformation rules.
  • the Documenttype dictionary can be updated to include e.g. display information such as those represented in numerals 742, 744 and 746 using e.g. a design tool, which allows the display of the eDoc document in a desired format.
  • Display information for each Rowtype is stored in blocks, e.g. 740.
  • a block 740 ( Figure 7e) in the Documenttype dictionary 709 stores the display information for the Rowtype with header RPD0 714 ( Figure 7a).
  • the display information includes formatting information 742 such as the data type 744 of each data field, and X-Y coordinate position information 746 of each field when displayed in a GUI application.
  • the block 740 may contain a label 748 for identifying the field being defined by the particular line, improving the readability of the block 740 to the consultant for debugging purposes, It will be appreciated that the iines of the Documenttype dictionary 709 illustrated in Figure 7e are truncated for reproduction purposes and the Figure is therefore not a complete representation of a Documenttype dictionary.
  • Figure 8 illustrates the storage of multiple (two) eDocs into a single DBMS table 800 in an example embodiment.
  • Figure 9 illustrates the storage of multiple (two) eDocs in a singie file 900 in an example embodiment.
  • Figures 8 and 9 also include the eDoc 702 of Figure 7 as eDoc 802 and 902 respectively.
  • An additional Rowtype RiFO has also been added.
  • eLFS disclosed in PCT7MY08/00017
  • an eDoc is first being filed to the transaction ledger before it is sent for filing in the master ledger.
  • the purpose of RiFO is to provide a cross-reference trace for the eDoc between these 2 ledgers.
  • Each RiFO has two sets of sequence numbers and status:
  • the RiFO rowtype defined variable character string is created. Both the current running sequence number of the ledger and the filing status (success, fail, pending, suspended) are then inserted to the Tseq# and Tstatus fields in the RiFO defined variable character string respectively.
  • the "new" eDoc with an attached RiFO defined variable character string is then sent to the master ledger for filing (storage).
  • the master ledger the same process is carried out where the sequence number and filing status at the master ledger is inserted into the same RiFO defined variable character string created at the transaction as separate Mseq# and Mstatus fields.
  • each eDoc document is represented as a character string of multiple Iines or rows. It will be appreciated that the use of multiple lines or rows in the figures are for illustrative purposes only. In example embodiments, each eDoc takes up only a single line in the text file or a single row in the DBMS table.
  • Figure 10 shows a flowchart illustrating a method 1000 for converting data from a multiple table structure into an eDoc format. At step 1002, one or more Documenttypes defining respective types of eDoc documents are created.
  • each Documenttype For each Documenttype, at step 1004, one or more Rowtypes defining respective first single character strings are created. At step 1006, fields in the multiple tables are mapped to fields of first single character strings defined by respective Rowtypes. At step 1008, data values are extracted from the mapped fields of the multiple tables. At step 1010, extracted data values are stored in the corresponding fieids of each first single character string. At step 1012, the first character strings are combined into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
  • the method and system of the example embodiment can be implemented on a computer system 1100, schematically shown in Figure 11. It may be implemented as software, such as a computer program being executed within the computer system 1100, and instructing the computer system 1100 to conduct the method of the example embodiment.
  • the computer system 1100 comprises a computer module 1102, input modules such as a keyboard 1104 and mouse 1106 and a plurality of output devices such as a display 1108, and printer 1110.
  • the computer module 1102 is connected to a computer network 1112 via a suitable transceiver device 1114, to enable access to e.g. the Internet or other network systems such as Local Area Network (LAN) or Wide Area Network (WAN).
  • LAN Local Area Network
  • WAN Wide Area Network
  • the computer module 1102 in the example includes a processor 1118, a Random Access Memory (RAM) 1120 and a Read Only Memory (ROM) 1122.
  • the computer module 1102 also includes a number of Input/Output (I/O) interfaces, for exampie I/O interface 1124 to the display 1108, and !/O interface 1 126 to the keyboard 1 104.
  • I/O Input/Output
  • the components of the computer module 1102 typically communicate via an interconnected bus 1128 and in a manner known to the person skilled in the relevant art.
  • the application program is typically supplied to the user of the computer system
  • a data storage medium such as a CD-ROM or flash memory carrier and read utilising a corresponding data storage medium drive of a data storage device 1130.
  • the application program is read and controlled in its execution by the processor 1118.
  • intermediate storage of program data maybe accomplished using RAM 1120.
  • the method of the current arrangement can be implemented on a wireless device 1200, schematically shown in Figure 12. It may be implemented as software, such as a computer program being executed within the wireless device 1200, and instructing the wireless device 1200 to conduct the method.
  • the wireless device 1200 comprises a processor module 1202, an input module such as a keypad 1204 and an output module such as a display 206.
  • the processor module 1202 is connected to a wireless network 1208 via a suitable transceiver device 1210, to enable wireless communication and/or access to e.g. the Internet or other network systems such as Local Area Network (LAN), Wireless Personal Area Network (WPAN) or Wide Area Network (WAN).
  • LAN Local Area Network
  • WPAN Wireless Personal Area Network
  • WAN Wide Area Network
  • the processor moduie 1202 in the example includes a processor 1212, a Random Access Memory (RAM) 1214 and a Read Only Memory (ROM) 1216.
  • the processor module 1202 also includes a number of Input/Output (I/O) interfaces, for example I/O interface 1218 to the display 1206, and I/O interface 1220 to the keypad 1204.
  • the components of the processor module 1202 typically communicate via an interconnected bus 1222 and in a manner known to the person skilled in the relevant art.
  • the application program is typically supplied to the user of the wireless device 1200 encoded on a data storage medium such as a flash memory module or memory card/stick and read utilising a corresponding memory reader-writer of a data storage device 1224.
  • the application program is read and controlled in its execution by the processor 1212. Intermediate storage of program data may be accomplished using RAM 1214.

Abstract

A system and method of converting data from a multi table structure into an eDoc format. The method comprising the steps of creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, creating one or more Rowtypes defining respective first single character strings; mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; extracting data values from the mapped fields of the multiple tables; storing the extracted data values in the corresponding fields of each first single character string; combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.

Description

System and Method of converting data from a
multiple table structure into an eDoc format
FIELD OF INVENTION
The present invention relates to a system and method of converting from a multiple table structure into an eDoc format. BACKGROUND
PCT MY08/00017 discloses a method of data storage and management. In this method, data obtained from a physical document is converted into an electronic document (eDoc) format and appended into an identified destination file in the eDoc format. An eDoc is therefore a document comprising data values in an eDoc format. The destination file eDoc is then stored away in the computer system for future retrieval. The disclosed method is account-centric in that electronic documents of the same account are grouped and stored in the same location (destination file). For example, data obtained from a physical invoice of a particular customer is converted into an eDoc format and appended into a destination file used for storing all invoices of the particular customer.
In other words, instead of having to break the data obtained from a physical document for storage into different pre-defined tables, with each table further comprising different fields of the data, the data is kept as a whoie into a single eDoc. This single eDoc may be stored as an entry in a single RDBMS (Relational Database Management System) table. While typical RDBMS tables contain predefined column widths for each field of the data, the method disclosed in PCT MY08/00017 uses only a single table (if it is stored in an RDBMS at all). Column widths in this single table are defined as arbitrary lengths of variable character strings. Embedded within the data (in eDoc format) are special characters, which serve as field separators when the eDoc is extracted from the table. The data eDoc obtained from the table, can then be parsed using a separate dictionary such that each field of data can be identified using the special characters.
Therefore, the eDoc is an extremely flexible storage tool, and can be configured in many ways to suit the user, corporation or industry. It is not fixed in a rigid structure as compared with traditional multiple table based database systems, but can be modified accordingly, in summary, an eDoc comprises a logical relational grouping of data values that can enable users to access and process information more efficiently and effectively. However, as many of today's existing systems utilise RDB S, data values are already stored in multiple traditional RDBMS tables. The multiple table RDBMS structure does not allow efficient retrieval of information, especially in a complex database. It further requires data normalisation where the data is structured and organised into tables, to eliminate redundancy and to maintain data integrity. As such, techniques of data mining will have to be programmed to obtain specific information comprising data from different tables, from the database.
Therefore, example embodiments of the present invention seek to provide a system and method for converting from a multiple table structure into an eDoc format.
SUMMARY
In accordance with a first aspect of the present invention, a method of converting data from a multi table structure into an eDoc format; the method comprising the steps of: creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, creating one or more Rowtypes defining respective first singie character strings; mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; extracting data values from the mapped fields of the multiple tables; storing the extracted data values in the corresponding fields of each first single character string; combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second singie character string. The method may further comprise creating a Rowtype dictionary for interpreting the first single character string defined by the respective Rowtype.
The Rowtype dictionary may define one or more of a group consisting data type of each field of the first character string, name of each field of the first character sting and the arrangement of each field of the first character string.
The method may further comprise creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each fieid of the second character string in a graphical user interface (GUI).
Defining the Documenttype may comprise defining a document ID for uniquely identifying each eDoc document. The document ID may comprise one or more data values of fields of the multiple tables.
Extracting the data values from the mapped fields of the multiple tables may comprise generating a list of potential eDoc documents based on the document IDs.
Extracting the data values from the mapped fields of the multiple tables may further comprise constructing a query to retrieve the data values from the mapped fields of the multiple tables. One or more Rowtypes may be shared between different Documenttypes.
One or more Rowtypes may have multiple occurrences in a single Documenttype. The method may further comprise storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file.
In accordance with a second aspect of the present invention, there is provided a system for converting data from a multi table structure into an eDoc format; the system comprising: means for creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, means for creating one or more Rowtypes defining respective first single character strings; means for mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; means for extracting data values from the mapped fields of the multiple tables; means for storing the extracted data values in the corresponding fields of each first single character string; means for combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
The system may further comprise means for creating a Rowtype dictionary for interpreting the first single character string defined by the respective Rowtype.
The Rowtype dictionary may define one or more of a group consisting data type of each field of the first character string, name of each field of the first character string and the arrangement of each field of the first character string.
The system may further comprise means for creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each field of the second character string in a graphical user interface (GUI).
Means for defining the Documenttype may comprise means for defining a document ID for uniquely identifying each eDoc document The document ID may comprise one or more data values of fields of the multiple tables.
Means for extracting the data values from the mapped fields of the multiple tables may comprise means for generating a list of potential eDoc documents based on the document IDs.
Means for extracting the data values from the mapped fields of the multiple tables may further comprise means for constructing a query to retrieve the data values from the mapped fields of the multiple tables. One or more Rowtypes may be shared between different Documenttypes.
One or more Rowtypes may have multiple occurrences in a single Documenttype.
The system may further comprise means for storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file. In accordance with a third aspect of the present invention, there is provided a computer readable data storage medium having stored thereon computer code means for instructing a computer to execute a method of converting data from a multi table structure into an eDoc format; the method comprising the steps of: creating one or more Documenttypes defining respective types of eDoc documents; for each Documenttype, creating one or more Rowtypes defining respective first single character strings; mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes; extracting data values from the mapped fields of the multiple tabies; storing the extracted data values in the corresponding fields of each first single character siring; combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will be better understood and readily apparent to one of ordinary skill in the art from the following written description, by way of example only, and in conjunction with the drawings, in which: Figure 1 illustrates the storage of data obtained from a form in a typical RDBMS multi-table RDBMS database and its corresponding storage in an eDoc format in an example embodiment. Figure 2 illustrates the steps of a method for converting data in a traditional RDBMS into an eDoc format in an example embodiment
Figure 3 illustrates a user display of a conversion tool in an example embodiment.
Figure 4 is a flow chart iiiustrating a method of defining the transformation rules in an example embodiment Figure 5 illustrates the eDoc syntax represented in EBNF in an example embodiment
Figure 6a shows, an example of a form representing a data entry for a database.
Figure 6b shows the corresponding storage of the data entry in a traditional table based RDBMS.
Figure 7a shows an example embodiment of the same data entry represented in an eDoc document.
Figure 7b, 7c and 7d shows example embodiments of associated Rowtype dictionaries for interpreting Rowtypes in the eDoc format Figure 7e shows example embodiments of an associated Documenttype dictionary for interpreting Documenttypes in the eDoc format
Figure 8 illustrates the storage of multiple eDocs into a single DBMS table in an example embodiment.
Figure 9 illustrates the storage of multiple eDocs in a single file in an example embodiment. Figure 10 is a flow chart illustrating a method of transforming data values stored across multiple tables of a DBMS into an eDoc format in an example embodiment
Figure 11 is a schematic diagram of the method and system of the example embodiment implemented on a computer system.
Figure 12 is a schematic diagram of the method and system of the example embodiment implemented on a wireless device. DETAILED DESCRIPTION
Some portions of the description which follows are explicitly or implicitly presented in terms of algorithms and functionai or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to convey most effectively the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities, such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.
Unless specifically stated otherwise, and as apparent from the following, it will be appreciated that throughout the present specification, discussions utilizing terms such as "extracting", "combining", "storing", "obtaining", "converting", "processing", "generating", "initializing", "formatting", or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.
The present specification also discloses apparatus for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a general purpose computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of more specialized apparatus to perform the required method steps may be appropriate. The structure of a conventional general purpose computer will appear from the description below.
In addition, the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the disclosure contained herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
Furthermore, one or more of the steps of the computer program may be performed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a general purpose computer. The computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system. The computer program when loaded and executed on such a general-purpose computer effectively results in an apparatus that implements the steps of the preferred method.
Figure 1 shows an example of how the data vaiues 112-138 obtained from an actual form 102 are typically stored as multiple tables 144, 146, 148 in a table based database management system 142 (such as RDBMS) and its corresponding storage in the eDoc document 150 in an eDoc format. In typical database management systems, data vaiues are split into relevant and logically relational information, and stored in multiple separated tables linked by a key or index. Instead of splitting data into multiple tables, the eDoc format combines and stores the data in a single document / destination, using one or more variable character strings. In each variable character string, different data fields are delineated by unique characters termed delineators. Different types of variable character strings are defined by "Rowtypes". Each eDoc document is a combination of one or more of the variable character strings, delineated by unique characters termed delineators. Thus, a single eDoc document is also in the form of a variable character string.
As illustrated in Figure 1, the data values 112-138 obtained from the form 102 are broken up into e.g. the three tables 144, 146 and 148 in a typical database management system such as RDBMS. The data vaiues 112-120 are identified to contain personal data and are stored in a first RDBMS table 144. Data values 122-130 are identified to contain employment details and are stored in a second RDBMS table 146. Data values 132-138 are identified to contain banking references and stored in a third RDBMS table 148. In order to obtain the corresponding document in eDoc format representing the same data, transformation rules which transform the data in the tables 144, 146 and 148 into an eDoc document 150 are implemented in an example embodiment.
In the example embodiment, the first data table 144 is a personal data table, storing data vaiues for the fields/attributes of Name 112, ID number 114, Date of Birth 116, Home Address 118 and Home Telephone 120. The second data table 146 is an employment details table, storing data values for the fields/attributes of Name of Employment 122, Office Telephone 124, Office Address 126, Nature of Business 128 and Designation 130. The third data table 148 is a banking reference table, storing data vaiues for the fields/attributes of Bank Name 132, Bank Account Number 134, Type of Account 136, and Date Opened 138.
The eDoc document 150 comprises a definitions header 158 for defining the Documenttype, document date, etc. Further, the Rowtype defined variable character strings 152, 154 and 156 which store data from the tables 144, 146 and 148 respectively may be grouped under a particular section 160 within the eDoc document 150.
Figure 2 illustrates the steps of a method 200 for converting data in a traditional RDBMS 142 (Figure 1) into an eDoc format 150 in an example embodiment. At step 202, Documenttypes and associated Rowtypes are designed. At step 204, Transformation Rules are defined. At step 206, a list of document identifiers based on the Transformation Rules are retrieved. At step 208, SQL statements are constructed to retrieve data from the traditional RDBMS 142 (Figure 1) based on the Transformation Rules. At step 210, the data is extracted from the traditional RDBMS 142 (Figure 1) using the constructed SQL statements. At step 212, the extracted data is converted into eDoc documents e.g. 213, and a Rowtype dictionary is constructed based on the Transformation Rules. Finally, at step 214, the eDoc documents e.g. 213 and the corresponding Rowtype dictionary are stored in a repository. The eDoc documents 213 may be stored in a RDBMS 215, or alternatively as a single file. Thus, in the example embodiment, data that was stored in a traditional RDBMS 142 (Figure 1) in separate tables 144, 146, 168 is now consolidated into single eDoc documents.
Each of the steps will now be described in further detail. The defining of the Documenttype and Rowtypes (step 202), involves defining the constituents of an eDoc document for the storage of the data contained in the form 102 (Figure 1). Firstly, a Documenttype is identified and given a name. The Documenttype is a document template or a document design for a particular type of document. It does not contain any data but is a template. For example, the eDoc document 150 represented in Figure 1 may be used for storing personal credit card details and is assigned a Documenttype of D100. Next, Rowtypes of the particular Documenttype e.g. D100 are defined. A Rowtype represents related data in the Documenttype, grouped together to form a set. Fields or Attributes in each Rowtype are also defined. For example, the eDoc document 150 represented in Figure 1 has three Rowtype defined variable character strings 152, 154 and 156.
A typical RDBMS may have thousands of fields spread into multiple tables. In order to convert these RDBMS tables into eDoc documents, a consultant may perform either one of the following:
A. Obtain an original/source document (be it a GUI interface, a printed form etc) where the data is obtained from, and based on these documents, create the appropriate Documenttype and Rowtype. For example, suppose the RDBMS data is a result of input from the form 102 as illustrated in Figure 1. The consultant may then identify the document as one that stores personal credit card data and assign "D100" as its Documenttype. Rowtypes of e.g. R101 , R102 and R103, to respectively store the sets of personal data values 112-120, employment details data values 122-130, and banking references data values 132-138 are also designed.
B. Alternatively, the RDBMS data may be analysed to produce the Documenttype design. This can be useful if the actual form 102 from which the data values were obtained is not available and the first option is not possible. In such a scenario, the consultant may design Documenttypes by analyzing the data in the existing RDBMS tables to group related data together based on their business processes/practices.
With reference to Figure 3, a user display of a conversion tool 300 in an example embodiment is shown. In the example embodiment, in order to create a new Documenttype, the user may be prompted (field 303 of a frame 304 of the conversion tool 300) to enter a 4 character Documenttype code e.g. D100 for the new Documenttype and click on the 'Add New Document' button. To create a new Rowtype for a Documenttype, the user is prompted (field 307) to enter a 4 character Rowtype code e.g. R101, R102 or R103, select the Documenttype to add the created Rowtype to (dropdown field 309) and click the 'Add New Rowtype' button 311. Thus, for each created Documenttype, Rowtype(s) associated with the created Documenttype are added.
In some scenarios, the same Rowtype e.g. R101 may be applicable for different data. For example, a Rowtype e.g. R101 may be designed with the fields of "Name of Firm", "Office Telephone", Office Address", "Nature of Business", etc. The Rowtype R101 may thus be used for storing employment details of an individual, and can be included as a Rowtype in a Documenttype e.g. D101 for storing personal details of an individual. Rowtype R101 may also be used to store the former employment details (or work experience) of the individual and may therefore be reused in the same Documenttype D101 , as a separate instance. Further, it will be appreciated that a different Documenttype e.g. D102, for storing customer details may require similar fields of "Name of Firm", Office Telephone", "Office Address", "Nature of Business", as defined in R101. Thus, Rowtype R101 originally designed for D101 , may be also used in the different Documenttype D102. In the current example embodiment of the conversion tool 300 illustrated in Figure 3, multiple instances of the same Rowtype in a Documenttype are catered for. Ro types can also be reused by different Documenttypes. In the example embodiment, a Rowtype can be reused for mapping. The reused Rowtype is hereafter referred to as a new "instance" of the same Rowtype. Using dropdown tools 330 and 332, the consultant selects the table and field from which data is extracted. The consultant also selects the Documenttype, Rowtype and Field to which extracted data may be mapped onto using dropdown tools 334, 336 and 338 respectively. By clicking the "Map" button 340, the new "instance" of the Rowtype for storing data from the mapped fields is created,
Further, for each Documenttype, e.g. D101, the consultant aiso defines which fieids can serve as unique document identifiers (DocID) of the Documenttype. For example, in the embodiment illustrated in Figure 1 , the data values in the Name, NRIC and Bank Account Number fields 112, 114, 134 are, in combination, unique and can serve as a good document identifier for the Documenttype D100. As illustrated in Figure 3, a portion of the conversion tool 300 that allows a user to select the DocID of the Documenttype is provided in frame 320. Dropdown list 322 ailows the consultant to first select the table where the DocID field may be selected from. Next Dropdown list 324 is populated with fields available in the selected table, and the consultant is then selects the field to be used as a DocID field. The user or consultant clicks on the 'Add Doc ID' button 326 to confirm the selection of the field as a DocID field. A list of presently selected DocID fields is shown in the display window 328. Returning to Figure 2, after step 202 of defining the Documenttypes and
Rowtypes, the next step involves the defining of transformation rules 204. This step 204 involves several sub-steps. The sub steps for defining the transformation ruies in an example embodiment are illustrated in Figure 4. At sub step 402, the database connection parameters are defined, information such as the type of RDBMS involved, database server address, port number, username and password to access the database are input by the user. Once the information is provided by the user, a list of tables in the relevant RDBMS can be made available for the subsequent sub-steps that follow. With reference to Figure 3, the portion of the conversion tool 300 for defining the database connection parameters is provided in frame 302.
Returning to Figure 4, at sub step 404, the source data in the RDBMS tables are connected to the defined Documenttypes and Rowtypes. In this step 404, each field in the RDBMS is linked/mapped to a Rowtype field in a Documenttype. As illustrated in Figure 3, a portion of the conversion tool 300 that allows a user to select a table 308 using a first drop down list of tables identified from the RDBMS database is provided in frame 306. A field 310 is also selected from the selected table 308 from a second drop down list as the source. Next, a previously defined Documenttype 312 e.g. D100 is selected from a third drop down list. A Rowtype 314 of the selected Documenttype 312 e.g. D100 from a last drop down list is selected as the destination. The user or consultant clicks on the 'Add Field' button 316 to map the source RDBMS field to the destination Rowtype field. This process is repeated until all required fields from the RDBMS tables are mapped into corresponding Rowtypes for each Document type. It will be appreciated that not every field in the traditional RDBMS needs to be mapped, but only those that are deemed necessary by the user or consultant according to requirements. Returning to Figure 2, at step 206, a list of document identifiers based on the
Transformation Rules is retrieved. As described earlier, the DocID is a minimum set of parameters to uniquely identify individual documents of a particular Documenttype. That is, a list of potential eDoc documents based on the retrieved DoclDs for each Documenttype is generated from the RDBMS tables. This list 207 of DoclDs serves as a unique document identifier compilation such that the conversion tool can extract data on a per eDoc document basis. For example, for the Documenttype D100 at numeral 150 in Figure 1, the DocID is based on the source data values of the fields of Name 112, NRIC 114 and Bank Account Number 134, as extracted from the relevant tabies 144, 146, 1 8 of the RDBMS 142.
At step 208, SQL statements are constructed to retrieve data for the respective eDoc documents based on the Transformation Rules defined in step 204. The list 207 of unique DoclDs together with the previously defined transformation rules can be used to construct SQL statements for querying the RDBMS for data extraction purposes as will be appreciated by a person skilled in the art. The conversion process can be configured to construct SQL statements to cater for parallel data extraction for multiple eDocs to improve efficiency rather than a sequential querying per eDoc document. At step 210, the data for the eDoc documents is extracted from the RDBMS using the constructed SQL statements.
At step 212, the retrieved SQL results are then converted into an eDoc format and a Rowtype dictionary is constructed based on the Transformation Rules. At this stage, the following information has been defined or obtained:
A. Documenttype and Rowtype definitions
B. Source (RDBMS) to destination (eDoc document) transformation information
C. Data required for eDoc document
In this step 212, an eDoc document e.g. 213 is constructed based on the Documenttype definition. Each data value extracted from the RDBMS is stored in the corresponding ('mapped') field of a variable character string of the type defined by the relevant Rowtype in the Documenttype definition. Based on the Transformation Rules, sets of Rowtype defined variable character strings of the Documenttype can be constructed and combined to form the eDoc document e.g. 213. The data values are extracted as a character string, regardless of its data type, so that it can be combined into a Rowtype defined variable character string. For example, data of the "date" data type is not extracted in the "date" data type format, but as a character string. The eDoc document may also contain metadata in a reserved system Rowtype RiDO. The system Rowtype RiDO may contain data such as the document name, the document creation date, the size and other system related information. An example embodiment of the eDoc syntax document represented in EBNF (Extended Backus-Naur Form) is shown in Figure 5.
Returning to Figure 2, the Rowtype dictionary is aiso created at this stage (step 212). The dictionary contains information for specifying the definition for each data value in the created eDoc document. This information is obtained from the definitions provided input by the user for transforming fields in the RDBMS tables into Rowtype fields in step 204. It will be appreciated that the eDoc document contains only data values delineated by special characters. The Rowtype dictionary can then be used to parse the eDoc document, where the values for each field or attribute is identified. In the example embodiment, the Rowtype dictionary provides definitions for the data type, name and arrangement of each field in the Rowtype defined variable character string.
In addition, a Documenttype dictionary may also be created to include display information for displaying the eDoc document through a GUI. These may include parameters such as specific X,Y coordinate positioning information, and may be provided by the consultant when the eDoc GUI display is created.
Finally, at step 214, the eDoc document e.g. 213 and corresponding Rowtype and Documenttype dictionaries are stored in an RDBMS 215, or alternatively as a single file. Thus, in the example embodiment, data that was stored in a traditional RDBMS 142 (Figure 1) in separate tables 144, 146, 168 is now consolidated into single eDoc documents. The final step 214 can for example involve storing of the Rowtype and Documenttype dictionaries and eDoc documents by the eLedger Filing System (eLFS) into the storage system. A description of the details of the eLFS is provided in PCT MY08/00017 and is herein incorporated by reference.
In other example embodiments, step 214 may involve storing the eDoc documents as a singie file and similarly corresponding Rowtype dictionaries as another single file. That is, as the eDoc document is a single data entry of an e.g. form, and comprises a string of characters, it can be stored as a single line of a single e.g. text file as an alternative to storing the eDoc document as a row in an RDMBS table. Similarly, the Rowtype and Documenttype dictionaries can also be respectively stored as a line in a text file as an alternative to storing as a row in an RDBMS table.
In example embodiments, the eDoc documents and Rowtype dictionaries are stored separately as respective single files, or respective tables. It will be appreciated that while this is the current implementation, it is possible for the eDoc documents and Rowtype and Documenttype dictionaries to be stored together as a single file or table. For illustrative purposes. Figure 6a shows an example of a form 602 representing a data entry for a database. Figure 6b shows the. corresponding storage of the data entry in a traditional table based RDBMS 604. For this example, the DBMS table 604 comprises four different tables 606, 608, 610 and 612.
Figure 7a shows an example embodiment of the same data entry represented in an eDoc document 702. In Figure 7a, the eDoc document comprises a Documenttype with header DOD0 710, data stored in Rowtype defined variable character strings with headers RiD0 712, RPD0 714, REM0 716, REM0 718 and RBK0 720. In defining the Rowtypes comprised within each Documenttype in e.g. steps 202 and 204 of Figure 2, the consultant may use the same Rowtype more than once. For example, with reference to Figure 6, data in the Employment Details Table 608 and Experience Table 610 have similar fields and may therefore use similar Rowtype definitions. In this regard, the consultant may have defined the Documenttype such that the created eDoc document has two instances of the same Rowtype. As illustrated in Figure 7a, the REMO Rowtype (header 716, 718) is used twice in that eDoc document. In order to distinguish between the two instances, a "sequence number" field 722, 724 is added. The system Rowtype, i.e. the variable character string with header RiD0 712, comprising system information described earlier is also shown.
Figure 7b, 7c, and 7dshows an example embodiment of the associated Rowtype dictionaries in the eDoc format. The Rowtype dictionaries 704, 706 and 708 respectively allow the data values in each Rowtype field of the eDoc document 702 to be interpreted. Figure 7b provides Rowtype definitions for interpreting the Rowtype RPD0 714 of the eDoc document 702 (Figure 7a). Figure 7c provides Rowtype definitions for interpreting the Rowtype REMO 716 of the eDoc document 702 (Figure 7a). Figure 7d provides Rowtype definitions for interpreting the Rowtype RBK0 720 of the eDoc document 702 (Figure 7a).
With reference to Figure 7b, each Rowtype dictionary comprises:
1) a header [uDDdR0u] 726 for representing the start of document;
2) a header [uSS010u] 728 for representing the Start of section; 3) a system Rowtype RiD0 [üRRiD0u ] 730; and
4) one or more RdC0 lines [üRRdC0 ] 732 for defining fields for the Rowtype
RPDO. Each RdC0 line in numeral 732 provide a definition of the "fields" that are present in the particular Rowtype, RPD0. In addition to providing a definition for the fields, the RdCOs 732 further comprises special identification and checking fields. As illustrated in Figure 7b, the first 8 RdC0 lines 732a, which provide for system related fields, are:
Figure imgf000018_0001
After the first 8 RdC0 lines 732a, the fields associated with the data values in the eDoc document for that Rowtype are defined. Numeral 732b shows five RdC0 lines:
Figure imgf000018_0002
Numeral 734 shows the data type of the field represented by the RdC0 line, where "X" represents a character string, "9" represents a number data type, and "D" represents a Date data type. Numeral 736 shows the field names of the data values stored in of the eDoc document 702 for the Rowtype RPD0 714.
The last 4 RdC0 lines 732c, which provide for other system related fields such as document size, checksum, etc., in the Rowtype dictionary 704 are:
Figure imgf000019_0001
Based on the above description for the Rowtype dictionary 704, it will be appreciated by a person skilled in the art that the Rowtype dictionaries 706 and 708 for identifying the data values in the eDoc document 702 for the Rowtypes with headers REM0 714 and RBK0 720 can similarly be interpreted.
In addition, in other embodiments, a Documenttype dictionary may also be created. Figure 7e shows an example embodiment of a Documenttype dictionary 709. The Documenttype dictionary 709 contains display information for use by e.g. Graphical User Interface (GUI) applications to recreate and display the eDoc data of a particular Documenttype e.g. DOD0 710 of Figure 7a. It will be appreciated that the Rowtype dictionaries e.g. 704, 706 and 708 of Figures 7b, 7c and 7d respectively and a partial Documenttype dictionary are generated based on the transformation rules. Subsequently, the Documenttype dictionary can be updated to include e.g. display information such as those represented in numerals 742, 744 and 746 using e.g. a design tool, which allows the display of the eDoc document in a desired format.
Display information for each Rowtype is stored in blocks, e.g. 740. For example, a block 740 (Figure 7e) in the Documenttype dictionary 709 stores the display information for the Rowtype with header RPD0 714 (Figure 7a).The display information includes formatting information 742 such as the data type 744 of each data field, and X-Y coordinate position information 746 of each field when displayed in a GUI application. Further, the block 740 may contain a label 748 for identifying the field being defined by the particular line, improving the readability of the block 740 to the consultant for debugging purposes, It will be appreciated that the iines of the Documenttype dictionary 709 illustrated in Figure 7e are truncated for reproduction purposes and the Figure is therefore not a complete representation of a Documenttype dictionary. Figure 8 illustrates the storage of multiple (two) eDocs into a single DBMS table 800 in an example embodiment. Figure 9 illustrates the storage of multiple (two) eDocs in a singie file 900 in an example embodiment. Figures 8 and 9 also include the eDoc 702 of Figure 7 as eDoc 802 and 902 respectively. An additional Rowtype RiFO has also been added. In the process flow of eLFS disclosed in PCT7MY08/00017, an eDoc is first being filed to the transaction ledger before it is sent for filing in the master ledger. The purpose of RiFO is to provide a cross-reference trace for the eDoc between these 2 ledgers. Each RiFO has two sets of sequence numbers and status:
o Tseq# and Tstatus - sequence number and status for filing at the transaction ledger
o Mseq# and Mstatus - sequence number and status for filing at the master ledger
When the eDoc is filed in the transaction ledger, the RiFO rowtype defined variable character string is created. Both the current running sequence number of the ledger and the filing status (success, fail, pending, suspended) are then inserted to the Tseq# and Tstatus fields in the RiFO defined variable character string respectively.
The "new" eDoc with an attached RiFO defined variable character string is then sent to the master ledger for filing (storage). At the master ledger, the same process is carried out where the sequence number and filing status at the master ledger is inserted into the same RiFO defined variable character string created at the transaction as separate Mseq# and Mstatus fields.
The Mseq# and Mstatus can then be sent back to the transaction ledger to update the same parameters In both example embodiments illustrated in Figures 8 and 9, each eDoc document is represented as a character string of multiple Iines or rows. It will be appreciated that the use of multiple lines or rows in the figures are for illustrative purposes only. In example embodiments, each eDoc takes up only a single line in the text file or a single row in the DBMS table. Figure 10 shows a flowchart illustrating a method 1000 for converting data from a multiple table structure into an eDoc format. At step 1002, one or more Documenttypes defining respective types of eDoc documents are created. For each Documenttype, at step 1004, one or more Rowtypes defining respective first single character strings are created. At step 1006, fields in the multiple tables are mapped to fields of first single character strings defined by respective Rowtypes. At step 1008, data values are extracted from the mapped fields of the multiple tables. At step 1010, extracted data values are stored in the corresponding fieids of each first single character string. At step 1012, the first character strings are combined into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
The method and system of the example embodiment can be implemented on a computer system 1100, schematically shown in Figure 11. It may be implemented as software, such as a computer program being executed within the computer system 1100, and instructing the computer system 1100 to conduct the method of the example embodiment.
The computer system 1100 comprises a computer module 1102, input modules such as a keyboard 1104 and mouse 1106 and a plurality of output devices such as a display 1108, and printer 1110.
The computer module 1102 is connected to a computer network 1112 via a suitable transceiver device 1114, to enable access to e.g. the Internet or other network systems such as Local Area Network (LAN) or Wide Area Network (WAN).
The computer module 1102 in the example includes a processor 1118, a Random Access Memory (RAM) 1120 and a Read Only Memory (ROM) 1122. The computer module 1102 also includes a number of Input/Output (I/O) interfaces, for exampie I/O interface 1124 to the display 1108, and !/O interface 1 126 to the keyboard 1 104.
The components of the computer module 1102 typically communicate via an interconnected bus 1128 and in a manner known to the person skilled in the relevant art.
The application program is typically supplied to the user of the computer system
1100 encoded on a data storage medium such as a CD-ROM or flash memory carrier and read utilising a corresponding data storage medium drive of a data storage device 1130. The application program is read and controlled in its execution by the processor 1118. intermediate storage of program data maybe accomplished using RAM 1120.
The method of the current arrangement can be implemented on a wireless device 1200, schematically shown in Figure 12. It may be implemented as software, such as a computer program being executed within the wireless device 1200, and instructing the wireless device 1200 to conduct the method. The wireless device 1200 comprises a processor module 1202, an input module such as a keypad 1204 and an output module such as a display 206.
The processor module 1202 is connected to a wireless network 1208 via a suitable transceiver device 1210, to enable wireless communication and/or access to e.g. the Internet or other network systems such as Local Area Network (LAN), Wireless Personal Area Network (WPAN) or Wide Area Network (WAN).
The processor moduie 1202 in the example includes a processor 1212, a Random Access Memory (RAM) 1214 and a Read Only Memory (ROM) 1216. The processor module 1202 also includes a number of Input/Output (I/O) interfaces, for example I/O interface 1218 to the display 1206, and I/O interface 1220 to the keypad 1204. The components of the processor module 1202 typically communicate via an interconnected bus 1222 and in a manner known to the person skilled in the relevant art. The application program is typically supplied to the user of the wireless device 1200 encoded on a data storage medium such as a flash memory module or memory card/stick and read utilising a corresponding memory reader-writer of a data storage device 1224. The application program is read and controlled in its execution by the processor 1212. Intermediate storage of program data may be accomplished using RAM 1214.
It will be appreciated by a person skilled in the art that numerous variations and/or modifications may be made to the present invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.

Claims

1. A method of converting data from a multi table structure into an eDoc format; the method comprising the steps of.
creating one or more Documenttypes defining respective types of eDoc documents;
for each Documenttype,
creating one or more Rowtypes defining respective first single character strings;
mapping fields in the multiple tables to fields of the first single character strings defined by respective Rowtypes;
extracting data values from the mapped fields of the multiple tables; storing the extracted data values in the corresponding fields of each first singie character string;
combining the first character strings into the eDoc document of said
Documenttype such that the eDoc document comprises a second single character string.
2. The method as claimed in claim 1 , further comprising creating a Rowtype dictionary for interpreting the first single character string defined by the respective Rowtype.
3. The method as claimed in claim 2, wherein the Rowtype dictionary defines one or more of a group consisting data type of each field of the first character string, name of each field of the first character string and the arrangement of each field of the first character string.
4. The method as claimed in any one of the preceding claims, further comprising creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each field of the second character string in a graphical user interface (GUI).
5. The method as claimed in any one of the preceding claims, wherein defining the Documenttype comprises defining a document ID for uniquely identifying each eDoc document.
6. The method as claimed in claim 5, wherein the document ID comprises one or more data vaiues of fields of the multiple tables.
7. The method as claimed in claims 5 or 6, wherein extracting the data values from the mapped fieids of the multiple tables comprises generating a list of potential eDoc documents based on the document IDs.
8. The method as claimed in claim 7, wherein extracting the data values from the mapped fields of the multiple tables further comprises constructing a query to retrieve the data values from the mapped fieids of the multiple tables.
9. The method as claimed in any one of the preceding claims, wherein one or more Rowtypes are shared between different Documenttypes.
10. The method as claimed in any one of the preceding claims, wherein one or more Rowtypes have multiple occurrences in a single Documenttype.
11. The method as claimed in any one of the preceding claims, further comprising storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file.
12. A system for converting data from a multi table structure into an eDoc format; the system comprising:
means for creating one or more Documenttypes defining respective types of eDoc documents;
for each Documenttype,
means for creating one or more Rowtypes defining respective first single character strings;
means for mapping fieids in the multiple tables to fields of first single character strings defined by respective Rowtypes; means for extracting data values from the mapped fields of the multiple tables;
means for storing the extracted data values in the corresponding fields of each first single character string;
means for combining the first character strings into the eDoc document of said Documenttype such that the eDoc document comprises a second single character string.
13. The system as claimed in claim 12, further comprising means for creating a o type dictionary for interpreting the first single character string defined by the respective Rowtype.
14. The system as claimed in claim 13, wherein the Rowtype dictionary defines one or more of a group consisting data type of each field of the first character string, name of each field of the first character string and the arrangement of each field of the first character string.
15. The system as claimed in any one of claims 12 to 14, further comprising means for creating a Documenttype dictionary for defining X-Y coordinates and formatting information for displaying each field of the second character string in a graphical user interface (GUI).
16. The system as claimed in any one of claims 12 to 15, wherein means for defining the Documenttype comprises means for defining a document ID for uniquely identifying each eDoc document.
17. The system as claimed in claim 16, wherein the document ID comprises one or more data values of fieids of the multiple tables.
18. The system as claimed in claims 16 or 17, wherein means for extracting the data values from the mapped fields of the multiple tables comprises means for generating a list of potential eDoc documents based on the document IDs.
19. The system as claimed in claim 18, wherein means for extracting the data values from the mapped fields of the multiple tables further comprises means for constructing a query to retrieve the data values from the mapped fields of the multiple tables.
20. The system as claimed in any one of claims 12 to 19, wherein one or more Rowtypes are shared between different Documenttypes.
21. The system as claimed in any one of claims 12 to 20, wherein one or more Rowtypes have multiple occurrences in a single Documenttype.
22. The system as claimed in any one of claims 12 to 21 , further comprising means for storing the eDoc documents in respective rows of a single DBMS table or respective single lines of a file.
23. A computer readable data storage medium having stored thereon computer code means for instructing a computer to execute a method of converting data from a multi table structure into an eDoc format; the method comprising the steps of:
creating one or more Documenttypes defining respective types of eDoc documents;
for each Documenttype,
creating one or more Rowtypes defining respective first single character strings;
mapping fields in the multiple tables to fields of first single character strings defined by respective Rowtypes;
extracting data values from the mapped fields of the multiple tables; storing the extracted data values in the corresponding fields of each first single character string;
combining the first character strings into the eDoc document of said
Documenttype such that the eDoc document comprises a second single character string.
PCT/MY2010/000323 2009-12-16 2010-12-16 System and method of converting data from a multiple table structure into an edoc format WO2011074942A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI20095417 2009-12-16
MYPI20095417 2009-12-16

Publications (1)

Publication Number Publication Date
WO2011074942A1 true WO2011074942A1 (en) 2011-06-23

Family

ID=44167511

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2010/000323 WO2011074942A1 (en) 2009-12-16 2010-12-16 System and method of converting data from a multiple table structure into an edoc format

Country Status (1)

Country Link
WO (1) WO2011074942A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016060553A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee A method for converting file format and system thereof
WO2016060549A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee A system for processing data and method thereof
WO2016060547A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee Emulating manual system of filing using electronic document and electronic file
WO2016060548A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee Electronic document and electronic file
WO2017074174A1 (en) * 2015-10-30 2017-05-04 Kim Seng Kee A system and method for processing big data using electronic document and electronic file-based system that operates on rdbms
CN112256691A (en) * 2019-07-22 2021-01-22 珠海金山办公软件有限公司 Data mapping method and device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5909570A (en) * 1993-12-28 1999-06-01 Webber; David R. R. Template mapping system for data translation
US6381600B1 (en) * 1999-09-22 2002-04-30 International Business Machines Corporation Exporting and importing of data in object-relational databases
US20050086235A1 (en) * 2003-10-17 2005-04-21 International Business Machines Corporation Configurable flat file data mapping to a datasbase

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5909570A (en) * 1993-12-28 1999-06-01 Webber; David R. R. Template mapping system for data translation
US6381600B1 (en) * 1999-09-22 2002-04-30 International Business Machines Corporation Exporting and importing of data in object-relational databases
US20050086235A1 (en) * 2003-10-17 2005-04-21 International Business Machines Corporation Configurable flat file data mapping to a datasbase

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016060553A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee A method for converting file format and system thereof
WO2016060549A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee A system for processing data and method thereof
WO2016060547A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee Emulating manual system of filing using electronic document and electronic file
WO2016060548A1 (en) * 2014-10-13 2016-04-21 Kim Seng Kee Electronic document and electronic file
GB2546912A (en) * 2014-10-13 2017-08-02 Seng Kee Kim Emulating manual system of filing using electronic document and electronic file
GB2548255A (en) * 2014-10-13 2017-09-13 Seng Kee Kim Electronic document and electronic file
WO2017074174A1 (en) * 2015-10-30 2017-05-04 Kim Seng Kee A system and method for processing big data using electronic document and electronic file-based system that operates on rdbms
GB2559909A (en) * 2015-10-30 2018-08-22 Seng Kee Kim A system and method for processing big data using electronic document and electronic file-based system that operates on RDBMS
CN112256691A (en) * 2019-07-22 2021-01-22 珠海金山办公软件有限公司 Data mapping method and device and electronic equipment

Similar Documents

Publication Publication Date Title
US9495347B2 (en) Systems and methods for extracting table information from documents
CN101739390B (en) Data transformation based on a technical design document
CN102298582B (en) Data search and matching process and system
KR100372585B1 (en) Method and system for data processing
US20090259670A1 (en) Apparatus and Method for Conditioning Semi-Structured Text for use as a Structured Data Source
US20080235227A1 (en) Systems and methods to extract data automatically from a composite electronic document
WO2021151270A1 (en) Method and apparatus for extracting structured data from image, and device and storage medium
WO2011074942A1 (en) System and method of converting data from a multiple table structure into an edoc format
US20220035847A1 (en) Information retrieval
CN115061721A (en) Report generation method and device, computer equipment and storage medium
US11704484B2 (en) Cross channel digital data parsing and generation system
CN106815366A (en) A kind of method and system of Mass production data
US20120290602A1 (en) Method and system for identifying traditional arabic poems
CN113836038A (en) Test data construction method, device, equipment and storage medium
CN111190920A (en) Data interactive query method and system based on natural language
CN112084342A (en) Test question generation method and device, computer equipment and storage medium
CN102959538A (en) Indexing documents
CN114090671A (en) Data import method and device, electronic equipment and storage medium
CN101763424B (en) Method for determining characteristic words and searching according to file content
US11520835B2 (en) Learning system, learning method, and program
KR101253502B1 (en) System and method for displaying application document
US11620282B2 (en) Automated information retrieval system and semantic parsing
CN108292307A (en) With the quick operating prefix Burrow-Wheeler transformation to compressed data
CN112989011B (en) Data query method, data query device and electronic equipment
CN110069489A (en) A kind of information processing method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10837937

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10837937

Country of ref document: EP

Kind code of ref document: A1