WO2007068279A1

WO2007068279A1 - Method and computer system for updating a database from a server to at least one client

Info

Publication number: WO2007068279A1
Application number: PCT/EP2005/014210
Authority: WO
Original assignee: Piaton, Alain, Nicolas
Priority date: 2005-12-14
Filing date: 2005-12-14
Publication date: 2007-06-21

Abstract

The invention proposes a method for updating a database from a server to at least one client comprising: a) the client receiving database update data and database integrity data; b) inserting said update data at a location of a database of the client; c) verification of the integrity of said database using said integrity data, wherein the database modified at the insertion step is a concatenated table comprising information data, and preferably display data.

Description

METHOD AND COMPUTER SYSTEM FOR UPDATING A DATABASE FROM A SERVER TO AT LEAST ONE CLIENT

FIELD OF THE INVENTION

The invention relates to the field of databases, and more specifically to a method for updating a database from a server to at least one client.

BACKGROUND OF THE INVENTION A database is usually defined as a collection of data or information organized for rapid search and retrieval, especially by a computer. Databases are structured to facilitate storage, retrieval, modification, and deletion of data in conjunction with various data-processing operations. A database consists of a file or set of files that can be broken down into records, each of which consists of one or more fields. Fields are the basic units of data storage. Users retrieve database information primarily through queries. Using for example keywords and sorting commands, users can rapidly search, rearrange, group, and select the field in many records to retrieve or create reports on particular aggregates of data according to the rules of the database management system being used. The computer program used to manage and query a database is known as a database management system (DBMS). Other solutions, e.g. the so-called PDM solution may further be used to automatically store and manage product information. Databases used in PDM systems enable queries to be made on various types of relation between objects stored on the database. Besides, in a related field, a number of search engine are known.

A search engine is a program designed to help find information stored on a computer system such as the WWW, a database, or a personal computer.

A search engine allows one to ask for content meeting specific criteria such as given in a word or phrase and is adapted to retrieve a list of references matching said criteria. Search engines use regularly updated indexes to operate quickly and efficiently. Typically, a search engine works by sending out a spider to fetch as many documents as possible to extract information. Another program, called an indexer, then reads these documents and creates an index based on the words contained in each document. Search engines use proprietary algorithms to create indices such that, ideally, only meaningful results are returned for each query. A search engine may else work by synchronizing to a database. Metadata can further be used.

Several search engines have been developed for retrieving information, notably on the Internet. For instance, the Alta Vista Company proposed an Internet search site, with a request box where the user may input keywords for retrieving information. More recently, Google Inc. proposed a searching tool for searching html files or text documents (in the PDF, Microsoft Word or RTF formats) available through the Internet. The results are returned to the user as a list of web pages. Each result is displayed as a URL, together with an abstract of the document accessible through the URL. The abstract is an extract of sentences or part of sentences of the document. If a web page is comprised of frames, the result returned to the user may be the URL of the frame, together with an abstract of the frame. Each frame is therefore searched and handled individually by the engine. More generally, a web search engine, such as Google and MSN search

(trademark) provides a way to access information content from a unique index.

Referring back to the general field of databases: it is often needed replicating a database. In particular, database replication is a concept closely related to transactions. It is even sometimes possible to duplicate data in real time, for example when using a database that can log its individual actions. The duplicate can be used to improve the performance or availability of the whole database system. Known replication concepts include:

Server/Client replication, wherein all Write requests are performed on the master and then replicated to the slaves; - Quorum: the result of Read and Write requests is calculated by querying a majority of replicas; and

Multimaster: two or more replicas synchronize each other via a transaction identifier.

With modern database system, installing and maintaining a replicate copy of a given database requires a considerable investment in hardware, software, and services, particularly if the primary database is a large system handling large transaction volumes (examples are discussed throughout). Incidentally, the replication or update process comprises a number of manually intensive steps. In some applications, as will be exemplified hereafter, replicates copies of a master database need be frequently updated at many different client terminals. When large transaction volumes are involved, the require update rate may further overload the network capacity. In addition, it may take a long time to answer a query on a large replicate database, which is particularly unsuitable when applications involving recurrent queries run at the client terminals. There is therefore a need for a method and computer system for updating a database which resolves the above problems.

SUMMARY OF THE INVENTION To this aim, the invention proposes a method for updating a database from a server to at least one client comprising:

- the client receiving database update data and database integrity data;

- inserting said update data at a location of a database of the client; - verification of the integrity of said database using said integrity data, wherein the database modified at the insertion step is a concatenated table comprising information data and preferably display data.

In other embodiments, the method according to the invention may comprise one or more of the following features: - during said insertion step, the update data are inserted at the end of the client database; said integrity data are a checksum; if database integrity is confirmed, saving the modified database; the method comprises a step in which, if integrity of the modified database is not confirmed, sending a request from the client to the server requesting the update data for the database to be re-sent; the client is adapted to communicate with said server via two separate communication networks, said update data being received via one of said networks, the request for re-sending employing the other of said networks; the database modified at the insertion step is a concatenated table comprising at least partially encrypted information data and display data, stored on a non- volatile memory, the method further comprising: o accessing the modified database ; o reading a portion of the modified database; and o sequentially scanning said read portion of the database; at the step of reading, said read portion amounts to the whole modified database;

The step of reading said portion further comprises decrypting on-the- fly said read portion, all decrypted data resulting from the decrypting step residing on a volatile memory only, and the step of sequentially scanning comprises sequentially scanning the decrypted data only; the read portion is sequentially scanned in an encrypted state; the method further comprises, after the step of sequentially scanning, a step of deleting the read portion; the method further comprises, before decrypting, verification of the authorization of decryption; the verification step uses a decryption key or a hardware device; the method further comprises a step of querying the database modified at the insertion step and retrieving data from that database. querying said modified database comprises sequentially scanning said database or a read portion thereof. - the method further comprises a step in which the client displays data retrieved at the retrieving step; said concatenation table comprises information in the form of elementary records, and tags structuring said information, wherein said display step is dependent on said tags - said query step in said modified database comprises sequentially scanning said database or a read portion thereof without taking account of said tags; said concatenated table comprises: o information in the form of elementary records in which: ■ a first record contains a word,

■ a second record comprises phonetic interpretation data in a first language,

^■ a third record comprises phonetic interpretation data in a second language; and o tags structuring said information; said concatenated table comprises: o information in the form of records, said records comprising one or more fields; and o tags structuring said information within said records, wherein at least some of the tags describe both a type of a field and its significance within its parent record; information data and display data are the same; the method further comprises, before the step of querying the modified database, a step of selecting one or more file according to a given one or more criterion, wherein the step of querying the database restricts to querying the selected one or more selected files only; the step of selecting one or more file comprises several iterations; the method further comprises a step of querying a relational database; the step of querying the relational database occurs while querying the modified database; and the step of querying the relational database occurs while or after returning said retrieved data. The invention further concerns a computer program product suitable for implementing the steps of the method according to the invention.

The invention is also directed to a computerized system implementing the steps of the method according to the invention. It is the belief of the inventor that the efficiency of a method for updating a database and subsequently use of the updated database is dramatically improved, according to the invention.

Furthermore, to the best of their knowledge, the prior art, whilst suggesting some features and numerous variations relevant to database replication and queries in general, the prior art has not disclosed some of the highly advantageous features of the present invention discussed herein.

Various embodiments of the invention are now discussed.

I. METHOD AND COMPUTER SYSTEM FOR UPDATING A DATABASE Ll basics of the method according to the invention

The invention is directed to a method for updating a database from a server to at least one client.

As known, Client/Server refers to a network architecture, wherein the client (for example a graphic user interface, likely to run on a computer remote from the server computer), is separated from the server. With such architecture, each computer or process on the network is either a client or a server. Server software generally runs on computers dedicated for exclusive use to running a basis application. On the other hand, the client software is likely to run on common computers or workstations. Clients get all or part of their information and rely on the application server for various things such as configuration files. Each instance of the client software can preferably send requests to a server or application server, as will be exemplified later. In the present case, several clients are likely to be involved.

The method according to the invention comprises a first step, wherein the client receives database update data and database integrity data. Database update data are preferably divided in blocks, as known in the art.

Furthermore, said integrity data can be a checksum (or even a length), which is a simple measure for protecting the integrity of data. Errors in data to be sent through the network are thereby detected. Typically, one sends a value resulting from adding up basic components of the database, typically the bytes. Later, the same operation on the data can be performed on remote PCs, then the result to the authentic checksum are compared, and provided that the sums match, it may be concluded that the update process was not corrupted. Notice that the simplest form of checksum (adding up the bytes in the data) cannot detect a number of types of errors. Thus, one may contemplate using more sophisticated approaches, as known in the art.

Then, the method comprises: inserting said update data at a location of a database of the client and verification of the integrity of said database using said integrity data. Said location is preferably chosen so as to minimize subsequent reorganization of the table. Preferably, during the above insertion step, the update data are inserted at the end of the client database or table, such that re-structuring the table is not required. The process time is thereby improved.

Furthermore, the database modified at the insertion step is a concatenated table comprising information data and, preferably, display data. Accordingly, said update data are preferably sent in a form (for example blocks) suitable for a straightforward insertion in the table.

Thus, only the information necessary to update the table is sent through the network, whereby the load on the network is reduced. Hence, frequent updates can be contemplated, from a server to several client computers (for example once a day).

Since the table is a concatenated table, though possibly dispersed among several files stored on client computers, adding the update data are easily implemented.

Yet, the table is a searchable table including information data. Thus, owing to such a simple table structure, the table can be locally queried and data retrieved. Furthermore, querying the modified table may comprise sequentially scanning said table, owing to its structure. Sequentially scanning the table allows to dramatically speeding up the usual response time.

Then, if the integrity of the table is confirmed, one may proceed to save the modified table. Thus, a non-corrupted and updated table is available and ready for a next step of update.

If not, that is, if integrity of the modified database is not confirmed, a request might be sent from the client to the server requesting the update data to be re-sent.

In particular, one may contemplate using two separate communication networks. Hence, said update data are received via one of said networks, while the request for re-sending employs the other of said networks. It is thereby reduced the load on the network intended for broadcasting the update data. In addition, the network intended for broadcasting the update data can be a non public network, whereby risks of piracy can be reduced.

A user at the client PC is likely to display data retrieved at the retrieving step. To this aim, the table may also comprise display data. Furthermore, said concatenation table may comprise both information in the form of elementary records and tags structuring said information, allowing for example to structure said information in a subsequent display (as will be exemplified later). Hence, said display step may make use of said tags.

On the other hand, querying the table and subsequent sequential scan are preferably performed without taking account of said tags, so as to maintain a high scanning rate.

As an example of application, the concatenated table may also comprise information in the form of a suite of elementary records. For example, a first record may contain a word, a second record comprises phonetic interpretation data in a first language (e.g. French) and a third record comprises phonetic interpretation data in a second language (e.g. English). Tags can be used to structure said records. Insertion and use of tags will be discussed in details after.

If necessary, the same data can be used for both information (for example to be scanned) and display, which makes it possible using more compact tables. Yet, scanning rates remain high. Notice that the same steps as above can be carried out at each one of said clients.

Local queries (performed at client computers) in the database may typically be achieved via a standard GUI, having usual menu bars that contain a set of user- selectable icons, each icon being associated with one or more operations or functions, as known in the art. hi particular, said GUI may comprise a query box and a display area for displaying target results.

1.2 Sample application to a decentralized company: networked information system 1.2.1 Table duplication

A goods distribution company with a head office and traveling salespeople distributes goods referenced in a product list and wants better customer management at any time.

Salespeople are equipped with laptop computers connected to the head office, and must at all times dispose of all elements to file and manage orders (financial elements in relation to each customer, current orders status, list, prices, inventory, etc.) ; they must also dispose of textual information in relation to their customer (visit reports, exchanged e-mail messages, etc.).

For response time reasons, and traffic load both on phone lines and radio waves, exchanges between the central database and the micro-computers must be limited to a strict minimum, so that it is necessary to replicate the information contained in the central database on the micro-computers.

Trade information can be separated in two categories: information common to all salespeople, list, prices, etc: this information must be identical on all company computers.

- information related to each customer, that must be stored in the computer of each sales engineer for the part that concerns his customers, but also in the head office computers, which stores the totality of the company's customers information.

It must be noted that, according to the case, some information is generated at the head office, and other at the level of the salespeople, and in terms of flow, it can be said that: - common information available in the salespeople's computers must be a replication of the information produced at the level of the head office.

- accounting or inventory information are managed at the head office level, and partially duplicated for that part that concerns them, on the salespeople's computers. - mainly textual information related to customers (e-mail messages, meeting reports, etc.) is produced at the level of the salespeople's computers and, according to the case, must be accessible at the level of the head office.

Such a need is known, and the difficulty resides in the fact that all information must be coherent and updated with a minimum delay in the whole company.

For cost reasons, salespeople cannot use high bandwidth phone communication (wired or wireless) while they are traveling.

For security reasons, all information contained in the laptop computers must be encrypted.

1.2.2 System architecture 1.2.2.1 At the head office

One finds a classical database containing all the necessary elements for centralized inventory management and end-to-end order processing (orders, shipping, billing, etc.), meaning the inventory files, prices files, called hereafter source prices files, customer files, accounting operations history for the customers, etc. These elements are modified in real-time by transactions done by the different employees of the company. hi parallel to this database, one finds a second prices file, called hereafter sequential prices file, which contains all or part of the elements of the prices file contained in the central database, and which contains for each list item an identifier and the corresponding price. According to a feature of the invention, this sequential price file is constituted by a table in which all the elements, hereafter called elementary items, are stored one after the other, which allows, from a request on an element such as an item identification code, by a simple sequential scanning of the table, to find and display searched elementary items.

Concerning the sequential prices file, the first time, it is entirely produced by an extraction from the source prices file of the database.

The next times, updating this source prices file is done by appending, at the end of the table, an elementary item generated from the last modification, so that for a given item, one may have several elementary items in the file, and only the last elementary item in the table will have valid information.

For instance, when an item is deleted, the last elementary item will be a said deletion element meaning there is no more a corresponding article.

Alternatively, another solution consists in marking the one or many items to delete by a code indicating that the item is deleted.

Regularly, a simple reorganization of the table allows deleting duplicated items so as to keep, for a given item, only the most recent elements and delete those that have a deletion element.

Other information, essentially of text type, which must be identical for all salespeople is managed in an analogous way, meaning products lists, descriptive entries, data sheets, etc.

1.2.2.2 Salespeople computers

It is typically a laptop computer (or desktop for a head office-based salesperson) running all the usual programs to file orders on line with the central computer.

The salesperson's computer also contains a prices file said replica prices file which is a replica of the sequential prices file.

With traditional databases, getting a replica is very delicate, because one has to either make a copy of its totality, or to simulate all updates, which is complex and slow.

According to the invention, the replica is obtained in the following way. At first, each time a modification is made on the head office sequential prices file, the new elementary item which has just been created, is sent through a telecommunication link to every salesperson's computer using an update data block containing:

- the elementary item

- an update number, which is incremented at each operation a control or checksum number, which is computed from all the characters of the table after this elementary item is appended, and which allows to check the both tables are identical.

Each salesperson's computer, in addition to the prices file replica, keeps in its memory the number of the last valid update. This ways, when it receives this new update data block, it first checks the update data block number is indeed the one which is expected.

If it is the case, the elementary item is appended at the end of the prices file replica, the computer computes the control or checksum number after the update, and checks it is indeed identical to the one from the said update data block.

An analogous procedure may as well be used to process a group of consecutive elementary items instead of a single one.

As well, the computer may receive several disordered update data blocks; in this case the different blocks are stored in a temporary memory, and wait for the whole correct sequence is received before beginning the update process from the temporarily stored blocks.

This procedure ensures that all updates are done in the right order, and that the prices files replica is identical to the sequential prices file.

According to an alternative, to ensure the replica prices file is correctly updated, even if there is no update, one may use an update data block not containing any elementary item.

One of the advantages of this procedure is that it is possible to use either a wire or a wireless transmission, only in the direction from the head office to salespeople, and to work in broadcast mode, that is, like the Teletex method, have a transmitter or satellite to sends the same information to everyone.

To allow some receivers to get correctly any information which could have been badly transmitted for any reason, the emitter may send repeatedly to totality of the update data blocks for the day, making the temporary storage described earlier all the more useful. It is typically only in the case of a computer failing to synchronize with the head office, for instance following a breakdown, that the computer will establish a link to the head office, and send it the elements that allows initiating a recovery procedure from the last correctly stored elementary item.

1.3 Sample application to document certification

In trade relations, it is sometimes necessary to be able to bring a proof of an event such as an order, an acknowledgement of receipt, etc. A way to bring this proof is to involve a trusted third party who keeps by himself a copy of the information. However keeping a copy of all the operations and e-mail exchange of a large company who sometimes exceeds a hundred million messages per year, and be able to find any information rapidly necessitates very costly means.

The replication method described here may be used to obtain an equivalent result at a lower cost. Indeed, thanks to the storage system consisting in putting all events one after the other, one gets a memory usage gain of the order of 10 compared to classical relational databases and document management systems, while allowing a fast access to an information following a request.

As for ensuring coherence and inviolability of the replica, it suffices that every transaction sent to the replica bears information forbidding deletions or additions of transactions.

In complement with other solutions used in this activity, one may use the fact that each transaction be time-stamped, has a sequence number, a checksum on the transaction, a checksum on the duplicated table after the update, that the program that manages the replication checks the process correctness, and finally that the programs used by the company and the trust centre be authenticated, which can be verified by the means of a system of passwords and encryption keys.

II. DATABASE DECRYPTION METHOD ILL basic principles of the decryption method hi general, a database is stored on disk, and while it is working, it generates permanent or temporary files on disk, which constitutes an important risk in case of theft when information is particularly confidential.

To remedy this, the invention may propose a concatenated table comprising at least partially encrypted information data and, preferably, display data, stored on a non-volatile memory. Said method may comprise accessing said database, reading a portion of the database and sequentially scanning said read portion of the database.

In particular, the invention may concern a database decryption method. Said method comprises steps of:

- accessing an encrypted database stored on a non-volatile memory, for instance a hard-disk of a computer; - reading at least a portion of the encrypted database and decrypting on-the-fly the read portion, for example if an authorization is confirmed and using a decryption key ; and

- storing results of the decrypting step on a volatile memory only. As the decrypted information is stored on the volatile memory only, the risk for said information to be accessed is strongly reduced. Therefore, said method appears convenient for preventing from piracy or burgling.

Here again, the encrypted database is a concatenated table comprising at least partially encrypted information data and, preferably, display data, leading to the advantages described above. In particular, this makes it possible to query results of the decryption step and retrieve data contained in said results. Furthermore, querying results of the decryption may comprise sequential scanning of said results.

In an embodiment, at the step of reading, the whole encrypted database is read and decrypted. Loading the whole database in the volatile memory allows for faster scanning of the data.

In addition, said method may further comprise, before the step of decrypting, a step of verification of the authorization of decryption, so as to prevent from unauthorized access to information. Said verification step may make use of a decryption key or a hardware device, as known in the art.

As described earlier, an advantage of the invention is being able to scan at high speed the table which has been previously loaded in the memory of the computer from the disk. When the information is stored on disk, it is typically encrypted by the means of an encryption key, and when it is in memory it is in clear text. Therefore, an application that, for instance, is the sole to know the encryption key, does the encryption operation each time the table in memory is saved to disk. Additionally, this application may do the reverse operation (decryption) during the loading in memory, optionally after having checked that this operation is authorized. In spite of all these precautions there are ways to freeze the decrypted contents of the table in memory and save it on a disk.

Thus, the above mentioned inconvenient is corrected by loading all or a part of a table in encrypted state. When the encryption method allows it, the searched information may be encrypted and searched directly in the encrypted data. Else, one has to partially decrypt data as it is searched. This turns particularly useful when information is permanently stored on the volatile memory and never resides in a non- volatile memory.

In this case, just at the moment when a part of a table must be read for scanning, one proceeds to decrypt this part into another part of the memory, then we to the search by scanning the said other part of the memory, then one erases the contents of the said other part of the memory.

In the application sample detailed previously and for the same reasons, all update data blocks exchanged on telecommunication links between the head office and the computers are encrypted, so that no clear text information remains on disk concerning the replica prices file.

Furthermore, as described above, one may contemplate making use of tags structuring said information data. Scanning said data may be performed without taking account of said tags. In contrast, displaying said information data (or, if necessary, the only display data therein) would make use of the tags.

II.2. Imbedding the decryption method within the update method

The above decryption method may advantageously be implemented together with the method for updating a database described earlier.

In such an embodiment, the decryption method comprises the following steps. At a first step, the client receives database update .data and database integrity data. Then, said update data are inserted at a location of a database of the client. The database, as modified at the insertion step, is a concatenated table comprising at least partially encrypted information data and, preferably, display data. The modified database is stored (updated) on a non-volatile memory. Then, it is proceed to a verification of the integrity of said database, as described earlier. In a subsequent step, the modified database is accessed and it is read at least a portion of the modified database. Then, the read portion is decrypted on-the-fly, preferably if an authorization is confirmed and using a decryption key. Finally, results of the decryption step are stored on a volatile memory only. The advantages described above are therefore jointly achieved.

III. DISTRIBUTED DATABASE Reconsidering the application sample detailed previously: the computer of each salesperson also has a file said customer event file that essentially regroups together text type information related to the customers (e-mail messages, meeting reports, memos, specific technical appendices, complaints, etc.)

This customer event file is also built in the form of a sequence of elementary event items that may be scanned to find information related to a customer.

This information is produced at the level of the salespeople's computers, and generally will be used by the salespeople themselves; however, it may be useful to make this information available to the head office.

A solution to this problem consists in having an update system similar to the one described above so that the head office server contains a replica of all the customer event files of the salespersons that deal with them.

Another solution, less costly in telecommunication link traffic, consists in having only one instance of the information, in the salesperson's computer, and to consider the global company customer event database as a group of elementary databases.

In this respect, it is proposed a method for accessing information content in a database delocalized amongst several connected client computers. Thus, each client has its own database, which can be viewed as a part of said delocalized database (each computer may assume the role of server or client depending on the situation).

Furthermore, each client database is a concatenated table (possibly encrypted) with information (and possibly display) data stored as elementary records, said method for accessing information content comprises a first step in which a number of clients receives a request relative to information content sought, for example from the server itself or from one of the client computers. Said request is adapted for triggering sequentially scanning the entire or part of a client computer database. If pertaining information is found, it is then uploaded to the server or requesting PC.

Thus, as soon as one wants to know the information related to a customer, it suffices to send a request to all company computers, which may hold information on this customer, leaving to each computer the scanning of its customer events file, and the sending back of the search results.

IV. MAKING USE OF IMBEDDED TAGS rV.l Basics

As evoked above and in an embodiment, the invention may propose a data processing method, wherein, according to the result of a user action, the method comprises a step of accessing a database, the database being a concatenated table comprising information data as elementary records and, preferably, display data. Further, the table comprises tags for structuring said information. Said method further comprises, in any order:

- sequentially scanning records in the file while ignoring the tags; and

- sequentially scanning records in the file while taking the tags into account.

Said method is likely to include both steps of sequential scanning, the order of which depends for example on user actions. Thus, according to a first result of a user action, the records in the file are scanned while the tags are ignored, allowing high speed retrieval of content, owing to the features of the database, as discussed above.

On the other hand, the records in the file may be scanned while taking the tags into account, making it possible accessing and using data which are structured according to said tags. This second step is particularly convenient for a display step, wherein displayed information need be structured for visual comfort.

Notice that the database or file may be partially or fully encrypted, so as to allow for decryption on the fly, as explained earlier. The decrypted results would therefore exist in the volatile memory only, even if displayed, so as to improve the security of the method.

One of the advantages of the method described above is to allow storing data one after the other in an elementary item. Other aspects of the method are described below.

Usually in an elementary item, numerical or textual data are stored according to a predefined format, which is not always simple and may need much memory.

According to what precedes, another way of proceeding consists, at least for part of the data, in using a system of mark-up tags such as the one found in the HTML or XML formats

rV.l Examples of applications for embedded tags

In this way, when one is dealing with texts that are to be displayed, it is judicious to insert in the text, information related to formatting. For instance the sequence '0x4-b' (0x4 is a value expressed in hexadecimal) means start of a character zone displayed in bold, and '0x5-b' end of zone, etc.

During the sequential scanning of the table, it is possible to ignore the tag (0x4 or 0x5) and the following character to find a given text.

On the other hand, if one wants to display the text with formatting, tags are taken into account like it is done with texts displayed on the Internet. This allows avoiding duplicating information, once for searching, once for display.

Advantageously, one may use a tag to specify that a character is equivalent to another while the search by sequential scanning.

For instance, the sequence 'e-0x7-e' means a character comparison with 'e' or 'e\

One may also use a system of tags to insert near a word or a proper name its phonetic writing, which allows finding people whose name sounds the same like 'dupont' et 'dupond'.

A given tag may mean that the next 4 bytes that follow must be considered as a signed or unsigned integer, as a date, or as an amount in a given currency, for instance in Euros with two digits after the point.

A coding of hypertext type allows to link to other data stored remotely on the Internet for instance, or on disk.

In the case of a table related to customer information, one may insert near the name of the customer, a tag indicating that the n following characters represent his or her account number.

Access rights management tags may indicate if the current user has the right to read, or modify, or delete elementary articles. An access right tag is constituted by any number of group or user identifiers and permission masks. This allows for access right management similar to the one found in classical database or operating systems.

One may also use tags to delimitate texts coded on a one byte like the Latin alphabet or two bytes like Chinese, or code some less frequent accented characters like accented uppercase characters with two characters.

In the case where one uses a text content analysis module, one may use a tag system to insert syntactic analysis elements, or logical analysis, according to the method used by the information retrieval tools.

IV.1 Complementarity between database replication and tag embedding

As we have seen above, the replication method allows building an economical and a cost-effective to archive static data such as bank transactions of several millions customers for several years, and to have a replica of this database off-site or at in custody of a trusted third party.

Using tags allows handling graciously the transition phase during a currency change (like the French Franc to the Euro.) Indeed, using two different tags, one would have for instance

0x5 — 65666 meaning 656.66 Francs 0x6 - 10000 meaning 100.00 Euros

One or the other of this information may appear, or both may coexist in the same element, without any ambiguity at the moment of searching, and it is therefore unnecessary to reorganize the database. This is already appreciable in the case of huge data sets. It is even more so when the database is replicated to different places, especially if the information is located at a trusted third party without access.

V. Other sample applications

It has been shown above how one may judiciously use the lower cost and the performance of memory to build information retrieval systems either on structured data types, avoiding the resort to index tables like the ones used in relational databases, either of text type, avoiding to the resort to thesauri.

Amongst other advantages, the methods presented earlier have the particularity of using the sequential scanning of a table in memory to find information, and at the moment of displaying the result, using all or a part of the said table. One goal of the present invention is also to show how one may use such methods in combination with disks or relational databases to access rapidly to archived information. V.I Bank Example

In what follows, one describes the implementation of a method to provide information related to a bank customer, for instance balances at some point in time, bank operations histories, etc., text documents such as e-mail messages from or to this customer, meeting reports concerning the customer, etc.

Typically one deals with past events, and by definition not modifiable, for instance, for an account balance, the balance at a given date. This type of elementary information is hereafter called a customer event.

As we'll see below, it is also possible to build a real-time system, that is, a system capable of managing correctly operations of all nature that may impact the same information in different places, for instance a flight booking system, and bank operations in the current bookkeeping day.

In general banks use a relational database which contain information on their customers (last name, first name, address, proxies, list of accounts, current account, securities account, etc.) and for each account, information such as balance, transactions, deposits, withdrawals, etc. hi addition, for a given customer, one finds information about him or her distributed in e-mail messages in the mailboxes of different collaborators, as well Word or Excel documents stored on their computers.

V.I.1 Disk reading and table scanning

A proposed solution consists in storing sequentially in a table, all the said customer events, in the form of data blocks one after the other and using item identification code, formatting, delimiter and tag techniques. Later on, as described above, thanks to a simple sequential scanning of the said table in memory, one may find very quickly a customer event, and extract the information in order to display it. As the case may be, information is encrypted on disk.

In the case of a bank with for instance 5 million customers, the number of customer events on a 5 year period amounts to tens of billions, which necessitates storage capacities of the order of the terabyte. As such storage capacities are costly, one proceeds as follows.

On a hard disk, one creates a tree of folders corresponding to the geographic structure of the bank, for instance country, state, branch. Then in the folder corresponding to the branch, one creates one file per customer, and one puts in this file the totality of the customer events concerning him or her. In this way, one will find on the disk for the said bank 5 million files distributed in about one thousand branch folders. Obviously, nothing prevents to split the branch folder in subfolders each corresponding to a group of customers, hi the same way, the information on each customer may be split in several files, according to the nature of the information, that is, numerical, e-mail, texts, etc. Also, a single file may gather diverse information related to the same customer, or even several customers. To accomplish this, it suffices to generate the file path from the account number and information requested.

To avoid having the file system manager move blocks too often during updates, with the effect of augmenting the size of each file, it may be judicious to create from the start empty files of sufficient size and fill them progressively as needed. This precaution is necessary of one wants that all information related to a customer be stored physically near one another on the disk and allow fast loading in memory. Regularly, every night for instance, using conversion routines, for each customer, one may extract from the relational database all the customer events of the day. Then, possibly after reshaping, they are added at the beginning or at the end of the file corresponding to the customer.

For documents of e-mail messages type concerning the customer, one uses the appropriate conversion routines. That is, for each message, one extracts the sender addresses, recipient addresses, the date, the subject, the plain text message body, etc., and one creates a customer event which additionally contains the corresponding customer number.

Associating the customer account number to the said customer event may be done automatically from the e-mail addresses contained in the message, or the account manager manually does it at the moment of sending or receiving the message.

For other text type documents without sender or recipients, one may use a mode similar to the one used for e-mal with a manual entry of the concerned customer numbers.

The enrichment operation of the said customer file with text-type customer events of the day may be done at night, or progressively at the end of the automatic or the manual customer number process.

When a user searches for information of a given type on a customer, and corresponding to a set of criteria, using a simple conversion table one know the customer file name, and the path of the folder containing it.

Then, it suffices to read the customer file and, according to the needs, load all or a part of its contents in a temporary table in memory. From that moment, a simple sequential scanning of the said temporary table allows to locate the customer events corresponding to the requested criteria.

Finally, from the found customer events, it suffices to build a response message which, possibly after enrichment with other information and formatting will be returned to the user.

This mode of storage and querying is particularly adapted when the search is done from an identifier internal to the company, such as a customer number or an account number, or a service number for personnel, etc.

Response times and the number of requests that can be processed simultaneously depend essentially on the speed of reading one or more files from disk and on the loading of the temporary table in memory, because the scanning time in memory can be very fast. For security, and/or to augment traffic, one or more computers can be added with exactly the same files on disk, working in parallel.

V.1.2 Real-time

As it has been said above, the method is appropriate to store the said customer events.

Furthermore, it also allows building very simply a real-time system, that is, a system capable of managing correctly operations of all nature impacting the same information simultaneously in different places, for instance a flight booking system, and bank operations in the current bookkeeping day.

In the bank example, one may manage an account balance, that is, have at all times an item which gives the account balance, so that a customer doing a withdrawal from an ATM, would not be able to perform another operation, which is implemented simply by using a feature of the file system, that is, when the contents of a customer file is in memory to process a first request that changes the balance, the file system is asked to make the customer file inaccessible for another request, then at the end of the said request, one asks to the file system to make this file accessible again. This mode of storage and querying is particularly interesting when the information contain an identifier such as a customer account number, a service number for human resources, a phone number for telephone operators, a social security number, etc., and as long as the search is always done on one identifier at a time.

V.2 Table scanning, disk reading, table scanning

In some cases, search is not done on an identifier as described above. For instance, to answer to complaints, it may be necessary to provide the list of inter-bank transfers, to or from an account external to the bank, of which only the bank coordinates is known.

If one proceeds like above, and if the information related to each customer is stored on disk, a way to know if a customer event contains such a number, is to read all the customer files, check with scanning if the search account number is in the said customer event, which can take a long time.

A simply way to solve the problem consists in limiting the number of customer files to read, as described below. Every night, at the moment when one extracts the customer events of the day from the relational database, one selects the transactions corresponding to inter-bank transfers and for each on of them, one generates a record said transfer-item containing two fields:

- the bank references of the source of the transfer - the bank references of the destination of the transfer

In many countries, bank coordinates are composed only of numbers, and they can be coded directly on 64 bits; else a simple conversion program can do it.

This way, a block of twice 64 bits, or 16 bytes, contains the coordinates of both the bank customer and the other party, who most often will be exterior to the bank. All the blocks corresponding to the transactions of the bookkeeping day are concatenated one after the other, and constitute a block said "transfers of the day block".

In the header of the said block, one also finds a 128 bits block, hereafter called date- separator containing a tag such as described in the invention, the date of the bookkeeping day and the number of transfers of the said day.

Finally the said "transfers of the day block" is itself concatenated at the continuation of all the other blocks corresponding to the previous days to constitute what one will call the "inter-bank transfer table", and which preferably will be permanently in memory, in one computer, or more computers working in parallel, in the case where the computer's memory is insufficient, or if one wants to augment the number of requests to process at peak times.

Thus, when one searches for an account number external to the bank, and on a given period, thanks to the information contained in the date separators, one jumps over the set of the transfer-items of the days not in the period without scanning them. When one passes on a date-separator in the searched period, one scans the transfer-items looking for the external account numbers corresponding to the request, in order to constitute the list of account numbers internal to the bank concerned by the search. Then from this list it suffices to do searches in the disk customer files as explained above to obtain the whole or a part of the information related to these transfers.

One may also use all or a part of these results to constitute one or more new search criteria, which, possibly associated with other search results, and/or other criteria provided by the user, the whole of these criteria being used to start other searches in tables and/or to load tables in memory from files, according to a sequence that is described in a script, as it is described later on.

V.3 Complementarity example: table supplemented by a relational database An inconvenience of searching by scanning is that it is poorly adapted when there are hierarchical relations between different information, as in the following example.

When the customers files contain e-mail messages exchanged between the bank and its customers, as described above, one finds in these customer files customer-events with the different e-mail fields, such as the sender name, recipients names, the subject, the contents etc., and on the other hand items corresponding to attached documents of text type (file name, author, contents, etc.) To each e-mail message as well as to each attached document is associated a unique identifier, which helps to avoid attached document duplication, which belong to several messages. To link a message to its attachments, one may store in each e-mail item the list of is attachments identifiers, but if one deletes a message from the table, one may produce orphaned attachments, and the whole loses coherence.

In this respect, one may use a relational database that, for each messages, maintains a list of its attachment (parent-child relation), but which can also provide, using a SQL-type request, the list of all the messages that contain a given attached document (parent-child relation).

In the same way, for a customer having several checking accounts or savings accounts, this device allows passing from one account to another and consolidate all customer information. In a more general way using a relational database allows associating different elements contained in tables according to the invention.

V.4 Other complementarity example: table supplemented by a relational database

Finally, for customer analysis needs, it may be interesting to search for customers of branch B which have element E in their customer event file such as described above (for instance a check transaction on more than a given amount) and at the same time a proxy authorization for their spouse, this latter information being stored in a classical relational database (such as Oracle™ or DB2™.) In a similar way, after the search and at the moment of restitution for displaying the answers, it may be useful to display the elements of the customer accounts contained in the relational database (as if one had done a SQL request on this database) as well as some elements from the customer events file. Consequently, the method according to the invention is composed of two phases, a first phase consisting in doing the search in the tables according to the invention, and then a second phase during which a relational database is used to refine the search and/or display information not contained in the tables according to the invention. The classical technique consisting in doing first the search in the relational database, then from the obtained results, then doing a search in the customer event file (or the reverse) can be extremely slow.

This is why it may be judicious to use the high-speed table sequential scanning described above to accomplish this operation. One proceeds as follows. In the table containing the customer events one adds the elements extracted from the relational database such as the account balance, sales turnover for the past months, all the data that may be susceptible to be included in a multi-criteria, the common point between all items containing the information from the relational database or the customer event file being the customer number. According to the cases, for a given customer having a given customer number, it may be interesting to have a single item containing all the information on that customer, which one obtains easily by proceeding to reorganize of the said table, for instance once a day, at night.

One may also have several tables containing information of different nature, but using the same identifier to ensure one can bring together the different elements of a customer, which may sometimes imply several scans of the whole or a part of a table.

Such an organization allows managing the search of a customer number according to a series of criteria related to information contained in the relational database as well as in a customer event table, by the means of a table described above, and using the relational database to display or process the information of a given customer.

VI Query language One of the main disadvantage of relational databases, is that their structure depend of the format of the information that are stored, of the relations that connect them, and largely of the kind of requests that one wants to execute with very short delay. In particular if one wants fast access to information, this is detrimental to other accesses, whereas with the method according to the invention, in general access times depend essentially on the size of the tables to scan in memory.

In addition, the traditional language used to consult databases, namely the SQL language, is very complex, whereas the vast majority of requests may be expressed in a much simpler language, like the one found in language such as the C language, that allows describing a script sequence of as has been said above.

For instance; a request corresponding to several steps may be written:

Primary_Index_Disk_Filename_RIB = LoadFromDisk ("....") ; « if (D AT A B ASE = bank_operation

&& ( Primary_Index_Disk_Filename_RIB = 00313-00734-00006457858) && article_code = transfer)

&& (operation_date > 2005-10-25) && (operation date < 2005-11-25) && (value_date > 2005-10-25) ) {

Action_l (...) ; Action_2 (...) ; }

if ( RESULT (Action_l) = true)

{

Action_3 (...);

}

Etc,

Action_i may be an action such as: read a table from a disk file: for instance LoadFromDisk() according to the user's access rights.

- extract one element from the item found in a table - execute a SQL request on a relational database by giving the request text as parameter: Sql("text of the SQL request") according to the user's access rights.

Thus, it allows to specify the sequence of operations used in the above inter-bank transfers example. DATA_BASE, keyword indicating the database that contains the customers bank transactions of the bank.

RESULT, operator to get the status code resulting from an action. The language allows describing the database according to the invention, either from a specific simple language, either from SQL-type commands (CREATE TABLE, CREATE INDEX, etc.)

It also allows to export the contents of a traditional database to the format of the database according to the invention, using an export mode of the same kind as the CSV format, but according to the desired format.

VI.1 Complementarity between query language and tag imbedding

If one refers to the above bank transaction example, the simplest consists in putting at the head of each item of bank transaction type:

- the account number, on 64 bits (it is the "primary index" of the database in the SQL convention) possibly another field like a item identifier hi most cases, selections on bank transactions will be done using these two main criteria.

Then, one will find the different item fields, for instance the transaction date, the amount, the transaction details, etc.

Advantageously, one builds a system for referring to the different fields partly based on tags as it is explained above. For instance, one supposes that tags reserved for this purpose begin at the 0x80 value, indicating the next field, the transaction date. One builds a table with 3 columns, The 1^st column contains the tag value (that is 0x80)

The 2^nd column corresponds to the type, that is UTC DATE, a date on 32 bits in the UTC convention, this type will be used in operations on dates,

The 3^rd column contains the text of the field name used in the said language, that is

"transaction date"

If the following field is the value date, the contents of the 3 columns will be:

0x81, UTC DATE, and "value date" Etc.

In this way, whatever the order of the fields in the item, to the exception of the two first obviously, one will know:

- starting from an export file of CSV type or other, create an elementary item in the database according to the invention, by converting the source information according to the desired format, for instance to a 32 bits UTC date, and concatenating the table with the 0x80 tag followed by the 32 bits of the said date (same operation for the 2^nd date field with the 0x81, etc.) - relate the field indicated in the script language, to the characters read from the table in memory while scanning, and if necessary extract the contents for display

The invention is furthermore directed to a script, a computer program product and a computerized system comprising code means, suitable for implementing the steps of any of the above-described methods.

Other embodiments than described above can be contemplated by the skilled person, in the scope of the appended claims.

Claims

1. A method for updating a database from a server to at least one client comprising: - the client receiving database update data and database integrity data;

- inserting said update data at a location of a database of the client;

- verification of the integrity of said database using said integrity data, wherein the database modified at the insertion step is a concatenated table comprising information data and preferably display data.

2. The method according to claim 1 wherein during said insertion step, the update data are inserted at the end of the client database.

3. The method according to claim 1 or 2, wherein said integrity data are a checksum.

4. The method according to claim 1, 2 or 3, further comprising a step wherein: if database integrity is confirmed, saving the modified database.

5. The method according to any one of claims 1 - 4, further comprising a step in which: if integrity of the modified database is not confirmed, sending a request from the client to the server requesting the update data for the database to be re-sent.

6. The method according to claim 5, wherein the client is adapted to communicate with said server via two separate communication networks, said update data being received via one of said networks, the request for re-sending employing the other of said networks.

7. The method according to any one of claims 1 - 6, wherein the database modified at the insertion step is a concatenated table comprising at least partially encrypted information data and display data, stored on a non-volatile memory, the method further comprising: - accessing the modified database ;

- reading a portion of the modified database; and

- sequentially scanning said read portion of the database.

8. The method according to claim 7, wherein at the step of reading, said read portion amounts to the whole modified database.

9. The method according to claim 7 or 8, wherein :

- the step of reading said portion further comprises decrypting on-the-fly said read portion, all decrypted data resulting from the decrypting step residing on a volatile memory only; and

- the step of sequentially scanning comprises sequentially scanning the decrypted data only.

10. The method according to claim 7 or 8, wherein the read portion is sequentially scanned in an encrypted state.

11. The method according to claim 9 or 10, further comprising, after the step of sequentially scanning, a step of:

- deleting the read portion.

12. The method according to claim 9, 10 or 11, further comprising, before decrypting:

- verification of the authorization of decryption.

13. The method according to claim 12, wherein the verification step uses a decryption key or a hardware device.

14. The method according to any one of claims 1 - 13, further comprising a step of querying the database modified at the insertion step and retrieving data from that database.

15. The method according to claim 14, wherein querying said modified database comprises sequentially scanning said database or a read portion thereof.

16. The method according to claim 14 or 15, further comprising a step in which: the client displays data retrieved at the retrieving step.

17. The method according to claim 16, wherein said concatenation table comprises: - information in the form of elementary records; and tags structuring said information, wherein said display step is dependent on said tags.

18. Method according to claim 17, in which said query step in said modified database comprises sequentially scanning said database or a read portion thereof without taking account of said tags.

19. The method according to any one of claims 14 to 19, wherein said concatenated table comprises: - information in the form of elementary records in which:

- a first record contains a word,

- a second record comprises phonetic interpretation data in a first language,

- a third record comprises phonetic interpretation data in a second language; and

- tags structuring said information.

20. The method according to any one of claims 14 to 19, wherein said concatenated table comprises:

- information in the form of records, said records comprising one or more fields; and

- tags structuring said information within said records, wherein at least some of the tags describe both a type of a field and its significance within its parent record.

21. The method according to any one of claims 14 to 20, wherein information data and display data are the same.

22. The method according to any one of claims 14 to 21, further comprising, before the step of querying the modified database, a step of: - selecting one or more file according to a given one or more criterion, and wherein the step of querying the database restricts to querying the selected one or more selected files only.

23. The method according to claim 22, wherein the step of selecting one or more file comprises several iterations.

24. The method according to any one of claims 14 to 22, further comprising a step of querying a relational database.

25. The method according to claim 24, wherein the step of querying the relational database occurs while querying the modified database.

26. The method according to claim 24, wherein the step of querying the relational database occurs while or after returning said retrieved data

27. A computerized system comprising code means for implementing the steps of the method according to any one of claims 1 to 26.

28. A computer program product suitable for implementing the steps of the method according to any one of claims 1 to 26.

29. A script, allowing for implementation of the method according to any one of claims 1 to 26, upon interpretation by a computer program.