US20150294002A1

US20150294002A1 - Data accelerator for managing data transmission

Info

Publication number: US20150294002A1
Application number: US14/728,817
Authority: US
Inventors: Sean P. Corbett; Edward Philip Edwin Elliott; Matthew P. Clothier
Original assignee: Data Accelerator Ltd
Current assignee: Data Accelerator Ltd
Priority date: 2010-02-22
Filing date: 2015-06-02
Publication date: 2015-10-15
Also published as: GB201004449D0; US9396228B2; WO2011101691A1; GB2478016A; US20110208808A1; GB201216375D0; US20140025648A1; GB201011179D0; GB201103043D0; US8543642B2; US20130325927A1; GB2478189A; US20170046381A1; GB2491751A

Abstract

Systems and methods for managing the flow of data between a client device and a data source system including a data accelerator implemented, at least in part, between the client device and the data source. The data accelerator can function to intercept queries and determine whether responses stored locally with respect to the client device can satisfy, at least in part, the request for data of the client device. If a locally stored response can satisfy the data request at least in part, the data accelerator is configured to retrieve the response from the local storage and send it to the client device. The data accelerator is also configured to modify the query based on whether response are locally stored that can satisfy the request. Specifically, the data accelerator can modify the query to only request the remaining data that is not included as part of the locally stored responses.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent application Ser. No. 13/880,707, filed Aug. 16, 2013, entitled “Method of Optimizing the Interaction Between a Software Application and a Database Server or Other Kind of Remote Data Source,” which is a National Phase application under 35 U.S.C. §371 of PCT Application No. PCT/GB2011/050342, filed Feb. 22, 2011, entitled “Method of Optimizing the Interaction Between a Software Application and a Database Server or Other Kind of Remote Data Source,” which claims priority to Great Britain Patent Application Nos. 1002961.9, filed Feb. 22, 2010, 1004449.3, filed Mar. 17, 2010, and 1011179.7, Jul. 2, 2010. Further, this application claims priority to U.S. Provisional Application No. 62/006,770, filed Jun. 2, 2014, entitled “Data Accelerator for Managing Data Transmission,” all of which are incorporated herein by reference.

BACKGROUND

Computer architectures have been developed called client-server where the client computer runs an application and the server computer typically runs server software such as database server software. The client applications connect to the database and send requests; they then use the responses to drive the client application.
These client and server systems have had to reside on local network connections or else they perform very slowly or have to be written specifically to handle low network speeds and high amounts of latency. In the last few years there has been a shift in the focus of client-server systems to web based systems where the client connects to a server component which then connects to the database. This means the application can work over slower network links but has a number of disadvantages, the main one being that the application is limited to how much data it can send to the client, so web applications are generally less sophisticated than the original client-server systems. This means that developers have two options, the first is to create client-server systems which gives them the richness of a full application but requires a local network connection to function properly or to write a web based application which means they will work over a remote connection but functionality is poor.
Over the last 3 years especially, new developments have seen a trend whereby software vendors are offering their traditional on-premises software to their customers as a hosted service. This is either being achieved using Server Based Computing or by re-creating a new version of the existing application using Web 2.0 technologies. This is a natural progression as they have moved from core competencies of creating the software, to managing the delivery of the software on behalf of their clients.
Other limitations of the relevant art will become apparent to those of skill in the art upon a reading of the specification and a study of the drawings.

SUMMARY

The following implementations and aspects thereof are described and illustrated in conjunction with systems, tools, and methods that are meant to be exemplary and illustrative, not necessarily limiting in scope. In various implementations one or more of the above-described problems have been addressed, while other implementations are directed to other improvements.
Various implementations include systems and methods for managing the flow of data between a client device and a data source system. Specifically, a data accelerator is implemented between the client device and the data source system. The data accelerator can function to intercept queries requesting data sent from the client device. The data accelerator can determine whether responses stored locally with respect to the client device can satisfy, at least in part, the request for data of the client device. If a locally stored response can satisfy the data request at least in part, the data accelerator is configured to retrieve the response from the local storage and send it to the client device. The data accelerator is also configured to modify the query based on whether response are locally stored that can satisfy the request, at least in part. Specifically, the data accelerator can modify the query to only request the remaining data that is not included as part of the locally stored responses. The data accelerator can then forward the modified query to the data source system and receive the response form the data source system based on the modified request. The response from the data source system can be sent to the client device to fully satisfy the query.
These and other advantages will become apparent to those skilled in the relevant art upon a reading of the following descriptions and a study of the several examples of the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a diagram of an example of a system for managing data traffic.

FIG. 2 depicts a diagram of another example of a system for managing data traffic.

FIG. 3 depicts a diagram of an example of a system for managing data traffic using multiple data accelerators.

FIG. 4 depicts a diagram of an example of a system for transferring an application to a client.

FIG. 5 depicts a diagram of a flowchart of an example of a method for managing the transfer of data.

FIG. 6 depicts a diagram of a flowchart of an example of another method for managing the transfer of data.

FIG. 7 depicts a diagram of a flowchart of an example of another method for managing the transfer of data in receiving multiple queries.

DETAILED DESCRIPTION

FIG. 1 depicts a diagram 100 of an example of a system for managing data traffic. The diagram 100 includes a computer-readable medium 102, a local datastore 104, a data accelerator 106, a client device 108, and a data source system 110.
The local datastore 104, the data accelerator 106, the client device 108, and the data source system 110 are coupled to each other through the computer-readable medium 102. As used in this paper, a “computer-readable medium” is intended to include all mediums that are statutory (e.g., in the United States, under 35 U.S.C. 101) and to specifically exclude all mediums that are non-statutory in nature to the extent the exclusion is necessary for a claim that includes the computer-readable medium to be valid. Known statutory computer-readable mediums include hardware (e.g., registers, random access memory (RAM), non-volatile (NV) storage, to name a few), but may or may not be limited to hardware.
The computer-readable medium 102 is intended to represent a variety of potentially applicable technologies. For example, the computer-readable medium 102 can be used to form a network or part of a network. Where two components are co-located on a device, the computer-readable medium 102 can include a bus or other data conduit or plane. Where a first component is co-located on one device and a second component is located on a different device, the computer-readable medium 102 can include a wireless or wired back-end network or LAN. The computer-readable medium 102 can also encompass a relevant portion of a WAN or other network, if applicable.
The computer-readable medium 102, the data accelerator 106, the client device 108, the data source system 110, and other applicable systems or devices described in this paper can be implemented as a computer system or parts of a computer system or a plurality of computer systems. A computer system, as used in this paper, is intended to be construed broadly. In general, a computer system will include a processor, memory, non-volatile storage, and an interface. A typical computer system will usually include at least a processor, memory, and a device (e.g., a bus) coupling the memory to the processor. The processor can be, for example, a general-purpose central processing unit (CPU), such as a microprocessor, or a special-purpose processor, such as a microcontroller.
The memory can include, by way of example but not limitation, random access memory (RAM), such as dynamic RAM (DRAM) and static RAM (SRAM). The memory can be local, remote, or distributed. The bus can also couple the processor to non-volatile storage. The non-volatile storage is often a magnetic floppy or hard disk, a magnetic-optical disk, an optical disk, a read-only memory (ROM), such as a CD-ROM, EPROM, or EEPROM, a magnetic or optical card, or another form of storage for large amounts of data. Some of this data is often written, by a direct memory access process, into memory during execution of software on the computer system. The non-volatile storage can be local, remote, or distributed. The non-volatile storage is optional because systems can be created with all applicable data available in memory.
Software is typically stored in the non-volatile storage. Indeed, for large programs, it may not even be possible to store the entire program in memory. Nevertheless, it should be understood that for software to run, if necessary, it is moved to a computer-readable location appropriate for processing, and for illustrative purposes, that location is referred to as the memory in this paper. Even when software is moved to the memory for execution, the processor will typically make use of hardware registers to store values associated with the software, and local cache that, ideally, serves to speed up execution. As used herein, a software program is assumed to be stored at an applicable known or convenient location (from non-volatile storage to hardware registers) when the software program is referred to as “implemented in a computer-readable storage medium.” A processor is considered to be “configured to execute a program” when at least one value associated with the program is stored in a register readable by the processor.
In one example of operation, a computer system can be controlled by operating system software, which is a software program that includes a file management system, such as a disk operating system. One example of operating system software with associated file management system software is the family of operating systems known as Windows® from Microsoft Corporation of Redmond, Wash., and their associated file management systems. Another example of operating system software with its associated file management system software is the Linux operating system and its associated file management system. The file management system is typically stored in the non-volatile storage and causes the processor to execute the various acts required by the operating system to input and output data and to store data in the memory, including storing files on the non-volatile storage.
The bus can also couple the processor to the interface. The interface can include one or more input and/or output (I/O) devices. The I/O devices can include, by way of example but not limitation, a keyboard, a mouse or other pointing device, disk drives, printers, a scanner, and other I/O devices, including a display device. The display device can include, by way of example but not limitation, a cathode ray tube (CRT), liquid crystal display (LCD), or some other applicable known or convenient display device. The interface can include one or more of a modem or network interface. It will be appreciated that a modem or network interface can be considered to be part of the computer system. The interface can include an analog modem, isdn modem, cable modem, token ring interface, satellite transmission interface (e.g. “direct PC”), or other interfaces for coupling a computer system to other computer systems. Interfaces enable computer systems and other devices to be coupled together in a network.
The computer systems can be compatible with or implemented as part of or through a cloud-based computing system. As used in this paper, a cloud-based computing system is a system that provides virtualized computing resources, software and/or information to client devices. The computing resources, software and/or information can be virtualized by maintaining centralized services and resources that the edge devices can access over a communication interface, such as a network. “Cloud” may be a marketing term and for the purposes of this paper can include applicable networks described herein. The cloud-based computing system can involve a subscription for services or use a utility pricing model. Users can access the protocols of the cloud-based computing system through a web browser or other container application located on their client device.
As used in this paper, an engine includes one or more processors or a portion thereof. A portion of one or more processors can include some portion of hardware less than all of the hardware comprising any given one or more processors, such as a subset of registers, the portion of the processor dedicated to one or more threads of a multi-threaded processor, a time slice during which the processor is wholly or partially dedicated to carrying out part of the engine's functionality, or the like. As such, a first engine and a second engine can have one or more dedicated processors or a first engine and a second engine can share one or more processors with one another or other engines. Depending upon implementation-specific or other considerations, an engine can be centralized or its functionality distributed. An engine can include hardware, firmware, or software embodied in a computer-readable medium for execution by the processor. The processor transforms data into new data using implemented data structures and methods, such as is described with reference to the FIGS. in this paper.
The engines described in this paper, or the engines through which the systems and devices described in this paper can be implemented as cloud-based engines. As used in this paper, a cloud-based engine is an engine that can run applications and/or functionalities using a cloud-based computing system. All or portions of the applications and/or functionalities can be distributed across multiple computing devices, and need not be restricted to only one computing device. In some embodiments, the cloud-based engines can execute functionalities and/or modules that end users access through a web browser or container application without having the functionalities and/or modules installed locally on the end-users' computing devices.
As used in this paper, datastores are intended to include repositories having an applicable organization of data, including tables, comma-separated values (CSV) files, traditional databases (e.g., SQL), or other applicable known or convenient organizational formats. Datastores can be implemented, for example, as software embodied in a physical computer-readable medium on a general- or specific-purpose machine, in firmware, in hardware, in a combination thereof, or in an applicable known or convenient device or system. Datastore-associated components, such as database interfaces, can be considered “part of” a datastore, part of some other system component, or a combination thereof, though the physical location and other characteristics of datastore-associated components is not critical for an understanding of the techniques described in this paper.
Datastores can include data structures. As used in this paper, a data structure is associated with a particular way of storing and organizing data in a computer so that it can be used efficiently within a given context. Data structures are generally based on the ability of a computer to fetch and store data at a place in its memory, specified by an address, a bit string that can be itself stored in memory and manipulated by the program. Thus, some data structures are based on computing the addresses of data items with arithmetic operations; while other data structures are based on storing addresses of data items within the structure itself. Many data structures use both principles, sometimes combined in non-trivial ways. The implementation of a data structure usually entails writing a set of procedures that create and manipulate instances of that structure. The datastores, described in this paper, can be cloud-based datastores. A cloud-based datastore is a datastore that is compatible with cloud-based computing systems and engines.
The client device 108 is intended to represent a device through which a client can receive data. The data can be received from the data accelerator 106 or from the local datastore 104. In one example the client device 108 is a thin client device or an ultra-thin client device. The client device 108 can include a wireless network interface, through which the client device 108 can receive data from the data accelerator or the local datastore 104.
In the example of FIG. 1, the local datastore 104 is local in that it can be integrated as part of the client device 108 or as part of a local area or smaller network (e.g., LAN, PAN, or the like). For example, the local datastore 104 can be implemented as a cache in the client device 108. As another example, the local datastore 104 can be implemented separately from the client device 108 and coupled to the client device 108 through a LAN connection. The data stored in the local datastore 104 can be received from or sent to the data accelerator 106. In one example, the data stored in the local datastore 104 is received from the data accelerator 106 in response to a query generated by the client device 108.
The data source system 110 can function to store and/or generate data that is sent to the client device 108. The data can include values or variables that are stored in a database. The values or variables can be used by software, such as an application or operating system, in the execution of the software either on the client device 108 or a cloud based system associated with the client device 108. Alternatively, the values or variables can be the output of virtualized applications that are executed on the data source system 110. In one example, a user associated with the client device 108 can interact with the values and variables as if the virtualized applications were being executed on the client device 108 itself. In another example, the data can include executable computer code. The executable computer code can be software for applications or operating systems. The code can be executed on the client device 108 after being sent to the client device 108.
The data source system 110 can include one or more datastores and data servers. For example, the data deployment system 110 can include multiple data servers implemented in the cloud as cloud-based servers. The server can be a server of an applicable type based, for example, on the type of data that is provided by the server. For example, the server can be a database server, a file server, a mail server, a print server, a web server, a gaming server, an application server, or an operating system server. The data source system 110 can be coupled to the client device 108 through a WAN. In a specific implementation, the data source system 110 is coupled to the client device 108 through the Internet.
In a specific implementation, the data accelerator 106 is implemented as an intermediary between the client device 108 and the local datastore 104 and the data source system 110. For example, the data accelerator 106 can be implemented, in whole or in part, on a server that is part of the data source system 110, or a server that is on the same local network as a server of the data source system 110. As another example, the data accelerator 106 can be implemented, in whole or in part, locally with respect to the client device 108. In one example, the data accelerator 106 is separate from the client device but is implemented, in whole or in part, within the same LAN as the client device 108 and the local datastore 104. In another example, the data accelerator 106 is implemented, in whole or in part, on the client device 108. In being implemented, in whole or in part, on the client device 108 the data accelerator 106 can be application or a client install, or a combination of both on a client device 108. The portion of the data acceleration implemented on the client device 108 can be installable in user mode, meaning without the requirement for administrator rights on the client device.
In a specific implementation, the data accelerator 106 functions to intercept queries for data sent from the client device 108 to the data source system 110 and process the intercepted queries based on rules. The queries can include HTTP requests or a request to open a file over a network implemented in the computer-readable medium 102. In one example, as part of processing the intercepted queries, the data accelerator 106 decodes the intercepted queries. Further, in the one example, the data accelerator 106 can decode the intercepted query to determine the content and/or context of the query and the requested data as the client device 108 and/or data source system 110 understand the query. In a specific example, ‘content’ is the actual application or data that the client has asked to be run or the query that the client has asked for. In another specific example, ‘context’ is whether the query is part of a series of queries and the relationships between the queries. Further in the another specific example, ‘context’ can also be associated with an understanding environmental details such as which server the query is going to, which user and workstation sent the query and a number of other details such as specific network link speeds. The context can be used to determine how effective specific rules that are applied by the data accelerator 106 in receiving and processing queries for data from the client device 108 and responses to queries for data from the data source system 110. For example, the effectiveness can be used to determine whether to change the parameters of the rules or even enable or disable the rules applied by the data accelerator in processing and receiving queries and responses to queries.
In a specific implementation, the data accelerator 106 can function to determine whether a portion of the requested data of the query is located in the local datastore 104. In one example, if the data accelerator 106 determines that a portion of the requested data is stored locally in the local datastore 104, then the data accelerator 106 retrieves the portion of the requested data stored locally and returns it to the client device 108. In another example, the data accelerator 106 can modify the query and generate a modified query that is sent to the data source system 110. The modified query can include a request for the portions of the requested data not stored in the local datastore 104.
In a specific implementation, the data source system 110 can retrieve or generate requested data from the modified query. In one example, if the accelerator requests that an application that is running on the data source system 110 be executed in a certain way to generate the requested data, then the data source system 110 can execute the application in the specified way in order to generate the requested data. Further in the one example, the data source system 110 can send a response to the data accelerator 106 that includes the generated and/or retrieved requested data. In another example, the data accelerator 104 can receive the response from the data source system 110 and process the response according to rules. In one example, processing by the data accelerator 106 includes modifying the response and generating a modified response according to the rules. In yet another example, the data accelerator 106 can send the response or modified response received from the data source system 110 to either or both the local data store 104 or the client device 108.
In a specific implementation, in processing queries received from the client device 108 and responses received from the data source system 110, the data accelerator can analyze, change, or ignore the queries and responses. In one example, the data accelerator 106 can process the queries and responses in order to decrease network latency through the various networks that are included as part of the computer-readable medium 102. In another example, the data accelerator 106 can process the queries and responses based on a number of factors including but not limited to, query compilation, data source system 110 load, data access (reading and writing of data to disks) of the data source system 110, and network response time of the networks included as part of the computer-readable medium.
FIG. 2 depicts a diagram 200 of another example of a system for managing data traffic. The system in FIG. 2 includes a computer-readable medium 202, a data accelerator 204, a local datastore 224, a client device 226, and a data source system 228. The data accelerator 204, the local data 224, the client device 226, and the data source system 228 are coupled together through the computer-readable medium 202. The data accelerator 204, the local datastore 224, the client device 226, and the data source system 228 can function according to or be implemented as the data accelerators, local datastores, client devices, and data source systems described in this paper.
In the example of FIG. 2, the data accelerator 204 includes an interceptor engine 206, a protocol specific parser engine 208, a query decomposer engine 210, a rule effectiveness and modification engine 214, a pre-fetch engine 216, a modification engine 218, a compression and encryption engine 220, and a re-connection and redirection engine 222.
In a specific implementation, the interceptor engine 206 is configured to intercept queries sent from the client device 226 and responses sent from the data source system 228. For example, in intercepting queries send form the client device 226, the interceptor engine 206 can function to intercept TCP/IP socket data from the client device. As another example, the interceptor engine 206 can function to pass the intercepted queries to the protocol specific parser engine 208.
In a specific implementation, the protocol specific parser engine 208 functions to take the intercepted queries and responses that the interceptor engine 206 collects and split or merge the queries and responses into a message that is specific to the type of protocol being used. For example, in the tabular data stream (hereinafter referred to as “TDS”) protocol the size of the message is at a byte offset 2 from the beginning of the stream. In one example, based on this protocol, the protocol specific parser engine 208 can split the stream into messages based on the size of the messages specified in the TDS protocol. If the network traffic interceptor does not send enough data for a complete message to be formed according to the specific protocol, then the protocol specific parser engine 208 can buffer the start of the message until enough data is received to form a complete message according to the specified protocol.
In a specific implementation, the protocol specific parser engine 208 functions to decode the messages to determine the properties of the messages. For example, if the query is an SQL request, then the protocol specific parser engine 208 can decode the message so the text of the SQL request can be read.
In a specific implementation, the query decomposer engine 210 functions to add additional properties to the query which although they are not included in the query, are known because of the specific connection or previous queries to gain an understanding of the context of the query. For example in Microsoft® SQL Servers each connection has a service profile identifier (hereinafter referred to as “SPID”). The SPID can be included in the request if the data accelerator determines that the query is being sent to a Microsoft® SQL Server in the data source system 228. By monitoring requests and responses the SPID can be saved and appended to every query that is intercepted by the interceptor engine 206 and is destined for or believed to be destined for a Microsoft® SQL Server. The SPID can then be used by other components to retrieve metadata about the queries directly from the data source system 228 using the SPID.
In a specific implementation, the pre-fetch engine 216 functions to store the queries sent by the client device 226 and the responses to queries sent by the data source system 228 in the local datastore 224. The local datastore 224 can include cache memory. The local datastore 224 can be on the client device 226 or on a peer that is coupled to the client device 226 through a LAN connection. The queries and responses to queries can be previous queries and responses to queries that were generated by the client device 226 or other peers in the LAN of the client device 226. As a result, in one example, the queries and responses to queries are pre-cached, or locally stored before the query is even generated by the client device 226.
In a specific implementation, the pre-fetch engine 216 functions to determine whether an intercepted query has been made before and whether requested data sufficient to satisfy the intercepted query is stored in local datastore 224. In one example, the pre-fetch engine determines whether responses from the data source system 228 are available in the local datastore 224 to satisfy a portion of the intercepted query. In the one example, if requested data sufficient to satisfy a portion of the intercepted query are stored in the local datastore 224, the pre-fetch engine 216 can retrieve the data from the local datastore 224 and send the data to the client device 226 in order to satisfy the query. As a result, the query does not have to be sent upstream to the data source system 228 through a WAN.
In a specific implementation, the pre-fetch engine 216 stores the original query so when it determines that a query may be used, it can pass the original binary data stream to the interceptor engine 206 or an engine within the data accelerator. As a result, the intercepted query can be handled as a normal query from the client device 226 and other rules such as a compression rule can also be applied without the need of applying custom logic so that the pre-fetch engine 216 can access the local datastore 224.
In a specific implementation, the pre-fetch engine 216, in storing and retrieving queries, responses, and data from the local datastore 224, functions according to a simple caching rule. The simple caching rule can includes three parts, the actual caching of data called the “cache data rule”, the serving of cached data called the “cache serve rule” and the diagnostics component “cache diagnostics and management”.
In a specific implementation, the pre-fetch engine 216 functions, in accordance with the cache data rule, to actually store the response from the data source system 228 after the response has been sent back to the client device 228. There are a number of different types of cache that can be used, including but not limited to an in-process and out-of-process or separate machine cache and permanent storage such as a hard disk. In one example, the cache can serve as a hash table lookup with the key being either the SQL command from the request or a hash of that SQL command. In the one example, depending on the type of cache, it will either store a pointer to the first response packet or it will store the actual packets as an array.
In a specific implementation, the simple caching rule includes, before a query or response can be added to the cache, determining whether a query or response is actually cacheable. Whether data is cacheable depends on a number of factors. In one example, certain types of SQL commands, such as an UPDATE or INSERT request, are inherently not cacheable. In another example, certain commands are not cacheable because they need to be looked at in the context that they are being used, for example a data source system will have a command to retrieve the current date and time. In yet another example, if a query is sent to get all records in the future, depending on when the query was next run and if a record was added or deleted, it may or may not have a different set of results. In an example of the specific implementation, if it is determined that a query cannot be cached, it can still be stored in a memory or datastore so further queries do not have to be verified.
In a specific implementation, once a query or response is stored in a cache, the cache serve rule can be applied to queries as they arrive but before they are sent onto the data source system 228. Further in the specific implementation, if the query is in the cache, it is verified to ensure that it is still valid. In the one example, determining whether a query is valid can include determining whether rows have been added to, deleted from or modified in the cached response. In one example, if rows have been added, then the query is not valid. In the specific implementation, the cache serve rule can also include verifying whether the user of the client device 226 or the client device 226 has permission to access the response based on one or more factors, including whether the proper security clearance exists, or whether a valid license exists.
In a specific implementation, the pre-fetch engine 216 functions to verify how well the cache worked in processing and satisfying a query with a response. In one example, the pre-fetch engine 216 can compare the total time it took the data source system 228 to return a response with how long it took to verify that the query is still valid, check security and return a cached response from the local datastore 224 to the client device 226. In the one example, if the time it takes to return a response from the data source system 228 is less than the time it takes to verify that the query is still valid, check security and return a cached response from the local datastore 224, then the pre-fetch engine 216 can determine that the cache did not work well in processing and satisfying a query, and thereby act accordingly to make the cache efficient in satisfying requests.
In a specific implementation, the pre-fetch engine 216 functions to manage the cache. In one example, the pre-fetch engine 216 can manage the cache size by expiring unused or not often used queries as well as expiring items which are no longer valid. Further in the specific implementation, if the pre-fetch engine 216 determines that caching the request is not adding a benefit, in that the cache is not performing well, it can still monitor later requests to see if at some time it does become worth caching. In another example, in managing the cache to see if a query is still valid. For example, the rule can include the pre-fetch engine 216 keeping a record of the items that the query used within the data source system 228 and monitors the record for changes to the queries. In another example, the pre-fetch engine 216 can determine whether the changes affect the response. In the another example, if the pre-fetch engine 216 determines that the changes affect the response, then the rule can include the pre-fetch engine 216 evicting the item from the cache or re-running the query in order to cache the latest response available. In yet another example, the pre-fetch engine 216 can evict items from the cache that have not been used for a specific amount of time. Further in the another example, every time an item is served from the cache, a counter can be incremented and the time noted in order to track the last time an item was used.
In a specific implementation, the pre-fetch engine 216, in storing and retrieving queries, responses, and data from the local datastore 224, can function according to an intelligent cache rule. The intelligent cache rule is similar to the simple cache rule in that it has three components. In one example, in functioning according to the intelligent cache rule, the pre-fetch engine 216 can assess how much of the response to a specific query has changed and if the amount of change is under a certain percentage the cached response can be modified to reflect the changes. In the one example, in satisfying a query from responses in the local datastore 224, the data accelerator 204 can modify the query to only include the portions of a cached response that have changed. Further in the one example, the data accelerator 204 can then send the modified query to the data source system and receive a response from the data source system 228 that includes the changes to the response stored in the local datastore 224. As a result, in the one example, the pre-fetch engine can then merge the received response and the cached response in the local datastore 224, thereby updating the response in the local datastore 224 to include the received changes.
In a specific implementation, the pre-fetch engine 216 can decide on whether to modify the query to receive the changes to a cached response from the data source system 228 and merge the cached response to include the changes based on factors including the size of the cached response stored in the local datastore 224. In one example, the pre-fetch engine 216 can determine that if the response to the query is a small response, that it would be faster to receive the response from the data source system 228 rather than processing the query to determine if there are changes to a cached response for the query and then modifying and sending the query to receive the changes. In another example, the pre-fetch engine 216 can determine how much of the data has changed to determine whether to modify the query to receive only the changes rather than sending the entire query itself and receiving the complete response from the data source system 228. In determining how much of the data is changed in a cached response, the data accelerator 204 can instruct an instance of the data accelerator 204 that is upstream from the data accelerator 204 to physically re-run the request. Once the upstream instance gets the response, it can analyze each packet in turn to see if it has changed at all and if it has what percentage of the packet is different. Once the pre-fetch engine 216 knows how much of the data has changed, it can determine how complicated the changes are. An example of a more complicate change is if the size of a packet has changed, either due to extra rows being returned, or a string changed then details like the packet size and protocol specific information need updating. An example of the a less complicated change is when something has changed but the length of the packet remains the same, for example swapping “Company A” for “Company B” then it is simply a matter of swapping the “A” for “B” which is a less complicated change.
In a specific implementation, the pre-fetch engine 216 can rely on the data source system 228 to split a data file into subsections, for example with Microsoft SQL Server each file is split into a series of 8K pages. In one example, for the requests, instead of the actual tables that were used being monitored for changes, the pages that were read or written to when running the query can be captured, and then if a change happens, only the responses which were built using the changed pages can be expired.
In a specific implementation, the pre-fetch engine 216 functions to cache parts of a query. In one example, in a query that reads from a secure table but also adds an entry to an audit log; the write operation is completely independent of the read operations. As a result, in the one example, the pre-fetch engine 216 can cache the read operations and send the update operations separately.
In a specific implementation, the modification engine 218 functions to modify the queries sent by the client device 226 or responses to queries sent by the data source system 228. For example, the modification engine 218 can apply rules in modifying the queries and responses in order to improve network latency and overall network performance in communication queries and responses to queries between the client device 226 and the data source system 228. In modifying the queries and responses according to rules, the modification engine 218 can function to determine whether the query or response should be modified and then either modify the queries or responses itself or instruct an applicable engine in the data accelerator 204 to modify the query or response in accordance with the rules.
In a specific implementation, the modification engine 218 applies a compression or encryption rule to determine whether to compress or encrypt the query and instructs the compression and encryption engine 220 to modify a query by compressing the query if it determines that it is appropriate to compress the query. In one example, the modification engine 218 functions, in accordance with the compression rule, to determine what speed each of the up and down stream networks are running at and assign a score based on the determined network performance. Over time the performance can be verified to ensure if something changes, or if there is a particularly busy period on a portion of the network, it is taken into consideration. In another example, the modification engine can also check to see how long it takes to compress/decompress packets and compare that to the time it takes to send packets through both the up and down stream networks to determine the best ratio of packet size/compression size over processing cost to decide what to compress. In yet another example, the modification engine 218 can use both the network performance and the CPU compression cost ratios to determine whether a specific query or response should be compressed. If the modification engine 218 determines to compress a query or response, then instructions can be sent from the modification engine 218 to the compression and encryption engine 220 to compress the specific query or response.
In a specific implementation, the modification engine 218 functions, according to a decompression rule, to determine whether to decompress data. In one example, the modification engine 218 can determine whether the link contains a data accelerator instance. In one example, if there is no instance that exists then the data is always uncompressed. As a result, the modification engine 218 can instruct the compression and encryption engine 220 to decompress a query or a response.
In a specific implementation, in modifying queries and responses in accordance with rules, the modification engine 218 can use applicable rules used in carrying out the functions of engines in the data accelerator. In one example, the modification engine 218 can use whether it is determined, by the pre-fetch engine 216 that responses, including data, is stored on the local datastore 224 sufficient to satisfy the entire query or a portion of the query, to modify the query to include the portion of the query that is not satisfied by the responses stored in the local datastore 224. For example, if the query includes a requests for data A and data B, and data A is located on the local datastore 224, the modification engine 218 can modify the query to only include a request for data B. For example, the modification engine 218 can modify the query to include a TOP X clause that only requires a certain amount of data but requests more than it needs.
In another example, if the pre-fetch engine 216 determines that all of the responses, including data, necessary to satisfy the intercepted query are located on the local datastore 224, then the modification engine 218 can hold or ignore the intercepted query. Holding or ignoring the intercepted query can include deleting the query or instructing the engines of the data accelerator 204 to no longer process the intercepted query.
In a specific implementation, modifying by the modification engine 218 can include holding a query and preventing further processing of the query. Specifically, the rule applied by the modification engine 218 can include a query batching rule. The query batching rule can include determining whether duplicate queries have been intercepted at the same time from the same client device 226 or different client devices within the same LAN. In one example, if the modification engine 218 determines that duplicate queries have been intercepted, then the modification engine 204 can instruct the engines in the data accelerator 204 to continue performing their functions in processing the query while holding the duplicate queries and not allowing it to be processed by the data accelerator 204.
In a specific implementation, the modification engine 218 can modify a query or a response in accordance with a pre-validation rule. For example, the modification engine 218 can function to retrieve the SQL commands for a query run the commands through a series of checks to ensure that the query can actually be satisfied. If the modification engine 218 determines that a query cannot be completed then it can function to return an error message to the client device 226. Determining whether a query can be satisfied can include a syntax check on the command to validate that the data source system 228 will actually accept the query. Furthermore, determining whether a query can be satisfied can include checking that the query includes an actual command and is not just a comment. For example, in a typical data source system “/* SELECT * FROM A*/” will not return a result as the command is commented out. Determining whether a query can be satisfied can also include verifying that a user associated with the client device 226 or the client device 226 has the permissions to run the query.
In a specific implementation, the modification engine 218 can apply a string replacement rule to determine whether common strings are present in the query, modified query, or response. If the modification engine 218 determines that commons strings are present, then the modification engine 218 can replace the common strings to minimize the data size that has to travel through the computer-readable medium 202 including both LANs and WANs. Specifically, the modification engine 218 can function to replace common strings with specific id's which decreases the size of the data packets. For example if a company name appears in a number of queries then depending on the length of the company name it can save quite a lot of network traffic by replacing “Company Name Corporation” with “:1:” or some similar identifier.
In a specific implementation, the modification engine 218 can modify queries to only ask for the minimum amount of data that is actually required. For example, the query “SELECT * FROM B” could be modified to “SELECT TOP 10 * FROM B”.
In a specific implementation, the modification engine 218 can apply redirection rules to determine whether to redirect the query, modified query, or response. The redirection rules can be applied based on network information determined by the re-connections and redirection engine 222. The redirection rules can be applied in order to separate specific servers in the data source system 228 and balance the load amongst the servers in the data source system 228. The redirection rules can also be applied in order to separate specific network paths depending on which one is online and fastest.
In a specific implementation, the modification engine 218 can apply pre-validating rules that determine whether errors such as incorrect syntax of the query language or for security issues are present in the query, modified query, or response. In one example, the modification engine 218 can modify the query or response to remove the determined errors, thereby avoiding sending a query to the data source system 228 that will lead to returning a response from the data source system 228 that includes a failed request as a result of the error. In another example, the modification engine 218 can apply rules to address issues such as auditing and logging. Specifically the modification engine 218 can apply a rule to the query or response that calls the auditing or logging systems so that they can still be used. In yet another example, the modification engine 218 can also apply encryption rules to determine whether the query, modified query, or response should be encrypted. If it is determined that the query or response should be encrypted, then the modification engine 218 can instruct the compression and encryption engine to encrypt the query or response.
In a specific implementation, the modification engine 218 can determine whether the intercepted query is a simple request for data that can be satisfied without the need to send the query to the data source system 228. In one example, a query such as “SELECT 1” or “SELECT 10 * 100” always returns the same response, so the modification engine 218 can detect that the query is for a simple response and generate the response locally, in the data accelerator 204, and return it to the client device 226.
In a specific implementation, the data accelerator 204 functions according to a network optimization protocol rule. For example, the data accelerator 204 can apply enhancements at the network layer as opposed to the application layer. In one example, the data accelerator 204 parallelizes a large TCP buffer or it might change the underlying protocol. For example, in applying the network optimization protocol rule, the data accelerator 204 can monitor how effective the network performs against different buffer sizes as well as continually monitoring the network links and parameters of those links to make the best choices for buffer types and sizes.
In a specific implementation, the compression and encryption engine 220 functions to handle compressing and decompressing intercepted queries and responses. The compression and encryption engine 220 can compress and decompress queries and responses in accordance with instructions received from the modification engine 218. In one example, the compression and encryption engine 220 can use a configurable compression method such as the Lempel-Ziv method to compress and decompress the intercepted queries and responses. The queries and responses, after being compressed, can be appended with a header showing that the query and responses are compressed and the method used to compress the query and responses. In another specific implementation, the compression and encryption engine 220 functions to apply extra techniques such as using a pre-known compression table in compressing the queries and responses. In one example, the compression table can be included as part of an application intelligence table (hereinafter referred to as “AIT”) so the table does not need to be transmitted with each set of compressed query or response to further reduce the size of the transmitted queries and responses.
In a specific implementation, the compression and encryption engine 220 functions to encrypt and decrypt intercepted queries and responses. The compression and encryption engine 220 can encrypt and decrypt the queries and responses in accordance with instructions received from the modification engine 218. In one example, the compression and encryption engine 220 encrypts the queries and responses in a blanket approach than the compression of the queries or response as encryption is either required or not require for a given network link in the computer-readable medium over which the query or response will be transmitted. The queries and responses, after being encrypted, can be appended with a header showing that the query or the response is encrypted and the encryption method used as well as details on which certificates etc. should be used to decrypt the data.
In a specific implementation, the re-connection and redirection engine 222 functions to determine whether the network performance statistics. The network performance statistics can be used by the rule effectiveness and modification engine 214 to generate rules, as will be discussed in greater detail later, or used by the modification engine 218 to apply rules. For example, the re-connection and redirection engine 222 can monitor servers within the data source system 228 to determine uptime and performance. Additionally, the re-connection and redirection engine 222 can monitor when servers within the data source system 228 go offline. If the re-connection and redirection engine 222 determines that a server has gone offline, the re-connection and redirection engine 222 can redirect data flow traffic from the offline server. The re-connection and redirection engine 222 can redirect data flow traffic based on instructions received from the modification engine 218. The re-connection and redirection engine 222 can also function to change the way that data, including queries and responses, is transferred to the client device 226 or the data source system 228. For example, the re-connection and redirection engine 222 can change the way that data is transferred from TCP over IP to UDP over IP.
In a specific implementation, the re-connection and redirection engine 222 functions to monitor the connection between servers within the data source system 228 and the data accelerator 204. In one example, the re-connection and redirection engine 222 can determine whether or not the connection has failed by monitoring the read/writes to a network socket. If the re-connection and redirection engine 222 determines that a connection has failed, the re-connection and redirection engine 222 can attempt to repair and reestablish a connection with the failed socket. In another example, the re-connection and redirection engine 222 can connect to a new socket and uses that for the read/writes to the server. In the case of a failure during read, the original query or modified query is re-sent from the data accelerator, and the responses (if there is more than one) returned from the last time data was sent back to the client device 116. There are some situations where the connection cannot be recovered, for example when the request is not cacheable and it needs to be run again so that the reads can be read again. In this case the client device 226 receives a connection termination and can generate and resend the query to the data accelerator.
In a specific implementation, the re-connection and redirection engine 222 functions to establish a number of pre-connections to servers within the data source system 228. As a result, the data accelerator 204 does not need to establish a connection to a server after the query is received as the connection is already established. In forming pre-connections with a server, the re-connection and redirection engine 222 can setup an incoming connection and an outgoing connection with the server. For example, queries or modified queries can be sent over the incoming connection, while responses from the data source system 228 can be sent over the outgoing connection. The pre-established incoming and outgoing connections can be de-coupled from each other. As a result, by way of an example, the incoming connection can be spliced onto a separate outgoing connection.
In a specific implementation, the rule effectiveness and modification engine 214 functions to generate and/or modify rules that are applied by the modification engine 218 and the other engines in the data accelerator 204. In one example, the rule effectiveness and modification engine 214 can generate and/or modify the rules based on network performance data, the type of queries that are intercepted, the type of client device 226 that is coupled to the data accelerator, and the type of data that is requested for in the intercepted queries. In the one example, the rule effectiveness and modification engine 214 in determining a compression rule including whether to compress some data, can base the rule on the average time it takes to compress a piece of data x bytes long and how long it takes to send x bytes over a specified network link. In another example, the rule effectiveness and modification engine 214 can determine how long it takes to compress each packet as they are compressed. In yet another example, the rule effectiveness and modification engine 214 can also determine how long it takes to send different sized packets over the specified network link. The rule effectiveness and modification engine 214 can then use the determination information to generate a compression rule that is used to determine whether to compress the data. In another example, the rule effectiveness and modification engine 214 can also determine how long different types of data take to compress, i.e. a series of 100 0's takes 1 ms but 10,000 random bytes takes 6 ms. This determined data can be used to generate or modify compression rules for determining whether to compress data before sending it.
In a specific implementation, the data accelerator 204, in intercepting, processing, and modifying queries and responses, functions according to custom rules. In one example, in functioning according to custom rules, the data accelerator 204 can help to ensure that data source system functions, such as auditing or logging occur in a data source system 228. In particular, a custom rule can be put in place to run a specific command on the data source system 228 as events occur in the data accelerator 204.
In a typical system, there would be some auditing when a user carried out a specific action, for example if someone retrieved all the annual wages of all employees, the query would need to be audited. However, if the response is retrieved from the local datastore 224, then the request would not be sent to the data source system 228 to be logged. The custom rules item can be configured with a list of queries or events such as data source system 228 load balancing or network path re-routing and then a list of actions such as writing to a log file or sending a separate request to the data source system 228.
In a specific implementation, the data accelerator 204 includes an AIT used by an engine of the data accelerator 204. The AIT contains details of the query, whether the query is cacheable, which base tables are read from and written to in generating a response by the data source system 228 and data about how previous queries have performed. The following is a representation of a subset of what a row in the AIT may contain in a specific implementation:
Query: “Insert Into TableTwo Select * from TableOne”
IsCacheable: No
BaseReadTables: “TableOne”
BaseWriteTables: “TableTwo”
Previous Request Times:
Was Compressed: True, Time: 00:00.22
Was Compressed: True, Time: 00:00.26
Was Compressed: False, Time: 00:00.94
In one example of using the AIT, the rule effectiveness and modification engine 214 can see that the response time is much faster with compression than without, and can therefore modify the compression rule that is applied by the modification engine 218 in determining whether to compress the data.
The AIT can be extended by other engines as necessary. For example the re-connection and redirection engine 222 can add data to the AIT that signifies that one specific server in the data source system 228 is faster in generating a response for a specific query than other servers in the data source system 228. As a result, the AIT can be updated to ensure that the specific query is sent to the server that is fast in generating a response.
In a specific implementation, the AIT is used in managing the local datastore 224 in that when a query runs that includes a base write table, all of the queries stored in the local datastore 224 which have one of the write tables as a read table can be expired.
In one implementation if a query, or a required part or property of a query, is not in the AIT then the rules are used to add different parts of the row until it is complete based on application of the rules by the engines in the data accelerator. For example the AIT can indicate that it has not been determined whether a query is cacheable. Therefore the engines in the data accelerator 204 can process the query to determine if the query is cacheable. The engines can also determine which base read and write tables, if applicable, once the information is known. The AIT can be updated so that other rules and future queries have the same information available. This dynamic updating of the AIT effectively means the AIT can grow and keep up to date as the application changes and/or users perform other operations that haven't been used before. The benefits of the self-learning AIT are that the developers of the application do not have to pre-program the AIT and as the AIT grows and understands more about the application, it can better optimize the application so that for users, it appears to get faster and faster.
In a specific implementation, the example system shown in FIG. 2 can operate in managing data traffic in accordance with various communication protocols. For example, the system can operate in accordance with protocols used by Microsoft® SQL Server in sending queries and receiving queries in the primary query languages used by Microsoft® SQL Server, such as T-SQL and ANSI SQL. In another example, the system can operate in accordance with the Hypertext Transfer Protocol (hereinafter referred to as “HTTP”), the Web Distributed Authoring and Versioning (hereinafter referred to as “WebDAV”), and the Simple Object Access Protocol (hereinafter referred to as “SOAP”).
In a specific implementation, a query is defined by a set of headers and an optional body sent in ASCII text. The headers are used to signify whether or not there is a body in the query, i.e. whether there is the HTTP header Content-Length or other information such as a Content-Type or Chunked-Encoding in the query.
In a specific implementation, the content of the query includes the contents of the HTTP header and possibly the body. In analyzing headers to define the query, there are certain headers which can be ignored for the purposes of caching such as the authorization or referrer headers as these do not uniquely identify a request, rather some extra data that is unique to the client. For example to uniquely identify a request to determine a key for caching, a key is defined as some text which can be used to match requests, to requests/response pairs that are in the local datastore 224. For example, based on the following example HTTP request:
GET/Uri/Uri HTTP/1.0 [Carriage Return] [Line Feed]
Host: www. server.com[Carriage Return][Line Feed]
Content-Length: 0[Carriage Return][Line Feed]
[Carriage Return] [Line Feed]
The unique caching key would be: “GET:/Uri/Uri:1.0”
This shows the query type, “GET”, the requested resource “/Uri/Uri” and the version of the query, “HTTP 1.0”
In a specific implementation, in supporting the WebDAV and SOAP protocols, the system shown in FIG. 2 can be configured to understand the different types of WebDAV packets i.e. PropFind and options and read the XML Body of the Soap message which is used to define a caching key. Additionally, the context information can be determined from headers such as the user-agent which gives information about the client device 226, the client device 226 browser, and the name of the server in the data source system 228 for which the packet was destined.
In a specific implementation, in determining whether a request is cacheable, the data accelerator 204 breaks the data requested in queries into two broad categories. The first is data which is inherently cacheable such as requests for images, documents or files such as CSS files. The second set is data that includes application code such as Java Servlets or Pearl Scripts. In the example when the data includes application code, a number of methods including monitoring the responses and compare to previous similar responses for the same request, analyzing the resources by parsing the text and or decompiling executable files and also using a manual method of having the owner of the resources can be used to determine which resources are cacheable.
In a specific implementation, the example system shown in FIG. 2 can also be configured to determine which queries should be expired or held. Specifically, there are a number of methods which can be utilized, such as the HTTP header If-Modified-Since can indicate whether a response or a query has changed. In one example, this is often effective where a web site is set to not allow caching but the files are the same and transferring the data over the Internet again is a waste of time. A similar method can be used for the data accelerator 204 to send a query or a modified query for a requests for a resource from the data source system 228, if the server responds saying it has expired, but the upstream instance determines that the content of the response is the same as a previously returned response then the upstream instance can tell the downstream instance to use the version it already has.
In a specific implementation, the example system shown in FIG. 2 functions to send WebDAV traffic is sent over the HTTP protocol. Specifically, by understanding the WebDAV extensions, it is possible to further understand the content and the context of queries. This can be seen by having the data accelerator 204 monitoring for queries. In some cases where a server does not support a particular option, such as WebDAV version 2, the Data Accelerator can return that the server does and handle the protocol changes as required to make it look as if the server does support version 2.
The following describes an example of the operation of the system shown in FIG. 2, with reference to a healthcare provider. Not using a data accelerator as is described in the example system shown in FIG. 2 could require that the national healthcare provider either host their data source system 228 in one location and use an expensive remote virtualization solution. Alternatively, not using the data accelerator according to the example system shown in FIG. 2 could require that the national healthcare provider host individual data source systems 228 in each branch office and replicate data between branches which can be inefficient, prone to failure, and expensive.
Using the example system shown in FIG. 2, when a patient goes to reception their details are first loaded so the receptionist's traffic can be processed by the data accelerator 204, as the data that is required is common i.e. there are a number of requests which get the patient records (e.g. names, address, date of birth etc.) As the patient moves to the specific department, the information is already available at the local cache so it can be served immediately.
The following shows another example of the operation of the system shown in FIG. 2, with reference to an insurance company. In the second example of operation, a global insurance company has a number of reports showing the daily claims and policy sales data which are run by various levels of management every day. In using the data accelerator 204, the global insurance company is able to drastically reduce the amount of processing that the data source system 228 needs to do during the online day so it can be used for other processing or a cheaper system can be put in place. The hierarchy of managers who view the reports are as follows:
1×Global Director
5×Regional Directors
50×Country Managers—Each region has an average of 10 countries
2500×District Managers—Each Country has an average of 50 districts
There is one report for each manager so the global director has a global report, regional directors have a report and each country manager has their own report etc. A report consists of one DBMS request. Typically each person views their own report, their peer's reports (district managers peers are those in their country and not in all countries) and also their direct subordinates.
The data is refreshed once overnight and without the present implementation of the invention and request caching the amount of requests the DBMS needs to cope with is:
Global Director=6 Reports—1 Global Report and 5 Regional Reports
Regional Directors=275 Reports—Each regional director views the 5 regional reports and their own countries reports
County Managers=27500 Reports—Each country manager views all 50 country reports and their own districts
District Managers=25000 Reports—Each district manager views their own reports and all the districts in their own country
Total responses=52781
If however local datastores and a pre-fetching engine is used, so that reports are only retrieved once, then we simply count the number of reports that are available:
1 Global Report
5 Regional Reports
50 Country Reports
500 District Reports
Total Responses=556
As a result, only 1.053% of the number of original responses need to be made. Because the same reports are run every day, once the data has been refreshed the data accelerator 204 can employ pre-caching to generate the data the reports require before anyone has even requested the first report.
Deploying this for enterprise reporting solutions often means that it is possible to restrict the use of complicated and expensive pre-aggregating solutions such as online analytical processing (OLAP) cubes.
In another example of the operation of the example system shown in FIG. 2, includes a website which shows dynamic pages directly from a data source system 228. The site described in this example is full time and has pages modified by editors as well as data feeds that continuously update pages. By using the data accelerator 204 performance and operational costs of the site can be improved. Specifically, a page consists of; a site header, site footer, a site tree and the page itself where each item is a separate data source system 228 request.
On average, within the example, 1 Page every 5 minutes is added or deleted which changes the site tree. The header or footer is changed once every 7 days. The site receives 50 page views a minute. Without using the data accelerator 204, the example site handles 2,000 requests/minute which are:
50×Site Tree
50×Site Header
50×Site Footer
50×Pages
These example statistics equate to 12,000 requests per hour, 288,000 per day and 2,016,000 requests a week. Using the system shown in FIG. 2, depending on which pages are shown, in the worst case scenario, where the page requested is always the page that has been modified there is still a massive reduction in requests is achieved. Specifically,
1×Site Tree—every 5 minutes
1×Site Header—every 7 days
1×Site Footer—every 7 days
1×Page—every 5 minutes (if the changed page is not requested then this can be even lower). This equates to 12 requests per hour, 288 requests per day and 2,018 DBMS requests every week. As a result, 0.1% of the original responses are generated in the above described example.
In yet another example of operation of the example system shown in FIG. 2, parallel TCP streams are used. When sending large chunks of TCP data, it is often inefficient to send a series of packets in serial as the time it takes to transfer the data over a network with high latency and also high bandwidth. Specifically, the time to send a packet and receive acknowledgement can be modeled by the following formula: (SizeOfData/MaximumTCPPacketSize) * (TimeToSendPacketOverInternet+TimeToSendAcknowledgement). In an example, where 10,000 bytes need to be sent and assuming the maximum TCP packet size is 1460 bytes and the network links we are using have a combined latency of 50 ms in each direction, the time to send the data and receive the TCP acknowledgement is (10,000/1460) * (50+50)=˜685 ms. If instead, the data is split into chunks no larger than 1460 and sent at the same time in parallel TCP streams, the time to send the data and receive the TCP acknowledgements is (1460/1460) * (50+50)=100 ms. Additionally, as many parallel packets as there is available bandwidth can be sent meaning that it takes roughly the same amount of time to send 10,000 bytes as opposed to 100,000 bytes as long as there is the available bandwidth.
In another example of operation of the example system shown in FIG. 2, the packets can be sent by the data accelerator in accordance with the user datagram protocol (hereinafter referred to as “UDP”). Specifically, in networks with very high latency but very high reliability, the time for the client device 226 to receive the TCP acknowledgements is too slow. As a result the data accelerator 204 can send the data using UDP as no downstream acknowledgement is required, The data accelerator 104 can send a checksum for the data with each packet and its own packet identifier so if data is not received or not received upstream data accelerator 204 instances or the data source system 228 can re-request the missing data.
In yet another example of operation of the system shown in FIG. 2, the data source system 228 includes a multi-terabyte database containing information from a supermarket's store card usage. When a team or internal staff is mining data from the database in order to track trends of customers or products, they may need to repeat many of the same queries but each time with some additional or different information required. By caching the results of the requests each time a team member runs a query they only need the database server to return new results that no one else has already requested. Additionally, each week the managers within the organization run a report of the purchases made and compare the purchases made to historical data. Normally the database server would have to return all of the data required for the reports, including the data necessary to create the historical data. Using the data accelerator 204, when a user runs the report they can access all the historical data from the local datastore 224 that have been run before, and the database server is only accessed to run a small query for the current week's data.
In a specific implementation, the data accelerator 204 can be used to reduce the load on the connection from the client device 226 to the data source system 228. This is achieved by using the various techniques described for each connection that the client device 226 makes to the data source system 228, and eliminating the need for the connection where possible. By being able to improve the performance of the connection between the client device 226 and the data source system 228, it is possible to move the data source system 228 from a local network connection onto a slower WAN connection. As a result, the data source system 228 can be moved into a public data center or public cloud environment or for an enterprise the data source system 228 can be centralized into a private data center or private cloud environment.
In one example of the operation of the example system shown in FIG. 2, a university system may have a database server on each campus and provide a remote desktop connection for students to log in from remote locations. By using the data accelerator 294, the database can be moved into a public cloud that provides low cost infrastructure and the campus locations and remote users or students can access the data using a client application on their local machine. To simplify the deployment of the client software application streaming can be used from a web page deployment. A typical use is for a tutor to download a student's essay, which is stored in a binary format inside the database, so it can be marked.
FIG. 3 depicts a diagram 300 of an example of a system for managing data traffic using multiple data accelerators. The example system in FIG. 3 includes a computer-readable medium 302, a first data accelerator 304, a second data accelerator 306, a local datastore 308, a client device 310, and a data source system 312. While the example system in FIG. 3 is only shown to have two data accelerators, the system can include a number of data accelerators that form a plurality of data accelerators.
The first data accelerator 304, the second data accelerator 306, the local datastore 308, the client device 310, and the data source system 312 are coupled to each other through the computer-readable medium 302. The first and second data accelerators 304 and 306 can include and function according to an applicable data accelerator, including the data accelerators described in this paper. The local datastore 308 can function to store data locally according to an applicable local datastore, including the local datastores described in this paper. The client device 310 can include and function according to an applicable client device, including applicable client devices described in this paper. The data source system 312 can include and function according to a data source, including, for example, applicable data source systems described in this paper.
In one example, the second data accelerator 306 is an instance of the first data accelerator and is implemented upstream from the first data accelerator 304, between the first data accelerator 304 and the data source system 312. As a result, the first data accelerator 304 and the second data accelerator 306 form a chain of data accelerators that is implemented between the client device 310 and the data source system 312.
In a specific implementation, the path through which data travels between the client device 310 and the data source system 312 through a chain of data accelerators is variable. In one example, queries and responses can travel through one or all of the data accelerators in the chain. In the one example, a data accelerator in the chain can change the route to the data source system 312 and the client device 310. In another example, the data accelerator changes the route to the data source system 312 based on the source of the query and/or the specific properties, including content and context, of the queries. The data accelerators within the chain can communicate with each other to describe what operations have been performed in processing the queries and responses. For example, if the first data accelerator 304 compresses a query according to the compression rule, the data accelerator 304 can communicate to the second data accelerator 306 that the query has been compressed and that the second data accelerator 306 needs to decompress the query before it is sent to the data source system 312.
In a specific implementation, in modifying a packet by processing a query or response and sending the modified packet to the second data accelerator 306, the first data accelerator 304 can wrap the contents of the packet in a specific message that the second data accelerator 306 can remove before forwarding the packet to either the data source system 312 or the client device 310. In one example, the data accelerators 304 and 306 in the chain can use a number of methods to determine what rules to apply or allowed to be applied to data packets forwarded through the specific data accelerator. In one example, the first data accelerator 304 sends a specially crafted query to the data source system 312 through the second data accelerator and monitors for a response. In yet another example, each data accelerator in the chain has its own unique id that can be used to modify the data that flows through the chain. For example, the first data accelerator can modify a query to include “SELECT uniqueID.” The second data accelerator 306 can add its own id so the query is modified to include “SELECT uniqueID, uniqueID. The order of the unique ids in the query shows the data path of the query.
In a specific implementation, the data accelerators 304 and 306 within the chain are aware of the chain and of other data accelerators or instances of data accelerators within the chain. As a result, data accelerators within the chain are able to communicate between themselves within the network channel that has already been opened for the client device 310. In one example, the first data accelerator 304 and the second data accelerator 306 can relay between each other, diagnostic and network performance information discovered by each respective data accelerator. In yet another example, the first data accelerator 304 and the second data accelerator 306 can share information about the queries and responses, such as how quickly they are being received at each point. With this information the data accelerators within the chain can dynamically determine how effective or detrimental a specific applied rule has been processing intercepted requests and responses. As a result, the rules and application of the rules (either not apply the rule or change the parameters to the rule or even test a different rule) can be modified to find the optimum rules and applicant of rules in decreasing network latency and otherwise improving network performance.
In a specific implementation, the data accelerators 304 and 306 can apply rules in working together to process intercepted queries and responses that travel through at least part of the chain formed by the data accelerators. In one example, the data accelerators can apply in-flight rules in processing the intercepted requests and responses. In-flight rules can include, for example, rules used in processing queries and responses, including the rules described in this paper. Specifically, in applying in-flight rules, the second data accelerator can apply the compression rule to an intercepted response from the data source system 312 before it is sent to the first data accelerator 304.
In an example of the processing of intercepted requests and responses using a chain of data accelerators, a query is intercepted. The first data accelerator 304 determines that the command is “SELECT a, b, c FROM xyz” from the intercepted query. Then the first data accelerator can further apply the in-flight rules in processing the intercepted query. Specifically, the first data accelerator 304 can apply the caching rule and determine that a corresponding response is in local storage but has expired the query cannot be satisfied with the locally stored response. The first data accelerator 304 can then apply the compression rule and determine that there is an upstream data accelerator, e.g. second data accelerator 306, and the network link is slow. As a result, the first data accelerator 304 can compress and wrap data packet information in a compressed data accelerator packet. The packet is then sent upstream to the second data accelerator 306.
In a specific implementation, the first data accelerator 304 can use the pre-caching rule to determine that normally when the query of this type is sent (another example of the ‘content and/or context’ that we reference earlier), there are an additional 5 queries are always run after satisfying the query. As a result, the first data accelerator 304 can generate the next 5 queries, and the rules can be applied to the next 5 in processing the queries. The first data accelerator 304 can receive a response to the query and determine that the query is compressed and there are no downstream data accelerators between the first data accelerator 304 and the client device 310. As a result, the first data accelerator 304, in accordance with the compression rule, can decompress the response and send the response to the client device 310.
In a specific implementation, post-send rules can be applied by either or both the first data accelerator 304 and the second data accelerator 306. As part of the post-send rules, the data accelerators 304 and 306 can determine that the upstream link is a fast link and there is little or no latency. As a result the data accelerators can turn off the compression rule for packets that are less than 1 k in size.
In a specific implementation, the data accelerators 304 and 306 use packet merging rules to process intercepted queries and responses. For example, if the first data accelerator 304 has to transfer two small packets over two separate connections at the same time, the first data accelerator 304 can create a single packet and send it upstream to be split by the second data accelerator 306. Although the latency involved is the same, instead of sending two small packets in parallel, half of the bandwidth requirement is used while leaving the latency overhead the same by sending the data in a serial fashion.
In a specific implementation, the first data accelerator 304 and the second data accelerator 306 can communicate with each other according to the following. Specifically, the first data accelerator 304 is not sure if upstream form it is another data accelerator (e.g. the second data accelerator 306) or the data source system 312. In such an example, the data accelerator 304 sends the query or modified query with the information that another data accelerator would need to process the query or modified query but that will not actually do anything if the sent query or modified query is received by the data source system 312 instead of another data accelerator. By way of example, when the first data accelerator 304 wants to enumerate the chain of data accelerators and find the speed of each network link, it can send a request such as: “SELECT ‘1 January 2010 09:43:22.02’ As DAInstance4AA5888240B4448e9E20-62A8F70CF595, current date As ServerTime.” The DAInstance4AA5888240B4448e9E20-62A8F70CF595 is the unique id of the first data accelerator. When the response is sent back to the first data accelerator, it will include the time the request was started and the time on the server. This information can be used to determine how long it took to get a response from the network.
In a specific implementation, each instance of a data accelerator adds its own unique ID and time so the received query actually ends up as “SELECT ‘1 January 2010 09:43:22.02’ As DAInstance4AA5888240B4448e9E20-62A8F70CF595, ‘1 January 2010 09:43:22.04’ As DAInstance936C4368DE18405881707A22FDBCFE59, ‘1 January 2010 09:43:23.09’ As DAInstance8F4AEA5AE4D544cd9B56DF16F7563913, current_date As ServerTime”. From this response each data accelerator can determine where it is in the chain and also that the speeds between data accelerators within the chain. It is evident from the above response that the link between the second and third data accelerators is slow, as the second data accelerator received the query at 9:43:22.04 and the third data accelerator received the query at 9:43:23.09.
In a specific implementation if the first data accelerator 304 receives a request such as this, it then knows there is a downstream data accelerator (second data accelerator 306). As a result, the first data accelerator 304, instead of re-running the query, after combining the request it received with the results it already has, can simply share the updated results with both the client device 310 and the upstream data source system 312.
In a specific implementation, the first data accelerator 304 and the second data accelerator 306 can communicate with each other according to the following. Specifically, the first data accelerator 304 can communicate with the second data accelerator 306 knowing that the second data accelerator 306 exists upstream from the first data accelerator 306. The first data accelerator 304 can create a connection by sending a data accelerator control packet to the second data accelerator 306. The data accelerator control packet can instruct the second data accelerator 306 in how to process and route data packets. For example, the data accelerator control packet can instruct the second data accelerator not to forward packets up or down stream but to process and route the packet according to specific rules applied by data accelerators in processing and routing queries and responses.
In one example of the communication between data accelerators in a chain, two separate workgroups, accounting and marketing, both use the same data source system 312 but rarely run the same queries. Each department has their own data accelerator which is coupled to the data source system 312. Because there is no chain the instances cannot communicate by sending requests up the chain but instead communicate by sending data accelerator control packets. In the case of caching, where a request comes in from the marketing which has already been served to accounting, the caching rule, as well as checking its own cache can ask the data accelerator used by the accounting workgroup to determine if it has the query and a response to the query stored in local storage. The data accelerator used by the accounting workgroup can return the response to the query to the data accelerator used by the marketing workgroup which can then satisfy the query with the response. The data accelerator used by the accounting workgroup can process the query from the data accelerator used by the marketing workgroup according to the rules described in this paper. For example, the data accelerator used by the accounting workgroup can determine whether the response has expired before retrieving the response from local storage.
FIG. 4 depicts a diagram 400 of an example of a system for transferring an application to a client. The example system in FIG. 4 includes a computer-readable medium 402, an application accelerator 404, a data accelerator 406, a local datastore 408, a client device 410, and a data source system 412.
In the example of FIG. 4, the application accelerator 404, the data accelerator 406, the local datastore 408, the client device 410, and the data source system 412 are coupled to each other through the computer-readable medium 402. The data accelerator 406 can include and function according to an applicable data accelerator, including the data accelerators described in this paper. In one example, the data accelerator 406 can function to manage the flow of data between the client device 410 and the data source system 412 that are used in the execution of applications on the client device 410. The local datastore 408 can include and function to store data locally according to an applicable local datastore, including the local datastores described in this paper. The client device 410 can include and function according to an applicable client device, including the client devices described in this paper. The data source system 412 can include and function according to an applicable data source, including the data source systems described in this paper.
In a specific implementation, the application accelerator 404 functions to intercept queries from the client device 410 and manage the transfer of applications for execution on the client device 410. For example, the queries can be generated by the client device 410 as part of the beginning of execution or during execution of an application on the client device 410. In one example, the query includes a request for all executable code needed to begin running the application on the client device 410.
In a specific implementation, the application accelerator 404 functions to create an environment on the client device 410 in which the application is executed through which the application accelerator 404 can monitor use and execution of the application. For example, the application accelerator 404 can intercept the response that include the code necessary to run the application on the client device 410 and instruct the data accelerator 406 to return data through which a virtualized version of the application can be run on the client device 410 within the created environment. In being virtualized, only the parts of the application are necessary to execute the application at a given time. As a result, the application accelerator 404 can apply restrictions that limit a user of the client device having access to the parts of the application that are not used at a given time in executing the application. This further protects the software from piracy. Additionally, in creating a virtualized version of the application, the application accelerator 404 can limit potential conflicts with other applications that are running on the client device 410 concurrently. Specifically, if other applications or software is running that creates a conflict with a portion of the application, the application accelerator 404, in creating a virtualized version of the application, can limit the use of the portions of the application that create a conflict with the other software or applications running on the client device 410 concurrently.
In one example of using the application accelerator 404, a user can log in to an account using a web interface. The user credentials are checked against a licensing server, that is included as part of the data source system 412, to ensure that the user has permission to use the required application. The user can launch the virtualization application package by, clicking a button labeled “Launch” which downloads the application accelerator 404 to the client device 410. Once the application accelerator 404 is downloaded, it can create, for example, desktop/miscellaneous shortcuts and ensure the latest version of the files required for the application virtualization package are available, if not it will download them. Each individual virtualized application package has its own set of required actions, such as setting shortcuts and downloading supporting files which are carried out as required. The application accelerator 404 then downloads the correct version of the application virtualization package from the data source system 412 and begins execution of the application.
In a specific implementation, when the user next tries to use their application virtualization package, the application accelerator 404 is started, which runs through the checks to ensure that both itself, and the application virtualization package, is up to date, if necessary it will download updated files and start the application virtualization package. Using this process, the user needs to download the file and run it only once, from that point on the application virtualization package will always be up to date.
In a specific implementation, in managing the execution of the application, the application accelerator 404 can provide highly flexible licensing (e.g. time based usage, limited number of times an application can be run, try and buy etc.). For example, the application accelerator 404 is configured to check if the user has permission to run the application from a central database before allowing access to the application virtualization package. Additionally, in functioning with the data accelerator, the data accelerator 406 can be configured to keep a time based token of remaining usages and then cease to allow access to the package once the time has expired; thus stopping the application from running Furthermore, the data accelerator 406 is configured to decode the traffic through the computer-readable medium 402. Therefore, the application accelerator 404 in combination with the data accelerator 406 can prevent operations like a file copy from being able to be run so that the software cannot be pirated and the license checks cannot be bypassed.
In one example of the operation of the example system in FIG. 4, fictional Solicibert Company supplies law firms with time management and billing software. A lawyer logs onto the portal, their credentials are checked and a list of applications to which they have access shown to the customer, he then clicks on the “Launch” button for the “PerseusTimeTrack” application virtualization package, the launch button downloads the application accelerator 404. The user then runs the application accelerator 404 and it checks in the users profile folder whether or not the required files to run the package exist. If they do not then the files (including data accelerator 406) are downloaded, a desktop shortcut in the users roaming profile is created and the application accelerator 404 is copied to the roaming profile. The application accelerator 404 can then validate that the user has access to the software, start the data accelerator 406 and then start the software. The user then can use the software as if it was installed on the client device 410
When the user goes onto a new machine in the same company or to a home machine, the roaming profile associated with the user has the shortcut to the application and the application accelerator 404, if the user starts the shortcut the package and the data accelerator 406 are downloaded and the application is started. For enhanced performance for larger application virtualization packages and to allow increased control to provide licensing controls, rather than download the entire application virtualization package (which can be several hundred MB) the package can be stored on a central data store and accessed using a remote network share, for example SMB, CIFS, WebDAV or FTP.
If the package were run from a remote network share without optimization the performance of the application during use would be very slow as applications load large amounts of data from the data store that contains the application virtualization package into memory during runtime and unload the data once they are not needed to keep system RAM from being used up unnecessarily. This means that even with a package of 100 MB there could be over 1000 MB of data transferred during use as the same parts are loaded and unloaded from memory during use of the application. The result of this is slow performance of the application as client applications are written with the expectation that the source files are stored on a local disk with high-speed access compared to a remote network share. To overcome these problems the data accelerator 406 can be used to provide the relevant optimization techniques; for example caching of the blocks of the package so that once they are used the first time, they do not need to be pulled over the network again, or compression to reduce the amount of data that needs to be transferred, or pre-fetching so that once one feature of an application has used the blocks of data for the files needed to run a feature that always or normally follows are cached in advance. Using the data accelerator 406 also means when the application virtualization package is updated only the parts that have changed need to be downloaded as the other parts will already be cached. Also the application virtualization package can be enabled to run offline as the data accelerator 406 can be configured to proactively cache all blocks of the package in the background for later use offline.
In one example of operation of the system shown in FIG. 4 can be used in disaster recovery. For example in the event of a disaster, the user can work from a static or roaming device and the application with all the data, will stream down to the user and maintain the full functionality that they had whilst in the original work place.
FIG. 5 depicts a diagram 500 of a flowchart of an example of a method for managing the transfer of data. The flowchart begins at module 502 with intercepting a query that includes a request for data. The query can be generated by a client device and intercepted by a data accelerator.
The flowchart then continues to decision point 504 where it is determine whether part of the requested data is stored locally. In one example, data is stored locally if it is within a local datastore that is part of the client device that generated the query. In another example, data is stored locally if it is within a local datastore that is within the same LAN as the client device that generated the query. Specifically, the local datastore can be implemented as part of another peer's device within a peer-to-peer network. If it is determined, at decision point 504, that part of the requested data is stored locally, then the flowchart continues to module 506.
At module 506, the flowchart includes retrieving and sending the part of the requested data stored locally to the client device that generated the query. The flowchart then continues to module 508 where the query is modified based on the retrieved and sent part of the locally stored data. In one example, the query can be modified to exclude the request for the data stored locally and sent to the client device. As a result, the query can be modified to include only a request for the data that is necessary to satisfy the query. In one example, all of the requested data necessary to satisfy the query is stored locally. If this is the case, then the flowchart ends. In another example, the query can be modified to either not include a request for data or can be ignored, not sent, or deleted, wherein the flowchart ends.
Alternatively, if the query is modified, at module 508, to include a request for data that is not stored locally to satisfy the query or it is determined at decision point 504 that part of the requested data is not stored locally, then the flowchart continues to module 510. At module 510, the query is sent to a data source system. The query can be a query that is modified at module 508, or the query that is intercepted at module 502. The flowchart continues to module 512, where a response to the query is received from the data source system that the query was sent to at module 510. The response to the query includes the requested data that was included as part of either the modified query, created at module 508, or the query intercepted at module 502. The flowchart then continues to module 514, where the requested data that is received from the data source system at module 512 is returned to the client device.
FIG. 6 depicts a diagram 600 of a flowchart of an example of another method for managing the transfer of data. The flowchart begins at module 602 with intercepting, from a client device, a query that includes a request for data. In one example, the query is intercepted by a data accelerator.
The flowchart continues to decision point 604, where it is determined whether the query and a response to the query are stored locally with respect to the client device, from which the query was intercepted. In one example, data is stored locally if it is within a local datastore that is part of the client device that generated the query. In another example, data is stored locally if it is within a local datastore that is within the same LAN as the client device that generated the query. The local datastore can be implemented as part of another peer's device within a peer-to-peer network.
If it is determined at decision point 604, that the query and/or a response to the query are stored remotely, then the flowchart continues to module 610. At module 610, the query is forwarded to a data source system that can satisfy the query by providing the requested data. If at decision point 604, it is determined that the query and the response are stored locally, then the flowchart continues to decision point 606. At decision point 606 it is determined whether the locally stored response to the query is still valid. In one example, a response is still valid if none or a percentage of the data remains unchanged, so that the query can still be satisfied. If it is determined that the locally stored response to the query is not valid, then the flowchart continues to module 610, where the query is forwarded to the data source system. If it is determined that the locally stored response to the query is valid, then the flowchart continues to module 608, where the response is served from the local storage to the client device.
FIG. 7 depicts a diagram 700 of a flowchart of an example of another method for managing the transfer of data in receiving multiple queries. The flowchart begins at module 702 with intercepting a first query from a client device. In one example, the query is intercepted by a data accelerator.
The flowchart continues to module 704, where the first query is forwarded to a data source system. The flowchart continues to module 706, where a second query is intercepted. The second query can be intercepted from the same client device that generated the first query or from a separate client device. In one example, the client devices are part of the same LAN. Next, the flowchart continues to decision point 708 where it is determined if the second query is the same as the first query. In one example, the second query is the same as the first query if both queries are requesting the same data. If it is determined at decision point 708 that the second query is the same as the first query, then the flowchart continues to module 710, where the second query is held, and otherwise not forwarded to the data source system. Alternatively, if it is determined at decision point 708 that the second query is not the same as the first query, then the flowchart continues to module 712, where the second query is forwarded to the data source system.
These and other examples provided in this paper are intended to illustrate but not necessarily to limit the described implementation. As used herein, the term “implementation” means an implementation that serves to illustrate by way of example but not limitation. The techniques described in the preceding text and figures can be mixed and matched as circumstances demand to produce alternative implementations.

Claims

We claim:

1. A method comprising:

intercepting a query that includes a request for requested data from a client device;

determining whether a portion of the requested data is stored in a local storage on a same local area network as the client device;

if it is determined that the portion of the requested data is stored locally:

returning the portion of the requested data to the client device from the local storage;

modifying the query based on the portion of the requested data to not include a request for the portion of the requested data;

sending the modified query to a data source system coupled to the client device through a wide area network;

forwarding a remaining portion of the requested data in response to the modified query from the data source system to the client device.

2. The method of claim 1, further comprising:

storing the remaining portion of the requested data from the data source system in the local storage on the same local area network as the client device.

3. The method of claim 1, wherein determining whether a portion of the requested data is stored locally further includes:

determining whether a locally stored query is the same as the intercepted query;

determining whether the locally stored query is still valid;

if it is determined that the locally stored query is still valid, returning the portion of the requested data to the client device.

4. The method of claim 3, wherein determining whether the locally stored query is still valid includes determining that the local stored query is invalid when rows have been added to, deleted from, or modified in a locally stored response to the locally stored query.

5. The method of claim 1, further comprising:

applying a compression rule to determine whether to compress at least a portion of the modified query before sending it to the data source system;

compressing the at least a portion of the modified query before sending it to the data source system if it is determined to compress the modified query.

6. The method of claim 5, wherein applying the compression rule includes:

determining an upstream network speed between the data source system;

determining an amount of time it takes to compress and decompress packets in the modified query;

determining a ration of compression size over processing cost from the determined upstream network speed and the determined amount of time it take to compress and decompress packets in the modified query;

determining a size of the portion of the modified query to compress based on the ratio of compression size to processing cost in compressing the portion of the modified query.

7. The method of claim 1, further comprising:

determining whether the modified query includes a common string;

replacing the common string with a specific identification to reduce the size of the modified query, the specific identification uniquely associated with the common string and being of a smaller size than the common string.

8. The method of claim 1, further comprising:

applying pre-validating rules to the modified query to determine whether the modified query includes errors that will prevent the data source system from processing the modified query;

removing the errors form the modified query before sending the modified query to the data source system.

9. The method of claim 1, further comprising:

determining an amount of time to return the portion of the requested data to the client device from the local storage;

determining an amount of time to return the portion of the requested data from the data source system to the client device;

returning the portion of the requested data to the client device from the local storage if the amount of time to return the portion of the requested data to the client device from the local storage is less than the amount of time to return the portion of the requested data from the data source system to the client device.

10. The method of claim 1, further comprising:

determining whether the modified query can be satisfied by the data source system;

sending the modified query to the data source system if it is determined that the modified query can be satisfied.

11. A system comprising:

an interceptor engine configured to intercept a query that includes a request for requested data from a client device;

a pre-fetch engine configured to determine whether a portion of the requested data is stored in a local storage on a same local area network as the client device and return the portion of the requested data to the client device from the local storage if it is determined that the portion of the requested data is stored locally;

a modification engine configured to:

modify the query based on the portion of the requested data returned to the client device to not include a request for the portion of the requested data;

send the modified query to a data source system coupled to the client device through a wide area network;

forward a remaining portion of the requested data in response to the modified query from the data source system to the client device.

12. The system of claim 10, wherein the pre-fetch engine is further configured to store the remaining portion of the requested data from the data source in the local storage on the same local area network as the client device.

13. The system of claim 10, wherein the pre-fetch engine is further configured to:

determine whether a locally stored query is the same as the intercepted query;

determine whether the locally stored query is still valid;

return the portion of the requested data to the client device if it is determined that the locally stored query is still valid.

14. The system of claim 13, wherein determining whether the locally stored query is still valid includes determining that the local stored query is invalid when rows have been added to, deleted from, or modified in a locally stored response to the locally stored query.

15. The system of claim 10, further comprising:

the modification engine further configured to apply a compression rule to determine whether to compress at least a portion of the modified query before sending it to the data source system;

a compression and encryption engine configured to compress the at least a portion of the modified query before sending it to the data source system if it is determined to compress the modified query.

16. The system of claim 15, wherein in applying the compression rule the modification engine is further configured to:

determine an upstream network speed between the data source system;

determine an amount of time it takes to compress and decompress packets in the modified query;

determine a ration of compression size over processing cost from the determined upstream network speed and the determined amount of time it take to compress and decompress packets in the modified query;

determine a size of the portion of the modified query to compress based on the ratio of compression size to processing cost in compressing the portion of the modified query.

17. The system of claim 10, wherein the modification engine is further configured to:

determine whether the modified query includes a common string;

replace the common string with a specific identification to reduce the size of the modified query, the specific identification uniquely associated with the common string and being of a smaller size than the common string.

18. The system of claim 10, wherein the modification engine is further configured to:

apply pre-validating rules to the modified query to determine whether the modified query includes errors that will prevent the data source system from processing the modified query;

remove the errors form the modified query before sending the modified query to the data source system.

19. The system of claim 10, wherein the pre-fetch engine is further configured to:

determine an amount of time to return the portion of the requested data to the client device from the local storage;

determine an amount of time to return the portion of the requested data from the data source system to the client device;

return the portion of the requested data to the client device from the local storage if the amount of time to return the portion of the requested data to the client device from the local storage is less than the amount of time to return the portion of the requested data from the data source system to the client device.

20. A system comprising:

means for intercepting a query that includes a request for requested data from a client device;

means for determining whether a portion of the requested data is stored in a local storage on a same local area network as the client device;

if it is determined that the portion of the requested data is stored locally:

means for returning the portion of the requested data to the client device from the local storage;

means for modifying the query based on the portion of the requested data to not include a request for the portion of the requested data;

means for sending the modified query to a data source system coupled to the client device through a wide area network;

means for forwarding a remaining portion of the requested data in response to the modified query from the data source system to the client device.