METHOD FOR STATE PRESERVATION IN HTTP-BASED COMMUNICATIONS
TECHNICAL FIELD
This invention relates to computer communication in a client-server environment under a stateless protocol. More specifically, the invention relates to a method for client state preservation when communicating using the HyperText Transfer Protocol.
BACKGROUND OF THE INVENTION
The HyperText Transfer Protocol ("HTTP") is commonly used on the Internet for requesting and sending documents. Typically, a World Wide Web ("WHITING: ") client browser connects to a server and requests a document using HTTP. HTTP for WHITING: is of course well know. It's specification is available on line at ftp: //ftp. isi.edu/in-notes/rfc2616.txt ("HTTP specification"). The specification is hereby incorporated by reference as if fully set forth herein.
HTTP is a stateless protocol. By stateless we mean that a client does not store information regarding a completed information exchange with a server, and therefore does not provide the information to the server during a subsequent request for information. But often it is desirable for the server to have client state information.
For example, it may be desirable to identify the client and to provide client-specific information in response to a request. If the client previously identified him- or herself during a particular session, it would be truly annoying to the client to keep submitting identifying information with each screen. More insidiously, user identification can be used for delivery of targeted advertisement and for updating user profiles through tacking of the sites visited by the user.
It may also be desirable to preserve state information in addition to client ID. Thus, because of intermediate caching, the client may be requesting
documents while viewing a page other than the last page sent to the client by the server, and a response to the client's request may be page-dependent.
Two of the better know state preservation techniques are "cookies" and Uniform Resource Locator ("URL") rewriting. Cookies are subject of U.S. Patent Number 5,774,670 to Montulli, assigned to Netscape Communications Corporation (the "Netscape patent". URL rewriting is subject of U.S. Patent Number 5,961,601 to Iyengar, assigned to International Business Machines Corporation (the "IBM" patent).
Briefly, according to the method of the Netscape patent, the server sends a small file - a cookie - to the client to be stored locally by the client. The cookie is then sent by the client to the server with subsequent a request. The disclosure of the Netscape patent is hereby incorporated by reference as if fully set forth herein. URL description can be found in http://www.rfc-editor.org/rfc/rfcl738.txt ("URL specification"), which document is hereby incorporated as if fully set forth herein.
Using cookies has the disadvantages of requiring user permission for local access; in other words, cookies may be disabled. Another disadvantage of cookies is that they do not appear in the server's log files. And, as noted above, the method has been patented and therefore unavailable or expensive to use.
According to the method of the IBM patent, when the server receives a request for a particular page, it creates a new page containing all the information of the requested page, and redirects the client to the new page. State information is embedded in the hyperlinks in the new page. When the user clicks on a hyperlink, the client's browser automatically transmits information - to the server. For example, if the user visits www.amazon.com, the user will be redirected to a URL similar to this: http/www. amazon.com/exec/obidos/subst/home.html/102- 7545796-2745608. The trailing number carry state information. Viewing the page's source code reveals that all URLs have been rewritten to contain the state information.
The disclosure of the IBM patent is hereby incorporated by reference as if fully set forth herein, including of course the Glossary; but the term "conversation" as used herein has the following meaning: A sequence of communications between a client and server in which the server sends regular or terminal responses to the client's requests, a regular response includes on or more continuations, a terminal response includes no continuations, the client selects each response from continuations received by the client from the server in the course of the sequence of communications.
The major disadvantage of the method of the IBM patent is that the server must create a brand new page with every request of the client. This takes time and resources, degrading performance. And the method is also patented.
OBJECT OF THE INVENTION
One object of this invention is to provide a new method for preserving client state information during HTTP-based communications. Another object of the invention is to provide a method that is faster than URL rewriting and requires less computational resources.
Yet another object is to create a mechanism for storing state information in server log files, enabling generation of reports on user behavior during a particular session.
SUMMARY OF THE INVENTION
According to the method of this invention, when a client contacts a server, the server redirects the client to the same page with the information encoded in the URL of the redirected page. Every hyperlink of the redirected page will automatically contain the state information identifying the session in the HTTP Referrer header of the request to the serve associated with the link.
If all the links of a page are relative to the virtual state root note, i.e. , if the links point only to other pages with the same domain name, the redirection can
be performed only once, to a page with assigned state identifier encoded in the URL at the root. After the initial redirection, all requests to the server will carry the state identifier, which should be stripped by a recognized by special servlet on the server before serving the static files requested by the client.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a non-limiting illustration of the "HTTP Referrer" method. Figure 2 is a non-limiting illustration of the "URL Encoding" method.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT OF
THE INVENTION
1. HTTP REFERRER
With this approach, the server maintains state information by encoding it in specially created URLs, and examining the standard HTTP Referrer request header fields of all requests from clients to identify the specific URL included in each request's header.
In the Referrer field, the client can specify the address of the resource that supplied the address of the service requested by the client from the server. The HTTP Referrer is described in more detail in section 14.36 of the HTTP specification.
Initially, a user/client first attempts to visit a server's site by sending the server the user's first request for service. In the first request, the Referrer header does not contain any state information, i.e., the header does not contain one of the URLs previously created by the server and recognizable by the server from the state information encoded in them. For this reason, the server treats the request as the beginning of a new conversation, creating a state object to identify the conversation. The server then redirects the client to the same page, but with state information encoded in the page' s URL. For example, if the user attempts to view http://www.softcom.com/index.htm, the user may be redirected to
http://www.softcom.com/index.htm?støte=3476. This is a static page whose URL contains state information. By static we mean that none of the information of the original index.htm page, including hyperlinks, has changed. Only the URL has changed.
All links and other embedded content referenced by this page will include an HTTP Referrer header containing the referring URL - http://ww.softcom.com/index.htm?state=3476. The state information of the client can be obtained by parsing this URL and then used to construct a new URL that the user is then redirected to. Thus, if the user is viewing http://www.softcom.com/index.htm?state=3476 that, for example, contains a liαk such as < A HREF= "foo.htm> , and the user clicks on that link, the browser will employ the HTTP GET method on http://www.softcom.com/foo.htm, with the HTTP Referrer header set to http://www.softcom/index.htm?state=3476. The server will then construct a new URL using this state and redirect the browser to, for example, http://www.softcom.com/foo.htm7state-3476, mamtaining the sessions 's state. Alternatively, the server will construct a new URL with a new, unique state assigned to it, for example 3477, to tracking both the session and the particular place within the session. Subsequent requests from the client to the server made in the course of the same conversation will be treated similarly: the client's state will be identified from the HTTP Referrer field, and the client will be redirected to the requested page at a URL encoded with state information.
2. URL Encoding
The HTTP Referrer approach described in the preceding subsection requires one HTTP redirect per request. The approach described in this subsection requires one redirect per conversation, but works only if all hyperlinks are relative to the root, i.e., relative tot he root path element, (by "root" I mean "base URL"; relative links - URLs - generally are links that reside on the same server; both "base URL" and "relative URL" concepts are discussed in
http://www.ietf.org/rfc/rfc2396.txt, which document is hereby incorporated by reference as if fully set forth herein.) With this approach, the state (session ID) is encoded in each URL at the root so that the browser automatically maintains it in the root path element.
Suppose, as before, that the client/user initially goes to http://www.softcom.com/index.htm. The server assigns a state of "3476" to the session and redirects the client to http://www.softcom.com/3476/index.htm. This is the only redirect taken during the session. Note that the session ID (3476) is now encoded at the root of the URL. If this page contains a link to foo.htm, for example, the browser will request http://www.softcom.com/3476/foo.htm, automatically encoding the state information in its request. A special file, a "servlet, " runs on the server to strip the state portion encoded in the path and to serve the static files to the client. This approach works only for relative URLs; if the URLs are absolute, then each would have to be rewritten, defeating an important object of the invention.
The HTTP Referrer and the URL Encoding methods described above encode state information within the URLs accessed. The web server log files often record this information in the server's log files. Advantageously, both method thus allow existing mechanism to record user behavior during a particular session.
The inventive methods are describe din this specification in a general manner. Those skilled in the art will be able to devise various modifications that although not explicitly described or shown herein, embody the principles of the invention and are thus within its spirit and scope.