METHOD OF REFORMATTING WEB PAGE AND METHOD OF PROVIDING WEB
PAGE USING THE SAME
Technical Field The present invention relates to a web page re-formatter that is capable of extracting necessary items from a web page based on hypertext markup language to allow them to be displayed on a terminal, and a method of reformatting a web page requested by a terminal and transmitting the reformatted web page to the terminal.
Background Art
Nowadays, a service that provides information in the form of a web page through the Internet is popularized. The web page is requested while being saved in a server, and displayed on a client's browser. In the past, most clients were personal computers, so that the browser was capable of displaying all the information of the web page. However, recently, many clients are terminals with various sizes of screens (hereinafter, referred to as "terminals"), such as mobile phones, personal digital assistants (PDAs), Internet TVs, smart phones and web screen phones, as well as the PCs, so that the functions of their browsers have been limited.
In the terminals, the sizes of their display windows are limited, so that it is difficult to display all of the web page on each of the windows. Additionally, since the terminal cannot be provided with many input keys, there occurs a limitation that web pages have to be browsed using a limited number of input keys. Additionally, the terminals are connected to servers via a wireless media, so that the amount of data, a data transmission rate, and connection charges are matters of great concern. As a result, in order to display a web page on terminals, the web page stored in the server should be edited and then sent to the terminals. This can be performed through a function called a page reformatting or an information extraction function.
This function serves to extract only information selected by a user from contents in a general web page and make a new page, and is required because the sizes of the display windows of the terminals are limited.
Disclosure of the Invention
An object of the invention is to provide a web page reformatting method that is capable of extracting necessary items from a web page based on hypertext markup language (HTML) to allow them to be displayed on a terminal, and a page providing method of reformatting a web page requested by a terminal and transmitting the reformatted page to the terminal.
In the present invention, Fig. 1 is a schematic diagram showing the typical connection between a terminal and a real server through the Internet. A call signal from the terminal is sent to the real server 18 on the Internet through an exchange net 12, an inter-working function (IWF) 14 and a proxy server 16. The proxy server serves as a proxy of the terminal that adapts data to fit the handling ability and transmission capacity' of the terminal. When the terminal is connected to the real server and requests a web page, it is indispensable to convert HTML documents for a PC browser to documents for the terminal browser in the real server. The size of the display window of each terminal, as explained in conjunction with the prior art, is limited, so only necessary items (in most cases, text) should be extracted. The invention relates to a web page re-formatter that is capable of extracting necessary items from a web page based on HTML to allow them to be displayed on a terminal, and a method of reformatting a web page requested by a terminal and transmitting the reformatted to the terminal.
More further explanation of the invention will be followed in "Best Mode" section.
Brief Description of the Drawings
The above and other objects, features and other advantages of the present invention will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which: Fig. 1 is a schematic diagram showing the typical connection between a terminal and a real server through the Internet;
Fig. 2 is a view showing an original HTML document to be reformatted and a tree structure in which the document is analyzed;
Fig. 3 is a structural diagram showing a tree of a menu table in detail; Figs. 4 to 9 are tree structure diagrams showing the operation of a web page re- formatter;
Fig. 10 is a diagram showing a system for providing a page using the web page reformatting method;
Fig. 11 is a flowchart of a method using the system of Fig. 10; Fig. 12 is a structural diagram of a configuration database;
Fig. 13 is a diagram showing another system for providing a page using the web page reformatting method; and
Fig. 14 is a flowchart of a method using the system of Fig 13.
Best Modes for Carrying out the Invention
Method of reformatting page
Fig. 2 is a diagram showing the screen of a web page re-formatter in accordance with the present invention. The web page re-formatter is named a "Node Extractor". The screen is generally comprised of a view section 20 for showing a web page, a tree view section 22 for showing a tree structure of the web page, and an extraction view section 24 for showing items selected and extracted from the web page.
The view section 20 has an address section 26 for inputting an Internet uniform resource locater (URL), so that the URL is inputted in the address section 26 so as to move to a desired site. In Fig. 2, for example, "http://www.yahoo.com" has been inputted in the address section 26. When the site corresponding to the.inputted URL is open, the web page is displayed in the web page view section 20 and the tree structure of the web page is displayed in the tree view section 22.
When a desired item in the web page view section 20 is dragged, the position of the selected node is activated in the tree view section 22. That is, as shown in Fig. 2, if a text "Yahoo! Auctions" in the web page view section 20 is dragged and dropped into the tree view section 22, a check box D in front of a corresponding item title is checked. In this case, if a selected node is a leaf node when a dragged item is dropped onto the tree view section 22, the check boxes of the selected node and its parent nodes are checked. And, if a selected node is not a leaf node, the check boxes of its child nodes are checked as well as the check boxes of the selected nodes and its parent node. Items that can be dragged in the web page view section 20 may be a text, an. image and an input. The text can be dragged because it has a link to a tree, and the image also can be dragged because a clover image in front of the image represents its position in the tree. The input can also be dragged because a clover image in an input box (or hidden input tag) represents its position in the tree, like the image. As described above, if check boxes in the tree view section 22 are checked, the contents of the selected items are displayed in the extraction view section 24. That is, the extraction of the node is completed, so an original web page is reformatted and outputted in the extraction view section 24. In addition, if checks in check boxes are deleted, items shown in the extraction view section 24 disappear at once. In the above explanation, a method of reformatting a page by extracting necessary items in the original web page using a re-formatter is explained. Hereinafter, a method of implementing the web page re-formatter will be explained.
As shown in Fig. 3, in order to reformat an original HTML document, first of all, the document is analyzed and then reformatted in a tree type format. Fig. 3 is a view showing the first step of reformatting a homepage of a newspaper company "OO Daily" into a page consisting of necessary items. In Fig. 3, the position of -a'menu table 28 having "OO Daily" at the top of the web page can be found in the tree structure. At the bottom of the menu table 28, items "politics | economics | social | international | culture | sports I IT I metropolitan area | national | cartoons | TV" are arranged. The menu table 28 (as in usual HTML Home Pages) is constructed in a hypertext form to access a corresponding page when each item is selected. Fig. 4 is a view of the structure of a tree showing in detail the menu table 28 having "OO Daily". In the tree structure of Fig. 4, the position of each node can be represented by enumerating numerals showing that the node is situated at a certain child node of its parent node. For example, a position of a HTML situated at the uppermost position is represented as "1". That is, the number "1" means the first child node of a root node. By the same principle, "Text=Politic" can be represented as "1+2+1+2+1+2+1+1".
However, it is difficult to find the position of a certain node by the above method, so a utility program called "Node Finder" is necessary. The node finder analyzes the above tree, and gives results as follows. [1] <HTML>
[1+1] <HEAD>
[1+1+1] <TITLE>
[1+1+1+1] OO Daily
</TITLE> [1+1+2] <META >
[1+1+3] <SCRIPT >....</SCRIPT>
</HEAD>
[1+2] <BODY>
[1+2+1] <TABLE>
[1+2+1 + 1] <TR>
[1+2+1+1+1] <IMG SRC="abc.gif> [1+2+1+1+2] <IMG SRC="cde.gif >
That is, the Node Finder puts numbers representing the positions of all the nodes to the front of corresponding nodes. Accordingly, a unique ID is assigned to each node.
If the home page of Fig. 3 is desired to be displayed on the terminal as shown in Fig. 5, only necessary nodes are first indicated in the tree structure as shown in Fig. 6 (nodes indicated in the filled box in Fig. 6). Thereafter, all the parent nodes of the indicated nodes in Fig. 7 are also indicated. Then, as shown in Fig.. 8, if unselected nodes are deleted, a new tree having necessary parts is completed. When the tree is arranged and serialized, a desired document arranged as shown in Fig. 9 is obtained. The document contains only necessary extracted items, so it can be transmitted and provided to the terminal.
Method of providing page using web page re-formatter
After Fig. 10, a method of providing a page using the above web page re- formatter is described. Figs. 10 to 12 describe an embodiment, in which a proxy server is laid between a terminal and a real server and page reformatting is performed in the proxy server. Figs. 13 and 14 describe another embodiment, in which a virtual server and a real server are situated on the Internet without a proxy server, and reformatting is performed in the virtual server so that a desired web page is obtained from the terminal although proxy setting is not made.
First embodiment
Referring to Fig. 10, the structure and operation of a system in accordance with a
first embodiment of the present invention are described.
(1) A setting terminal receives an original web page from a real server through the Internet, and selects items to be extracted from the original web page. For example, the items of Fig. 9 are extracted from the "OO Daily" web page in Fig. 3.--- (2) The results of the selection by the setting terminal are saved in the proxy server as a configuration file (*.cfg). The proxy server has a configuration database (CFG DB), so that the configuration file from the setting terminal is saved in the database. (3) When a user connects to the Internet and requests a web page configured as above, (4) the proxy server receives the request and transmits the page to the real server. (5) The real server searches for the web page requested by the user, and transmits it to the proxy server. (6) The proxy server retrieves the configuration file of the corresponding page from the CFG DB, and converts the page (that is, the server reformats the page according to Figs. 3-9). The converted page is transmitted to the terminal.
With reference to Fig. 11, a process of reformatting a web page according to the embodiment is described. Items to be extracted from a web page of a certain site are. selected at the setting terminal [100]. Then, a configuration file created according to a result of the selection is sent and saved to the proxy server [102]. The configuration file is saved in the CFG DB of the proxy server [200] in various forms according to web sites and types of terminals as shown in Fig. 12. The reason for this is that the owners of terminals can see certain web pages at their terminals because various kinds of terminals exist nowadays.
When a user requests a page through the user terminal after the configuration file is saved at the CFG DB of the proxy server [202], the proxy server transmits a request for the corresponding page to the real server [204]. The real server formats the requested page [300], and provides it to the proxy server [302]. Then, the proxy server searches the CFG DB for a configuration file regarding a corresponding page [206]. If the server has the configuration file [210], the server reformats the page according to the
configuration file [212] and transmits the page to the user [214]. If the server does not have the configuration file [211], the server converts the requested page into one that can be displayed on the terminal, and transmits it to the user terminal [304].
In this embodiment, when a user requests a web page, the request is transmitted to a real server through a proxy server, and the proxy server reformats the web page received from the real server according to a configuration file and transmits the page to the terminal of the user.
Second Embodiment
Referring to Fig. 13, the structure and operation of another system in accordance with a second embodiment of the present invention are described.
(1) A setting terminal receives an original web page from a real server through the Internet and selects items to be extracted from the original web page. (2) The result of the selection by the setting terminal is saved in the virtual server as a configuration file (*.cfg). The virtual server is established on the Internet together with the real server.. The virtual server has a configuration database (CFG DB), so that a configuration file from the terminal is saved in the database. (3) When a user connects to the Internet to access the virtual server, (4) the virtual server requests a web page to the real server. The real server retrieves the web page requested by the virtual server and responds to the request. (5) The virtual server retrieves the configuration file of the corresponding page from the CFG DB, and converts the page (that is, the server reformats the page according to Figs. 3 to 9). (6) The converted page is transmitted to the terminal.
With reference to Fig. 14, a process of reformatting a web page according to the second embodiment is described. Items to be extracted from a web page of a certain site are selected at the setting terminal [104], Then, a configuration file created according to a result of the selection is sent and saved to the virtual server [216]. The configuration file is saved in the CFG DB of the virtual server [216] in various forms according to types
of terminals, somewhat differently from Fig. 12. In the first embodiment, configuration files are saved in various forms according to web sites; while in the second embodiment, configuration files are saved in various forms according to only the types of terminals, because the user accesses web sites by connecting to the real server for himself. When a user connects to a page through the terminal after the configuration file is saved at the CFG DB of the virtual server [218], the virtual server transmits a request for the corresponding page to the real server [220]. The real server constructs the requested page and provides it to the virtual server [306]. Then, the virtual server searches the CFG DB for a configuration file regarding the corresponding page, and when the virtual server has the configuration file, the server reformats the page according to the configuration file [222] and transmits the page to the user [224].
From the foregoing, the invention serves to extract only information selected from the items of a general web page, reformat the page into a new page and provide it to a user, so an Internet browsing, which has been limited due to the relatively small sizes of display windows, can be popularized and an Internet service can be provided at a rapid speed (accordingly, at a low price).