XML-DRIVEN AUTOMATED SELF-RECOVERY FILE DELIVERY SYSTEM FOR A DISTRIBUTED COMPUTER NETWORK
RELATED APPLICATIONS
[0001] The present invention claims priority from United States Provisional Application
No. 60/297,245 filed June 12, 2001. BACKGROUND OF THE INVENTION
[0002] The invention relates to a centralized file and/or content distribution scheme that uses meta data, e.g. meta data based on a markup language such as Extensible Markup Language (XML). The meta data may be used as descriptive tagging useful for data network delivery tasks such as file transmission and retransmission processing. This novel use of meta data tagging capability allows a centralized file delivery system to optimize its file self-recovery ability in dealing with file transmission errors, especially in a multiple user/customer, multiple file delivery applications/jobs served by a single centralized file delivery system. [0003] As more users use data networks such as the Internet to accomplish delivery of files and content, prioritization of data transmission and retransmission of dropped or faulty data become increasingly complex. Moreover, use of bandwidth efficiently in light of file delivery and network management has also become increasingly complex.
[0004] In many systems, data are transmitted in a multicast manner, i.e. simultaneously to numerous receivers. Packet loss occurs at a non-predictable rate. Most commonly, when receivers fail to receive all of the data packets required, systems utilize a reservation of bandwidth, e.g. forward error correction. In a worst case scenario, an entire file may need to be
retransmitted due to a lack of knowledge of the status of file receipt by a receiver. This is aggravated in a multicast environment.
[0005] There is a need to automate a self-recovery system for data transmission which takes into account factors outside the mere transmission of the data and incorporates business rules based decision making.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] These and other features, aspects, and advantages of the present invention will become more fully apparent from the following description, appended claims, and accompanying drawings in which:
[0007] Fig. 1 is a schematic of a system for providing a self-recovery file delivery system for a distributed computer network;
[0008] Fig. 2 is a schematic overview of system components; and
[0009] Fig. 3 is a flowchart of a method of a preferred embodiment.
DETAILED DESCRIPTION OF AN EXEMPLARY EMBODIMENT
[0010] In general, throughout this description, if an item is described as implemented in software, it can equally well be implemented as hardware.
[0011] Referring to Fig. 1, a schematic of a system for providing a self-recovery file delivery system for a distributed computer network, system 10 comprises data network 100; source 20 (generally referred to by the number "20" and shown as 20a and 20b in Fig. 1) of one or more files 22 to be delivered to one or more receivers 30; network operation center 40; and file recovery software 50 executing at least partially in network operation center 40. As used herein, "distributed computer network" and "data network" are equivalent.
[0012] Source 20, receiver 30, and network operation center 40 are all operatively in communication with data network 100. Network operation center 40 is logically disposed intermediate source 20 and receiver 30 and may be used to receive and store files 22 from source 20 and subsequently distribute files 22 to one or more receivers 30.
[0013] Referring additionally to Fig. 2, a schematic overview of system components, in an exemplary embodiment, system 10 further includes file load list 23. File load list 23 contains interrogatable, rule-based file delivery meta data for use by network operation center 40 (Fig. 1) when delivering file 22. In a preferred embodiment, load list 23 and rule-based meta data related to delivery of files 22 are in an XML format, although other formats may be used as well, e.g. hypertext markup language (HTML), Resource Description Framework (RDF), Open Software Distribution (OSD), and the like, or combinations thereof.
[0014] In a preferred embodiment, job processor 24 preprocessing comprises preprocessing file delivery load list by parsing XML in file delivery load list 23. [0015] Network operation center 40 (Fig. 1) further comprises schema repository 44. In an embodiment, schema repository 44 comprises data associatable with a plurality of sources 20, a plurality of receivers 30, or a combination thereof. In a preferred embodiment, schema repository 44 data are formatted in XML, RDF, OSD, and the like, or combinations thereof. [0016] Additionally, schema repository 44 may comprise schema 45 which may include a description of a predetermined structure of file 22, e.g. stored at 59 such as by using storage manager 55, a description of an acceptable data type for file 22, and the like, or a combination thereof. The description of the predetermined structure of file 22 may use XML to describe the structure of a high level XML document in a readable format, e.g. a natural language such as
English. Schema 45 may be shared within a centralized file delivery system useable by many users/customers, e.g. schema repository 44.
[0017] In a preferred embodiment, file recovery software 50 (Fig. 1) further comprises job processor 52 for preprocessing the file delivery load list 23, storage manager 55 for managing storage and retrieval of files 22 such as those files 22 stored at file storage 59, and remote status processor 54 for interpreting file error messages returned by receivers 30. File storage 59 may be a persistent data store, e.g. comprising electronic, magnetic, and optical media. In a preferred embodiment, file error messages are formatted at least partially in XML format.
[0018] As used herein, a business rule may comprise one or more directives to be used in scheduling transmission of data into data network 100. These rules may arise from contractual agreements with a customer, e.g. source 20; rules based on operation of data network 100; prioritization schemes whereby data from a source 20a is to be given higher priority than data from another source 20b; and the like; or combinations thereof. Scheduling data may be among the rules in load list 23.
[0019] Job processor 52 comprises business processing logic to process both incoming load list 23 and retransmission load list 23 from remote status processor 54. For example, job processor 52 may process load list 23a according to business rules specified in load list 23a and also process a load list 23 associated with retransmission results suggested from remote status processor 54. Job dispatcher 52 works cooperatively with storage manager 55 and transmission optimizer 53 to move files 22 or portions of files 22 into transmit queue 56.
[0020] Storage manager 55, job dispatcher 52, and transmission optimizer 53 may share high level definitions of how data in schema 44 are to be structured, e.g. using XML structures.
[0021] Remote site document handler 58 handles processing of incoming messages, including error messages, received from receivers 30 and comprises remote status storage 57. Additionally, remote status storage 57 is accessible from remote status processor 54. Remote status storage 57 may be a transient or persistent data store, e.g. comprising electronic, magnetic, or optical media.
[0022] As used herein, file delivery module 51 generally refers to one or more of job processor 52, transmission optimizer 53, remote status processor 54, storage manager 55, transmit queue 56, and remote status storage 57.
[0023] In the operation of an exemplary embodiment, referring now to Fig. 3, a flowchart of an exemplary method, and generally to Fig. 1 and Fig. 2, file 22 is supplied by source 20, pre-processed at step 200 at job processor 24, and stored according to one or more predetermined business rules, e.g. in schema repository 44 or passed onto storage manager 55. In an embodiment, source 20 may be under the control of a supplier 20a (Fig. 1) or other creator of file 22. Source 20, e.g. supplier 20a, may additionally provide one or more business rules to dictate predetermined delivery parameters relating to file 22, such as in load list 23. These predetermined business rules may be used to dictate the file delivery behavior of system 10 for files 22 from that source 20 while providing flexibility to accommodate a plurality of sources 20, e.g. each with their own business rules. The rules may be provided in an XML formatted load list 23.
[0024] In a preferred embodiment, job processor 24 acts as a business rule manager and is XML based. Job processor 24 may create meta data which may further comprise rules, e.g. policy and business rules, to dictate how file 22 will be treated and processed by network operation center 40, e.g. whether file 22 is capable of delivery to a single receiver 30 or a
plurality of receivers 30. Meta data may further comprise a job identifier for file 22, an owner identifier for file 22, a priority descriptor for file 22, a quality of service descriptor for file 22, or the like, or a combination thereof. Using schema 45, job processor 24 can verify and validate the syntax of schema 45 within file 22 as well as interpret the structure and validity of the meta data associated with the content in file 22. Job processor 24 thus acts as a business rule manager and screens schema repository 44 to ensure that schema repository 45 is uniform within system 10. [0025] Pre-processed file 22 is then submitted at step 210 to job dispatcher 52 for delivery to a desired receiver 30 or a plurality of receivers 30 over a distributed computer network such as data network 100, e.g. as a job. Job dispatcher 52 analyzes load list 23, e.g. the XML tags that describe the owner of the content in file 22 associated with load list 23, job ID, priority and/or quality of service parameters, and the like.
[0026] As used herein, a submitted job, in an exemplary embodiment, comprises schema
45 implemented as a software object which encompasses both desired content within file 22 as well as meta data which describes that content, such as by using XML meta tags. Additionally, job dispatcher 52 may pass all or some portion of file 22 to storage manager 55 for selectively retrievable storage in file storage 59.
[0027] In dispatching the job into transmit queue 56, job dispatcher 52 may work cooperatively with transmission optimizer 53. Transmission optimizer 53 may analyze numerous predetermined parameters when determining where in transmission queue 56 a job will be placed, including prioritization parameters gleaned from load list 23 as well as characteristics of data network 100 and operational characteristics such as the current number of retransmission retries for a given job. Transmission queue 56 interfaces with data network 100
for placement of data packets into data network 100. Transmission optimizer 53 also communicates and works cooperatively with remote status processor 54.
[0028] Receiver 30 receives at step 220 at least a portion of file 22 over a distributed computer network such as data network 100. In a preferred embodiment, receiver 30 maintains a log file representative of a predetermined number of received files 12 or data packets. Receiver 30 determines the validity at step 230 of the received file 22 or portion of file 22 and places a description of the determined validity into a formatted message at step 240 for transmission back to network operations center 40. In a preferred embodiment, the message is created as part of the logfile and is formatted in XML. Accordingly, the description of the determined validity may comprise XML tags as XML meta data.
[0029] Receiver 30 reports the XML message(s) at step 250 to network operations center
40 over data network 100. Received messages may first be processed by remote site document handler 58 and then placed into a data store such as remote status storage 57. [0030] In a preferred embodiment, network operations center 40 deciphers the returned messages at step 260 by analyzing the XML tags placed in the messages by receiver 30, such as by using remote status processor 54. Network operations center 40 can then obtain information it requires from the message, e.g. content owner identifier, job identifier, priority descriptor, quality of service descriptor, or the like, or a combination thereof, that are associated with file 22. Because schema 45 may be shared amongst the various processors, e.g. job dispatcher 52 and transmission optimizer 53, query and search functions internal to system 10 may be enhanced, e.g. through use of a native XML database which is independent from a traditional relational database in the backend. Further, job dispatcher 52 can use schema 45 to detect if
received XML files in schema repository 44 have missing data, data that are invalid, or the like, or a combination thereof.
[0031] Predetermined portions of the deciphered message can then the stored or otherwise processed for additional handling. For example, a portion of network operations center 40, e.g. remote status processor 54, can process the message to determine what file 22, i.e. the entire file 22, or what portion of file 22 needs to be retransmitted as well as the specific receiver 30 or receivers 30 that need that file 22 or that portion of file 22, e.g. the receiver 30 which sent back the message indicating file receiving status or errors. For system 10 comprising a plurality of receivers 30, network operation center 40 can analyze each message to determine which receiver 30 of the plurality of receivers 30 sent an individual message as well as determine which of the plurality of receivers 30 requires retransmission of a common file or just a portion of file 22.
[0032] If an entire file 22 or a portion of file 22 needs to be retransmitted, network operations center 40, e.g. by communication between remote status processor 54 and transmission optimizer 53, can determine a desired method for retransmission based on predetermined business and algorithmic logic, e.g. multicasting or unicasting. Additionally, remote status processor 54 may communicate a need for a specific file 22 or a portion of file 22 to job dispatcher 52, e.g. through transmission optimizer 53, and job dispatcher 52 may then retrieve the needed file 22 or portion of file 22 from file storage 59, e.g. through storage manager 55.
[0033] In this manner, job dispatcher 52 and transmission optimizer 53 may work cooperatively when placing data into transmission queue 56, including using content of messages sent back to network operating center 40 from receivers 30. This method provides a self-
contained file recovery cycle that remains independent from the functions of job dispatcher 52, e.g. works cooperatively with job dispatcher 52 without burdening job dispatcher 52 when placing retransmission packets into transmission queue 56. Priority urgency can be used, e.g. a window in time may exist to fill a retransmission request and the placement of data into job queue may be at least partially based on one or more business rules. For example, a business rule may dictate that satellite 110 is to be used for multicasting over large area or for large number of receivers 30. Transmission optimizer 53 may also use business rules associated with load list 23 when placing data into job queue, whether for an initial transmission or retransmission, e.g. placement and prioritization may be at least partially dependent on contractual agreement with source 20. In this manner, job processor 52 and transmission optimizer may negotiate actual transmission.
[0034] As can be seen from the description of an exemplary embodiment above, a recovery cycle for file 22 where file 22 in receiver 30 contains errors in transmission is independent of system 10 with respect to file delivery, i.e. job dispatcher 52 continues to create new jobs for transmission into data network 100 and transmission optimizer 53 is charged with independently inserting retransmission requests into transmission queue 56 for whole files 22 or for portions of files 22 that need retransmission. Transmission optimizer 53 may work cooperatively with job dispatcher 52, e.g. scheduling transmission of data may be dictated by predetermined operational business rules. Transmission queue 56 may therefore contain both new transmission jobs and retransmission jobs. Use of meta tags such as XML meta tags allows network operation center 40 to analyze status of receiver 30 with respect to files 22 being delivered to receiver 30 and helps minimize retransmission bandwidth allocation and
transmission efficiencies while maintaining file and data integrity, including cross-customer screening of files 22.
[0035] It will be understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated above in order to explain the nature of this invention may be made by those skilled in the art without departing from the principle and scope of the invention as recited in the following claims.