WO2005081408A1 - A method of binary encode that adapts to structured data whose code is automatically generated - Google Patents

A method of binary encode that adapts to structured data whose code is automatically generated Download PDF

Info

Publication number
WO2005081408A1
WO2005081408A1 PCT/CN2004/000122 CN2004000122W WO2005081408A1 WO 2005081408 A1 WO2005081408 A1 WO 2005081408A1 CN 2004000122 W CN2004000122 W CN 2004000122W WO 2005081408 A1 WO2005081408 A1 WO 2005081408A1
Authority
WO
WIPO (PCT)
Prior art keywords
bxml
structured data
encoding
tag
code generation
Prior art date
Application number
PCT/CN2004/000122
Other languages
French (fr)
Chinese (zh)
Other versions
WO2005081408A9 (en
Inventor
Wenyuan Li
Original Assignee
Utstarcom (China) Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Utstarcom (China) Co., Ltd. filed Critical Utstarcom (China) Co., Ltd.
Priority to CN2004800236517A priority Critical patent/CN1836374B/en
Priority to PCT/CN2004/000122 priority patent/WO2005081408A1/en
Publication of WO2005081408A1 publication Critical patent/WO2005081408A1/en
Publication of WO2005081408A9 publication Critical patent/WO2005081408A9/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8543Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]

Definitions

  • the present invention relates to a simple and easy structured data encoding technology, in particular to a binary encoding method of structured data suitable for automatic code generation, and specifically, includes description, encoding and automatic code mapping of structured data. . Background technique
  • W3C released the XML binary encoding format WBXML (WAP Binary XML). Its core idea is to convert XML tags and attribute types. , Attribute values, and string constants are mapped into single-byte encodings, and encoding conflicts are avoided by using a coding space (codepage).
  • WBXML was originally designed to improve the coding efficiency of WAP's application layer protocol WML (Wireless Markup Language), but because it is simple, efficient, and has characteristics independent of specific applications, it is also used in many other situations, such as Wireless Village (http://www.wireless-village.org), SyncML (http://ww.syncml.org) etc. have different characteristics.
  • WBXML is used as a communication protocol. Encoding format. All the passed message structures are predefined. Languages such as WML are used to describe the structure of WAP pages. The message structure is unpredictable. Known. Although WBXML has a much higher coding efficiency compared to XML, its programming model has not changed. It still needs to be documented using Document Object Model (DOM) or Simple API for XML (SAX API). This programming model may be suitable for pages. Browser, but it is not necessarily suitable for applications where WBXML is extended (such as Wireless Village), because it is not a direct mapping of computer programming language-oriented data structures, and developers need to write a lot of code to manipulate data related to application information .
  • DOM Document Object Model
  • SAX API Simple API for XML
  • the technical problem to be solved by the present invention is to propose a binary encoding method of structured data suitable for automatic code generation. It is independent of platform, language and transmission of various application data exchanges, such as network communication protocols, data synchronization between intelligent devices, structured data storage, etc.
  • the binary encoding method for structured data suitable for automatic code generation in the present invention refers to the described encoding rule as binary extensible markup language BXML (Binary XML), and includes the following steps:
  • Step 1 define a BXML encoding format
  • Step 2 According to specific application requirements, construct a structured data description file suitable for BXML encoding
  • Step 3 Use a BXML compiler to read the structured data description file, and the BXML compiler generates the source code of a specific computer language according to the command.
  • Step 4 Combine with specific application logic and transmission methods to achieve complete application-layer data exchange.
  • the method of the present invention provides a code-defined rule and a set of code generation processes and rules so that a compiler developed by a developer according to these rules can automatically generate codecs.
  • the invention has the characteristics of wide application area, efficient coding, suitable for automatic code generation, simple and easy to implement. Overview of the drawings
  • FIG. 1 is a schematic flowchart of a method according to the present invention
  • FIG. 2 is another specific schematic diagram of the method according to the present invention. DETAILED DESCRIPTION OF THE INVENTION
  • the present invention refers to the described coding rules as "BXML (Binary XML)" to show the difference from WBXML.
  • FIG. 1 A schematic flowchart of the method according to the present invention is shown in FIG. 1.
  • the basic idea of the present invention is to first establish a BXML encoding format, which includes a description of a version number, a message length, a character set, an indefinite structure, and the like; construct a structured data description file according to the BXML encoding format; and then use a BXML compiler to read all The structured data description file is described.
  • the BXML compiler generates the source code of a specific computer language according to the command, and combines it with specific application logic and transmission methods to implement the data structure mapping from the application data structure to the computer language.
  • the encoding map of the label is shown in FIG. 1.
  • the BXML encoding format proposed by the present invention is as follows:
  • arrayltem ARRAY— ITEM (integer
  • the message length refers to the variable number of bytes in the subsequent BXML encoding, excluding the version number and the number of bytes occupied by the message length itself. It is encoded as a double-byte integer (in network order). The purpose of this field is to facilitate the use of BXML encoding in connection-oriented transmissions (such as TCP) without affecting the decoder.
  • Character set charset mb_u_int32
  • the character set defines the encoding character set used for all basic types of strings in subsequent BXML encodings.
  • the field itself is encoded as a multibyte integer whose integer value is the MIB number assigned by IANA to the character set. If the character set is zero, it means that the default character set has been agreed in advance by the codec.
  • Indefinite structure ANY [SWITCH JPAGH codepage] TAG [struct]
  • the ANY part is a tagged structure, and the decoder can know the type of the structure by the tag value.
  • the TAG value is automatically assigned by the BXML compiler.
  • the BXML compiler always assigns TAG values incrementally starting from 0X05 in the order of the structure definitions in a BXML structure description file.
  • the TAG value of the structure is only valid in the corresponding codepage space.
  • the default codepage value is zero. If the codepage is not zero, the SWITCH— PAGH codepage must appear, and the codepage value specified by it only applies to the following struct. This is different from WBXML. In WBXML, the codepage always takes effect until the next SWITCH—PAGH codepage appears.
  • the consideration of the present invention is that the structure contained in one structure may be located in another codepage space. If the coding rules of WBXML are followed, SWITCH-PAGH may appear repeatedly, and the invention considers that the decoder has predicted the type of any structure member Therefore, codepage and structure TAG are not needed for members of the structure at all, so we define SWITCH-PAGH to take effect only once. This also avoids the need for the decoder to remember the codepage state.
  • TAG is encoded as a single byte, which has the following structure:
  • a TAG is always valid in the space it belongs to. There are three types of TAG spaces, as shown in the following table:
  • Predefined tags are always globally valid and include:
  • the 7th bit of the TAG must be cleared, otherwise it must be set to 1 and ended with an END after the member encoding is finished.
  • Internal tags are used to identify whether a member appears. Its value is also incremented from 0X05.
  • a structure consists of a number of content codes and an END tag.
  • Each content represents a member of a structure.
  • the presence or absence of a structure member is determined by the application logic.
  • the appearance of a member can be with or without a value.
  • the logic decides.
  • Content INTERNAL— TAG [integer
  • a string is encoded in the encoding specified by the character set and ends with a single-byte zero.
  • the present invention does not accept character sets whose end tag is not a single-byte zero value in the C language, such as UTF-16.
  • UTF-16 a single-byte zero value in the C language
  • UTF-8 any special character set
  • the encoding rules for arbitrary binary data strings are the same as opaque in WBXML, which consists of a length indicator and several bytes of data.
  • the length indicator refers to the number of bytes of the binary data string, excluding the number of bytes of itself, and it is encoded into a multi-byte integer.
  • Union content
  • a union consists of a single content encoding, which can be valued or unvalued.
  • Enum enum integer
  • Enumerations are encoded as multi-byte integers that represent the defined enumeration values.
  • Array * arrayltem END
  • An array consists of the encoding of several array elements and an END tag.
  • Array element arrayltem ARRAY ITEM (integer
  • An array element consists of an ARRAYJ EM tag and the encoding of the element value.
  • the array element type is predictable to the decoder.
  • the characteristics of WBXML inherited by the BXML encoding format of the present invention mainly include: inheriting the characteristics of elements in WBXML, including element nesting, default, single element without content, etc.
  • the element tag (Element tag) is still single-byte encoded, and the encoding space (codepage) is used to avoid encoding conflicts;
  • Basic data type encoding rules are the same as WBXML, such as multibyte integers (mb-int), Inline string (opaque) and opaque data (opaque), etc .; BXML-based development process and automatic code generation
  • the development process is shown in Figure 2.
  • the present invention explains the principle of automatic code generation for the most commonly used computer languages C ++ and JAVA.
  • BXML compiler generates the source code of a specific computer language according to commands, such as C ++, JAVA. This automatically generated code contains the following main features:
  • Structured data types are directly expressed with the same class name and member name. Developers can use these codes to directly set or extract the content of structured data without indirect access like DOM or SAX API.
  • the code contains codec functions that can be used to generate or parse BXML encoded data.
  • the code can include self-printing functions for easy debugging.
  • the structured data description file is used to describe the structure of the structured data determined in advance. Its status is similar to that of XML DTD files or Schema files, but unlike XML DTD or Schema files, its purpose is not to perform BXML encoding Verification is used to instruct the compiler to automatically generate program source code and to tell the compiler what codec code should be generated.
  • BXML encoding Verification is used to instruct the compiler to automatically generate program source code and to tell the compiler what codec code should be generated.
  • Context any data exchange always takes place in a certain context (Context), such as a specific protocol interface and so on.
  • Context such as a specific protocol interface and so on.
  • a structured data description is always directed to such a context.
  • a context description can consist of one or more BXML structure description files.
  • a run of the BXML compiler is always directed to one context, and it needs to read all the description files of the context at the same time.
  • Each BXML structure description file must use the "page" keyword at the beginning of the file to specify the codepage space of the file, which is valid for all structures described in the file. In one context, the codepage must be unique.
  • Each BXML structure description file should specify the JAVA package name or C ++ namespace required after generating the program source code after the page keyword. They are valid for all descriptions in the file. For different description files in a context, you can specify the same or different JAVA package name or C ++ namespace.
  • Opaque binary byte sequence the keyword is binary
  • the present invention describes structured data in a manner similar to the C language header file.
  • the following example illustrates the format of the description file. Where the underlined part Are keywords, all keywords are displayed.
  • mapping method from a BXML structure description file to a certain computer programming language should not be unique.
  • the present invention first describes general rules; then, it briefly describes typical mappings to C ++ and JAVA languages.
  • Context Any data exchange always takes place in a certain context (Context), such as a specific protocol interface and so on.
  • Context such as a specific protocol interface and so on.
  • a structured data description is always directed to such a context.
  • a context description can consist of one or more BXML structure description files.
  • a run of the BXML compiler is always directed to one context, and it needs to read all the description files of the context at the same time.
  • all BXML configuration files for a context should usually be placed in the same root directory.
  • An application can have different contexts at the same time. For example, it may have multiple different types of communication interfaces at the same time, and exchange data with different entities. In this way, you need to use the BXML compiler to compile each context separately. Although the tag values of different contexts will conflict, they should be used in different communication interfaces (addresses), and there is no problem with such duplicate tag values. However, there may be situations where data type names conflict in different contexts. This situation should be resolved using different JAVA package names or C ++ namespaces.
  • the BXML compiler starts from 0X05 for each description file: Assigns a structure TAG to each structure in turn. If there are too many structures, they can be distributed to different codepages.
  • the BXML compiler assigns internal tags to each member of the structure or union in turn, starting from 0X05.
  • the maximum number of internal tags of a structure or union is:
  • SessionType (const SessionType &other);
  • SessionType & operators const SessionType &other); void wr / te (BXMLBuffer & buffer, BXMLWriter &w);
  • a _value integer member variable represents the current enumeration value
  • the write member function is used to encode the enumeration object itself
  • the toString member function is used to output the printable string of the object itself.
  • the string should be represented in a human-readable manner.
  • the enumeration object in the above example may output the string "inband".
  • the parseStatic static member function is used to decode an enum object.
  • SessionDescriptor * get a desc ();
  • a structure is mapped to a C ++ class with the same name, which inherits the ANY class;
  • a member pointer variable with the same name prefixed with " ⁇ " is used to represent the value of the member. If the value pointer of the member identified as being present by the corresponding presence variable is empty, it means that the member has no value;
  • a xxx_presence variable is used to identify whether the corresponding member appears
  • set_xxx For each structure member, there should be thousands of overloaded set_xxx member functions to set the value of the members.
  • the set_XXX function should generally include the form of the takeover pointer and the form of the copy number.
  • the write member function is used to encode the object itself
  • the toString member function is used to output the printable string of the object itself.
  • the parseStatic static member function is used to decode a structure object.
  • IntegerList (const IntegerList &other);
  • IntegerList & operator (const IntegerList &other);
  • BinaryList & operator (const BinaryListfe: other); void clean ();
  • the write member function is used to encode the array 3 like itself
  • the toString member function is used to output the printable string of the object itself.
  • a string is mapped to the C ++ STL string type.
  • a binary byte sequence is mapped to C ++ 's STL vecotor ⁇ BYTE> type.
  • the indefinite structure is mapped to the ANY class in the BXML runtime, which is also the base class where all structures are mapped into C ++ classes.
  • the definition of the ANY class is summarized as follows: class ANY public: static ANY * parseANY (uBYTE id, BXMLBuffer & buffer, BXMLParser &p); virtual -ANY () ⁇ ;
  • BXMLWriter and BXMLParser, which are used to support encoding and decoding of top-level BXML data. Mapping to Java
  • An integer is mapped to JAVA's Integer type, which is available in the BXML runtime. •
  • the BXMLInt class is used to support encoding and decoding of it and the output of printable strings.
  • a string is mapped into the JAVA String type.
  • a binary byte sequence is mapped to JAVA's byte [] type.
  • the indefinite structure is mapped to the ANY class in the BXML runtime. It is also the base class for all structures mapped to JAVA classes.
  • the definition of the ANY class is summarized as follows: public abstract class ANY ⁇
  • BXMLWriter In addition, there are two important classes in the BXML runtime: BXMLWriter and
  • BXMLParser used to support encoding and decoding of top-level BXML data.
  • An application running example uses the BXML structure description file described in the previous example, and then uses the source code generated by the BXML compiler to develop an application that constructs structured data with the Message structure defined in the description file as the top-level structure. Then use the self-print function of this structure to output the data content, and use the encoding function to perform BXML encoding.
  • the total length of the data after LoginRes is encoded by the encoding function is 138 bytes, and its content is expressed in hexadecimal: ⁇ :
  • 04 key binary byte sequence has length 4
  • 02 binary byte sequence is 2 in length
  • 03 Binary byte sequence is 3 in length
  • Means codepage is 0
  • the internal tag is explained in a structure and represents a corresponding member of the structure.
  • the internal tag can carry the content of the tag according to the needs of the runtime. (I.e. member value), may also carry no content (i.e. no value appears), or may not appear (i.e. member is absent);
  • the content carried by the internal tag no longer needs any other tag tags, because the type is predicted, except when the member type is an indefinite structure (ANY);
  • the content carried by the internal label is directly encoded according to the corresponding member type.
  • integers are encoded as multi-byte integers.
  • WBXML which still uses string encoding
  • Attributes that specifically support WBXML are not considered, but attributes can be expressed by adding structure members
  • a binary encoding method of structured data suitable for automatic code generation is proposed.
  • the present invention is applicable to the exchange of various application data independent of the platform, language and transmission, such as network communication protocols, Data synchronization, structured data storage, etc.

Abstract

The present invention discloses a method of binary encode that adapts to structured data whose code is automatically generated, characteristics in, comprising the following steps: Defines BXML encode format; according to specific application requirement, Constructs structured data describe document that adapts to using BXML encode; Uses BXML compiler to read the structured data describe document, according to the command the BXML compiler generates certain specific computer language source code, combines with specific applied logic and transmission mode, achieves the data exchange of the whole application layer. Though providing a kind of encode definition rule, and designing a set of code generation process and rule, the method of the present invention makes the compiler that the research staff research according to these rule can automatically generate coding decoding code .The present invention have the characteristic of broad appliance fields, effectively encode, adapting to code automatically generation, simple and easily achieve.

Description

一种适合代码自动生成的  One suitable for automatic code generation
结构化数据的二进制编码方法 技术领域  Binary encoding method for structured data
本发明涉及一种简单易行的结构化数据的编码技术, 特别是涉及 一种适合代码自动生成的结构化数据的二进制编码方法,具体地说, 包 括结构化数据的描述、 编码和代码自动映射。 背景技术  The present invention relates to a simple and easy structured data encoding technology, in particular to a binary encoding method of structured data suitable for automatic code generation, and specifically, includes description, encoding and automatic code mapping of structured data. . Background technique
无论是 Internet还是各种专用通讯网络中,应用数据的交换是不可 避免的,结构化数据是应用数据在计算机世界的最通常的表述方式,为 了让各种结构化的应用数据在不同的计算机平台之间顺利交换,需要有 独立于平台、编程语言和传输的数据编码方法。 W3C组织提出的 XML ( Extensible Markup Language )就是一种被广泛应用的数据编码方 法, 基于 XML的网络通讯协议也被广泛采用, 如 XML-RPC(Remote Process Communication) , SOAP(Simple Object Access) , Jabber(http://www.jabber.org)等等。但是由于 XML采用文本编码导致 占用大量网络带宽,对它的解析消耗较多计算机 CPU和内存资源, W3C 发布了 XML的二进制编码格式 WBXML ( WAP Binary XML ) , 其核 心思想是将 XML标签、 属性类型、属性值和字符串常量映射成单字节 编码, 通过使用编码空间(codepage)来避免编码冲突。 WBXML最初是 为提高 WAP的应用层协议 WML ( Wireless Markup Language ) 的编 码效率而设计的,但由于其编码简单、高效且具备与具体应用无关的特 点 , 也被用 于 艮 多 其 他 的 场 合 , 如 Wireless Village (http://www.wireless-village.org) , SyncML(http://ww .syncml.org)等 着不同的特点, 在诸如 Wireless Village这样的应用环境, WBXML被 用作通讯协议的编码格式, 所有传递的消息结构都是预先定义的, 而 WML这样的语言被用来描述 WAP页面的结构, 其消息结构是不可预 知的。 尽管 WBXML相对于 XML来说编码效率提高了很多, 但它的 编程模型并没有改变, 仍然需要采用 Document Object Model ( DOM ) 或 Simple API for XML ( SAX API )来编程, 这种编程模型可能适合页 面浏览器, 但在 WBXML被引申应用的场合(如 Wireless Village )就 未必合适了, 因为它不是面向计算机编程语言的数据结构的直接映射, 需要开发人员写大量的代码来操作与应用信息相关的数据。 另一方面, XML标签到单字节编码的映射是通过普通文档描述的, 需要开发人员 阅读这些文档并手工地映射成代码。这些特点导致开发效率低下而且容 易出错。如果有一种自动化工具能帮助开发人员自动地实现应用数据结 构到计算机语言的数据结构映射, 同时自动实现结构标签的编码映射, 将大大提高开发效率。 发明内容 Whether it is the Internet or various dedicated communication networks, the exchange of application data is unavoidable. Structured data is the most common way to express application data in the computer world. A smooth exchange between them requires a platform, programming language, and data encoding method that is independent of the transmission. The XML (Extensible Markup Language) proposed by the W3C organization is a widely used data encoding method. XML-based network communication protocols are also widely used, such as XML-RPC (Remote Process Communication), SOAP (Simple Object Access), Jabber (http://www.jabber.org) and more. However, because XML uses text encoding, it consumes a lot of network bandwidth, and parsing it consumes more computer CPU and memory resources. W3C released the XML binary encoding format WBXML (WAP Binary XML). Its core idea is to convert XML tags and attribute types. , Attribute values, and string constants are mapped into single-byte encodings, and encoding conflicts are avoided by using a coding space (codepage). WBXML was originally designed to improve the coding efficiency of WAP's application layer protocol WML (Wireless Markup Language), but because it is simple, efficient, and has characteristics independent of specific applications, it is also used in many other situations, such as Wireless Village (http://www.wireless-village.org), SyncML (http://ww.syncml.org) etc. have different characteristics. In application environments such as Wireless Village, WBXML is used as a communication protocol. Encoding format. All the passed message structures are predefined. Languages such as WML are used to describe the structure of WAP pages. The message structure is unpredictable. Known. Although WBXML has a much higher coding efficiency compared to XML, its programming model has not changed. It still needs to be documented using Document Object Model (DOM) or Simple API for XML (SAX API). This programming model may be suitable for pages. Browser, but it is not necessarily suitable for applications where WBXML is extended (such as Wireless Village), because it is not a direct mapping of computer programming language-oriented data structures, and developers need to write a lot of code to manipulate data related to application information . On the other hand, the mapping of XML tags to single-byte encoding is described by ordinary documents, and developers need to read these documents and manually map them into code. These characteristics make development inefficient and error-prone. If there is an automation tool that can help developers automatically implement data structure mapping from application data structures to computer languages, and automatically implement coding mapping of structural tags, it will greatly improve development efficiency. Summary of the invention
基于以上的分析, 大量的结构预知的通讯场合采用 WBXML这样 简单高效的编码规则, 因此,本发明要解决的技术问题是提出一种适合 代码自动生成的结构化数据的二进制编码方法,本发明适用于独立于平 台、语言和传输的各种应用数据的交换, 如网络通讯协议、 智能设备间 的数据同步、 结构化数据存储等。  Based on the above analysis, a large number of structurally predictable communication occasions use simple and efficient encoding rules such as WBXML. Therefore, the technical problem to be solved by the present invention is to propose a binary encoding method of structured data suitable for automatic code generation. It is independent of platform, language and transmission of various application data exchanges, such as network communication protocols, data synchronization between intelligent devices, structured data storage, etc.
本发明所述适合代码自动生成的结构化数据的二进制编码方法,将 所描述的编码规则称为二进制扩展性标识语言 BXML (Binary XML), 包括如下步骤:  The binary encoding method for structured data suitable for automatic code generation in the present invention refers to the described encoding rule as binary extensible markup language BXML (Binary XML), and includes the following steps:
步骤一, 定义 BXML编码格式;  Step 1: define a BXML encoding format;
步驟二, 按照具体的应用需求, 构造适合采用 BXML编码的结构 化数据描述文件;  Step 2: According to specific application requirements, construct a structured data description file suitable for BXML encoding;
步骤三,使用 BXML编译器读取所述结构化数据描述文件, BXML 编译器根据命令生成某种具体的计算机语言的源代码,  Step 3: Use a BXML compiler to read the structured data description file, and the BXML compiler generates the source code of a specific computer language according to the command.
步骤四,与具体的应用逻辑和传输方式相结合, 实现完整的应用层 的数据交换。 本发明所述方法通过提供一种编码定义的规则,以及设计了一套代 码生成过程和规则,使得开发人员按照这些规则开发的编译器能自动生 成编解码代码。 本发明具有适用面广泛、编码高效、适合代码自动化生 成、 简单和易于实现的特点。 附图概述 Step 4: Combine with specific application logic and transmission methods to achieve complete application-layer data exchange. The method of the present invention provides a code-defined rule and a set of code generation processes and rules so that a compiler developed by a developer according to these rules can automatically generate codecs. The invention has the characteristics of wide application area, efficient coding, suitable for automatic code generation, simple and easy to implement. Overview of the drawings
图 1为本发明所述方法的流程示意图;  FIG. 1 is a schematic flowchart of a method according to the present invention;
图 2为本发明所述方法的又一具体示意图。 本发明的具体实施方式  FIG. 2 is another specific schematic diagram of the method according to the present invention. DETAILED DESCRIPTION OF THE INVENTION
以下结合附图对本发明的具体实施方式进行说明。  The following describes specific embodiments of the present invention with reference to the accompanying drawings.
本发明将所描述的编码规则称为 "BXML(Binary XML)" , 以示 与 WBXML的区別。  The present invention refers to the described coding rules as "BXML (Binary XML)" to show the difference from WBXML.
本发明所述方法的流程示意图参见图 1。本发明的基本思想是先建 立 BXML编码格式, 包括对版本号、 消息长度、 字符集和不定结构等 等 的描述; 按照 BXML编码格式, 构造结构化数据描述文件; 然后使 用 BXML编译器读取所述结构化数据描述文件, BXML编译器根据命 令生成某种具体的计算机语言的源代码,与具体的应用逻辑和传输方式 相结合, 实现应用数据结构到计算机语言的数据结构映射, 同时自动实 现结构标签的编码映射。  A schematic flowchart of the method according to the present invention is shown in FIG. 1. The basic idea of the present invention is to first establish a BXML encoding format, which includes a description of a version number, a message length, a character set, an indefinite structure, and the like; construct a structured data description file according to the BXML encoding format; and then use a BXML compiler to read all The structured data description file is described. The BXML compiler generates the source code of a specific computer language according to the command, and combines it with specific application logic and transmission methods to implement the data structure mapping from the application data structure to the computer language. The encoding map of the label.
以下分别对建立 BXML编码格式, 构造结构化数据描述文件以及 生成某种具体的计算机语言的源代码进行说明。  The following explains the establishment of a BXML encoding format, the construction of a structured data description file, and the generation of source code for a specific computer language.
本发明提出的 BXML编码格式如下:  The BXML encoding format proposed by the present invention is as follows:
BXMLMessage - version msgLength charset ANY BXMLMessage-version msgLength charset ANY
version = u_int8 containing BXML version number version = u_int8 containing BXML version number
msgLength = u_intl6 msgLength = u_intl6
charset = mb_u_int32 indicating the charset charset = mb_u_int32 indicating the charset
ANY = [SWITCH— PAGH codepage] TAG [struct] ANY = [SWITCH— PAGH codepage] TAG [struct]
struct = * content END struct = * content END
content = INTERNAL— TAG [integer | string | binary | struct | union | enum | array | ANY] integer = mb_int32 content = INTERNAL— TAG [integer | string | binary | struct | union | enum | array | ANY] integer = mb_int32
string = string terminated with zero string = string terminated with zero
binary = length *byte binary = length * byte
length = integer length = integer
union = content union = content
enum = integer enum = integer
array = *arrayltem END array = * arrayltem END
arrayltem = ARRAY— ITEM (integer | string | binary | struct) 版本号 version: u_int8 containing BXML version number arrayltem = ARRAY— ITEM (integer | string | binary | struct) version number version: u_int8 containing BXML version number
所有 BXML编码的初始字节代表 BXML的版本号,其编码规则与 WBXML相同。 它的高四位比特代表主版本号减一, 低四比特代表从 版本号。 例如, 版本 2.7被编码成 0X17。 如果将本发明的版本号指定 为 1.1, 则被编码成 0X01。 消息长度 msgLength = u_intl6  All BXML-encoded initial bytes represent the BXML version number, and the encoding rules are the same as WBXML. Its upper four bits represent the master version number minus one, and the lower four bits represent the slave version number. For example, version 2.7 is encoded as 0X17. If the version number of the present invention is specified as 1.1, it is encoded as 0X01. Message length msgLength = u_intl6
消息长度指后续的 BXML编码的可变字节数, 不包括版本号和消 息长度本身所占用的字节数。它被编码成一个双字节整数(按网络序)。 该域的目的是方便 BXML编码在面向连接的传输(如 TCP ) 中使用, 对解码器无影响。 字符集 charset = mb_u_int32  The message length refers to the variable number of bytes in the subsequent BXML encoding, excluding the version number and the number of bytes occupied by the message length itself. It is encoded as a double-byte integer (in network order). The purpose of this field is to facilitate the use of BXML encoding in connection-oriented transmissions (such as TCP) without affecting the decoder. Character set charset = mb_u_int32
字符集定义后续的 BXML编码中的所有字符串基本类型所采用的 编码字符集。 该域本身被编码成一个多字节整数, 其整数值是 IANA 为字符集分配的 MIB号。 字符集为零, 则代表编解码默方预先已经约 定默认的字符集。  The character set defines the encoding character set used for all basic types of strings in subsequent BXML encodings. The field itself is encoded as a multibyte integer whose integer value is the MIB number assigned by IANA to the character set. If the character set is zero, it means that the default character set has been agreed in advance by the codec.
考虑到代码生成的简单性, 本发明不接受那些在 C语言中结尾标 记不是一个单字节零值的字符集, 如 UTF-16。 事实上, 也很少有人用 这样的字符集作为传输数据的编码,而且我们总是可以用其他字符集来 替换他们, 如 UTF-8或任何专用字符集(如 GB2312 )等。 不定结构 ANY = [SWITCH JPAGH codepage] TAG [struct] ANY部分是一个带标签的结构, 解码器可以通过标签值知道结构 的类型。 TAG值由 BXML编译器自动分配。 BXML编译器总是按照一 个 BXML 结构描述文件中结构定义的顺序从 0X05 开始递增地分配 TAG值。 结构的 TAG值仅仅在对应的 codepage空间中有效。 Considering the simplicity of code generation, the present invention does not accept character sets whose end tag is not a single-byte zero value in the C language, such as UTF-16. In fact, few people use such character sets as the encoding of transmission data, and we can always replace them with other character sets, such as UTF-8 or any special character set (such as GB2312). Indefinite structure ANY = [SWITCH JPAGH codepage] TAG [struct] The ANY part is a tagged structure, and the decoder can know the type of the structure by the tag value. The TAG value is automatically assigned by the BXML compiler. The BXML compiler always assigns TAG values incrementally starting from 0X05 in the order of the structure definitions in a BXML structure description file. The TAG value of the structure is only valid in the corresponding codepage space.
缺省 的 codepage 值是零。 如果 codepage 不是零 , 贝!] SWITCH— PAGH codepage必须出现, 而且它指定的 codepage值仅仅 对紧接着的 struct生效。这点与 WBXML不同,在 WBXML中, codepage 始终生效直到下一个 SWITCH— PAGH codepage出现。本发明的考虑在 于, 一个结构中内含的结构可能位于其他的 codepage空间, 如果按照 WBXML的编码规则, 则 SWITCH— PAGH可能反复地出现, 而本发 明认为解码器已经预知了任何结构成员的类型,所以对结构的成员根本 不需要 codepage和结构 TAG, 因此我们定义 SWITCH一 PAGH仅仅生 效一次。 这样还可以避免解码器需要记住 codepage状态。  The default codepage value is zero. If the codepage is not zero, the SWITCH— PAGH codepage must appear, and the codepage value specified by it only applies to the following struct. This is different from WBXML. In WBXML, the codepage always takes effect until the next SWITCH—PAGH codepage appears. The consideration of the present invention is that the structure contained in one structure may be located in another codepage space. If the coding rules of WBXML are followed, SWITCH-PAGH may appear repeatedly, and the invention considers that the decoder has predicted the type of any structure member Therefore, codepage and structure TAG are not needed for members of the structure at all, so we define SWITCH-PAGH to take effect only once. This also avoids the need for the decoder to remember the codepage state.
TAG 及其所属空间 TAG and its space
TAG被编码成一个单字节, 它具备下面的结构:  TAG is encoded as a single byte, which has the following structure:
Figure imgf000007_0001
一个 TAG总是在它所属的空间中有效,有三种类型的 TAG空间, 如下表所示:
Figure imgf000007_0001
A TAG is always valid in the space it belongs to. There are three types of TAG spaces, as shown in the following table:
标签种类 描迷  Type of label
预定义标签总是全局有效的, 包括:  Predefined tags are always globally valid and include:
• SWITCH— PAGE (0x00): 用于在 ANY类型编码中指示非零 的 codepage空间, 相应的 codepage被编码成一个单字节无 预定义标签  • SWITCH— PAGE (0x00): Used to indicate non-zero codepage space in ANY type encoding, the corresponding codepage is encoded as a single byte without a predefined tag
符号数紧随其后。  The number of symbols follows immediately.
• END (0x01): 用于标识一个结构或数组的结束。  • END (0x01): Used to identify the end of a structure or array.
• ARRAY一 ITEM (0x82): 用于标识一个数组元素的开始。 结构标签值由 BXML编译器自动分配。由 BXML编译器总是按 结构标签  • ARRAY_ITEM (0x82): Used to identify the beginning of an array element. Structure tag values are automatically assigned by the BXML compiler. By BXML compiler always tags by structure
照一个 BXML结构描述文件中结构定义的顺序从 0X05开始递增 地分配 TAG值。 Follow the order of structure definition in a BXML structure description file from 0X05 To assign a TAG value.
在编码的时候, 如果结构定义为空结构或者结构的所有成员在 运行时都不出现, 则 TAG的第 7比特必须清零, 否则必须置 1 , 而且在成员编码结束后用一个 END标记结束。  At the time of encoding, if the structure is defined as an empty structure or all members of the structure do not appear at run time, the 7th bit of the TAG must be cleared, otherwise it must be set to 1 and ended with an END after the member encoding is finished.
内部标签用来标识一个成员是否出现。其值也是从 0X05开始递 增分配。  Internal tags are used to identify whether a member appears. Its value is also incremented from 0X05.
联合 ( union )或者结构的成员都将被 BXML编译器自动分配一 内部标签  The union or struct members are automatically assigned an internal tag by the BXML compiler
个内部标签值, 他们仅仅在各自的联合或结构内部空间有效; 在编码的时候, 如果成员的值在运行时不出现, 则内部标签的 第 7比特必须清零,否则必须置 1 ,而且成员的值的编码紧随其后。 结构 struct = ^content END  Internal label values, they are only valid in the internal space of the respective union or structure; when encoding, if the value of the member does not appear at runtime, the 7th bit of the internal label must be cleared, otherwise it must be set to 1, and the member The encoding of the value immediately follows. Struct = ^ content END
一个结构由若干个内容的编码和一个 END标签组成, 每个内容代 表了一个结构的成员,结构成员出现与否由应用逻辑自行决定,成员的 出现可以带值也可以不带值, 也由应用逻辑自行决定。 内容 content = INTERNAL— TAG [integer | string | binary | struct I union | enum | array | ANY]  A structure consists of a number of content codes and an END tag. Each content represents a member of a structure. The presence or absence of a structure member is determined by the application logic. The appearance of a member can be with or without a value. The logic decides. Content = INTERNAL— TAG [integer | string | binary | struct I union | enum | array | ANY]
一个内容代表结构或联合的一个成员, 它可以带值出现,或不带值 出现。 它由一个内部标签和对应的值的编码组成。 整数 integer = mb_int32  A content represents a member of a structure or union, and it can appear with or without a value. It consists of an internal label and the encoding of the corresponding value. Integer = mb_int32
一个整数被编码成一个多字节整数, 规则与 WBXML相同。 它由 一系列字节组成, 每个字节的第 7 (最高)比特为连续标记, 如果它为 1 , 表示该整数还包括后续字节编码, 否则表示当前字节是该整数的最 后一个字节编码。该整数值由这一系列的字节去除连续标记后剩余的比 特位串接起来表示 (由高到低) 。 字符串 string = string terminated with a single zero byte  An integer is encoded as a multi-byte integer with the same rules as WBXML. It consists of a series of bytes. The 7th (most significant) bit of each byte is a continuous flag. If it is 1, it means that the integer also includes the subsequent byte encoding, otherwise it means that the current byte is the last word of the integer. Section encoding. The integer value is represented by concatenating the bits (high to low) of the series of bytes after removing consecutive marks. String = string terminated with a single zero byte
一个字符串按照字符集指定的编码方式进行編码,并用一个单字节 零结尾。 考虑到代码生成的简单性, 本发明不接受那些在 C语言中结 尾标记不是一个单字节零值的字符集, 如 UTF-16。 事实上, 也很少有 人用这样的字符集作为传输数据的编码,而且我们总是可以用其他字符 集来替换他们, 如 UTF-8或任何专用字符集(如 GB2323 )等。 二进制数据串 binary = length Abyte A string is encoded in the encoding specified by the character set and ends with a single-byte zero. Considering the simplicity of code generation, the present invention does not accept character sets whose end tag is not a single-byte zero value in the C language, such as UTF-16. In fact, there are very few People use such character sets as the encoding for transmitting data, and we can always replace them with other character sets, such as UTF-8 or any special character set (such as GB2323). Binary data string binary = length A byte
任意的二进制数据串的编码规则与 WBXML中的 opaque相同,由 一个长度指示和若干字节数据组成。其中,长度指示指该二进制数据串 的字节数, 不包括其自身的字节数, 它被编码成一个多字节整数。 联合 union = content  The encoding rules for arbitrary binary data strings are the same as opaque in WBXML, which consists of a length indicator and several bytes of data. Among them, the length indicator refers to the number of bytes of the binary data string, excluding the number of bytes of itself, and it is encoded into a multi-byte integer. Union = content
联合由一个单一的内容编码组成, 它可以带值或不带值。 枚举 enum = integer  A union consists of a single content encoding, which can be valued or unvalued. Enum enum = integer
枚举被编码成多字节整数, 代表所定义的枚举值。 数组 array = *arrayltem END  Enumerations are encoded as multi-byte integers that represent the defined enumeration values. Array = * arrayltem END
数组由若干数组元素的编码和一个 END标签组成。 数组元素 arrayltem = ARRAY ITEM (integer | string | binary | struct)  An array consists of the encoding of several array elements and an END tag. Array element arrayltem = ARRAY ITEM (integer | string | binary | struct)
一个数組元素由一个 ARRAYJ EM标签和元素值的编码组成。 数组元素类型对解码器是预知的。  An array element consists of an ARRAYJ EM tag and the encoding of the element value. The array element type is predictable to the decoder.
本发明的 BXML编码格式继承的 WBXML的特性, 主要包括: 继承了 WBXML中的元素(Element )所具备的特性, 包括元素的 嵌套、 缺省、 无内容单一元素等;  The characteristics of WBXML inherited by the BXML encoding format of the present invention mainly include: inheriting the characteristics of elements in WBXML, including element nesting, default, single element without content, etc.
在编码上仍然对元素标签(Element tag )采用单字节编码, 并采 用编码空间(codepage)避免编码冲突;  In the encoding, the element tag (Element tag) is still single-byte encoded, and the encoding space (codepage) is used to avoid encoding conflicts;
继承了部分全局标签(Global token ) , 如 S WITCH_PAGE , END 等。  It inherits some global tokens, such as S WITCH_PAGE, END and so on.
基本数据类型编码规则与 WBXML相同, 如多字节整数 (mb—int), 内联字符串 ( inline string )和不透明数据 ( opaque )等; 基于 BXML编码的开发过程和代码自动生成 Basic data type encoding rules are the same as WBXML, such as multibyte integers (mb-int), Inline string (opaque) and opaque data (opaque), etc .; BXML-based development process and automatic code generation
开发过程如图 2 所示。 本发明针对最常用的计算机语言 C++和 JAVA说明代码自动生成的原理。  The development process is shown in Figure 2. The present invention explains the principle of automatic code generation for the most commonly used computer languages C ++ and JAVA.
首先需要根据具体应用需求编写结构化数据描述文件, 然后使用 BXML编译器读取这些描述文件, BXML编译器根据命令生成某种具 体的计箅机语言的源代码, 如 C++、 JAVA。 这些自动生成的代码包含 下面的主要功能:  First, structured data description files need to be written according to specific application requirements, and then these description files are read using a BXML compiler. The BXML compiler generates the source code of a specific computer language according to commands, such as C ++, JAVA. This automatically generated code contains the following main features:
用同名的类名、成员名直接表达结构化数据类型,开发人员可利用 这些代码直接设置或提取结构化数据的内容,而不需要象 DOM或 SAX API那样间接地访问。  Structured data types are directly expressed with the same class name and member name. Developers can use these codes to directly set or extract the content of structured data without indirect access like DOM or SAX API.
代码中包含编解码函数, 可用于生成或解析 BXML编码数据。 代码中可包含自打印功能, 方便调试。  The code contains codec functions that can be used to generate or parse BXML encoded data. The code can include self-printing functions for easy debugging.
在使用 BXML编译器生成源代码以后, 开发人员可利用这些代码 与具体的应用逻辑和传输方式结合起来, 实现应用数据的交换。开发人 员不需要再编写任何编解码的代码, 也不用再间接访问结构化数据。 结枸化数据的描述文件  After using the BXML compiler to generate source code, developers can use these codes to combine with specific application logic and transmission methods to implement application data exchange. Developers no longer need to write any codec code, nor do they need to access structured data indirectly. Descriptive file of citrated data
结构化数据描述文件用于描述事先确定的结构化数据的结构,其地 位类似于 XML的 DTD文件或 Schema文件,但与 XML DTD或 Schema 文件不同的是, 它的目的不是用于对 BXML编码进行校验, 而是用于 指导编译器自动生成程序源代码,并告知编译器应生成什么样的编解码 代码。 以下是有关结构化数据描述文件的规则:  The structured data description file is used to describe the structure of the structured data determined in advance. Its status is similar to that of XML DTD files or Schema files, but unlike XML DTD or Schema files, its purpose is not to perform BXML encoding Verification is used to instruct the compiler to automatically generate program source code and to tell the compiler what codec code should be generated. Here are the rules for structured data description files:
1 ) 任何数据交换总是发生在一定的上下文(Context ) 中, 比如 一个特定的协议接口等等。一个结构化数据描述总是针对这样的一个上 下文的,一个上下文描述可以由一个或多个 BXML结构描述文件组成。 BXML 编译器的一次运行也总是针对一个上下文的, 它需要同时读入 该上下文的所有描述文件。 1) Any data exchange always takes place in a certain context (Context), such as a specific protocol interface and so on. A structured data description is always directed to such a context. A context description can consist of one or more BXML structure description files. A run of the BXML compiler is always directed to one context, and it needs to read all the description files of the context at the same time.
2 ) 每个 BXML结构描述文件在文件开头必须使用 "page" 关键 字指定该文件的 codepage空间, 它对该文件内描述的所有结构有效。 在一个上下文中, codepage必须是唯一的。  2) Each BXML structure description file must use the "page" keyword at the beginning of the file to specify the codepage space of the file, which is valid for all structures described in the file. In one context, the codepage must be unique.
3 )每个 BXML结构描迷文件应在 page关键字之后指明生成程序 源代码时所需要的 JAVA包名或 C++命名空间, 他们对该文件中的所 有描 有效。对一个上下文中的不同的描述文件,可以指定相同或不同 的 JAVA包名或 C++命名空间。  3) Each BXML structure description file should specify the JAVA package name or C ++ namespace required after generating the program source code after the page keyword. They are valid for all descriptions in the file. For different description files in a context, you can specify the same or different JAVA package name or C ++ namespace.
4:)在一个 BXML结构描述文件中可以直接使用同一上下文的任何 一个 BXML描述文件中定义的数据类型, 但是同一上下文的任何数据 类型不能重名。  4 :) In a BXML structure description file, any data type defined in the same context can be used directly, but any data type in the same context cannot have the same name.
5 ) 用关键字定义的数据类型, 包括:  5) Data types defined with keywords, including:
整数: 关键字为 int  Integer: keyword is int
宇符串: 关键字为 string  U character string: The keyword is string
不透明二进制字节序列: 关键字为 binary  Opaque binary byte sequence: the keyword is binary
收举: 关键字为 enum  Exclude: keywords are enum
联合: 关键字为 union  Union: The keyword is union
结构: 关键字为 struct  Structure: The keyword is struct
不定结构: 关键字为 ANY  Indefinite structure: the keyword is ANY
数组: 关键字为 arrayof  Array: The keyword is arrayof
6 )在一个 BXML结构描述文件中的结构定义顺序是重要的, 它影 响 BXML编译器分配的 TAG值。数据交换双方必须使用相同的 BXML 结构描述文件。  6) The order of the structure definition in a BXML structure description file is important, it affects the TAG value assigned by the BXML compiler. Both parties to the data exchange must use the same BXML structure description file.
7 ) —个结构或联合内部的成员定义顺序也是重要的, 它影响 BXML 编译器分配的内部标签值。 数据交换默方必须使用相同的 BXML结构描述文件。  7) The order of member definitions within a structure or union is also important, as it affects the internal tag values assigned by the BXML compiler. The data exchange must use the same BXML structure description file.
为了开发人员书写方便, 本发明采用类似 C语言头文件的方式描 述结构化数据, 下面的例子说明了描述文件的格式。 其中, 下划线部分 为关键字, 所有关键字均被展示。 For the convenience of developers, the present invention describes structured data in a manner similar to the C language header file. The following example illustrates the format of the description file. Where the underlined part Are keywords, all keywords are displayed.
//file testbxml, oaly for test, no actual meaning page=Q; // file testbxml, oaly for test, no actual meaning page = Q;
package com.test: //forjava  package com.test: // forjava
namespace com::test; //for C++ enum SessionType {  namespace com :: test; // for C ++ enum SessionType {
inband = 1;  inband = 1;
outband = 2;  outband = 2;
} union SessionAddress {  } union SessionAddress {
string url;  string url;
_int ipAddress;  _int ipAddress;
} struct SessionDescriptor {  } struct SessionDescriptor {
― SessionType type;  ― SessionType type;
SessionAddress address;  SessionAddress address;
string sessionID;  string sessionID;
} struct Userlnfo {  } struct Userlnfo {
string userlD;  string userlD;
int age;  int age;
binary key;  binary key;
} arrayof Userlnfo UserlnfoList;  } arrayof Userlnfo UserlnfoList;
arrayof int IntegerList  arrayof int IntegerList
arrayof string StringList;  arrayof string StringList;
arrayof binary B inaryList; struct LoginReq {  arrayof binary B inaryList; struct LoginReq {
string devicelD;  string devicelD;
UserlnfoList userList;  UserlnfoList userList;
B inaryList blist;  B inaryList blist;
StringList slist; IntegerList ilist; StringList slist; IntegerList ilist;
} struct Logi Res {} struct Message {  } struct Logi Res {} struct Message {
SessionDescriptor desc;  SessionDescriptor desc;
ANY msgBody;  ANY msgBody;
int time;  int time;
ANY addition;  ANY addition;
}  }
//end of the test.xml 代码生成的一般规则  // end of the test.xml general rules for code generation
事实上, 从 BXML结构描述文件向某种计算机编程语言的映射方 法应该不会是唯一的, 本发明首先描述一般的规则; 随后, 简要描述向 C++和 JAVA语言的典型映射。  In fact, the mapping method from a BXML structure description file to a certain computer programming language should not be unique. The present invention first describes general rules; then, it briefly describes typical mappings to C ++ and JAVA languages.
1 )任何数据交换总是发生在一定的上下文(Context )中, 比如一 个特定的协议接口等等。一个结构化数据描述总是针对这样的一个上下 文的, 一个上下文描述可以由一个或多个 BXML结构描述文件组成。  1) Any data exchange always takes place in a certain context (Context), such as a specific protocol interface and so on. A structured data description is always directed to such a context. A context description can consist of one or more BXML structure description files.
BXML 编译器的一次运行也总是针对一个上下文的, 它需要同时读入 该上下文的所有描述文件。 为方便起见, 通常应将一个上下文的所有 BXML給构描述文件放置在同一个根目录下。 A run of the BXML compiler is always directed to one context, and it needs to read all the description files of the context at the same time. For convenience, all BXML configuration files for a context should usually be placed in the same root directory.
2 )—个应用程序可以同时拥有不同的上下文, 比如, 它可能同时 拥有多个不同类型的通讯接口, 与不同的实体交换数据。 这样, 需要使 用 BXML编译器分别对各个上下文进行编译, 尽管不同上下文的结构 标签值会冲突, 但他们应使用在不同的通讯接口 (地址)中, 这种标签 值的重复不会有任何问题。但是不同上下文中可能有数据类型名冲突的 情况, 种情况应使用不同的 JAVA包名或 C++命名空间来解决。  2) An application can have different contexts at the same time. For example, it may have multiple different types of communication interfaces at the same time, and exchange data with different entities. In this way, you need to use the BXML compiler to compile each context separately. Although the tag values of different contexts will conflict, they should be used in different communication interfaces (addresses), and there is no problem with such duplicate tag values. However, there may be situations where data type names conflict in different contexts. This situation should be resolved using different JAVA package names or C ++ namespaces.
3 )在一次上下文编译中, BXML编译器针对每个描述文件分別从 0X05开: ½依次为每个结构分配结构 TAG。 如果结构太多, 可以分布到 不同的丈件 ( codepage ) 中。 一个上下文的应用结构最大数为: 256 codepages * (128 tags一 5 predefined) = 31488 3) In a contextual compilation, the BXML compiler starts from 0X05 for each description file: Assigns a structure TAG to each structure in turn. If there are too many structures, they can be distributed to different codepages. The maximum number of application structures for a context is: 256 codepages * (128 tags-5 predefined) = 31488
对绝大多数应用来说已经足够了。  This is sufficient for most applications.
4 ) BXML编译器分别为每一个结构或联合的成员依次从 0X05开 始分配内部标签, 一个结构或联合的内部标签的最大数为:  4) The BXML compiler assigns internal tags to each member of the structure or union in turn, starting from 0X05. The maximum number of internal tags of a structure or union is:
128 tags - 5 predefined = 123  128 tags-5 predefined = 123
对绝大多数结构或联合来说也足够了。 下面通过具体示例分别描述向 C++和 JAVA语言的典型映射。 向 C++语言的典型映射  It is also sufficient for most structures or unions. The following describes the typical mapping to C ++ and JAVA languages through specific examples. Typical mapping to C ++ language
枚举的映射: ι Enumerated mapping: ι
BXML结构描述 BXML structure description
enum SessionType {  enum SessionType {
inband = 1;  inband = 1;
outband = 2; 生成的 C++头文件摘要  outband = 2; summary of generated C ++ header files
class SessionType  class SessionType
{  {
public:  public:
DWORD—value;  DWORD—value;
static const DWORD—inband = 1;  static const DWORD—inband = 1;
static const DWORD _outband = 2;  static const DWORD _outband = 2;
SessionType(DWORD value); SessionType (DWORD value);
SessionType(const SessionType& other);  SessionType (const SessionType &other);
SessionType& operators—const SessionType& other); void wr/te(BXMLBuffer& buffer, BXMLWriter& w);  SessionType & operators—const SessionType &other); void wr / te (BXMLBuffer & buffer, BXMLWriter &w);
string toString(int level);  string toString (int level);
static SessionType* parseStatic(uBYTE id, BXMLBuffer& buffer, BXMLParser& p);  static SessionType * parseStatic (uBYTE id, BXMLBuffer & buffer, BXMLParser &p);
一 ); 一… . · : ; : A); one .... · ·: ;:
' : : ':··' :.:·:· :··:: . ' ■:: ': :' : ·· ':.: ·: ·: ··::.' ■ ::
1) 一个枚举类型被映射成同名的 C++类;  1) An enum type is mapped to a C ++ class with the same name;
2) 一个 _value整型成员变量代表当前的枚举值; 3) 对每个被枚举 ό 常量, 用一个静态整型常量表示; 2) A _value integer member variable represents the current enumeration value; 3) For each enumerated constant, use a static integer constant;
4) 应支持拷贝构造函数和赋值操作符重载;  4) Copy constructor and assignment operator overloading should be supported;
5) write成员函数用来编码枚举对象自身;  5) The write member function is used to encode the enumeration object itself;
6) 出于调试的目的 , toString成员函数用来输出对象自身的可打印字符串,字符串应按 人工可理解的方式表示, 例如, 上例中的枚举对象可能输出字符串 "inband" 。  6) For debugging purposes, the toString member function is used to output the printable string of the object itself. The string should be represented in a human-readable manner. For example, the enumeration object in the above example may output the string "inband".
7) parseStatic静 成员函数用来解码一个枚举对象。  7) The parseStatic static member function is used to decode an enum object.
Figure imgf000015_0001
Figure imgf000016_0001
SessionDescriptor* _desc;
Figure imgf000015_0001
Figure imgf000016_0001
SessionDescriptor * _desc;
ANY*— msgBody; ANY * — msgBody;
DWORD*—time; DWORD * —time;
ANY*—addition; ANY * —addition;
bool一 desc— presence; bool a desc— presence;
bool—msgBody— presence; bool—msgBody— presence;
bool—time— presence; bool—time— presence;
bool—addition— presence; bool—addition— presence;
Message(); Message ();
Message(const Message& other);  Message (const Message &other);
Message& operator=(const Message& other);  Message & operator = (const Message &other);
virtual -Message(); virtual -Message ();
virtual ANY* duplicate(); virtual ANY * duplicate ();
virtual uBYTE getCodepage(); virtual uBYTE getCodepage ();
virtual uBYTE getTag(); virtual uBYTE getTag ();
virtual void write(BXMLBuffer& buffer, BXMLWriter& w, bool withtag); virtual string toString(int level, boo! withtag); virtual void write (BXMLBuffer & buffer, BXMLWriter & w, bool withtag); virtual string toString (int level, boo! withtag);
virtual ANY* parse(uBYTE id, BXMLBuffer& buffer, BX LParser& p); static Message* parseStatic(uBYTE id, BXMLBuffer& buffer, BXMLParser& p); void set_desc(SessionDescriptor* desc); virtual ANY * parse (uBYTE id, BXMLBuffer & buffer, BX LParser &p); static Message * parseStatic (uBYTE id, BXMLBuffer & buffer, BXMLParser &p); void set_desc (SessionDescriptor * desc);
void set_desc(const SessionDescriptor& desc); void set_desc (const SessionDescriptor &desc);
void set_msgBody(ANY* msgBody); void set_msgBody (ANY * msgBody);
void setJime(DWORD* time); void setJime (DWORD * time);
void set_time(DWORD time); void set_time (DWORD time);
void set_addition(ANY* addition); void set_addition (ANY * addition);
void unset_desc(); void unset_desc ();
void unset_msgBody(); void unset_msgBody ();
void unset_time(); void unset_time ();
void unset_addition(); void unset_addition ();
SessionDescriptor* get一 desc();  SessionDescriptor * get a desc ();
ANY* get_msgBody();  ANY * get_msgBody ();
DWORD* getjime();  DWORD * getjime ();
ANY* get_addition();  ANY * get_addition ();
bool desc_presence(); bool desc_presence ();
bool msgBody_presence(); bool msgBody_presence ();
bool time_presence(); bool time_presence ();
bool addition presenceQ; private: bool addition presenceQ; private:
void init();  void init ();
};  };
- ; ,  -;,,
1) 一个结构被映射成同名的 C++类, 它继承了 ANY类;  1) A structure is mapped to a C ++ class with the same name, which inherits the ANY class;
2) Jag和— codepage静态成员常量用来记录 BXML编译器为该结构分配的 TAG和 BXML结构描述文件指定的 codepage;  2) Jag and — codepage static member constants are used to record the codepage specified by the TAG and BXML structure description file assigned by the BXML compiler for the structure;
3) 对每个结构成员, 用一个静态成员常量来记录 BXML编译器为它分配的内部标签 值;  3) For each structure member, use a static member constant to record the internal tag value assigned by the BXML compiler;
4) 对每个结构成员, 用一个带 "―"前缀的同名成员指针变量表示成员的值。 如果被相 应的 presence变量标识为出现的那个成员的值指针为空, 表示该成员无值出现; 4) For each structure member, a member pointer variable with the same name prefixed with "―" is used to represent the value of the member. If the value pointer of the member identified as being present by the corresponding presence variable is empty, it means that the member has no value;
5) 对每个结构成员, 一个一 xxx_presence变量用来标识相应的成员是否出现; 5) For each structure member, a xxx_presence variable is used to identify whether the corresponding member appears;
6) 对每个结构成员, 应有若千个重载的 set_xxx成员函数用来设置成员的值, set— XXX 函数通常应包括接管指针的形式和拷贝 数的形式  6) For each structure member, there should be thousands of overloaded set_xxx member functions to set the value of the members. The set_XXX function should generally include the form of the takeover pointer and the form of the copy number.
7) 对每个结构成员, 用一个 unset— XXX成员函数来设置该成员为不出现;  7) For each structure member, use an unset_XXX member function to set the member to not appear;
8) 应支持拷贝构造函数和赋 操 符重载;  8) Copy constructor and assignment operator overloading should be supported;
9) write成员函数用来编码 构对象自身;  9) The write member function is used to encode the object itself;
10)出于调试的目的, toString成员函数用来输出对象自身的可打印字符串。  10) For debugging purposes, the toString member function is used to output the printable string of the object itself.
11)应实现基类 ANY中的 parse虚成员函数实现用于解码, 它调用 parseStatic函数; 11) The parse virtual member function in the base class ANY should be implemented for decoding, which calls the parseStatic function;
12) parseStatic静态成员函数用来解码一个结构对象。 12) The parseStatic static member function is used to decode a structure object.
数组的映射: Mapping of arrays:
BXML结构描述 BXML structure description
arrayof Userlnfo UserlnfoList;  arrayof Userlnfo UserlnfoList;
arrayof int IntegerList;  arrayof int IntegerList;
arrayof string StringList;  arrayof string StringList;
arrayof binary BinaryList;  arrayof binary BinaryList;
生成的 C++头文件摘要 Summary of generated C ++ header files
class UserlnfoList: public vector< Userlnfo* >  class UserlnfoList: public vector <Userlnfo *>
{  {
public;  public;
UserlnfoListQ;  UserlnfoListQ;
UserInfoList(const UserInfoList& other);  UserInfoList (const UserInfoList &other);
~UserInfoList();  ~ UserInfoList ();
UserInfoList& operator=(const UserInfoList& other);  UserInfoList & operator = (const UserInfoList &other);
void clean();  void clean ();
void write(BX LBuf er& buffer, BXMLWriter& w);  void write (BX LBuf er & buffer, BXMLWriter &w);
string toString(int level); static UserlnfoList* parseStatic(uBYTE id, BXMLBuffer& buffer, BX LParser& p); void add(const UserInfo& val); string toString (int level); static UserlnfoList * parseStatic (uBYTE id, BXMLBuffer & buffer, BX LParser &p); void add (const UserInfo &val);
}; class IntegerList: public vector< DWORD > public:  }; Class IntegerList: public vector <DWORD> public:
IntegerList();  IntegerList ();
IntegerList(const IntegerList& other);  IntegerList (const IntegerList &other);
~IntegerList();  ~ IntegerList ();
IntegerList& operator=(const IntegerList& other);  IntegerList & operator = (const IntegerList &other);
void clean();  void clean ();
void write(BX LBuf er& buffer, BXMLWriter& w);  void write (BX LBuf er & buffer, BXMLWriter &w);
string toString(int level);  string toString (int level);
static IntegerList* parseStatic(uBYTE id, BXMLBuf er& buffer, BXMLParser& p); static IntegerList * parseStatic (uBYTE id, BXMLBuf er & buffer, BXMLParser &p);
}; class StringList: public vector< string* > }; Class StringList: public vector <string *>
{  {
public: public:
StringList();  StringList ();
StringList(const StringList& other);  StringList (const StringList &other);
-StringList();  -StringList ();
StringList&
Figure imgf000019_0001
StringList& other);
StringList &
Figure imgf000019_0001
StringList &other);
void clean();  void clean ();
void write(BXMLBuffer& buffer, BXMLWriter& w);  void write (BXMLBuffer & buffer, BXMLWriter &w);
string toString(int level);  string toString (int level);
static StringList* parseStatic(uBYTE id, BXMLBuffer& buffer, BXMLParser& p); void add(const string& val);  static StringList * parseStatic (uBYTE id, BXMLBuffer & buffer, BXMLParser &p); void add (const string &val);
void add(const char* val);  void add (const char * val);
}; class Binar List: public vector< ByteArray* > public:  }; Class Binar List: public vector <ByteArray *> public:
Binar List();  Binar List ();
BinaryList(const Binar List& other);  BinaryList (const Binar List &other);
~BinaryList();  ~ BinaryList ();
BinaryList& operator=(const BinaryListfe: other); void clean(); BinaryList & operator = (const BinaryListfe: other); void clean ();
void write(BXMLBuf er& buffer, BX LWriter& AV);  void write (BXMLBuf er & buffer, BX LWriter &AV);
string toString(int level);  string toString (int level);
static BinaryList* parseStatic(uBYTE id, BXMLBuf er& buffer, BX LParser& p);  static BinaryList * parseStatic (uBYTE id, BXMLBuf er & buffer, BX LParser &p);
void add(const ByteArray& val);  void add (const ByteArray &val);
void add(const BYTE* vala int len); void add (const BYTE * val a int len);
1) 一个数组类型被映射成同名的 C++类, 它继承了一个 STL vector模版类; 1) An array type is mapped to a C ++ class with the same name, which inherits an STL vector template class;
2) 应支持拷贝构造函数和赋值操作符重载;  2) Copy constructor and assignment operator overloading should be supported;
3) write成员函数用来编码数组3中象自身; 3) The write member function is used to encode the array 3 like itself;
4) 出于调试的目的, toString成员函数用来输出对象自身的可打印字符串。  4) For debugging purposes, the toString member function is used to output the printable string of the object itself.
5) parseStatic静态成员函数用来解码一个数组对象。 整型的映射:  5) The parseStatic static member function is used to decode an array object. Integer mapping:
一个整型被映射成 C++的 DWORD类型,在 BXML运行库中有- 个; BXMLInt类用来支持对它进行编解码和可打印字符串的输出功能 < 字符串的映射:  An integer is mapped to a C ++ DWORD type, and there are one in the BXML runtime; the BXMLInt class is used to support its encoding and decoding and the output function of printable strings <String mapping:
一个字符串被映射成 C++的 STL string类型 , 在 BXML运行库中 有一个 BXMLString类用来支持对它进行编解码和可打印字符串的输 出 能。 二进制字节序列的映射:  A string is mapped to the C ++ STL string type. There is a BXMLString class in the BXML runtime to support its encoding and decoding and the output of printable strings. Mapping of binary byte sequences:
一个二进制字节序列被映射成 C++的 STL vecotor< BYTE >类型, 在 BXML运行库中有一个 BXMLBinary类用来支持对它进行编解码和 可 τ印字符串的输出功能。 不定结构的映射:  A binary byte sequence is mapped to C ++ 's STL vecotor <BYTE> type. There is a BXMLBinary class in the BXML runtime to support its encoding and decoding and the ability to print strings. Indefinite structure mapping:
不定结构被映射成 BXML运行库中的 ANY类, 它也是所有结构 映身成 C++类的基类。 ANY类的定义摘要如下: class ANY public: static ANY* parseANY(uBYTE id, BXMLBuffer& buffer, BXMLParser& p); virtual -ANY(){}; The indefinite structure is mapped to the ANY class in the BXML runtime, which is also the base class where all structures are mapped into C ++ classes. The definition of the ANY class is summarized as follows: class ANY public: static ANY * parseANY (uBYTE id, BXMLBuffer & buffer, BXMLParser &p); virtual -ANY () {};
virtual ANY* dupiicate() = 0;  virtual ANY * dupiicate () = 0;
virtual uBYTE getCodepageQ = 0;  virtual uBYTE getCodepageQ = 0;
virtual uBYTE getTag() - 0;  virtual uBYTE getTag ()-0;
virtual ANY* parse(uBYTE id, BXMLBuffer& buffer, BXMLParser& p) = 0;  virtual ANY * parse (uBYTE id, BXMLBuffer & buffer, BXMLParser & p) = 0;
virtual void write(BXMLBuf er& buffer, BXMLWriter& w, bool withtag) = 0;  virtual void write (BXMLBuf er & buffer, BXMLWriter & w, bool withtag) = 0;
virtual string toString(int level, bool withtag) = 0;  virtual string toString (int level, bool withtag) = 0;
另外, BXML 运行库中还有两个重要的类: BXMLWriter 和 BXMLParser, 用来支持对顶层 BXML数据进行编解码。 向 JAVA语言的映射 In addition, there are two important classes in the BXML runtime: BXMLWriter and BXMLParser, which are used to support encoding and decoding of top-level BXML data. Mapping to Java
枚举的映射:  Enumerated mapping:
Figure imgf000021_0001
Figure imgf000022_0001
Figure imgf000021_0001
Figure imgf000022_0001
结构的映射:  Structural mapping:
BXML结构描述 struct Message {  BXML structure description struct Message {
SessionDescriptor desc; SessionDescriptor desc;
ANY msgBody; int time; ANY msgBody; int time;
ANY addition; ANY addition;
Figure imgf000023_0001
public boolean time_presence()
Figure imgf000023_0001
public boolean time_presence ()
public boolean addition_presence()  public boolean addition_presence ()
public ANY parse(int id, InputStream in, BXMLParser p)  public ANY parse (int id, InputStream in, BXMLParser p)
public static Message parseStatic(int id, InputStream in, BXMLParser p) public void write(OutputStream out, BXMLWriter , boolean withtag) public String toString(int level, boolean withtag)  public static Message parseStatic (int id, InputStream in, BXMLParser p) public void write (OutputStream out, BXMLWriter, boolean withtag) public String toString (int level, boolean withtag)
1) 与 C++的映射规则类似, 请参考前面描述。 数组的映射: 1) Similar to C ++ mapping rules, please refer to the previous description. Mapping of arrays:
BXML结构描述 BXML structure description
arrayof Userinfo UserlnfoList;  arrayof Userinfo UserlnfoList;
arrayof int IntegerList;  arrayof int IntegerList;
arrayof string StringList;  arrayof string StringList;
arrayof binary BinaryList;  arrayof binary BinaryList;
生成的 JAVA类摘要 Generated JAVA Class Summary
public final class UserlnfoList extends Vector  public final class UserlnfoList extends Vector
{  {
public Userinfo getltem(int i)  public Userinfo getltem (int i)
public static UserlnfoList parseStatic(int id, InputStream in, BXMLParser p) public void write(OutputStream out, BXMLWriter w)  public static UserlnfoList parseStatic (int id, InputStream in, BXMLParser p) public void write (OutputStream out, BXMLWriter w)
public String toString(int level)  public String toString (int level)
} public final class IntegerList extends Vector  } public final class IntegerList extends Vector
{  {
public Integer getltem(int i)  public Integer getltem (int i)
public static IntegerList parseStatic(int id, InputStream in, BXMLParser p) public void write(OutputStream out, BXMLWriter w)  public static IntegerList parseStatic (int id, InputStream in, BXMLParser p) public void write (OutputStream out, BXMLWriter w)
public String toString(int level)  public String toString (int level)
} public final class StringList extends Vector  } public final class StringList extends Vector
{  {
public String getltem(int i)  public String getltem (int i)
public static StringList parseStatic(int id, InputStream in, BXMLParser p) public void write(OutputStream out, BXMLWriter w)
Figure imgf000025_0001
整型的映射:
public static StringList parseStatic (int id, InputStream in, BXMLParser p) public void write (OutputStream out, BXMLWriter w)
Figure imgf000025_0001
Integer mapping:
一个整型被映射威 JAVA的 Integer类型, 在 BXML运行库中有 • BXMLInt 类用来支持对它进行编解码和可打印字符串的输出功  An integer is mapped to JAVA's Integer type, which is available in the BXML runtime. • The BXMLInt class is used to support encoding and decoding of it and the output of printable strings.
字符串的映射: Mapping of strings:
一个字符串被映 †成 JAVA的 String类型, 在 BXML运行库中有 一个 BXMLString类用来支持对它进行编解码和可打印字符串的输出 功能。 二进制字节序列的映射:  A string is mapped into the JAVA String type. There is a BXMLString class in the BXML runtime to support its encoding and decoding and the output of printable strings. Mapping of binary byte sequences:
一个二进制字节序列被映射成 JAVA的 byte[】 类型,在 BXML运 行库中有一个 BXMLBinary类用来支持对它进行编解码和可打印字符 串的输出功能。 不定结构的映射:  A binary byte sequence is mapped to JAVA's byte [] type. There is a BXMLBinary class in the BXML runtime library to support encoding and decoding of it and the output of printable strings. Indefinite structure mapping:
不定结构被映射成 BXML运行库中的 ANY类, 它也是所有结构 映射成 JAVA类的基类。 ANY类的定义摘要如下: public abstract class ANY { The indefinite structure is mapped to the ANY class in the BXML runtime. It is also the base class for all structures mapped to JAVA classes. The definition of the ANY class is summarized as follows: public abstract class ANY {
public static ANY parseANY(int id, InputStream in, BXMLParser p);  public static ANY parseANY (int id, InputStream in, BXMLParser p);
public abstract ANY parse(int id, InputStream in, BXMLParser p);  public abstract ANY parse (int id, InputStream in, BXMLParser p);
public abstract void write(OutputStream out, BX LWriter w, boolean with tag);  public abstract void write (OutputStream out, BX LWriter w, boolean with tag);
public abstract int getCodepage();  public abstract int getCodepage ();
public abstract int getTag();  public abstract int getTag ();
public abstract String toString(int level, boolean withtag);  public abstract String toString (int level, boolean withtag);
} 另外, BXML运行库中还有两个重要的类: BXMLWriter和  } In addition, there are two important classes in the BXML runtime: BXMLWriter and
BXMLParser, 用来支持对顶层 BXML数据进行编解码。 BXMLParser, used to support encoding and decoding of top-level BXML data.
一个应用程序运行示例 用前面示例过的 BXML结构描述文件,然后用 BXML编译器生成的 源代码开发了一个应用程序, 该程序以描述文件中定义的 Message结 构作为顶层结构构造了一个结构化数据 ,再用该结构的自打印功能输出 数据内容, 并用编码函数进行 BXML编码。  An application running example uses the BXML structure description file described in the previous example, and then uses the source code generated by the BXML compiler to develop an application that constructs structured data with the Message structure defined in the description file as the top-level structure. Then use the self-print function of this structure to output the data content, and use the encoding function to perform BXML encoding.
Message结构的自打印输出: Self-printed output of Message structure:
Message  Message
desc  desc
type  type
outband  outband
address  address
url  url
sip :j oe.li@utstar.com '  sip: j oe.li@utstar.com ''
sessionID  sessionID
abed  abed
msgBody  msgBody
LoginReq  LoginReq
devicelD  devicelD
UTStarcomABC  UTStarcomABC
userList  userList
Userlnfo  Userlnfo
userlD  userlD
Joe.Ii  Joe.Ii
age 29 age 29
key  key
Oc 16 00 17  Oc 16 00 17
Userlnfo  Userlnfo
userlD  userlD
Mike  Mike
age  age
25  25
key  key
38 23 Oc 43 45  38 23 Oc 43 45
biist  biist
4e c8  4e c8
15 19 06  15 19 06
slist  slist
string 1  string 1
string2  string2
ilist  ilist
30 '  30 '
800  800
time  time
540394  540394
addition  addition
LoginRes 经编码函教进行编码以后的数据总长度为 138 个字节, 其内容用 16进制表示:^下:  The total length of the data after LoginRes is encoded by the encoding function is 138 bytes, and its content is expressed in hexadecimal: ^:
01 00 87 6a 89 S5 85 02 86 85 73 69 70 3a 6a 6f 65 2e 6c 69 40 75 74 73 74 61 72 2e 63 6f 6d 00 87 61 62 63 64 00 01 86 87 85 55 54 53 74 61 72 63 6f 6d 41 42 43 00 86 86 85 4a 6f 65 2e 6c 69 00 86 Id 87 04 0c 16 00 17 01 86 85 4d 69 6b 65 00 86 19 87 05 38 23 0c 43 45 01 01 87 82 02 4e c8 82 03 15 19 06 01 88 82 73 74 72 69 6e 67 31 00 82 73 74 72 69 6e 67 32 00 01 89 82 le 82 86 20 01 01 87 aO fd 6a 88 08 01 对上述 BXML编码的解释如下表所示: 01 00 87 6a 89 S5 85 02 86 85 73 69 70 3a 6a 6f 65 2e 6c 69 40 75 74 73 74 61 72 2e 63 6f 6d 00 87 61 62 63 64 00 01 86 87 85 55 54 53 74 61 72 63 6f 6d 41 42 43 00 86 86 85 4a 6f 65 2e 6c 69 00 86 Id 87 04 0c 16 00 17 01 86 85 4d 69 6b 65 00 86 19 87 05 38 23 0c 43 45 01 01 87 82 02 4e c8 82 03 15 19 06 01 88 82 73 74 72 69 6e 67 31 00 82 73 74 72 69 6e 67 32 00 01 89 82 le 82 86 20 01 01 87 aO fd 6a 88 08 01 The explanation of the above BXML encoding is shown in the following table:
字节序列 解释 Byte Sequence Explanation
01 BXML版本: 1.1  01 BXML version: 1.1
00 87 后续编码数据长度为 135个字节  00 87 subsequent encoded data length is 135 bytes
6a 字符集为 UTF-8  6a character set is UTF-8
89 Message的结构标签, 有内容  89 Message structured tag with content
85 Message结构的内部标签 desc , 有内容 85 SessionDescriptor结构的内部标签 type, 有. 85 Message structure internal tag desc, with content 85 Internal tag type for SessionDescriptor structure, yes.
内容  Content
02 SessionType的值 outband  02 SessionType value outband
86 SessionDescriptor结构的内部标签 address, 有内容  86 SessionDescriptor structure internal label address, with content
85 SessionAddress枚举的内部标签 url, 有内 容  85 SessionAddress enumeration internal label url, with content
73 69 70 3a 6a 6f 65 2e 6c 69 url字符串的值: sip:joe. li@utstar. com  73 69 70 3a 6a 6f 65 2e 6c 69 The value of the url string: sip: joe. Li @ utstar. Com
40 75 74 73 74 61 72 2e 63 6f  40 75 74 73 74 61 72 2e 63 6f
6d 00  6d 00
87 . SessionDescriptor结构的内部标签  87.Internal label of SessionDescriptor structure
sessionID , 有内容  sessionID with content
61 62 63 64 00 SessionID字符串的值: abed  61 62 63 64 00 The value of the SessionID string: abed
01 SessionDescriptor结构结束  01 SessionDescriptor structure ends
86 Message结构的内部标签 msgBody,有内容 86 Message structure's internal tag msgBody, with content
87 LoginReq的结构标签, 有内容 ( codepage未 出现, 表示 codepage为 0 ) 87 LoginReq structure tag with content (codepage does not appear, indicating codepage is 0)
85 LoginReq结构的内部标签 devicelD, 有内 容  85 Internal tag devicelD of LoginReq structure, with content
55 54 53 74 61 72 63 6f 6d 41 devicelD字符串的值: UTStarcomABC  55 54 53 74 61 72 63 6f 6d 41 DevicelD string value: UTStarcomABC
42 43 00  42 43 00
86 LoginReq结构的内部标签—— userList,有内容 86 LoginReq structure internal tag-userList, with content
82 ARRAY ITEM全局标签 82 ARRAY ITEM global label
85 Userlnfo结构的内部标签—— userlD, 有内容 85 Internal tag of Userlnfo structure-userlD, with content
4a 6f 65 2e 6c 69 00 userlD字符串的值: Joe.li 4a 6f 65 2e 6c 69 00 UserlD string value: Joe.li
86 Userlnfo结构的内部标签—— age, 有内容 86 Userlnfo structure internal tag-age, with content
Id age的值: 29 Id age value: 29
87 Userlnfo结构的内部标签—— key, 有内容 87 Internal label of Userlnfo structure-key, with content
04 key二进制字节序列的长度为 4 04 key binary byte sequence has length 4
0c 16 00 17 key二进制字节序列的内容  0c 16 00 17 content of key binary byte sequence
01 Userlnfo结构结束  01 End of Userlnfo structure
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
85 Userlnfo结构的内部标签—— userlD, 有内容 85 Internal tag of Userlnfo structure-userlD, with content
4d 69 6b 65 00 userlD字符串的值: Mike 4d 69 6b 65 00 userlD string value: Mike
86 Userlnfo结构的内部标签—— age, 有内容 86 Userlnfo structure internal tag-age, with content
19 age的值为 25 19 age is 25
87 Userlnfo结构的内部标签—— key, 有内容 87 Internal label of Userlnfo structure-key, with content
05 key二进制字节序列的长度为 5 38 23 0c 43 45 key二进制字节序列的内容 05 key binary byte sequence has length 5 38 23 0c 43 45 key binary byte sequence content
01 Userlnfo结构结束  01 End of Userlnfo structure
01 userList数組结束  01 end of userList array
87 LoginReq结构的内部标签—— bList, 有内容  87 Internal tag of LoginReq structure-bList, with content
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
02 二进制字节序列的长度为 2  02 binary byte sequence is 2 in length
4e c8 二进制字节序列的内容  4e c8 binary byte sequence contents
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
03 二进制字节序列的长度为 3  03 Binary byte sequence is 3 in length
15 19 06 二进制字节序列的内容  15 19 06 Content of binary byte sequence
01 bList数组结束  01 bList array ends
88 LoginReq结构内部标签—— sList, 有内容  88 LoginReq structure internal tag-sList, with content
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
73 74 72 69 6e 67 31 00 字符串: string 1  73 74 72 69 6e 67 31 00 string: string 1
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
73 74 72 69 6e 67 32 00 字符串: string2  73 74 72 69 6e 67 32 00 string: string2
01 sList数组结束  01 end of sList array
89 LoginReq结构的内部标签—— iList, 有内容  89 LoginReq structure internal tag-iList, with content
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
le 整数: 30 le integer: 30
82 ARRAY ITEM全局标签  82 ARRAY ITEM global label
86 20 整数: 800  86 20 Integer: 800
01 iList数组结束  01 iList array ends
01 LoginReq结构结束  01 LoginReq structure ends
87 Message结构的内部标签—— time , 有内容  87 Internal tag of Message structure-time, with content
aO fd 6a 整数: 540394 aO fd 6a integer: 540394
88 Message结构的内部标签 addition , 有内容  88 Message structure internal tag addition, with content
08 LoginRes结构标签,无内容( codepage未出现,  08 LoginRes structure tag, no content (codepage does not appear,
表示 codepage为 0 )  Means codepage is 0)
01 Message结构结束 综上所述, 本发明对 WBXML的改进主要包括:  01 Message structure ends In summary, the improvement of the present invention on WBXML mainly includes:
将 WBXML的元素与计算机语言的结构数据类型相对应, 所有的 元素标签代表相应的结构类型编码;  Corresponds the elements of WBXML with the structure data type of the computer language, and all element tags represent the corresponding structure type codes;
增加内部标签( Internal tag )的概念,内部标签在一个结构内解释, 代表一个相应的结构成员, 内部标签根据运行时需要可携带标签内容 (即成员值), 也可不携带内容(即无值出现), 或者不出现(即成员 缺席) ; Add the concept of internal tag. The internal tag is explained in a structure and represents a corresponding member of the structure. The internal tag can carry the content of the tag according to the needs of the runtime. (I.e. member value), may also carry no content (i.e. no value appears), or may not appear (i.e. member is absent);
内部标签所携带的内容不再需要任何其他 tag标记, 因为类型是预 知的, 除非是成员类型为不定结构 (ANY ) 的情况;  The content carried by the internal tag no longer needs any other tag tags, because the type is predicted, except when the member type is an indefinite structure (ANY);
内部标签所携带的内容直接按照所对应的成员类型进行编码,如整 型按多字节整数进行编码, 不象 WBXML仍然采用字符串编码;  The content carried by the internal label is directly encoded according to the corresponding member type. For example, integers are encoded as multi-byte integers. Unlike WBXML, which still uses string encoding;
增加一个预定义的全局标记—— ARRAY一 ITEM, 用于分隔数组元 素;  Add a predefined global tag-ARRAY_ITEM, used to separate array elements;
不考虑专门支持 WBXML的属性(Attribute ) , 但是可以通过增 加结构成员来表达属性;  Attributes that specifically support WBXML are not considered, but attributes can be expressed by adding structure members;
不考虑专门支持 WBXML中的字符串常量, 但是可以通过枚举类 型来表达;  Does not consider specifically supporting string constants in WBXML, but can be expressed by enumeration types;
虽然已经参考附图对本发明的方法以举例方式进行了描述,但是本 发明不限于上述这些细节,并且本申请含盖权利要求范围之内的各种变 型或改变。 工业应用性  Although the method of the present invention has been described by way of example with reference to the accompanying drawings, the present invention is not limited to these details, and the present application covers various modifications or changes within the scope of the claims. Industrial applicability
根据本发明的方法提出了一种适合代码自动生成的结构化数据的 二进制编码方法,本发明适用于独立于平台、语言和传输的各种应用数 据的交换, 如网络通讯协议、 智能设备间的数据同步、 结构化数据存储 等。  According to the method of the present invention, a binary encoding method of structured data suitable for automatic code generation is proposed. The present invention is applicable to the exchange of various application data independent of the platform, language and transmission, such as network communication protocols, Data synchronization, structured data storage, etc.

Claims

权 利 要 求 Rights request
1.一种适合代码自动生成的结构化数据的二进制编码方法, 其特征 在于, 包括如下步骤: 1. A binary encoding method for structured data suitable for automatic code generation, characterized by comprising the following steps:
步骤一, 定义 BXML编码格式;  Step 1: define a BXML encoding format;
步骤二, 按照具体的应用需求, 构造适合采用 BXML编码的结构 化数据描述文件;  Step 2: According to specific application requirements, construct a structured data description file suitable for BXML encoding;
步骤三,使用 BXML编译器读取所述结构化数据描述文件, BXML 编译器根据命令生成某种具体的计算机语言的源代码,  Step 3: Use a BXML compiler to read the structured data description file, and the BXML compiler generates the source code of a specific computer language according to the command.
步骤四,与具体的应用逻辑和传输方式相结合, 实现完整的应用层 的数据交换。  Step 4: Combine with specific application logic and transmission methods to achieve complete application-layer data exchange.
2.根据权利要求 1所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 按照如下要求构造结构化数据描述文件: 一个 结构化数据描述总是针对一个特定的上下文,一个上下文描述可以由一 个或多个 BXML结构描述文件组成; BXML编译器的一次运行也总是 针对一个上下文同时读入该上下文的所有描述文件。  The method for binary encoding of structured data suitable for automatic code generation according to claim 1, characterized in that a structured data description file is constructed according to the following requirements: a structured data description is always directed to a specific context, a A context description can consist of one or more BXML structure description files; a run of the BXML compiler always reads all description files of a context simultaneously for one context.
3.才艮据权利要求 2所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 每个 BXML结构描述文件在文件开头必须使 用关键字指定该文件的编码空间, 它对该文件内描述的所有结构有效; 在一个上下文中, 编码空间必须是唯一的。  3. The binary encoding method for structured data suitable for automatic code generation according to claim 2, wherein each BXML structure description file must use a keyword at the beginning of the file to specify the encoding space of the file. All structures described in this file are valid; in one context, the coding space must be unique.
4.根据权利要求 3所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 每个 BXML结构描述文件在关键字之后指明 生成程序源代码时所需要的包名或命名空间,他们对该文件中的所有描 迷有效;对一个上下文中的不同的描述文件,可以指定相同或不同的包 名或命名空间。 编码方法, 其特征在于, 在一个 BXML结构描述文件中可以直接使用 同一上下文的任何一个 BXML描述文件中定义的数据类型, 同一上下 文的任何数据类型不能重名。 The method for binary encoding of structured data suitable for automatic code generation according to claim 3, wherein each BXML structure description file specifies a package name or a namespace required after generating a program source code after a keyword , They are valid for all descriptions in the file; for different description files in a context, you can specify the same or different package names or namespaces. The encoding method is characterized in that a BXML structure description file can directly use a data type defined in any BXML description file of the same context, and any data type of the same context cannot have the same name.
6.根据权利要求 1所述的适合代码自动生成的结构化数据的二进制 编码方法,其特征在于,定义 BXML编码格式时,还包括定义标签 TAG , 所述标签 TAG被编码成一个单字节。 6. The method for binary encoding of structured data suitable for automatic code generation according to claim 1, characterized in that, when defining a BXML encoding format, it further comprises defining a tag TAG, wherein the tag TAG is encoded into a single byte.
7.根据权利要求 6所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 所述单字节的最高位用于指示该 TAG是否还 有后续的内容编码。  The binary encoding method for structured data suitable for automatic code generation according to claim 6, wherein the highest byte of the single byte is used to indicate whether the TAG has subsequent content encoding.
8.根据权利要求 6所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 所述单字节的第 0到 6比特位用于定义 TAG 标识。  The binary encoding method for structured data suitable for automatic code generation according to claim 6, wherein the 0th to 6th bits of the single byte are used to define a TAG identifier.
9.根据权利要求 8所述的适合代码自动生成的结构化数据的二进制 编码方法, 其特征在于, 所述单字节的一定范围被保留做预定义标记。  The binary encoding method for structured data suitable for automatic code generation according to claim 8, wherein a certain range of the single byte is reserved as a predefined tag.
10.根据权利要求 6-9任一所述的适合代码自动生成的结构化数据 的二进制编码方法, 其特征在于, 一个标签 TAG总是在它所属的空间 中有效。  The method for binary encoding of structured data suitable for automatic code generation according to any one of claims 6 to 9, characterized in that a tag TAG is always valid in the space to which it belongs.
11.根据权利要求 6-9任一所述的适合代码自动生成的结构化数据 的二进制编码方法, 其特征在于, 所述 TAG有三种类型: 总是全局范 围有效的预定义标签; 结构标签, 由 BXML编译器自动分配; 内部标 签, 用来标识一个成员是否出现。  11. The method for binary encoding of structured data suitable for automatic code generation according to any one of claims 6-9, wherein the TAG has three types: a predefined tag that is always globally valid, a structure tag, Automatically assigned by the BXML compiler; internal tags are used to identify whether a member appears.
12.根据权利要求 1所述的适合代码自动生成的结构化数据的二进 制编码方法, 其特征在于, 所述定义 BXML编码格式, 包括用关键字 定义的数据类型:  12. The binary encoding method for structured data suitable for automatic code generation according to claim 1, wherein the defining the BXML encoding format comprises a data type defined by a keyword:
整数: 关键字为 int  Integer: keyword is int
字符串: 关键字为 string  String: The keyword is string
不透明二进制字节序列: 关键字为 binary  Opaque binary byte sequence: the keyword is binary
枚举: 关键字为 enum  Enumeration: keyword is enum
联合: 关键字为 union  Union: The keyword is union
结构: 关键字为 struct  Structure: The keyword is struct
不定结构: 关键字为 ANY  Indefinite structure: the keyword is ANY
数组: 关键字为 arrayof。 Array: The keyword is arrayof.
13.根据权利要求 1所述的适合代码自动生成的结构化数据的二进 制编码方法, 其特征在于, 一个上下文描述可以由一个或多个 BXML 结构描述文件组成。 13. The binary encoding method for structured data suitable for automatic code generation according to claim 1, wherein a context description can be composed of one or more BXML structure description files.
14.根据权利要求 1所述的适合代码自动生成的结构化数据的二进 制编码方法, 其特征在于, 对应于多个不同类型的通讯接口, 一个应用 程序可以同时拥有不同的上下文,以适应于多个不同类型的功能实体进 行交换数据。 制编码方法, 其特征在于, 在一次上下文编译中, BXML 编译器针对 每个描述文件分别依次为每个结构分配结构 TAG; 如果一个页面的结 构数目超过 123, 则分布到不同的文件的编码空间中, 一个上下文的应 用结构最大数为 31488 。  14. The method for binary encoding of structured data suitable for automatic code generation according to claim 1, characterized in that, corresponding to a plurality of different types of communication interfaces, an application can have different contexts at the same time to adapt to multiple Different types of functional entities exchange data. The encoding method is characterized in that in one context compilation, the BXML compiler assigns a structure TAG to each structure in turn for each description file; if the number of structures on a page exceeds 123, it is distributed to the encoding space of different files In a context, the maximum number of application structures is 31488.
16.根据权利要求 14所述的适合代码自动生成的结构化数据的二进 制编码方法, 其特征在于, 所述 BXML编译器分别为每一个结构或联 合的成员依次分配内部标签, 一个结构或联合的内部标签的最大数为 123 。  16. The method for binary encoding of structured data suitable for automatic code generation according to claim 14, wherein the BXML compiler assigns internal tags to each member of a structure or joint in turn, one structure or joint The maximum number of internal labels is 123.
17.根据权利要求 1所述的适合代码自动生成的结构化数据的二进 制编码方法,其特征在于,所述源代码提供了结构化数据描述到代码的 直接映射, 并自.动实现了对结构标签的编码映射和相应的 BXML编解 码和自打印功能。  17. The binary encoding method for structured data suitable for automatic code generation according to claim 1, wherein the source code provides a direct mapping of the structured data description to the code, and implements the structure automatically. The encoding mapping of tags and the corresponding BXML codec and self-printing functions.
18.根据权利要求 1所述的适合代码自动生成的结构化数据的二进 制编码方法, 其特征在于, 所述 BXML编码格式包括对版本号、 消息 长度、 字符集和基本数据类型、 联合、 枚举、 结构和不定结构的描述。  18. The method for binary encoding of structured data suitable for automatic code generation according to claim 1, wherein the BXML encoding format includes version number, message length, character set and basic data type, union, enumeration , Structure and indefinite description.
PCT/CN2004/000122 2004-02-13 2004-02-13 A method of binary encode that adapts to structured data whose code is automatically generated WO2005081408A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN2004800236517A CN1836374B (en) 2004-02-13 2004-02-13 Binary encoding method of structured data suitable to generate codes automatically
PCT/CN2004/000122 WO2005081408A1 (en) 2004-02-13 2004-02-13 A method of binary encode that adapts to structured data whose code is automatically generated

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2004/000122 WO2005081408A1 (en) 2004-02-13 2004-02-13 A method of binary encode that adapts to structured data whose code is automatically generated

Publications (2)

Publication Number Publication Date
WO2005081408A1 true WO2005081408A1 (en) 2005-09-01
WO2005081408A9 WO2005081408A9 (en) 2005-11-10

Family

ID=34876874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2004/000122 WO2005081408A1 (en) 2004-02-13 2004-02-13 A method of binary encode that adapts to structured data whose code is automatically generated

Country Status (2)

Country Link
CN (1) CN1836374B (en)
WO (1) WO2005081408A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111258629A (en) * 2018-11-30 2020-06-09 武汉斗鱼网络科技有限公司 Mobile phone code transcoding method, storage medium, electronic device and system
US11599708B2 (en) 2017-10-20 2023-03-07 Hewlett Packard Enterprise Development Lp Encoding of data formatted in human readable text according to schema into binary

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105955066A (en) * 2016-05-30 2016-09-21 北京理工大学 Universal model data coding and decoding method in simulation system
CN110162480B (en) * 2019-05-31 2023-02-24 泛升云微电子(苏州)有限公司 Automatic analysis method for structured diagnosis object

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002063775A2 (en) * 2001-02-05 2002-08-15 Expway Method and system for compressing structured documents
WO2003001811A1 (en) * 2001-06-25 2003-01-03 Siemens Aktiengesellschaft System for the improved encoding/decoding of structured, particularly xml-based, documents and methods and devices for the improved encoding/decoding of binary representations of such documents
US20030046317A1 (en) * 2001-04-19 2003-03-06 Istvan Cseri Method and system for providing an XML binary format

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002063775A2 (en) * 2001-02-05 2002-08-15 Expway Method and system for compressing structured documents
US20030046317A1 (en) * 2001-04-19 2003-03-06 Istvan Cseri Method and system for providing an XML binary format
WO2003001811A1 (en) * 2001-06-25 2003-01-03 Siemens Aktiengesellschaft System for the improved encoding/decoding of structured, particularly xml-based, documents and methods and devices for the improved encoding/decoding of binary representations of such documents

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11599708B2 (en) 2017-10-20 2023-03-07 Hewlett Packard Enterprise Development Lp Encoding of data formatted in human readable text according to schema into binary
CN111258629A (en) * 2018-11-30 2020-06-09 武汉斗鱼网络科技有限公司 Mobile phone code transcoding method, storage medium, electronic device and system
CN111258629B (en) * 2018-11-30 2023-08-11 苏州新看点信息技术有限公司 Mobile phone code transcoding method, storage medium, electronic equipment and system

Also Published As

Publication number Publication date
WO2005081408A9 (en) 2005-11-10
CN1836374B (en) 2010-10-13
CN1836374A (en) 2006-09-20

Similar Documents

Publication Publication Date Title
US7808975B2 (en) System and method for history driven optimization of web services communication
US7509574B2 (en) Method and system for reducing delimiters
US7873663B2 (en) Methods and apparatus for converting a representation of XML and other markup language data to a data structure format
US7650597B2 (en) Symmetric transformation processing system
JP4323516B2 (en) Information access system and method
US20020029229A1 (en) Systems and methods for data compression
US20040015890A1 (en) System and method for adapting files for backward compatibility
US20030110446A1 (en) Object class for facilitating conversion between Java and XML
CA2438176A1 (en) Xml-based multi-format business services design pattern
CN1549966A (en) Programming language extensions for processing xml objects and related applications
US7500184B2 (en) Determining an acceptance status during document parsing
EP1250657A2 (en) Method of retrieving schemas for interpreting documents in an electronic commerce system
JPH10143423A (en) System and method for managing object
CN102566984A (en) Method and device for configuring parameters
JP5044943B2 (en) Method and system for high-speed encoding of data documents
US7735001B2 (en) Method and system for decoding encoded documents
JP5789236B2 (en) Structured document analysis method, structured document analysis program, and structured document analysis system
Werner et al. Compressing soap messages by using pushdown automata
WO2005081408A1 (en) A method of binary encode that adapts to structured data whose code is automatically generated
Butler CC/PP and UAProf: Issues, improvements and future directions
US20060212799A1 (en) Method and system for compiling schema
Freund et al. Applying the web of things abstraction to bluetooth low energy communication
Aziz et al. An Introduction to JavaScript Object Notation (JSON) in JavaScript and .NET
CN102314406A (en) The method and system that is used for the lazy data sequenceization of compunication
Slominski Design of a pull and push parser system for streaming XML

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200480023651.7

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
COP Corrected version of pamphlet

Free format text: PAGE 1/1, DRAWINGS, REPLACED BY CORRECT PAGE 1/1

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: DE

122 Ep: pct application non-entry in european phase