WO2016082367A1 - Method and device for realizing hardware table traversal based on network processor - Google Patents

Method and device for realizing hardware table traversal based on network processor Download PDF

Info

Publication number
WO2016082367A1
WO2016082367A1 PCT/CN2015/073808 CN2015073808W WO2016082367A1 WO 2016082367 A1 WO2016082367 A1 WO 2016082367A1 CN 2015073808 W CN2015073808 W CN 2015073808W WO 2016082367 A1 WO2016082367 A1 WO 2016082367A1
Authority
WO
WIPO (PCT)
Prior art keywords
hardware
cpu
copy
value
entry
Prior art date
Application number
PCT/CN2015/073808
Other languages
French (fr)
Chinese (zh)
Inventor
黄治文
张文军
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016082367A1 publication Critical patent/WO2016082367A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and apparatus for implementing hardware table traversal based on a network processor.
  • Passive Optical Network refers to a specific optical communication network that does not contain any electronic devices and electronic power sources, including optical line terminals (OLT), optical network units (ONUs), and optical distribution networks (ODNs).
  • OLT optical line terminals
  • ONU optical network units
  • ODN optical distribution networks
  • the OLT device is located at the central control station, and multiple ONUs are installed at the user site.
  • the ODN between the OLT and the ONU contains fiber optics as well as passive splitters or couplers, but does not contain any active electronics. Therefore, the PON network has the advantages of no power supply pressure, strong environmental adaptability, and immunity from electromagnetic lightning.
  • Using PON technology for access network construction can also save the cost of equipment room construction and reduce operation and maintenance costs.
  • EPON passive optical access network has become one of the main choices for the construction of next-generation optical access networks.
  • a network processor is a programmable device that is specifically applied to various tasks of a communication network, such as message processing, protocol analysis, and route lookup.
  • the unique hardware design of the network processor makes it widely used in OLT devices with flexible programming and efficient message processing.
  • a hardware MAC address table When the CPU of the CPU needs to obtain all the entries in the large-capacity hardware table that meet certain characteristics, how to quickly traverse the large-capacity hardware table and report the matching entries in batches becomes one of the technical problems in the development of OLT device drivers.
  • the embodiment of the invention provides a method and a device for implementing hardware table traversal based on a network processor, which can effectively reduce the average response time of large-capacity hardware table traversal and matching hardware table batch reporting in the network processor.
  • an embodiment of the present invention provides a method for implementing hardware table traversal based on a network processor, where the method includes:
  • the value of the copy register is the number of copies of the CPU message to be copied.
  • the value of the copy register is equal to the total number of hardware entries.
  • the matching hardware entry is stored in a pre-configured cache table and the value of the pre-configured counter is updated.
  • the method further includes:
  • the saved hardware entry is obtained from the cache table according to the value of the counter.
  • the steps of configuring the replication register according to the total number of hardware entries include:
  • the steps of comparing the hardware entries in the hardware table with the copied CPU packets, and determining whether the matching information of the hardware entries matches the matching information of the CPU packets include:
  • the method further includes:
  • An embodiment of the present invention further provides an apparatus for implementing hardware table traversal based on a network processor, where the foregoing apparatus includes:
  • the receiving module is configured to receive a CPU message, and the CPU message carries matching information of the hardware entry;
  • the configuration module is configured to configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
  • the copy module is set to copy the CPU message according to the value of the copy register, and assign a copy sequence number to each copied CPU message, and the copy sequence number is the sequence number when the CPU message is copied;
  • the judging module is configured to compare the hardware entries in the hardware table with the copied CPU packets one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets;
  • the update module is configured to: when the information of the hardware entry matches the matching information of the CPU message, store the matched hardware entry in a pre-configured cache table, and update the value of the pre-configured counter.
  • the device further comprises:
  • the obtaining module is set to obtain the saved hardware entry from the cache table according to the value of the counter.
  • the configuration module includes:
  • a detecting unit configured to detect type information of a CPU message
  • the startup unit is set to configure the copy register based on the type information.
  • the judging module includes:
  • the first unit is configured to obtain, according to the copy sequence number of the copied CPU message, a hardware entry whose hardware entry index number is equal to the copy sequence number;
  • the second unit is configured to determine whether the matching information of the CPU message matches the information of the obtained hardware entry.
  • the device further comprises:
  • the method of traversing the hardware table by the microcode module and the logical multicast module of the network processor greatly shortens the time for traversing the hardware table, thereby effectively reducing the network.
  • FIG. 1 is a flow chart of steps of a method for implementing hardware table traversal based on a network processor according to an embodiment of the present invention
  • FIG. 2 is a flowchart of specific steps of step 12 in FIG. 1 according to an embodiment of the present invention
  • step 14 in FIG. 1 is a flowchart of specific steps of step 14 in FIG. 1 according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a line card and a network processor according to an embodiment of the present invention.
  • FIG. 5 is a form of an entry of a sys_mac_table and a mac_cache_table according to an embodiment of the present invention
  • FIG. 6 is a format of a packet header of a CPU packet according to an embodiment of the present invention.
  • FIG. 7 is a format of a packet header of a CPU message MAC_GET_PKT according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a microcode program mac_iter_process in an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an apparatus for implementing hardware table traversal based on a network processor according to an embodiment of the present invention.
  • the present invention is directed to the problem that the central processing unit CPU directly traverses the large-capacity hardware table in the prior art, resulting in a long response time, and provides a method and device for implementing hardware table traversal based on the network processor, which can effectively reduce the large size of the network processor.
  • the capacity hardware table traverses and matches the average response time of batch reporting of hardware entries.
  • an embodiment of the present invention provides a method for implementing hardware table traversal based on a network processor, where the method includes:
  • Step 11 Receive a CPU message, where the CPU message carries matching information of the hardware entry.
  • Step 12 Configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
  • Step 13 according to the value of the copy register, copy the CPU message, and assign a copy sequence number to each CPU message copied, and the copy sequence number is the sequence number when the CPU message is copied;
  • Step 14 Compare the hardware entries in the hardware table with the copied CPU packets one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets.
  • Step 15 If they match, the matched hardware entries are stored in a pre-configured cache table, and the values of the pre-configured counters are updated.
  • the method further includes: acquiring the saved hardware entry from the cache table according to the value of the counter.
  • the method further includes: discarding the CPU message corresponding to the copy sequence number.
  • the logical multicast module and the microcode module of the network processor are used to implement fast traversal of the large-capacity hardware table, and the matching entry is cached, so that the CPU can directly read the matching entry.
  • the CPU sends a CPU message to the network processor, and the CPU message header carries the matching feature of the hardware table entry.
  • the matching feature of the entry refers to a certain field of the entry or Several fields are equal to the specified value of the CPU software. If the CPU software does not specify the value of any field of the hardware entry, it is required to obtain all valid entries in the hardware table.
  • the CPU message enters the microcode module of the network processor, and the corresponding microcode program configures the copy register.
  • the value of the copy register is the number of copies of the message, and the number of copies is equal to the total number of entries in the hardware table.
  • the microcode module is a packet processing module shared by a general network processor, and can store multiple microcode programs; by configuration, packets from different sources can correspond to different microcode programs.
  • the corresponding microcode program begins processing the message.
  • the main process of the microcode program corresponding to the CPU message is to configure the copy register of the CPU message, and the copy register stores the number of copies of the message, which should be equal to the total number of entries in the hardware table N.
  • the CPU message is sent to the logical multicast module for copying, and each of the copied packets carries a copy sequence number, and the copy sequence number is the sequence number when the CPU message is copied, wherein the logical multicast module refers to the general network.
  • a hardware unit that implements message replication in the processor Messages with a copy register value greater than 1 are copied in the logical multicast module, and each copied message is assigned a copy sequence number and stored in a specific register.
  • the CPU message is copied into N shares in the logical multicast module, and each message corresponds to a copy sequence number between 0 and N-1.
  • each copied message returns to the microcode module, and the corresponding microcode program reads the entry in the hardware table whose index is equal to the copy number of the packet.
  • the item index number is a sequence number of the hardware entry in the hardware table, and determines whether the entry satisfies the matching feature of the packet header. If yes, the value of the counter used for the statistics matching entry is incremented by one, and the The entry is written to the cache table of the cache matching entry. If it is not satisfied, the entry is not processed. Finally, the microcode program discards the duplicated packet regardless of whether the entry matches.
  • each of the copied packets carries a copy sequence number back to the microcode module, and the corresponding microcode program is executed once; each time the microcode program is executed, the hardware is read and judged.
  • the index of the entry is equal to the copy number 0 to N-1 of the packet.
  • the CPU After the network processor traverses the hardware table, the CPU waits for a certain period of time to directly read the value of the counter, and reads the matching entry in the cache table according to the value of the counter, so as to obtain a batch matching entry (for example, a counter)
  • the value is M
  • the CPU reads the first M items of the cache table according to the value M of the counter, that is, all matching entries), wherein the waiting time of the CPU should be based on the table capacity of the hardware table and the processing speed of the network processor. Determination can also be determined experimentally.
  • the microcode program after the microcode program reads a matching entry, it is not immediately reported to the CPU, but is reported by the counter and stored in the cache table, which can reduce the CPU processing flow. It is convenient for the CPU to obtain batch matching entries.
  • the initial value of the counter is 0, which can be cleared by the CPU.
  • the index of the matching entry stored in the cache table is determined by the current count value of the counter. Generally, adding a 1 to the hardware counter and returning the count value is an inseparable atomic operation, which can avoid the problem that the matching counter intensive operation causes confusion when the microcode program is executed multiple times.
  • the CPU message leaves the microcode module and returns to the microcode module. The configuration of this loop can be configured when the network processor is initialized.
  • the microcode module processes the message very efficiently, so the processing time of the whole process is far less than the response time of the CPU directly traversing the large-capacity hardware table.
  • the time at which the CPU reads the cache table is only proportional to the actual number of matching entries. In most cases, the actual number of matching entries is much smaller than the total number of hardware table entries. Therefore, the response time of the method of the embodiment of the present invention is much smaller than the response time of the CPU directly traversing the hardware table.
  • step 12 are:
  • Step 21 Detect type information of a CPU packet.
  • step 22 the copy register is configured according to the type information.
  • step 14 is:
  • Step 31 Obtain a hardware entry whose hardware entry index number is equal to the copy sequence number according to the copy sequence number of the copied CPU packet.
  • Step 32 Determine whether the matching information of the CPU packet matches the information of the obtained hardware entry.
  • each of the copied packets carries a copy sequence number back to the microcode module, and the corresponding microcode program is executed once; each time the microcode program is executed, the hardware table is read and judged. A different entry, the entry number of the entry is equal to the copy sequence number of the packet.
  • an OLT includes a main control card, an uplink card, and a line card.
  • a line card implemented by a network processor.
  • the line card forwards and learns the MAC address of the uplink and downlink packets.
  • the current MAC address information is recorded in a hardware table sys_mac_table, and the table size is N, that is, a maximum of N MAC addresses are saved.
  • the contents of the sys_mac_table entry include a valid flag valid, a MAC address mac_addr, a port value portId, a vlan value vlanId, and a static/dynamic flag bit static_flag.
  • the CPU can obtain the MAC address of a sys_mac_table that meets certain characteristics, such as obtaining a MAC address of a port, a vlan, or a port + a vlan.
  • a line card is configured to leave a packet from the microcode module in the network processor chip, and then return to the loop queue of the microcode module, as shown in FIG.
  • a hardware counter mac_cache_counter for counting matching entries is defined, and a hardware table mac_cache_table for caching matching entries is used.
  • the format of the entry of the cache table mac_cache_table is the same as that of the MAC address table sys_mac_table, such as Figure 5 shows.
  • the microcode module stores a plurality of microcode programs, including but not limited to the microcode programs ecm_process and mac_iter_process.
  • the microcode program ecm_process is used to process CPU messages sent by the CPU, and the microcode program mac_iter_process is used to process the loopback.
  • the queue returns to the message of the microcode module.
  • the microcode module can process the corresponding microcode program according to the source of the message.
  • the line card CPU obtains all MAC addresses on port 1 of a line card.
  • the line card CPU sends a CPU message for obtaining a MAC address to the network processor chip.
  • the format of the CPU message header is defined as two parts, TYPE and VALUE.
  • the CPU message that obtains the MAC address first enters the microcode module of the network processor chip for processing, and the corresponding microcode program is ecm_process.
  • Ecm_process first determines that the message type is MAC_GET_PKT, and then jumps to the corresponding sub-process seq_get_mac_addr.
  • the sub-flow seq_get_mac_addr configures the replication register of the packet so that the number of copies is equal to N, and the destination queue of the configured packet is the loopback queue, so that the packet can be returned to the microcode module after leaving the microcode module.
  • the copy register value of the CPU message is the copy number greater than 1, so it is copied into N copies in the logical multicast module, and each copy of the copied message carries a copy sequence number, which is sequentially taken.
  • the value is 0 to N-1.
  • the copy sequence number is saved in a specific register.
  • the subsequent process adds 1 to the counter mac_cache_counter and returns the count value C, and writes the entry K to the cache table.
  • the index in the mac_cache_table is equal to C.
  • the entry of -1, the last configuration discards the message, and the program ends.
  • the counter mac_cache_counter is used to count the number of matching entries.
  • the CPU software should clear mac_cache_counter to zero before each MAC address is obtained. Since the operation of adding 1 to the hardware counter and returning the count value is an inseparable atomic operation, when the microcode program mac_iter_process is executed multiple times and the macro_1 is returned to the mac_cache_counter, the count value C returned by each operation is correct. Reflects the current counting result, so the write of the cache table mac_cache_table will not be confusing, and all matching M entries can be correctly written to the first M items of the cache table mac_cache_table.
  • the CPU can directly read the counter mac_cache_counter to obtain the total number M of matching MAC addresses, and read the first M items of mac_cache_table.
  • the waiting time of the CPU can be determined experimentally. The waiting time should be such that when the total number of MAC addresses reaches the maximum number of MAC addresses supported by the line card, that is, the table size N of the MAC address table sys_mac_table, all copied messages can be processed by the microcode program mac_iter_process.
  • the embodiment of the present invention further provides an apparatus 90 for implementing hardware table traversal based on a network processor, where the apparatus 80 includes:
  • the receiving module 91 is configured to receive a CPU packet, where the CPU packet carries matching information of the hardware entry.
  • the configuration module 92 is configured to configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
  • the copying module 93 is configured to copy the CPU message according to the value of the copy register, and assign a copy sequence number to each copied CPU message, and the copy sequence number is a sequence number when the CPU message is copied;
  • the determining module 94 is configured to compare the hardware entries in the hardware table with the copied CPU messages one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets;
  • the update module 95 is configured to: when the information of the hardware entry matches the matching information of the CPU message, store the matched hardware entry into the pre-configured cache table, and update the value of the pre-configured counter.
  • the device 90 further includes:
  • the obtaining module is set to obtain the saved hardware entry from the cache table according to the value of the counter.
  • the device 90 further includes:
  • the configuration module 92 includes:
  • a detecting unit configured to detect type information of a CPU message
  • the startup unit is set to configure the copy register based on the type information.
  • the determining module 94 includes:
  • the first unit is configured to obtain, according to the copy sequence number of the copied CPU message, a hardware entry whose hardware entry index number is equal to the copy sequence number;
  • the second unit is configured to determine whether the matching information of the CPU message matches the information of the obtained hardware entry.
  • the functions of the receiving module 91, the configuration module 92, the copying module 93, the determining module 94, and the updating module 95 can be implemented by a receiving module, a microcode module, and a logical multicast module of the network processor. .
  • the method and apparatus for implementing hardware table traversal based on a network processor have the following beneficial effects: the method of traversing the hardware table by the microcode module and the logical multicast module of the network processor
  • the time for traversing the hardware table is greatly shortened, thereby effectively reducing the average response time of large-capacity hardware table traversal and matching hardware item batch reporting in the network processor.

Abstract

The present invention provides a method and device for realizing the hardware table traversal based on a network processor. The method comprises: receiving CPU packets, wherein the CPU packets carrying the matching information about hardware table items; configuring copy registers according to the total number of the hardware table items, wherein the value of the copy registers being the number of the CPU packets to be copied, and the value of the copy registers equaling to the total number of the hardware table items; copying the CPU packets according to the value of the copy registers, and assigning a copy number to each copied CPU packet, wherein the copy number being the sequence number when the CPU packets are copied; comparing the hardware table items in the hardware table with the copied CPU packets one by one, and determining whether the information about the hardware table items matches the matching information about the CPU packets or not; and if matching, storing the matched hardware table items in a pre-configured cache table, and updating the value of a pre-configured counter. The average response time for high-capacity hardware table traversal and batch reporting the matched hardware table items in the network processor can be effectively reduced by the method in the present invention.

Description

一种基于网络处理器实现硬件表遍历的方法及装置Method and device for implementing hardware table traversal based on network processor 技术领域Technical field
本发明涉及通信技术领域,特别涉及一种基于网络处理器实现硬件表遍历的方法及装置。The present invention relates to the field of communications technologies, and in particular, to a method and apparatus for implementing hardware table traversal based on a network processor.
背景技术Background technique
无源光纤网络(PON)指不含有任何电子器件及电子电源的特定光通信网,包括光线路终端(OLT)、光网络单元(ONU)和光配线网(ODN)三个部分。其中OLT设备位于中心控制站,多个ONU安装于用户场所。在OLT与ONU之间的ODN包含光纤以及无源分光器或者耦合器,但不包含任何有源电子设备。因此,PON网络具有无供电压力、环境适应性强、不受电磁雷电干扰等优势。将PON技术用于接入网建设还可以节约机房建设成本,降低运营维护费用。随着以太无源光网络(EPON)技术的出现,使得EPON无源光接入网成为新一代光接入网建设的主要选择之一。Passive Optical Network (PON) refers to a specific optical communication network that does not contain any electronic devices and electronic power sources, including optical line terminals (OLT), optical network units (ONUs), and optical distribution networks (ODNs). The OLT device is located at the central control station, and multiple ONUs are installed at the user site. The ODN between the OLT and the ONU contains fiber optics as well as passive splitters or couplers, but does not contain any active electronics. Therefore, the PON network has the advantages of no power supply pressure, strong environmental adaptability, and immunity from electromagnetic lightning. Using PON technology for access network construction can also save the cost of equipment room construction and reduce operation and maintenance costs. With the emergence of Ethernet Passive Optical Network (EPON) technology, EPON passive optical access network has become one of the main choices for the construction of next-generation optical access networks.
网络处理器是一种可编程器件,特定应用于通信网络的各种任务,比如报文处理、协议分析、路由查找等。网络处理器独特的硬件设计,使其凭借灵活的编程方式和高效的报文处理在OLT设备中得到广泛应用。OLT设备中有很多应用需要大容量硬件表支持,比如硬件MAC地址表。当中央处理器CPU需要获取大容量硬件表中符合某类特征的所有表项时,如何快速遍历大容量硬件表,并将匹配表项批量上报,成为OLT设备驱动程序开发的技术难题之一。A network processor is a programmable device that is specifically applied to various tasks of a communication network, such as message processing, protocol analysis, and route lookup. The unique hardware design of the network processor makes it widely used in OLT devices with flexible programming and efficient message processing. There are many applications in OLT devices that require large-capacity hardware table support, such as a hardware MAC address table. When the CPU of the CPU needs to obtain all the entries in the large-capacity hardware table that meet certain characteristics, how to quickly traverse the large-capacity hardware table and report the matching entries in batches becomes one of the technical problems in the development of OLT device drivers.
依靠CPU逐项读取芯片中的大容量硬件表进行遍历,CPU和芯片需要交互大量的读表消息,这种方法的效率很低。为了快速遍历大容量硬件表,可以设计专门的硬件搜索引擎,使其在CPU控制下直接遍历硬件表,但这种方法对芯片硬件有较高要求,增加了开发成本。Relying on the CPU to read the large-capacity hardware table in the chip for traversal, the CPU and the chip need to interact with a large number of reading messages, and the efficiency of this method is very low. In order to quickly traverse large-capacity hardware tables, a special hardware search engine can be designed to directly traverse the hardware table under the control of the CPU, but this method has higher requirements on the chip hardware and increases the development cost.
发明内容Summary of the invention
本发明实施例提供了一种基于网络处理器实现硬件表遍历的方法及装置,能有效减少网络处理器中大容量硬件表遍历及匹配硬件表项批量上报的平均响应时间。The embodiment of the invention provides a method and a device for implementing hardware table traversal based on a network processor, which can effectively reduce the average response time of large-capacity hardware table traversal and matching hardware table batch reporting in the network processor.
为了达到上述目的,本发明的实施例提供了一种基于网络处理器实现硬件表遍历的方法,上述方法包括: In order to achieve the above objective, an embodiment of the present invention provides a method for implementing hardware table traversal based on a network processor, where the method includes:
接收CPU报文,CPU报文携带硬件表项的匹配信息;Receiving CPU packets, the CPU packets carrying matching information of hardware entries;
根据硬件表项的总数,配置复制寄存器,复制寄存器的数值为需要复制CPU报文的份数,复制寄存器的数值等于硬件表项总数;According to the total number of hardware entries, configure the copy register. The value of the copy register is the number of copies of the CPU message to be copied. The value of the copy register is equal to the total number of hardware entries.
根据复制寄存器的数值,复制CPU报文,并给复制的每个CPU报文分配一个复制序号,复制序号为CPU报文复制时的顺序号;Copy the CPU message according to the value of the copy register, and assign a copy sequence number to each CPU message copied, and the copy sequence number is the sequence number when the CPU message is copied;
逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与CPU报文的匹配信息是否匹配;Comparing the hardware entries in the hardware table with the copied CPU packets one by one, and determining whether the information of the hardware entries matches the matching information of the CPU packets;
若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。If there is a match, the matching hardware entry is stored in a pre-configured cache table and the value of the pre-configured counter is updated.
其中,若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值的步骤之后,方法还包括:After the step of storing the matched hardware table in the pre-configured cache table and updating the value of the pre-configured counter, the method further includes:
根据计数器的值,从缓存表中获取保存的硬件表项。The saved hardware entry is obtained from the cache table according to the value of the counter.
其中,根据硬件表项的总数,配置复制寄存器的步骤包括:The steps of configuring the replication register according to the total number of hardware entries include:
检测CPU报文的类型信息;Detecting type information of CPU packets;
根据类型信息,配置复制寄存器。Configure the copy register based on the type information.
其中,逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与CPU报文的匹配信息是否匹配的步骤包括:The steps of comparing the hardware entries in the hardware table with the copied CPU packets, and determining whether the matching information of the hardware entries matches the matching information of the CPU packets include:
根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项;Obtaining, according to the copy sequence number of the copied CPU packet, a hardware entry whose hardware entry index number is equal to the copy sequence number;
判断CPU报文的匹配信息与获取的硬件表项的信息是否匹配。Determine whether the matching information of the CPU packet matches the information of the obtained hardware entry.
其中,若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值的步骤之后,方法还包括:After the step of storing the matched hardware table in the pre-configured cache table and updating the value of the pre-configured counter, the method further includes:
丢弃复制序号对应的CPU报文。Discard the CPU packets corresponding to the copy sequence number.
本发明实施例还提供了一种基于网络处理器实现硬件表遍历的装置,上述装置包括: An embodiment of the present invention further provides an apparatus for implementing hardware table traversal based on a network processor, where the foregoing apparatus includes:
接收模块,设置为接收CPU报文,CPU报文携带硬件表项的匹配信息;The receiving module is configured to receive a CPU message, and the CPU message carries matching information of the hardware entry;
配置模块,设置为根据硬件表项的总数,配置复制寄存器,复制寄存器的数值为需要复制CPU报文的份数,复制寄存器的数值等于硬件表项总数;The configuration module is configured to configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
复制模块,设置为根据复制寄存器的数值,复制CPU报文,并给复制的每个CPU报文分配一个复制序号,复制序号为CPU报文复制时的顺序号;The copy module is set to copy the CPU message according to the value of the copy register, and assign a copy sequence number to each copied CPU message, and the copy sequence number is the sequence number when the CPU message is copied;
判断模块,设置为逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与CPU报文的匹配信息是否匹配;The judging module is configured to compare the hardware entries in the hardware table with the copied CPU packets one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets;
更新模块,设置为当硬件表项的信息与CPU报文的匹配信息匹配时,将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。The update module is configured to: when the information of the hardware entry matches the matching information of the CPU message, store the matched hardware entry in a pre-configured cache table, and update the value of the pre-configured counter.
其中,装置还包括:Wherein, the device further comprises:
获取模块,设置为根据计数器的值,从缓存表中获取保存的硬件表项。The obtaining module is set to obtain the saved hardware entry from the cache table according to the value of the counter.
其中,配置模块包括:The configuration module includes:
检测单元,设置为检测CPU报文的类型信息;a detecting unit, configured to detect type information of a CPU message;
启动单元,设置为根据类型信息,配置复制寄存器。The startup unit is set to configure the copy register based on the type information.
其中,判断模块包括:The judging module includes:
第一单元,设置为根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项;The first unit is configured to obtain, according to the copy sequence number of the copied CPU message, a hardware entry whose hardware entry index number is equal to the copy sequence number;
第二单元,设置为判断CPU报文的匹配信息与获取的硬件表项的信息是否匹配。The second unit is configured to determine whether the matching information of the CPU message matches the information of the obtained hardware entry.
其中,装置还包括:Wherein, the device further comprises:
丢弃模块,设置为丢弃复制序号对应的CPU报文。Discard the module and set the CPU packet corresponding to the copy sequence number.
本发明的上述实施例的有益效果如下:The beneficial effects of the above embodiments of the present invention are as follows:
在本发明实施例的基于网络处理器实现硬件表遍历的方法中,通过网络处理器的微码模块和逻辑多播模块遍历硬件表的方式,使得遍历硬件表的时间大大缩短,从而有效减少网络处理器中大容量硬件表遍历及匹配硬件表项批量上报的平均响应时间。 In the method for implementing hardware table traversal based on the network processor in the embodiment of the present invention, the method of traversing the hardware table by the microcode module and the logical multicast module of the network processor greatly shortens the time for traversing the hardware table, thereby effectively reducing the network. Large-capacity hardware table traversal in the processor and matching the average response time of batch reporting of hardware entries.
附图说明DRAWINGS
图1为本发明实施例中基于网络处理器实现硬件表遍历的方法的步骤流程图;1 is a flow chart of steps of a method for implementing hardware table traversal based on a network processor according to an embodiment of the present invention;
图2为本发明实施例中图1中步骤12的具体步骤流程图;2 is a flowchart of specific steps of step 12 in FIG. 1 according to an embodiment of the present invention;
图3为本发明实施例中图1中步骤14的具体步骤流程图;3 is a flowchart of specific steps of step 14 in FIG. 1 according to an embodiment of the present invention;
图4为本发明实施例中线卡和网络处理器的结构示意图;4 is a schematic structural diagram of a line card and a network processor according to an embodiment of the present invention;
图5为本发明实施例中sys_mac_table和mac_cache_table的表项格式;FIG. 5 is a form of an entry of a sys_mac_table and a mac_cache_table according to an embodiment of the present invention;
图6为本发明实施例中CPU报文的报文头部的格式;6 is a format of a packet header of a CPU packet according to an embodiment of the present invention;
图7为本发明实施例中CPU报文MAC_GET_PKT类型的报文头部的格式;7 is a format of a packet header of a CPU message MAC_GET_PKT according to an embodiment of the present invention;
图8为本发明实施例中微码程序mac_iter_process的流程图;8 is a flowchart of a microcode program mac_iter_process in an embodiment of the present invention;
图9为本发明实施例中基于网络处理器实现硬件表遍历的装置的结构示意图。FIG. 9 is a schematic structural diagram of an apparatus for implementing hardware table traversal based on a network processor according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。The technical problems, the technical solutions, and the advantages of the present invention will be more clearly described in the following description.
本发明针对现有技术中中央处理器CPU直接遍历大容量硬件表导致响应时间过长的问题,提供了一种基于网络处理器实现硬件表遍历的方法及装置,能有效减少网络处理器中大容量硬件表遍历及匹配硬件表项批量上报的平均响应时间。The present invention is directed to the problem that the central processing unit CPU directly traverses the large-capacity hardware table in the prior art, resulting in a long response time, and provides a method and device for implementing hardware table traversal based on the network processor, which can effectively reduce the large size of the network processor. The capacity hardware table traverses and matches the average response time of batch reporting of hardware entries.
如图1所示,本发明的实施例提供了一种基于网络处理器实现硬件表遍历的方法,上述方法包括:As shown in FIG. 1 , an embodiment of the present invention provides a method for implementing hardware table traversal based on a network processor, where the method includes:
步骤11,接收CPU报文,CPU报文携带硬件表项的匹配信息;Step 11: Receive a CPU message, where the CPU message carries matching information of the hardware entry.
步骤12,根据硬件表项的总数,配置复制寄存器,复制寄存器的数值为需要复制CPU报文的份数,复制寄存器的数值等于硬件表项总数;Step 12: Configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
步骤13,根据复制寄存器的数值,复制CPU报文,并给复制的每个CPU报文分配一个复制序号,复制序号为CPU报文复制时的顺序号; Step 13, according to the value of the copy register, copy the CPU message, and assign a copy sequence number to each CPU message copied, and the copy sequence number is the sequence number when the CPU message is copied;
步骤14,逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与CPU报文的匹配信息是否匹配;Step 14: Compare the hardware entries in the hardware table with the copied CPU packets one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets.
步骤15,若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。 Step 15. If they match, the matched hardware entries are stored in a pre-configured cache table, and the values of the pre-configured counters are updated.
可选地,在本发明的上述实施例中,在执行完步骤15之后,上述方法还包括:根据计数器的值,从缓存表中获取保存的硬件表项。Optionally, in the foregoing embodiment of the present invention, after performing step 15, the method further includes: acquiring the saved hardware entry from the cache table according to the value of the counter.
可选地,在本发明的上述实施例中,在执行完步骤15之后,上述方法还包括:丢弃复制序号对应的CPU报文。Optionally, in the foregoing embodiment of the present invention, after the step 15 is performed, the method further includes: discarding the CPU message corresponding to the copy sequence number.
在本发明具体的实施例中,利用网络处理器的逻辑多播模块和微码模块实现大容量硬件表的快速遍历,并将匹配表项缓存,使CPU能直接读取匹配表项。当需要遍历某硬件表时,CPU会向网络处理器发送CPU报文,且该CPU报文头部携带该硬件表表项的匹配特征,表项的匹配特征是指表项的某个字段或者某几个字段等于CPU软件的指定值。如果CPU软件未指定硬件表项的任何字段的取值,则是要求获取该硬件表中所有有效的表项。接着CPU报文进入网络处理器的微码模块,对应的微码程序配置复制寄存器,该复制寄存器的数值为该报文的复制份数,复制份数等于该硬件表的表项总数。其中,微码模块是一般网络处理器共有的报文处理模块,可以存储多个微码程序;通过配置,不同来源的报文可以对应不同的微码程序。当报文进入微码模块后,相应的微码程序就开始处理该报文。其中,CPU报文对应的微码程序,其主要流程就是配置CPU报文的复制寄存器,复制寄存器保存报文的复制份数,应等于硬件表的表项总数N。然后CPU报文被送到逻辑多播模块进行复制,并且复制后的每份报文都携带一个复制序号,复制序号为CPU报文复制时的顺序号,其中,逻辑多播模块代指一般网络处理器中实现报文复制的硬件单元。复制寄存器值大于1的报文都会在逻辑多播模块中进行复制,并且复制后的每份报文都被分配一个复制序号,保存在特定寄存器中。其中,CPU报文在逻辑多播模块中被复制成N份,每份报文对应0~N-1之间的一个复制序号。逻辑多播模块复制完CPU报文之后,每份复制后报文回到微码模块,对应的微码程序都会读取该硬件表中表项索引号等于该报文复制序号的表项,表项索引号是硬件表项在硬件表中的序号;并判断该表项是否满足报文头部的匹配特征,如果满足,则将用于统计匹配表项的计数器的值加1,并将该表项写入用于缓存匹配表项的缓存表中,如果不满足,则不做处理;最后,无论该表项是否匹配,微码程序都将复制报文丢弃。 In a specific embodiment of the present invention, the logical multicast module and the microcode module of the network processor are used to implement fast traversal of the large-capacity hardware table, and the matching entry is cached, so that the CPU can directly read the matching entry. When a hardware table needs to be traversed, the CPU sends a CPU message to the network processor, and the CPU message header carries the matching feature of the hardware table entry. The matching feature of the entry refers to a certain field of the entry or Several fields are equal to the specified value of the CPU software. If the CPU software does not specify the value of any field of the hardware entry, it is required to obtain all valid entries in the hardware table. Then, the CPU message enters the microcode module of the network processor, and the corresponding microcode program configures the copy register. The value of the copy register is the number of copies of the message, and the number of copies is equal to the total number of entries in the hardware table. The microcode module is a packet processing module shared by a general network processor, and can store multiple microcode programs; by configuration, packets from different sources can correspond to different microcode programs. When the message enters the microcode module, the corresponding microcode program begins processing the message. The main process of the microcode program corresponding to the CPU message is to configure the copy register of the CPU message, and the copy register stores the number of copies of the message, which should be equal to the total number of entries in the hardware table N. Then, the CPU message is sent to the logical multicast module for copying, and each of the copied packets carries a copy sequence number, and the copy sequence number is the sequence number when the CPU message is copied, wherein the logical multicast module refers to the general network. A hardware unit that implements message replication in the processor. Messages with a copy register value greater than 1 are copied in the logical multicast module, and each copied message is assigned a copy sequence number and stored in a specific register. The CPU message is copied into N shares in the logical multicast module, and each message corresponds to a copy sequence number between 0 and N-1. After the logical multicast module copies the CPU message, each copied message returns to the microcode module, and the corresponding microcode program reads the entry in the hardware table whose index is equal to the copy number of the packet. The item index number is a sequence number of the hardware entry in the hardware table, and determines whether the entry satisfies the matching feature of the packet header. If yes, the value of the counter used for the statistics matching entry is incremented by one, and the The entry is written to the cache table of the cache matching entry. If it is not satisfied, the entry is not processed. Finally, the microcode program discards the duplicated packet regardless of whether the entry matches.
在本发明的具体实施例中,复制后的每份报文携带一个复制序号回到微码模块,相应的微码程序都会执行一次;该微码程序每执行一次,都会读取并判断该硬件表的一个不同表项,表项索引号分别等于报文的复制序号0~N-1。这样,该微码程序执行N次后,便等价于实现了对该硬件表的遍历查找。其中,由于微码程序的长度可能受到硬件资源的限制,在一个微码程序中顺序读取所有硬件表项一般无法实现,所以本发明实施例采用多次执行同一微码程序,每次执行只读取一条表项的方式。当网络处理器遍历完硬件表之后,CPU等待一定时间会直接读取上述计数器的数值,并根据计数器的数值读取缓存表中的匹配表项,即可获取批量的匹配表项(例如计数器的值是M,CPU根据计数器的数值M,读取缓存表的前M项,即为所有的匹配表项),其中,CPU的等待时长应根据硬件表的表容量和网络处理器的处理速度来确定,也可以通过实验的方法来确定。In a specific embodiment of the present invention, each of the copied packets carries a copy sequence number back to the microcode module, and the corresponding microcode program is executed once; each time the microcode program is executed, the hardware is read and judged. A different entry in the table, the index of the entry is equal to the copy number 0 to N-1 of the packet. Thus, after the microcode program is executed N times, it is equivalent to realizing the traversal lookup of the hardware table. Wherein, since the length of the microcode program may be limited by the hardware resources, it is generally impossible to sequentially read all the hardware entries in a microcode program. Therefore, the embodiment of the present invention uses the same microcode program multiple times, each execution only The way to read an entry. After the network processor traverses the hardware table, the CPU waits for a certain period of time to directly read the value of the counter, and reads the matching entry in the cache table according to the value of the counter, so as to obtain a batch matching entry (for example, a counter) The value is M, the CPU reads the first M items of the cache table according to the value M of the counter, that is, all matching entries), wherein the waiting time of the CPU should be based on the table capacity of the hardware table and the processing speed of the network processor. Determination can also be determined experimentally.
在本发明具体的实施例中,微码程序读到一条匹配表项后,并没有立即上报给CPU,而是通过计数器计数并存入缓存表的方式集中上报,这样做可以减少CPU处理流程,方便CPU获取批量的匹配表项。计数器的初始值为0,可以由CPU进行清零操作。匹配表项存入缓存表的索引,由计数器的当前计数值确定。一般地,硬件计数器加1并返回计数值是一个不可分割的原子操作,可以避免微码程序多次执行时对匹配计数器密集操作导致混乱的问题。另外,CPU报文离开微码模块又回到微码模块,这一环路的配置可以在网络处理器初始化时进行配置。In a specific embodiment of the present invention, after the microcode program reads a matching entry, it is not immediately reported to the CPU, but is reported by the counter and stored in the cache table, which can reduce the CPU processing flow. It is convenient for the CPU to obtain batch matching entries. The initial value of the counter is 0, which can be cleared by the CPU. The index of the matching entry stored in the cache table is determined by the current count value of the counter. Generally, adding a 1 to the hardware counter and returning the count value is an inseparable atomic operation, which can avoid the problem that the matching counter intensive operation causes confusion when the microcode program is executed multiple times. In addition, the CPU message leaves the microcode module and returns to the microcode module. The configuration of this loop can be configured when the network processor is initialized.
在本发明具体的实施例中,微码模块对报文的处理十分高效,因此整个过程的处理时间远远小于CPU直接遍历大容量硬件表的响应时间。其中CPU读缓存表的时间只与匹配表项的实际数目成正比。在多数情况下,匹配表项的实际数目都比硬件表表项总数要小得多,因此本发明实施例的方法的响应时间比CPU直接遍历硬件表的响应时间要小得多。In a specific embodiment of the present invention, the microcode module processes the message very efficiently, so the processing time of the whole process is far less than the response time of the CPU directly traversing the large-capacity hardware table. The time at which the CPU reads the cache table is only proportional to the actual number of matching entries. In most cases, the actual number of matching entries is much smaller than the total number of hardware table entries. Therefore, the response time of the method of the embodiment of the present invention is much smaller than the response time of the CPU directly traversing the hardware table.
如图2所示,在本发明的上述实施例中,步骤12的具体步骤为:As shown in FIG. 2, in the above embodiment of the present invention, the specific steps of step 12 are:
步骤21,检测CPU报文的类型信息;Step 21: Detect type information of a CPU packet.
步骤22,根据类型信息,配置复制寄存器。In step 22, the copy register is configured according to the type information.
如图3所示,在本发明的上述实施例中,步骤14的具体步骤流程为:As shown in FIG. 3, in the above embodiment of the present invention, the specific step flow of step 14 is:
步骤31,根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项; Step 31: Obtain a hardware entry whose hardware entry index number is equal to the copy sequence number according to the copy sequence number of the copied CPU packet.
步骤32,判断CPU报文的匹配信息与获取的硬件表项的信息是否匹配。Step 32: Determine whether the matching information of the CPU packet matches the information of the obtained hardware entry.
在本发明具体的实施例中,复制后的每份报文携带一个复制序号回到微码模块,相应的微码程序都会执行一次;该微码程序每执行一次,都会读取并判断硬件表的一个不同表项,表项索引号分别等于报文的复制序号。以下是一个具体的实施:In a specific embodiment of the present invention, each of the copied packets carries a copy sequence number back to the microcode module, and the corresponding microcode program is executed once; each time the microcode program is executed, the hardware table is read and judged. A different entry, the entry number of the entry is equal to the copy sequence number of the packet. The following is a specific implementation:
在一个EPON中,一个OLT包括主控卡,上联卡和线卡。考虑一块网络处理器实现的线卡,如图4所示,线卡转发并学习上下行报文的MAC地址。当前MAC地址信息记录在一张硬件表sys_mac_table中,表大小为N,即最多保存N个MAC地址。如图5所示,sys_mac_table的表项内容包括有效标志位valid,MAC地址mac_addr,端口值portId,vlan值vlanId,与静态/动态标志位static_flag。CPU可以要求获取sys_mac_table中符合某种特征的MAC地址,比如获取某端口、某vlan或者某端口+某vlan上的所有MAC地址。In an EPON, an OLT includes a main control card, an uplink card, and a line card. Consider a line card implemented by a network processor. As shown in Figure 4, the line card forwards and learns the MAC address of the uplink and downlink packets. The current MAC address information is recorded in a hardware table sys_mac_table, and the table size is N, that is, a maximum of N MAC addresses are saved. As shown in FIG. 5, the contents of the sys_mac_table entry include a valid flag valid, a MAC address mac_addr, a port value portId, a vlan value vlanId, and a static/dynamic flag bit static_flag. The CPU can obtain the MAC address of a sys_mac_table that meets certain characteristics, such as obtaining a MAC address of a port, a vlan, or a port + a vlan.
为了实施上述实施例的技术方案,线卡初始化时在网络处理器芯片中配置一条报文从微码模块离开,再回到微码模块的环路队列,如图4所示。并在网络处理器芯片中定义用于计数匹配表项的硬件计数器mac_cache_counter,和用于缓存匹配表项的硬件表mac_cache_table.缓存表mac_cache_table的表项格式与MAC地址表sys_mac_table的表项格式一样,如图5所示。微码模块中存储多个微码程序,包括但不仅限于微码程序ecm_process和mac_iter_process.其中,微码程序ecm_process用来处理CPU发来的CPU报文,微码程序mac_iter_process用来处理经上述环回队列回到微码模块的报文。微码模块可以根据报文来源选择对应的微码程序进行处理。In order to implement the technical solution of the foregoing embodiment, a line card is configured to leave a packet from the microcode module in the network processor chip, and then return to the loop queue of the microcode module, as shown in FIG. And in the network processor chip, a hardware counter mac_cache_counter for counting matching entries is defined, and a hardware table mac_cache_table for caching matching entries is used. The format of the entry of the cache table mac_cache_table is the same as that of the MAC address table sys_mac_table, such as Figure 5 shows. The microcode module stores a plurality of microcode programs, including but not limited to the microcode programs ecm_process and mac_iter_process. The microcode program ecm_process is used to process CPU messages sent by the CPU, and the microcode program mac_iter_process is used to process the loopback. The queue returns to the message of the microcode module. The microcode module can process the corresponding microcode program according to the source of the message.
考虑线卡CPU获取线卡某端口port1上所有MAC地址的情形。如图4所示,线卡CPU向网络处理器芯片发送获取MAC地址的CPU报文。如图6所示,CPU报文头部的格式定义为TYPE和VALUE两个部分。如图7所示,获取MAC地址的CPU报文,其TYPE部分取值定义为MAC_GET_PKT,VALUE部分取值定义为{port_match=1,portId=port1,vlan_match=0,vlanId=0}。Consider the case where the line card CPU obtains all MAC addresses on port 1 of a line card. As shown in FIG. 4, the line card CPU sends a CPU message for obtaining a MAC address to the network processor chip. As shown in Figure 6, the format of the CPU message header is defined as two parts, TYPE and VALUE. As shown in FIG. 7, the CPU packet of the MAC address is defined as MAC_GET_PKT, and the value of the VALUE part is defined as {port_match=1, portId=port1, vlan_match=0, vlanId=0}.
如图4所示,获取MAC地址的CPU报文首先进入网络处理器芯片的微码模块进行处理,相应的微码程序是ecm_process。ecm_process首先判断报文类型为MAC_GET_PKT,然后跳转到相应的子流程seq_get_mac_addr。子流程seq_get_mac_addr配置报文的复制寄存器,使复制份数等于N,并配置报文的目的队列为上述环回队列,使报文离开微码模块后能再次回到微码模块。As shown in FIG. 4, the CPU message that obtains the MAC address first enters the microcode module of the network processor chip for processing, and the corresponding microcode program is ecm_process. Ecm_process first determines that the message type is MAC_GET_PKT, and then jumps to the corresponding sub-process seq_get_mac_addr. The sub-flow seq_get_mac_addr configures the replication register of the packet so that the number of copies is equal to N, and the destination queue of the configured packet is the loopback queue, so that the packet can be returned to the microcode module after leaving the microcode module.
如图4所示,该CPU报文的复制寄存器值即复制份数大于1,因此在逻辑多播模块中被复制成N份,并且复制后的每份报文都携带一个复制序号,依次取值0~N-1。 复制序号保存在特定寄存器中。这些复制报文通过环回队列回到微码模块进行处理,相应的微码程序是mac_iter_process。As shown in FIG. 4, the copy register value of the CPU message is the copy number greater than 1, so it is copied into N copies in the logical multicast module, and each copy of the copied message carries a copy sequence number, which is sequentially taken. The value is 0 to N-1. The copy sequence number is saved in a specific register. These duplicate messages are processed back to the microcode module through the loopback queue, and the corresponding microcode program is mac_iter_process.
如图8所示,微码程序mac_iter_process首先读特定寄存器获取报文的复制序号k,然后读取MAC地址表sys_mac_table中索引等于k的表项K。如果表项K无效,则配置丢弃该报文,程序结束。如果表项K有效,由于报文头部port_mtach=1,portId需要匹配,因此判断表项K的portId是否与报文头部的portId(=port1)相等。如果两者的portId不等,则配置丢弃该报文,程序结束。如果两者的portId相等,由于报文头部vlan_mtach=0,vlanId不需要匹配,因此进入后续流程将计数器mac_cache_counter加1并返回计数值C,并将表项K写入缓存表mac_cache_table中索引等于C-1的表项,最后配置丢弃该报文,程序结束。As shown in FIG. 8, the microcode program mac_iter_process first reads the specific register to obtain the copy sequence number k of the message, and then reads the entry K in the MAC address table sys_mac_table whose index is equal to k. If the entry K is invalid, the configuration discards the message and the program ends. If the entry K is valid, since the packet header port_mtach=1, the portId needs to match, so it is determined whether the portId of the entry K is equal to the portId (=port1) of the packet header. If the portIds of the two are not equal, the configuration discards the message and the program ends. If the portIds of the two are equal, the vlanId does not need to match because the packet header vlan_mtach=0. Therefore, the subsequent process adds 1 to the counter mac_cache_counter and returns the count value C, and writes the entry K to the cache table. The index in the mac_cache_table is equal to C. The entry of -1, the last configuration discards the message, and the program ends.
计数器mac_cache_counter用来计数匹配表项的数目,每次获取MAC地址前CPU软件应将mac_cache_counter清零。由于硬件计数器的加1与返回计数值的操作是不可分割的原子操作,当微码程序mac_iter_process多次执行,对mac_cache_counter进行密集的加1返回操作时,每次操作返回的计数值C都能正确反映当前的计数结果,因此缓存表mac_cache_table的写入不会出现混乱,所有匹配的M条表项可以正确写入缓存表mac_cache_table的前M项。The counter mac_cache_counter is used to count the number of matching entries. The CPU software should clear mac_cache_counter to zero before each MAC address is obtained. Since the operation of adding 1 to the hardware counter and returning the count value is an inseparable atomic operation, when the microcode program mac_iter_process is executed multiple times and the macro_1 is returned to the mac_cache_counter, the count value C returned by each operation is correct. Reflects the current counting result, so the write of the cache table mac_cache_table will not be confusing, and all matching M entries can be correctly written to the first M items of the cache table mac_cache_table.
CPU等待所有复制报文都处理完,就可以直接读计数器mac_cache_counter获取MAC地址的匹配总数M,并读取mac_cache_table的前M项上报。具体实施中可通过实验确定CPU的等待时长。这一等待时长应保证,当前MAC地址总数达到线卡支持的最大MAC地址数,也就是MAC地址表sys_mac_table的表大小N时,所有复制报文都能通过微码程序mac_iter_process处理完成。After waiting for all the copied messages to be processed, the CPU can directly read the counter mac_cache_counter to obtain the total number M of matching MAC addresses, and read the first M items of mac_cache_table. In the specific implementation, the waiting time of the CPU can be determined experimentally. The waiting time should be such that when the total number of MAC addresses reaches the maximum number of MAC addresses supported by the line card, that is, the table size N of the MAC address table sys_mac_table, all copied messages can be processed by the microcode program mac_iter_process.
从CPU软件将计数器mac_cache_counter清零,发送获取MAC地址的CPU报文开始,到最后CPU读取完成所有匹配的MAC地址信息,这一过程应该保证不被打断,相应的CPU软件的代码区间应该使用信号量等方式来保证互斥。这样做可以防止CPU连续执行获取MAC地址的操作时,对计数器mac_cache_counter与缓存表mac_cache_table操作混乱的竞争问题。在此只讨论了获取线卡指定端口上所有MAC地址的情形。获取线卡指定vlan上所有MAC地址,与获取线卡指定端口+指定vlan上所有MAC地址的情形,也适用于上述处理流程,这里不再赘述。Clear the counter mac_cache_counter from the CPU software, send the CPU message to obtain the MAC address, and finally the CPU reads all the matching MAC address information. This process should be guaranteed not to be interrupted. The code interval of the corresponding CPU software should be Use semaphores and other methods to ensure mutual exclusion. This can prevent the CPU from continually performing the operation of obtaining the MAC address, and the competition between the counter mac_cache_counter and the cache table mac_cache_table is confusing. Only the case of obtaining all MAC addresses on the designated port of the line card is discussed here. The process of obtaining the MAC address of the vlan and obtaining the MAC address of the line card and all the MAC addresses of the vlan are also applicable to the above process.
为了更好地实现上述目的,如图9所示,本发明实施例还提供了一种基于网络处理器实现硬件表遍历的装置90,上述装置80包括:In order to achieve the above objective, as shown in FIG. 9, the embodiment of the present invention further provides an apparatus 90 for implementing hardware table traversal based on a network processor, where the apparatus 80 includes:
接收模块91,设置为接收CPU报文,CPU报文携带硬件表项的匹配信息; The receiving module 91 is configured to receive a CPU packet, where the CPU packet carries matching information of the hardware entry.
配置模块92,设置为根据硬件表项的总数,配置复制寄存器,复制寄存器的数值为需要复制CPU报文的份数,复制寄存器的数值等于硬件表项总数;The configuration module 92 is configured to configure a copy register according to the total number of hardware entries, and the value of the copy register is the number of copies of the CPU message to be copied, and the value of the copy register is equal to the total number of hardware entries;
复制模块93,设置为根据复制寄存器的数值,复制CPU报文,并给复制的每个CPU报文分配一个复制序号,复制序号为CPU报文复制时的顺序号;The copying module 93 is configured to copy the CPU message according to the value of the copy register, and assign a copy sequence number to each copied CPU message, and the copy sequence number is a sequence number when the CPU message is copied;
判断模块94,设置为逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与CPU报文的匹配信息是否匹配;The determining module 94 is configured to compare the hardware entries in the hardware table with the copied CPU messages one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets;
更新模块95,设置为当硬件表项的信息与CPU报文的匹配信息匹配时,将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。The update module 95 is configured to: when the information of the hardware entry matches the matching information of the CPU message, store the matched hardware entry into the pre-configured cache table, and update the value of the pre-configured counter.
其中,装置90还包括:Wherein, the device 90 further includes:
获取模块,设置为根据计数器的值,从缓存表中获取保存的硬件表项。The obtaining module is set to obtain the saved hardware entry from the cache table according to the value of the counter.
其中,装置90还包括:Wherein, the device 90 further includes:
丢弃模块,设置为丢弃复制序号对应的CPU报文。Discard the module and set the CPU packet corresponding to the copy sequence number.
其中,配置模块92包括:The configuration module 92 includes:
检测单元,设置为检测CPU报文的类型信息;a detecting unit, configured to detect type information of a CPU message;
启动单元,设置为根据类型信息,配置复制寄存器。The startup unit is set to configure the copy register based on the type information.
其中,判断模块94包括:The determining module 94 includes:
第一单元,设置为根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项;The first unit is configured to obtain, according to the copy sequence number of the copied CPU message, a hardware entry whose hardware entry index number is equal to the copy sequence number;
第二单元,设置为判断CPU报文的匹配信息与获取的硬件表项的信息是否匹配。The second unit is configured to determine whether the matching information of the CPU message matches the information of the obtained hardware entry.
在本发明的具体实施例中,上述接收模块91、配置模块92、复制模块93、判断模块94以及更新模块95的功能都可以通过网络处理器的接收模块、微码模块以及逻辑多播模块实现。In a specific embodiment of the present invention, the functions of the receiving module 91, the configuration module 92, the copying module 93, the determining module 94, and the updating module 95 can be implemented by a receiving module, a microcode module, and a logical multicast module of the network processor. .
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The above is a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.
工业实用性Industrial applicability
如上所述,本发明实施例提供的一种基于网络处理器实现硬件表遍历的方法及装置,具有以下有益效果:通过网络处理器的微码模块和逻辑多播模块遍历硬件表的方式,使得遍历硬件表的时间大大缩短,从而有效减少网络处理器中大容量硬件表遍历及匹配硬件表项批量上报的平均响应时间。 As described above, the method and apparatus for implementing hardware table traversal based on a network processor provided by the embodiment of the present invention have the following beneficial effects: the method of traversing the hardware table by the microcode module and the logical multicast module of the network processor The time for traversing the hardware table is greatly shortened, thereby effectively reducing the average response time of large-capacity hardware table traversal and matching hardware item batch reporting in the network processor.

Claims (10)

  1. 一种基于网络处理器实现硬件表遍历的方法,包括:A method for implementing hardware table traversal based on a network processor, comprising:
    接收CPU报文,所述CPU报文携带硬件表项的匹配信息;Receiving a CPU message, where the CPU message carries matching information of a hardware entry;
    根据所述硬件表项的总数,配置复制寄存器,所述复制寄存器的数值为需要复制所述CPU报文的份数,所述复制寄存器的数值等于硬件表项总数;And configuring, according to the total number of the hardware entries, a copy register, where the value of the copy register is a number of copies of the CPU message, and the value of the copy register is equal to the total number of hardware entries;
    根据所述复制寄存器的数值,复制所述CPU报文,并给复制的每个CPU报文分配一个复制序号,所述复制序号为所述CPU报文复制时的顺序号;Copying, according to the value of the copy register, the CPU message, and assigning a copy sequence number to each copied CPU message, where the copy sequence number is a sequence number when the CPU message is copied;
    逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与所述CPU报文的匹配信息是否匹配;Comparing the hardware entries in the hardware table with the copied CPU packets one by one, and determining whether the information of the hardware entries matches the matching information of the CPU packets;
    若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。If there is a match, the matching hardware entry is stored in a pre-configured cache table and the value of the pre-configured counter is updated.
  2. 如权利要求1所述的方法,其中,所述若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值的步骤之后,所述方法还包括:The method of claim 1, wherein, if the matching, the matching hardware entry is stored in a pre-configured cache table, and the value of the pre-configured counter is updated, the method further comprises:
    根据计数器的值,从所述缓存表中获取保存的硬件表项。The saved hardware entry is obtained from the cache table according to the value of the counter.
  3. 如权利要求1所述的方法,其中,所述根据所述硬件表项的总数,配置复制寄存器的步骤包括:The method of claim 1, wherein the step of configuring the copy register according to the total number of the hardware entries comprises:
    检测所述CPU报文的类型信息;Detecting type information of the CPU packet;
    根据所述类型信息,配置复制寄存器。The copy register is configured based on the type information.
  4. 如权利要求1所述的方法,其中,所述逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与所述CPU报文的匹配信息是否匹配的步骤包括:The method of claim 1, wherein the hardware table items in the hardware table are compared with the copied CPU messages one by one, and it is determined whether the information of the hardware table matches the matching information of the CPU message. The steps include:
    根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项;Obtaining, according to the copy sequence number of the copied CPU packet, a hardware entry whose hardware entry index number is equal to the copy sequence number;
    判断所述CPU报文的匹配信息与获取的硬件表项的信息是否匹配。 And determining whether the matching information of the CPU packet matches the information of the obtained hardware entry.
  5. 如权利要求4所述的方法,其中,所述若匹配,则将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值的步骤之后,所述方法还包括:The method of claim 4, wherein, if the matching, the matching hardware table entry is stored in a pre-configured cache table, and the value of the pre-configured counter is updated, the method further comprises:
    丢弃复制序号对应的CPU报文。Discard the CPU packets corresponding to the copy sequence number.
  6. 一种基于网络处理器实现硬件表遍历的装置,包括:A device for implementing hardware table traversal based on a network processor, comprising:
    接收模块,设置为接收CPU报文,所述CPU报文携带硬件表项的匹配信息;The receiving module is configured to receive a CPU message, where the CPU message carries matching information of the hardware entry;
    配置模块,设置为根据所述硬件表项的总数,配置复制寄存器,所述复制寄存器的数值为需要复制所述CPU报文的份数,所述复制寄存器的数值等于硬件表项总数;a configuration module, configured to configure a copy register according to the total number of the hardware entries, where the value of the copy register is a number of copies of the CPU message, and the value of the copy register is equal to the total number of hardware entries;
    复制模块,设置为根据所述复制寄存器的数值,复制所述CPU报文,并给复制的每个CPU报文分配一个复制序号,所述复制序号为所述CPU报文复制时的顺序号;a copy module, configured to: copy the CPU message according to the value of the copy register, and assign a copy sequence number to each copied CPU message, where the copy sequence number is a sequence number when the CPU message is copied;
    判断模块,设置为逐一将硬件表中的硬件表项与复制后的CPU报文进行比对,判断硬件表项的信息与所述CPU报文的匹配信息是否匹配;The judging module is configured to compare the hardware entries in the hardware table with the copied CPU messages one by one, and determine whether the information of the hardware entries matches the matching information of the CPU packets;
    更新模块,设置为当硬件表项的信息与所述CPU报文的匹配信息匹配时,将匹配的硬件表项存储至预先配置的缓存表中,并更新预先配置的计数器的值。The update module is configured to: when the information of the hardware entry matches the matching information of the CPU packet, store the matched hardware entry in a pre-configured cache table, and update the value of the pre-configured counter.
  7. 如权利要求6所述的装置,其中,所述装置还包括:The device of claim 6 wherein said device further comprises:
    获取模块,设置为根据计数器的值,从所述缓存表中获取保存的硬件表项。The obtaining module is configured to obtain the saved hardware entry from the cache table according to the value of the counter.
  8. 如权利要求6所述的装置,其中,所述配置模块包括:The apparatus of claim 6 wherein said configuration module comprises:
    检测单元,设置为检测所述CPU报文的类型信息;a detecting unit, configured to detect type information of the CPU message;
    启动单元,设置为根据所述类型信息,配置复制寄存器。A boot unit configured to configure a copy register based on the type information.
  9. 如权利要求6所述的装置,其中,所述判断模块包括:The apparatus of claim 6 wherein said determining module comprises:
    第一单元,设置为根据复制后的CPU报文的复制序号,获取硬件表项索引号等于该复制序号的硬件表项;The first unit is configured to obtain, according to the copy sequence number of the copied CPU message, a hardware entry whose hardware entry index number is equal to the copy sequence number;
    第二单元,设置为判断所述CPU报文的匹配信息与获取的硬件表项的信息是否匹配。The second unit is configured to determine whether the matching information of the CPU message matches the information of the obtained hardware entry.
  10. 如权利要求9所述的装置,其中,所述装置还包括: The device of claim 9 wherein said device further comprises:
    丢弃模块,设置为丢弃复制序号对应的CPU报文。 Discard the module and set the CPU packet corresponding to the copy sequence number.
PCT/CN2015/073808 2014-11-25 2015-03-06 Method and device for realizing hardware table traversal based on network processor WO2016082367A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410687355.2 2014-11-25
CN201410687355.2A CN105700859A (en) 2014-11-25 2014-11-25 Network-processor-based hardware table traversal method and apparatus

Publications (1)

Publication Number Publication Date
WO2016082367A1 true WO2016082367A1 (en) 2016-06-02

Family

ID=56073446

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/073808 WO2016082367A1 (en) 2014-11-25 2015-03-06 Method and device for realizing hardware table traversal based on network processor

Country Status (2)

Country Link
CN (1) CN105700859A (en)
WO (1) WO2016082367A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111600811A (en) * 2020-04-14 2020-08-28 新华三信息安全技术有限公司 Message processing method and device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105262854A (en) * 2015-10-15 2016-01-20 上海斐讯数据通信技术有限公司 Method and device for performing unified management on MAC address table on OLT equipment
CN106776107B (en) * 2016-11-30 2019-07-16 迈普通信技术股份有限公司 A kind of parity error correction method and the network equipment
CN107729053B (en) * 2017-10-17 2020-11-27 安徽皖通邮电股份有限公司 Method for realizing high-speed cache table
CN112637062B (en) * 2020-12-22 2022-05-27 新华三技术有限公司合肥分公司 Hardware forwarding table item synchronization method and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101043442A (en) * 2006-11-17 2007-09-26 神州数码网络(北京)有限公司 Method for realizing URPF on Ethernet switch
US7324547B1 (en) * 2002-12-13 2008-01-29 Nvidia Corporation Internet protocol (IP) router residing in a processor chipset
CN101841473A (en) * 2010-04-09 2010-09-22 北京星网锐捷网络技术有限公司 Method and apparatus for updating MAC (Media Access Control) address table
CN103001878A (en) * 2012-11-26 2013-03-27 中兴通讯股份有限公司 Determination method and device for media access control (MAC) address Hash collision

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100687745B1 (en) * 2005-06-14 2007-02-27 한국전자통신연구원 Network processor for IPv6 source-specific multicast packet forwarding and method therefor
CN101841474A (en) * 2010-04-15 2010-09-22 华为技术有限公司 Device for realizing access control lists
CN102932262B (en) * 2011-08-11 2018-02-16 中兴通讯股份有限公司 Network processing unit mirror image implementing method and network processing unit
CN102831140A (en) * 2012-05-18 2012-12-19 浙江大学 Implement method for MAC (Media Access Control) address lookup tables in FPGA (Field Programmable Gate Array)
CN103973571A (en) * 2013-02-05 2014-08-06 中兴通讯股份有限公司 Network processor and routing searching method
US9419895B2 (en) * 2013-02-25 2016-08-16 Brocade Communications Systems, Inc. Techniques for customizing forwarding decisions via a hardware lookup result
CN104038429B (en) * 2013-03-05 2018-01-30 中兴通讯股份有限公司 A kind of method and device that message multicast is carried out in distributed forwarding equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7324547B1 (en) * 2002-12-13 2008-01-29 Nvidia Corporation Internet protocol (IP) router residing in a processor chipset
CN101043442A (en) * 2006-11-17 2007-09-26 神州数码网络(北京)有限公司 Method for realizing URPF on Ethernet switch
CN101841473A (en) * 2010-04-09 2010-09-22 北京星网锐捷网络技术有限公司 Method and apparatus for updating MAC (Media Access Control) address table
CN103001878A (en) * 2012-11-26 2013-03-27 中兴通讯股份有限公司 Determination method and device for media access control (MAC) address Hash collision

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111600811A (en) * 2020-04-14 2020-08-28 新华三信息安全技术有限公司 Message processing method and device

Also Published As

Publication number Publication date
CN105700859A (en) 2016-06-22

Similar Documents

Publication Publication Date Title
WO2016082367A1 (en) Method and device for realizing hardware table traversal based on network processor
CN103856406B (en) System and method for the routing table management in distributed network interchanger
US8799507B2 (en) Longest prefix match searches with variable numbers of prefixes
CN105049359B (en) Entrance calculate node and machine readable media for the distribution router that distributed routing table is searched
US8854973B2 (en) Sliced routing table management with replication
US8792494B2 (en) Facilitating insertion of device MAC addresses into a forwarding database
US9215171B2 (en) Hashing-based routing table management
US9106443B2 (en) Forwarding table optimization with flow data
US8923291B2 (en) Communication apparatus and communication method
WO2015114473A1 (en) Method and apparatus for locality sensitive hash-based load balancing
US20140064090A1 (en) Cached routing table management
US20220045950A1 (en) Single lookup entry for symmetric flows
US20120155485A1 (en) Efficient space utilization of distributed mac address tables in ethernet switches
KR102126592B1 (en) A look-aside processor unit with internal and external access for multicore processors
CN103560957A (en) Table look-up key value construction method and microcode issuing method, device and system
US20130124721A1 (en) Detected IP Link and Connectivity Inference
CN110912826B (en) Method and device for expanding IPFIX table items by using ACL
US20200313921A1 (en) System and method to control latency of serially-replicated multi-destination flows
US20160006684A1 (en) Communication system, control apparatus, communication method, and program
CN106878106B (en) Reachability detection method and device
US20170024154A1 (en) System and method for broadcasting data to multiple hardware forwarding engines
US9137158B2 (en) Communication apparatus and communication method
JP5760012B2 (en) Method and system for common group behavior filtering in a communication network environment
US9912581B2 (en) Flow inheritance
CN109218204A (en) A kind of method and apparatus solving MAC HASH conflict

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15863937

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15863937

Country of ref document: EP

Kind code of ref document: A1