WO2013153026A1

WO2013153026A1 - A method and a program for automatic diagnosis of problems in telecommunication networks

Info

Publication number: WO2013153026A1
Application number: PCT/EP2013/057294
Authority: WO
Inventors: Javier GONZÁLEZ ORDÁS; Luis GREGORIO MOYANO; Javier GARCÍA ALGARRA
Original assignee: Telefonica, S.A.
Priority date: 2012-04-12
Filing date: 2013-04-08
Publication date: 2013-10-17
Also published as: ES2436654A2; ES2436654B1; ES2436654R1

Abstract

A method and a program for automatic diagnosis of operative problems of devices and systems of a telecommunication network. The method comprising creating and using a Bayesian network for performing said diagnosis of operative problems in said telecommunication network according to probabilities values, where said probabilities values are provided or influenced by expertise complementary information regarding said operating problems and/or telecommunication network behavior, wherein the method comprises providing a logical representation of said telecommunication network under diagnosis and associated problems relevant for the telecommunication network and/or at least one diagnosis scenario, and using said logical representation for inputting said expertise complementary information.

Description

A method and a program for automatic diagnosis of problems in

telecommunication networks

Field of the art

The present invention generally relates to network diagnosis, and more particularly to a method to generate and use a diagnosis Bayesian network for automatic network diagnosis of problems in telecommunication networks.

Prior State of the Art

Accurate and efficient network diagnosis is of paramount importance for the correct functioning of any network. Misdiagnosed problems increase the time to restore customer service, degrade the quality of customer experience, cause higher maintenance costs to the network operator and may lead to widespread network failures.

Network diagnosis requires the collection of disparate status and test data from many devices and systems, which are then analyzed to identify deviations from normal conditions that may point to specific problems. This process is frequently not entirely automatic. Expert knowledge plays an important role in diagnosing a given network problem. As a consequence, network operators and managers must access the relevant information regarding the problem to be able to assess, evaluate, and finally arrive at an accurate picture of the situation. It is usually not straightforward to determine the relevant information for the problem, precisely because the problem is not known. To mitigate this difficulty, often much more information than needed is passed to the expert and is left to him or her to detect the relevant piece of data needed to pinpoint the network failure or service downgrade. This approach is prone to cause an extra burden on the expert, who in some cases must navigate through the excess of data to arrive at a conclusion, losing valuable time and resources, and decreasing the efficiency of the diagnosis.

Even if the collection of diagnosis data is automated, or there is a reasoning engine able to produce an automated diagnosis, no such reasoning system is able to achieve 100% success rates, so the expert intervention is still required, and the problem of selecting and presenting the correct subset of relevant data still exists. Furthermore, the definition of the diagnosis rules required by the reasoning engine has its own problems, as it involves the transfer of knowledge from diagnosis experts to reasoning and modelling experts. This is a long process that involves exchange of deeply technical documents, long interview sessions and many iterations until a satisfactory reasoning model is generated. In addition to the high cost and complexity of this process, the resulting model is hard to maintain and evolve by its end users (the diagnosis experts) when the network changes, which reduces the value of the model for the users, and typically leads to a less effective automated reasoning system over time.

In this context, it is relevant to give the expert operator a way to simplify the diagnosis process, without losing granularity over the relevant information needed in each case. The problem to overcome is how to simplify the diagnosis process by translating the expert knowledge unobtrusively into a conceptual model which allows the expert to diagnose a given situation in a given network. To achieve this, the information needed to set up the conceptual modeling of the network, its problems and their associated observations must be reduced to its minimum. Moreover, it is also possible to provide for certain automation in the diagnosis, for example, thorough Bayesian reasoning.

The goal of the present invention is to facilitate knowledge capture from network diagnosis experts and to integrate this information with the aim of improving the accuracy of the network diagnosis. Simplified rules are combined with previously defined knowledge about network communication problems and observations (both tests and status information) to generate a diagnosis Bayesian network and use it for automatic network diagnosis and presentation of relevant information to expert operators.

Network diagnosis requires the ability to retrieve information from multiple sources and highly specialized technical knowledge to properly analyse that information and produce an accurate diagnosis. Both the data collection process and the analysis phase may be assisted or performed by automated systems, but significant investments are needed to keep such systems up to date given the rapid pace of evolution of network technologies and configurations.

In this context, a lot of diagnosis tasks are still conducted manually by diagnosis operators, who use their intuition and expertise to decide which information to collect, where to obtain it, how to combine it and which is the most likely cause of a fault. This is a slow process that involves connecting to multiple systems and devices, via different user interfaces, and manually drawing network diagrams before reaching a conclusion. This is an inefficient and error-prone approach that demands great expertise from operators. Data collection systems reduce the burden for operators by providing a single unified interface for data access, or by automatically retrieving all information related to a given fault US 5333183. However, these systems typically do not perform a selection of relevant information, which still remains a task for operators, which must either explicitly request the information they are interested in, or navigate the complete data set in search of the most relevant pieces.

Expert systems able to automatically produce a diagnosis are well known in the industry but not much frequently used because they require substantial upfront investments (for training, configuration and modeling) and extensive maintenance to cope with the network evolution. In practice, a combination of automated and manual diagnosis is the most common setup, where expert systems are in charge for the most simple and common cases, and diagnosis experts' work on complex or unusual cases.

Multiple approaches exist in the literature to build expert systems, which differ in how the diagnosis knowledge is captured and modeled, how reasoning is performed, and what outcomes are presented to users. One of them are the rule based, these systems rely on a rules engine to determine the state of network entities (faulty or not) from a set of input information. There are different types of rules engines:

• Forward inference engines deduce new knowledge by applying IF-THEN rules to known information. An example of a diagnosis inference rule would be "IF port A is down and port B is down, THEN the cable between A and B has been cut". Inference engines are run on demand when diagnosis is needed [1].

• Event processing engines react to new events by applying event-specific rules that update their view over the status of the network. Event processing engines are permanently active, listening for new events to diagnose any kind of problem in real time [2].

• Goal driven backward inference engines attempt to reach a goal by backward reasoning over a set of rules, and checking if there is information to reach the goal, or asking for specific information to reach it [3]. They are typically based on IF-THEN rules. In a diagnosis scenario, the goal to reach is to decide which fault is present. Goal driven engines are particularly useful to determine which information should be retrieved next. Other examples are the semantic systems which are a variation of rule based systems where the knowledge is expressed with ontology and reasoning is performed over the ontology [4]. Semantic reasoners are able to infer logical consequences from a set of asserted facts. Semantic systems share many of the characteristics of forward inference engines, but adopt a different knowledge representation approach and reasoning logic, both richer and closer to a human representation.

Rule based and semantic systems are typically deterministic in nature. However, it is possible to include a probabilistic dimension in them US 2010/0049676 or apply fuzzy logic [5] to better handle uncertainty.

Decision diagrams turn the diagnosis process into a flow diagram where branches are selected depending on the value of retrieved information, and leaves contain the different diagnosis conclusions [6]. Decision diagrams assume all the information required at the decision points is available; otherwise, a conclusion cannot be obtained. Diagnosis operators are quite familiar with decision diagrams, and use it frequently to understand and characterize new diagnosis scenarios and during teaching sessions with new operators.

Case-based reasoning, works by comparing the situation under diagnosis with previous diagnosis cases, finding the most similar ones, and producing a conclusion based on the outcome of past similar cases as stated in WO/2006/097675 Ά method of fault diagnostics in a case based reasoning system'. In a diagnosis setting, a historical database is constructed with all information associated to each case, and the actual problem found for each. When a new case arrives, associated information is collected and a similarity algorithm used for comparison with the historical data set is done.

Bayesian network inference estimates the likelihood of failures by applying probability theory to causal dependencies between problems and symptoms. Diagnosis knowledge must be codified into a Bayesian network with a priori probabilities for problems and conditional probability tables for symptoms. The value of these probabilities must be learned from a historical data set of diagnosis cases or elicited from diagnosis experts. Some inventions related to Bayesian networks applied to diagnosis are US 6456622, US 2009/0292948 and US 2007/02609.

Some problems have been detected with the existing solutions:

- Manual solutions are slow and error prone.

- Automated solutions do not gracefully handle missing or incorrect information, which is fairly typical in network environments. This is particular relevant for rule based systems, semantic systems and decision diagrams.

Probabilistic solutions (including Bayesian networks), and case based reasoning, are more robust against missing or incorrect information. When this happens, they still provide a conclusion, but if this is not reliable enough (for instance, a 70% probability), expert involvement is still required.

Extensive output information clutters the expert's visibility of the problem and complicates reaching a correct diagnosis. This applies to any solution that requires expert intervention in a fraction or the totality of cases.

Solutions that require collecting all information before starting the analysis delay the diagnosis process and are not appropriate when there are non-negligible costs to obtain information.

Historical information about diagnosis cases and its solutions is rarely available, nor is usually organized into a structured model that allows comparisons and learning. This is particularly problematic for case based reasoning approaches, and for estimating probabilities in Bayesian networks.

Different experts reach different conclusions (possibly due to various backgrounds and/or levels of expertise in the concrete problem to be diagnosed) and there may not be a simple way to reach a unique agreement on the diagnosis.

Difficulty of the expert to express his/her expert knowledge in respect to a given problem in the appropriate knowledge representation framework

Diagnosis experts cannot easily evolve the knowledge model to follow the network evolution and adapt it to changing or new scenarios. This happens because expert knowledge is translated into a reasoning representation not familiar to diagnosis experts. The result is that models are not kept up to date and the quality of automated solutions degrades over time. Description of the Invention

It is necessary to offer an alternative to the state of the art which covers the gaps found therein, particularly related to the lack of proposals which really allows the simplification of the diagnosis process of a network by translating the expert knowledge unobtrusively into a conceptual model which will allow the expert to diagnose a given situation in said network.

To that end, the present invention provides, in a first aspect, a method for automatic diagnosis of operative problems of devices and systems of a telecommunication network, comprising creating and using a Bayesian network for performing said diagnosis of operative problems in said telecommunication network according to probabilities values, where said probabilities values are provided or influenced by expertise complementary information regarding said operating problems and/or telecommunication network behavior.

On contrary to the known proposals, the method of the first aspect comprises providing a logical representation of said telecommunication network under diagnosis and associated problems relevant for the telecommunication network and/or at least one diagnosis scenario, and using said logical representation for inputting said expertise complementary information.

On a preferred embodiment, the method of the first aspect of the invention comprises, in order to provide or influence said probabilities values, building and providing to the Bayesian network at least one expert model including at least said associated relevant problems.

Other embodiments of the method of the first aspect of the invention are described according to appended claims 2 to 13, and in a subsequent section related to the detailed description of several embodiments.

A second aspect of the present invention concerns to a computer program comprising code computer program means adapted to perform the stages of the method according to any of the claims 1 to 13, including the building of the expert model or models the building of the Bayesian network or networks and their operation, when said program runs on a computer, a digital signal processor, an application of an specific integrated microprocessor, microcontroller or any other form of programmable hardware.

Brief Description of the Drawings

The previous and other advantages and features will be more fully understood from the following detailed description of embodiments, with reference to the attached drawings, which must be considered in an illustrative and non-limiting manner, in which:

Figure 1 shows an example of the Bayesian network used in the present invention.

Figure 2 shows the logical steps, from the drawing of the circuit diagram to the display of diagnosis results, according to the logical workflow of the present invention.

Figure 3 shows an example of the Analytical hierarchy process structure used in the present invention.

Figure 4 shows the architecture of a computational device, according to an embodiment of the present invention. Figure 5 shows an example of the knowledge capture stage.

Figure 6 depicts some examples of the different icons that could be available for the user.

Figure 7 represents an example of a finished circuit, according to an embodiment of the present invention.

Figure 8 shows an example of the circuit stage in the interface used in the present invention.

Figure 9 shows an example of the definition/modeling stage in the interface used in the present invention.

Figure 10 shows an example of the diagnosis stage in the interface used in the present invention.

Figure 1 1 shows an example of the integration stage in the interface used in the present invention.

Figure 12 shows an example of the weight table edition in the interface used in the present invention.

Detailed Description of Several Embodiments

The present invention includes the construction and usage of a Bayesian network for diagnosis of problems in telecommunication networks. The objectives of the invention are to facilitate the introduction by experts of the diagnosis knowledge, to use this knowledge to reason which are the problems impacting the network, and to provide a condensed summary of the relevant data for the diagnosis case to experts.

The method involves the following logical steps: circuit definition, problem definition, observation definition, model definition, Bayesian network generation, individual diagnosis, knowledge integration and visualization.

1. An expert designs a circuit for the diagnosis scenario, which is a network representation of the diagnosis problem. Supported elements in the design are nodes, cards, ports, links, network segments and backup.

2. An expert defines which problems are relevant for the scenario. The expert may select previously defined problems, common with other scenarios, or add new problem definitions. A problem definition includes information about the problem type (e.g. medium cut, port defect or quality degradation), its location on the network representation, and whether it causes a high or low impact on the network.

3. An expert defines which observations, either tests or status information, are available in the scenario. These observations may be pre-existing or newly added ones. The definition of an observation includes information about the values it may have, its location on the network representation, about its normal value when no problems are present, and about which values are indicative of problems.

4. An expert associates each problem with the observations which are impacted by the appearance of that problem. This is called the expert model. This is done over the network representation created in step 1. Each problem may impact multiple observations, and each observation may be impacted by multiple problems. Multiple experts may create experts models of their own, independently of other experts.

5. A Bayesian network for the scenario is generated from each expert model. The rules for creating the Bayesian network take into account the relationships explicitly present in the model (from step 4), generic information about problem types and the problem and observation definitions made during steps 2 and 3.

6. The resulting Bayesian networks are used for performing diagnosis of problems in a network scenario. The scenario observations will be retrieved from different sources, and the Bayesian networks will provide, for every expert, the probability of each scenario problem, given the value of the collected observations. The set of problem probabilities derived from an expert model is known as the expert diagnosis.

7. Expert diagnoses produced by multiple experts are combined to produce what it is called a collective diagnosis. The combination of individual diagnoses is done using an Analytical Hierarchy Process (AHP) algorithm that takes into account the individual expertise of each contributor on different areas of the network and on different types of problems.

8. The collective diagnosis is presented to the users over a network representation of the scenario together with relevant observations. The subset of observations is selected based on how much explanatory power they have on the final diagnosis, i.e. how much they contribute to the conclusion. Individual expert diagnoses may also be presented to the users using the same mechanism. The invention consists of a method and system to, first, build Bayesian networks for telecommunication network diagnosis; second, to use those networks to determine which problems are impacting the network, and, finally, to provide to users a condensed description of the diagnosis case. Multiple diagnosis experts can provide input over a representation of the network to build a model of problems and observations. These models are turned into Bayesian networks which, at diagnosis time, are combined using the Analytical Hierarchy Process (AHP) method to help decide which are the most likely causes of a given network failure. The invention hides the internal modeling logic from diagnosis experts; it allows them to create and modify the reasoning knowledge by directly operating over a logical representation of the network under diagnosis, instead of by configuring the low level, specific aspects of the reasoning engine.

A Bayesian network is a probabilistic graphical model that represents a set of variables, their probabilistic dependencies and the existing conditional independences among them. When applied to failure diagnosis in communication networks, the variables in Bayesian networks represent possible causes of failure and the observed symptoms resulting from them, together with their probabilistic relationship. A priori knowledge about the likelihood of each cause can be combined with observations about the symptoms to make probabilistic inferences about the full status of the communication network and the actual causes of a failure even when provided with incomplete information, noisy data or mistaken inputs.

In the present invention, as depicted in Figure 1 , users work with a list of problems 1 (representing root causes 2a, 2b, 2c) and a list of observations 3 (representing obtainable symptoms 4a, 4b, 4c) for a given circuit type. Users can create associations 5a, 5b, 5c from network problems to observations; a problem such as 2a may be associated to multiple observations, and an observation, such as 4c, may be associated to multiple problems. These associations are converted into causal relationships in a Bayesian network for the circuit. The Bayesian network definition is completed with information about the frequency of each problem and about the significance of the problem-observation relationship. Bayesian networks created, by one or multiple users, for a particular circuit type are used, at diagnosis time, to compute the probability of each problem. When several Bayesian networks have been created for the same circuit, an AHP algorithm combines the probabilities given by each of them to reach a decision about the most likely problems.

Figure 2 provides a high level overview of the invention logical workflow, from the drawing of the circuit diagram to the display of diagnosis results. It involves the following steps:

• Circuit definition (6 in Figure 2)

• Problem definition (7 in Figure 2)

• Observation definition (8 in Figure 2) • Model definition (9 in Figure 2)

• Bayesian network generation (10 in Figure 2)

• Individual diagnosis (1 1 in Figure 2)

• Knowledge integration (12 in Figure 2)

• Visualization (13 in Figure 2)

Each of these steps is described in detail below. Circuit definition

The procedure starts with the creation of a new circuit type. In this phase an expert designs the diagnosis scenario, which is a logical network representation of the diagnosis problem. The expert will define which elements are relevant for diagnosis, how they are connected, and how the circuit should be presented at diagnosis time. Every element in the diagram is characterized by a name (which is unique within a circuit), a type and a set of connections with other elements. Elements available are:

• Nodes: nodes represent physical network elements. The name of nodes is provided by the user. Nodes can contain zero, one or multiple cards and ports. Nodes can be connected by links to other nodes or network segments. Nodes may be part of a backup segment.

• Cards: cards are contained in nodes. Cards are assigned a unique identifier within a node. The name of a card is derived from the name of the node, concatenated with the unique identifier. Cards may contain zero, one or multiple ports. Cards are not connected to other elements.

• Ports: ports are contained in cards or directly in nodes. Ports are assigned a unique identifier within a node. The name of a port is derived from the name of the node, concatenated with the port unique identifier. Ports may terminate a link that connects the containing node.

• Network segments: network segments represent a graph of network elements which is abstracted away from the diagnosis point of view. The name of network segments is provided by the user. Network segments can contain zero, one or multiple ports. Network segments can be connected by links to nodes or other network segments. Network segments may be part of a backup segment. • Links: links connect two elements. These elements can be nodes or network segments. The name of the link results from the concatenation of the name of the elements it connects. Links may be terminated at ports or directly at nodes or network segments.

· Backup segments: backup segments represent a set of nodes and network segment which are grouped together for network backup purposes. A backup segment contains a primary branch and one or multiple backup branches that provide protection to the primary one. Backup branches within a backup segment are identified by a branch number N increasing from 1. Elements in the backup segment are assigned to one and only one branch. The name of a backup segment is automatically generated based on its placement in the circuit.

• Services: services represent functions required for the circuit to work, but which are not part of a specific node or network segment. The name of a service is provided by the user. Services are not connected to any other element.

The result of the circuit definition is a logical network definition, a representation of the circuit which is used throughout the entire procedure.

Problem definition

An expert defines which problems are relevant for a circuit. The expert may select previously defined problems, common with other circuits, or add new problem definitions. A problem definition includes information about the problem type (e.g. medium cut, port defect, quality degradation), its location on the network diagram drawn in the circuit definition phase, the likelihood of the problem happening and whether it causes a high or low impact on the network.

Observation definition

An expert defines which observations, either tests or readable status information, are available in the scenario. These observations may be pre-existing or newly added ones. The definition of an observation includes its name, information about the values it may take, its location on the network diagram, about its normal value when no problems are present, and about which values are indicative of problems. This last set of values is sorted in decreasing order of severity: the values at the top of the list are indicative of more severe problems. The invention method differentiates the following kinds of observation:

• Node observations: these observations are tied to a node. The user has to explicitly enter the name of the observation. Names of node observations must be unique in the scope of each node. By default, these observations take two values: "NOK" and ΌΚ". The normal value is "OK". The user may add additional values, or edit the default ones.

• Service observations: these observations are tied to a service. The user has to explicitly enter the name of the observation. Names of service observations must be unique in the scope of each service. By default, these observations take two values: "NOK" and "OK". The normal value is "OK".

The user may add additional values, or edit the default ones.

• Network segment observations: these observations are tied to a network segment. The user has to explicitly enter the name of the observation. Names of the observations must be unique in the scope of each network segment. By default, these observations take two values: "NOK" and "OK".

The normal value is "OK". The user may add additional values, or edit the default ones.

• Point to point observations: these observations represent measurements or tests taken between two elements in the network diagram. The user has to explicitly enter the name of the observation. Names of the point to point observations must be unique within the circuit. By default, these observations take two values: "NOK" and "OK". The normal value is "OK". The user may add additional values, or edit the default ones.

• Status of ports and cards: status observations are created automatically for each port and card in the circuit. Status observations can take two values:

"NOK" and "OK". The normal value is "OK".

• Manifestation: the manifestation represents the network user view of the behavior of the circuit. A manifestation observation is created for all circuits, independently of the network diagram drawing. The manifestation is not associated to any particular element in the logical network definition. The circuit manifestation can take the following values, from the highest to the lowest severity: "unavailable", "intermittent cuts", "low speed", "packet loss", "working". The normal value is "working".

Model definition In this phase, experts build a model of the circuit diagnosis scenario. Multiple experts may provide models for the same circuit. Modeling is based on the logical network designed in the circuit definition phase. To create his model, an expert associates each problem with the observations which are impacted by the appearance of that problem. Therefore, an expert model is comprised of a set of problems, each of which is associated to zero, one or many observations.

The user chooses the circuit to work on, and starts selecting a problem to model among the list of problems created in the problem definition phase. For each problem, the user will indicate which observations at which logical network elements are associated to it. The user will select the most probable value of the observation when the problem appears. By default, there are no associations between problems and observations. Only when the user selects an observation value different from its normal value an association is created. An association between a problem and an observation indicates that the observation has a causal dependency on the problem.

A problem may impact multiple observations, and each observation may be impacted by multiple problems.

The result of the model defined by an expert is called the expert model. There may be multiple independent expert models for the same circuit, and each model may reuse problems and observations from other expert models.

Bayesian network generation

An expert Bayesian network for the circuit is generated from each expert model. The rules for creating the Bayesian network take into account the relationships explicitly present in the model (created in the prior phase), general information about problem types and how they impact telecommunication networks, and the problem and observation definitions previously made.

The procedure to generate a Bayesian network uses a technique known as generalized Noisy-OR gate [7]. Generalized noisy OR-gates allow constructing the conditional probability table by independently defining the intensity of the causal dependency from a parent cause to the symptom, plus leak parameters that specify the probabilities when no problems are present. The main advantage of the OR-gate is that the number of parameters is proportional to the number of causes, while it was exponential in the general conditional probability table case. As a consequence, the OR-gate simplifies knowledge acquisition and evolution. The procedure for building the Bayesian network, when no backup segments are present in the circuit, is as follows: i. For each association between a problem and an observation in the expert model, create a causal link from the problem to the observation. In this link, the problem is said to be the parent, and the observation the child. ii. The a priori probability for the problems is given by the likelihood of the problem occurring, as specified during the problem definition phase. iii. If an observation has a single parent problem, and it's a problem defined as having high impact on the network, the conditional probability table for the observation will be:

• P(selected observation value / Problem) = P_high (configurable)

• P(non-selected observation / Problem) = (1- P_high) I (number of observation values - 1 )

• P(normal observation value / No Problem) = P_normal (configurable) · P(non-normal observation / No Problem) = (1- P_normal) I (number of observation values - 1 ) where P_high and P_normal are configurable parameters. Here, "P (Value / Problem)" means the probability that the observation takes that Value if the Problem is present. "Selected observation value" corresponds to the most probable value of the observation, given the problem, as selected during the model definition. "Normal observation value" corresponds to the normal value of the observation as specified during the observation definition. iv. If an observation has a single parent problem, and it's a problem defined as having low impact on the network, the conditional probability table for the observation will be:

• P(selected observation value / Problem) = P_low

• P(non-selected observation / Problem) = (1- P_low) I (number of observation values - 1 )

• P(normal observation value / No Problem) = P_normal • P(non-normal observation / No Problem) = (1- P_normal) I (number of observation values - 1 ) where P_low and P_normal are configurable parameters. v. When an observation has two or more parents, generalized Noisy OR gates are used to generate the conditional probability tables. The individual intensity of the dependency between a problem and the associated observation is computed as described in the rules iii (for high impact problems) and iv (for low impact problems) above, using P_high and P_normal parameters. Leak values for the observation in the absence of problems are controlled by the P_normal configurable parameter. vi. If a particular problem is not associated to any observation, that problem will be kept unconnected in the Bayesian network.

If the circuit contains backup segments, a protection structure will be inserted into the Bayesian network, as per the following rules: a. Each branch of the backup segment will be dealt with as if they were an independent circuit, resulting in two or more partial Bayesian networks. These Bayesian networks will only contain problems associated to observations located within the boundaries of each branch. If any of these problems are associated with observations outside the protection segment boundaries, see rule f. b. To integrate several branches, auxiliary Bayesian network nodes are created: ProtectionProblem, MainBranchProblem and as many BackupBranch[N]Problem as there are backup branches, where [N] is the backup branch number. c. MainBranchProblem is defined as 'Yes' if there's any problem in the main branch. In the Bayesian network, that means it appears as a child node of all problems in the main branch; its probability table for 'Yes' will have a value of Ί ' if any of the problems is present, '0' if no problem is present. The probability of 'No', indicating no problems in the branch, is computed as 1 - Probability("yes") d. Same for each BackupBranch[N]Problem for its respective branches. Protection Problem will be a child node of MainBranchProblem and all BackupBranch[N]Problem nodes. Its probability table (for the 'Yes' value) will take a value of '1 ' when all MainBranchProblem and BackupBranch[N]Problem nodes are at Ί ', and Ό' otherwise. The probability of 'No', indicating no problems in the backup segment, is computed as 1 - Probability("yes")

If any of the backup segment problems are associated as well with additional observations outside the backup segment, a dependency in the Bayesian network will be created from the Protection Problem node to the observation. The values in the probabilities table will be taken from the problem with the highest network impact (high or low) among those associated with a given observation.

Individual diagnosis

When there is a suspicion that a circuit is affected by a problem, a diagnosis case is created. The diagnosis case contains information about the manifestation of the problem as perceived by the user of the circuit. The case is complemented with information, retrieved from different external sources, about the status of elements and services related to the circuit and with the result of specific tests conducted to get more information about the case. The status and test values are assigned to the observations representing them in each expert Bayesian network. Given these values, Bayesian network inference is performed, for all expert Bayesian networks, to compute the probability of each problem defined for the circuit. The set of problem probabilities obtained for a specific expert Bayesian network is known as the expert diagnosis.

Knowledge integration stage

Expert diagnoses produced by multiple experts are combined to produce what it is called a collective diagnosis. The combination of individual diagnoses is obtained using the AHP [8]. AHP in the present invention takes into account the individual expertise of each contributor on different areas of the network and on different types of problems, as explained as follows.

The AHP method has three basic phases: 1 ) definition of a hierarchical structure of the problem, 2) comparison of all elements in the hierarchy respect to a ratio scale (in pairwise fashion), and 3) synthesizing all comparisons to arrive at a global prioritisation for the alternatives to be ranked, representing the correct diagnosis.

The definition phase of the hierarchical structure of the decision problem (14 in Figure 3) within the framework of AHP consists (in a simple case) in the identification of the goal to be achieved by the decision process, called the overall goal 15, and a set of general criteria 16a, 16b..., 16n, and sub-criteria 17a, 17b..., 17n which represent different aspects that affect the decision problem. Finally, a final level of alternatives 18a, 18b..., 18n represent the possible choices among which to choose from to provide a final ordering that will convey the decision process.

The definition of the problem hierarchy in the present invention refers to the setup of a structure representing the different experts, and their relationship to the outcome of the expert diagnosis implemented through the Bayesian reasoning. The overall goal in the hierarchical structure for the present invention refers to the achievement of a single best outcome for the diagnosis in the circuit. Criteria, and as many sub-criteria levels as needed to compose the hierarchical structure, that will provide the framework for solving the decision problem. In the present invention, criteria refer to the different experts and if needed, sub-criteria can refer to aspects of the expert's knowledge, for example their degree of expertise in a certain particular subject. Finally, in the present invention it is considered every expert to have the same set of alternatives, which represent the different problems associated to the set of observations in a given circuit. In this way a consistent treatment can be done through the AHP method to obtain consensus in the diagnosis outcome.

The prioritisation is achieved by applying a modified version of the AHP. In the present invention, a modification to the usual AHP algorithm is made in order to fit the prioritisation to the outcome from the Bayesian reasoning stage. In the traditional implementation of the AHP methodology, pairwise comparisons are performed for every pair of criteria, sub-criteria and alternatives. In the present invention only criteria and sub-criteria are compared, while alternatives are treated separately, as will be explained below.

The pairwise comparison step consists in the process of quantifying the relative importance of each element (in this case, excluding alternatives) in all possible pairs of elements in each level or cluster, always in relation with the immediately above element. In the present invention, the elements or criteria are the experts. Thus, pair- wise comparisons are made to quantify how much an expert is one over the other in relation to the overall goal, which is the best assessment possible for the likely sources of problems in the diagnosis of the circuit.

Experts can be compared in two ways, by means of self-evaluation or by means of peer-evaluation. In the first case, each expert compares herself to every other expert, generating a concrete number which will quantify the relation between the two. In the second case, peers judge between the expertise of their colleagues, and a synthesis is made of all the comparisons, for example by means of an arithmetic average. In both cases every pair of expert is represented by a weight, given by an integer between 1 and 9. The weight corresponding to the same pair of experts but in the opposite direction is represented as in the usual AHP fashion, by the inverse of the first weight. The peer evaluation includes the instance of a single individual evaluating every expert.

In the case of alternatives, which are all the possible problems associated to the circuit's diagnosis, there are no comparisons. In the AHP methodology, weights are assigned to every pair of alternatives, and from them it is possible to generate a set of intermediate weights with total sum equal to 1 . These intermediate weights are then used as an input to the rest of the AHP methodology. In the present invention, it is taken directly the outcome from each expert's Bayesian reasoning, which is then used as the intermediate weights as if generated by pairwise comparison between alternatives. The only caveat is to normalize the outcome from the Bayesian network, which need not be normalized, while the intermediate weights in AHP do.

Finally, in the synthesizing step all the relevant comparisons or scores are integrated to obtain weights by means of the rationale provided by AHP [8] and reach a final prioritisation for all the alternatives, in this case a consensus among the probability of all problems. This consensus is called the collective diagnosis.

Embodiment of the invention:

The preferred embodiment of the present invention consists in a computational device consisting in the following elements (see Figure 4):

An external source 19 which will feed data to an external memory 20 that will contain them as input data 21. The external source 19 may also feed an external database 22 which in turn can write data in input data 21 in the external memory 20.

The input data 21 will consist in data related to the circuit, to the observations, to the manifestations, to the comparison between experts and any other kind of data necessary for the correct functioning of the present invention. These data will in turn be taken to the calculation module 23, specifically to the internal memory 26 as Bayesian network data 27 or AHP data 28, and will feed the Bayesian network processor 24 and the AHP processor 30 which may or may not be the same processor. The data will be able to move to and from the internal memory 26 through the I/O buses 25 and 29.

Additionally, data will also be able to move to and from the presentation module

31 through I/O bus 32. In turn, the I/O bus 32 will connect to the user interface generator 33 which will display the appropriate information in the user interface 35. The user will be able to select and change part of these data through the user interface 35, and these changes will be processed by the configuration selector 34.

Users will interact with the invention through a web-page in any browser, remotely or locally. They will authenticate through a user and password management system, so to link the knowledge translated into the Bayesian Network and additional information with that particular expert.

The knowledge capture 36 consists in 5 stages as shown in Figure 5, comprising: a circuit stage 37, where the user is able to draw the circuit to be diagnosed, a definition stage 38 where the user defines the problems, observations and manifestation available to the circuit, a model stage 39 where the user is able to link problems, observations and manifestation and generates the diagnosis model (i.e. the associated Bayesian network), the diagnosis stage 40 where the Bayesian network is solved to obtain a list of problems with their associated probabilities, and finally an integration stage 41 where the user is able to integrate her results with other results from other experts in order to achieve a consensus for the knowledge captured in every model. Circuit stage

In the preferred embodiment, the user interface is displayed in a browser or any other online web tool. The front-end will include all the appropriate design elements in order to allow the user to perform all the tasks associated to the design of the circuit.

In the circuit stage the user will draw the circuit to be diagnosed. The circuit is composed by a given number of circuit elements, which are previously defined in the tool. In the preferred embodiment the different elements are represented by different icons, which describe the role of the element in the circuit, as shown in Figure 6, which depicts some examples of the different icons that could be available for the user: an equipment element 42, a network cloud element 43, a backup equipment element 44, a service element 45. Elements eligible for selection and inclusion in the circuit are shown to be so by including an additional visual mark in the corresponding icon, as for instance a small color-coded cross, say a plus-shaped cross (+), as shown in icon 45. To be able to remove any element in the circuit, icons will include whenever appropriate a different visual mark in the corresponding icon, as for instance another small cross, with different color and shape, say an X-shape, as shown in icon 46. In this way the user may select and order in the desired way the different elements comprising the circuit that will be diagnosed. An example of a finished circuit is represented in Figure 7, where the different icons representing elements are automatically connected to form circuit 47.

In the preferred embodiment (Figure 8), the browser will display indicative icons

48 through 52 will inform at which of the 5 stages the user is currently working on. In the circuit stage the draw icon 48 will be highlighted in an appropriate way. Also, a scroll-down menu 53 is available to allow the user to change among different circuits at any point in the process. Multiple circuits are stored in the database 22 to this end. Additionally, the user may have to login to access the invention's interface. In this case, the user will be identified with an appropriate icon 54 and will be able to logout with the exit icon 55. The possibility of login will allow different types of users with different attributes, for instance, different permission rights to edit information in the interface.

The user will select, with a mouse or any equivalent tool, from the layout area 58 the appropriate icons among the available element icons 56a, 56b, ... 56n, representing the relevant network equipment, links, etc. described above for the specific circuit that is being analysed. The element icons will be identified with labels 57a, 57b... 57n. After selecting the element icon, the user will be presented with a popup window with a text-field entry to name the element being introduced. After naming the element and selecting the OK button in the pop-up window, the icon will be placed after the last drawn icon in draw layout area 59 to give form to said network circuit. Each icon in the draw layout area is designated with the assigned label 60a, 60b... 60n. Definition stage

In the definition stage the define icon 49 will be highlighted in an appropriate way to inform that the user is currently working in this stage. In this stage the user will create a list of problems associated to the circuit.

The user will be able to compose a list of problems in select problem area layout 61 , shown in Figure 9, where all the problems relevant to the circuit are compiled. To this end, the user will select the add feature 63 in the layout area 61 . A pop-up window will appear with a text field where the user will type a name or label representing the problem. Then the text field appears in the layout area 61 as for instance problem elements 62a, 62b, 62n. Additionally, the circuit elements chosen in the circuit stage will show by default a given number of observations 68a, 68b, 68c, 68d..., 68m, 68n. Every observation text field will display nearby a small X-shaped mark 69 in order to allow the user the deletion of that particular observation. Also, for each circuit element, a plus-shaped icon mark 70a, 70b..., 70n will be available for the user to add more observations than the default ones. Additionally, an editable horizontal arrow 72 representing a point-to-point test between any two elements in the circuit will be available. By varying the length of the arrow selecting and dragging its tip appropriately, the user will be able to define the point-to-point test between any two parts of the circuit.

Finally, a button or icon 73 will be in place to allow the user to declare that the definition stage has ended and is ready to continue to the next stage.

Modeling stage

In the modeling stage the modeling icon 50 will be highlighted in an appropriate way to inform that the user is currently working in this stage. In this stage the user will be able to associate each of the problems defined in the previous stage with observations in the circuit, circuit elements or part of circuit elements as well as with a manifestation value.

The user will choose any one problem in the layout area 61 by selecting with a mouse or any other means the said problem element, which will change color to indicate that said problem element has been selected. Afterwards, the user will identify observations associated to circuit elements or part of circuit elements (including point- to-point tests) connected with the highlighted problem, again by selecting in the corresponding circuit element or part of said circuit element in the layout area 67. Again, the selected circuit element or part of circuit element will be highlighted by changing color, indicating that has been selected and linked to the highlighted problem. Next, the user will select an associated manifestation in the layout area 64 by selecting the specific icon 65 describing the observation, described by the manifestation label 66. Different icons associated to different manifestations will appear by repeatedly selecting the icon 65 in the layout area 64. Finally, a button or icon 73 will be in place to allow the user to declare that the modeling stage has ended. Upon selection of this button or icon, the Bayesian network will be generated and stored in the database 22 and the user will be ready to continue to the next stage.

Diagnosis stage

In the diagnosis stage the diagnosis icon 51 will be highlighted in an appropriate way to inform that the user is currently working in this stage.

A diagnosis output layout area 74 displays all relevant listed problems in layout area 75, 77a, 77b...7n which are output of the Bayesian network logic and that surpass a given probability threshold set in advanced. Additionally, problems whose probabilities are lower than the threshold but are considered unusual by means of the appropriate logic can also be displayed here. An associated probabilities layout area 76 will be displayed alongside the problems layout area 75. In the layout area 76 all associated probabilities 78a, 78b...78n will be displayed to inform the user on the relative relevance of each problem. By selecting each problem the user activates a pop-up window with additional information regarding said problem, including the value of all observations associated to it.

At this point the individual user has the entire outcome obtained by her individual diagnosis plus possibly external information. An additional integration stage is available but not mandatory, in which the user will be able to combine the information obtained by her analysis with that of other experts.

Integration stage

In the integration stage the integration icon 52 will be highlighted in an appropriate way to inform that the user is currently working in this stage.

In the integration layout area 79, shown in Figure 1 1 , the current user is presented with two lists comprising other expert users which may include the current user. The left-hand list of eligible users 80, displays all those users 82a, 82b... 82n, which have already provided input models for the same circuit and the same problems. Each user in the eligible user list has a small mark 83a, 83b..., 83n that can be used by the current user to add the eligible user to the right-hand side list 81 which displays the group of user models 84a, 84b..., 84n that will be used to arrive by consensus at a list of probable causes for the problems in the circuit. When the eligible user list 80 is considered to be complete, the current user may need to indicate the relative weights between users, if this has not been done previously by other users. This is done by selecting the weights icon or button 85. The current user is shown another window, shown in Figure 12, where an editable table is presented. The first row and column of the table designate all the eligible users chosen in the previous step. Weights corresponding to the diagonal elements in the table are automatically set to 1. Only numbers 1 through 9 are allowed in the rest of the cells. Whenever a cell is filled with the appropriate number, the cell opposite to it is automatically assigned the inverse of the number introduced, as required by the AHP formalism.

Users may have full edition permissions, partial edition permissions or may not have edition permissions at all. In the case of full permission, the user will be able to set weights comparing any pair of experts. Peer evaluation will need full edition permissions. In the case of partial edition permissions, the user will be able to edit only its own row and column, thus effectively being able only to change its own weight relative to other experts. Self-evaluation will need only partial edition permissions. Finally, a user with no permission to edit any cell in the table will only be able to generate a consensus diagnosis after another user with the proper edition permissions completes the weight table. Additionally, a user may not even be allowed to read the weight table.

When the table is complete, the user finishes the weight assignment procedure by selecting the icon of button 89. When this is done, the users are taken back to the integration window in Figure 1 1 , where will be able to finish the integration process by selecting the icon of button 86. By selecting this, the AHP procedure is executed and a final ordering for the available problems is generated. At this stage, the current user is taken to the diagnostic stage where the results are displayed in the layout area 74, in the same manner as with the individual diagnostic results described before.

Advantages of the Invention:

The main advantage of the present invention is the simplification of the creation of the diagnosis knowledge model. Diagnosis experts can directly enter their understanding of how problems impact circuits over a given logical representation -a circuit- familiar to them, with no involvement from reasoning engine experts. Users do not need to know anything about Bayesian networks to create the Bayesian network model that will govern the diagnosis process. This approach substantially reduces the risk of misunderstandings and speeds up the knowledge capture process. Furthermore, it allows the diagnosis experts to keep control on the underlying reasoning logic, so they can better evolve the diagnosis model together with the network evolution.

The present invention allows automating diagnosis of communication problems by using a Bayesian network inference based on the input of diagnosis experts. Diagnosis can be performed even in the absence of information, or with incorrect one. In case a reliable conclusion cannot be achieved, the invention still provides guidance, in terms of problem probabilities, to human experts on which problems to focus first, so that diagnosis can finish sooner.

A third advantage of the invention is that the reasoning model for automated diagnosis and the logical representation of the scenario for manual review are created in the same process and kept aligned. This ensures that the information and representation is kept consistent when diagnosis can be automated and when human intervention is required.

The invention achieves a simplification of the interface, presenting minimal but relevant presentation of the underlying observation data over a circuit schematic. The selection of which data to show is based on the same knowledge capture process required to build the reasoning model, so it involves no extra work.

Finally, the invention supports the integration of knowledge from several experts, which can have varying degrees of expertise, and may reach different conclusions when applying their knowledge to a diagnosis case. Moreover, the integration may have a larger degree of control by allowing both self-evaluations among users expertise and peer evaluation among users expertise. In this way, users may modify the influence of each diagnosis model according to their degree of expertise.

ACRONYMS

AHP Analytical Hierarchy Process BN Bayesian Network

Ul User Interface

REFERENCES

[1] C. Forgy, "Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem", Artificial Intelligence, 19, 1 (1982) 17-37.

[2] J. P. Martin-Flatin, G. Jakobson and L. Lewis, "Event Correlation in Integrated Management: Lessons Learned and Outlook", Journal of Network and Systems Management, 15, 4 (2007) 481-502.

[3] D. E. Smith, "Controlling backward inference", Artificial Intelligence, 39, 2 (1989) 145-208.

[4] E. Sirin et al, "Pellet: A practical OWL-DL reasoner", Web Semantics: Science, Services and Agents on the World Wide Web, 5, 2 (2007) 51-53.

[5] R. Isermann, "On fuzzy logic applications for automatic control, supervision, and fault diagnosis", IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 28, 2 (1998) 221-235.

[6] K. W. Przytula, D. Thompson, "Development of Bayesian Diagnostic Models Using Troubleshooting Flow Diagrams", Proceedings of SPIE, Component and Systems Diagnostics, Prognosis, and Health Management, 4389 (2001 ) 1 10- 120.

[7] F. J. Diez, "Parameter adjustment in Bayes networks. The generalized noisy OR-gate", Proceedings of the 9th conference on uncertainty in Artificial Intelligence (1993) 99-105.

[8] T. L. Saaty, "How to make a decision: The Analytical Hierarchy Process", European Journal of Operational Research, 48 (1990) 9-26.

Claims

\ - A method for automatic diagnosis of operative problems of devices and systems of a telecommunication network, comprising creating and using a Bayesian network for performing said diagnosis of operative problems in said telecommunication network according to probabilities values, where said probabilities values are provided or influenced by expertise complementary information regarding said operating problems and/or telecommunication network behavior, wherein the method is characterized in that it comprises providing a logical representation of said telecommunication network under diagnosis and associated problems relevant for the telecommunication network and/or at least one diagnosis scenario, and using said logical representation for inputting said expertise complementary information.

2.- The method of claim 1 , characterized in that it comprises, in order to provide or influence said probabilities values, building and providing to the Bayesian network at least one expert model including at least said associated relevant problems.

3.- The method of claim 2, characterized in that it comprises building said at least one expert model by also including observations regarding operative values of the elements of said logical representation, said operative values including certain values when no problem is present for said elements and other different values when there is a problem present for said elements.

4.- The method of claim 3, wherein said elements are at least nodes and segments linking nodes.

5.- The method of claim 3, further comprising obtaining said operative values by monitoring status information of the element or empirically, by testing the operation of the element.

6.- The method of claim 3, characterized in that it comprises building said at least one expert model by also including association between said problems and said observations regarding the impact the former have in the latter.

7.- The method of any of claims 2 to 6, comprising using said at least one expert model for building said Bayesian network.

8.- The method of claim 7, comprising building a plurality of said expert models, independently from each other, and using said plurality of expert models for building a corresponding plurality of Bayesian networks.

9.- The method of claim 8, comprising combining the probabilities given by each of said plurality of Bayesian networks for a problem, to reach a decision for said automatic diagnosis.

10. - The method of claim 9, comprising using a AHP algorithm for performing said combination.

1 1. - The method of claim 8, comprising providing a plurality of said logical representation of said telecommunication network under diagnosis and using each of them for building one of said expert models.

12. - The method of any of the previous claims, characterized in that it comprises providing graphical information to a user including said logical representation, and at least part of the automatic diagnosis and/or problems and/or observations associated to the elements forming the logical representation.

13.- The method of claim 12, wherein said logical representation of said telecommunication network comprises a plurality of nodes, cards, ports, links, network segments, backup or a combination thereof.

14.- A computer program comprising code computer program means adapted to perform the stages of the method according to any of the claims 1 to 13, including the building of the expert model or models the building of the Bayesian network or networks and their operation, when said program runs on a computer, a digital signal processor, an application of an specific integrated microprocessor, microcontroller or any other form of programmable hardware.