WO1999038077A1 - Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform - Google Patents
Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform Download PDFInfo
- Publication number
- WO1999038077A1 WO1999038077A1 PCT/DE1999/000125 DE9900125W WO9938077A1 WO 1999038077 A1 WO1999038077 A1 WO 1999038077A1 DE 9900125 W DE9900125 W DE 9900125W WO 9938077 A1 WO9938077 A1 WO 9938077A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- processor
- processors
- chain
- task
- failure
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1405—Saving, restoring, recovering or retrying at machine instruction level
- G06F11/1407—Checkpointing the instruction stream
Definitions
- the invention relates to a method according to the preamble of claim 1.
- Modern communication systems have a plurality of processors, which work together to work on certain tasks or subtasks. Such ⁇ strength plurality of processors is also referred to as a processor platform.
- the platform is determined administratively before the commissioning of the communication system.
- one of the processors of the processor platform accepts the task to be processed with the data required for this and carries out a first processing.
- a further processor is then controlled in accordance with the result, to which the result of the first processing is then fed.
- the processor then carries out further processing and, if necessary, transfers the determined result to another processor.
- the processing steps of a subsequent processor thus depend directly on the result of the predecessor. This forms a logical chain, in which several processors of the processor platform are usually integrated. These processors form a subset of all processors on the processor platform.
- monitoring programs or audits are started to treat these failures in a cyclical time grid, which examine the processors of a processor platform for hardware and software errors.
- these monitoring and checking processes are carried out in less traffic times.
- the underlying time interval can therefore take a long time under certain circumstances. For the duration of this time interval, the misconduct remains unnoticed.
- the invention has for its object to show a way how the failure of one or more processes of a processor platform can be treated efficiently in order to increase the dynamics of the system.
- the advantage of the invention is in particular the formation of a further logical chain of processors, which is superimposed on the first logical chain.
- significant data from a processor arranged in this chain is transferred to the processor downstream in this chain. This happens regardless of which of the processors of the first logical chain the result of the processing is transferred.
- This is associated with the advantage that a failed processor can reload this significant data when it reboots directly to the processor following in this chain and thus has an image of the data as before the failure.
- FIG. 1 shows a processor platform with a total of 30 processors
- processors P- * ... P 30 of a processor platform are shown by way of example.
- all processors are designed in duplicate in order to be able to switch to the redundantly arranged processor in the event of a processor failure and are interlinked via connecting lines.
- the processors P ⁇ , P 10 / P15, P28 are now intended to process an upcoming task and thus form a first logical chain in the processor platform in question.
- the task at hand is to establish a connection.
- the processors P ** ... P 30 are now arranged according to the invention in a second logical chain. According to the present exemplary embodiment, the beginning of this chain is thus formed by the processor P- *. This is followed as a further link in this chain by the processor P 2 , etc. The end of the chain is formed by the processor P 30 .
- the processor platform should therefore be given the job of establishing a connection.
- this task and the data required for this purpose are fed to one of the processors of the first logical chain of processors.
- This is to be the processor Pi, for example. 4
- the task is broken down into sub-tasks, with each sub-task running on one of the processors Pio, Pis, P 2 s integrated in the machining process.
- the subsequent processor in the chain depends on the pre-processing of the other processors.
- processor P- * the first subtask is now being processed. According to the result of the processing process, the data defining this result is then fed to processor P 10 , which carries out further processing before the data are fed to processors P 15 and P 2 8 and leave the chain again.
- the significant data should be data which represent a representative image of the physical and logical state in which the processor P- * is located. Furthermore, the significant data describe the current state of the task in question, which is stored in the processor P- *. is currently being edited.
- the processors following in the second logical chain are supplied with significant data from the upstream processor.
- Significant data of the processor P ⁇ 0 are thus stored in the processor P
- significant data of the processor P 22 are stored in the processor P 23 etc.
- the supply of the significant data can take place at the same time as the transmission of the result to the processor subsequently connected in the first logical chain .
- this procedure is not mandatory.
- a cyclic time interval between the machining processes is also conceivable here.
- the significant data is deleted in the follow-up processor once the task has been processed.
- one of the processors together with the redundant one 5 arranged processor fails. This should be the processor P 15 , for example. In this case, the data that have just been processed are lost and cannot be made available to the processor P 2 ⁇ for further processing.
- the processor P 15 is now started up immediately after the failure. For this purpose, the significant data that were fed to processor P 16 are stored back in processor P 15 . The knowledge before the failure is then available again in the processor P i5 and the task can be continued to be processed. The result obtained is then fed to the processor P 2 s. This closes the gap in the logical first chain caused by the failure.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/600,715 US6625752B1 (en) | 1998-01-20 | 1999-01-19 | Method for improving system availability following the failure of the processors of a processor platform |
CA002319214A CA2319214A1 (en) | 1998-01-20 | 1999-01-19 | Method for improving system availability after the failure of processors in a processor platform |
DE59905317T DE59905317D1 (de) | 1998-01-20 | 1999-01-19 | Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform |
EP99932437A EP1049978B1 (de) | 1998-01-20 | 1999-01-19 | Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19801992A DE19801992C2 (de) | 1998-01-20 | 1998-01-20 | Verfahren zur Verbesserung der Systemverfügbarkeit nach dem Ausfall von Prozessoren einer Prozessorplattform |
DE19801992.0 | 1998-01-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1999038077A1 true WO1999038077A1 (de) | 1999-07-29 |
Family
ID=7855150
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE1999/000125 WO1999038077A1 (de) | 1998-01-20 | 1999-01-19 | Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform |
Country Status (6)
Country | Link |
---|---|
US (1) | US6625752B1 (de) |
EP (1) | EP1049978B1 (de) |
CA (1) | CA2319214A1 (de) |
DE (2) | DE19801992C2 (de) |
ES (1) | ES2198925T3 (de) |
WO (1) | WO1999038077A1 (de) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19801992C2 (de) | 1998-01-20 | 2000-07-06 | Siemens Ag | Verfahren zur Verbesserung der Systemverfügbarkeit nach dem Ausfall von Prozessoren einer Prozessorplattform |
US6999994B1 (en) | 1999-07-01 | 2006-02-14 | International Business Machines Corporation | Hardware device for processing the tasks of an algorithm in parallel |
JP5948933B2 (ja) * | 2012-02-17 | 2016-07-06 | 日本電気株式会社 | ジョブ継続管理装置、ジョブ継続管理方法、及び、ジョブ継続管理プログラム |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4521847A (en) * | 1982-09-21 | 1985-06-04 | Xerox Corporation | Control system job recovery after a malfunction |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5271013A (en) * | 1990-05-09 | 1993-12-14 | Unisys Corporation | Fault tolerant computer system |
US5214652A (en) * | 1991-03-26 | 1993-05-25 | International Business Machines Corporation | Alternate processor continuation of task of failed processor |
US5815651A (en) * | 1991-10-17 | 1998-09-29 | Digital Equipment Corporation | Method and apparatus for CPU failure recovery in symmetric multi-processing systems |
US5513354A (en) * | 1992-12-18 | 1996-04-30 | International Business Machines Corporation | Fault tolerant load management system and method |
JP2846837B2 (ja) * | 1994-05-11 | 1999-01-13 | インターナショナル・ビジネス・マシーンズ・コーポレイション | 障害を早期検出するためのソフトウェア制御方式のデータ処理方法 |
JPH0887341A (ja) * | 1994-09-16 | 1996-04-02 | Fujitsu Ltd | 自動縮退立ち上げ機能を有したコンピュータシステム |
US5649088A (en) * | 1994-12-27 | 1997-07-15 | Lucent Technologies Inc. | System and method for recording sufficient data from parallel execution stages in a central processing unit for complete fault recovery |
JP3196004B2 (ja) * | 1995-03-23 | 2001-08-06 | 株式会社日立製作所 | 障害回復処理方法 |
JP3247043B2 (ja) * | 1996-01-12 | 2002-01-15 | 株式会社日立製作所 | 内部信号で障害検出を行う情報処理システムおよび論理lsi |
US5758051A (en) * | 1996-07-30 | 1998-05-26 | International Business Machines Corporation | Method and apparatus for reordering memory operations in a processor |
DE19801992C2 (de) | 1998-01-20 | 2000-07-06 | Siemens Ag | Verfahren zur Verbesserung der Systemverfügbarkeit nach dem Ausfall von Prozessoren einer Prozessorplattform |
-
1998
- 1998-01-20 DE DE19801992A patent/DE19801992C2/de not_active Expired - Lifetime
-
1999
- 1999-01-19 CA CA002319214A patent/CA2319214A1/en not_active Abandoned
- 1999-01-19 WO PCT/DE1999/000125 patent/WO1999038077A1/de active IP Right Grant
- 1999-01-19 DE DE59905317T patent/DE59905317D1/de not_active Expired - Fee Related
- 1999-01-19 US US09/600,715 patent/US6625752B1/en not_active Expired - Lifetime
- 1999-01-19 ES ES99932437T patent/ES2198925T3/es not_active Expired - Lifetime
- 1999-01-19 EP EP99932437A patent/EP1049978B1/de not_active Expired - Lifetime
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4521847A (en) * | 1982-09-21 | 1985-06-04 | Xerox Corporation | Control system job recovery after a malfunction |
Non-Patent Citations (2)
Title |
---|
CUYVERS R ET AL: "A MODULAR MULTIPROCESSOR KERNEL FOR AUTOMATIC NON-STOP PROCESSING", INTERNATIONAL JOURNAL OF MINI AND MICROCOMPUTERS, vol. 14, no. 1, 1 January 1992 (1992-01-01), pages 9 - 15, XP000281818 * |
KRISHNA KUMAR R ET AL: "A FAULT-TOLERANT MULTI-TRANSPUTER ARCHITECTURE", MICROPROCESSORS AND MICROSYSTEMS, vol. 17, no. 2, 1 January 1993 (1993-01-01), pages 75 - 81, XP000355542 * |
Also Published As
Publication number | Publication date |
---|---|
ES2198925T3 (es) | 2004-02-01 |
EP1049978B1 (de) | 2003-05-02 |
CA2319214A1 (en) | 1999-07-29 |
DE59905317D1 (de) | 2003-06-05 |
DE19801992A1 (de) | 1999-08-05 |
EP1049978A1 (de) | 2000-11-08 |
US6625752B1 (en) | 2003-09-23 |
DE19801992C2 (de) | 2000-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE19509150C2 (de) | Verfahren zum Steuern und Regeln von Fahrzeug-Bremsanlagen sowie Fahrzeug-Bremsanlage | |
DE1524239B2 (de) | Schaltungsanordnung zur aufrechterhaltung eines fehler freien betriebes bei einer rechenanlage mit mindestens zwei parallel arbeitenden rechengeraeten | |
WO1985002475A1 (en) | Process for monitoring electronic computing elements, particularly microprocessors | |
EP1810139B1 (de) | Verfahren, betriebssystem und rechengerät zum abarbeiten eines computerprogramms | |
EP1358554B1 (de) | Automatische inbetriebnahme eines clustersystems nach einem heilbaren fehler | |
EP2732347B1 (de) | Verfahren und system zur dynamischen verteilung von programmfunktionen in verteilten steuerungssystemen | |
DE102015222321A1 (de) | Verfahren zum Betrieb eines Mehrkernprozessors | |
EP1049978B1 (de) | Verfahren zur verbesserung der systemverfügbarkeit nach dem ausfall von prozessoren einer prozessorplattform | |
EP1812853B1 (de) | Verfahren, betriebssystem und rechengerät zum abarbeiten eines computerprogramms | |
EP1526420B1 (de) | Synchronisationsverfahren für ein hochverfügbares Automatisierungssystem | |
DE1966991C3 (de) | Ausfallgesicherte Datenverarbeitungsanlage | |
EP4232905A1 (de) | Datenverarbeitungsnetzwerk zur datenverarbeitung | |
EP0961973B1 (de) | Redundant aufgebautes elektronisches geraet mit zertifizierten und nicht zertifizierten kanaelen und verfahren dafür | |
EP3948449B1 (de) | Verfahren und engineering-system zur änderung eines programms einer industriellen automatisierungskomponente | |
DE102017212560A1 (de) | Verfahren zum ausfallsicheren Durchführen einer sicherheitsgerichteten Funktion | |
EP1774417B1 (de) | Verfahren und vorrichtung zum überwachen des ablaufs eines steuerprogramms auf einem rechengerät | |
EP1420341A1 (de) | Verfahren zur Steuerung eines Automatisierungssystems | |
DE102004019371B4 (de) | Verfahren zur Wiederherstellung eines Betriebszustands eines Systems | |
DE4415761A1 (de) | Verfahren zum Behandeln vorübergehend ungleicher Eingangsdaten oder Zwischenergebnisse in redundanten Steuersystemen | |
WO2023066625A1 (de) | Datenverarbeitungsnetzwerk zur datenverarbeitung | |
DE102018214980A1 (de) | Rechnersystem und Betriebsverfahren dafür mit verbesserter Zuverlässigkeit | |
EP1426862A2 (de) | Synchronisation der Datenverarbeitung in redundanten Datenverarbeitungseinheiten eines Datenverarbeitungssystems | |
DE4117693A1 (de) | Fuer ein fehlertolerantes rechnersystem bestimmte funktionseinheit und verbindungsstruktur sowie verfahren zum betrieb eines solchen rechnersystems | |
EP1675005A2 (de) | Verfahren zur Sicherstellung der Verfügbarkeit von Daten auf lokalen Massenspeichern | |
DD247983A1 (de) | Verfahren zur interruptverarbeitung bei asynchron arbeitenden m vom n-rechnermoduln |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999932437 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2319214 Country of ref document: CA Ref country code: CA Ref document number: 2319214 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09600715 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1999932437 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1999932437 Country of ref document: EP |