Warning of Service Interruption on a Multiplex Section Resulted from Simultaneous SD Clearing in OptiX NGSDH and OCS Products
[Problem Description]
Triggering conditions
SD conditions at the two ends of a span on the same MSP ring are simultaneously cleared.
Symptoms
After SD conditions are cleared, some sites on the MSP ring do not recover and APS-INDI does not end.
Services on the MSP ring are interrupted.
Identification method
When an NE meets the following conditions, the problem addressed in this document occurs on the NE:
1. The NE runs a version specified in the preceding table, and ring MPS is configured for the NE.
2. After SD conditions are cleared, some sites on the MSP ring do not recover and APS-INDI does not end. In addition, services on the MSP ring are interrupted.
3. K bytes 0x8xxx, 0x5xxx, and 0x0xxx are cyclically sent during the period the problem persists. For details, see the following information:
#0x90406:cfg-get-rmsevent:1;
MSSPR-EVENT-LOG
PG-ID EVENT-NO EVENT-VALUE EVENT-PARA DATE-TIME TIME-STAMP
1 570 K_SENDS 0x512a 2012-4-29 12:33:15 0x022ebdd5
1 571 K_DIR 0x0002 2012-4-29 12:33:15 0x022ebdeb
1 572 STATE_TRANS 0x2405 2012-4-29 12:33:15 0x022ebe76
1 573 K_RECEIVED 0x0010 2012-4-29 12:33:15 0x022ebeb7
1 574 K_DIR 0x0002 2012-4-29 12:33:15 0x022ebebb
1 575 XC_EXECUTE 0x0200 2012-4-29 12:33:15 0x022ec323
1 576 K_SENDS 0x0020 2012-4-29 12:33:15 0x022ec3bf
1 577 K_DIR 0x0002 2012-4-29 12:33:15 0x022ec3d6
1 578 K_SENDS 0x0120 2012-4-29 12:33:15 0x022ec440
1 579 K_DIR 0x0000 2012-4-29 12:33:15 0x022ec450
1 580 STATE_TRANS 0x2100 2012-4-29 12:33:15 0x022ec4d3
1 581 K_RECEIVED 0x821a 2012-4-29 12:33:15 0x022ef3e3
1 582 K_DIR 0x0002 2012-4-29 12:33:15 0x022ef3e9
1 583 K_SENDS 0x1121 2012-4-29 12:33:15 0x022ef71e
1 584 K_DIR 0x0000 2012-4-29 12:33:15 0x022ef723
1 585 K_SENDS 0x8129 2012-4-29 12:33:15 0x022ef7aa
1 586 K_DIR 0x0002 2012-4-29 12:33:15 0x022ef7bf
1 587 STATE_TRANS 0x0408 2012-4-29 12:33:15 0x022ef846
1 588 K_RECEIVED 0x1212 2012-4-29 12:33:15 0x022ef8f9
1 589 K_DIR 0x0000 2012-4-29 12:33:15 0x022ef8fe
1 590 T2_START 0x0000 2012-4-29 12:33:15 0x022ef977
1 591 K_SENDS 0x5122 2012-4-29 12:33:15 0x022efa09
1 592 K_DIR 0x0000 2012-4-29 12:33:15 0x022efa0d
Check whether a WDM device exists between the transmission devices and whether OLP is configured for the WDM device.
[Root Cause]
The SD condition on the site at one end of the span is cleared soon after the SD condition on the site at the other end of the span is cleared. The long path is long. As a result, the SD conditions are cleared between the time point at which a site receives the K byte sent on the short path and the time point at which the same site receives the K byte sent on the long path. Consequently, the same NE receives two different K bytes in the same direction. The standard does not provide a field in the K byte sent on the long path to specify whether the end responding to a fault is the end triggering a switchover or the end responding to a switchover. An NE sends a response as long as the request received on the long path differs from the local request. In the previously addressed scenario, the NE at either end receives two different requests from the long path and therefore responds to the requests because it considers that the opposite NE sends the requests, resulting in status flapping.
[Impact and Risk]
The MSP ring is not in the normal state and its carried services are interrupted.
[Measures and Solutions]
Recovery measures
Restart the multiplex section protocol throughout the ring.
Workarounds
Workaround 1: Do not use SD as a condition to trigger an MSP switchover.
Advantage: When SD occurs, the problem addressed in this document does not occur.
Disadvantage: If SD affects services, it will occur on links. Then, services are interrupted, and no protective switching is triggered.
Workaround 2: Do not use SD as a condition to trigger an MSP switchover. Reduce the SF threshold to the current SD value.
Advantage: When SD conditions at the two ends of a span are simultaneously cleared, the problem addressed in this document does not occur.
Disadvantage: If SD occurs on multiple sites on the same MSP ring, multiple SF switchovers occur. As a result, isolated areas occur, resulting in multiplex section squelching and service interruption.
Workaround 3: Set the MSP hold-off time (applicable to scenarios with OLP configured).
Advantage: If OLP triggers a switchover for the intermediate WDM device, no MSP switchover will occur. Therefore, the problem addressed in this document does not occur.
Disadvantage: If the hold-off time is set to 100 ms and no OLP switchover is triggered, an MSP switchover takes about 150 ms, which exceeds the allowed time.
Preventive measures
OCS:
Upgrade NEs involved to V1R6C03SPC200+SPH206 or a later V1R6C03 version or V1R6C05SPC201 or a later version.
NGSDH:
For NEs running V100R008C02SPC500, install SPH505 or a later patch version.
For NEs running V100R008C02SPC200, install SPH203 or a later patch version.
Upgrade NGSDH NEs running V100R009 or V100R010 to V100R010C03SPC203 or a later version.
Upgrade NGSDH NEs running V200R011 or V200R012 to V200R012C01 or a later version.
Material handling after replacement
None
[Inspector Applicable or Not]
None
[Rectification Scope and Time Requirements]
None
[Rectification Instructions]
None
[Attachment]
None
More related:
No comments:
Post a Comment