Wednesday, October 12, 2016

An SCC Board Fails to Start After Multiple Board Replacement Operations

An SCC board fails to start after multiple board replacement operations.
Product

Fault Symptom

A site at office A in country D is configured with one master subrack and three slave subracks. The master subrack is configured with active and standby SCC boards. When the standby SCC board reports the HARD_BAD alarm, users consecutively replace the standby SCC board twice. After that, the standby SCC board malfunctions according to NMS statistics.

Network Topology

None.

Cause Analysis

There are the following possible causes:
  • The new standby SCC boards that are being used have defects.
  • The slot for housing a standby SCC board malfunctions, resulting a startup failure on the new standby SCC boards.
  • The database of the standby SCC board is abnormal, resulting a startup failure on the new standby SCC boards.

Procedure

  1. Replace the standby SCC board for two consecutive times.
    The fault persists. This indicates that the fault is not caused by the original standby SCC board.
  2. Inspect the slot for housing a standby SCC board.
    No bent pin is found in the slot. This indicates that the fault is not caused by the slot.
  3. Inspect the PROG indicator on a new standby SCC board.
    The PROG indicator blinks quickly, indicating that the standby SCC board is being repeatedly reset.
    Result: According to this analysis, the possible cause of that fault is that the database of the standby SCC board is abnormal, which results in start failures and repeated resets of the standby SCC board.
    When the SCC boards that are in the slave subracks start, the data modules in the slave subracks will not start. Therefore, the fault may be caused by the data module on the SCC boards that are in the master subrack.
  4. Insert the original standby SCC board from the master subrack into a slave subrack.
    The board starts properly after 5 minutes. You can now determine that the repeated resets of the standby SCC board in the master subrack result from the abnormal data module.
  5. Obtain the package loading logs of the SCC boards in the slave subracks using the UpgradeKit tool.
    According to the logs, downgrade operations have been performed on the SCC boards.
  6. Clear the database for the SCC board by referring to the Upgrade Guide and insert the original SCC board into the master subrack.
    The board starts properly.

Conclusion and Suggestion

  1. NG WDM systems support smooth upgrades but not smooth downgrades. Databases must be cleared before a downgrade. Therefore, determine the version of a spare part before replacing it on an SCC board. If you downgrade a spare part that has a version later than the version of a target SCC board before starting the board, the database of the board will fail to start.
  2. Perform the following workaround if an SCC board is incorrectly downgraded and fails to start: Insert the SCC board into a slave subrack of the NE and clear the database by referring to the Upgrade Guide.

MORE BLOG:

No comments:

Post a Comment