|
Best Practices when replacing a Disk On Module for the Intel® SSR212MA Storage System
The Disk On Module (DOM) installed in the Intel® SSR212MA Storage System contains the boot strap, boot loader and boot files for the system and is essential to the system’s correct operation.
It is possible, that a DOM becomes faulty during the life time of a storage system and needs to be replaced. A replacement DOM can be obtained through the Intel® Warranty Process, if the product is still under warranty (3 years from purchase date).
It is important to adhere to the following steps to avoid loss of data on the storage system whilst replacing a DOM, especially if the system is a stand alone unit and not part of a cluster in two- or three way replication.
During the DOM replacement and until the new DOM is working and configured, you must not remove or replace any of the hard disk drives that are in the system. The drives are part of the Raid array. Removing or replacing them during a DOM replacement, whilst the DOM is faulty, or whilst restoring the DOM configuration may result in data loss. Regular backups minimize the risk of data loss.
It is strongly recommended to backup the SSM and Management Group configuration after the system is configured for the first time and at any time changes are applied. Regular backups of the DOM configuration are also recommended prior to a potential SSM failure.
Backup of the SSM Configuration provides the capability to save the SSM configuration file for use to restore in case of an SSM failure.
Backup of the Management Group Configuration provides the capability to preserve a record of management group configuration information and license keys to restore in case of an SSM failure.
See the Intel® Storage System SSR212MA Software User Manual for your specific SAN/iQ version for details on backup and restore of the SSM and Management Group Configuration.
Replacing a Disk on Module (DOM) (Intel® Storage System SSR212MA only)
- Power down the SSM.
- Remove the DOM from the SSM. Refer to the Intel® Storage System SSR212MAUser Guide for instructions on removing the DOM.
- Install the new DOM in the SSM. Refer to the Intel® Storage System SSR212MA UserGuide for instructions on installing the DOM.
- Attach a serial cable to the storage system and connect to a laptop. Open a terminal emulation program to run a text interface, such as HyperTerminal* or ProCommPlus*.
Use the following settings to configure your session:
- Bits per second = 19200
- Data bits = 8
- Parity = None
- Stop bits = 1
- Flow control = None
- Backspace key sends = Del
- Emulation = ANSI
If using HyperTerminal, set the properties for the backspace key and emulation after the session is established. If you exit the session and return to the session in order to use the Configuration Interface, the screen will not open correctly.
- Power up the SSM with the replacement DOM. From the laptop, you should be able to observe two boot cycles. A boot cycle is indicated by a “Welcome to SAN IQ” message displayed on the screen. On the second boot, the cycle should end with a “DOM replacement logic: OS was restored to DOM on previous boot cycle” message. The logon screen will display, indicating a proper restoration process.
Caution: Do not execute any keyboard commands, such as <ESC> to view diagnostic messages, <F2> to enter setup, <F12> for a network boot, <CTRL> <G> for running the RAID BIOS Console or login to the storage system, during reboot.
Note: Disregard any failed statuses and failure messages during reboot. These statuses messages are normal and are not an indication of a failure. The entire restoration process, if successful, will take about 30 minutes. Once the two boot cycles have executed, ensure the system has been restored.
- From the laptop or text interface, login and verify that the IP address and host name of the storage system have not changed. If the storage system uses DHCP, the IP address may have changed.
- Login to the Intel® Storage System Console and select Edit Config -> Storage -> RAID Setup and ensure all disks are online and in their original RAID configuration.
All volumes should be available with data restored to all volumes, and your host should be able to perform an iSCSI login.
Note: In most cases, DOM replacement should result in no issues. However, the following two conditions may occur if there is another hardware problem present. In both cases, refer to the Intel® Storage System SSR212MA User Guide for instructions on removing the DOM and replacing it with the original. If the new DOM could not access data on the disks because of another system fault (e.g., RAID is seriously degraded or no longer configured, or the RAID controllers, midplane or server board have a failure) then the replacement DOM will boot a single time and appear to be a newly manufactured system. Check the network settings and if they are set to factory defaults, the restoration process has failed because the DOM could not detect a coherent RAID configuration. In this case, the DOM cannot be used again to attempt a system restoration. Replace with the original DOM because the problem is not a bad DOM. If the original RAID array is intact, but the restoration process is unsuccessful because the new DOM can’t be written or verified, then the system will remain in a reboot cycle attempting to recover the configuration. If the DOM has not recovered after several reboot cycles or exceeded an hour without completing the process then the system cannot recover the original configuration. Power down the system if the system is continuously rebooting. Do not remove or replace the original RAID disks if a DOM replacement seems to have failed. Removing the RAID drives during a DOM replacement process may result in data loss.
Operating System:
This applies to:
|