Posts Tagged ‘failed’

Fix NetApp AutoSupport

November 20, 2015

I come across this issue too often. You need to fetch some information for the customer from the My AutoSupport web-site and can’t because the last AutoSupport message is from half a year ago.

Check AutoSupport State

When you list the AutoSupport history on the target system you see something similar to this:

# autosupport history show

autosupport

Mail Server Configuration

If AutoSupport is configured to use SMTP as in this case, then the first place to check is obviously the mail server. The most common cause of the issue is blocked relay.

There are two things you need to make sure are configured: NetApp controllers management IPs are whitelisted on the mail server and authentication is disabled.

To set this up on a Exchange server go to Exchange Management Console > Server Configuration > Hub Transport, select a Receive Connector (or create it if you don’t have one for whitelisting already), go to properties and add NetApp IPs on the network tab.

exchange.png

Then make sure to enable Externally Secured authentication type on the Authentication tab.

receiveconnector

Confirm AutoSupport is Working

To confirm that the issue is fixed send an AutoSupport message either from OnCommand System Manager or right from the console and make sure that status shows “sent-successfull”.

# options autosupport.doit Test

# autosupport history show

autosupport2

 

Advertisements

Replacing hard drives in a NetApp aggregate

May 30, 2013

netapp_disk_driveNetApp uses certain rules to assign hot spares in case of a failure. It always tries to use the exact match, but if it’s not there, the best available spare is used. “The best” means that if you have an aggregate which consists of 1TB hard drives and you have only 2TB spare left, then this 2TB spare will be downsized to 1TB and used as a data disk. After that, when you receive a correct size replacement from NetApp, you need to exchange the downsized 2TB hard drive with the delivered 1TB spare. To accomplish that, use the following command:

> disk replace start disk_name spare_disk_name

It will take considerable amount of time to copy the data. In my case it was 6.5 hours for a 1TB drive.

When the process finishes, replaced drive becomes a new spare. It’s wise to zero it out right away, so that it could be easily used again as a spare. Otherwise when time comes you’ll be waiting hours before it could be added in place of the failed drive:

> disk zero spares

As a side note I want to mention that you cannot take disks out of the raid group. There is no way to shrink aggregates. The only thing you can make is to replace a hard drive with another one.