Posts Tagged ‘volume’

Chkdsk won’t run upon reboot

December 1, 2011

I ran into problem when Windows server doesn’t run chkdsk on logical volume even though it’s corrupted. Some research on this issue revealed that there is a registry key HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ Control\ Session Manager\ BootExecute which has default value:

autocheck autochk *

Each time Windows restarts it runs Autochk tool which checks if some of volumes are dirty (means potentially corrupted). It’s implemented through dirty bit. You can check if dirty bit is set or set it by yourself with by:

fsutil dirty query d:

fsutil dirty set d:

Default value of BootExecute parameter means that all volumes have to be checked for dirty bit upon reboot. When Autochk find dirty volume it runs Chkdsk on it. But it turns out it doesn’t work in our OS instance due to bug or some misconfiguration.

There is a tool called chkntfs which can schedule check. But it’s useful since it won’t make any effect if dirty bit isn’t set. You can set it by yourself using Fsutil but once again dirty bit isn’t working for us for some reason. And moreover setting dirty bit by Fsutil is enough to force volume check even without Chkntfs. In fact Chkntfs is designed for another goal. It allows to exclude some volumes from checking by Chkdsk at all.

Solution for us is either run

chkdsk /f /r d:

from command line. But it could in some cases start online check which we would like to prevent. Or adjust registry. So solution is to change BootExecute to:

autocheck autochk /P /r \??\D:

/P in this case tells Autochk not to look at dirty bit and force check of logical volume.

NetApp storage architecture

October 9, 2011

All of us are get used to SATA disk drives connected to our workstations and we call it storage. Some organizations has RAID arrays. RAID is one level of logical abstraction which combine several hard drives to form logical drive with greater size/reliability/speed. What would you say if I’d tell you that NetApp has following terms in its storage architecture paradigm: disk, RAID group, plex, aggregate, volume, qtree, LUN, directory, file. Lets try to understand how all this work together.

RAID in NetApp terminology is called RAID group. Unlike ordinary storage systems NetApp works mostly with RAID 4 and RAID-DP. Where RAID 4 has one separate disk for parity and RAID-DP has two. Don’t think that it leads to performance degradation. NetApp has very efficient implementation of these RAID levels.

Plex is collection of RAID groups and is used for RAID level mirroring. For instance if you have two disk shelves and SyncMirror license then you can create plex0 from first shelf drives and plex1 from second shelf.  This will protect you from one disk shelf failure.

Aggregate is simply a highest level of hardware abstraction in NetApp and is used to manage plexes, raid groups, etc.

Volume is a logical file system. It’s a well-known term in Windows/Linux/Unix realms and serves for the same goal. Volume may contain files, directories, qtrees and LUNs. It’s the highest level of abstraction from the logical point of view. Data in volume can be accessed by any of protocols NetApp supports: NFS, CIFS, iSCSI, FCP, WebDav, HTTP.

Qtree can contain files and directories or even LUNs and is used to put security and quota rules on contained objects with user/group granularity.

LUN is necessary to access data via block-level protocols like FCP and iSCSI. Files and directories are used with file-level protocols NFS/CIFS/WebDav/HTTP.

Mad IT workdays

February 10, 2010

Today I needed to get Symantec Storage Exec to work with NetApp filer. This software allows to enforce file blocking and allocation policies on filer’s volumes.

I spent whole day resolving numerous problems while integrating them:

  1. When I was trying to install Symantec, installer said that it was interrupted and installation had to be rolled back. I couldn’t find ANY information regarding this issue in the whole Internet. I found posts with similar problems with other Symantec products but they didn’t help. Then somehow I found installation log and made search with lines from it. Finally I ran into this solution: http://seer.entsupport.symantec.com/docs/284901.htm. So the Symantec uninstaller for some damn reason left keys in the registry and couldn’t install itself for the second time because of it’s own fault.
  2. Then I’ve got “Can’t connect to host (err=10061)” after adding filer to the list of managed appliances. This link (http://seer.entsupport.symantec.com/docs/326973.htm) says I need to enable HTTP access. What? We don’t even have HTTP license. After half an hour of playing with filer configure options I found out that it’s not an access to httpd server it’s an access to magic filer administration area which is governed by httpd.admin.enable option (don’t forget also to add Storage Exec server IP to httpd.admin.access).
  3. The next error is: “HTTP POST authorization failed” in Storage Exec and “HTTP XML Authentication failed” from the filer side. It turned out that I also needed the user with the same user name and password as the user from which Storage Exec is being run. This user should be in the filer’s Administrators group.

Symantec’s documentation doesn’t have a word about all this stuff. It doesn’t say about access to filer’s administrative area and necessary user names.  You have to find this all out by yourself. I think Symantec’s docs leave too much to be desired and it’s the most mild way to describe it. And also there is little information about Storage Exec in the Internet. It seems that not many people are actually using it.