Archive for the ‘Linux/Unix’ Category

AIX at first glance

May 19, 2012

Recently I set up an AIX 5.1 on a RS/6000 box. Now, after some time working with the OS, I’d like to share my first impressions and features that distinguishes it from Linux.

FYI: Do not try to run AIX on x86, it won’t work. And it have never done. Only PowerPC and POWER RISC architectures.

System Management Services

The very first thing which may surprise you when you start a PowerPC system is absence of BIOS. PowerPC uses SMS which is an acronym for System Management Services. You enter SMS by pressing F2 during server startup. However, SMS implements same features as conventional server’s BIOS. Like configuring boot sequence, performing simple diagnostics, etc.

AIX default shell

AIX uses KornShell (ksh) by default. Bourne shell (bsh) is also available. But do not confuse it with Bourne-again shell (bash). It was developed two ears earlier (1989) than AIX 5.1 (2001), but wasn’t included. What’s interesting about ksh is that by default it works in vi editing mode. It means that initially you work in an input mode and enter commands by typing and hitting return as usual. Type ESC to enter control mode. For example type CTRL+V in control mode and you will find your ksh version. Mine is M-11/16/88f. If you type backslash (\) in control mode you will complete a file path. ksh88 shortcoming is that it doesn’t support commands completion.

System Management Interface Tool

AIX operating system is configured using the System Management Interface Tool (SMIT). It’s an equivalent of YaST in SuSE, redhat-config-* tools in Red Hat or Windows Control Panel. SMIT is very thorough configuration tool. For example, user add page consists of forty fields! SMIT has several handy functional keys. For instance, F5 sets field to the default value, using F9 you can temporarily invoke command shell, F4 generates a list if field implies it, like list of packages available to install from particular directory. Apart from that, SMIT has weird field hints: ‘-‘ says that field is numerical, ‘+’ means a list, ‘/’ is a path. Everything you do in SMIT is logged in /smit.log.

Web-based System Manager

On top of that, AIX has Web-based System Manager (WebSM) which lets you monitor your system and manage devices, backups, processes and virtually everything in your operating system. You can do that either from inside operating system itself or through standalone client which is available for Windows and Linux. To manage your AIX host via WebSM you need to have equal Manager and Remote Client versions.  To satisfy that you can download Windows version of Web-based System Manager Remote Client right from the AIX host using SCP or FTP from /usr/websm/pc_client/setup.exe. WebSM Client for AIX 5 is incompatible with Windows 7.

 

Object Data Manager

Feature which is unique to AIX is Object Data Manager (ODM) database, which maintains device configuration. ODM consists of Predefined Configuration Database (PCD) and Customized Configuration Database (CCD). Predefined Configuration Database keeps information on supported devices which means devices for which AIX has drivers and Customized Configuration Database hold information of devices which are currently connected to the system. Data in ODM is stored in terms of objects and their attributes. Access to ODM is implemented via special API. User can manage ODM by calling odmshow, odmadd, odmchange and odmdelete utilities. Additionally, AIX uses location codes to identify devices. Location code is effectively a path from a motherboard to a device. For example, location code of a SCSI device is in the form AB-CD-EF-G,H. Here AB is a bus type, CD – slot or adapter number, EF – connector ID, G – Control Unit Address of SCSI Device, H – Logical Unit Address of SCSI Device. I have two SCSI hard drives hdisk0 and hdisk1. For hdisk0 location code is 04-C0-00-5,0. Here 04 means PCI bus (00 – CPU bus, 01 – ISA bus, 05 – PCMCIA bus), C0 – integrated SCSI controller (A0 -ISA bus, B0 – secondary PCI bus), 00 – SCSI bus number, 5 – SCSI ID, 0 – LUN.

Logical Volume Manager

Did you know that LVM was implemented in AIX ten years earlier (1989) than in Linux (1998)? In fact, after AIX version of LVM was developed, its license was bought by HP. And only after that Heinz Mauelshagen developed Linux version with commands similar to the HP version. Windows Server platform still doesn’t have anything similar AFAIK.

Journaling File System

Another AIX achievement is JFS file system which is journaling by design. First JFS version was implemented in 1990 in AIX 3.1 Do you remeber when ext3 was developed? I believe somewhere in 2001. Journaling NTFS v3 was implemented in 2000 with Windows Server 2000. JFS file system in AIX 4.2 supported 64GB file size (it was 1996). With introduction of JFS2 in 2001, AIX 5 began to support 1TB files. Maximum file size for FAT32 was 4GB. All these facts are explainable. AIX was developed far earlier than Linux and Windows. But it’s still interesting how features firstly introduced in AIX (and other flavors of UNIX) migrate to younger OSes.

Full system recovery

Unlike Linux, AIX allows you to create full volume group backups with all logical volumes. Even in present times in Linux you work with antique tar, gzip, cpio and dd (or duplicity and bacula if you want something more sophisticated). In 2001 AIX already had savevg for backing up non-rootvg volume groups and mksysb which lets you backup rootvg along with system related data. mksysb creates installable image for full system recovery. I find these tools invaluable. I do not know of Linux alternative.

User/group administration

Additionally, AIX has several handy user administration features. For example, a user group can be either administrative or standard. If it’s administrative, then only root can add/remove users from it. If it’s standard, it means that ordinary users can administer that group. Feature I sometimes lack in Linux. Groups are configured in /etc/security/group and look like the following:

system:
admin = true

jradmin:
admin = false
adms = pac,xander

Here system is an administrative group and jradmin is standard. admin field identifies group type and adms contains the list of group administartors (pac an xander). Also, in AIX you can assign portions of root authority to non-root users. There are several predefined roles, like ManageAllUsers, ManageShutdown, ManageBackupRestore, etc, defined in /etc/security/roles. Roles consist of a number of authorizations, which is a set of particular tasks that user can perform. For example, ManageAllUsers role consists of the following authorizations: UserAudit, ListAuditClasses, UserAdmin, RoleAdmin, PasswdAdmin, GroupAdmin. You can create your own roles from these authorizations. In AIX 5 Role-Based Access Control (RBAC) is rather primitive and restricted, but it’s better than nothing.

Error logging

And the last thing I’d like to talk about is error logging. In Linux logging is performed by syslogd, AIX has the same daemon. However, AIX error logging facility is augmented by errdemon. It is started as part of system initialization and continuously monitors /dev/error. When information is read from /dev/error errdemon checks its Error Record Template Repository /var/adm/ras/errtmplt and if it has any additional info on this error, demon writes this information into /var/adm/ras/errlog. Log is in binary format. To read it run errpt command:

errpt -a -s 0519000012

This will show you detailed information on log entries starting from 19th of May 2012 00:00 a.m.

Conclusion

My first experience working with AIX (even with such an outdated version) makes me think of it as a sophisticated and very well written operating system. Many major features were developed in AIX much earlier than in Linux and Windows and I believe it’s still true for modern AIX releases. It becomes obvious why Unix is the primary choice for many big organizations with strong IT infrastructure.

Advertisements

Configuring remote access to AIX

May 16, 2012

I work on an old AIX 5.1:

# oslevel -r
5100-03

By default it has only telnet preinstalled. Which works out of the box without additional configuration. However, there are several recommended steps to do.

Telnet

Firstly check if you have stable network connection. I had problems connecting to AIX box after connection timeout. It seemed that telnet session somehow hang on the OS side and didn’t allow me to reconnect. To prevent that, you have two options. If you use PuTTY then go to Settings->Connection and set amount of seconds between keepalive packets to say 60 seconds. And PuTTY will maintain connection automatically. Another workaround is to edit TMOUT variable in /etc/profile. By default AIX uses ksh shell which uses this parameter to detect idle sessions. If set this variable to 120, then after two minutes ksh will throw a warning that session will be closed in 60 seconds. This means that if your telnet session breaks, ksh will automatically terminate its shell. (I checked that and it turned out that TMOUT doesn’t help here.)

TCP Wrapper

By default telnet access in AIX is opened for everyone. It’s not what you want for sure. AIX has built-in firewall (called AIX TCP/IP Filters) but it’s rather cumbersome to use it just to restrict telnet access. I’d prefer TCP Wrapper, which is standard for Linux, but optional for AIX. You can get AIX LPP package from Bull AIX freeware site here: http://www.bullfreeware.com/index2.php?page=lppaix51. Then simply:

chmod +x tcp_wrappers-7.6.1.0.exe

Extract package contents by running the executable. Then run smit from directory where you extracted files and go to Software Installation and Maintenance -> Install and Update Software ->  Install Software. Set current directory in “INPUT device / directory for software”. You can view software available, if you press F4 in “SOFTWARE to install” field. Change “ACCEPT new license agreements?” to yes and press Enter.

When package is installed, edit /etc/inetd.conf. Find telnet line and change it:

#telnet stream tcp6 nowait root /usr/sbin/telnetd telnetd -a
telnet stream tcp6 nowait root /usr/local/bin/tcpd telnetd -a

And restart inetd service:

stopsrc -s inetd && startsrc -s inetd

Now to limit telnet access create /etc/hosts.allow:

telnetd: 123.234.123.234 234.123.234.123

and /etc/hosts.deny:

ALL:ALL

Secure Shell

Telnet is completely outdated and insecure protocol. So you’d probably prefer ssh on the server side. I believe SSH is bundled with AIX 5.1, but I simply downloaded it from Bull site. Additionally to OpenSSH package you will have to setup OpenSSL prerequisite. Here are the links:

http://www.bullfreeware.com/affichage.php?id=779
http://sourceforge.net/projects/openssh-aix/files/openssh-aix51/4.1p1/

Install OpenSSL simply by:

rpm -i openssl-0.9.7l-1.aix5.1.ppc.rpm

In case of OpenSSH you will need to gunzip it, untar it and setup using smit. But if you work on AIX with old maintenance level (ML3 in my case) you can run into the following error when running ssh service:

getnameinfo failed: Invalid argument

You can see it if you run sshd with -D and -d flags. Solution here is to download AIX 5.1 ML9 and POSTML9 fixes from IBM Fix Central, extract them and setup in Software Installation and Maintenance -> Install and Update Software ->  Update Installed Software to Latest Level (Update All).

SSH is a standalone service, so you do not need to edit /etc/inetd.conf. Just add new sshd line to /etc/hosts.allow and you are good to go. However, if your ssh was built without wrapper support, then you have a problem. You can check that by calling:

# dump -H /usr/sbin/sshd

/usr/sbin/sshd:

                        ***Loader Section***
                      Loader Header Information
VERSION#         #SYMtableENT     #RELOCent        LENidSTR
0x00000001       0x00000115       0x00000601       0x00000096

#IMPfilID        OFFidSTR         LENstrTBL        OFFstrTBL
0x00000006       0x00006224       0x0000075a       0x000062ba

                        ***Import File Strings***
INDEX  PATH                          BASE                MEMBER
0      /usr/lib:/lib:/opt/freeware/lib
1                                    libc.a              shr.o
2                                    libpthreads.a       shr_comm.o
3                                    libpthreads.a       shr_xpg5.o
4                                    libcrypto.a         libcrypto.so.0.9.7
5                                    libz.a              libz.so.1

If there is no libwrap.a, then the only option you have is to run sshd under tcpd which is run by inetd. To accomplish that add the first line into /etc/services and second into /etc/inetd.conf:

ssh 22/tcp
ssh stream tcp6 nowait root /usr/local/bin/tcpd sshd -i

Switch ‘-i’ tells sshd to generate smaller keys. Otherwise you will wait significant amount of time for login prompts. Also don’t forget to remove sshd startup and shutdown scripts from /etc/rc.d/rc2.d.

Migrating physical Linux host to the VMware ESXi

May 2, 2012

Well, perhaps the easiest way to accomplish that is using VMware Converter from the start. I believe there is a Linux version. However, I took another route. I already had an Acronis backup image. So my solution was to use this image as a source, which I fed into Windows version of VMware Converter, which in its turn converted it to VMware format and created VMware virtual machine on ESXi server automatically.

Using this simple procedure you can get a working system. Not in my case. Original OS used a software RAID of two hard drives. So I had to boot from a live CD. Then I changed fstab and GRUB’s menu.lst and set /dev/sda1 (root volume) instead of /dev/md0 and /dev/sda2 (swap) in place of /dev/md1. Additionally, I had to reinject GRUB’s boot files:

grub-install –root-directory=/media/sda1/boot /dev/sda

Then, if it’s SUSE you will have to change “resume” switch in GRUB’s boot menu line to /dev/null. Then after you boot into the system, recreate swap partition and point to it in “resume” switch. If you won’t do that, you will end up with the following error during boot process:

Kernel panic – not syncing: I/O error reading memory image

One tricky issue I had in all this story was related to kernel. As I’ve already mentioned original operating system worked on top of software RAID. And its initrd image won’t detect ordinary virtual SCSI hard drive during boot. So I had to boot from the SUSE installation CD and install standard kernel on top of original system. It solved the issue. Additionally I had to choose Russian language as primary during kernel installation, otherwise I ended up with unreadable symbols inside the system. But it’s not necessary for majority of cases.

I hope my experience will be helpful for other sysadmins.

Export share in ROCKS

March 14, 2012

In my previous post I described how you can present an iSCSI LUN to a Linux host. I moved all home directories to this NAS share, but later I came to the conclusion that making separate share would be better. Users should have ability to quickly compile applications in their home directories. If home directories are also used as target storage for computational data, then during computation, iSCSI network link can become a bottleneck and slow down everything. That’s why I decided to separate them. It requires exporting additional share and it can be done very easily in ROCKS.

1. Mount the LUN to say /export/scratch

2. Make export by adding (all in one line) to /etc/exports

/export/scratch 192.168.111.128(rw,async,no_root_squash) 192.168.111.0/255.255.255.0(rw,async)

3. Restart nfs

/etc/rc.d/init.d/nfs restart

4. Add line to /etc/auto.share

scratch master.local:/export/&

5. Update 411 config

make -C /var/411

Now share is accessible by all compute nodes from /share/scratch.

Same process is described in ROCKS FAQ here.

Present NetApp iSCSI LUN to Linux host

March 7, 2012

Consider the following scenario (which is in fact a real case). You have a High Performance Computing (HPC) cluster where users usually generate hellova research data. Local hard drives on a frontend node are almost always insufficient. There are two options. First is presenting a NFS share both to frontend and all compute nodes. Since usually compute nodes  connect only to private network for communication with the frontend and don’t have public ip addresses it means a lot of reconfiguration. Not to mention possible security implications.

The simpler solution here is to use iSCSI.  Unlike NFS, which requires direct communication, with iSCSI you can mount a LUN to the frontend and then compute nodes will work with it as ordinary NFS share through the private network. This implies configuration of iSCSI LUN on a NetApp filer and bringing up iSCSI initiator in Linux.

iSCSI configuration consists of several steps. First of all you need to create FlexVol volume where you LUN will reside and then create a LUN inside of it. Second step is creation of initiator group which will enable connectivity between NetApp and a particular host.  And as a last step you will need to map the LUN to the initiator group. It will let the Linux host to see this LUN. In case you disabled iSCSI, don’t forget to enable it on a required interface.

vol create scratch aggrname 1024g
lun create -s 1024g -t linux /vol/scratch/lun0
igroup create -i -t linux hpc
igroup add hpc linux_host_iqn
lun map /vol/scratch/lun0 hpc
iscsi interface enable if_name

Linux host configuration is simple. Install iscsi-initiator-utils packet and add it to init on startup. iSCSI IQN which OS uses for connection to iSCSI targets is read from /etc/iscsi/initiatorname.iscsi upon startup. After iSCSI initiator is up and running you need to initiate discovery process, and if everything goes fine you will see a new hard drive in the system (I had to reboot). Then you just create a partition, make a file system and mount it.

iscsiadm -m discovery -t sendtargets -p nas_ip
fdisk /dev/sdc
mke2fs -j /dev/sdc1
mount /dev/sdc1 /state/partition1/home

I use it for the home directories in ROCKS cluster suite. ROCKS automatically export /home through NFS to compute nodes, which in their turn mount it via autofs. If you intend to use this volume for other purposes, then you will need to configure you custom NFS export.

Dovecot failing regularly

February 22, 2012

I ran into an issue when Dovecot fails when ntpd moves time backwards on a Linux server. Here is the message which appears in logs:

dovecot: Fatal: Time just moved backwards by 15 seconds. This might cause a lot of problems, so I’ll just kill myself now. http://wiki.dovecot.org/TimeMovedBackwards

Possible solution here is to add -x flag to ntpd daemon run line. I did that using /etc/sysconfig/ntp file. Now each time ntpd will change time it will do it smoothly.

The root cause of the problem could be an onboard lithium battery. But I hope it’ll solve the problem without taking chassis cover off.

Amavis whitelisting

February 21, 2012

Just a short note on how to add a sender email to a whitelist. To do that you just add the following lines into /etc/amavisd.conf and restart:

map { $whitelist_sender{lc($_)}=1 } (qw(
    sender1@domain.com
    sender2@domain.com
));

Installing Symantec Backup Exec Agent for Linux

October 7, 2011

Symantec Backup Exec Linux/Unix agent is called RALUS which stands for Remote Agent for Linux and Unix Servers. I obtained my RALUS installation from official Symantec CDs. If you don’t have them you probably can download them from Symantec web site. Here is the sequence:

  1. Mount CD or iso image to your Linux host.
  2. Run ./installralus script and follow instructions. I use defaults. The only thing you should enter is Media Server IP address. Installation script add itself to rc*.d levels automatically.
  3. After installations is completed create backup user, add it to beoper group and set its password: # useradd backup -c “User for Symantec Backup Exec”;  # usermod -G beoper backup; # passwd backup.
  4. Start BE agent manually for the first time: # /etc/init.d/VRTSralus.init start

That’s it. Now you can see your server under Linux/Unix Servers section when creating backup job.

Add #1: If agent doesn’t start and you get an error with libstdc++.so.5 missing in /var/VRTSralus/beremote.service.log then install compat-libstdc++-33.

Add #2: If you have active firewall then you need to open additional ports. For me it was tcp 10000-10200. It’s 10000 plus port range you can find on media server in Tools->Options->Network and Security tab. For CentOS firewall rule would be:

-A RH-Firewall-1-INPUT -m tcp -p tcp -s media_server_ip –dport 10000:10200 -j ACCEPT

Add #3: In case you also write firewall rules to OUTPUT chain then open output tcp 10000:

-A RH-Firewall-1-OUTPUT -m tcp -p tcp -d media_server_ip –dport 10000 -j ACCEPT

If you don’t have RH-Firewall-1-OUTPUT add also:

:RH-Firewall-1-OUTPUT – [0:0]
-A OUTPUT -j RH-Firewall-1-OUTPUT

I leave possibility of me being wrong, but SBE documentation says:

Symantec recommends having port 10000 open and available on the Backup Exec media
server as well as on the remote systems.

Additional connections from the media server to the Remote Agent will be initiated on any available port.

I understand that as both agent and media server may connect to each other’s 10000 port and additional 10001:10200 connections are initiated from medias server.

Simple Duplicity backup scheme

September 21, 2011

Here is the quick description of how you can setup simple backup scheme in Duplicity:

00 22 1 * * /usr/bin/duplicity full –no-encryption /etc/vsftpd file:///var/backup/ftp/conf.bp
00 22 2-31 * * /usr/bin/duplicity inc –no-encryption /etc/vsftpd file:///var/backup/ftp/conf.bp
00 23 1 * * /usr/bin/duplicity remove-older-than 6M –no-encryption –force file:///var/backup/ftp/conf.bp

First line create full backup on first day of month. Second make incremental backups each other day. And by third line I leave backups for the last 6 months.

RPMforge for CentOS 5

September 21, 2011

Standard CenOS repository lacks several useful packages. Duplicity is one of them. This backup software can be found in RPMforge repository. RPMforge installation into CentOS 5 is as simple as downloading and installing this RPM package: http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el5.rf.x86_64.rpm.