Friday, November 9, 2007

Solaris Notes I: serial console

My wiki at work has a very long SolarisNotes page. We have three X4100s running Solaris that I've already got a number of secondary apps running on, and I'm preparing to move some serious production apps onto them. I've done a lot with Solaris before, but never on x86, never with Solaris 10, and never with stuff like Live Upgrade, (organized) patching, etc. So I've learned a lot doing all this, and have made all kinds of notes.

The Solaris documentation is mostly pretty good, so it's a little surprising to me that I've had to assemble such a collection of notes. But some things aren't brought together in an organized and task-centric way, at all. If you want to find out how to perform a specific operation, that's no problem. But if you don't know what operation to perform in the first place, it might take a while.

A case in point is setting up a Solaris x86 install for serial console management. I'm used to Solaris on SPARC, where the console is always ttya unless somebody's been silly enough to plug in a keyboard. Want to manage the boot process? Halt the machine? ttya is it. Whatever else is wrong, you can always send a BREAK and drop into the PROM monitor. Surely buying a Sun box running Solaris would reproduce that experience, even if it did happen to have an AMD processor instead of SPARC, right?

Wrong. The first X4100 I unpacked, I ended up hard-resetting a couple of times before I realized that it actually was booting, it just wasn't printing anything on the serial console, and I needed to load up all the Java KVM redirection gunk just to log in and configure the network. (Never mind my surprise when I realized that the first Galaxy boxes we bought actually had regular PC BIOSes; I had really been hoping for a nice OpenBoot environment or something...)

My preference for a serial console isn't just my curmudgeon side showing through, either. It's actually a substantial pain to navigate through the ILOM interface and launch the Java KVM thing, and its bandwidth requirements are obscene. It's one thing when I've got a DS3 between my MacBook and the servers, and quite another when I'm on flaky coffee-shop WiFi. In the latter case, the ILOM redirection is Not Gonna Happen. But a serial console is only 9600 bps at source, and even the slowest of connections can handle that. Plus a single SSH hop is a lot quicker to establish.

In practice, you have to change an "eeprom" setting (which doesn't actually go into an EEPROM at all, rather some file in the mysterious "boot archive" AFAIK) and the SMF console-login service configuration and the GRUB configuration, like so:


# eeprom console=ttya
# svccfg -s console-login setprop ttymon/terminal_type = "vt102"


And then in /boot/grub/menu.lst:

Uncomment these lines:


serial --unit=0 --speed=9600
terminal serial


Comment this out:

#splashimage /boot/grub/splash.xpm.gz


And change the Solaris failsafe entry too:

kernel /boot/multiboot kernel/unix -s -B console=ttya


What a mess! And, of course, this wasn't documented in any organized way under "how to make your shiny new Sun box have a serial console like god intended;" I had to rely on comments in the GRUB menu file, scattered bits of documentation, and other people's blog posts (long-forgotten, I'm afraid).

I can appreciate that the idea of a single console device is probably baked in pretty deep in Unix, but it really would be good if we could have active consoles on the graphical console AND the serial port.

Moreover, a separate chapter in the System Administration Guide or something on setting up your system for remote access and management would be incredibly helpful. This was a piece of cake compared to the things I had to kludge up for monitoring the built-in RAID and hardware sensors, and the documentation is basically useless for solving these kinds of problems.

No comments: