January 23, 2009
Building a Server
Most companies pay a lot of money to have have third-parties build
and maintain their storage infrastructure, often at an enormous markup
beyond the cost of the hardware. At SpiderOak, they do things a little
bit… differently. They have me.
A big part of my job is following the industry and researching and
testing new ways to build storage servers, and if I do say so myself,
I’m no slouch at building systems. I’ve designed and administered a
small computational cluster and built my fair share of desktops and
servers. So if everything worked perfectly, building SpiderOak machines
would be a doddle.
But they don’t, and it’s not. A few recent examples:
Since our servers are locked away in a data center, we have an array
of remote monitoring and access controls for our machines, core among
which is the BMC, or Baseboard Management Controller. Its job is to
allow all the things you could do with a computer physically, like turn
it off and on again or look at the console display, over a network. With
one of these in your computer, you can install, configure, run, and
destroy your computer from anywhere in the world. It’s a fantastically
convenient piece of kit when it works properly.
The particular BMC we have is made by a company who shall remain
anonymous, but we’ll call them MuperSicro. Now, MuperSicro’s BMC is
designed to share an Ethernet port so that it doesn’t need a dedicated
port. It does still have its own MAC address, though. Or it
should. This particular unit came to me with a MAC address of
00:00:00:00:00:00. Their solution? “Take the MAC address of LAN2 and add
1.” That works, but I would like it if parts came properly configured. I
ordered a BMC, not a Heathkit for one.
More fun comes from our LSI 8888ELP controller. This is a fantastic
SAS RAID controller — internal and external ports, 512MB of cache,
and excellent OS compatibility. The configuration, though, is a bit
daffy. For their BIOS configuration, you have a choice. You can use
WebBIOS, which poorly imitates a webpage, uses a mouse, and it just
about the worst choice of interface for RAID configuration imaginable.
Alternatively, you can use Preboot CLI, which is MegaCLI in standalone
firmware form. The deficiencies of MegaCLI have been href="http://www.kaltenbrunner.cc/blog/index.php?/archives/4-LSIlogic-MegaRAID-SAS-and-the-self-explaining-CLI.html">adequately
discussed by others, and I can say as a man who uses both mplayer
and ffmpeg frequently, it is bar none the most hideous and inconsistent,
poorly-documented piece of crap command-line tool I have ever used.
Without proper documentation, I willy-nilly decided to enable
DirectPdMapping, figuring that it would allow me to get direct access to
the drives. That particular option is, I might add, not documented in
the MegaRaid SAS User’s Guide available on LSI’s website. It said I
needed to reboot, so I did, and I was greeted with this:
Attached Enclosure doesn't support in controller's Direct mapping mode Please contact your system support. System has halted due to unsupported configuration.
The controller decides that it can’t use the enclosure in the way I
asked, so it halts the system. It doesn’t offer to turn direct mapping
off (which would be nice) or offer to load the configuration tool (which
would be expected), it just halts. The solution is to open the case and
unplug the SAS cable to the enclosure, which then allows the machine to
boot so that you can change it, then plug it back in and continue on
your way. If this machine was in a rack in the datacenter when this
happened, I’d have to bother the techs to go open it up and fiddle with
it. It allowed me to get into a situation that was unrecoverable without
physical intervention. That is completely unacceptable. Oh, and neither
WebBIOS or Preboot CLI can help you turn that feature off, either. I had
to boot into Linux to switch it back.
Invariably you will find foibles like these when building a new
system, which is why we spend time poking and prodding newly built
systems before putting them in our data centers and entrusting them with