RANT: Cheap Server Rail Kits

This is a very specialized topic, but my hand hurts this morning for a very specialized reason, and I wish to rant about it.

Attention cheap white-box server manufacturers: the cheap, flimsy, rickety, ill-fitting, funny-smelling rail kits you continue to ship so you can save $10 are absolute garbage, and you should be ashamed of yourselves. I would so dearly love to name names, but as this post surely proves, I am too pro for that.

You can and should do better. The mechanism responsible for holding something heavy and important should not be made of materials as utterly not-resilient as (and unfavorably comparable to) plastic wrap, talc, or balsa wood. I would rather use duct tape, because at least I know what to expect from duct tape, and it typically does what I expect.

IKEA ships better rail kits to hold a two pound drawer.

I would shake my fist at you, but I am having trouble making my hand into that shape this morning because of an encounter with one of these travesties of engineering yesterday afternoon.

Please round up all such alleged rail kits and leave them out in the sun, where they will surely melt into a greasy puddle of sadness and embarrassment in under 15 minutes.

This is perhaps the leading reason I do not prefer cheap white-box servers, and have a moment of nausea whenever I know I will have to deal with one. It is never by choice.

Thank you for your time and attention. My soul feels better, but my hand still hurts.

OS X Mirroring and External Displays

I have a not-superfancy 21″ iMac at home. It’s got an old Dell 24″ DVI display attached with a Thunderbolt adapter. I’m generally a big fan of multiple displays on any computer I’m using, and this works out well when I want it, especially if I’m doing something with VMware Fusion that could use a full extra screen. But here’s the thing: In this particular setting, I don’t need or want the external display all the time. It’s connected via VGA to an old Mac mini that we occasionally use, which complicates things a bit more because of input switching. The display is from something like 2007, so it does nothing automatically.

No problem: If I don’t need it, I’ll just turn it off. However, the Thunderbolt adapter means that OS X always sees the display as present, even if it’s powered off. I find this annoying, especially when I launch an application that remembers it has windows on the external display. I have to turn the display on, make sure it’s on the DVI input, blah blah, just to move the window. It’s a small thing, but the grumpy accumulates.

Today, the obvious solution finally occurred to me: Just turn on display mirroring when the external display is unwanted. Command-F1, or Command-fn-F1 if you have media keys disabled.

This gathers everything to the internal iMac display and means OS X (and I) can just ignore the second display. Easy.

I Did Not Know: xargs -n and -P

Say you need to md5sum 46 files, all ending in “.foo” in a single directory. You might use your standard `md5sum *.foo > md5sum.txt` command to checksum them all in one process. Get coffee, it’s done, move on.

Oh, I should mention those 46 files are a little under 6 terabytes in total size. The standard command might take a while. I drink a lot of coffee, but whoa.

Now imagine you have a 16 core server connected via modern InfiniBand to an otherwise idle pre-production parallel filesystem with several hundred disks and multiple controllers, each with their own cache. The odds are tilting in your favor. This is especially true if you read up on this pair of options in xargs(1), which inexplicably, shamefully, I Did Not Know:

--max-args=max-args, -n max-args
       Use at most max-args  arguments  per  command  line.
       Fewer  than  max-args  arguments will be used if the
       size (see the -s option) is exceeded, unless the  -x
       option is given, in which case xargs will exit.
--max-procs=max-procs, -P max-procs
       Run up to max-procs processes at a time; the default
       is 1.  If max-procs is 0, xargs  will  run  as  many
       processes  as possible at a time.  Use the -n option
       with -P; otherwise chances are that  only  one  exec
       will be done.

Sure, I could have run this on more than one such machine connected to the same filesystem. There are a number of tools that can split up work across multiple child processes on a single machine, none of which were installed in this environment. I wanted to see what I could get this single server to do with basic commands.

46 files / 16 cores = 2.875, so let’s give this a shot:

find . -type f -name "*.foo" | xargs -P 16 -n 3 md5sum | tee md5sum.out

English: For the files ending in “.foo” in this directory, run md5sum up to 16 times in parallel with up to three files per run, show results as they happen, and save the output.

Please Note: This will absolutely not help unless you have the storage infrastructure to handle it. Your Best Buy hard drive will not handle it. It has a strong chance of making your machine unhappy.

In this case, I got something lovely from top:

  PID S %CPU COMMAND
29394 R 100.0 md5sum
29396 R 100.0 md5sum
29397 R 100.0 md5sum
29398 R 100.0 md5sum
29399 R 100.0 md5sum
29400 R 100.0 md5sum
29401 R 100.0 md5sum
29402 R 100.0 md5sum
29403 R 100.0 md5sum
29391 R 99.6 md5sum 
29392 R 99.6 md5sum 
29393 R 99.6 md5sum 
29395 R 99.6 md5sum 
29404 R 99.6 md5sum 
29405 R 99.6 md5sum 
29406 R 99.6 md5sum

Early on, there were some D states waiting for cache to warm up, and CPU dropped below 70% for one or two processes, but I’ll take it. I’ll especially take this:

real    31m33.147s

Right on, xargs. Easy parallelization on one system for single file tasks driven from a file list or search.

Good Software Practices Scale Down

Today I revisited some scripts I last touched on December 5, 2011 for very very carefully archiving research data with checksums, an audit trail, and other very very careful things like that.

One of the requirements for this project is that the first phase of my processing needs to accept input data from a provider. Unfortunately, this input format has never been the same twice. Grr.

Upon receipt of the second variation on July 12, 2011 (six days after I started the project), I took the time to make the script somewhat configurable with an external file.

This was handy in November 2011 when I needed to do a similar set of work for a second research dataset. I put everything in a configuration file stored alongside the input data. Date format strings, headers, fields of interest, key/values for data types, etc. That meant I could share code between datasets as they emerged from the wild.

So last week, I got another set of input data. Yep, another unique format. I haven’t thought about this in over a year, and I have a terrible memory. Today, I got the input data parsed and validated in five minutes after editing a config file, because:

  1. I had one place to do customization
  2. I took steps to encourage code reuse
  3. I wrote good comments and gave myself a -h option

All this despite knowing that I was probably the only one who would ever look at this again. And I have those dates because everything is in a Subversion repository. Did I mention that I wrote it in a language I don’t know very well?

Granted, this is a tiny little thing in the universe of computer things, but my point here is that it’s often worth doing the right thing for the next guy, even for small things, even if the next guy is you. Perhaps especially if it’s you.

No Solaris 11 for Legacy UltraSPARC

I don’t do a lot of Solaris anymore, and though I’m interested in several of the new features in the upcoming Solaris 11, I wasn’t aware until today that it wouldn’t run on most legacy UltraSPARC systems:

Support for legacy systems that have included the UltraSPARC I, II, IIe, III, IIIi, III+, IV and IV+ processor architectures (as reported by the Solaris ‘psrinfo -pv’ command) has been removed. All Oracle SPARC Enterprise M-Series Servers and Oracle SPARC T-Series Servers will continue to be supported.

Source: http://www.oracle.com/technetwork/systems/end-of-notices/eonsolaris11-392732.html

I’ve always liked the T-Series systems, and have personally achieved what I consider impressive workload consolidation using a few of them. It’s not entirely clear what Oracle means with the above statement, but rumor is that support for the sun4u architecture is gone. That’s a big change from the Solaris 11 Express HCL. I imagine there are lots of places where this means “we will never ever run Solaris 11.”

Reminds me of a certain fruit company. Time marches on.