I Did Not Know: xargs -n and -P

Say you need to md5sum 46 files, all ending in “.foo” in a single directory. You might use your standard `md5sum *.foo > md5sum.txt` command to checksum them all in one process. Get coffee, it’s done, move on.

Oh, I should mention those 46 files are a little under 6 terabytes in total size. The standard command might take a while. I drink a lot of coffee, but whoa.

Now imagine you have a 16 core server connected via modern InfiniBand to an otherwise idle pre-production parallel filesystem with several hundred disks and multiple controllers, each with their own cache. The odds are tilting in your favor. This is especially true if you read up on this pair of options in xargs(1), which inexplicably, shamefully, I Did Not Know:

--max-args=max-args, -n max-args
       Use at most max-args  arguments  per  command  line.
       Fewer  than  max-args  arguments will be used if the
       size (see the -s option) is exceeded, unless the  -x
       option is given, in which case xargs will exit.
--max-procs=max-procs, -P max-procs
       Run up to max-procs processes at a time; the default
       is 1.  If max-procs is 0, xargs  will  run  as  many
       processes  as possible at a time.  Use the -n option
       with -P; otherwise chances are that  only  one  exec
       will be done.

Sure, I could have run this on more than one such machine connected to the same filesystem. There are a number of tools that can split up work across multiple child processes on a single machine, none of which were installed in this environment. I wanted to see what I could get this single server to do with basic commands.

46 files / 16 cores = 2.875, so let’s give this a shot:

find . -type f -name "*.foo" | xargs -P 16 -n 3 md5sum | tee md5sum.out

English: For the files ending in “.foo” in this directory, run md5sum up to 16 times in parallel with up to three files per run, show results as they happen, and save the output.

Please Note: This will absolutely not help unless you have the storage infrastructure to handle it. Your Best Buy hard drive will not handle it. It has a strong chance of making your machine unhappy.

In this case, I got something lovely from top:

  PID S %CPU COMMAND
29394 R 100.0 md5sum
29396 R 100.0 md5sum
29397 R 100.0 md5sum
29398 R 100.0 md5sum
29399 R 100.0 md5sum
29400 R 100.0 md5sum
29401 R 100.0 md5sum
29402 R 100.0 md5sum
29403 R 100.0 md5sum
29391 R 99.6 md5sum 
29392 R 99.6 md5sum 
29393 R 99.6 md5sum 
29395 R 99.6 md5sum 
29404 R 99.6 md5sum 
29405 R 99.6 md5sum 
29406 R 99.6 md5sum

Early on, there were some D states waiting for cache to warm up, and CPU dropped below 70% for one or two processes, but I’ll take it. I’ll especially take this:

real    31m33.147s

Right on, xargs. Easy parallelization on one system for single file tasks driven from a file list or search.

No Solaris 11 for Legacy UltraSPARC

I don’t do a lot of Solaris anymore, and though I’m interested in several of the new features in the upcoming Solaris 11, I wasn’t aware until today that it wouldn’t run on most legacy UltraSPARC systems:

Support for legacy systems that have included the UltraSPARC I, II, IIe, III, IIIi, III+, IV and IV+ processor architectures (as reported by the Solaris ‘psrinfo -pv’ command) has been removed. All Oracle SPARC Enterprise M-Series Servers and Oracle SPARC T-Series Servers will continue to be supported.

Source: http://www.oracle.com/technetwork/systems/end-of-notices/eonsolaris11-392732.html

I’ve always liked the T-Series systems, and have personally achieved what I consider impressive workload consolidation using a few of them. It’s not entirely clear what Oracle means with the above statement, but rumor is that support for the sun4u architecture is gone. That’s a big change from the Solaris 11 Express HCL. I imagine there are lots of places where this means “we will never ever run Solaris 11.”

Reminds me of a certain fruit company. Time marches on.

I Did Not Know: pbzip2

I just learned about pbzip2, which lets your multicore computer use more than one core when using the bzip2 compression algorithm.

On my Mac Pro at work, I installed it with MacPorts (`sudo port install pbzip2`). It is this kind of awesome:

$ ls -lh original.tar
-rw-r--r--  1 jmcmurry  staff   2.4G Feb  4 13:47 original.tar
$ time bzip2 -k -v original.tar
original.tar: 36.215:1,  0.221 bits/byte, 97.24% saved, 
2604288000 in, 71911733 out.

real	13m3.313s
user	12m50.536s
sys	0m3.773s
$ mv original.tar.bz2 bzip2.tar.bz2
$ time pbzip2 -k -v original.tar
Parallel BZIP2 v1.0.5 - by: Jeff Gilchrist [http://compression.ca]
[Jan. 08, 2009]             (uses libbzip2 by Julian Seward)

# CPUs: 8
BWT Block Size: 900k
File Block Size: 900k
-------------------------------------------
File #: 1 of 1
Input Name: original.tar
Output Name: original.tar.bz2

Input Size: 2604288000 bytes
Compressing data...
-------------------------------------------

Wall Clock: 119.369207 seconds

real	1m59.612s 
user	14m39.090s
sys	0m44.840s

Sweet. 6.57x faster by adding a “p” to my command line.

The resulting compressed .bz2 files aren’t exactly the same according to md5 (the pbzip2 output is a little larger, which makes sense due to the splitting of the work), but when they decompress, they’re both identical to the original .tar file.

See also: mgzip.

Solaris 10 u6 has no “-u” on “zfs receive”

Despite what you might read at docs.sun.com, Solaris 10 update 6 doesn’t have a “-u” option for `zfs receive`.

jmcmurry@lemon $ cat /etc/release
Solaris 10 10/08 s10s_u6wos_07b SPARC
Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
Assembled 27 October 2008
jmcmurry@lemon $ zfs receive -u
invalid option 'u'

This could’ve been brought to my attention before I decide to create my ZFS export streams recursively. One at a time, I’d have been fine. What seemed to be an awesome way to retrofit a metadata slice for an SVM volume into a whole disk ZFS root install turned out to be a miserable trail of heartache and pain.

But I can take it; I am a Unix guy.

UPDATE: “-u” seems to be in Solaris 10 update 7, but even there, it’s not in the man page for zfs(1M). Thanks so much for updating the docs without indicating that the “recover your root storage” function only works on the OS release that you made available for download today.

/grrr

Moving a Solaris 10 zone

Today I wanted to move a Solaris 9 zone running on a Solaris 10 test server to a new ZFS dataset within the same ZFS pool with compression enabled. This container is an archive of an environment we don’t use very much, normally leave shut down, and intend to delete fairly soon, but since it’s a flash archive of a Solaris 9 machine, it takes up a lot of space on the test system’s local disks.

First I created the new dataset:

# zfs create rpool/zones
# zfs set mountpoint=/zones rpool/zones
# zfs create rpool/zones/foo
# zfs set compression=on rpool/zones/foo

I thought I should halt the zone, update the zonepath property for the zone, move the files to the new place, and start up the zone. Nope:

zonecfg:foo> set zonepath=/zones/foo
Zone foo already installed; set zonepath not allowed.

Great, now I’m going to have to search through a bunch of docs and maybe remove the zone and redo it all and why can’t they make this easy for me, argh.

Well, they did:

# zoneadm -z foo move /zones/foo

I like a lot of the changes in Solaris 10, especially the usage and man pages for things like zfs(1M). These commands tend to do the annoying things for you and the man pages have lots of examples. Nice.

And hey:

# zfs get compressratio rpool/zones/foo
NAME               PROPERTY       VALUE              SOURCE
rpool/zones/foo  compressratio  1.44x              -