Efficient Querying == Speed
Monday, July 15, 2002 @ 19.20 CDT

One thing that always amazes me about writing programs that deal with databases is how profoundly a suboptimal approach to querying can affect performance.

There's an aggregation process in a project I'm working on that does a few standard calculations like min(), max(), and stddev() on a day's worth of performance data, grouped in clumps. It also does things like median and a couple of percentile rankings. Nothing elaborate, but that last set of calculations had to be written in PL/SQL because there are no built-in functions for that on our Oracle RDBMS version. (As in, "Not 8i.")

The first version of this took three hours to run sometimes, and would often fail. I did some quick optimizing one day and it turned into roughly 20 minutes, and it would fail only rarely. Not great, but certainly better.

Today, I rewrote a stored procedure and added another one. Now, it takes roughly 35 seconds. For the same results. It's a lot simpler, too. Basically, rather than doing each custom calculation independently, it now does all the custom calculations on one pass through each clump of data.

If you're keeping score, the new thing runs 30,857% faster than the old thing.

(I might be off with that number, but if so, Donnie will correct me, because he enjoys proofreading web sites for factual errors. Hi Donnie...)

Sometimes efficiency in programming isn't a big deal. Computers are fast and memory is cheap. But if you have a database system in your project (which you usually do,) that's a great place to look when you know it works and now it needs to work faster.

This is a surprise to no one, really. Just a note-to-self to remind me not to write (or allow) inefficient querying in anything important.

All the news