Rabbit holin’ January 13, 2011

Been a blustery packed few days away from penciling thoughts and goings-on, particularly due to the fact that I’m a rabbit-holer: i.e. if there’s s/thing substantial looming – it will suck up all of my attention and time-span until it’s finished, then I’ll surface for air and to see what’s about.

I just finished watching the memorial service in Tuscon – which turned out to be an odd bit of cognitive dissonance. Outstanding and inspiring speech by the President, but preceded by a procession of speakers against a backdrop of what sounded like a crowd at a rock concert. The screaming and cheering were really jarring considering it was a memorial service, and as the service progressed I grew increasingly irritated. Eventually it donned on me that it was on a university campus and the voices I was hearing were primarily students. Which made it a little less odd, though only a little. It at least gave me some sympathy for how difficult it would be to try and wrangle a stadium full of students to any sort of decorum.

But that’s not what’s been occupying my attention since Saturday – that honor went to doing a database merge, sort of. Due to a number of factors and detailia that would bore you to cutting yourself, I had to merge the xls exports of 2 different databases (one with 5000+ records, the other with 6300+ records) into one, including manually de-duping approx 500 duplicates, fixing bad data when/where possible, and accounting for any records that coexisted/overlapped between the two exports. All of this outside of an actual database – just within xls spreadsheets.  Yes. Really. Just trust me – that’s how it had to go down.

It’s a perfect example of what sorts of things lead to rabbit-holing for this particular rabbit. I set out thinking the merge should be reasonably straight-forward, and that I could get the desired outcome (an all-inclusive mailing list) in an afternoon, into the evening at worst. It’s cute how hope springs eternal in me on these sorts of things – even if you don’t know me, just know that I should know better. A thousand times over.

First rabbit hole was discovering the format of names in one export were completely nonsense – all mushed into a single field rather than first, middle, last name, etc, AND inclusive occasionally (but not systematically) of Mrs/Mr/Ms, Jr/Sr, Dr/Honorable, star sign, favorite color pony, and Icelandic equivalent of the name. Maybe I’m exaggerating those last three, maybe not.  So first pass was space-delimiting one field into 2, 3, 5, or 20, then MANUALLY moving first names into single column (as those were scattered amongst columns 2, 3, 5, or 20 depending on the amount of other name stuff) for 6300+ records. If I could capitalize digits so it could look like I was shouting 6300+ (like I was shouting MANUALLY) I would, but attempts at capitalizing 6300 comes out like ^#)) – and it just doesn’t have the same shouty impact. But that said – I feel I need to take that up with someone, the need for caps’ing digits. Or I could just adjust my font size, but where’s the fun in that.

Incidentally – I conscripted my sig to help with the whole monkeys-could-do-this process (trust me, if Dax could type with her little dog toes – I’d have conscripted her too), we each burned through data-ripping 3000+ records while watching Star Trek flics on SyFy: there was Generations (had forgotten how much I both loved and hated that one), Nemesis (holy crap that was pure nonsense) and Star Trek V: The Final Frontier (odd-numbered Original Trek flics always = bad news).

Nextly came 2 days of manually trolling through the merged 11k+ records for duplicates and overlaps, axing the former and removing but saving the latter for analysis purposes, and also included fixing lots of bad data I came across along the way. All told – I spent about 20 hours on all the data munging. And it makes baby Jesus cry real tears to know that if only we were up and running on our new database platform that this all would’ve been like a dream within a dream (or a dweam wiffin a dweam if you will) – all kittens and puppies and unicorns.

But let ye not be led astray by all my kvetching to think that I have a mad or sad about it – I actually grew oddly fixated on the whole process – the sheer loveliness of mass quantities of data, all the data manipulation possible. It was like being in a sandbox where I could just grab data by the handfuls and let it run through my fingers, build little castles and cities and roads in it, throw it at whoever’s in the sandbox with me and get it all stuck in their hair in that way that sand does. So yeah – it was oddly enjoyable and therapeutic and…fun, despite all of the vexations and bad-movies-until-2am endured to get it done.

So that’s where I’ve been the last few days, which blotted out the sun of my attention on all other things – including blogging and reading for the most part. I managed to feed myself, the dogs, the chickens and my sig, go to work, go to the gym and that’s about it. Back from that rabbit warren, for now anyway.



1. para selenic - January 13, 2011

Ahhh! You and I have database hell to share! I will have to show you the utter crappitude project of account tracking that I am in– but it is nice when every little square peg gets it’s own little square hole– something VERY satisfying about making it all fit right. Although, I think that there may be an easier way for me to do it with a csv file, but i haven’t figured it out yet… outside of simple copy-paste-clean up, hrmf. perhaps I can bribe you with furious for ideas?

therealtinlizzy - January 16, 2011

srsly – I want to see/know what db nonsense you’ve been wrangling! And absolutely my knowledge – such as it is – is at your disposal, no Furious needed! But of course any offer of Furious would never be turned down…Seems like a meetup for magic and commiserating is in order this week! and I should be able to take a look at your laptop tomorrow!

