Here I sit, at 12:25am, in the lobby of the New Orleans airport waiting until 6:00am so I can go home. “Why?”, you ask. Well, I’ll tell you. If for no other reason than to look busy so that the nice police officers stop looking at me like I’m some kind of vagrant. It all started a couple of weeks ago, when I noticed that the chuckle-head who was the Network Admin before me had not gotten all the servers onto the latest set of service packs. When I asked my boss about this, he was somewhat surprised, since he’d been told that everything was up to date. Not hardly. In fact, I had servers that went back at least two revisions! So, two weeks ago I prepared to get all the servers updated. Behold, my true troubles begin! I try to get the patches copied to all the servers and discover that it takes literally *days* to copy the files! That’s right, the patches are so big, and our bandwidth is so small, that it takes almost five days to copy the Novell patches to all the servers. But, wait! They didn’t really copy to all the servers!! Yikes! So, the re-copying begins. Now, this is when it starts to get really fun. You see, in most of the cases, the reason the Novell Service Pack didn’t copy, is because there just wasn’t enough room on the main volume. (That’s SYS: for all you Novell CNEs out there. Not that there’s many of us left. Certified Netware Engineers are a dying breed, I’m afraid.) Well, I try the obvious solution and recopy everything to one of the other volumes (that’s a “disk drive” to you non-network engineer geeks.) At first, that seems to work well, but, on closer examination, I discover that one of the servers has still refused to copy everything. It turns out that this server lost it’s wide area network connection. *sigh*
Okay, so I try copying the patches yet again, but this time I send them in a compressed format. So far, so good. While all that copying was going on, I started installing the patches on the other servers. Again, most of them worked just fine, but those few with teeny, tiny SYS: volumes don’t all work. So, at this point, I’ve ended up with four that don’t quite have enough room on the main volume to install patches and one that finally has the patches *copied* , but not installed. This is where the problems that actually led me to the New Orleans airport began. We recently purchased a nifty utility called ServerMagic from PowerQuest. This darn program is the best thing since sliced bread! Among other things, it lets you resize Netware volumes without having to destroy them first! Totally cool, and totally impossible before this tool. So, never having used this before, I installed it on one of the remote servers. I ran it and it asks to reboot the server. (GULP!) I took a deep breath and tell the program, “okay”. Bang! The server goes away. Poof! It never comes back up. *sigh* Well, that was just the beginning. I let my boss know what was going on, and then we sat back to wait for Monday morning when that office discovered their crashed server. In the meantime, of course, I installed the patches on the rest of the servers. Or, at least, I *tried*. Some of these servers **still** didn’t have enough room to install all the patches!! Well, I figured that I’d eventually get the space problem worked out, and just left it alone. Oh, boy, was that a mistake! Monday, the first office called to get the server up and running. We got the first signs of life about 9:30am, with some errors, of course. About then, that office’s answering service lets them know that I know about the server and that they should call me. *sigh* So, okay, we get this pretty well squared away, when the development department tells me about this new, totally redesigned piece of misson-critical software that *I* need to get rolled out. Being the kind of guy I am, I start getting this all configured to run, from scratch. Now, we’ve survived until Tuesday morning without a tragedy.
Well, about the time that I start to work on finishing the big, new, shiny software rollout, another site calls to tell me that their server is down. “How did that happen?”, I wonder. No time to worry about that, though, because we’ve got two whole companies down while this one fileserver is off-line. So, like I always do, I started trying to figure out what happened so that I can undo it. It turns out that they took it upon themselves to reboot the server because there were users having problems logging in. Aha! The new patches had partially loaded, because they were partially installed, and some of the new files didn’t like some of the old files. Blamo! One crashed fileserver. Well, I banged away at it for most of the afternoon, one way or another. I tried loading files from backup directories, and copying files from the backup directories, neither of which worked. Then I tried to use ServerMagic to resize volumes so that I could come up with the extra space I needed to install the patches. Well, that *almost* worked. Apparently, the old disk drive just couldn’t quite handle the new utility and just decided to stop working.
By this time, it’s after 4:30pm and I’ve already figured out that I’m flying out to the remote site. Shortly after I suppose I’m going, my boss confirms it for me. We agree to try a couple more things, but not past 5:30pm. So, now, it’s do-or-die time. Let’s just say that I didn’t “do”. My boss helps me get the travel arranged, and we decide that a day trip should be enough. I gather my tools together and scare up a 4Gig external drive that I can add on to the ailing server. Then, I got the brilliant idea to burn a CD-ROM that has the patches on it. After all, since I have to add diskspace anyway, I might as well install the patches while I’m there! That was easier said than done. *sigh* I finally downloaded the required files to my PC and used my own CDR to make what I needed. It took over 6 hours to get the complete download. I got out of the house with just enough time to make the first flight from IAH to OLY. That’s at 6:50am, just in case you’re interested. It gets into New Orleans at about 7:00am. I then spent almost 45 minutes getting from the airport into my rental car. After that, I was ready to drive more than an hour to get to our site in Houma, which I finally did after a few wrong turns.
Finally, I was ready to start working on the server at about 10:30am. It took all of about 15 minutes to get the extra drive installed, configured and running. The fileserver itself took a little longer. I got enough of the new drive allocated to the SYS: volume and then I got the new patches installed from my CD. So far, so good. In fact, at this point was planning to try and catch an earlier flight home than the 5:20pm flight I was scheduled to fly back on. Oh, well, maybe next time. I rebooted the server, so that the new patches were activated, but started getting errors right away. Apparently, the USR volume, where most of the actual data existed, was damaged. Damn! So, I try every repair utility that I can think of, without any improvement. Eventually, we had to delete most of the data because it was too badly damaged to actually use. Unfortunately, that volume also stored the GroupWise e-mail system data. Whoops! So, now, it was about 2:00pm and I was wolfing down an oyster poboy while trying to get the USR volume back online, when it hit me to try and use ServerMagic to fix the problem. Well, that *almost* worked, but it recovered the drive at the expense of the data it contained. Double damn! We don’t back up the GroupWise data for legal reasons, so now we can’t restore it. But, wait! The local remote admin ran a backup last month that accidentally included the e-mail directories. Hooray!! Well, to shorten a *very* long story, we managed to recover and rebuild the mail databases so at least they’ll have user accounts and mail going back a couple of weeks. Not ideal, but better than nothing.
But, how did I end up missing my flight? Well, to prove that we give the best customer service, I stayed while the Arcserve restores ran. (Okay, it was more like they ran, then got totally screwed up, then we deleted them and recreated the jobs and *then* they ran.) And, the next thing you know, it’s 9:30pm and I’m running to try and make a 10:30pm flight. Obviously, I missed it. *sigh* On a more positive note, I do have a flight booked for 6:00am. Of course, that means that I’ll have just enough time to shower and change before I head back into work so that I can get the super-duper software rollout working. Blech! I plan to get the most out of the damn company picnic on Sunday!