This last week I have had my head buried in a project to merge several databases into a few common ones. One of my clients is finding that a lot of smaller companies are combining their presence for various reasons and with it comes the need to merge what were previously very separate installations. Blah, blah, that is the boring stuff. What it means is that I have to do a bunch of things to make it work. These programs have a long history and as you can imagine with the “evolving” nature of vertical market software they also have a lot of baggage in their implementation. From what I have seen this is just a fact of life. It sure does make for some interesting jobs though!

If I get a chance I might try to blog about some of the other techniques I learnt while working on this project. Lots of SQL manipulations that are pretty cool, lots of mistakes and back tracks and lots of things I just wish there was a tool that could do the job for me. First though, this is something that I hacked together using powershell to rename a directory full of word documents that had a unique identifier embedded in the name.

The Situation

Two databases were being combined, Client records in database A would remain as they were. All Clients from Database B were copied into database A and received a newly generated SysID (Identity column) when they were inserted. The original SysID was stored in a column of the Clients table (ReferralSysID) for later reference. It could have been a temp table but this old ID was going to be needed for other things so it was decided to put it in the table itself.

The Problem

There were loads of issues of course but this particular one was about these pesky client note files. Word documents stored in the file system, file names based on this pattern:

<SysID>_<Date>_<Time>.doc

Of course the filename was also stored in the ClientNotes table in the database. The file names needed to be updated to reflect the newly generated SysID mentioned above.

9_070809_130914.doc –> 5409_070809_130914.doc

Over several thousand files of course…

The Solution

Powershell to the rescue!

Of course they are storing image files in the Word Docs so directory zipped up was half a GB in size. Copying it to my development machine was not an option.

Getting some test data

  1. Select all files in Windows Explorer
  2. Right-Click and select “Copy as path”
  3. Remove the root path so there are just filenames and save as “notes.csv”
    e.g. 
  4. Use Powershell to create a bunch of empty files!

Rename the Files

It turns out that grabbing some data from SQL Server using PowerShell is also pretty easy. So the idea is to connect to the database, read the old ReferallSysID match it to the correct filename and then rename the file to the new SysID. Done!

Do this in a temp directory of course so you don’t get a recursion problem or collisions or something. A little bit of RegEx thrown in to make sure to only match on the beginning of the filename. A quick and dirty DataTable, shuffle the files around a bit and everything is sorted.

Here is the complete script:

Oh, and don’t forget to rename the filename stored in the database. I did this with an SQL script in SSMS when I was converting the rest of the data.

That bit was kind of fun too, hacky SQL string manipulation to follow:

These kinds of jobs are great for learning new things. Next step is to package all this up for the next time someone requests a “one off” database merge 😀

-brahn