Blog Merge
I finally decided that I didn’t need two blogs, so I merged them into one. Blogs do seem to naturally divide into several types, one being the thoughtful occasional post blog, and another being the short blurb with a link type. So, I originally decided to have two different blogs, Tom’s Tech Toys for short links, and Oxyfish for longer items. But managing two blogs is just silly, especially since I don’t post that much anyway. So I merged the other one into this one. Read on if you want technical details.
This turned out not to be quite as easy as I had hoped. The first major problem was that I had a huge amount of comment spam on Tom’s Tech Toys, since it was running Movable Type v2.6. There were just enough comments I wanted to keep around that I didn’t want to blow away all the comments, so I spent some painstaking time combing through and deleting the spam. Since some posts had over 1000 spam comments, and I needed to check each box for comments to delete, I combed the net for a bookmarklet that would “check all” boxes on the page – and I found one. Unfortunately, the ones with over 900 comments wouldn’t work because the request-uri was too long – I guess MT uses a GET method. Those I had to manually remove from the export file.
Exporting was easy, but I need to do the manual cleanup, and I’m on a Windows box, so I searched for my favorite tool, vi. I have used cygwin in the past, but this time I just wanted a quick and dirty answer, so I found WinVI, which worked great.
Importing into my WordPress setup was also easy, but then I ran into the next snag: I had used Markdown for a bunch of posts in my old blog. My WordPress didn’t seem to have Markdown, so I started down the route of trying to install Markdown, or Text Control (used the patched version). They installed great, but I found it too tedious to go back and update the text control for 200+ posts.
Besides, I didn’t really want to rely on Markdown anyway, I really wanted to de-Markdown my posts entirely. So, I wrote a quick and diry Perl script to do it:
#!/usr/bin/perl
use Markdown;
undef $/; # slurp whole file
$entries = <>;
$entries =~ s/(BODY:n)(.*?)(-----$)/$1.Markdown::Markdown($2).$3;/msge;
print $entries;
I renamed the Markdown.pl to Markdown.pm, and deleted all the setup junk for MT and BBedit to get it to work. This was beautiful! Then I re-imported, and found out that WordPress doesn’t overwrite posts, it just ignores ones it considers dupes. Ugh! How was I going to delete all those posts I had imported incorrectly?
Fortunately, I noticed that the posts had sequential IDs based on when the were added. So, a quick “delete from wp_posts where ID>50;” later, I reimported and was in business. A little category consolidation, and some other cleanup, and things are looking pretty good. If you notice any strangeness in any posts, please let me know in the comments.
One last thing to do: redirect all links from the old blog to this one. That could be tricky, but I’ll work on it.
December 15th, 2007 at 5:59 am
very interesting, but I don’t agree with you
Idetrorce
March 12th, 2008 at 11:19 pm
Cool hack. I like Markdown, but I can see not wanting to be dependent on it in all situations.