New Features here at MR&HBI!

First, allow me to call your attention to the episode immediately before this one. You might notice the little icon is a camera. “huh,” you might be saying to yourself, “I don’t remember seeing that one before.” Very observant, Buckaroo! It’s for a new category, Photography, that I added. “But,” the even more observant amongst you might say, “There are already a handful of episodes in that category.” Right again, Wisenstein! I recategorized a couple episodes that were under The Great Adventure and found a couple in Idle Chit-Chat that were better filed under the new category. I expect there are plenty more; the trick is finding them.

The icon is actually my camera sitting on an opened unabridged dictionary. That may seem staged, but that’s actually where we keep the camera these days. Yes, we have an unabridged dictionary open on a stand at all times. No, that does not make us geeks.

Second, way down at the bottom of the sidebar, there’s a section called Other Muddled Stats (or something like that). That’s a wordpress widget I made that counts all the words in all the episodes, and keeps a tally of how many comments there have been as well. I plan to add other stats as well next time I have the hood open. Perhaps the number of times I’ve said “You don’t have to thank me,” or the number of times I’ve blamed the Chinese for things. (Hm… haven’t done that in a while…) Anything you’d like to know? The number of letters typed? Words in comments? Most prolific commenters? If it’s on these pages, I can count it.

The WordPress plugin itself is hand-crafted by yours truly. I started by downloading a different word-counting plugin, but it counted the words on every page load and didn’t have a sidebar widget. All it was was a database query and a loop. My version only counts when the relevant value changes – it only counts words when a new episode is posted, for instance. Once I tidy it up I’ll be adding it to the WordPress repository, so others can also gather useless stats about their blogs. It’s all about sharing the love.

The Book Review that Wasn’t

Last night I wrote a review of a book. I was pretty pleased with the results. It actually talked about the book for a while. This morning I tweaked it a bit and hit post.

It vanished.

Well, mostly vanished. The title was there, as was the little blurb at the top. Everything else was gone. “Poop!” I said (or something like that).

I use ecto to compose my larger blog episodes; the offline editor is much nicer than any in-browser editor I’ve encountered, especially on my 8-year-old laptop. I don’t call it Ol’ Pokey for nothing. Plus there are times I want to write an episode but the Internet is nowhere in sight. ecto has been working very well for me. Except when it loses my work. This is the second time, but somehow this one hurt more. Also, ecto was recently bought from the original developer and seems to be stagnating.

“Looks like it’s time to give MarsEdit a serious look,” I said, and downloaded the latest. I fired it up and was greeted with “Your trial period has expired.” Dang. I’d launched it once when comparing ecto and MarsEdit back in the day. MarsEdit was missing a particular feature (don’t remember exactly what) and that made ecto the winner. Before it started losing my work.

Lots of people like MarsEdit (lots of people like ecto, too), but am I willing to pay for it without writing a single episode with it? That’s hard to justify. I’m downloading a program called Qumana to rewrite the book review with. We’ll see how that goes.

Edited to add: Nope. Qumana didn’t work. At all. I checked the system requirements, and it should work. But it doesn’t.

New Sidebar Feature – Tag Cloud (sort of)

Most blog systems support tags these days. Put simply, tags are just words that can be used to create informal groups of posts. Tags aren’t as rigidly defined as categories, and so a ramble that covers many topics can have many tags. The purpose of the tags is to allow folks like you to find similar stuff. Since moving to WordPress I’ve started to pay more attention to tags, and at the bottom of each episode you can find a link or three to episodes with similar tags. It’s kind of cool, and it’s search-engine friendly.

Now I have added a widget to the sidebar that provides a ‘tag cloud’ — a list of the tags with the most-used tags in larger font. (I think this is a misuse of ‘cloud’, which in this context is also supposed to show relationships. A true cloud would group tags by how often they are used together.) There are much fancier tag cloud widgets out there, but I was starting to spend way too much time investigating the options. I settled on a nice, simple, colorful widget which is over there now. It’s called “ILW Colorful Tag Cloud” (or something like that). There are a few aesthetic tweaks I’d like to make, like condensing the text, but that shouldn’t be too much trouble.

The widgit’s all right, but the colors are arbitrarily set by me. It would be cool if the colors actually meant something. Since the number of times a tag is used is already represented in the font size, color could be used to show relationships or (better yet) indicate how many times a tag has been clicked. That way the tags more people found interesting would be highlighted.

Another minor problem with the tag cloud as it stands is that most of the 1200 episodes I created with my old blog system have no tags. I’ve gone back to retrofit tags on a few obvious ones, but overall most of this blog is untagged.

But no, not today. No widget modifications, and no more tag retrofitting. I’ve already spent far too much time on this silly feature.

Programming Note

I’ve put in a new anti-spam layer in the comments. It’s supposed to nip spam in the bud before it even reaches the spam-catcher I already have in place. Almost no spam has been getting through to your eyes, but behind the scenes the comments have been building up, and this should simplify administration of the site. In addition the new spam layer helps prevent robots from scraping email addresses off the site and other antisocial behavior (not that I will depend on that stuff). The name of the Plugin is “Bad Behavior”, for those who might want to try it out.

The system uses a variety of techniques that are supposed to be completely invisible to you, but please let me know if you have any trouble leaving comments. My email address is addy

Programming Note: Polls!

I was trying to decide how to spell a word I coined while talking about the preposterometer, and I decided to turn to you, the viewing audience, to get your thoughts on the matter. That required that I finally get polls working.

In fact, there are two polls going on right now, one about spelling and one about how the polls themselves should operate. The widget instructions indicated that I’d be able to show both at once, but the setting just doesn’t seem to be there. Instead, you get one or the other randomly.

While I had the hood up I made the sidebar headers stronger, to help people sort through the long list of stuff over there. It’s definitely more useful, but somehow it doesn’t seem right to me yet. Also, I set up so I can highlight parts of the sidebar with new stuff going on (the current color will not be my final choice, I think). Let me know what you think about any of the changes!

Programming Note: Sweetness

Sometimes I write an episode that I’m particularly pleased with, only to have it greeted by the sound of crickets chirping. It’s possible that while people enjoyed reading it, they didn’t have anything to add afterward, so there are no comments. That’s what I tell myself, anyway. Soon we’ll put that assertion to the test. The results may prove depressing, but I am experimenting with a feature that will allow readers to say “I liked that episode” without actually leaving a comment.

There are definitely some aesthetic issues to resolve, but there is now an option to vote on episodes you like. It’s not a big deal, just a way for you to say, “thanks, Jer, for sharing your genius with us on the topic of the proper way to belch after a meal.” Or whatever world-shaking topic I’ve chosen to tackle in a particular episode. Don’t be shy out there, if you like lots of episodes, feel free to shower me with kudos! Really!

After I get the episode-voting in, I intend to add a similar system for comments, so when someone leaves a particularly good comment the rest of the blogcomm can clap. If no one ever votes for any of my episodes, I will cry silent bitter tears and remove the feature.

Quest for the Perfect Moon Widget

You may have noticed that as of this moment there are three different moon phase widgets over on the sidebar. None of them are perfect, alas (although the Japanese one is perfectly inscrutable). I looked around at other WordPress widgets and did not find one that gave out all the information I was interested in (especially for the eclipse) and was aesthetically pleasing. I thought I might spend a few hours and make my own.

The design was very simple. I would write a little Flash thingie that read XML data from a server and draw the moon with great precision and also look nice doing it. In addition I could put numerical readouts for more interesting (to me) numbers. Piece of cake.

I started my quest looking for a server with current moon info. The US Naval Observatory has all sorts of lunar data available, presumably calculated with far greater precision that I will ever need. The only problem is, they didn’t have data for right now. They had almanac generators and whatnot, but nothing that I could ping and get back a message that said, “at this moment, the moon is…” I couldn’t find anything at NASA, either. I broadened my search and found that nobody seems to be providing this service. “fine, then,” I thought. “I’ll make my own moon server. I’m sure there are plenty of places I can find algorithms for calculating this stuff.”

Only, that didn’t turn out to be so simple, either. The motion of the moon is incredibly complex. There exists a thing called ELP 2000-85 which is the latest attempt to make the math match what the moon actually does. What the thing does is loop through a set of calculations a bazillion times, each time with tweaked coefficients that make smaller and smaller corrections to the calculation. Compiling the tables of coefficients must have been a real pain in the butt. Refining the tables is still ongoing. The accuracy of your calculation comes down to how many times you loop through the coefficients before you decide that the computer power is better used for something else.

Nobody in their right mind would actually use all the tweaks in the ELP 2000 for anything as simple as a moon phase widget, or, for that matter, a moon landing. Along came a guy named Jean Meeus, who published a book full of handy formulas for calculating where things are going to be. He includes simplifications of the ELP 2000 (only looping through 64 iterations), and while they’re not as precise, they’re pretty damn good. I don’t have that book, either.

Time wasted so far: 3 hours. Completion of widget: 0%

But now my search began to bear fruit. I didn’t have Meeus’ formulas, but other people did, and had written software. I found some open-source code that implemented some of his stuff. Yay! I implemented the code, moving it from c to PHP so I could run it on my server. After a few routine hitches the code was up and running and telling me just where the moon was, relative to the Earth, accurate to a couple of arcseconds.

Time wasted so far: 6 hours. Completion of widget: 5%

Unfortunately, it didn’t tell me anything else. This particular code did not provide any information that required data about the sun — like, say, the phase of the moon. Harrumph. Back to the Internet I went. Fairly quickly I found some different code, this time in JavaScript, that also cited Meeus. It was much, much, simpler, ignoring many of the more difficult-to-calculate corrections, but I figured that the first code sample had already done most of that. It was simply a matter of adding the new code to what I already had. Naturally, despite having the same source reference, all the variable names were completely different.

After a great deal of forensics (that’s a big word for ‘wasted time’) I established which quantities I had accurate versions of and which I still needed to calculate. I got everything set up and ran some tests. The results were not good.

Time wasted so far: 12 hours. Completion of widget: 3%

I had expected some problems like this – perhaps in one body of code an angle was expressed in degrees and the other expected radians. Things like that. I started working through things. Only after another day of head-scratching did I test the code I’d based the second half of my project on. It was wrong. So there I was with Frankenstein’s monster of code sewn together from different sources, and one of the sources was broken before I even started. Sigh. Back to the drawing board.

Time wasted so far: 20 hours. Completion of widget: 2%

I should mention along in here somewhere that there are people who sell moon software for quite a bit of money. My little server could potentially put a dent in their sales by bringing accurate calculations to anyone who asks, but its not really the calculations they are selling, but the application around it. I’m not too worried for them.

Back to the Web and by now I was getting better searches because I knew the key terms to look for. I found two more code examples, both of which take precision to the most extreme available. One is a complete implementation of the ELP 2000-82b. This honey consists of 36 files with tables with hundreds of rows of numbers, and a sample program in Fortran that shows how to use them. For ridiculously accurate calculations, I couldn’t do much better. But… It only calculates the position of the moon, just like the first code I implemented. I’d still need to work out the phases and whatnot.

The other code I found is based on earlier math, but really concentrates on what an observer would see from a given point on the Earth. It includes corrections for the optical effects of the atmosphere and for the friggin’ speed of light. It’s got a lot of stuff I don’t need (other planets, for instance), but it has everything I’d be looking for. The thing is, the code is horrible. It’s in c, and the writer apparently never heard of parameters or returning values. Or structs, or anything else that might help organize the information. It is impossible to read a function and know what it does or where all the numbers it uses come from. It would be a big task to translate the pieces I need, mainly because it’s very difficult to tell which pieces I need. Still, it’s an option.

Time wasted so far: 24 hours. Completion of widget: 3%

And that’s where I stand. You know, maybe I’ll wait until I’m on a boat full of moon geeks. I bet one of them even knows a Web site that gives current moon data.

1

Please don’t adjust your set…

Just playing around with background images. You know, for something memorable. Branding. Something that people will look at and say “Now that’s Muddled Ramblings and Half-baked Ideas!” Or, failing that, “Aaaaaaaaaaaaaaaaaargggggggghhhhhh!”

Although my sweetie likes this background, I expect it will be temporary. Let me know what you think!

Upgrading the Search Function

The other day I wondered how many times I’d used the phrase “You don’t have to thank me” in this blog. No problem, I thought, I’d just pop the phrase into the search feature over on the sidebar and let it tell me.

The only problem was, it didn’t give a very good answer. It also included partial matches, which would have been all right if it had either a) ranked the results, or b) shown a little excerpt of the resulting matches with the searched-upon words emphasized. The built-in WordPress search function does neither. Off I went to find alternatives.

One option was to hook up Google to do the search. That’s a pretty good option from a functional standpoint; nobody is as good at ranking results and showing you a bit to help you with your decision. The downside is that it’s pretty ugly. My (very) brief search made it appear that I wouldn’t be able to do much with the results. My search for Google-based solutions was brief because I found another WordPress plugin called “Better Search” which did in fact return ranked answers. Hooray!

Only, not so fast, Sparky. The plugin is still young, and doesn’t provide much in the way of customizing the look of the results, either. The good news was that the source code is right there and I thought it wouldn’t be too tough to rearrange things a bit to make it much easier to customize. The plugin author had already done the mysterious, magical steps to allow a template file to work, all that was left was giving the template the power. So I did that, and sharpened up some PHP skills while I was at it. Now if you do a search, you will see that the results include a relevance ranking. The result page is still pretty ugly, because I haven’t finished tweaking my new template for my site. (I tried to start with a general one that would be useful to others.)

Then, I typed “You don’t have to thank me” into the search box and got… No matches found. What? I know I’ve used that phrase before. I tried removing the word with the apostrophe, in case that had something to do with it. Nope. Eventually I got down to the word “thank”. No matches.

Here’s the thing: MySQL, the database I use, has built this fancy full-text matching thing (which I learned an awful lot about yesterday), but they’ve optimized it for huge sites. There is a list of common words they throw out to reduce the number of matches. Six of those words are “you”, “don’t”, “have”, “to”, “thank”, and “me”. Wow. To make things worse, I can’t change the list. Only the big boys who have their own servers can control the list. Those are the ones least likely to want to change the list, but there you go.

There were some other annoying “features” of the MySQL Full-Text search (exact phrase matching doesn’t work like you’d think, for instance), but some of those I suspect are the result of my provider using an older version of the database.

Now, I can put up with the limits of MySQL (this morning i was coding in my head the algorithm for showing an excerpt with emphasis), or shift focus and let the Goog or it’s new arch-rival bing do the heavy lifting – and the formatting. Why can’t this stuff just be easy?

Edited to Add: Well, that blog episode went obsolete in a hurry. I’m currently using a Google sidebar thingie that is visually acceptable (and adaptable). Play around with it!

There is a feature of the Better Search plugin I was using that I will miss – it kept track of recent searches and produced one-click links in a cloud that showed popularity. I guess it’s not a major loss, since not that many people search here, but I liked it.

1

Blogs and Bloggers

A Facebook friend of mine posted a link to a NY Times article about the high failure rate of blogs. I couldn’t read the article without registering (so I didn’t), but that won’t stop me from commenting on it! You don’t have to thank me; it’s what I do.

As I pondered the short life span of the typical blog, I decided that bloggers fall into a few categories, and failure can (usually) be predicted just by identifying what class the blogger is a member of:

  1. People with nothing to say. Unscientifically, I’d say this is the vast majority of blogs. Many of these blogs might better be described as journals; the content is really meant for the consumption of the writer, not any audience. After a few weeks, anecdotes about the crazy antics of Fluffy the cat get old. After a few months these stories get old even for the blogger and he quits. Some people have a treasury of a few really good stories, and those will keep them going for a while, but when the well runs dry the blog fades away.
  2. People who lack the skill to say what they want. I suspect that this group is fairly small, as most people who lack language skills probably don’t start blogging in the first place. The exceptions to this rule, I suspect (having done no research) are found in sport blogs and political blogs, where passionately held beliefs are undermined by the complete inability of the writer to express himself.
  3. Interesting, articulate people with unrealistic expectations. When the blog doesn’t become famous overnight and the blogger realizes she must devote time to it almost every day for months for it to have even a remote chance of catching on, they quit.
  4. Interesting, articulate people who embrace the medium and do it for the pleasure of doing it. They produce what we in the industry call “good blogs.”
  5. People who, despite having traits from categories 1-3, continue to blog, rehashing old material and catering to a microscopic audience. Even as readership remains constant for several years these writers delude themselves into thinking that their blog sucks less than most blogs.

On a purely unrelated note, as Muddled Ramblings and Half-Baked Ideas celebrates its fifth year of contributing to the noise of the blogosphere, the MuddledRamblings.com business cards I designed say at the bottom, “Sucks less than most blogs!”

Fundamentally, I think most bloggers want to be read. A growing audience is the payoff — more people reading, more people commenting, lively discussions triggered by the words of the blogger. It seems obvious, but when it comes right down to it, most blogs are not read. Personally, I don’t read that many blogs, and comment on fewer still. There are just too many of the damn things. Blogs that don’t produce consistently excellent posts, with some thematic connection between posts, are not going to grow big audiences. (The exception to this is the celebrity blog, where people read just for the name.) I’d have a much better chance at a large readership if I wrote a blog strictly about software engineering on a particular platform rather than just posting whatever drivel pops into my head.

I just like writing drivel, is all.

1

New “About” Page is up.

My “About Jerry Seeger” page is now a long, rambling muddle, as befits the rest of the blog. Let me know what you think! Any mysterious incidents from the past I neglected to distort mention?

Blogging Without a Net

I woke up early this morning, which is even more surprising than usual because apparently this weekend everyone around here decided to set their clocks ahead an hour. Many mornings I wake up to the sound of my Yahoo! chat thingie announcing that it’s time for a little morning dialog with That Girl. Not this morning; my Internet has been down more than up lately and this morning it wasn’t even pretending to try to connect. Not a good sign. I thought maybe I was behind on paying, but I just got a bill and it’s not due yet. It mentions nothing of previous unpaid debts.

Out of habit I sat in front of my computer and stared at it for a little while. It seems my only morning ritual that doesn’t involve the Internet is making tea, which I drink while reading Web comics and checking for comments here. This morning I didn’t even make tea.

Sure, I could have picked up a book, or fired up Jer’s Novel Writer to do a little creative work of my own. I could have used my telephone to communicate with friends and contacts for “Moonlight Sonata”. I could have packed up my stuff, walked down the hill, and reacquainted myself with Café Fuzzy’s breakfast sandwich.

I did none of those things. Instead I fiddled with wires and the DSL box and whatnot, hoping for some magic combination that would bring the Internet back. I did not succeed. After two hours of alternating fiddling and staring blankly at the screen I had to admit to myself that Plan B was called for. A mere three hours after I woke up I stumbled into Little Café Near Home, just so I could tell you this little story.

The First MOH of the New Blog

In the past few months the pivotal role of the Millennial Office Holder has been lost, overwhelmed by all the other news and my own laziness to keep track of this stuff. But now I am re-energized, and on top of that traffic to the blog is way down (Google has not approved of the move), making this business easier to stay on top of.

[Brief writer’s note: there’s an arbitrary rule created by a bunch of hard-asses in the early 20th century that says not to end a sentence with a preposition. The rule is a load of crap designed by a bunch of old men hoping to maintain class distinctions by creating an artificial “high English”, and the above sentence is proof of their folly. What are you going to say? ‘On top of which the business is more easily stayed?’]

Anyway, It’s time to revive the MOH. Visitor 121,001 was a googler from Pennsylvania looking for the recipe for Kofola. Happily, I’m confident that whoever that person was will never succeed in reproducing the most original soft drink in the world. Dr. Pepper is a mainstream chump compared to Kofola. It’s the licorice in Kofola that gets me, I think. Not a big fan.

Speaking of the Goog, I’ve seen an interesting trend lately. There has been a big slowdown in over-easy egg seekers and a huge increase in ‘New York Sucks’ searches. Did something happen over there? (On that topic, while I stand by the core sentiments of my original rant, several people have written in the comments about how damn cool New York can be and I believe them, and there have been people who agree with me completely that just perpetuate the stereotype. Irony abounds.)

A Day of Design

I had other things I needed to do today, and the new blog sucked up WAY too much of my time. I’m working on making the new banner actually look cool, rather than merely function. It’s going… OK, I guess. I’ve already spent a long time trying to figure out colors, when I think the core problem is that the fonts just plain don’t work well together. The guest poem system is mostly done, but I don’t have it displaying the author pictures yet. There’s a bit of a problem there; For most of the poems there’s plenty of room for a picture, but there are a few poems that need a lot of space. I’ll work something out.

I also worked on the comment popup window over there. It’s not great, but it’s a heck of a lot better than it was.

Overall, what do you think? Still to come: sound effects (and a mute button), and a way to play “All for me grog!” I might sneak in a couple of other surprises, too.

Geekery: Transferring this blog from iBlog 2 to WordPress

Note: For those looking to move from iBlog 2 to wordpress, this article and some follow-up can be found at the iBlog survivors’ forum. The complete script is available there for download. You really don’t have to understand all this stuff.

I started using iBlog several years ago, when it was new and I was new to blogging. It had one advantage over other blogging packages: it came free with my .mac account back in the day and it worked on .mac servers, which are, to put it kindly, inflexible.

Two things have happened in the intervening years: first, all the blogging platforms have gotten much better, including the ability to work on the blog while offline. The second is that iBlog made an abortive step forward to iBlog 2, which was a major improvement, but then the whole company stalled before that release was really finished (although by then I was fully committed to it). I will miss iBlog 2, but not as much as I will enjoy getting my stuff onto a faster, more versatile platform.

After a rather exhaustive search of blogging and CMS systems, I settled on WordPress. While it’s not perfect, it is a straightforward MySQL-Apache-php application that is easy to fiddle with, and some of the customizations I was looking for were much easier with WordPress than with others.

WordPress has a whole bunch of tools and instructions for importing your stuff from other blog systems. None of those did me much good at all, however, as iBlog was too obscure for anyone to worry about. After searching the Internet I found some helpful information, but it all applied to iBlog 1 – most people never made the move to the ill-fated upgrade. I was pretty much on my own.

WordPress can import data in a variety of formats, but it was up to me to get the data out of iBlog in a format WrodPress could understand. The most versatile format was one created by the folks at WordPress, which could include information specific to WordPress. Cool! Decision made, I was on my way.

Except… the folks at WordPress have never bothered to document the structure of their files. Apparently It’s something they’ve been meaning to get around to eventually (though the people writing translation software for the other major blogging software have long since muddled through it). I did what everyone else has had to do to export data: copy one of WordPress’s files and fiddle with it until it works. Not only is this a pain in the patoot, there might be tags that don’t appear in my examples that could nonetheless be useful to me. Oh, well.

I needed my import file to include definitions of categories, and then each of the blog entries, with correct category associations. My example file had a lot of fields that seemed redundant for my purposes, but without documentation I wasn’t going to waste time trying to figure out which tags were required and which weren’t.

Here is a very small (one episode) export file. We’ll go into the details of things like nicename later:

<rss>
<channel>
    <title>Muddled Ramblings and Half-Baked Ideas</title>
    <link>http://jerssoftwarehut.com/muddled</link>
    <description>blog!</description>
    <pubDate>Thu, 28 Jun 2007 21:32:21 +0000</pubDate>
    <generator>Jers Very Clever Script</generator>
    <language>en</language>
    <wp:wxr_version>1.0</wp:wxr_version>
    <wp:base_site_url>http://jerssoftwarehut.com/muddled</wp:base_site_url>
    <wp:base_blog_url>http://jerssoftwarehut.com/muddled</wp:base_blog_url>
 
<wp:category>
    <wp:category_nicename>bars-of-the-world-tour</wp:category_nicename>
    <wp:category_parent></wp:category_parent>
    <wp:posts_private>0</wp:posts_private>
    <wp:links_private>0</wp:links_private>
    <wp:cat_name><![CDATA[Bars of the World Tour]]></wp:cat_name>
    <wp:category_description><![CDATA[blah blah blah]]></wp:category_description>
</wp:category>
 
<item>
    <title>Delayed by Weather</title>
    <link></link>
    <pubDate>2007-03-27 18:23:57</pubDate>
    <dc:creator><![CDATA[Jerry]]></dc:creator>
    <category><![CDATA[Bars of the World Tour]]></category>
    <category domain="category" nicename="bars-of-the-world-tour"><![CDATA[Bars of the World Tour]]></category>
    <content:encoded><![CDATA[<p>The Weather Channel is calling the roads around here "a big mess", so I'm going to take time out from driving and catch up on some writing. Unfortunately, TWC is also calling for dangerous surf and "rough bar conditions". I'd better leave the laptop in my room.</p>]]></content:encoded>
    <excerpt:encoded><![CDATA[&amp;nbsp;]]></excerpt:encoded>
    <wp:post_id>1065</wp:post_id>
    <wp:post_date>2007-03-27 18:23:57</wp:post_date>
    <wp:post_date_gmt>2007-03-27 18:23:57</wp:post_date_gmt>
    <wp:comment_status>open</wp:comment_status>
    <wp:ping_status>open</wp:ping_status>
    <wp:post_name>Delayed by Weather</wp:post_name>
    <wp:status>publish</wp:status>
    <wp:post_parent>0</wp:post_parent>
    <wp:post_type>post</wp:post_type>
</item>
 
</channel>
</rss>

But how to create the file? The data for iBlog 2 is distributed over (literally) thousands of files. Writing a program to track down all the information and make sense of it would be a major chore. That’s where AppleScript came in. iBlog’s programmer took the time to provide access to the iBlog data through the Apple Scripting system. I was able to let iBlog read all of its silly scattered files and make sense of them, then provide the data to me in a coherent fashion. So far, so good. All I needed to do was loop through all the episodes, pull out the data I needed, and shovel it into a text file that WordPress could read.

[IMPORTANT NOTE: I’ve tried to go back and reconstruct the scripts as they were at the appropriate stage in development, but the snippets are untested.]

[ALSO IMPORTANT: you don’t really have to understand the code. If you are in this boat, I will help you. You should understand the challenges, but I’m here for you.]

on run

set exportFile to 0

try

set exportFile to open for access “Users:JerryTi:Documents:scripts:” & niceName & “.xml” with write permission

set eof of exportFile to 0

tell application “iBlog” to set cats to the categories of the first blog

repeat with cat in cats

tell application “iBlog” to set catname to (the name of cat) as text

set niceName to the first word of catname

write rssHead to exportFile as «class utf8» — xml/rss header stuff that’s always the same

set catDescription to “blah blah blah”

write out the category info

tell application “iBlog” to set nextText to “<wp:category>” & newLine & tab & “<wp:category_nicename>” & niceName & “</wp:category_nicename>” & newLine & tab & “<wp:category_parent></wp:category_parent>” & newLine & tab & “<wp:posts_private>0</wp:posts_private>” & newLine & tab & “<wp:links_private>0</wp:links_private>” & newLine & tab & “<wp:cat_name><![CDATA[” & catname & “]]></wp:cat_name>” & newLine & tab & “<wp:category_description><![CDATA[” & catDescription & “]]></wp:category_description>” & newLine & “</wp:category>” & newLine & newLine

write nextTex
t
to exportFile as «class utf8» — have to coerce the text from 16-bit unicode

tell application “iBlog” to set ents to the entries of cat

repeat with ent in ents

get the stuff in iBlog’s world, work with it here

tell application “iBlog”

set titl to (the title of ent)

set desc to (the summary of ent)

set bod to (the body of ent)

set postDate to the post date of ent

end tell

set nextText to (((“<item>” & newLine & tab & “<title>” & titl & “</title>” & newLine & tab & “<link></link>” & newLine & tab & “<pubDate>” & postDate) & “</pubDate>” & newLine & tab & “<dc:creator><![CDATA[Jerry]]></dc:creator>” & newLine & tab & “<category><![CDATA[” & the name of cat & “]]></category>” & newLine & tab & “<category domain=”category” nicename=”” & niceName & “”><![CDATA[” & the name of cat & “]]></category>” & newLine & tab & “<content:encoded><![CDATA[” & bod & “]]></content:encoded>” & newLine & tab & “<excerpt:encoded><![CDATA[” & desc & “]]></excerpt:encoded>” & newLine & tab & “<wp:post_id></wp:post_id>” & newLine & tab & “<wp:post_date>” & postDate) & “</wp:post_date>” & newLine & tab & “<wp:post_date_gmt>” & postDate) & “</wp:post_date_gmt>” & newLine & tab & “<wp:comment_status>open</wp:comment_status>” & newLine & tab & “<wp:ping_status>open</wp:ping_status>” & newLine & tab & “<wp:post_name>” & titl & “</wp:post_name>” & newLine & tab & “<wp:status>publish</wp:status>” & newLine & tab & “<wp:post_parent>0</wp:post_parent>” & newLine & tab & “<wp:post_type>post</wp:post_type>” & newLine & “</item>” & newLine & newLine

write nextText to exportFile as «class utf8»

end repeat

end repeat

write rssTail to exportFile as «class utf8» — xml/rss file closing stuff

on error errStr number errorNumber

if exportFile is not equal to 0 then

close access exportFile

set exportFile to 0

end if

error errStr number errorNumber

end try

if exportFile is not equal to 0 then

close access exportFile

set exportFile to 0

end if

end run

So far things are pretty simple. The script loops through the categories, and in each category it pulls out all the episodes. Only it kept stalling. It turns out that sometimes iBlog took so long to respond that the script gave up waiting. I added

with timeout of 600 seconds

at the start to make the script wait a full ten minutes for iBlog to respond. Yes, iBlog certainly is no jackrabbit of a program.

Now the program ran! The only problem is, the resulting file doesn’t work. Hm. The first thing the importer reports is that it can’t read the dates the way AppleScript formats them. So, I added a function to reformat all the dates to match the example. Then it was importing categories, but not items. Why not?

Um… actually I don’t remember the answer to that one. Let’s just say that it took a lot of fiddling and testing to get it right. Eventually, hurrah! There in my WordPress installation were episodes from iBlog.

And they looked like crap. The thing is, that iBlog included unnecessary HTML tags around the blog title, excerpt, and body. It’s going to be a lot easier to clean them up now, while we’re mucking with each bit of text anyway, so back to AppleScript’s lousy string functions we go to clean up iBlog’s mess. Now, after we get all the data from iBlog, we call a series of functions to clean it all up:

set titl to stripParagraphTags(titl)

set desc to stripParagraphTags(desc)

set postDate to formatDate(postDate)

set bod to fixBlogBodyText(bod, postDate)


The actual functions are available in the attached final script.

Things are looking better, but still not very good. Much of this is due to some junk iBlog did when converting my older episodes into iBlog 2 format. One thing it did was to insert hard line breaks in the text of the blog body. No idea why. Maybe they were there all along and I had no way to see them. WordPress helpfully assumes that if you have a line break in the data it imports, you want a line break when it shows on the screen. So, every line break is replaced by a <br /> tag when imported into WordPress. This will not do. Additionally, iBlog replaced paragraph breaks </p><p> with a pair of break tags: <br /><br />. Once again, the reason for this is a mystery. The latter issue is less important, but we may as well address it while the hood is up.

Back we go into the fixBlogBodyText function, to repair more silly iBlog formatting. The resulting function looks like this:

on fixBlogBodyText(s, postDate)

this assumes that if an episode is supposed to start with a div, it will have a style or class

if (the offset of “<div>” in s) is equal to 1 then

set s to text 6 thru (the (length of s) – 6) of s

in some cases there was an extra line feed at the end of the text as well

if the last character of s is “<” then

set s to text 1 thru (the (length of s) – 1) of s

end if

set s to “<p>” & s & “</p>”

end if

clean up iBlog junk (lots of this stuff is the result of upgrading to iBlog 2 – the conversion was not clean

replace all line breaks with spaces

set s to replaceAll(s, “

“, ” “)

replace all double-break tags with paragraph tags

set s to replaceAll(s, “<br /><br />”, “</p>” & newLine & “<p>”)

replace all old-fashioned double-break tags with paragraph tags

set s to replaceAll(s, “<br><br>”, “</p>” & newLine & “<p>”)

get rid of some pointless span class info


set s to replaceAll(s, ” class=”Apple-style-span””, “”)

return s

end fixBlogBodyText

note: replaceAll is a utility function I wrote that does pretty much what it says. You will find it in the attached source file. newLine is a variable I defined because left to it’s own devices AppleScript uses the obsolete Mac OS 9 line endings. What’s up with that?

At this point the text is importing mostly nicely. But wait! I was running my tests just working with one category to save time. When I looked at Allison in Anime on WordPress, some really weird things started happening. It turns out that when importing the data, you need line breaks every now and then, otherwise the importer will insert them. That would be nice to put in the documentation somewhere! In one of my episodes, the newline was inserted right in the middle of a <div> tag, which led to all kinds of trouble. So, to the above script I added a line that inserts a line break between </p><p> tags. As long as any one paragraph isn’t too long, I’ll be all right.

set s to replaceAll(s, “</p><p>”, “</p>” & newLine & “<p>”)

And with that, we’ve done it! We’ve written a script that will export all the data from iBlog 2 and format it in a way that WordPress can accept. Time to run it on the whole blog, go take a little break, and come back and see how things went…

Dang. Didn’t work. There’s a maximum file size for import, and my blog is too damn big. Not a huge problem, just a bit of modification to make each category a separate file. Now, at last, the data is imported, the text looks nice, and we’re ready to make the move to our new home.

Except…

The images don’t show up, and links between episodes are broken. Also, it would be nice if people could still read the old Haloscan comments. I guess we’re not done yet.

Image links were the easiest to repair. In iBlog 2 the source code always looks for the image at path /http://muddledramblings.com/wp-content/uploads/iblog/. We just have to find those links and replace them with new info. I used Automator to find all the image files in the iBlog data folders, then I copied them all up to a directory on the WordPress server, and pointed all the links there. Worked like a charm! (Icerabbit goes into more detail on that process here. I used different tools, but the process is the same.)

Links between episodes turned out to be a lot trickier. It came down to this: How do I know what the URL of the episode is going to be when I load it into WordPress? I had to either know what the episode’s id was going to be, or I had to know what its nicename was going to be.

Nicename is a modified title that can be used in URL’s – no spaces and whatnot. “Rumblings from the Secret Labs” becomes “rumblings-from-the-secret-labs”. If I set up wordpress to use the nicename to link to an episode rather than the ID number, it would have some advantages, but I can get long-winded (have you noticed?) and that applies to my episode titles as well. The URL’s for my episodes could get really long. Therefore, I’d rather use the episode’s ID for its permalink. (If you try the icerabbit link above, you will see the nicename version of a link.)

Happily, the import file format allows me to specify the id of episodes I upload. (I don’t know what it does if there’s already an episode with that ID.) After some fiddling I managed to specify reliably what ID to give each episode. Now in my script I make a big table with the iBlog paths to each episode and the ID I will assign it. Before the main loop I have another that builds the table:

first loop

set postID to firstPostID

set idTableRef to a reference to episodeIDTable

tell application “iBlog” to set cats to the categories of the first blog

repeat with cat in cats

set cat to item 1 of cats

tell application “iBlog” to set catFolderName to the folder name of cat

display dialog catFolderName

copy {catFolderName, -1} to the end of idTableRef

tell application “iBlog” to set ents to the entries of cat

repeat with ent in ents

tell application “iBlog” to set episodeFolderName to the folder name of ent

set episodePath to catFolderName & “/” & episodeFolderName

copy {episodePath, postID} to the end of idTableRef

set postID to postID + 1

end repeat

end repeat


Now it’s possible to look up the id of any episode, and build the new link. The lookup code is in the attached script, and also handles the special cases of linking to a category page and to the main page. For category pages, I just hand-built a table of the category ID’s I needed based on previous import tests.

Finally, there is the task of preserving the links to the old comment system. Happily, those Haloscan comments are also connected based on the file path of the episode. (Though it looks like really old comments are not accessible, anyway, which is a bummer.)

In the main loop, after the body text has been cleaned up, tack the link to Haloscan on the end, complete with hooks to allow CSS formatting:

set bod to bod & newLine & newLine & “<div class=”jsOldCommentBlock”><span>Legacy Comment System:</span> <a href=”javascript:HaloScan(‘” & entFolder & “‘);”><script type=”text/javascript”>postCount(‘” & entFolder & “‘); </script></a></div>”

Not mentioned above are functions for logging errors and a few other utililties that are in the main script file. They should be pretty obvious. The script includes code that is specific to issues I encountered, but it should be a good start for anyone who wants to export iBlog 2 data for import into another system. It SHOULD be safe to execute on your iBlog data; it doesn’t change anything on the iBlog side of things. I don’t know if there’s anyone else in the world even using iBlog 2 anymore, but if you would like help with this script, let me know.