When Coding, Always use Descriptive Function Names

This evening I’ve been coding up something on a tight deadline. A few minutes ago I wrote a function named:

handlePluralTypeNameBecauseRESTWasInventedByIdiots()

REST is an acronym for… you can look it up if you care. It’s a way for code running in your browser to communicate with services off somewhere else. Some guy got his Phd making this nonsense up, and it has now become an industry standard. That guy is very lucky I was not one of the ones to review his dissertation, and the rest of the world is very unlucky that I wasn’t.

Like with HTML before it, someone came up with a half-assed solution to a real problem, and before the smart people in the room could say, “hey, that has some pretty serious flaws, but with only a little more effort we could fix most of them” the whole world went romping off with the flawed solution. And here we are.

Not only does REST violate previously-existing standards, it does so for no technical advantage. Servers and programming languages had to be updated to accommodate those violations. Maybe that should have been a red flag.

It would be SO DAMN EASY to fix most of the problems with REST. Use your head(ers)!

But here we are sweating over REST. And here’s a fun thing: for no technical advantage people who use this standard-violating standard have to understand the rules of pluralization in American English. At least in any implementation of REST I’ve had the pleasure of working with. Not only is that fucking annoying, it’s exclusionary. Sorry, kid in Senegal, we’re making a standard that disadvantages you.

Sure, you can get a code library to do plurals for you, and with any luck the rules in the browser code will match the rules on the server. Up until now, I’ve chosen the approach, “always name your data types in a way that just adds ‘s’ for the plural.” Tonight that wasn’t an option, so I a made a way for specific REST servers to keep only the rules relevant to them. More efficient and more reliable than someone else’s library.

And as I’ve often said, your code should express what it does without resorting to comments.

5

Optimization

Tonight it hit me, as I was giving advice, that I was dispensing wisdom I would do well to listen to myself. It was a question about optimization. Whenever you use that word, you have to know what you are optimizing for. Sometimes I forget.

One of the approximations of social media I engage in is a place called StackOverflow.com. That site and dozens of its sisters are an indispensable resource for programmers and tech folk in general. Some nights, like tonight, when I’m a little adrift, I will stop by over there and see if there are any questions pending I am qualified to answer.

When you’ve been around a while, you realize that there are almost no new questions, and in the major categories there are semi-pro question-answerers who can swiftly point the new question to the ancient answer. For php, you will see the name Barmar time and again, always helpful, always gentle.

But there is a category of question that in general is found to be annoying on Stack Overflow, but that I can answer well. These are questions that go beyond the nuts-and-bolts of a language to get to the heart of what programming really is. I will see the code they post, often a mess, and I can sort through it and figure out what is the actual question.

While others will downvote the question for being a mess, I find an opportunity to teach. I see someone who, with a little help, will see the mess also. My answers are long, and meticulous, but if I feel like I’m answering a homework question I leave out the key stones in the path.

A recent example: Someone had code that took the result of a database query and built an html table. But the result of the query didn’t directly match what was to go into the table. So while drawing the table, the code was also trying to figure shit out.

I suggested that while there was no way I was going to figure out what was wrong with the code as written, I had a simple concept to apply: Think, then Draw. Get all the information set up beforehand so that creating the html was just a brain-dead loop.

I have discovered that I have axioms. One is “Keep the guards close to the gate.” Another, apparently, is “Think, then draw.”

Tonight I came across a question asking for how to optimize a data manipulation operation. Massive data in, then a rearrangement of the data to be more useful later. The code to accomplish the data transform included nested loops, but was pretty tight and pretty clean. “How do I optimize this?” was the question.

I have come to realize that optimization of code is an economic judgement, not a technical one. The most obvious tradeoff: You can use more memory to make your process run faster.

But there is another, more subtle, economic tradeoff. Tonight I answered the optimization question with another economic argument: that Engineer Hours are worth more then CPU milliseconds. I told that person that if their code worked, and there wasn’t a specific problem caused by it, then it was perfect and they should move on to the next problem. I told that programmer to value their own time. Not just for personal reasons, but to better judge how their time could best help their endeavor.

Value your time, but also recognize the cost of your time.

My friends, this is something I do not do well. Where I work, I am moving our stuff from an environment where it was hosted for free, to a place where we will have to pay for the resources. I am, at heart, a cheap bastard. While that does me well at the poker table, it may not always serve the interests of my employer. I am constantly aware that all resources I request will now be billed.

So I spend time, hours of my time, trying to guess how small a footprint I can squeeze into on behalf of our group. I spend hours on elaborate schemes to do more with less. But I am not serving my clients well. I’m not including the cost of my time in the value equation. Things are too slow? I could spend a week doing clever stuff, at a cost to my company of thousands of dollars in my compensation while I defer projects that actually will make things run better, or I could just submit a form and get more RAM for the database, at an incremental cost for the department each month.

In my defense, I HATE filling out forms. I HATE the call that comes after while someone else re-enters the form into a different system, and invariably makes a mistake. I will take a week with the code over an afternoon of nonsense any time. But that is a weakness, not a strength. That is me allowing my own preferences to make decisions that are not properly optimized.

I am, at heart, a person who can take a problem and slice it and slice it and slice it until each part is a simple question, and I can take the answers to those questions and combine them to arrive at the solution. While I’m at it, I’ll make tests so that each part proves itself. When you see my code at the flea market, it will be on the shelf behind the table, where you will have to ask to get a closer look. I am proud of this. But it is not always a good thing.

There was a time I was VP of Software Engineering at a small company, and I did all right in that role. During the dot-com boom I held a pretty dang good team together. I didn’t have to fill out forms then, either. We were all just building something awesome. The calculus of the value of my time was different then; Engineer-hours — my hours — were a commodity; the value only to be realized if the venture succeeded. (Picture Facebook, before Facebook, only far more dynamic, with the fatal flaw of being private.)

Good times, good times. The sleep-optional days of youth. But if I’m going to justify that little walk in the past in this ramble, I have to tie it to the present.

Back then, I was paid well enough but there was never doubt that Engineer Hours were the currency we were paying to make our name. Our data center was a few racks in a room that had once been a bank vault (AWS was decades away yet). Everything, everything, was optimized for minimum CPU.

Tonight, while I was (very gracefully I’m sure) reminding a young coder about the relative value of CPU cycles relative to neuron flashes, I realized that I have not been very good at making that judgement myself. I need to do better. Not just for my team, but for myself. I need to fill out the goddam form, bluster my way to 4x the resources I actually think we need (magnificently tiny, in the scale of my company), and recognize that my time is better spent making new things, rather than helping the old things go.

4

A Pair of Coding Aphorisms

I write software for a living, and I take great pleasure when fixing a problem means reducing the number of lines of code in the system. In the last two days, I have come up with a couple of observations:

Every line of code is a pre-cancerous cell in the body of your application.

Now, “line of code” can be a deceptive measurement, as cramming a whole bunch of logic into a single line will certainly not make the application more robust. There are even robots that can comb through your code and sniff out overly-complex bits. But just as in humans weight is a proxy for a host of more meaningful health measurements, lines of code is the proxy for a host of complexity measurements.

But the point stands. I recently had to fix a bug where someone had copy/pasted code from one place to another. Then the original was modified, but not the copy. All those apparently-safe lines of code (already tested and everything!) were a liability, where instead a function call so everyone used the same code would have been more compact, easier to read, and much easier to maintain. There’s even an acronym for this type of practice: DRY — Don’t Repeat Yourself.

While that’s one of the more flagrant ways code bloat happens, there are plenty of others, mostly symptoms of not thinking the problem through carefully at the get-go, or not stopping to reconsider an approach as the problem is better-understood. Stopping and thinking will almost always get the project done sooner — and smaller.

One important thing to keep in mind is that programming languages are for the benefit of humans, not the machines that will eventually execute the program. If the purpose of your code is not obvious from reading it, go back and do it again. Comments explaining the code are generally an indication that the code itself is poorly written.

No software is so well-written that it ages gracefully.

I work on a lot of old code written by others, and I know people who work on old code written by me. In some cases, the code was shit to start with, but in others time has simply moved on, requirements have changed, and the code has been fiddled and futzed until the pristine original is lost to a host of semi-documented tweaks.

Naturally no code I have ever written falls into the “shit to start with” category (how could you even think that?), but that doesn’t mean the people who have to maintain that old stuff won’t be cursing my name now and then, as some clever optimization I did back in the day now completely breaks with a new requirement I didn’t have to deal with at the time.

And sometimes even if the code itself is still just fine, the platform it runs on will change, and break stuff. Jer’s Novel Writer was pretty elegant back in the day, but now when I compile I get literally hundreds of warnings about “That’s not how we do things anymore.” Some parts of JersNW are simply broken now. When I no longer work where I do, I will likely rebuild the whole thing from scratch.

Speaking of work, I am very fortunate to work in an environment that allows us to trash applications and rebuild them from scratch every now and then. Having a tiny user base helps in this regard. And as we build the new apps, we can apply what we’ve learned and maybe the next system will age a little better than the one before. Maybe. But sure as the sun rises at the end of a long day of coding, someone will be cursing the new system before too long.

3

A Guide to Commenting Your Code

I spend a lot of time working with code that someone else wrote. The code has lots of comments, but they actually do little to improve the understandability of the work. I’m here to provide a concise set of examples to demonstrate the proper way to comment your code so that those who follow will be able to understand it easily and get to work.

These examples are in php, but the principles transcend language.

WRONG:

// get the value of the thing
$val = gtv();

RIGHT:

$thingValue = getTheValueOfTheThing();

WRONG:

// get the value of the thing
$val = getTheValueOfTheThing();

RIGHT:

$thingValue = getTheValueOfTheThing();

Oh so very WRONG:

// Let's get the value of the thing
$val = getTheValueOfTheThing();

We’re not pals on an adventure here.

RIGHT:

$thingValue = getTheValueOfTheThing();

You might have noticed that so far all my examples of the proper way to comment your code don’t have comments at all. They have code that doesn’t need a comment in the first place.

Computer languages are not created to make things easier to understand for the machine, they are to make sets of instructions humans can read that (secondarily) tell the computer what to do. So, if the code syntax is for the benefit of humans, treat it that way.

If you have to write a comment to explain what is going on in your code, you probably wrote it wrong. Or at the very least, if you need to write a comment, it means you’re not finished. I write many comments that start TODO, which my tools recognize and give me as a to-do list.

Stopping to come up with the perfect name for a variable, class, or function is an important part of programming. It’s more than a simple label, it’s an understanding of what that symbol means, and how it works in the system. If you can’t name it, you’re not ready to code it.

There is a special category of comments in code called doc blocks. These are massive comments above every function that robots can harvest to generate documentation. It’s a beautiful idea.

Here’s my world (not a standard doc block format but that’s irrelevant):

/*
|--------------------------------------------------------------------------
| @name "doSomething"
|--------------------------------------------------------------------------
| @expects "id (int)"
|--------------------------------------------------------------------------
| @returns "widget"
|--------------------------------------------------------------------------
| @description "returns the widget of the frangipani."
|--------------------------------------------------------------------------
*/
public function doSomething($id, $otherId) {
    $frangipani = getFrangipani($id);
    multiplex($frangipani, $otherId);
 
    return $frangipani->widgets();
}

The difficulty with the above is that the laborious description of what the function does is harmfully wrong. The @expects line says it needs one parameter, when actually it needs two. It says it returns a widget but in fact the function returns an array of widgets. If you were to try to understand the function by the doc block, you would waste a ton of time.

It happens all the time – a programmer changes the code but neglects to update the doc block. And if you’re not using robots to generate documentation, the doc block is useless if you write your code well.

public function getFrangipaniWidgets($id, $multiplexorId) {
    $frangipani = getFrangipani($id);
    multiplex($frangipani, $multiplexorId);
 
    return $frangipani->widgets();
}

Doc blocks are a commitment, and if you don’t have a programmer or tech writer personally responsible for their accuracy, the harm they cause will far surpass any potential benefit.

I have only one exception to the “comments indicate where you have more work to do” rule: Don’t try this at home.

public function getFrangipaniWidgets($id, $multiplexorId) {
    $frangipani = getFrangipani($id);
 
    // monoplex causes data rehash, invalidating the frangipani
    multiplex($frangipani, $multiplexorId);
 
    return $frangipani->widgets();
}

This is useful only when the obvious, simple solution to a problem had a killing flaw that is not obvious. This is a warning sign to the programmer coming after you that you have tried the obvious. Often, when leaving notes like this, and explaining why I did something the hard way, I realize that the easy way would have worked after all. At which point I fix my code and delete the comment. But at least in that case the comment did something useful.

4

Time Not Well-Spent

Here it is, Whiskey-Exemption Thursday, and my weight is on-target so I can even have beer. The purpose of Thursday is to devote an evening to pushing the writing forward, and hang the consequences.

What have I been writing this fine evening? I’ve been trying to come up with the least-objectionable way to emulate Swift’s extensions to Protocols in php. The answer: there is no way.

Begin geek

Coding with php is coding with flint knives and bearskins; the power of php is in its wham-bam-thank-you-ma’am ability to do a quick task and then to go away.

Bless the movers behind php, they’re trying to evolve their language to catch up with the way people are using it these days. If they had known Drupal was coming along, they might not have been so quick-and-dirty before. Drupal might be slightly less awful as a result.

There are design patterns enabled by Swift that I get a little misty contemplating. Being able to add extensions (with executable code!) to protocols is enormously powerful. Having experienced that, I wanted to do the same thing in php, creating a trait “taggable” and having classes that used it automatically injected with the implementation. Injected, not inherited. Ain’t gonna happen.

End geek

At least now I’m writing prose about writing the code rather than writing the code itself. Progress, I guess.

3

Defensive Programming: Put the Guards Near the Gate

We can file this one under “not interesting to pretty much anyone who reads this blog,” but it’s an important concept for writing robust code. This is part of a discipline called Defensive Programming.

Let’s say you build yourself a castle in a clearing in the woods. There is one path to the front gate, and you need to guard it. “Hah!” you think, “I’ll put the guards where the path comes out of the woods, to stop shenanigans before they even get close!” You post the guards out there in a little guardhouse, secure in the knowledge that no bad guys will reach your gate.

Until someone makes a new path. Perhaps when the new path is created the path-maker will notice that there are guards on the other path and put a little guardhouse on the new path as well. But perhaps not.

In software, it’s the difference between code that says, “when all conditions are right, call function x”, and having function x test to make sure everything is OK before proceeding.

Putting the guard by the trees:

    function x(myParameter) {
        myParameter.doSomething();
    }

    thing = null;

    ... other stuff that might or might not set 'thing'

    if (thing != null) {
        x(thing);
    }

This is fine as long as everything that calls function x knows to check to make sure the parameter is not null first. It might even seem like a good idea because if ‘thing’ is not set you can save the trouble of calling the function at all. But if some other programmer comes along and doesn’t know this rule, she might not do the check.

    // elsewhere in the code...

    anotherThing = null;

    ... other stuff that might or might not set 'anotherThing'

    x(anotherThing); // blammo!

Better to move the guards close to the gate:

    function x(myParameter) {
        if (myParameter != null) {
            myParameter.doSomething();
        }
    }

Now when someone else writes code that calls function x, you can be confident that your guards will catch any trouble. That doesn’t mean you can’t ALSO put guards out by the edge of the forest, but you shouldn’t rely on them.

2

Maybe if I Assert Harder…

When I unleash the testing robots on my code, I very often see messages like this:

Failed asserting that false is true.

It seems like there should be a metaphor in there somewhere.

2

Pretty Badass…

I just put the following comment in my code:

// now SHIT GETS REAL

3

Haloscan comments to WordPress – the nitty gritty.

As I mentioned in the previous episode, I recently had to move more than 8000 comments from my old comment system, Haloscan, and import them into WordPress. Haloscan served me well back in the day, but they are going away, and all my more recent comments are in the WordPress system anyway. Nice to have them all in one place.

The process turned out to be pretty easy. I found a script for importing comments from a different system, modified it, modified it some more, found a fundamental problem with it, fixed that, and in the end not much of code remained from the example, except the part where the WordPress logo is displayed on the screen. I assume that part came from the code the guy copied to make the code that I copied.

Along the way I learned a couple of things. PHP is a pretty flexible language, but running a loop that sets up 8500 data structures and runs 25500 database queries exposes PHP’s primary weakness: memory management. The whiz kids who invented PHP designed it for a load/compile/execute/exit-and-clean-up flow. Memory allocated during execution is cleaned up when the program is done running (usually when the Web page is delivered). When you try to do heavy lifting with PHP, you have to start paying attention to getting your memory back before the traditional clean-up time.

The code I started with did a direct database query to add the comment to the comments table, but that got things out of sync with other tables. (The posts table keeps track of the number of comments that apply to it, presumably for performance reasons.) I dug into the core WordPress code and found the method they call to post comments, and I made my code call that function. I have no idea what all the bookkeeping chores are that function does, and really I don’t care as long as they get done.

I didn’t worry about performance too much at first (after all, it only has to run once), but one of the database queries I did was really expensive (scanning all the posts for a specific set of characters). Even running on my local server it was slow, and I knew that if I tried something like that on my actual Web host alarms would go off and they’d shut me down for a while. I did a little optimization on that front, and it was enough.

The following script has some Muddle-specific code in it, but it might come in handy for others who need to move Haloscan comments to a new system. The part that parses Haloscan XML is pretty generic and would work for anyone, the part that saves the comments might be useful as a guide as well. The main difference others will have to deal with is where to get proper post_id based on the thread field in the XML. In my case I had a link in each blog episode back to the Haloscan thread.

The HTML bit in the middle of the file is not essential; but it puts a nice WordPress logo on the screen when the script starts up. I inherited that from the script I started with.

NOTE: While this script has code in it specific to me, I am available to customize it for others who need to move their code from Haloscan into another environment, or, for that matter, from any structured source into WordPress. Drop me a line!

<?php
 
if (!file_exists('../wp-config.php')) die("There doesn't seem to be a wp-config.php file. You must install WordPress before you import any comments.");
require('../wp-config.php');
 
function saveCommentToWP($comment, $dbRef, &$postThreads) {
    //echo "here's where the comment save happens <br/><br />";
    $thread = $comment['thread'];
    $postID = $postThreads[$thread];
    if (!isset($postThreads[$thread])) {
        $query = "SELECT * FROM wp_posts WHERE post_content LIKE '%".$thread."%' AND post_status='publish'";
        $postID = $dbRef->get_var($query, 0);
        $postThreads[$thread] = $postID ? $postID : 0;
        if ($postThreads[$thread] == 0)
            echo ("<br />Thread $thread has no post!");
        else
            echo "<br />Thread $thread";
        flush();       // got to have real-time updates!
    }
 
    if ($postID && $postID != 0) {
        $userId = $comment['email'] == '[email protected]' ? 1 : 0;
 
        //set up the data the way wp_insert_comment expects it.
        $wp_commentData = array();
        $wp_commentData['comment_post_ID'] = (int) $postID;
        $wp_commentData['user_id'] = (int) $userId;
        $wp_commentData['comment_parent'] = 0;
        $wp_commentData['comment_author_IP'] = $comment['ip'];
        $wp_commentData['comment_agent'] = 'Haloscan';
        $wp_commentData['comment_date'] = $comment['datetime'];
        $wp_commentData['comment_date_gmt'] = $comment['datetime'];
        $wp_commentData['comment_approved'] = '1';
        $wp_commentData['comment_content'] = $comment['text'];
        $wp_commentData['comment_author'] = $comment['name'];
        $wp_commentData['comment_author_email'] = $comment['email'];
        $wp_commentData = wp_filter_comment($wp_commentData);
 
        $comment_ID = wp_insert_comment($wp_commentData);
 
        //echo ("<strong>saved comment $comment_ID</strong>");
    }
 
    // try to reclaim some memory
    unset($wp_commentData);
    unset($comment);
}
 
header( 'Content-Type: text/html; charset=utf-8' );
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<title>WordPress &rsaquo; Import Comments from RSS</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style media="screen" type="text/css">
    body {
        font-family: Georgia, "Times New Roman", Times, serif;
        margin-left: 20%;
        margin-right: 20%;
    }
    #logo {
        margin: 0;
        padding: 0;
        background-image: url(http://wordpress.org/images/logo.png);
        background-repeat: no-repeat;
        height: 60px;
        border-bottom: 4px solid #333;
    }
    #logo a {
        display: block;
        text-decoration: none;
        text-indent: -100em;
        height: 60px;
    }
    p {
        line-height: 140%;
    }
    </style>
</head><body> 
<h1 id="logo"><a href="http://wordpress.org/">WordPress</a></h1> 
 
<?php
 
// Bring in the data
$reader = new XMLReader();
if ($reader->open('export-8.xml')) {
    $postThreads = array();
    $thread = '';
    while ($reader->read()) {
        //echo "<br />read node type: ".$reader->nodeType.';     '.$reader->name.': '.$reader->value;
        if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'thread') {
            $thread = $reader->getAttribute('id');
        }
        if ($thread) {
            if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'comment') {
                // begin building comment
                $comment = array('thread' => $thread);
                $reader->read();
                while ( !($reader->nodeType == XMLReader::END_ELEMENT && $reader->name == 'comment') ) {
                    if ($reader->nodeType == XMLReader::ELEMENT) {
                        $property = $reader->name;
                        $reader->read(); // assumes text element following element tag has the data
                        $comment[$property] = $reader->value;
                    }
                    $reader->read();
                }
                saveCommentToWP($comment, $wpdb, $postThreads);
            }
        }
    }
    $reader->close();
}
 
?>
 
 
</body>
</html>

3