Archive for ‘Rumblings from the Secret Labs’

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

An Online Community that I can Get Behind

February 2nd, 2012

Since there are others using my server now, I thought it would be a good idea to upgrade my backup practices. I looked around a bit, hoping for a solution that was free, butt-simple to set up, and automatic, so I would never have to think about it again. I don’t like thinking when I don’t have to.

I came across CrashPlan, the backup solution my employer uses. Turns out their software is free to chumps like me; they make their cash providing a place for you to put that valuable information.

There are two parts to any backup plan: you must gather your data together and you must put it somewhere safe that you can get to later. The CrashPlan software handles the gathering part, making it easy, for instance, to save all my stuff to the external hard drive sitting on my desk, but if the house burns down that won’t do me much good.

Happily CrashPlan also makes it easy to talk to remote computers, provided they have the software installed. I put CrashPlan on my server in a bunker somewhere in Nevada, and now this site and a couple of others are saved automatically to my drive in California as well. Easy peasy! Any computer signed up under my account can make backups to any other.

But wait! There’s more! The cool idea CrashPlan came up with was letting friends back each other up. I give you a special code and you can put backups of your stuff on my system. I can’t see what you saved, it’s all encrypted. But unless both our houses burn down at the same time, there’s always a safe copy.

Sure, if you pay you get more features and they will store your stuff in a safe place where you don’t have to wait if I happen to be on vacation, but for free that’s not bad at all. The idea of friends getting together and forming a backup community appeals to me as well. It’s a great way for geeks to look out for one another.

2
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Step-by-Step LAMP server from scratch with MacPorts

January 26th, 2012
A works-every-time guide to getting everything installed and configured.

Getting Apache, PHP, and MySQL installed and talking to each other is pretty simple — until something doesn’t come out right. This guide takes things one step at a time and checks each step along the way.

January 14, 2011
January 14, 2011

Using MacPorts to build a LAMP server from scratch

About this tutorial:

There are other step-by-step guides out there, and some of them are pretty dang good. But I’ve never found one that I could go through and reach the promised land without a hitch. (Usually the hitches happen around MySQL.) Occasionally key points are glossed over, but I think mostly there are things that have changed, and the tutorials haven’t updated. Now however I’ve done this enough times that there are no hitches anymore for me. Since MacPorts occasionally changes things, I’ll put up at the top of this page the last time this recipe was last used exactly as written here.

This guide breaks things down into very small steps, but each step is simple. I include tests for each stage of the installation, so problems can be spotted while they're easy to trace. We get each piece working before moving on to the next. I spend a little time telling you what it is you accomplish with each step, because a little understanding can really help when it’s time to troubleshoot, and if things are slightly different you have a better chance of working through them.

Audience: This guide is designed to be useful to people with only a passing familiarity with the terminal. More sophisticated techno-geeks may just want to go through the sequence of commands, and read the surrounding material only when something doesn't make sense to them. The goal: follow these steps and it will work every damn time.

MacOS X versions: Tiger, Leopard, Snow Leopard, Lion. Maybe others, too. The beauty of this method is it doesn’t really matter which OS X version you have.

The advantages of this approach

There are multiple options for setting up a Mac running OS X to be a Web server. Many of the necessary tools are even built right in. Using the built-in stuff might be the way for you to go, but there are problems: It’s difficult to customize (Search on install apc Mac and you’ll see what I mean), you don’t control the versions of the software you install, and when you upgrade MacOS versions things could change out from under you.

Also simple to set up is MAMP, which is great for developing but not so much for deployment. For simple Web development on your local machine, it’s hard to beat.

But when it comes right down to it, for a production server you want control and you want predictability. For that, it’s best to install all the parts yourself in a known, well-documented configuration, that runs close to the metal. That’s where MacPorts comes in. Suddenly installing stuff gets a lot easier, and there’s plenty of documentation.

Holy schnikies! A new timing exploit on OpenSSL! It may be months before Apple's release fixes it. I want it sooner!

What you lose:

If you’re running OS X Server (suddenly an affordable option), you get some slick remote management tools. You’ll be saying goodbye to them if you take this route. In fact, you’ll be saying goodbye to all your friendly windows and checkboxes.

Also, I have never, ever, succeeded in setting up a mail server, MacPorts or otherwise, and I’ve tried a few different ways (all on the same box, so problems left over from one may have torpedoed the next), and no one I've ever met, even sophisticated IT guys, likes this chore. If serving mail is a requirement, then OS X Server is probably worth the loss of control. Just don’t ever upgrade your server to the next major version. (Where’s MySQL!?! Ahhhhhh! I hear frustrated sys admins shout.)

So, here we go!

Document conventions

There are commands you type, lines of code you put in files, and other code-like things. I've tried to make it all clear with text styles.

This is something you type into the terminalDo not type the ; it's just to represent the prompt in your own terminal window. Once you've typed the text (or pasted it in from here), hit return.
This is a line of code in a file.Either you will be looking for a line like this, or adding a line like this.
This is a reference to a file or a path.

Prepare the Box

  1. Turn off unneeded services on the server box. Open System Preferences and select Sharing.
    • turn on remote login
    • (optional) turn on Screen Sharing
    • Turn off everything else - especially Web Sharing and File Sharing
  2. Install XCode. This provides tools that MacPorts uses to build the programs for your machine. You can get XCode for free from the app store. It’s a huge download. Note that after you download it, you have to run the installer. It may launch XCode when the install is done, but you can just Quit out of it.
  3. Install MacPorts. You can download the installer from http://www.MacPorts.org/install.php (make sure you choose the .dmg that matches the version of MacOS you are running). Run the installer and get ready to start typing.
  4. Now it’s time to make sure MacPorts itself is up-to-date. Open terminal and type
    1. sudo port selfupdate
    2. password: <enter your admin password>
    MacPorts will contact the mother ship and update itself.
    • If you're not familiar with sudo, you will be soon. It gives you temporary permission to act as the root user for this machine. Every once in a while during this process you will need to type your admin password again.
  5. May as well get into the habit of updating the installed software while we’re at it. Type
    1. sudo port upgrade outdated
    and you will most likely see a message that looks like an error but really says that there was nothing to upgrade. No biggie.
    • Make a habit of running these commands regularly. One of the reasons you're doing this whole thing is to make sure your server stays up-to-date. This is how you do it.

Install Apache

  1. Now it’s time to get down to business. All the stuff we’ve installed so far is just setting up the tools to make the rest of the job easier. Let’s start with Apache!
    1. sudo port install apache2
    • This may take a little while. It’s actually downloading code and compiling a version of the server tailored to your system. First it figures out all the other little pieces Apache needs and makes sure they’re all installed correctly. Hop up and grab a sandwich, or, if you're really motivated, do something else productive while you wait.
  2. When the install is done, you will see a prompt to execute a command that will make Apache start up automatically when the computer is rebooted. Usually you will want to do this. The command has changed in the past, so be sure to check for the message in your terminal window. As of this writing, the command is:
    1. sudo port load Apache2
  3. Create an alias to the correct apachectl. apachectl is a utility that allows you to do things like restart Apache after you make changes. The thing is, the built-in Apache has its own apachectl. To avoid confusion, you can either type the full path to the new apachectl every time, or you can set up an alias. Aliases are commands you define. In this case you will define a new command that executes the proper apachectl.
    1. In your home directory (~/) you will find a file called .profile - if you didn’t have one before, MacPorts made one for you. Note the dot at the start. That makes the file invisible; Finder will not show it. In terminal you can see it by typing
      1. ls -a ~/
      You will get back a list of all the files in your home directory, including the hidden ones that start with ..
    2. Edit ~/.profile and add the following line:
      1. alias apache2ctl='sudo /opt/local/apache2/bin/apachectl'
      • Edit how? See below for a brief discussion about editing text files and dealing with file permissions.
      • ~/.profile isn't the only place you can put the alias, but it works.
    3. You need to reload the profile info for it to take effect in this terminal session.
      1. source ~/.profile
    4. Now anywhere in the docs it says to use apachectl, just type apache2ctl instead, and you will be sure to be working on the correct server.
  4. Start Apache:
    1. apache2ctl start
    You might see a warning or two, probably a notification about the server's name. That's fine.
  5. Test the Apache installation. At this point, you should be able to go to http://127.0.0.1/ and see a simple message: “It works!”
  6. MILESTONE - Apache is up and running!

Install PHP

  1. Use MacPorts to build PHP 5:
    1. sudo port install php5 +pear
    • You could install the MySQL extensions to PHP now (sudo port install php5-mysql), but that will cause MySQL to be installed as well. It’s no biggie, but I like to make sure each piece is working before moving on to the next. It makes problem-solving a lot easier. So, let’s hold off on that.
    • +pear adds an industry-standard way to load other PHP addons later.
  2. Choose your php.ini file. There are a couple of different options that trade off security for convenience (error reporting and whatnot). As of this writing there is php.ini-development (more debugging information, less secure) and php.ini-production. Copy the one you want to use and name it php.ini:
    1. sudo cp /opt/local/etc/php5/php.ini-development /opt/local/etc/php5/php.ini
    You will be editing this file a little bit later, but mostly it’s just a bunch of settings you’ll never need to understand.
  3. Test the PHP install
    1. On the command line, type
      1. php -i
    2. A bunch of information will dump out. Hooray!
  4. Now it’s time to get Apache and PHP talking to each other. Apache needs to know that PHP is there, and when to use it. There’s a lot of less-than-ideal advice out there about how to do this.
    1. httpd.conf is the heart of the Apache configuration. Mess this up, Apache won’t run. It’s important, therefore, that you MAKE A BACKUP (there’s actually a spare copy in the install, but you never rely on that, do you?)
      1. cd /opt/local/apache2/conf
      2. sudo cp httpd.conf httpd.conf.backup
    2. First run a little utility installed with Apache that supposedly sets things up for you, but actually doesn’t do the whole job:
      1. cd /opt/local/apache2/modules
      2. sudo /opt/local/apache2/bin/apxs -a -e -n "php5" libphp5.so
    3. The utility added the line in the Apache config file that tells it that the PHP module is available. It does not tell Apache when to use it. There is an extra little config file for that job, but it’s not loaded (as far as I can tell), and it’s not really right anyway. Let's take matters into our own hands.
      • It won't let me save! See below for a brief discussion about editing text files and dealing with permissions.
    4. Time to edit! Open /opt/local/apache2/conf/httpd.conf with permission to edit it. We need to add three lines; one to tell it that PHP files are text files (not strictly necessary but let’s be rigorous here), and two lines to tell it what to do when it encounters a PHP file.
      1. Search for the phrase AddType in the file. After the comments (lines that start with #) add:
        1. AddType text/html .php
      2. Search for AddHandler (it’s just a few lines down) and add:
        1. AddHandler application/x-httpd-php .php
        2. AddHandler application/x-httpd-php-source .phps
        The second of those is just to let you display PHP source code in a Web page without actually running it.
      3. Finally, we need to tell Apache that index.php is every bit as good as index.html. Search in the config file for index.html and you should fine a line that says DirectoryIndex index.html. Right after the html file put index.php:
        • Before:
          1. DirectoryIndex index.html
        • After:
          1. DirectoryIndex index.html index.php
      4. (Optional) As long as we’re in here, let’s make one more change for improved security. Search for the line that specifies the default options for Apache and remove Indexes:
        • Before:
          1. Options Indexes FollowSymLinks
        • After:
          1. Options FollowSymLinks
        This prevents outsiders from seeing a list of everything in a directory that has no index file.
      5. Save the file.
    5. Check the init file syntax by typing
      1. /opt/local/apache2/bin/httpd -t
      You will probably get a warning about the server’s name again, but that’s OK, as long as you see the magical Syntax OK message. If there is an error, the file and line number should be listed.
    6. Restart Apache:
      1. apache2ctl restart
  5. Test whether PHP and Apache can be friends. We will modify the “It Works!” file to dump out a bunch of info about your PHP installation.
    1. Currently the default Apache directory is /opt/local/apache2/htdocs
    2. Start by renaming index.html to index.php:
      1. cd /opt/local/apache2/htdocs
      2. sudo mv index.html index.php
    3. Edit the file, and after the It Works! bit add a PHP call so the result looks like this:
      1. <html>
      2.     <body>
      3.         <h1>It works!</h1>
      4.         <?php echo phpinfo(); ?>
      5.     </body>
      6. </html>
    4. Save the file
    5. Go to http://127.0.0.1 - you should see a huge dump of everything you wanted to know about your PHP but were afraid to ask.
  6. MILESTONE - Apache and PHP are installed and talking nice to each other.

Install and configure MySQL

  1. Use MacPorts to install MySQL database and server and start it automatically when the machine boots:
    1. sudo port install mysql5-server
    2. sudo port load mysql5-server
  2. Now we get to the trickiest part of the whole operation. There's nothing here that's difficult, but I've spent hours going in circles before, and I'm here so you won't find yourself in that boat as well. MySQL requires some configuration before it can run at all, and it can be a huge bother figuring out what’s going on if it doesn’t work the first time. We start by running a little init script:
    1. sudo -u _mysql mysql_install_db5
  3. As with Apache, you can create a set of aliases to simplify working with MySQL. There are some commands you will run frequently; things get easier if you don’t have to type the full path to the command every time. Open up ~/.profile again and add the following three lines:
    1. alias mysqlstart='sudo /opt/local/share/mysql5/mysql/mysql.server start'
    2. alias mysql='/opt/local/lib/mysql5/bin/mysql'
    3. alias mysqladmin='/opt/local/lib/mysql5/bin/mysqladmin'
    When you're done, save and
    1. source ~/.profile
  4. Start MySQL server:
    1. mysqlstart
  5. Next we need to deal with making the database secure and setting the first all-important password. The most complete way to do this is running another utility that takes you through the decisions.
    1. /opt/local/lib/mysql5/bin/mysql_secure_installation
    The script offers to delete some test users and databases that in my experience are totally useless anyway. Take the advice offered and get rid of all that junk.
    Remember the password you set for the root user!
    • You now have a MySQL account named root which is not the same as the root user for the machine itself. When using sudo you will use the machine root password (as you have been all along), but when invoking mysql or mysqladmin you will enter the password for the database root account.
  6. As with PHP above, MySQL has example config files for you to choose from. The config file can be placed in a bunch of different places, and depending on where you put it, it will override settings in other config files. If you follow this install procedure, you don’t actually need to do anything with the config files; we’ll just be using the factory defaults. But things will work better down the road if you choose a config that roughly matches the way the database will be used.
    1. Find where the basedir is. As of this writing it’s /opt/local, and that’s not likely to change anytime soon, but why take that for granted when we can find out for sure? Let's make a habit of finding facts when they're available instead of relying on recipes like this one.
      1. mysqladmin -u root -p variables
      2. password: <enter MySQL root user's password>
      A bunch of info will spew across your screen. At this moment, there are two interesting nuggets: basedir and socket. Make a note of them for later.
    2. Now it’s time to choose which example config file you want to start with. The examples are in /opt/local/share/mysql5/mysql/, and each has a brief explanation at the top that says what circumstances it’s optimized for. You can read those, or just choose one based on the name. If you have no idea how big your database is going to be, medium sounds nice. You can always swap it out later.
      1. sudo cp /opt/local/share/mysql5/mysql/my-medium.cnf <basedir>/my.cnf
      Fill in <basedir> with the basedir you learned in the previous step.
  7. Test MySQL
    1. On the command line, type
      1. mysql -u root -p
      2. password: <enter MySQL root user's password>
      and enter the MySQL root user password when prompted. No errors? Cool. We’re done here. Type
      1. exit
      at the prompt.
  8. MILESTONE - MySQL server is running and happily talking to itself.

Teach PHP where to find MySQL

  1. The database is up and running; now we need to give PHP the info it needs to access it. There's a thing called a socket that the two use to talk to each other. Like a lot of things in UNIX the socket looks like a file.

    The default MySQL location for the socket is in /tmp, but MacPorts doesn’t play that way. There are a couple of reasons that /tmp is not an ideal place for the socket anyway, so we’ll do things the MacPorts way and tell PHP that the socket is not at the default location. To do this we edit /opt/local/etc/php5/php.ini.

    There are three places where sockets are specified, and they all need to point to the correct place. Remember when you saved the socket variable from MySQL before? Copy that line and then search in your php.ini file for three places where is says default_socket:

    1. pdo_mysql.default_socket = <paste here>
    2. . . .
    3. mysql.default_socket = <paste here>
    4. . . .
    5. mysqli.default_socket = <paste here>

    In each case the whatever = part will already be in the ini file; you just need to find each line and paste in the correct path.

  2. While we’re editing the file, you may want to set a default time zone. This will alleviate hassles with date functions later.
  3. Finally, we need to install the PHP module that provides PHP with the code to operate on MySQL databases.
    1. sudo port install php5-mysql
  4. Restart Apache:
    1. apache2ctl restart
  5. Test the connection.
    1. Typing
      1. php -i | grep -i 'mysql'
      Should get you a list of a few mysterious lines of stuff.
    2. Second test: The whole bag of marbles. You ready for this?
      1. In the Apache’s document root (where the index.php file you made before lives), create a new file named testmysql.php
      2. In the file, paste the following:
        1. <?php
        2. $dbhost = 'localhost';
        3. $dbuser = 'root';
        4. $dbpass = 'MYSQL_ROOT_PASSWRD';
        5. $conn = mysql_connect($dbhost, $dbuser, $dbpass);
        6. if ($conn) {
        7.     echo 'CONNECT OK';
        8. } else {
        9.     die ('Error connecting to mysql');
        10. }
        11. $dbname = 'mysql';
        12. mysql_select_db($dbname);
      3. Edit the file to replace MYSQL_ROOT_PASSWRD with the password you set for the root database user.
      4. Save the file.
    3. In your browser, go to http://127.0.0.1/testmysql.php
    4. You should see a message saying “Connection OK”
  6. MILESTONE - Apache, PHP, and MySQL are all working together. High-five yourself, bud! You are an IT God!

Set up virtual hosts.

Finally, we will set up virtual hosts. This allows your server to handle more than one domain name. Even if you don't think you need more than one domain, it's a safe bet that before long you'll be glad you took care of this ahead of time.

We will create a file that tells Apache how to decide which directory to use for what request. There is an example file already waiting for us, so it gets pretty easy.

  1. Tell Apache to use the vhosts file. To do this we make one last edit to httpd.conf. After this, all our tweaks will be in a separate file so we don’t have to risk accidentally messing something up in the master file.
    1. In /opt/local/apache2/config/httpd.conf, find the line that says
      1. #Include conf/extra/httpd-vhosts.conf
      and remove the #.
    2. The # told Apache to ignore the include command. Take a look at all those other files it doesn’t include by default. Some of them might come in handy someday...
    3. Save the file and restart Apache
    4. Test by going to your old friend http://127.0.0.1
    5. Forbidden! What the heck!?! Right now, that's actually OK. The vhosts file is pointing to a folder that doesn't exist and even if it did it would be off-limits. All we have to do is modify the vhosts file to point to a directory that actually does exist, and tell Apache it's OK to load files from there.
  2. Before going further, it's probably a good idea to figure out where you plan to put the files for your Web sites. I've taken to putting them in /opt/local/www/domain.com/public/ - not through any particular plan, but /opt/local/www is the default location for phpMyAdmin and I just went with it. The public part is so you can have other files associated with the site that are not reachable from the outside.
  3. Set up the default host directory
    1. Open /opt/local/apache2/conf/extra/httpd-vhosts.conf for editing.
    2. You will see two example blocks for two different domains. Important to note that if Apache can’t match any of the domains listed, it will default to the first in the list. This may be an important consideration for thwarting mischief.

      The examples provided in the file accomplish one of the two things we need to get done — they tell Apache what directory to use for each domain, but they do nothing to address what permissions Apache has in those directories. A lot of people put the permissions stuff in the main httpd.conf, but why not keep it all in one place and simplify maintenance while we reduce risk?

      Here's an example:

      1. <VirtualHost *:80>
      2.     ServerAdmin [email protected]
      3.     DocumentRoot "/opt/local/www/mydomain.com/public"
      4.     ServerName mydomain.com
      5.     ServerAlias www.mydomain.com
      6.     ErrorLog "logs/mydomain.com-error_log"
      7.     CustomLog "logs/mydomain.com-access_log" common
      8.     <Directory "/opt/local/www/mydomain.com/public">
      9.         Options FollowSymLinks
      10.         AllowOverride None
      11.         Order allow,deny
      12.         Allow from all
      13.     </Directory>
      14. </VirtualHost>

      You can see where it sets what directory to go to, where it says to treat www.mydomain.com the same as mydomain.com, and then in the Directory block it sets permissions. The actual permissions instructions are pretty arcane. The most important thing to note is the line

      1. AllowOverride none
      This is not typical, but it's better, as long as you don't forget you did it.

      Here's the skinny: A lot of web apps like WordPress and Drupal need to set special rules about how certain requests are handled. They use a file called .htaccess to set those rules. By setting AllowOverride none you're telling Apache to ignore those files. Instead, you can put those rules right in the <Directory> blocks in your vhosts file. It saves Apache the trouble of searching for .htaccess files on every request, and it's a more difficult target for hackers. .htaccess is for people who don't control the server. You do control the server, so you can do better.

      1. If others will be putting sites on the server and you don't want them fiddling with the config files, you can allow .htaccess to override specific parameters. Read up in the Apache docs to learn more.
      2. If you are using SSL, you also need to set up a VirtualHost entry for port 443. That entry will also include the locations of the SSL certificates.
    3. Add further blocks that match the domains you will be hosting.
    4. Restart Apache and test your setup. http://127.0.0.1 should go to your default directory. Testing the domains is trickier if you don’t have any DNS entries set up for that server. I’ll write up a separate document about using /etc/hosts to create local domains for this sort of test.
  4. MILESTONE - You have done it. A fully operational LAMP environment on your Mac, suitable for professional Web hosting.

(Optional) Install phpMyAdmin

phpMyAdmin makes some database operations much easier. There have been security issues in the past, so you might reconsider on a production machine, but on a development server it can be a real time saver.

    1. sudo port install phpMyAdmin
  1. Update your Virtual Hosts with the domain you want to use to access phpMyAdmin, which is by default at /opt/local/www/phpmyadmin/
  2. test - log in as root.
  3. Configure - configuring phpMyAdmin fills me with a rage hotter than a thousand suns. It just never goes smoothly for me, whether I use their helper scripts or hand-roll it while poring over the docs. Maybe if I do it a few more times I’ll be ready to write a cookie-cutter guide for that, too. In the meantime, you’re better off getting advice on that one elsewhere.

Wrapping Up

I hope this guide was useful to you. I'm he kind of guy who learns by doing, and I've made plenty of mistakes in the past getting this stuff working. Funny thing is, when it goes smoothly, you wonder what the big deal was. Hopefully you're wondering that now.

If you find errors in this guide, please let me know. Things change and move, and I'd like this page to change and move with them.

Keep up to date: One of the big advantages of this install method is that updates to key software packages get to your server faster. Use that power. Run the update commands listed in step one regularly.

  1. The script that tests the PHP-MySQL connection is based on one I found at http://www.pinoytux.com/linux/tip-testing-your-phpmysql-connection

Appendices

Appendix 1: A brief explanation of sudo

In the UNIX world, access to every little thing is carefully controlled. There's only one user who can change anything they want, and that user is named root.

When you log in on a Mac, you're not root, and good thing, too. But as an administrator, you can temporarily assume the root role. You do this by preceding your command with sudo. (That's an oversimplification, and you will have earned another Geek Point when you understand why. In the meantime, just go with it. sudo gives you power.)

When you use sudo, you type your password and if the system recognizes you as an administrator it will let you be root for that command.

For convenience, you only have to type your password every five minutes, but you do need to repeat 'sudo' for each command.

Just remember, as root you can really mess things up.

Appendix 2: On editing text files and permissions

Jerry told me to edit the file, you lament, but he didn't say how. Kind of strange, considering the minute detail of the rest of the guide. The thing is, there's not one easy answer.

Let's start with the two kinds of text editors. There are editors like vim and pico that run right in terminal. They are powerful, really useful for editing files on a remote box, and if you know how to use them you're not reading this footnote. The other option is a windowed plain-text editor. TextEdit is NOT a plain-text editor. There are a lot of plain-text editors out there, and they all have their claims to fame. You can use any of them to edit these files.

Whoops! That brings us to the gotcha: permissions. In UNIX, who can change what is tightly controlled. Many of the files we need to edit are owned by root, the God of the Machine, so we need to get special permission to save our changes. Many of the plain-text editors out there will let you open the file, but when it comes time to save... they can't. You don't have permission.

Some editors handle this gracefully, however, and let you type your admin password and carry on. BBEdit and its (free) little brother TextWrangler give you a chance to type your password and save the file. I'm sure there are plenty of others that do as well.

BBEdit and TextWrangler also allow you to launch the editor from the command line, so where I say above edit ~/.profile, you can actually type edit ~/.profile and if you have TextWrangler installed, it will fire right up and you'll have taken care of the permissions issue. (If you decided to pay for BBEdit, the command is bbedit ~/.profile.) I'm sure there are plenty of other editors that do that too.

I'm really not endorsing BBEdit and TextWragler here; they just happen to be the tools I picked up first. Over time I have become comfortable with their (let's call them) quirks. Alas, finding your text editing answer is up to you. If you're starting down this path, it's only a matter of time before you pick up rudimentary vim or pico skills; eventually you'll be using your phone to tweak files while you're on the road. It's pretty empowering. But is now the time to start learning that stuff? Maybe not. It's your call.

1
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

I Just Slid Wikipedia a Couple of Bucks

December 14th, 2011
Because it's useful to me.

I use Wikipedia regularly, and apparently it’s costing them a bundle to keep the servers going. While I have on occasion had issues with the way they run things, overall this is shaping up to be a humanity-changing effort. So I slid them a couple of bucks. If you use Wikipedia a few times a week, you should too. They’re looking for big donations, but if everyone voluntarily pays just a little we get closer to the utopian ideal.

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

They ARE Watching You

November 21st, 2011
I'm late to the party with this particular hand-wringing, but it seems the tempest blew over without much discussion of the actual problem.

Near the beginning of the novel 1984, Winston Smith is in his apartment, doing his state-mandated exercises in front of the TV. Suddenly a voice blares from the speaker and reprimands him for not making more of an effort. We learn at that moment that the telescreen is a two-way device; it watches you as you’re watching it.

Now we call that machine Kinect for XBOX Live.

Some of this is old news in privacy circles; it was more than a year ago that Microsoft first bragged to investors that the Kinect platform could be used to gather data on people using their product — what people are wearing, and things like that. This is what happens when you have a Web-cam in the house that’s always connected to the Internet, and someone you don’t know is on the other end.

Well, as you might expect, these revelations raised quite a kerfuffle. Microsoft very quickly and very loudly promised not to use data gathered through the camera in your home for targeted advertising. In the articles I read, journalists took two approaches:

  1. Whew! I’m sure glad Microsoft promised not to be evil!
  2. You know, targeted advertising isn’t as bad a people keep claiming. Relax and get information tailored to you.

The commentary, and Microsoft’s reassurances, miss the point entirely. With the government pulling flagrant rights violations like National Security Letters, how long before the video feed in your living room is handed over to the FBI? Hell, it might have happened already. Microsoft would be legally barred from telling anyone it even happened. This is the state of our constitution these days.

(If the government really thinks this is all cool and the public wouldn’t mind, why do they work so hard to keep it secret?)

There are ways to prevent the video feed from reaching the outside world, but as I understand it, the default is always on. Not only can it report what game (or political convention) you’re watching, it can report when you cheer. Better think twice about that Che Guevara poster on the far wall from the TV. My video-game playing, dope-smoking neighbors may not be too concerned about privacy anyway (judging by the clouds drifting through the neighborhood), but I doubt they’d feel great about knowing they have a live video feed that any government monkey with a frightening letter will be able to watch.

Let me repeat that just so I’m clear: Any government monkey with a frightening letter will have access to a live video feed from your living room, as well as every email you’ve ever sent and what you checked out at the library. Things are bad enough without handing them the most invasive tool yet to pry into your lives.

I would LOVE to see a big company like Microsoft stand up to the government and publish a policy that states that they will not surrender the feed without a legal warrant signed by a judge. The chances of that actually happening are zero — unless Microsoft thinks it’s losing a very large amount of business due to those privacy concerns. That’s not an indictment of Microsoft, I doubt any major US corporation is ready to go to the mat with the Feds on this one.

Microsoft once more finds itself in the very familiar position of creating something that sounds really cool without considering all the consequences, much like when they put into Microsoft Office a system specifically tailored for adding executable code to Office documents. Office automation, they called it. A great time-saver. “Capital idea!” shouted the virus writers with glee. Now once more Microsoft has come up with something that is almost magic in how it works (e.g., parental controls based on the metrics of the people in the room), but those things require the camera to be on, even when you’re just watching TV.

If someone gave me a free Kinect and XBOX, I’d probably use it. But I’d be very, very careful about when the Internet connection is active. And, while exercising I’ll be sure to give it my all.

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Class A, Baby!

November 6th, 2011
Where have all the numbers gone?

Usually I blame the Chinese for every shortage or surfeit, and while they are definitely participating in this particular drought, it would be difficult to pin the blame wholly on them. Much of the problem lies closer to home.

You see, the world is running out of IP addresses. An IP address is like a computer’s phone number on the Internet. When you type muddledramblings.com, you start a complicated series of interactions (“I don’t know where that is, but I know who to ask…”) out there in the Interwebs and eventually it is resolved that what you’re looking for is computer 173.245.60.121. You get the same answer for JersSoftwareHut.com and jerryseeger.com. (That’s actually an IP owned by CloudFlare, who sends things on to the actual IP of 66.116.108.197. But that’s not what matters here…)

At the time of this writing, jer.is-a-geek.com resolves to 98.210.116.58, the IP of my home router. The actual number may change, but there will always be an ip address used up by the router. (Don’t bother going there; there’s nothing to see unless you use ssh and already have a key installed on your computer. (The key file itself is locked with a password I may have forgotten.))

Anyway, the IP address is a finite number, and so there is a limit to the total number of computers connected directly to the Internet. This is a very, very big number, but when they came up with the number they didn’t think people’s toasters (and telephones, and cars) would be connected to the Internet. (In your house, most likely your computers and other gadgets go through a router or a modem. That router has to have a unique ID, but the rest of your network uses a special range of IP’s reserved for internal networks. So, your household only eats up one of the limited supply.)

We are starting to reach the limits of the IP system, just as in the US there was a shortage of telephone numbers. (Some of the reasons we ran out of phone numbers are similar as well, as I’ll mention in a bit.)

With phone numbers they split areas into smaller chunks, and created new area codes. While there was the inconvenience of people’s area codes changing, everything still worked.

The Techno-Wizards who run the Internet saw the IP problem coming some time ago, and set out to solve it. What they came up with was IPv6 (currently we are using IPv4). The only problem: the two systems are not compatible. So now a new network based on IPv6 is being deployed, and the people on it can’t look at Web sites that have IPv4 addresses without some sort of middleman. Sucks to be one of those guys. (Muddled Ramblings is now visible on the IPv6 network thanks to CloudFlare.)

Meanwhile, at work, my team needed an IP address for one of our servers. We were advised by a coworker to just go ahead and grab a block of 256 addresses, so we’d have them if we needed them. Really? When IP addresses are running out?

Yep. It turns out that long ago, organizations who were on the ball could buy up huge blocks of IP addresses on the cheap. MIT bought a Class A* block, as did Stanford (who has given it back, I believe), the Army National Guard, IBM, HP (they have DEC’s block now, too, I think), and Apple. Each Class A block has almost 17 million IP addresses, and represents a significant chunk of all the IP addresses available.

The US military has several blocks, and the British military has some as well.

Oh, and Amateur Radio Digital Communications has a Class A, along with Prudential Securities. Ford and Daimler. Three or four pharmaceutical companies. (I imagine Merck or whoever bought one, and their competitors followed suit out of habit.)

I think you might now be getting a glimpse of a core problem. The huge blocks of IP addresses were allotted to whoever asked for them, with no requirement that the organization actually show that they needed them or would not hoard them. Does Ely Lilly have a side business as a data center?

A possibly-apocryphal story I was told the other day: Back when IPs were up for grabs, someone at Apple proposed that they snag a Class A. The powers that be decided against the move, so he got the purchase of the block wedged into the budget for something completely unrelated. It turns out to have been a pretty savvy move. Now every IP address that starts 17. belongs to Apple.

Of the companies on that list, I’d certainly say Apple has more business owning a Class A block than many of the others. Whether the US Military really needs all those huge blocks I’m not qualified to argue. But the fact remains that while we would be running out of IP addresses eventually anyway, the careless and haphazard way they were originally handed out exacerbated the problem mightily.

I mean, does the Department of Social Security in the UK really need 16.7 million IP addresses? Really?

* The term ‘Class A’ is a little out of date, but reads better than ‘/8 block’

Note 1: I got my information here and there on the Internet, then found it all here.

Note 2: This episode contains a lot of parenthetical comments, part of my crusade to address the global overabundance of parentheses. I encourage you to use a few extras as well, until supply is back in balance with demand. (As usual, I blame the Chinese for the surfeit.)

3
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Science

October 25th, 2011
It's not the answer to everything. Real scientists know that. Fake scientists don't even know what that means.

A few years ago I was at a party, and I was talking to a guy I’d met a few times before. “I don’t believe in X,” he said (I have no recollection what X was), just like I don’t believe in relativity.”

I was young, and perhaps naïve, but I didn’t think relativity was a candidate to be part of a belief system. “What do you mean, you don’t believe in relativity?” I asked. Here was a chance, I thought, to explain the principle to someone who didn’t understand it.

I failed. I failed and got very frustrated, angry at myself for not explaining things better. Angry that I had not even put doubt into the non-believer. It went like this: He explained something he called “the inertia problem.” I assumed he’d picked it up from a book by some ‘rogue’ physicist (more on them later). He described the inertia problem. It was nonsensical and even if you helped it along a bit with incorrect terminology, it still had absolutely nothing to do with relativity.

In retrospect, I enumerated a few options how to proceed:

  • Ask, “What does that have to do with relativity?” and address the incorrect linkages specifically.
  • Say, “Look, relativity has been measured over and over, in different ways, from the orbit of Mercury to clocks in the Apollo capsules. The work my own father does would simply break without it.”
  • Ask “Do you believe in gravity? Because that’s a hell of a lot more mysterious than relativity.”
  • Say, “Fortunately, relativity doesn’t need your faith to work.”
  • I could treat the “inertia problem” as a credible theory, work my ass of to recast it in terms that actually meant something, then demonstrate that my construct was, in fact, not in disagreement with relativity.

I think you can guess which course I took. Perhaps all of the above would have failed (more on that later, too), but just mentioning personal experience and giving a taste of the enormous pile of things that have verified relativity in the past century might have provided enough skepticism that at least the Unbeliever would not spread his Unfaith as fervently. (I wonder if he uses a GPS now? I wonder if he knows he’s using relativity?)

This guy thought of himself as a skeptic, as someone who didn’t just believe what everyone else did. In fact, he was not a skeptic at all. He was Rogue wanna-be. The way to convince him of something was to start with, “The establishment doesn’t want me to say…” and then say something that implies special knowledge that no one else has. Some idiot whose concept of physics is mired in the 1850′s writes a book saying that relativity is bogus, and members of the Rebel Dalliance hoist him on their shoulders. Stick it to the man! Believe a quack for no other reason than he says the establishment is wrong!

There’s never been a moon landing! Never mind that the junk is up there, in plain sight. For some reason Russia and China continue to cooperate with the US to perpetuate a hoax forty years later. Why do people believe that? Because it’s fun to style oneself as a rogue. As long as you only talk to other members of the Rebel Dalliance, you don’t have to discover that you’re an idiot.

Which brings me to evolution. Lots of people in this country don’t believe in it. As I could have said to the guy who didn’t believe in relativity, evolution doesn’t require their faith to work. The part that sticks in my craw is the large number of anti-evolution salesmen who claim that there are other scientifically-viable theories. Intelligent design and whatnot. A handful of ‘rogue’ scientists have done well for themselves proposing plausible-sounding stories and selling them as science. People will pay you to tell them what they want to hear.

Those theories are not science. In fact, they’re not even theories. A better name for ‘rogue scientist’ is ‘salesman’. Anyone who claims to be a scientist must always be ready to listen to more evidence and modify or scrap his favorite theory. It happens. But in science, even the guys who are wrong are improving the process, bringing up proposals and, most importantly, new tests to challenge the status quo. Sometimes (well, often) pride gets tangled up in things, but even then they are not rogues, they are stubborn scientists.

Science is about letting go. People who say science is messed up because people used to believe one thing but now believe something else are in fact demonstrating the strength of science. We learn. We grow. We change.

“I believe God made Adam from clay,” is perfectly all right with me. I have no difficulty with faith; it’s about the unknowable, about the places science can’t reach. Just don’t try to clothe faith in science and wedge it into the science curriculum at my local school.

If your theory can’t be tested, it’s not science. This is currently a hot topic at the most esoteric level of physics. The math works, but it’s hard to test without exploding suns to get the energy required. There are a lot of folks, promoters and skeptics alike, searching for planet-earth size experiments to test the math.

So, scientific theories have to be testable. Even that’s not enough, though. How many times have you started a sentence with “A study showed that…?” A bunch of times, right? Me, too. And I will again. Some of those studies are pretty crazy. But while you do it, remember this: A study has never shown anything. Ever. A single study is so vulnerable to mistakes and misinterpretation that you can never draw broad conclusions. The study has to be replicated, by someone else, using methods that answer questions raised by outsiders about the first study.

Remember cold fusion? Some guys were so excited about the result of their experiment that they bypassed normal science channels and went mainstream. The economic implications of their study were so world-changing that the entire scientific community dropped what they were doing to try to replicate that experiment in a hundred different ways. Turns out, the original experiment was flawed. (Somewhere, there’s a ROGUE SCIENTIST selling books telling of the coverup of cold fusion.)

Scientific evidence has to be repeatable. Predictably repeatable. Every measurement has to have an estimate of the likelihood that it’s wrong.

The biggest problem with teaching creationism alongside evolution in schools is that it clouds what science even is. Creationism as an ‘alternate theory’ totally confuses the definition of ‘theory’. When discussing science, creationism is most certainly not a theory. It can’t be tested. I don’t care what you think about dinosaurs; you could leave them out of the curriculum and I wouldn’t mind that much (the kids will supplement their own education on that score), but please, please, teach what science is, and even more importantly, what it isn’t.

Sooner or later our government will be filled with people who don’t even understand the nature of science, its strengths and weaknesses, yet they will be making critical decisions based on science. Ah, shit. That’s happened already.

If we all knew what science was, then when some oil-company-funded pundit comes on TV to ‘debunk’ global warming with feel-good talk about economic growth, the token scientist in studio to rebut could simply say, “that’s not science,” and the nation would nod and disregard the previous bloviations. “Now,” the anchor will say, “We can get to the real debate: what to do about it.”

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

The Rise and Fall of Adobe Flash

October 10th, 2011
The next generation has arrived.

A long, long time ago, I wanted to make lava lamp buttons for my Web site. I wanted the shape of the lava blobs to be random and mathematically controlled, and it had to be done with vector graphics – animated gifs would have been huge to provide something that even remotely felt random, and back in those days most people connected with dialup modems.

I searched high and low for a vector animation tool and couldn’t find one. There was Macromedia Director, which I used extensively back then, which put out files for Web play in a format called Shockwave, but it wasn’t a true vector-based program. Not the right tool for lava lamp buttons, that was for sure. I’d started playing with a java applet to draw my buttons, but it seemed like vector animation was something the Web really needed. I mentioned this to a friend of mine, and he said, “Oh I know some guys with the tool you’re looking for.” At the time it was called FutureSplash.

I mentioned FutureSplash to my boss. It was going to be huge, I predicted. His response: “Maybe we should buy them.” (Ah, those dot-com boom days, how I miss them.) Three days later Macromedia announced that they had bought FutureSplash (for a lot more than we could have paid) and contracted the name to Flash.

The rest is history — until the present.

There was even a time when I imagined that a lot of the Web would end up as Flash. Or at least it should. Flash had a lot of things right that HTML had managed to screw up. You could do a lot more, and with Flash the Web experience began to approach the quality of experience people had in other parts of their computing lives.

Macromedia and later Adobe seemed to go out of their way to prevent Flash from taking over the Web. Creating Flash became ever more complex and ever more expensive. Nowhere was the simple “baby Flash” that Joe Amateur could use to build a nice site without first getting extensive training and shelling out a few hundred bucks for tools.

Meanwhile, Flash designers didn’t help in those early years, either. So much Flash became “look what I can do” rather than “look how I can make your visit to my Web site better” that Jane Surfer started resenting Flash. “I waited 60 seconds to download this?” A good example of that sort of waste is at the top of this page, in fact. There are a couple of fun things in the banner, but they don’t enhance the Muddled Experience very much.

Now, the world is shifting again. If you’re reading this site from your iPad, you don’t see the banner at all. No Flash in iOS. This is something the other tablet manufacturers have made a big deal of—but maybe not for very much longer. Microsoft’s next tablet OS won’t support Flash, either.

HTML, the platform I get paid to dislike, is becoming HTML, the platform I get paid to deal with. HTML5, CSS3, full SVG support, and robust JavaScript libraries make possible just about everything Flash can do, without Flash. That’s a lot of things to learn and manage to get a job done, however. Before, a designer could just master Flash and be confident that their work would look right wherever the Flash plugin was installed.

What’s needed is a tool like Flash that, after you’re done designing, outputs your masterpiece in Web-standard format, with HTML, CSS, and JavaScript. When something like that comes out, the handwriting will be on the wall for Flash.

And here it is. Adobe, makers of Flash, have announced Edge, the animation tool that will eventually replace Flash. It looks pretty good. It doesn’t do anything remotely close to what Flash does (no mention of audio that I’ve found, for instance, so my banner would have to forego the theme song, and interactivity will have to be handled outside the tool as well, as far as my first glance tells me), but it does a great deal, and when you’re done the product will work in all modern browsers, including mobile ones. Adobe has applied their long, long experience making animation tools to make the user interface slick and clean (though you will want a really big monitor).

Flash will be around a long, long time yet; it still lets a developer build Web-based user interfaces that would be a pain in the butt to create from HTML and the rest of the alphabet soup. That gap is narrowing, however, and as Edge gains in features (and, alas, complexity), the marginalization of Flash will accelerate. I’m impressed that Adobe said, “If Flash dies, we’ll be the ones to kill it.” They really are the right people for the job. Now all we need is “baby Edge.”

1
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Seven? Really?

October 7th, 2011
The truth behind the number surge.

A few days ago the Firefox team let forth a new major release. 7.0.1. Seven. That’s a lot of progress since earlier this year when they floated Firefox 4.

Most software companies would have labeled this release 4.3. The Firefox team has eschewed the first dot and has decided to make any release with a feature change a new major release. There is no n.1; the first decimal digit is entirely vestigial. There was no 4.1. There was no 5.1 or 6.1 There will be no 7.1, just 7.0.1. This might sound stupid, unless you have Inside Information. Which I have, thanks to Wikipedia.

The Internet Explorer team at Microsoft, sworn rivals of Firefox, are nonetheless ok guys who want to make this whole Web thing work. Back in the day when the Firefox team kicked the ass of the web world and released a browser that not only defined standards but provided the tools to help Web developers code to those standards, team FF were the guys to beat. On the release of FF3, the boys at Microsoft sent the team a cake. Firefox 4 was similarly honored. And FF5. And so on.

And now we see the real reason behind the accelerated numbering. Each major release gets a cake. If I was in charge, there’d be a new major version every Thursday.

* The firefox team joked about sending a cake to Microsoft to honor IE 8 (or 7 or 9 and you shouldn’t ask me to remember shit like that), but they would send the cake along with the recipe. Open-source cake. But (as far as history records) they didn’t. Would’a been funny. There’s talk and there’s action, and seriously you don’t want to be on the losing side of that with Microsoft.

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Then there’s Incapsula

September 24th, 2011

I’ve written about CloudFlare in the past. I think it’s a no-brainer for small-time bloggers like me who control their own domain name registry. My writing has attracted the attention of another company, Incapsula, who offer a similar service.

Incapsula would love for me to give them a try, so I can write about them, too. They’re under the impression that I have some sort of influence in the world. Ha! They’ve even offered me a free upgrade to the ‘pro’ level of the service. One really cool thing about the upgrade: out-of-the-box SSL, which means you don’t have to get your own certificate to handle commerce. Certificates can be a real hassle, and a considerable expense.

The thing is, I’m pretty happy with CloudFlare. As of today, people on IPv6 can read these words. (Much like telephone numbers in some areas, the world is running out of IP addresses.) I’ve worked out one kink with the system and things are running smoothly. Does Incapsula have code to install on the server to make it play well with others? I don’t know.

Also, I don’t really need any of the advanced services of either system. I don’t do e-commerce, which could be a compelling reason to switch and grab my free upgrade.

I have a couple of terrifically minor quibbles about CloudFlare’s user interface and flexibility blocking IP ranges, but nothing worth even mentioning here. Logically, I should just stick with CloudFlare and leave it at that.

Except…

That guy they think I am? The one whose words can shift the balance of power in an emerging new market? I’m not that guy. I’ll never be that guy unless I devote myself to the task, and I’ve got other things to write about that are probably more interesting to most of you. But still I want to be the guy they think I am. I want to write the CloudFlare vs. Incapsula smackdown article to which all the pundits refer.

To do something like that, I’d have to set up a site to use Incapsula, but I don’t want to rock the Muddled Boat. I have jerryseeger.com, but what sort of test do I get out of a site that no one ever visits? It’s a site where acceleration hardly matters because the whole thing is so simple, and there’s no sign of e-commerce on the horizon. The thing barely even gets spammed.

Still, I have to think of something… the public demands it!

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Your Most Important Password

September 16th, 2011

I’ve mentioned passwords before, but today I’d like to tell you about the most important password in your possession, the single password that keeps the hordes at bay.

Take a moment to think about the passwords you use for your various secret stuff. If you’re like me, you have your ordinary password for unimportant stuff, then you ratchet up the entropy for sites that involve money. For a long time I had two passwords, my ‘secure’ one and my ‘other’ one. Now I’ve started taking my passwords a lot more seriously, which means keeping a file of all my passwords, itself protected with massive encryption and the most awesome passphrase ever. No one’s getting into that file.

But here’s the thing: they don’t have to. There’s another password I have that’s just as powerful and easier for a bad guy to use. My primary email password.

How does that password drop my trousers universally? Simple: if someone had access to my email, they could click “I forgot my password” on every site in the world and harvest the responses. If the evil robot cleared out the emails before I read them, I’d be none the wiser. And I’d be fucked.

You might think your online banking password is the one you must protect most diligently, but your email password will hand them your bank account along with everything else. This is the password to protect and change regularly.

As an aside, you can make things a little tougher for bad guys by modifying your email address when you register for stuff. For instance, if I register at xyz.com, I might use [email protected] for my email address. The cool thing about ‘+’ is that it doesn’t change the delivery (the above will go to [email protected]) but you can sort your email based on the suffix, and you can track who gave your email address away. Most significantly, if some wrongdoer has your email password, they still have to guess the +suffix part for each site before they can use the “I forgot my password” feature. If your email password gets out, that second line of defense could really save your ass.*

Also, know that if your email provider gets hacked, you could be hosed. There is one major company (rhymes with achoo!**) that seems to have a hard time keeping the wrong guys out of your account (although I think it’s the address book that has been compromised, and not direct access to your emails). There are likely others that do a better job keeping their names out of the press when they spill your information.

So, to flog the horse: If bad guys gets access to your email, they own you. Protect that password diligently. Change it fairly often. Use [email protected] when you sign up for stuff. In databases around the globe, your email is quite literally your entire identity.

* I read somewhere that hotmail and some others don’t support the + in emails. I haven’t tested personally, but if your provider is one of those, drop them immediately and find a better service.

** I’m pretty sure I have stock in a company that ends oo!, so I’m not just slinging mud here.

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Bad Behavior, CloudFlare and Google Bot

August 29th, 2011
They can all get along, but it might take some fiddling.

This blog has several layers of protection from the evils of the outside world, but those layers don’t always get along. One problem that I had is pretty common among CloudFlare users, and the documentation provided by the relevant players has a hole in it – a key nugget of information that can make all the difference.

The nugget follows in due course.

My first line of defense from ne’er-do-wells and miscreants is CloudFlare. They stop most of the bad guys before they even reach my site. Still, for some sorts of attacks, when there’s doubt it’s better to let the bad guy through. It may turn out to be a good guy.

A program called Bad Behavior is my next line of defense. It sits on my server and quickly spots liars and weasels. For dangerous-looking attacks, that’s the limit. But, when there’s doubt and the site itself is not at risk, Bad Behavior will let the attack through.

At this point, ‘attack’ means ‘comment spam’. Everything else is stopped before it reaches this stage. Most of the comment spam has been stopped as well, but some has been given the benefit of the doubt. That’s where Akismet comes in. This layer spots the rest of the comment spam, and it can be much more aggressive since it doesn’t actually delete the spam, it puts it into a bin for future review. So, legitimate comments can be rescued by an alert blog admin.

It works pretty well. Three spams actually got through all the layers last week, the first time any have gotten through in quite some time. Somewhere, a spammer popped a bottle of bubbly.

So comment spam is pretty well thwarted. Hooray! Unfortunately, for a while I had a pretty big problem. Search engine robots were being denied. I fell off Google and Yahoo! and all the rest, and traffic to this site dwindled.

Here’s what was going on:

  1. Googlebot said ‘hey, muddledramblings.com, show me page x’.
  2. The request must get past CloudFlare. No problem. They see it’s the real Google bot and pass the request on to my server.
  3. Bad Behavior is next. They look at the incoming message and see something that claims to be a Google bot but It’s not coming from Google. It’s coming through CloudFlare. Bad Behavior says, “You are a lying sack of shit and a false Google bot. You are obviously evil and you may not pass.” Google is shut out. The other legitimate robots are cut off as well.

This problem is pretty easy to fix, but not quite as easy as WordPress admins would like to hope. CloudFlare has code that you can install on your server that will straighten the whole problem out. Basically it tweaks incoming messages so that the original source appears instead of CloudFlare. This bit of fix-it code is available as a WordPress plugin, so you can install the plugin and rest easy.

But that’s the thing that tripped me up and is not explained in the docs. In the case of working with Bad Behavior, the WordPress Plugin is not enough.

The catch is that Bad Behavior does its magic before the CloudFlare plugin can do its magic. So, even with the CloudFlare plugin firmly installed, Bad Behavior will reject Google bot and all his pals.

There are two simple solutions: 1) Install the CloudFlare Apache module, which kicks in before anything else is run. This is preferable to the WordPress plugin anyway, because it’s a system-wide solution. 2) If you don’t have that level of control over your server, turn off Bad Behavior. It’s a shame to lose that layer of protection, but not devastating; there’s some overlap between what CloudFlare stops and what Bad Behavior stops. You still have two layers and your own alert management to fall back on.

2
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

How This Blog Works

August 20th, 2011
Were it not for a couple of bad Web hosts, I might never have learned all this stuff.

Over the years, the technology behind this blog has gone from cave-dwelling stone-knives-and-bearskin static pages to cloud-city jet-packs-and-lightsaber dynamic yumminess. That transformation starts with WordPress but does not end there. Not by a long shot.

I started the Muddled Media Empire using a tool called iBlog, because it was free and worked with Apple’s hosting service, which I was already paying for. iBlog’s claim to fame was that it didn’t require a database – every time you made a change it went through and regenerated all pages that were affected. Toward the end, that was getting to be thousands of pages in some cases, each of which had to be uploaded individually. When iBlog’s support and development faltered, it was already past time for me to move on.

WordPress is an enormously popular Web-publishing platform. It comes in two flavors: you can host your blog on their super-duper servers and accept their terms of service and the slightly limited customization options, or you can install the code on your own server and go nuts. I chose the latter, mainly because I wanted to be able to touch the code. I’m a tinkerer.

So I signed up for a cheap Web host and set to work building what you see now. At first things were great, but after a while the host started having issues, and the once-great customer service withered up and vanished. So much for LiveRack. I think they just didn’t want to be in the hosting business anymore. I moved to iPage.

iPage was cheap, but I was crammed onto a server with a bunch of other people and sometimes my blog would take an agonizing time to load. Like, almost a minute. Then there was the time a very popular Geek site linked to my CSS border-radius table and iPage shut me down because the demand on the server was too much. Ouch! My moment in the sun became my moment at the bottom of a well.

I set out to find ways to make this blog more server-friendly and more user-friendly at the same time. Step 1: caching. WordPress doesn’t store Web pages, it stores data and the instructions on how to build a Web page. So, every time you ask to load a page here, WordPress fires up a program that reads from the database and assembles all the parts to the page. The thing is, that takes longer than just finding the requested file and sending it back, the way iBlog did. Caching is a way for the server to say, “hey, wait a minute – I just did this page and nothing’s changed. I’ll just send the same thing I did last time.” That can lead to big savings, both in time and server load.

I looked at a few WordPress cacheing programs and eventually chose W3 Total Cache, because it does far more than just cache data. For instance, it will minify scripts and css files (remove extra spaces and crunch them down) and combine the files together so the browser only has to make one request. It will zip the data, meaning fewer 1′s and 0′s moving down the pipe, and it does a few other things as well, one of which I will get to shortly.

I installed W3 Total Cache, and although some settings broke a couple of javascripts (for reasons I have yet to figure out – I’ll get to that someday), the features I could turn on definitely made a difference. Hooray!

But Muddled Ramblings and Half-Baked Ideas was still way too slow. I continued my search for ways to speed things up. I also began a search for a host that sucked less than iPage. (iPage was also starting to have outages that lasted a day or more. Not acceptable.) I decided I was willing to pay extra to be sure I wasn’t on an overwhelmed machine.

I’m not sure which came first – new server or Amazon Simple Storage Service. S3 is a pretty basic concept – you put your stuff on their super-duper servers, and when people need it they will get it really quickly. Things that don’t change, like images and even some scripts, can live there and your server doesn’t have to worry about them.

This is where W3 Total Cache earned my donation to their cause. You see, you can sign up for Amazon S3, and then put your account info into the proper W3TC panel and Bob’s Your Uncle. W3TC goes through your site, finds images and whatnot, puts them in your S3 bucket, and automatically changes all the links in your Web pages to point to your bucket instead of your own server. (Sometimes I find I have to copy the image to my S3 bucket manually, but that’s a small price to pay.)

Now a lot of the stuff on my blog, like the picture of me with the Utahraptors the other day, sits on a different, high-performance server out there somewhere, and no matter how overwhelmed my server happens to be at the moment those parts will arrive to you lickety-split. Amazon S3 is not free, however – each month I get an invoice for two or three cents. Should Muddled Ramblings suddenly become wildly popular, that number would increase.

About that server – the next stop on my quest for a good host was a place called Green Geeks. I wanted to upgrade to a VPS, which means I get a dedicated slice of a server that acted just like it was my very own machine. There is a lot to like about those, but my blog just wouldn’t run in the base level of RAM they offered. I upgraded and reorganized so that different requests would not take up more ram than they needed. Still, I had outages. Sometimes the server would just stop freeing up memory and eventually choke and die. Since it was a virtual server in a standard configuration, logic says it was caused by something I was doing, but all my efforts to figure it out were fruitless, and Green Geeks ran out of patience trying to help me figure it out.

The server software itself is Apache. At this point I considered using nginx (rhymes with ‘bingin’ ex’) instead. It’s supposedly faster, lighter, and easier to configure. But, I already know Apache. I may move to nginx in the future, but it’s not urgent anymore.

During the GreenGeeks era I came across another service that improves the performance of Web sites while reducing the load on the servers. I recently wrote glowingly about CloudFlare, but I will repeat myself a bit here for completeness. CloudFlare is a service that has a network of servers all over the world, and they stand between you the viewer and my server. They stash bits of my site all around the world, and much of the time they will have a copy of what you need on hand, and won’t even need to trouble my server with a request. About half of all requests to muddledramblings.com are magically and speedily taken care of without troubling my server at all. They also block a couple thousand bogus requests to my server each day, so I don’t have to deal with them (or pay for the bandwidth). It’s sweet, and the base service is free.

Unfortunately, it was not enough to keep my GreenGeeks server from crashing. Once more I began a search for a new host. I found through word of mouth a place called macminicolo. Apple employees get a discount, but I wasn’t an Apple employee yet. It was still a bargain. For what turned out to be the same monthly cost of sharing part of a machine at GreenGeeks, I get an entire server, all to myself, with plenty of RAM. I’ve set up several servers on Mac using MacPorts, and I knew just how to get things up and running well. It costs less than half what a co-located server costs anywhere else I have found (Mac, Windows, or Linux). (Co-location has up-front costs, but in the long term saves money.) So I have that going for me.

The only thing missing is that at GreenGeeks I had a fancy control panel that made it much simpler to share the machine with my friends. I do miss that, but I’m ready now to host friend and family sites at a very reasonable cost.

So there you have it! This is just your typical Apache/WordPress/W3 Total Cache/Amazon S3/CloudFlare site run off a Mac mini located somewhere in Nevada. Load times are less than 5% of what they were a year ago. Five percent! Conservatively. Typically it’s more like 1/50th of the load time. Traffic is up. Life is good.

Now I have no incentive at all to learn more about optimization.

1
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

Ubiquity Solutions: Evil or merely Overwhelmed?

August 14th, 2011

Note: Wow. This got long, and somewhat technical. For today, some of you might want to look at cute pictures of cats instead. I won’t mind.

I noticed the other day a huge rush of spam comments from ip addresses starting 108.62. I did a lookup and found that the whole block is owned by an outfit called Nobis Technology Group. Most of the addresses also mentioned Ubiquity Server Solutions. They are a massive hosting and colocation service. Basically, they supply the hardware and infrastructure, and their customers set up Web servers and whatnot.

Some of those customers (or the customers of the customers) send out a lot of spam. A truckload. In some cases the customer of a customer of a customer might have been lax and his server got hacked and turned into an unwitting spambot. In other cases the people using Ubiquity’s servers are likely institutional spammers.

Brief aside: Why does comment spam even exist in the first place? Google plays a big role there, with a number called Page Rank. Part of Page Rank (at least historically) was that more links pointing to a page make it land higher in Google searches. So, the spam comment isn’t to get readers of a blog to buy Doc Marten shoes, it’s to get that particular site to land higher in Google’s results when someone searches for them.

The thing is, Google doesn’t publish page rank numbers anymore, and they steadfastly maintain that the comment spamming actually hurts your results in a search. That hasn’t stopped many companies from promising higher sales and taking people’s money in return for smearing their name all over the Internet.

Google could go a long way toward eliminating this sort of spam by publishing page rank again, only now include the amount the rank was hurt by spamming activities. My shoe salesman above is not going to keep paying when Google shows the opposite of the desired result.

So anyway, using CloudFlare’s threat control, I blocked an entire range of ip addresses allocated to Ubiquity’s servers. Then another. I didn’t like this solution; I had no idea how many legitimate potential blog visitors I was blocking. After reading more, the answer surprised me.

The folks at Ubiquity point out that they have terms of service that prohibit using their infrastructure to spam people. When I sent them a complaint, they were professional and courteous. They asked for more specifics, then said they’d sent a complaint to the culprit. Only after they’d asked what my domain name was.

Question: Did they send a message to the culprit saying ‘stop spamming people’ or did it say ‘stop spamming that guy?’

On other blogs where people have ranted about Ubiquity, representatives of the company have responded with measured, rational responses, explaining what a huge uphill battle it is for them, and asking the community to keep sending reports when spam comes from their range. Those reports make it possible for them to put sanctions on clients who are in violation of their terms of service. It is a huge problem and not easily solved.

And yet. Other hosting companies don’t seem as bad, from where I’m sitting.

One of those responses from a Ubiquity representative threw out the argument (I’m paraphrasing from this) “While it’s theoretically possible to monitor all data to weed out the 500MB/s of spam from the 2GB/s of legitimate traffic, that would be really expensive and we wouldn’t be able to compete in this market.” My first takeaway: they think 20% of the traffic from their servers is unethical. Wow. Now, that’s reading a lot into a statement like that, so take it with a grain of salt. Also, it was in a comment to a blog post and may well have been a typo in the first place.

But still, it makes me wonder. And a request coming in to a server for data (legitimate traffic like a request to load a Web page) is fundamentally different than robots on a server sending unrequested data OUT (a high percentage of which will be spam), and sending emails (almost all of which will be spam). A small random sampling of GET and PUT messages outbound from their data centers would probably smoke out the most egregious violators pretty quickly, and not require a lot of hardware to implement. (Not sure how I feel about this from a privacy standpoint.)

Once I got the message that Ubiquity had sent their complaint to the spammer involved, I unblocked that range. Sure enough, in a few minutes more spam came through. I sent the report and back up went the blockade. In my casting around the Internet I read assertions that were not contradicted (so must be true!) that said that NO legitimate traffic would come from those IP’s anyway; they were the addresses of big servers and not IP’s that would appear when Joe User is surfing. So there’s no downside to blocking them. (I’ll put the blocked ranges in a comment below, if you want to follow suit.)

Although, as I put the blockade back up, I had a thought: If I complain about every violation, and cc Google, then the cost of NOT clamping down more effectively on the host’s clients goes up. At some point, if enough people complain enough times, the cost of fixing the problem at the source becomes less than the cost of continuing to do business they way they are now.

That goes not just for Ubiquity, but for all hosts, and for Google and the other search engines. There is no incentive for them to play nice unless we create one.

Yep, I’m proposing fighting spam with a deluge of emails, and I’m probably too lazy to do it effectively.

Of course, this blog is hosted at a data center that almost inevitably will have spammers. Do I want to pay more for my own hosting because my data center has to install a bunch of spam detectors? In my case, I’d be willing to pay a bit more to know my host is doing the right thing, but I think I’d be in the minority. That makes it really difficult for one host to unilaterally decide to take the high road. And you’d be alienating about 20% of your customers, if Ubiquity’s off-the-cuff numbers are an indication.

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

CloudFlare = Awesome

August 13th, 2011
Too good, but true.

So by now you’ve probably heard of “the cloud”, but you might be vague on what the cloud actually is. That’s OK, the cloud is by nature vague. In short, it’s just a name that applies to what the Internet has been trying to do for a long time: information without location. You put a photo up in the cloud, and it’s just “out there”, not on any particular server, not in any particular data center, not in any given country. Could be there are copies of it all over the place, and when someone wants to look at the picture, The Cloud serves up the copy closest (in Internet miles) to the person who wants to see it.

This requires a lot of expensive equipment. Google and Amazon are the biggies in the cloud, but there are others as well, who, for a price, will host your data in a ‘cloudy’ way. In return, people around the world can load your stuff faster.

This humble blog is in the cloud. When you load a page here, roughly half the time the request doesn’t even reach my server (protected in a bunker somewhere in Nevada), but is instead served up from one of CloudFlare’s data centers around the globe. It’s pretty sweet, and has reduced the strain on my server (not that it’s working that hard anyway) while improving the Muddled Experience. The cost for this service? Nothing. It’s free.

I totally win.

CloudFlare also blocks a few hundred spammers each week, before my server has to go to the trouble of blocking them. They compile usage stats and provide other interesting information, and cut the load time for the blog about in half.

They’re a friendly bunch, too; when I suggested upgrades to their interface they wrote back with specific questions as well as thanks. A site they hosted was attacked from China a while back, and it brought down part of their network. They were right up front about the issue and what they were doing about it, and advised people on how to ‘de-cloud’ until the crisis was over. Not everyone was happy, but I was impressed. Soon after reading those communications I signed up.

How can they offer something like this for free? It’s the upsell, of course; they offer premium services. In addition they create a platform for other companies to sell stuff to me. Some of those services are pretty cool, too, though I haven’t dipped my toe in those waters yet (for instance, there’s a free service that checks your site now and then to see if it’s been hacked).

Overall, I can’t think of any reason NOT to use CloudFlare. Check ‘em out and tell them Jerry sent you!

0
Thanks!
Rumblings from the Secret LabsRumblings from the Secret Labs

OK, This is a Little Spooky

August 11th, 2011
Might as well send in your fingerprints.

I guess I knew this intellectually already, but reading this article really brought it home. We all know that info we post on Facebook and other social media sites is more or less public, no matter what security settings you use. The stuff just leaks out. Your birthday, gender, and zip code is enough to uniquely identify most of us in this country. Once someone has that, they can start to gather more information about you and share it with their friends.

But there’s another piece of information that most of us have shared all over the Internet, which when combined with the above, gives unscrupulous (or are they?) enterprises the ability to gather vast amounts of information about what you do even when you’re not using a computer.

What nugget of information is that? Your face. If you’ve used modern photo software, you may already have noticed that it’s getting pretty good at recognizing not just where faces are in your pictures, but whose faces they are.

Let’s say I own a big store, something like Target. I already have security cameras scattered liberally around the place. Imagine that now I can buy a list of faces in the zip codes close to my store. Suddenly I’m able to keep a record of which departments in my store each customer visits. The next time they come back, I can put a tease on a video screen as they walk in, tailored to their purchasing habits, or I can alert security if the person is a suspected shoplifter.

Of course, your friendly neighborhood government can use technology like that, too, and they already have your picture on file.

What to do about it? Realistically, nothing. The train has left the station, and there’s no calling it back. We could try to pass laws about this stuff, but they’d be pretty much impossible to enforce. You could try to scour the Internet and remove every picture in which you’re identified, but good luck with that.

The only counter-strategy I can think of off the top of my head is misinformation — tagging a whole bunch of different faces with your id, to create uncertainty over who the “real” you is. That only goes so far, however; once your face and credit card are linked at a retailer you’re done. It’s probably time to coach our children to not make the same mistake we did, instead to take a page out of Harlean’s book. She is a fiction. The Internet is no place for real people.

Coda:
The front panel of the article linked to above is about breaking the security on iPhones. It’s worth noting that while the article is correct, the same advice applies to anything protected with a password. The obvious thing missed in the article is that most people don’t put any password on their phone, rendering the rest of the warning moot. I use an Android, and my screen lock thingie has even fewer permutations than a 4-digit number. I’m not out to stop the pros; I put the lock on the phone when I read that California has ruled that searching a phone doesn’t require a warrant, even though searching a briefcase does. My lock is to stop prying during routine traffic stops. I don’t have anything to hide, but it’s important that everyone protects privacy, not just people with something to hide.

A closing note about passwords: