Catalyst Model #10: Fixing your legacy code by not fixing it

Published · Friday, 24 July 2009 (Updated · 7 March 2010)

You just inherited—or perhaps woke up sober to discover you wrote—a mound of code spaghetti in perl4/cgi-lib.pl style which uses neither strict nor SQL placeholders nor sanity. It cannot be used with your shiny new MVC because it’s got no separation of concerns and it’s bleeding namespaces and SIG handling like a stuck pig in a wedding dress.

“Trash it! Immediately and forever!” you cry. Nice idea. The cats who built it—it really wasn’t you, was it? Oh, well, karma isn’t always a no op—tied it deeply into other parts of your code and your presentation “layer” and it’s just too big and awful to chuck without spending 6 months to rewrite. Plus you have no tests so you don’t even know if your code is supposed to stay “broken” in places because it’s what customers expect or has necessary, undocumented side-effects.

So, what do you do? To your manager it’s obvious that you can’t just start over. To you it’s obvious that moving forward by adding to the pile could be construed as a terrorist act. What can you do?

Do what any good sanitization operation does: put the crap in a bag.

The legacy code problem

Anyone who has been around awhile will find this code hauntingly familiar. I’ve seen its ilk at almost every contracting job I’ve had. This stems from false laziness and over-hurried developers. Perl can solve pretty much any problem and it can do it, generally, dramatically faster than other programming languages. This leads to false economy.

Perl hackers, and I’ve been one of the offenders, too often rush through to the first solution that presents itself in too small a scope with no eye toward the future of the code. We get a prototype running so fast it makes the project manager’s head spin. Then we have a pile of code which over time costs the maintenance hackers 10-fold, at least, what it would have cost to slow down a little and do it right the first time.

The moral? Over-estimate on your projects. Use the padding to write tests and maybe documentation first. Please see #9 in The 10 biggest mistakes Perl hackers make. Tests are easy to write in Perl. Sometimes trivially so. Writing tests first is the single best method for shaking out bad designs.

Now for our sample legacy code. I only wish I could tell you it was difficult to write. :)

legacy-lib.pl

emacs legacy/lib/legacy-lib.pl
# use strict; :(
# Lasciate ogni speranza voi ch'entrate!!!

# Guys, stop taking this out!! I need it for my auth script!!!
$ENV{SHORTCUT} = `cat /etc/passwd`;

exit(13) unless $THIS_IS_SET_FROM_SOME_CRAZY_CALLING_BS_8_PACKAGES_AWAY;

# Our customers don't like errors.
$SIG{__DIE__} = sub { die "If you see this, SIG handling is still here :(" };

$SIG{__WARN__} = sub {
    open FH, "/tmp/stoopid.log";
    print FH "Weblogs, error checking, lexical file handles are icky: @_";
};

sub dumb {
    die "OH HAI. I IZ TEH RIGHT FATALZ NOT TEH SIG!";
}

sub dumber {
    warn "That's all folks" and exit;
    print
        "Content-Type: text/plain\n\n",
        "I CAN HAZ GUD DEVELOPMENT PRACTISIZ?";
}

sub users { sort keys %LEGACY_HASH  }

sub types { Music, Books, DVD } # Quoted strings are for Fascists!

sub get_title_for_user {
    $user = @_[0];
    $thing = @_[1];
    return $LEGACY_HASH{$user}{$thing};
}

%LEGACY_HASH = ( kitteh => { Music => "Tori Amos",
                             Books => "Cheeseburger",
                             DVD => "Mice",
                 },
                 ashley => { Music => "Clutch",
                             Books => "Dandelion Wine",
                             DVD => "Fight Club",
                 },
                 "we <3 paco" => { Music => "Dan Zanes",
                                   Books => "Curious George",
                                   DVD => "Little Bill",
                 },
               );

The evil that men do walk-through–

  1. No strict, no warnings.
  2. A heinous security hole.
  3. Code that is impossible to even use, it exits upon loading without more, unknown legacy junk.
  4. Global signal handling.
  5. Subroutines that print—including HTTP headers, instead of returning output—which would break, say, a Catalyst application if called.
  6. Barewords.
  7. Inappropriate array slicing.
  8. Plenty o’undeclared variables.

In the spirit of fighting fire with fire, we’re going to fight voodoo with voodoo. Most of that we can fix with some plain, if somewhat abstruse, Perl. Something that normally requires a lot of verbosity to fix is the dumber routine printing and if we had a real package we might have a bunch of that shite. So we want a general solution that’s already debugged and set-up for us. IO::CaptureOutput is a great fit for this task.

Install the dependency

cpan IO::CaptureOutput

Tactics to use the code without doing harm

We’ll put our legacy code into a model within a “secret” namespace, and prevent it from messing with global space or bombing our entire application. What we need to do to fix the evil points above–

  1. Nothing, c’ést la vie. It works as is, we can’t fix this without risking breaking the code’s current behavior in other places.
  2. We could do two things–
    • Don’t let it pollute %ENV with evil.
    • Don’t let it have the shell at all so cat won’t execute.
    –we’ll do both though you might have to pick an approach or lose valid functionality. Either would fix this particular problem.
  3. We’ll neuter exit for that scope.
  4. We’ll cache the global signals and put them back after loading our legacy code.
  5. IO::CaptureOutput, as foretold.
  6. Again, see #1, Perl doesn’t care so we don’t.
  7. Perl only warns, we don’t have warnings on, so we don’t have to fix anything.
  8. See #1.

Then we will have a situation where you have access to your legacy code when needed without letting it contaminate the new shiny stuff with flecks of feces.

Da model

./script/myapp_create.pl model Legacy
emacs lib/MyApp/Model/Legacy.pm
package __iHAZaSAD; # Namespace for isolation.
no warnings;
use lib "./legacy/lib";

my $legacy_lib = "legacy-lib.pl";

# Save some important things.
my $_exit = \*CORE::GLOBAL::exit;
my %_sig = %SIG;
my %_env = %ENV;

# Neuter this so the legacy code can't use it.
*CORE::GLOBAL::exit = sub { 0 };
@ENV{qw( PATH HOME SHELL USER )} = () x 4;

# Do the filthy, messy deed.
require $legacy_lib;

# Put things back where they belong.
*CORE::GLOBAL::exit = $_exit;
%SIG = %_sig;
%ENV = %_env;

# Back to regular space --------------------------------
package MyApp::Model::Legacy;
use strict;
use parent "Catalyst::Model";
use Carp ();
use IO::CaptureOutput qw(capture);
our $AUTOLOAD;

sub AUTOLOAD {
    my $called_sub = $AUTOLOAD;

    ( my $sub_name = $called_sub ) =~ s/\A.+:://;
    my $legacy_function = \&{"__iHAZaSAD::" . $sub_name};
    {
        no strict "refs";
        *$called_sub = sub {
            # We don't need or want "$self," we're just wrapping
            # legacy-lib.pl's exported functions from an OO call.
            my ( $self, @args ) = @_;
            defined &$legacy_function
                or Carp::croak("$legacy_lib has no $sub_name function");

            my ( @return, $return, $out, $err );
            if ( not defined wantarray )
            {
                capture { $legacy_function->(@args) } \$out, \$err;
                Carp::carp($err) if $err;
                return;
            }
            elsif ( wantarray )
            {
                @return = capture { $legacy_function->(@args) } \$out, \$err;
                Carp::carp($err) if $err;
                return @return ?
                    @return : $out;
            }
            else
            {
                $return = capture { $legacy_function->(@args) } \$out, \$err;
                Carp::carp($err) if $err;
                return $out ?
                    $out : $return;
            }
        };
    }
    goto &{$called_sub};
}

sub DESTROY { 1 }

sub get_variable {
    my ( $self, $variable ) = @_;
    no strict;
    ${"__iHAZaSAD::$variable"};
}

sub get_hash_value {
    my ( $self, $variable ) = @_;
    # Don't autovivify.
    exists $__iHAZaSAD::LEGACY_HASH{$variable} ?
        $__iHAZaSAD::LEGACY_HASH{$variable} : undef;
}

1;

Take a minute to explicitly tell yourself that this kind of approach is a shim, not code you want to live with long. I do advocate this sort of trickery in production environments if it helps move the code base forward. If it ends up being another level of complexity without tests, without documents, without taking any of the lessons from the first pass then don’t do it. Stick with the old code base till your manger drinks the Ruby or C# or Python or .NET or Java Kool-Aid®.

If you want Perl to be taken seriously in your shop, you—the developer—need to be serious about writing it first.

Now one controller to rule them all… well, to use new and old stuff together

./script/myapp_create.pl controller Hybrid
emacs lib/MyApp/Controller/Hybrid.pm
package MyApp::Controller::Hybrid;
use strict;
use warnings;
use parent 'Catalyst::Controller';

sub index :Path :Args(0) {
    my ( $self, $c ) = @_;
    my $type = $c->request->param("type");
    my $user = $c->request->param("user");

    s/[^[:print:]]|\A\s+|\s+\z//g for $type, $user;

    $c->detach() unless $type and $user;

    my $title = eval {
        $c->model("Legacy")->get_title_for_user($user => $type);
    };

    $c->detach() unless $title;

    my @src = $c->model("Amazon::APA")
        ->covers_for_title($title, $type);

    my $index = $c->model("RandomNumber")
        ->generate({ lower_bound => 1,
                     upper_bound => scalar(@src),
                     integer => 1
                   });

    my $img = $src[$index-1];

    $c->stash( img => $img,
               search_title => $title,
               user => $user );
}

sub show_shortcut :Local {
    my ( $self, $c ) = @_;
    $c->response->content_type("text/plain");
    my $huge_security_risk =
        $c->engine->env->{SHORTCUT} || "Nopers! All good.";

    $c->response->body(<<"Body");
If you can see the passwd file below, we are fscked.
----------------------------------------------------
$huge_security_risk
Body
}

sub call_dumb :Local {
    my ( $self, $c ) = @_;
    $c->model("Legacy")->dumb;
}

sub call_dumber :Local {
    my ( $self, $c ) = @_;
    my $printed_headers_and_html = $c->model("Legacy")->dumber;
    my ( $headers, $html ) = split /\r?\n\r?\n/, $printed_headers_and_html, 2;
    # Discard headers, they're legacy nonsense.
    $c->response->body( $html );
}

1;

Run it and see if it’s salvation or doom

./script/myapp_server.pl -d -r -p 3000

That’s it, kids! We’re done with the ten. Woo-woo! Come back in a few for the wrap-up and for the source code for the whole project + a nice index, suggested, exercises, and ideas for extensions. Don’t hesitate to let me know if you see problems, improvements, or clarifications.



digg stumbleupon del.icio.us reddit Fark Technorati Faves

« Catalyst Model #9: TheSchwartz · 10 Catalyst models in 10 days, wrap-up »
« 10 Catalyst models in 10 days1 »