Perl resources, modules, and sample code

The 10 biggest mistakes Perl hackers make

Friday, 13 June 2008—subject to ongoing, unnanounced updates

Modern Perl Modern Perl at Powell’s, Modern Perl at Barnes and Noble, Modern Perl at Amazon.com, or download Modern Perl for free and then recommend it to your friends, write a great review, or otherwise spread the word.

Avoid these problems and you will save a huge amount of time, maybe years, in your journey toward Perl guru. Laziness, hubris, and impatience being the core Perl virtues, be lazy and learn from me. I know what I’m talking about. Now hurry up and get reading.

1. Not Reading the Fµ¢&¡^% Manual – perldoc

Perl has among the best, most accessible documentation in the hacking world. All core packages and functions are well documented and every module of even marginal quality comes with docs too. They are accessible any number of ways, including the command line—perldoc—and many websites—perldoc.perl.org and search.cpan.org for starters.

Don’t whine when an experienced hacker hits you on the nose with the rolled up newspaper known as RTFM. It might not feel like it but he’s being a friend.

2. No strictures. No warnings.

If you can parse this in your head–

undef $/; print <>^<>;

–then you are entitled to leave strictures out of your code if you feel like it.

The rest of you—if your every script does not begin with use strict;—or a module that imports it for you—you are a bad person and you are making the world a worse place. Save yourself even more trouble by always adding use warnings; and maybe use diagnostics; immediately under it.

use strict;
use warnings;
# use diagnostics; <-- turn on if you want more detailed warnings.

Alternatives, these are not all the same in actual effect, only in spirit, include–

use Moose;
# –or–
use Modern::Perl;
# –or –
use common::sense;
# –or –
use strictures;

3. Not using the CPAN.

This goes hand-in-hand with #7 below. I know, your host makes it difficult to install new modules. I know, your manager is a prick and thinks you should be able to parse URIs without installing URI::QueryParam. I know, you think a regular expression is all you need to convert some HTML to wiki text.

A smart Perl hacker once remarked that 90% of every Perl application is already written. He meant that the CPAN already has pretty much every conceivable piece of the puzzle just waiting for you to put together with a little design and some tests.

4. Using global variables.

Global variables, and by extension, singletons, are almost always a design mistake. Though designing software well is harder than writing any given bit of code well it’s worth getting right. Well designed software is easy to maintain, debug, and extend.

Variables introduced 5 packages away which can be modified anywhere in 50,000 lines of code is bullshit with teeth. Variables should be as limited in scope as possible. Like a good subroutine, if you can’t see it all on one terminal screen then it’s probably too spread out.

5. Flouting standards.

Though there are a few dozen styles possible in Perl there are some pretty solid standards out there. Adhere to them or be recognized not as a genius or trail-blazer but as the jerk who makes life a living Hell for the dev who inherits your code.

Names of packages and subs.
Packages (modules) in Perl are named with camel case, e.g., MooseX and HTML::TokeParser. subroutines are named with lower case words joined with underscores, e.g., content_type, path_query. A leading underscore on a sub indicates the sub/method is private and not to be used because its implementation (arguments, return values, side-effects) is not part of the module’s sanctioned functionality and subject to unannounced or undocumented changes.
The only exceptions which make sense are when you’re adhering to an underlying standard. XML::LibXML uses camel case for methods—e.g., firstChild—because pretty much everything related to DOM hacking does, including the C library for which it’s an interface.
Variable names.
Variable names are like sub names—lowercase words joined by underscores—and should be meaningful; c.f. literate, as in literate programming. Many hackers from C or Java bring bad naming habits like $c, $x, %n or $ThisDoesSomething and @ICantReadThisCrap. The first set inhibits understanding. The second inhibits legibility.
Package variables meant to be used anywhere in package are title cased, e.g., $Widely_Used_Variable or declared with use constant if appropriate. Constants are generally in ALLCAPS.
%hash is an awful variable name. In addition to being redundant, it conveys no information about its purpose. Pretend you are writing a story or description of what your code is doing. Good variable names make it much easier for the next hacker. Abbreviations and mindless entries like $data are false economy.
Style considerations.
There are some basic Perl idioms. Learn them. You should know about $_ and @_ and the contexts in which they are relevant. You should be aware that foreach is just for typed by someone who has time to waste. You should know that C-style loops are rarely necessary and harder to read when they aren’t. You should be able to recognize a Schwartzian transform and not bitch about it being difficult to read when you encounter one. Yes, it’s confusing the first time you see one but it’s a standard Perl idiom and once you know it, it can be easier to read than the 30 lines of code it might otherwise occupy.
Consistency.
Even where styles diverge, every Perl hacker with half a brain agrees, consistency of style is important. Keep your code consistent. Maintain the style of others’ code when you hack on it. Don’t rewrite someone else’s perfectly functional code just because it isn’t written the way you would have done it.

6. Comments instead of Pod.

Compare this–

#############################################################
# FormMail                        Version 1.92              #
# Copyright 1995-2002 Matt Wright mattw@scriptarchive.com   #
#############################################################

–with–

=head1 NAME

FormMail - Don’t ever use this script!

=head1 VERSION

1.92

=head1 COPYRIGHT

Copyright E<copy> 1995-2002 Matt Wright, C<mattw@scriptarchive.com>.

=cut

Now you may ask, “Why care?” They both have (approximately) the same information. Here’s the most obvious reason to care. This is what everyone sees when they look up the documentation for the first example–

 

 

That’s right, nothing. Code comments are not documentation.

This is what they see when they look up the documentation for the second–

NAME

FormMail - Don’t ever use this script!

VERSION

1.92

COPYRIGHT

Copyright © 1995-2002 Matt Wright, mattw@scriptarchive.com.

If you’re not documenting with Pod, you’re a bad Perl hacker. You think it’s too hard?

=head1 [ALL CAPS TITLES], =head2 [SUB HEADING]. =over [#] and =back for indentation. =item [text] for items like methods, functions, or lists. L<> for links to other packages or URIs, e.g.: L<Moose> and L<http://sedition.com>. C<code>, B<bold>, I<italic>, E<> for character entities like E<gt> and E<copy>. =cut to end a section of Pod and return to code. Indent with an arbitrary number of spaces to get verbatim output. Help if you need it: perlpod.

That’s it! If you can do that, you can do Pod. If you can’t do that, may I recommend PHP?

7. Reinventing the wheel.

Listen, man. All the templating packages have a learning curve. All the XML parsers take a few hours or a few days to master. URLs are much more complicated than they look. Even the regular expressions you think you understand are probably really quite bad and might even be dangerous in production. A really bad regex can take a few thousand years to search a string if conditions are right.

You think it’s going to be easier to roll your own solutions to these common problems. You’re wrong on several levels. It won’t be easier. You will do a poor job compared to what’s been done by better hackers and improved for years by their peers and code consumers. You will code in a bubble and bad design habits will radiate through all your code. It will waste time you could be doing real work. It takes eyeballs and potential bug fixes away from the packages you should be using.

I know it seems fun. I know it’s hard to resist. I know writing your own templating system is all but Perl’s version of B’nai Mitzvah. Still, the hours and pain you’ll save by skipping this series of (mis)steps would serve you better doing anything else.

8. Cargo cult.

Never cut and paste code just because “it works.” Don’t use, or write, code you don’t understand. You’re responsible for what it does. If you don’t know what it does, it’s the hacking equivalent of firing a gun into the dark. You probably won’t kill anyone but if you do the cops won’t care that you’ve done it a thousand times before without hitting anyone.

9. Not writing tests.

Tests in Perl are so easy to write, and there are so many good examples to learn from in the test directories of various modules, that skipping them is the union of sloth and self-loathing. This is how easy it is–

use Test::More tests => 2;
ok(1, "The number one is true");
cmp_ok(1, ">", 0, "One is greater than zero");

Recent example: I wrote three modules for work last week that had an SQL WHERE clause builder. I wrote tests along side all the code. About 200 total. As you can see above, each one took about 30 seconds to write. Yesterday I realized that I was being stupid because we had SQL::Abstract installed on all the machines so I switched the code to use it instead. About 15 of the tests failed after the change. Fixing each one was easy and since the tests were comprehensive, everything was fixed in 30 minutes.

A half hour to reconcile a major change to the internals of three modules. Without the tests it would have taken a day to track things down and I probably would have missed something which QA might have missed before it ended up in the laps of customers.

10. Spelling Perl, PERL.

It’s trivial and it might seem unimportant but, like farting at a fancy dinner, it shows who belongs at the table and who doesn’t. It’s only an acronym by accident, not design. The language is “Perl.” The compiler/interpreter is “perl.” The proof of someone who is unfamiliar with the language, the compiler, and the community is “PERL.”

11. Not using Perl Monks.

This is one for which you can’t be blamed. It falls under extra credit and credibility. Probably no one told you about it yet. Listen up.

There is no better technical support and discussion forum on the Internet. Have you ever Googled for some simple piece of XSLT or CSS and found yourself at some rip-off site to be told you just have to pay $X or fill out a huge registration to sign-up and get a probably wrong answer? You can go to perlmonks.org right now and get $500 worth of advice for free by posting an anonymous question. No lie.

Spending a year at Perl Monks—reading, asking questions, and trying to answer questions when you can—is better than a wall papered with Perl training certificates and it will cost you at least $10,000 less.

Get thee to a nunnery monastery.

Update: StackOverflow’s UI is excellent and their perl tag is becoming the second best place to get Perl help.

Obligatory push for the crawlers: Top 10 Mistakes in Perl.