NB: These pages were mostly written in 2001 or so. The résumé dates are accurate but the code is aged and unlike whiskey, 8 year-old code doesn't usually taste better. For a look at my current skills and to see my CPAN modules, sample code, and code discussions, please see these pages instead: Perl resources and sample code and PangyreSoft.
Smart quotes with Perl
Social links
View Ashley Pond V's profile on LinkedIn
Miscellaneous

Other pages

Introduction

You want your HTML to be typographically correct (notes here). But typing ‘this’ constantly to get ‘this’ instead of "this" is ludicrous. Perl to the rescue.

Sample text file, named “plain-quote.html”

Which looks so in your browser…

Apply our smart quoter to it
G4:jinx[786]/www/perl>quote-fixer plain-quote.html
And we get back

Which will now render in a browser…

Here is the code to do it

Discussion

This is a great utility but it a simplistic one. Still, an algorithm that works for 99% of given cases is a good one. This one will fail in certain places because it’s not looking for delimited quotes. Therefore it is not smart enough to know what to do with open contractions like ‘cause and ‘Burque and it picks the wrong quote. It also fails when we want the original symbols, as in, “I am 6'2" and change.”

You can see in the code that we need to keep track of what tags we’ve descended into. If we fix quotes within an HTML tag, we’ll break it. Fixing them in a style declaration will break the CSS. If we fix quotes in comments, we’ll break SSIs. If we fix quotes in scripts, we’ll break them. If we fix quotes in pre, code, or textarea tags, we’ll change the literal meaning of demonstrated code. So we skip the attributes of tags altogether and avoid messing with the content of pre|code|script|textarea|style tags.

A fully robust solution might make use of recursive regexes or Text::Balanced but it would be “breakable” by bad user input (incorrect punctuation) and the one above is much simpler and works quite well as is.

Search these pages via Google
Text, original code, fonts, and graphics ©1990-2009 Ashley Pond V.