Move your CSS from stylesheets to inline with Perl

What this code does

It is a proof of concept—there are a couple of other ways you could do it but this is pretty succinct and functional—for taking an HTML page (must be well formed XHTML for the sample code to run on it) and apply a CSS file against it to change all the instances of “class” to inline “style.”

The main use for this I can see, and why I wrote it, is to transform the description fields of RSS/Atom feeds into style formatted XHTML.

use strict;
use warnings;
use CSS::Tiny;
use HTML::Selector::Xpath;
use XML::LibXML;

my $usage = "Usage:\n\t$0 [css_file] [html_file]\n\n";
my $sheet = shift || die $usage;
my $html  = shift || die $usage;

my $css = CSS::Tiny->read($sheet) or die;
# :before and :after stuff is still missing

my $parser = XML::LibXML->new();
my $doc = $parser->parse_html_file($html);
my $root = $doc->documentElement();

my %xpath_to_style;
for my $rule ( reverse sort keys %{$css} ) {
    my $selector = HTML::Selector::XPath->new($rule);
    $xpath_to_style{$selector->to_xpath} = $css->{$rule};
    # Uncomment if you want to see the CSS-->xpath strings
    #    printf("%s\n%s\n%s\n\n",
    #           $rule,
    #           $selector->to_xpath,
    #           format_css($css->{$rule})
    #           );
}

for my $xpath ( keys %xpath_to_style )
{
    my $style = $xpath_to_style{$xpath};
    for my $node ( $root->findnodes( $xpath ) )
    {
        if ( my $inline_css = $node->getAttributeNode("style") )
        {
            my $fake_sheet = $node->nodeName .
                "{" . $inline_css->getValue . "}";
            my $css = CSS::Tiny->read_string($fake_sheet);
            next unless $css;
            %{$style} = (
                         %{$style},
                         %{$css->{$node->nodeName}}
                         );
        }
        $node->setAttribute("style", format_css($style));
        $node->removeAttribute("class");
    }
}

print $doc->toString, $/;

exit 0;

# subs -----------------------------------------
sub format_css {
    my $css = shift || return '';
    my @pairs;
    while ( my ( $attr, $val ) = each %{$css} )
    {
        push @pairs, "$attr:$val";
    }
    join "; ", @pairs;
}
digg stumbleupon del.icio.us reddit Fark Technorati Faves
Your information (required) Name*
Email*
Website

* Indicates required fields; email is used for validation and is not displayed on the site.

Your comment
Commenting on Move your CSS from stylesheets to inline with Perl
Title

Body is limited to ≈1,000 words. Paragraphs—but not line breaks—are automatically inserted. Valid XHTML is required. These are the allowed tags–

<a href=""></a> <br/> <acronym title=""></acronym> <abbr title=""></abbr> <code></code> <pre></pre> <tt></tt> <ins></ins> <del></del> <hr/> <cite></cite> <b></b> <i></i> <sup></sup> <sub></sub> <strong></strong> <em></em> <h1></h1> <h2></h2> <h3></h3> <q></q> <blockquote></blockquote>