Catalyst Model #7: Page view counter/tracker

Published · Tuesday, 21 July 2009 (Updated · 7 March 2010)

Six models in and we haven’t even touched DBIx::Class yet. Notice? It has become commonly believed that Catalyst and DBIx::Class and Template are tied somehow or that Catalyst only works with DBIC as the model driver and Template as the view. It should be obvious by now that this is completely wrong.

The pervasiveness of DBIC and Template in the various examples and docs is because they are such excellent kits that they are often adopted. As you’ve seen, you can design and run a Catalyst application with no relational database or view class or with a mix of any.

It’s time to use one though. Let’s see how easy a page view counter and tracker is with Catalyst + DBIx::Class. First we need–

The database schema + required modules

We’re going to use SQLite. It’s small, portable, fast, and public domain to boot. You will not need to install it directly. It’s now bundled with its Perl engine—DBD::SQLite. Catalyst::Model::DBIC::Schema will be our model glue between the schema and the app. When you go to install it, it will prompt you to install the DBIx::Class things upon which it relies.

cpan DBD::SQLite
cpan Catalyst::Model::DBIC::Schema

My background is mostly MySQL so this was easier for me to sling than SQLite SQL. Also, MySQL, PostgreSQL, and friends support foreign keys. SQLite does not so it’s not going to let us generate our relationships automatically from Perl. We need to use an engine (for the purposes of generating the code, once it’s done, anyone can use it) which supports foreign keys–

DROP TABLE IF EXISTS `page`;
CREATE TABLE `page` (
  `id` int() unsigned NOT NULL auto_increment,
  `path` VARCHAR(255),
  `query` VARCHAR(255),
  PRIMARY KEY  (`id`),
  UNIQUE KEY `path` (`path`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `view`;
CREATE TABLE `view` (
  `id` int() unsigned NOT NULL auto_increment,
  `page` INT() NOT NULL,
  `agent` VARCHAR(255),
  `referer` VARCHAR(255),
  `ip` VARCHAR(32),
  `created` DATETIME NOT NULL,
  PRIMARY KEY  (`id`),
  FOREIGN KEY (`page`) REFERENCES page(`id`),
) ENGINE=InnoDB CHARSET=utf8;

Note: VARCHAR(255) is an unrealistic definition for real world URI data. Query strings can be something like 2K characters in most browsers. SQLite doesn’t actually enforce this size limit though. Just making sure we mention it so you don’t take it as the right™ way.

The great thing here is that DBIx::Class has tools—including DBIx::Class::Schema::Loader—to migrate across DB engines and write engine neutral schemata in Perl.

So, I don’t even have to bother learning any SQLite syntax. I just load the MySQL definitions into a new database called “page_views.” Then run this code against it–

use DBIx::Class::Schema::Loader "make_schema_at";
make_schema_at(
               "MyApp::Schema::PageView",
               {
                   dump_directory => "./lib",
                   use_namespaces => 0, # Easier for this example.
               },
               [ "dbi:mysql:page_view", $user, $pass ],
              );

And that spits out the schema classes we’ll use as a model. It writes the following–

  • MyApp::Schema::PageView
  • MyApp::Schema::PageView::Page
  • MyApp::Schema::PageView::View

Though the schema modules live inside the MyApp directory they are functionally distinct from your Catalyst application. They can be used without starting your app at all. This is important because you do not want to bind anything to your app in a way that it’s impossible to reuse. This is often called tight coupling; or even monolithic architecture. It means the app is one giant block of stuff and that makes everything harder: updates, changes, reuse. MVC is an excellent weapon against this.

So, keep in mind that the schema is not the model. The model is a consumer of the schema. Like models we’ve done already this model will have little code. It’s just a shim to define connection parameters and funnel through API requests from the application context. It’s the glue between the service—MyApp::Schema::PageVeiw + DB—and the the application—MyApp + MyApp::Model::PageView.

So, here are the classes just as they were written for us by DBIC::S::L->make_schema_at

MyApp::Schema::PageView

package MyApp::Schema::PageView;

use strict;
use warnings;

use base 'DBIx::Class::Schema';

__PACKAGE__->load_classes;

# Created by DBIx::Class::Schema::Loader v0.04006 @ 2009-06-23 21:36:34
# DO NOT MODIFY THIS OR ANYTHING ABOVE! md5sum:MZfMk2qBkRdbVtnHIk+Fgg

# You can replace this text with custom content, and it will be preserved on regeneration
1;

MyApp::Schema::PageView::Page

package MyApp::Schema::PageView::Page;

use strict;
use warnings;

use base 'DBIx::Class';

__PACKAGE__->load_components("Core");
__PACKAGE__->table("page");
__PACKAGE__->add_columns(
  "id",
  { data_type => "INT", default_value => undef, is_nullable => 0, size => 10 },
  "path",
  {
    data_type => "VARCHAR",
    default_value => undef,
    is_nullable => 1,
    size => 255,
  },
  "query",
  {
    data_type => "VARCHAR",
    default_value => undef,
    is_nullable => 1,
    size => 255,
  },
);
__PACKAGE__->set_primary_key("id");
__PACKAGE__->add_unique_constraint("path", ["path"]);
__PACKAGE__->has_many(
  "views",
  "MyApp::Schema::PageView::View",
  { "foreign.page" => "self.id" },
);

# Created by DBIx::Class::Schema::Loader v0.04006 @ 2009-06-23 21:36:34
# DO NOT MODIFY THIS OR ANYTHING ABOVE! md5sum:sxMMvDKvfO0MJ1cztjyH8A

# You can replace this text with custom content, and it will be preserved on regeneration
1;

MyApp::Schema::PageView::View

package MyApp::Schema::PageView::View;

use strict;
use warnings;

use base 'DBIx::Class';

__PACKAGE__->load_components("Core");
__PACKAGE__->table("view");
__PACKAGE__->add_columns(
  "id",
  { data_type => "INT", default_value => undef, is_nullable => 0, size => 10 },
  "page",
  { data_type => "INT", default_value => undef, is_nullable => 0, size => 10 },
  "agent",
  {
    data_type => "VARCHAR",
    default_value => undef,
    is_nullable => 1,
    size => 255,
  },
  "referer",
  {
    data_type => "VARCHAR",
    default_value => undef,
    is_nullable => 1,
    size => 255,
  },
  "ip",
  {
    data_type => "VARCHAR",
    default_value => undef,
    is_nullable => 1,
    size => 32,
  },
  "created",
  {
    data_type => "DATETIME",
    default_value => undef,
    is_nullable => 0,
    size => 19,
  },
);
__PACKAGE__->set_primary_key("id");
__PACKAGE__->belongs_to("page", "MyApp::Schema::PageView::Page", { id => "page" });

# Created by DBIx::Class::Schema::Loader v0.04006 @ 2009-06-23 21:36:34
# DO NOT MODIFY THIS OR ANYTHING ABOVE! md5sum:sr8c7VkpiV/7w9ELr8fjEg

# You can replace this text with custom content, and it will be preserved on regeneration
1;

Notice the reason we needed the foreign keys to be there. The relationships are set-up for us in the code. MyApp::Schema::PageView::Page contains this, for example–

__PACKAGE__->has_many(
  "views",
  "MyApp::Schema::PageView::View",
  { "foreign.page" => "self.id" },
);

Something awesome about this approach is that it means it when it says–

You can replace this text with custom content, and it will be preserved on regeneration

If you have a database which supports foreign keys—SQLite alas does not—and your DB schema is clean then you never have to write any Perl to set it up. This is a life-saver when requirements are rapidly changing or you’re prototyping and playing with the table design. You can tweak the DB all you want without having to do anything in the code but rerun a schema generation script. You can save the Perl for the more fun tasks like beefing up your model with extended searches or fat-model style business logic.

Gluing the schema into our application by creating a model is snap and there is a helpler to do it–

./script/myapp_create.pl model PageView DBIC::Schema MyApp::Schema::PageView

That’s it. We’re just gluing a DBIx::Class schema into our Cat app via a model. All the model needs to know is how to connect and that is information that is not—unless you’re doing things wrong—contained in your schema.

cat lib/MyApp/Model/PageView.pm
package MyApp::Model::PageView;
use strict;
use base 'Catalyst::Model::DBIC::Schema';

__PACKAGE__->config(
    schema_class => 'MyApp::Schema::PageView',
);

1;

Now add the connection information to the config file–

emacs myappl.yml
Model::PageView:
  connect_info:
    - dbi:SQLite:__path_to(etc/views.sqlt)__
    - ~
    - ~
    - RaiseError: 1
      AutoCommit: 1
      ChopBlanks: 1

DBIx::Class::Schema gives us a deploy method. It will take connection information and attempt to create the database tables. This is especially easy to use with sqlite as it’s just a file write to create the database and tables. We just need a writable location where we can deploy.

And speaking of “a writable location,” do this–

mkdir etc

Otherwise __path_to(etc/views.sqlt)__ is not going to be able to find a valid place to put the DB: views.sqlt.

Detour: Testing applications with DBIx::Class DBs

This is not directly related to what we’re doing in the model but well-worth discussing.

The ability to set the DB like this is powerful. You can set it in this config, in another config, override it in another, and use environment variables to point it at still another. Which means you could have a production DB that’s on Oracle and override it during tests to connect to an SQLite DB without changing a line of code in the application, just setting up the extra config and pointing the test at it before the application loads. You could run tests against this application with a different DB so it doesn’t pollute your real tracking stats with test views.

Create a myapp_test.yml config file

The only change from our real config is in bold. We connect to a new temporary DB. This is a convenient example that will work on most all *nixes but it I would only recommending using File::Temp or a similar strategy to make sure your DB is really temporary and not going to collide with another test running at the same time.

Model::PageView:
  connect_info:
    - dbi:SQLite:/tmp/test_views.sqlt
    - ~
    - ~
    - RaiseError: 1
      AutoCommit: 1
      ChopBlanks: 1

Point tests at the test DB

Put this at the top of tests. It will make your application ignore myapp_local.yml, if you have one, in favor of loading myapp_test.yml. See Catalyst::Plugin::ConfigLoader.

BEGIN {
    # To get myapp_test.yml to load.
    # Test specific configuration data is there.
    $ENV{MYAPP_CONFIG_LOCAL_SUFFIX} = "test";
}

Controllers to manage the view tracking

./script/myapp_create.pl controller ViewTrack
emacs lib/MyApp/Controller/ViewTrack.pm
package MyApp::Controller::ViewTrack;
use strict;
use warnings;
use parent 'Catalyst::Controller';

sub index :Path :Args(0) {
    my ( $self, $c ) = @_;
    my $pages = $c->model("PageView::Page")->search();
    $c->stash( pages => $pages );
}

sub process :Private {
    my ( $self, $c ) = @_;
    my $uri = $c->request->uri;

    my $page = $c->model("PageView::Page")
        ->find_or_create({ path => $uri->path });

    $page->query($uri->query) if $uri->query;
    $page->update();

    my $view = $c->model("PageView::View")
        ->create({ page => $page,
                   agent => $c->request->user_agent || "",
                   referer => $c->request->referer || "",
                   ip => $c->request->address || "",
                   created => \"DATETIME('NOW')", # SQLite syntax
                 });

    $view->update();
}

1;

That’s doing two things. It gives us a view to the tracking data in the index at /viewtrack. It processes a tracking request whenever called without a specific action. We will need to deal with automatic deployment in the Root space next.

We didn’t have to write the SQLite SQL because SQL::Translator and DBIx::Class::Schema did it for us. If you want to see what it ended up doing or run it by hand, this is it–

CREATE TABLE page (
  id INTEGER PRIMARY KEY NOT NULL,
  path VARCHAR(255),
  query VARCHAR(255)
);
CREATE TABLE view (
  id INTEGER PRIMARY KEY NOT NULL,
  page INT(10) NOT NULL,
  agent VARCHAR(255),
  referer VARCHAR(255),
  ip VARCHAR(32),
  created DATETIME(19) NOT NULL
);

We still don’t have anything doing the tracking. We’ll add a tracking call to the top of our app at Root->auto which means it will get called for everything. And this is really where we need to do our auto-deployment of the database. We do this with RenderView’s render+end combination.

Add to MyApp::Controller::Root

sub auto :Private {
    my ( $self, $c ) = @_;

    # ->process is called by default. We did not specify an action so
    # this will call MyApp::Controller::ViewTrack->process
    $c->forward($c->controller("ViewTrack"))
        if $self->{viewtrack_enabled};

    return 1;
}

# index and default omitted for space.

sub render : ActionClass('RenderView') {}

sub end :Private {
    my ( $self, $c ) = @_;

    # If we get this error, the PageView DB is not there.
    if ( grep { /no such table/i } @{$c->error} )
    {
        $c->model("PageView")->schema->deploy;
        # Probably safe to clear errors. DB will be there next request.
        $c->clear_errors() unless $c->stash->{db_deploy}++;
    }
    $c->forward("render");
}

1;

This will automatically deploy our database if it turns out it’s not there. If it’s not there it will throw an error, we’ll intercept, deploy the DB, and retry rendering the view.

Note that we have changed the original Root.pm’s end method, replacing it with a render. The render method, being a RenderView controller, handle automatic view dispatching where reasonable. The end method gives us a chance to deal with errors. We use the db_deploy variable in the stash as a safety valve so we don’t accidentally get an infinite loop if there is a deployment problem.

As soon as that’s in, you’ve got working page view tracking. It’s still not on though. Look above. I’s checking if it’s enabled or not with viewtrack_enabled. So, add that to your config to start tracking–

Controller::Root:
  viewtrack_enabled: 1

On an amusing side note. I have had the demo app up on my box visible externally for a week. I haven’t had port 80 on this box open for web requests for like three years. It was up about an hour before the /viewtrack index was already starting to log of this kind of crap–

  • /awstats/awstats.pl
  • /xmlrpc.php
  • /adxmlrpc.php
  • /_vti_bin/_vti_aut/author.dll
  • /-O
  • /phpMyAdmin/

There is a caveat with this implementation. Doing the processing of the view in auto is probably the wrong place. You won’t be able to capture page or status errors in your tracking until the end. There is a larger more general “why” you shouldn’t do it this way. It can slow down a site to insert a tracking record on every request—though this is not likely to be the case in this simple SQLite version—and there are wonderful, much deeper versions already out there for free which you can integrate into your site with nearly zero load by adding some JavaScript. We use and recommend Google Analytics. Most of the commercial tools aren’t as good.

Even though the discussion got a bit involved, we just added pretty sweet tracking to an entire site in just a few lines of hand-rolled code plus a little help from DBIC and SQLite.

Have at it

./script/myapp_server.pl -d -r -p 3000

Tomorrow is #8: Titles in real typefaces on demand with Imager. Lucky!



digg stumbleupon del.icio.us reddit Fark Technorati Faves

« Catalyst Model #6: Log file model–Apache access log · Catalyst Model #8: Titles in real typefaces on demand with Imager »
« 10 Catalyst models in 10 days1 »