June 2009 Archives

I'm not a big Facebook fan. In fact, I don't use it long enough to form any opinion about it other than objection that it's a silo from which I can't get my data back out. But, since most of my users in library are using it, I decided to take a look how hard would it be to expose part of our library catalog on Facebook.

Easiest thing to do was to make Facebook appliaction which would fetch RSS feed with results from Koha and present it inside Facebook.

Facebook applications (in canvas mode which I'm using) are in fact simple web pages with a bit of custom Facebook markup. My initial gripe about applications was that they where slow. Now that I know that they are running somewhere else and not on Facebook I understand why they are slow.

So, to sum it up: if you know how to make simple CGI script, you will be fine with Facebook applications. They will be even slower than your application on your server, and if you do something popular you might have problems with server load.

Generally we just like to bitch about state of X. However, I would like to point you towards two presentations from linux.conf.au 2009:

Introducing the Re-Built Linux Desktop by Keith Packard

You will find out what is GEM and why do we have it and now it influenced Linux kernel development. Also explained are DRI2 and KMS. So now you can run non-root X servers, multiple X servers with acceleration and other fun stuff. Look, glxgears on compiz sphere!

From click to pixel: A tour of the Linux graphics pipeline by Carl Worth

Video file contains only first half hour of presentation, but here are interesting highlights from slides:

  • Visually inspecting GTK+ updates
    ./configure --enable-debug=yes # for GTK+
    GTK_DEBUG=updates ./my-program
    
  • Tracing cairo calls

    Install cairo 1.9 or later

    cairo-trace ./my-program
    See results in my-program.$PID.trace

  • Inspecting Render protocol
    xtrace -D :5 > my-program.xtrace
    DISPLAY=:5 my-program
    
  • Finding software fallbacks in EXA

    Edit xserver/exa/exa_priv.h:

    #define DEBUG_TRACE_FALL 1
    Recompile xserver and examine Xorg.0.log file

  • Finding software fallbacks in xf86-video-intel

    In "device" section of xorg.conf:

       Option   "FallbackDebug"     "true"
    Examine Xorg.0.log file

  • Inspecting 3D state (for Intel)

    INTEL_DEBUG=fall,batch,sync
    fall: Show software fallbacks
    batch: Show decoded batchbuffers
    sync: Wait for idle after each batchbuffer
    (see intel_context.c debug_control[] for more)

  • Inspecting GEM state
    cat /proc/dri/0/gem_objects
    cat /proc/dri/0/i915_gem_interrupt
    

Good stuff, well worth two hours of your time to get to know your X, stop bitching and start reporting bugs...

I have been watching videos from linux.conf.au 2009 and stumbled upon Conrad Parker's Ogg Chopping: techniques for programming correctness and efficiency which is great lecture if you want to know something about current state of video on the web, Ogg or Haskell.

I have been thinking about poor state of Linux video for quite some time (bear in mind that I do have real-life experience with U-matic type equipment) but it seems that things are moving in right direction. Here is a quick comilation of useful links from this presentation:

This is very cool! Only problem for me right now is that server side is written in python with which I haven't have good experience (it's just my bias). But, than again Pad.ma JavaScript API seems easy enough to roll out own server implementation if I find time to play with it.

Update: Are we there yet?

After a bit more of watching, I also stumbled upon Collaborative Video for Wikipedia by Michael Dale which introduces following tools related to video editing:

  • Mv_Embed allows support of browsers without <video> tag with annotation editor
  • MetaVidWiki offers another interface, but I couldn't find any good demo to link from here

Internet is not a single network. Some parts of it are hidden behind firewalls, some services allows access only from specific range of IP addresses. To solve that, we are using proxy servers, but what do you do when you want to allow your users easy access to resources which are not directly accessible?

For a long time, I was fan of CGIProxy. Single CGI script which allows you to access all web resources which are visible from machine on which CGIProxy is installed. However, modern web pages have many, many elements, and soon enough overhead of CGI execution for each element proved to be too much for our users patience. It was slow...

I decided to take a look at mod_perl2 as solution since it provides long-living perl interpreter inside Apache 2 server. I was on the right track: Apache2::ModProxyPerlHtml provides easy to configure html rewriter using Apache 2 and mod_perl2. I tested it and immediately saw speedup comparing to previous CGIProxy based solution.

But, this was only half of problem. I also needed to solve user authorization somehow. With old system, we had LDAP server as login method, but this time, I needed to somehow check user passwords in Koha database which are base64 encoded md5 hash of password. Base64 is somewhat unfortunate choice because MySQL doesn't have built-in base64 encoding. If it did, I could just use Apache::AuthDBI, craft SQL queries and I would be ready to go.

First idea was to write Apache2 auth module which would connect to Koha directly. That would work, but it would also require secure connection between proxy and Koha (we are transfering passwords) and proxy would need to have credentials to access Koha database. None of that seemed very clean or secure, so I decided to split it into two parts:

  • Apache auth module which requests credential verification from Koha server over https
  • CGI script on Koha which verifies user and return status
With this approach, passwords are never traveling across network (and even md5 hash of password is transfered over ssl) and proxy server doesn't have to have any Koha specific configuration.

Here is small Apache authorization module which will transfer userid and base64 encoded password hash to cgi script on Koha server over https:

package Apache2::AuthKoha;
  
use strict;
use warnings;
  
use Apache2::Access ();
use Apache2::RequestUtil ();
  
use Apache2::Const -compile => qw(OK DECLINED HTTP_UNAUTHORIZED);

use Digest::MD5 qw/md5_base64/;
use LWP::Simple qw/get/;

sub handler {
        my $r = shift;
  
        my ($status, $password) = $r->get_basic_auth_pw;
        return $status unless $status == Apache2::Const::OK;

        return Apache2::Const::OK if get(
                'https://koha.example.com/koha-auth?userid=' . $r->user .
                ';password=' . md5_base64($password)
        );
 
        $r->note_basic_auth_failure;
        #return Apache2::Const::DECLINED; # allow other authentification
        return Apache2::Const::HTTP_UNAUTHORIZED;
}

1;
And this is small CGI script on Koha server's side which checks userid and password hash and returns appropriate status:
#!/usr/bin/perl

# ScriptAlias /koha-auth /srv/koha-auth/auth.cgi

use warnings;
use strict;

use CGI;
use DBI;

our $dsn      = 'DBI:mysql:dbname=koha';
our $user     = 'koha-database-user';
our $passwd   = 'koha-database-password';

my $q = CGI->new;

my $status = 200;

sub out {
        my ($status,$text) = @_;
        print $q->header( -status => $status ), "$text\r\n";
        exit;
}

out( 500, "NO PARAMS" ) unless $q->param;

my $dbh = DBI->connect($dsn, $user,$passwd, { RaiseError => 1, AutoCommit => 0 }) || die $DBI::errstr;

my $sth = $dbh->prepare(q{
        select 1 from borrowers where userid = ? and password = ?
});

my ( $userid, $password ) = ( $q->param('userid'), $q->param('password') );
$password =~ s{ }{+}g;

$sth->execute( $userid, $password );

if ( $sth->rows == 1 ) {
        out( 200, "OK" );
} else {
        out( 404, "ERROR" );
}
To complete this setup, we also have to define virtual host on proxy server which will tie together our components:
<VirtualHost *:443>
        SSLEngine on

        SSLCertificateFile    /etc/apache2/proxy.example.org.pem
        SSLCertificateKeyFile /etc/apache2/proxy.example.org.pem

        ServerName proxy.example.org

        ProxyRequests Off
        ProxyPreserveHost Off

        PerlInputFilterHandler Apache2::ModProxyPerlHtml
        PerlOutputFilterHandler Apache2::ModProxyPerlHtml
        SetHandler perl-script
        PerlSetVar ProxyHTMLVerbose "On"

        <Proxy *>
                Order deny,allow
                Allow from all
        </Proxy>

        PerlAuthenHandler Apache2::AuthenDBMCache Apache2::AuthKoha
        PerlSetVar AuthenDBMCache_File  /tmp/auth-cache
        PerlSetVar AuthenDBMCache_TTL   3600
        PerlSetVar AuthenDBMCache_Debug On


        ProxyPass /secure/ http://secure.example.com/
        <Location /secure/>
                ProxyPassReverse /
                PerlAddVar ProxyHTMLURLMap "/ /secure/"
                PerlAddVar ProxyHTMLURLMap "http://secure.example.com /secure"

                AuthName Proxy
                AuthType Basic
                require valid-user
        </Location>
</VirtualHost>
This will enable you to access http://proxy.example.org/secure/ and get access to http://secure.example.com/

You will also notice that I'm using Apache2::AuthenDBMCache to prevent proxy from checking user credential for every page element (which would be slow). At first, this setup didn't work well. I would get No access to /tmp/auth-cache at -e line 0 because client browser was opening multiple connection at same time and perl's dbmopen didn't like that. Fortunatly, it was easy to fix, so I just added use DB_File; in Apache2::AuthenDBMCache which forced dbmopen to use Berkeley DB (which allows multiple readers) instead of default GDBM.

Once again, perl proved to be duct tape of Internet. With a few lines of code and some configuration you can make wonderful things. So, why don't you? :-)

I'm working on Linux version of Sun storage machines, using commodity hardware, OpenVZ and Fuse-ZFS. I'm do have working system in my Sysadmin Cookbook so I might as well write a little bit of documentation about it.

My basic requirements are:

This makes it self-running system which won't fall over itself, so let's see how does it look:

root@opl:~# zpool status
  pool: opl
 state: ONLINE
 scrub: resilver completed after 1h59m with 0 errors on Wed Jun  3 15:29:50 2009
config:

        NAME        STATE     READ WRITE CKSUM
        opl         ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0

errors: No known data errors
root@opl:~# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
opl                            183G  35.7G    21K  /opl
opl/backup                     180G  35.7G    22K  /opl/backup
opl/backup/212052             76.1G  35.7G  8.12G  /opl/backup/212052
opl/backup/212052@2009-05-01  5.69G      -  7.50G  -
opl/backup/212052@2009-05-10  5.69G      -  7.67G  -
opl/backup/212052@2009-05-15  5.57G      -  7.49G  -
opl/backup/212052@2009-05-22  3.54G      -  7.74G  -
opl/backup/212052@2009-05-25  3.99G      -  8.38G  -
opl/backup/212052@2009-05-26  3.99G      -  8.38G  -
...
opl/backup/212052@2009-06-05  3.72G      -  8.09G  -
opl/backup/212052@2009-06-06      0      -  8.12G  -
opl/backup/212056             1.42G  35.7G   674M  /opl/backup/212056
opl/backup/212056@2009-05-30  37.1M      -   688M  -
opl/backup/212056@2009-05-31  47.3M      -   747M  -
opl/backup/212056@2009-06-01  40.9M      -   762M  -
opl/backup/212056@2009-06-02  62.4M      -   787M  -
...
opl/backup/212056@2009-06-05  12.1M      -  1.02G  -
opl/backup/212056@2009-06-06      0      -   674M  -
opl/backup/212226              103G  35.7G  26.8G  /opl/backup/212226
opl/backup/212226@2009-05-05  4.29G      -  26.7G  -
opl/backup/212226@2009-05-10  4.04G      -  26.6G  -
opl/backup/212226@2009-05-15  4.19G      -  26.6G  -
opl/backup/212226@2009-05-22  4.12G      -  26.7G  -
opl/backup/212226@2009-05-23  4.12G      -  26.7G  -
opl/backup/212226@2009-05-24  4.09G      -  26.6G  -
opl/backup/212226@2009-05-25  4.14G      -  26.7G  -
opl/backup/212226@2009-05-26  4.13G      -  26.7G  -
...
opl/backup/212226@2009-06-05  4.20G      -  26.8G  -
opl/backup/212226@2009-06-06      0      -  26.8G  -
opl/clone                      719M  35.7G    25K  /opl/clone
opl/clone/212056-60018         666M  35.7G  1.39G  /opl/clone/212056-60018
opl/clone/212226-60017        53.0M  35.7G  26.7G  /opl/clone/212226-60017
opl/vz                        1.59G  35.7G  43.5K  /opl/vz
opl/vz/private                1.59G  35.7G    22K  /opl/vz/private
opl/vz/private/60014           869M  35.7G   869M  /opl/vz/private/60014
opl/vz/private/60015           488M  35.7G   488M  /opl/vz/private/60015
opl/vz/private/60016           275M  35.7G   275M  /opl/vz/private/60016
There are several conventions here which are useful:
  • pool is named same as machine (borrowing from Debian way of naming LVM volume groups) which makes it easy to export/import pools on different machines (I did run it with mirror over nbd for a while)
  • snapshots names are dates of snapshot for easy overview
  • clones (writable snapshots) are named using combination of backup and new container ID

There are several things which I wouldn't be able to get without zfs:

  • clones can grows as much as they need
  • data is compressed, which increase disk IO as result
  • zfs and zpool commands are really nice and intuitive way to issue commands to filesystem
  • zpool history is great idea of writing all filesystem operations to internal log
  • ability to re-sliver (read/write all data on platters) together with checksums make it robust to disk errors

About this Archive

This page is an archive of entries from June 2009 listed from newest to oldest.

May 2009 is the previous archive.

July 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Pages

  • pics
OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.04