I'm not a big Facebook fan. In fact, I don't use it long enough to form any opinion about it other than objection that it's a silo from which I can't get my data back out. But, since most of my users in library are using it, I decided to take a look how hard would it be to expose part of our library catalog on Facebook.

Easiest thing to do was to make Facebook appliaction which would fetch RSS feed with results from Koha and present it inside Facebook.

Facebook applications (in canvas mode which I'm using) are in fact simple web pages with a bit of custom Facebook markup. My initial gripe about applications was that they where slow. Now that I know that they are running somewhere else and not on Facebook I understand why they are slow.

So, to sum it up: if you know how to make simple CGI script, you will be fine with Facebook applications. They will be even slower than your application on your server, and if you do something popular you might have problems with server load.

Generally we just like to bitch about state of X. However, I would like to point you towards two presentations from linux.conf.au 2009:

Introducing the Re-Built Linux Desktop by Keith Packard

You will find out what is GEM and why do we have it and now it influenced Linux kernel development. Also explained are DRI2 and KMS. So now you can run non-root X servers, multiple X servers with acceleration and other fun stuff. Look, glxgears on compiz sphere!

From click to pixel: A tour of the Linux graphics pipeline by Carl Worth

Video file contains only first half hour of presentation, but here are interesting highlights from slides:

  • Visually inspecting GTK+ updates
    ./configure --enable-debug=yes # for GTK+
    GTK_DEBUG=updates ./my-program
    
  • Tracing cairo calls

    Install cairo 1.9 or later

    cairo-trace ./my-program
    See results in my-program.$PID.trace

  • Inspecting Render protocol
    xtrace -D :5 > my-program.xtrace
    DISPLAY=:5 my-program
    
  • Finding software fallbacks in EXA

    Edit xserver/exa/exa_priv.h:

    #define DEBUG_TRACE_FALL 1
    Recompile xserver and examine Xorg.0.log file

  • Finding software fallbacks in xf86-video-intel

    In "device" section of xorg.conf:

       Option   "FallbackDebug"     "true"
    Examine Xorg.0.log file

  • Inspecting 3D state (for Intel)

    INTEL_DEBUG=fall,batch,sync
    fall: Show software fallbacks
    batch: Show decoded batchbuffers
    sync: Wait for idle after each batchbuffer
    (see intel_context.c debug_control[] for more)

  • Inspecting GEM state
    cat /proc/dri/0/gem_objects
    cat /proc/dri/0/i915_gem_interrupt
    

Good stuff, well worth two hours of your time to get to know your X, stop bitching and start reporting bugs...

I have been watching videos from linux.conf.au 2009 and stumbled upon Conrad Parker's Ogg Chopping: techniques for programming correctness and efficiency which is great lecture if you want to know something about current state of video on the web, Ogg or Haskell.

I have been thinking about poor state of Linux video for quite some time (bear in mind that I do have real-life experience with U-matic type equipment) but it seems that things are moving in right direction. Here is a quick comilation of useful links from this presentation:

This is very cool! Only problem for me right now is that server side is written in python with which I haven't have good experience (it's just my bias). But, than again Pad.ma JavaScript API seems easy enough to roll out own server implementation if I find time to play with it.

Update: Are we there yet?

After a bit more of watching, I also stumbled upon Collaborative Video for Wikipedia by Michael Dale which introduces following tools related to video editing:

  • Mv_Embed allows support of browsers without <video> tag with annotation editor
  • MetaVidWiki offers another interface, but I couldn't find any good demo to link from here

Internet is not a single network. Some parts of it are hidden behind firewalls, some services allows access only from specific range of IP addresses. To solve that, we are using proxy servers, but what do you do when you want to allow your users easy access to resources which are not directly accessible?

For a long time, I was fan of CGIProxy. Single CGI script which allows you to access all web resources which are visible from machine on which CGIProxy is installed. However, modern web pages have many, many elements, and soon enough overhead of CGI execution for each element proved to be too much for our users patience. It was slow...

I decided to take a look at mod_perl2 as solution since it provides long-living perl interpreter inside Apache 2 server. I was on the right track: Apache2::ModProxyPerlHtml provides easy to configure html rewriter using Apache 2 and mod_perl2. I tested it and immediately saw speedup comparing to previous CGIProxy based solution.

But, this was only half of problem. I also needed to solve user authorization somehow. With old system, we had LDAP server as login method, but this time, I needed to somehow check user passwords in Koha database which are base64 encoded md5 hash of password. Base64 is somewhat unfortunate choice because MySQL doesn't have built-in base64 encoding. If it did, I could just use Apache::AuthDBI, craft SQL queries and I would be ready to go.

First idea was to write Apache2 auth module which would connect to Koha directly. That would work, but it would also require secure connection between proxy and Koha (we are transfering passwords) and proxy would need to have credentials to access Koha database. None of that seemed very clean or secure, so I decided to split it into two parts:

  • Apache auth module which requests credential verification from Koha server over https
  • CGI script on Koha which verifies user and return status
With this approach, passwords are never traveling across network (and even md5 hash of password is transfered over ssl) and proxy server doesn't have to have any Koha specific configuration.

Here is small Apache authorization module which will transfer userid and base64 encoded password hash to cgi script on Koha server over https:

package Apache2::AuthKoha;
  
use strict;
use warnings;
  
use Apache2::Access ();
use Apache2::RequestUtil ();
  
use Apache2::Const -compile => qw(OK DECLINED HTTP_UNAUTHORIZED);

use Digest::MD5 qw/md5_base64/;
use LWP::Simple qw/get/;

sub handler {
        my $r = shift;
  
        my ($status, $password) = $r->get_basic_auth_pw;
        return $status unless $status == Apache2::Const::OK;

        return Apache2::Const::OK if get(
                'https://koha.example.com/koha-auth?userid=' . $r->user .
                ';password=' . md5_base64($password)
        );
 
        $r->note_basic_auth_failure;
        #return Apache2::Const::DECLINED; # allow other authentification
        return Apache2::Const::HTTP_UNAUTHORIZED;
}

1;
And this is small CGI script on Koha server's side which checks userid and password hash and returns appropriate status:
#!/usr/bin/perl

# ScriptAlias /koha-auth /srv/koha-auth/auth.cgi

use warnings;
use strict;

use CGI;
use DBI;

our $dsn      = 'DBI:mysql:dbname=koha';
our $user     = 'koha-database-user';
our $passwd   = 'koha-database-password';

my $q = CGI->new;

my $status = 200;

sub out {
        my ($status,$text) = @_;
        print $q->header( -status => $status ), "$text\r\n";
        exit;
}

out( 500, "NO PARAMS" ) unless $q->param;

my $dbh = DBI->connect($dsn, $user,$passwd, { RaiseError => 1, AutoCommit => 0 }) || die $DBI::errstr;

my $sth = $dbh->prepare(q{
        select 1 from borrowers where userid = ? and password = ?
});

my ( $userid, $password ) = ( $q->param('userid'), $q->param('password') );
$password =~ s{ }{+}g;

$sth->execute( $userid, $password );

if ( $sth->rows == 1 ) {
        out( 200, "OK" );
} else {
        out( 404, "ERROR" );
}
To complete this setup, we also have to define virtual host on proxy server which will tie together our components:
<VirtualHost *:443>
        SSLEngine on

        SSLCertificateFile    /etc/apache2/proxy.example.org.pem
        SSLCertificateKeyFile /etc/apache2/proxy.example.org.pem

        ServerName proxy.example.org

        ProxyRequests Off
        ProxyPreserveHost Off

        PerlInputFilterHandler Apache2::ModProxyPerlHtml
        PerlOutputFilterHandler Apache2::ModProxyPerlHtml
        SetHandler perl-script
        PerlSetVar ProxyHTMLVerbose "On"

        <Proxy *>
                Order deny,allow
                Allow from all
        </Proxy>

        PerlAuthenHandler Apache2::AuthenDBMCache Apache2::AuthKoha
        PerlSetVar AuthenDBMCache_File  /tmp/auth-cache
        PerlSetVar AuthenDBMCache_TTL   3600
        PerlSetVar AuthenDBMCache_Debug On


        ProxyPass /secure/ http://secure.example.com/
        <Location /secure/>
                ProxyPassReverse /
                PerlAddVar ProxyHTMLURLMap "/ /secure/"
                PerlAddVar ProxyHTMLURLMap "http://secure.example.com /secure"

                AuthName Proxy
                AuthType Basic
                require valid-user
        </Location>
</VirtualHost>
This will enable you to access http://proxy.example.org/secure/ and get access to http://secure.example.com/

You will also notice that I'm using Apache2::AuthenDBMCache to prevent proxy from checking user credential for every page element (which would be slow). At first, this setup didn't work well. I would get No access to /tmp/auth-cache at -e line 0 because client browser was opening multiple connection at same time and perl's dbmopen didn't like that. Fortunatly, it was easy to fix, so I just added use DB_File; in Apache2::AuthenDBMCache which forced dbmopen to use Berkeley DB (which allows multiple readers) instead of default GDBM.

Once again, perl proved to be duct tape of Internet. With a few lines of code and some configuration you can make wonderful things. So, why don't you? :-)

I'm working on Linux version of Sun storage machines, using commodity hardware, OpenVZ and Fuse-ZFS. I'm do have working system in my Sysadmin Cookbook so I might as well write a little bit of documentation about it.

My basic requirements are:

This makes it self-running system which won't fall over itself, so let's see how does it look:

root@opl:~# zpool status
  pool: opl
 state: ONLINE
 scrub: resilver completed after 1h59m with 0 errors on Wed Jun  3 15:29:50 2009
config:

        NAME        STATE     READ WRITE CKSUM
        opl         ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            sda2    ONLINE       0     0     0
            sdb2    ONLINE       0     0     0

errors: No known data errors
root@opl:~# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
opl                            183G  35.7G    21K  /opl
opl/backup                     180G  35.7G    22K  /opl/backup
opl/backup/212052             76.1G  35.7G  8.12G  /opl/backup/212052
opl/backup/212052@2009-05-01  5.69G      -  7.50G  -
opl/backup/212052@2009-05-10  5.69G      -  7.67G  -
opl/backup/212052@2009-05-15  5.57G      -  7.49G  -
opl/backup/212052@2009-05-22  3.54G      -  7.74G  -
opl/backup/212052@2009-05-25  3.99G      -  8.38G  -
opl/backup/212052@2009-05-26  3.99G      -  8.38G  -
...
opl/backup/212052@2009-06-05  3.72G      -  8.09G  -
opl/backup/212052@2009-06-06      0      -  8.12G  -
opl/backup/212056             1.42G  35.7G   674M  /opl/backup/212056
opl/backup/212056@2009-05-30  37.1M      -   688M  -
opl/backup/212056@2009-05-31  47.3M      -   747M  -
opl/backup/212056@2009-06-01  40.9M      -   762M  -
opl/backup/212056@2009-06-02  62.4M      -   787M  -
...
opl/backup/212056@2009-06-05  12.1M      -  1.02G  -
opl/backup/212056@2009-06-06      0      -   674M  -
opl/backup/212226              103G  35.7G  26.8G  /opl/backup/212226
opl/backup/212226@2009-05-05  4.29G      -  26.7G  -
opl/backup/212226@2009-05-10  4.04G      -  26.6G  -
opl/backup/212226@2009-05-15  4.19G      -  26.6G  -
opl/backup/212226@2009-05-22  4.12G      -  26.7G  -
opl/backup/212226@2009-05-23  4.12G      -  26.7G  -
opl/backup/212226@2009-05-24  4.09G      -  26.6G  -
opl/backup/212226@2009-05-25  4.14G      -  26.7G  -
opl/backup/212226@2009-05-26  4.13G      -  26.7G  -
...
opl/backup/212226@2009-06-05  4.20G      -  26.8G  -
opl/backup/212226@2009-06-06      0      -  26.8G  -
opl/clone                      719M  35.7G    25K  /opl/clone
opl/clone/212056-60018         666M  35.7G  1.39G  /opl/clone/212056-60018
opl/clone/212226-60017        53.0M  35.7G  26.7G  /opl/clone/212226-60017
opl/vz                        1.59G  35.7G  43.5K  /opl/vz
opl/vz/private                1.59G  35.7G    22K  /opl/vz/private
opl/vz/private/60014           869M  35.7G   869M  /opl/vz/private/60014
opl/vz/private/60015           488M  35.7G   488M  /opl/vz/private/60015
opl/vz/private/60016           275M  35.7G   275M  /opl/vz/private/60016
There are several conventions here which are useful:
  • pool is named same as machine (borrowing from Debian way of naming LVM volume groups) which makes it easy to export/import pools on different machines (I did run it with mirror over nbd for a while)
  • snapshots names are dates of snapshot for easy overview
  • clones (writable snapshots) are named using combination of backup and new container ID

There are several things which I wouldn't be able to get without zfs:

  • clones can grows as much as they need
  • data is compressed, which increase disk IO as result
  • zfs and zpool commands are really nice and intuitive way to issue commands to filesystem
  • zpool history is great idea of writing all filesystem operations to internal log
  • ability to re-sliver (read/write all data on platters) together with checksums make it robust to disk errors

So, you think that your network is slow. But, how would you test that? You can feel that speed between different hosts is different, but what you need some data to find problem. Here is my take on this...

First, select subset of machines to test network speed on and install netpipe-tcp. Then run NPtcp on target machines and NPtcp -h hostname -u 1048576 -o /tmp/hostname.np on machine from which you are testing bandwidth. Several iterations later, you will have a bunch of *.np files which are ready for analysis.

You can do it by hand, but this handy perl script will convert *.np files into graphviz's dot file. Which looks like this: netpipe-grahviz.png

GraphViz will make it's auto-layout magic and just looking at picture you will immediately notice that there are 100Mbit/s link somewhere in-between machines... Pictures can really replace thousands of words...

For quote some time I wanted to try PXE booting. After all, I did wrote bootp and tftp server for ADSL modems, so how complicated can it be?

I decided to use dnsmasq as server, and added following configuration options to dnsmasq:

enable-tftp
tftp-root=/srv/sysadmin-cookbook/recepies/pxe/tftpboot/
dhcp-boot=pxelinux.0
Then, I created tftpboot from upstream Debian netboot:
wget -nc ftp.hr.debian.org/debian/dists/lenny/main/installer-i386/current/images/netboot/netboot.tar.gz \
&& mkdir tftpboot && cd tftpboot && tar xvfz ../netboot.tar.gz
It seemed all nice and well, so I decided to try it using Eee PC 701. And it didn't work. I didn't have any network link, tshark -i eth0 didn't reported any network traffic and all suggested that BIOS didn't turn power on network card.

I even tried lastest bios upgrade but it didn't help. I was quite sure that configuration is correct (it's so simple after all) and tried to boot ThinkPad. Which worked...

So, I had a PXE environment which worked, just not with Eee PC. Fortunately, there is alternative to buggy PXE boots: gPXE. It comes with bootable USB version which to my amazement worked perfectly on Eee PC. If you want to know all glory defailes about gPXE watch this video. It well worth your time...

As you might guessed by now, I played with file-systems for backup appliance So, against my good judgment, I decided to try btrfs to see how ready is it to replace zfs-fuse configuration with real in-kernel file-system (zfs-fuse is not slow, because disks are much slower than any peace of software).

So far, I found following annoyances in brtrs:

  1. snapshots can't be removed (I'm doing incremental forever backups, so this is not show-stopper)
    You can remove all files in snapshot directory, but not directory itself. I would guess that removing files would just increase disk space, because it's copy-on-write filesystem, but I didn't test that.
  2. there is no indication which directory is snapshot (if you didn't wrote down in log which is snapshot, you are out of luck)
  3. it seeks quite a lot (there is 40-70% wait time in vmstat while running rsync which I guess is seek, because there is no block input/output operations at same time)
  4. it will oops your (Debian 2.6.29-2-686) kernel:
    Message from syslogd@klin at May 16 00:42:31 ...
     kernel:[ 4057.994566]  [<c0119e0f>] kmap_atomic_prot+0xbd/0xdd
    Message from syslogd@klin at May 16 00:42:31 ...
     kernel:[ 4057.994576]  [<c0119d30>] kunmap_atomic+0x58/0x7a
    Message from syslogd@klin at May 16 00:42:31 ...
     kernel:[ 4057.994586]  [<f83a61a2>] btrfs_cow_block+0x134/0x13d [btrfs]
    Message from syslogd@klin at May 16 00:42:31 ...
     kernel:[ 4057.994608]  [<f83a8b4b>] btrfs_search_slot+0x1f0/0x622 [btrfs]
    Messag./pull-snapshot-backup.sh: line 8:  4316 Segmentation fault      rsync -ravHC --numeric-ids --delete $from:/mnt/vz-backup/private/$1/ /$pool/$1/
    
    dmesg-btrfs-bug.txt

After that I concluded that warning about alpha state of btrfs is there with a reason. I didn't fully appreciate Theodore Ts's warning about development status of btrfs until I got kernel oops.

Let me first explain background to the story: you want a system to implement distributed printing. It has local accounts (it can fetch users from LDAP) and does routing of printed documents to printers which have card readers so that users can pick up printouts after they identify with a card.

Sounds complicated? O.K., let's consider that we have a system and we are trying to deploy it. At this point it doesn't metter did you already paid for it or if it's open or close source. Really.

2008-05-08_virtworkshop.jpg

You are trying to configure it. It's Java (because it's enterprise system) and it seems that most of things are configured using .ini files. After four weeks of trying to make it work, you have following facts:

  • configuration options are not used in all parts of system, for example some configuration options exists but aren't using all over the system (in this case, although there is objectclass for LDAP entries, and it's changed to HrEduPerson, system sometimes uses Person)
  • some configuration options have special limits within application logic: in our case, if we turn flag to disable negative credits on cards, system doesn't allow users to use system without 10 credits. This doesn't make sense, because there is administration interface for this option, and it shows 0.00
First, let me emphasize that this problem might be same for both types of software. Every software is reliable only in environment in which it's tested, and I know they very well from my experience with Open Source. However...

If system is closed product without source

You can exchange several e-mail with help-desk which is really first level customer support who is more or less working using cookbook. I have seen such help-desks at both previous jobs, so I don't really expect deep technical expertise about application. However, that resulted in painful try-and-error process because configuration options are somewhat cryptic and sparsely documented.

If system has source available

If I could look into source of application I could fix configuration option names. Or improve documentation. Probably even fix problems that I found and submit patch to improve upstream project (or pick another one because this one just isn't worth it).

So the real moral of this story is: closed source projects limits your flexibility. It will drain your time and bring you half-working solution without ability to fix it yourself. I really honestly cannot understand why someone would like to choose that.

Closed hardware - open source driver

I also have another example of company within same industry (printing) with closed hardware which at least got driver part right: Dualys has source code for CUPS driver. I still haven't found time to try it out, but I was afraid that making custom card printer will be more work than syncing closed source commercial application with LDAP, right?

Freedom as right of the user

Isn't it funny that Richard Stallman's Crusade for Free Software started with a printer?

Izgleda da me svake godine malo iznenadi, ali DORS/CLUC je i ove godine bio ponešto različit od prošle. Ako to moram svesti u jednu rečenicu: postoji određen broj ljudi u Hrvatskoj koji stvarno razumiju teme o kojima se priča na ovoj konferenciji. Jedini problem je da se zapravo svi slažemo, pa su diskusije možda premalo kritične :-)

Ove godine imali smo seriju zanimljivih lightning talk-ova od kojih sam ja održao Sve što ste htjeli znati o RFID-u a niste se usudili pitati... u 5 min.

Slijdeći dan sam pokušao zainteresirati publiku da oslobode neki komadić hardware-a. Ako vam je moje predavanje bilo zanimljivo, vjerojatno će Vam se svidjeti i predavanje sa ovogodišnjeg FOSDEM-a o tome kako je napisan Gnash (slobodan flash player).

Find recent content on the main index or look in the archives to find all content.

Recent Comments

  • Dobrica Pavlinušić: Thanks. This will be useful down the road, since some read more
  • Galen Charlton: Water under the bridge at this point, I assume, but read more
  • Benoit: Very good instruction. Could you tell me how you revert read more
  • Apock: Hi Dobrica, you can add two other firmwares for bcm963xx read more
  • Dobrica Pavlinusic: I did have audio, but it was beadly off-sync from read more
  • kamazeuci: Did you get to work your JVC Everio GZ-MG575E MOD read more
  • pc7wizard: Hello Dobrica, Nice article, I am a experienced programmer and read more
  • cropupian: Hello Dobrica, I found your blog as a treasure of read more
  • Dobrica Pavlinusic: I wasn't clear enough: Laptop which I have is on read more
  • Rado: If you get a chance to snatch one OLPC, count read more

Recent Assets

  • netpipe-grahviz.png
  • 2008-05-08_virtworkshop.jpg
  • audio-transcription.jpg
  • simile-svn.png
  • xdebug-vim.png
  • openmoko-gps.png
  • openmoko-web.png
  • openmoko-media.png
  • openmoko-home.png
  • openmoko-assembly_required.jpg
OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.261