Results tagged “perl”

Redis is another in-memory hash store with disk persistence. It's different from memcached or Tokyo Cabinet because it has native list and set types. Even better, it has atomic operations like set intersection, increment/decrement, sort (!) and so on. This makes it suitable for implementation of real applications without traditional relational databases.

If you want to try it with perl, head over to Redis perl bindings and redis from git repository at github. It supports tie to hash or arrays if you don't want to use Redis API directly.

I just finished implementing current version of protocol (which will change soon it seems) and I would like to implement some sample scripts before I push it to CPAN near you.

This weekend I worked on LDAP mungling: we needed to roll-out Koha with LDAP support, but at the same time, central LDAP server didn't have all data (date of birth, gender and address) needed for full entry about user in Koha. We had those data available as CSV export from other systems.

There are many ways to tackle this problem from modifying Koha LDAP support (which I was somewhat reluctant to do) to importing data back into LDAP. However, none of those things can be done in just one day, so I decided to write small LDAP proxy which will do data marging for me.

I started with Net::LDAP::Server based solution implemented by LDAP::Virtual module, but soon I stumbled upon problem of compareRequest which I couldn't implement correctly. Since it's required for login this was show-stopper.

After a bit of searching, I found simple-proxy.pl which is part of Net::LDAP. This is simpler script which operates directly on sockets and ASN encoding of entries. It's very useful in debugging, so I decided to re-implement modification of searchResEntry from LDAP server as ldap-rewrite.pl with following changes:

  • augment LDAP entry with data from YAML file (with configurable prefix for attribute names)
  • support SSL to upstream LDAP server
  • expand attributes with multiple values into separate attribute for each occurrence (to enable easy import of second value from attribute address using something like address_1 (it's 0 based, same as perl arrays)
  • generate additional attributes using concatenation of prefix: from data and attribute name (to get hrEduPersonUniqueNumber_JMBG from attribute hrEduPersonUniqueNumber which has JMBG: 1234567890 as value)

To keep it really KISS I used yaml files named as dn (for example, uid=dpavlin,dc=example,dc=com.yaml) for simple, human readable file fromat. This enabled me to separate data-mungling part into csv2yaml.pl. In this script, I converted values from CSV delimited by # into separate attributes, detect phone numbers which are mobile or fixed and do other tweaks (like gender mapping). YAML files are also nice if I want to implement audit trail of changes: I can just import them all into git and be done with it.

For future versions, I can envision that overlay data can also be fetched from database, so I can add additional attributes to LDAP entries directly from Koha database. This will be useful when connecting with copiers which require LDAP with card number for each user which isn't available in upstream LDAP directory.

Not bad for 4k of perl code :-) I hope this will help you use LDAP as directory for different data as opposed to just login service. Don't forget to push all useful data back to LDAP server, so that all application can take benefit from it without need to worry about source data format.

I'm preparing walk-through screencasts for workshop about virtualization so I needed easy way to produce console screencasts.

First, I found TTYShare which displays ttyrec files using flash, but I really wanted to copy/paste parts of commands and disliked flash plugin requirement.

It seems that I wasn't only one who wanted JavaScript client for ttyrec. jsttplay can produce screencasts like this one about OpenVZ using plain copy/paste friendly JavaScript within browser.

But, for some parts (like Debian installation under VirtualBox) I really needed movie capture. x11grab option from ffmpeg seemed easy enough to use:

ffmpeg -f x11grab -s 640x480 -r 10 -i :0.0 /tmp/screencast.avi
but getting right window size isn't something I want to do by hand. So, I wrote small perl script record-screencast.pl which extract window position and size using xwininfo and pass it to ffmpeg so I don't have to.

Wouldn't it be nice to have a CGI script which would convert bunch of SQL queries into XLS file on-the-fly? And while we are at it, let's have multiple reports, each in it's own directory?

In a sense, it's simple REST API to SQL files on disk which produce Excel files. I first wrote something like this back in 2002, but until now, I didn't have subversion repository for it or announced it to the world.

Each file in current directory which ends in *.sql will be converted to Excel sheet. If you want to have specific order, you can prefix filenames with numbers which will be striped when creating sheet names.

Comments in sql files (lines beginning with --) will be placed in first line in bold.

To specify database on which SQL query is executed \c database syntax is supported.

You can also run script from command line, and it will produce sql_reports.xls file.

If run within directory, it will use files in it to produce file.

When called as CGI, directory name can be appended to name of script to produce report for any sub-directory within directory where sql2xls.cgi is installed.

INSTALLATION

Only required file is this script sql2xls.cgi

If your server is configured to execute .cgi files, you can drop this script anywhere, but you can also add something like

   ScriptAlias /xls-reports /srv/SQL2XLS/sql2xls.cgi

in Apache's virtual host configuration to get nice URLs

To configure default database, user, password and other settings create config.pl file in same directory in which sql2xls.cgi is with something like this:

  $dsn      = 'DBI:mysql:dbname=';
  $database = 'database';
  $user     = 'user';
  $passwd   = 'password';
  $path     = 'sql_reports.xls';

  $db_encoding     = 'utf-8';
  $xls_date_format = 'dd.mm.yyyy';
  $debug = 1;

SECURITY

There is none. Use apache auth modules if you need it.

Publish your data with Exhibit

As you might remember, back in 2007 I wrote about Exhibit which in meantime released version 2.0 and moved to google code.

This time, I made few more converters which enable you to:

simile-svn.png

This is probably best test of JavaScript speed in your browser. Exhibit seems to work best with around 500 items in older browsers, but Firefox 3.1b2 works with 3000 objects, even on EeePC :-)

Last week I was playing around with understanding serial protocol between RFID reader and computer. For a start, I had a windows application which could communicate with RFID reader over USB serial adapter (which is included in device, so device looks like USB device).

First, I needed to sniff USB serial traffic under Windows to understand protocol between device and program.

Then I wrote a simple script to reformat output from portmon to more readable format, and found out that packets have two byte checksum in them.

After I tried all simple combinations to produce valid checksum, I decided to ask a question about checksum guessing at stackoverflow.com. This was great idea, because selwyn stepped in and confirmed that my checksum is CCITT.

Having all that parts in place, next step was to write perl script using Device::SerialPort to communicate with serial port, and thus RFID reader. Right now, I'm pondering how to integrate it with Koha, but that's topic for another post...

My mind is just too accustomed to RDBMS engines to accept that I can't have GROUP BY in my shell pipes. So I wrote one groupby.pl.


Aside from fact that it somewhat looks like perl golfing (which I'm somewhat proud of), let's see how does it look:


dpavlin@llin:~/private/perl$ ps axv | ./groupby.pl 'sum:($6+$7+$8),10,count:10,min:($6+$7+$8),max:($6+$7+$8)' | sort -k1 -nr | head -10 | align
440947 /usr/lib/iceweasel/firefox-bin 1 440947 440947
390913 /usr/sbin/apache2 11 22207 39875
180943 /usr/bin/X 1 180943 180943
135279 /usr/bin/pinot-dbus-daemon 1 135279 135279
122254 mocp 2 25131 97123
84887 pinot 1 84887 84887
78279 postgres: 5 10723 21971
70030 /usr/bin/perl 6 6959 15615
50213 /bin/bash 7 6351 7343
49266 /usr/lib/postgresql/8.2/bin/postgres 2 24631 24635

This will display total usage for process, it's name, number of such processes and range of memory usage. We can then use old friend sum.pl to produce console graph, but I already wrote about it.


So, let's move to another example, this time for OpenVZ. Let's see how much memory is each virtual machine using (and get number of processes for free):



$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1'
2209504127 0 265
611768242 212024 38
162484775 212037 19
170797534 212052 38
104853258 212226 26
712007227 212253 21

But wouldn't it be nice to display hostnames instead of VEID numbers? We can, using --join and --on options (which are really backticks on steroids):

$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1' --join 'sudo vzlist -H -o veid,hostname' --on 2
2146263206 0 259
675835528 saturn.ffzg.hr 40
162484775 arh.rot13.org 19
170797534 koha-dev.rot13.org 38
104853258 koha.ffzg.hr 26
712011323 zemlja.ffzg.hr 21

Which brings us to final result:

$ vzps -E axv --no-headers | ./groupby.pl 'sum:($7+$8+$9*1024),1,count:1' --join 'sudo vzlist -H -o veid,hostname' --on 2 | sort -rn | align | ./sum.pl -h
0 260 2105M OOOOOOOOOOOOOOOOOOO 2105M
zemlja.ffzg.hr 21 679M OOOOOO------------------- 2784M
saturn.ffzg.hr 35 512M OOOO-------------------------- 3296M
koha-dev.rot13.org 38 162M O------------------------------ 3459M
arh.rot13.org 19 154M O-------------------------------- 3614M
koha.ffzg.hr 26 99M ---------------------------------- 3714M

So, here you have it: SQL like query language for your shell pipes.

First of all, we had first ever Croatian perl workshop. Thanks to all the people who showed up, we had attendance of about ten.

Organizing a workshop event turned out to be much more work then I anticipated, and various other tasks stopped me from preparing for it as good as I should. Also, small number of people force me to re-consider my lectures about perl. On one hand, I really, really, tried to spread perl (and had good fortune of being at right place at right time to get Zagreb.pm off the ground), but with such low attendance, I must conclude that perl is used only by about 20 people in Zagreb. This seems somehow disturbing. Comparing size of Zagreb with Moscow turned out to show about same proportion, so I was just overly optimistic.

I also gave half an hour presentation about Jifty, based on Building a Jifty app in a jiffy by Kevin Falcone and showed some examples of my jifty apps (I actually didn't talk about last one, just mentioned it as integration of external javascript -- CodePress in this example).

I also have to thank to Andrew Shitov from Moscow.pm who have managed to prepare several very interesting topics which, in my opinion, made this event worthwhile. If it wasn't free I would ask my money back :-\

Danas sam na Razmjeni vještina održao maratonsko četverosatno predavanje koje je nadam se bilo donekle korisno. Nažalost, nismo stigli ući u detalje onoliko koliko bih želio, ali ako ništa drugo ponovo sam koristio pgrestraier (koji se nekako indeksira previše sporo, morati ću pogledati zašto) i još jedan zgodan projektić koji sam napisao prošle godine za studente u Zadru pg-getfeed koji je zapravo mala perl stored procedura kojim možete raditi SQL upite na RSS feedovima.

Initially created in 2006 this handy tool is best described with original commit message:

IRC bot which replace human memory

Here is a quick run-down through available features:

  • web archive with search
  • irc commands: last, grep/search, stat, poll/count
  • tags// in normal irc messages (tagcloud, filter by tag, export as RSS feed)
  • announce /me messages to Twitter (yes, lame, but that was a year ago)
  • tags are available as html links for embedding (in wikis)
  • RSS feed from messages with tags (also nice for embedding)
  • irssi log import (useful for recovery in case of failure of machine or service :-)
  • announce new messages from RSS feeds (nice for wiki changes, blog entries or commits)

It has grown quite a bit from initial vision to recall last messages on the web (and it does go through some hoops to produce nice web archive). Adding of tags allowed easy recall of interesting topics but in a way now it provides an central hub for different content connected to irc.

It's written in perl using POE and it's probably not best example of POE usage. It is also somewhat PostgreSQL specific but works well for our small comunity at #razmjenavjestina irc channel. Since I have seen some interest in it this blog post might serve as announce of it's existence.

I will probably add some documentation to it's wiki page and add real muti-channel support (most of code is in there, but web archive needs filtering by channel). If you are interested to /invite it to your channel, drop me a note.

Zagreb.pm first meeting

Yesterday, 2007-10-29 we had first ever meeting of Zagreb PerlMongers. There where three of us, and while that's not much, it's a start in organizing perl users here in Zagreb, Croatia.

So, now that you know that Zagreb.pm exists, watch out for official listing at Perl Mongers: Europe. We are still not there, but soon...

We will have another meeting as soon as we setup mail list and we'll try to do better public announcement before it... If you are perl monk near Zagreb, feel free to get in touch with me so we can keep you posted about progress.

Update: We now have (Croatian speaking) list at Google Groups so if you are interested in next meeting subscribe to list...

I wanted to review my route to recent conference, and there are a lot of map servers out there. But, none of them offer good way to print map (analog copies are always useful :-)

I didn't really wanted whole map, I wanted just a part which I viewed. And than, I noticed that those parts already arrived on my laptop (because I can view them in browser). Parts? Packets? With a quick trip to tshark (wireshark but for terminal) I captured network trace of my browsing of route.

So, now I had all the tiles I needed to print out (to find my way through route) in pcap capture format. Quick CPAN search and I found Net::Analysis which seem to fit the bill. Simple HTTP transaction listener later and I have tiles saved in directories which are zoom levels and each tile has coordinates of tile.

Do I want to stitch them together by hand? Of course not, with handy misspelled tool will do that for you. It's written as cleanly as possible, and might be a good way to look at one possible way to write perl code (don't write too much!).

If you didn't checked links to source code, please have in mind whole working implementation is under 3k of perl code!

It does have dependency on CPAN, but just imagine how much time would I spend writing TCP session reassembly and/or multiple-format image handling it I didn't used CPAN.

Off to the road now...

Subversion tools

In an effort to continue my hack-of-the-week series, here is a quick overview of few subversion hacks I have worked on lately:

  • svn-ignore.sh is a tiny shell script which will bring all unversioned files in current svn or svk repository in your $EDITOR and add result of your edit to svn:ignore
  • svndump-move.pl is more complex perl script which will allow you to reorganize directory layout in your repository while preserving revision history -- it solved problems like: oh, if I only had root of my repository is subdirectory foo...
  • svn2cvs is a bit older tool which received attention when Bartek Teodorczyk very patiently started to report problems with it. As a result, it now has test suite, and it's much more robust

Most of documentation for those tools is hidden in subversion commit messages. If you think they are useful, take a peek there...

Exhibit facet browsing

We have few mp3 players which no longer work, but are still under warranty. So idea was to pick another device (which will hopefully work longer). However, on-line shops leave a lot to be desired if you want to just do quick filtering of data.

As a very fortunate incident, I stumbled upon Exhibit from SMILE project at MIT which brought us such nice tools as Timeline and Potluck.

So, I scraped web, converted it to CSV and tried to do something with it. In the process I again re-visited the problem of semi-structured data: while data is separated in columns, one column has generic description, player name and all characteristics in it.

So, what did I do? Well, I started with CPAN and few hours later I had a script which is rather good in parsing semi-structured CSV files. It supports following:

  • guess CSV delimiter on it's own (using Text::CSV::Separator)
  • recognize 10 Kb and similar sizes and normalize them (using Number::Bytes::Human)
  • splitting of comma (,) separated values within single field
  • strip common prefix from all values in one column
  • group values and produce additional properties in data
  • generate specified number of groups for numeric data, useful for price ranges
  • produce JSON output for Exhibit using JSON::Syck


So how does it look?

In the end, it is very similar to the way Dabble DB parses your input. But, I never actually had any luck importing data into Dabble DB, so this one works better for me :-)

This will probably evolve to universal munger from CSV to arbitrary hash structure. What would be good name? Text::CSV::Mungler?

This is a first post in series of posts which will cover one hack a week on my blog. This will (hopefully) force me to write at least one post a week on one side, and provide some historic trace about my work for later.

AMV free decoder

I'm quite pleased to announce that my efforts to decode .amv format supported in most Chinese mp4 video players is over: I have working decoder :-)

First, let's clean up misconceptions: AMV is very smiliar to mjpeg video (frames are jpeg frames without quantization table, same as mjpeg frames) and IMA ADPCM is used to encode audio (not mp3 frames as often mentioned). Audio format isn't exact IMA ADPCM because it include 8 bytes at beginning of each frame which are used to seed ADPCM decoder.

Let's for a moment consider hardware of those players: there is Z80 and DSP. When playing mp3 files, DSP is used to decode audio, but when playing video DSP is used to decode jpeg frames (this also limits maximum frame rate and picture size). So choice for ADPCM was logical since I suppose that Z80 decodes audio while playing movie.

All this is done using clean-room reverse engining, which means that I only used output from Windows encoder. No disassembling of Windows code was used (hey! why would someone waste his time on that?).

So current decoder decode frames into jpeg images (flipping them using jpegtran) and decode audio to 16-bit linear .au file and use that to produce final .avi movie using ffmpeg.

As a result of this, there is enough knowledge embedded in script (and in this post) to create encoder. I probably won't have time for this in next few weeks, but my idea is to make perl script which will open any movie file supported by ffmpeg, resize it on-the-fly and than mungle avi output stream to produce amv. You can think of it as cp which on-the-fly converts movie to your player.

If somebody knows if this ADPCM variant is supported by ffmpeg (reading source didn't help much) I would be grateful for info.

Tails from the past: Orao

This post is about perl emulator of 6502-based machine called Orao.

screen.png

It was my first computer (I attended BASIC introduction course several times so I can work on machine). I never owned one (and almost nobody did back that), since most of machines where installed in schools (like the one near to my place offering basic computer literacy courses).

When I found out that Josip Perušanec wrote Orao Emulator, I was very excited. After all, who can resist temptation of running his old BASIC programs? But, since his version runs only on Windows (to be fair, it runs also under wine), I had to write my own. Credits where credits are due: without Josip's emulator (and especially ROM images) you wouldn't be looking in screenshot above.

I wanted to write it in perl, so at first I used Acme::6502 to do processor emulation and wrote screen interface in SDL. However, that seems slowish, so in the end, I wrote perl bindings for great 6502 emulator M6502 and implemented Orao emulation as embedded perl interpreter inside CPU emulator. I used Extending and Embedding Perl from Tim Jenness and Simon Cozens as my guide, and I can really recommend this book if you want to learn perl-C interaction. If I didn't already owned pdf copy, I would surely buy both pdf and paper copy. It's so good.

Both versions can startup Orao, but pure perl version probably won't receive much love and care. As you can see on screenshot, on the left is display (Orao didn't have text mode, this is graphic display) and on the right is graphic representation of memory map. It doesn't support keyboard at this moment, but it's just a few commits away :-)

Update: Changed link to source code. It's part of the bigger VRač project, but that will have to wait another post...

status line for dwm

I will be working on battery for most of next week, so I spent some time tweaking my setup. I have been running 2.6.21 because of my efforts to make gflrx play nice with CONFIG_PARAVIRT, so I had tickless kernel needed for powertop. To my horror, most of interrupts on my laptop was created by PostgreSQL (which I will stop when using battery) and ACPI! And that's because I'm using dwm with a primitive shell grep/sed pipe to produce status.

So, polling ACPI every 3 seconds (which is reasonable refresh time for me, even for clock) is too much. And than I started dreaming: network traffic would be nice. And a disk! Battery status when (dis)charging goes without saying. So, in the end, I also added a temperature, and got something like this (when on power):

2007-05-27 22:15:40 | 0.30 0.13 0.06 |   5M D 1k   |  32b > 54b  | 59 C

perl code it's nice and short but completely broken about estimates for charging time (how does acpi command-line tool calculate that?)

When started as stand-alone utility it can be poor man's replacement for dstat :-)

Lately, it seems that different programming languages are mostly different by community (yes, that's controversial statement, but bare with me).

For example, I just must brag how easy it was to submit patch to perl module which kept Jifty from handling my multi-line entries:

18:14 < dpavlin> obra: I found out what's wrong with multi-line text areas in Jifty. It turned out that HTTP::Server::Simple bit me.
18:15 < dpavlin> obra: around line 310 in Simple.pm is $request_uri =~ /([^?]*)(?:\?(.*))?/ which should really be $request_uri =~ 
                 /([^?]*)(?:\?(.*))?/s since my request has %0a in URL which gets decoded to \n before this regex gets it.
18:16 < dpavlin> obra: should I file CPAN bug or something? (write test for it? ;-)
...
18:30 < obra> dpavlin: send me mail and I'll fix it in the next hourish?
...
19:50 < obra> dpavlin: thanks! released
...
20:42 < dpavlin> obra++

If only all communication could be so useful... I might even change my mind about irc after all.

Fuse for perl on FreeBSD

I'm maintainer of fuse perl binding for some time now. One of things that I'm somewhat proud is somewhat limited but working support for FreeBSD. Since I use FreeBSD only to test fuse bindings for perl, I tend to forget all little thigs which are different on it. So, here is a little reminder for my poor brain -- how to get my FreeBSD box up-to-date:

root@freebsd:/root# cvsup -L 2 /root/ports-supfile
root@freebsd:/# cd /usr/ports/sysutils/fusefs-libs
root@freebsd:/usr/ports/sysutils/fusefs-libs# make install
root@freebsd:/usr/ports/sysutils/fusefs-libs# cd /usr/ports/sysutils/fusefs-kmod/
root@freebsd:/usr/ports/sysutils/fusefs-kmod# make install

Currently bindings still compile against perl 5.8.7, and I'm compiling perl 5.8.8 with threading to give it a try. I think that I will really need to fix tests to run on FreeBSD first (help from BSD users would be greatly appreciated).