January 2010 Archives

lib-architecture-v2.png When you are working as system architect or systems librarian, your job is to design systems. My initial idea was to create small Google out of 12 machines which are dedicated to be web kiosks. I decided to strictly follow loosely coupled principle, mostly to provide horizontal scaling for my data processing needs. I wanted to be able to add machine or two if my query is too slow... This easily translates into "now long will I have to wait for my page to generate results"....

I decided to split my system into three logical parts: network booting, data store, and quick reporting. So, let's take a look at each component separately:

  • PXElator
    • supported protocols: bootp, dhcp, tftp, http, amt, wol, syslog
    • boot kiosks using Webconverger (Debian Live based kiosk distribution)
    • provides web user interface for overview of network segment for audit
    • configuration is stored as files on disk, suitable for management with git or other source control management
  • MongoDB
    • NoSQL storage component which support ad-hoc queries, indexes and other goodies
    • simple store for perl hashes from PXElator generated every time we see network packet from one of clients using one of supported protocols
  • Sack
    • fastest possible way to execute snippet of perl code over multiple machines
    • this involves sharing information to nodes, executing code on all of them and collecting results back, all in sub 3 second mark!
    • web user interface for cloud overview and graph generation using gnuplot

When I started implementing this system last summer, I decided to use CouchDB for storage layer. This wasn't really good choice, since I didn't need transactions, MVCC or replication. Hack, I even implemented forking for document stored in CouchDB to provide faster response to clients in PXElator.

Moving to much faster MongoDB I got ad-hoc queries which are usable (as in I can wait for them to finish) and if that's too slow, I can move data to Sack and query it directly from memory. As a happy side effect, making shards from MongoDB is much faster than using CouchDB bulk HTTP API, and it will allow me to feed shards directly from MongoDB to Sack nodes, without first creating shards on disk.

I'm quite happy how it all turned out. I can configure any host using small snippet of perl code in PXElator, issue ad-hoc queries on audit data on it in MongoDB or move data to Sack if I want to do data munging using perl.

As you noticed by now, I'm using live distribution for kiosks, and machines do have hard drivers in them. Idea was to use those disks as storage with something like Sheepdog. seems like perfect fit. With it in place, I will have real distributed, building size computer :-).

I have been using CouchDB for some time now, mostly as audit storage for PXElator. Audit data stores are most useful for ad-hoc queries (hum, when did I saw that host last time?), and CouchDB map/reduces took half an hour or more. I wrote mall script couchdb2mongodb.pl to migrate my data over to MongoDB (in 26 minutes) and run first query I could write after reading MongoDB documentation about advanced queries. It took only 30 seconds, compared to 30 minutes or more in CouchDB. I was amazed.

This was NoSQL database which I can understand and tune. MongoDB has indexes and profiler so tuning query down to three seconds was a simple matter of adding an index. All my RDBMS knowledge was reusable here, so I decided to take a look why is it so much faster than CouchDB for same data...

To be honest, MongoDB, High-Performance SQL-Free Database by Dwight Merriman, CEO of 10gen won me over to finally try MongoDB. It was technical enough to make me think about MongoDB arhitecture and benefits. It's clearly pragmatic, let's re-think horizontally scalable hash storage with ad-hoc queries model, but with funny twist about close coupling with language types all encoded in BSON format, which is very similar to Google's protocol buffers.

First, let's have a look at raw side of data on disk. At some level, it will translate to number of IO operations involving rotating platters and usage of buffer cache.

root@opr:~# du -hc /var/lib/couchdb/0.9.0/.pxelator* /var/lib/couchdb/0.9.0/pxelator.couch
655M    /var/lib/couchdb/0.9.0/.pxelator_design
23M     /var/lib/couchdb/0.9.0/.pxelator_temp
7.8G    /var/lib/couchdb/0.9.0/pxelator.couch
8.4G    total

root@opr:~# du -hc /var/lib/mongodb/pxelator.*
65M     /var/lib/mongodb/pxelator.0
129M    /var/lib/mongodb/pxelator.1
257M    /var/lib/mongodb/pxelator.2
513M    /var/lib/mongodb/pxelator.3
513M    /var/lib/mongodb/pxelator.4
513M    /var/lib/mongodb/pxelator.5
17M     /var/lib/mongodb/pxelator.ns
2.0G    total
Here is a first hint about performance: MongoDB's 2G of data (which are used as mmap memory directly, leaving flushes and caching to OS layer) are almost a perfect fit into 3G of RAM memory I have in this machine.

MongoDB has montodump utility which dumps bson for backup and it's even smaller:

root@opr:~# du -hcs dump/pxelator/*
1.1G    dump/pxelator/audit.bson
4.0K    dump/pxelator/system.indexes.bson
76K     dump/pxelator/system.profile.bson
1.1G    total

So I switched PXElator to use MongoDB as storage. I never pushed anything in production after just one day of testing it, but first query speedup from 30 min to 30 sec, and ability to cut it down to 3 sec if I added index (which took about 13 sec to create) is just something which provides me with powerful analytical tool I didn't have before.

CardMan-5321_free.jpg I had OmniKey CardMan 5321 reader sitting on my desk for quite some time. First time I tried it, I had problem with propitiatory binary driver which expected pcscd to be compiled without hal support to make it work.

Fortunately, we now have pcsc-omnikey package in Debian which should make usage of this reader much easier. But, I really wanted more low-level implementation, allowing me to muck with cards without need to pass through whole smart card stack (since I'm really only interested in RFID part of this reader).

So, I did some searching and found out that librfid - A Free Software RFID stack implements support for this reader, so here is a quick overview of how to get started:

# build dependency
dpavlin@klin:/rest/cvs/librfid$ sudo apt-get install libusb-dev

# checkout source
dpavlin@klin:/rest/cvs$ svn co https://svn.gnumonks.org/trunk/librfid/
dpavlin@klin:/rest/cvs$ cd librfid/
dpavlin@klin:/rest/cvs/librfid$ ./autogen.sh

# build
dpavlin@klin:/rest/cvs/librfid$ ./configure --enable-ccid
dpavlin@klin:/rest/cvs/librfid$ make
Now we can test if our reader is working:
dpavlin@klin:/rest/cvs/librfid$ sudo ./utils/librfid-tool -s
lt-librfid-tool - (C) 2005-2008 by Harald Welte
This program is Free Software and has ABSOLUTELY NO WARRANTY

initializing librfid
opening reader handle OpenPCD, CM5x21
No OpenPCD found
scanning for RFID token...
Layer 2 success (ISO 15693):  eb 6e 77 1f 00 01 04 e0
And, that's not all. We can also read content of our tag:
dpavlin@klin:/rest/cvs/librfid$ sudo ./utils/librfid-tool -r -1
lt-librfid-tool - (C) 2005-2008 by Harald Welte
This program is Free Software and has ABSOLUTELY NO WARRANTY

initializing librfid
opening reader handle OpenPCD, CM5x21
No OpenPCD found
Layer2 init ok
Layer 2 success (ISO 15693)[8]: ' eb 6e 77 1f 00 01 04 e0'
block[  0:00]sec:0x8 data(4):  04 11 00 01
block[  1:01]sec:0x8 data(4):  31 33 30 32
block[  2:02]sec:0x8 data(4):  30 32 39 37
block[  3:03]sec:0x8 data(4):  31 30 00 00
block[  4:04]sec:0x8 data(4):  00 00 00 00
...
block[ 26:1a]sec:0x8 data(4):  00 00 00 00
block[ 27:1b]sec:0x8 data(4):  57 5f 4f 4b
no data(read_block(28)>> -1)
It's exactly what I was looking for: ability to do low-level block transfer with RFID card.

This is great news since I don't have to carry bulky 3M reader and antenna with me to conferences to demonstrate RFID. Since I didn't find librfid first time I searched for software to drive this reader, I hope that this post will be helpful to someone.If you intend to buy RFID reader, take a look at OpenPCD instead of this one :-)

I have been developing web application since 1995, and things have changed a lot since than. Back in the old days, we tried to render good in text-only browsers and machines where much slower back than. Why it that important? I had interesting problem today: write csv file filtering application in an hour.

Basically, I had one hour to make semi-formated csv file somehow searchable. It had sub-totals, and you wanted to type some words and search for then within each section (they all have to be in same section to make section visible as result). Since data is really textual report from SQL database, it seemed somehow logical to put it back into database and write complex SQL query to return just parts of it. Or store it in full-text search. In 1990 or so...

But, let's take a look in directory: I have 600K csv file. I decided to write simple cgi wrapper which understands sections (and ignores rest), basically streaming csv converted to html table back to client (which provided nice incremental rendering for bigger result sets). I had simple input-box, submit button, html table application in one hour. And it was used 522 times today, speeding up manual search and retrieve work with papers.

Let's think about this situation again. There is only one simple rule for agile: scale your solution to your problem, not to your technological preferences. Rest of the day, I spent fighting utf-8 encoding (in 2010 still, sigh!), aligning columns and formatting floating numbers (with few style="color:gray" here and there) and added highlight to lines within section which match search words.

At the end of the day, I added a simple summaries on the bottom. Not bad for 1 hour/1 day application.

As you know by now, I have been playing with Dell's remote consoles in hope that I will be able to connect from my Linux to Dell's RAC reliably. Currently, I have to run Windows XP with Internet Explorer and Java in kvm to have access to my servers, and that's clearly not reliable combination.

DRAC is PCI card which is presented to system as VGA which then transfers screen updates over the network to client. It also allows virtual media, but in a sense, it's mix-up of http over ssl and few propitiatory protocols:

  • 443 - https interface
  • 3368 - virtual media (proprietary)
  • 5900 - keyboard and mouse (ssl encrypted)
  • 5901 - video redirection (optionally ssl encrypted)
It's very strange that all documentation calls 5900 video redirection port and 5901 keyboard/mouse redirection when all traces of traffic between client and server clearly show that ports are swapped in implementation.

Did you notice ssl encrypted keyboard/mouse channel? I first decided to tackle this problem with well known SSL man in the middle approach. I decided to use simpliest possible approach first using something like:

apt-get install stunnel

openssl req -new -x509 -days 365 -nodes -out cert.pem -keyout cert.pem

# https mitm
stunnel -p cert.pem -d 443 -r 5443
stunnel -c -d 5443 -r 10.60.0.100:443

# 5900 mitm
stunnel -p cert.pem -d 5900 -r 5999
stunnel -c -d 5999 -r 10.60.0.100:5900
and than recoding all output using wireshark:
sudo tshark -w /tmp/drac.pcap -i any 'port 5999 or port 5901 or port 5443'
This allowed me to capture all unencrypted traffic into single pcap file which proved very useful for initial protocol analysis using wireshark. In short, you have to do following:
  1. make https connection to https://drac/cgi-bin/webcgi/winvkvm?state=1 and acquire vKvmSessionId console redirection authentication key
  2. connect to keyboard/mouse port 5900 forcing SSL_cipher_list to supported RC4-MD5 cipher and send vKvmSessionId
  3. connect to video port 5901
Finding supported cipher for communication between us and server was a real problem. They are using openssl-0.9.7f and I had to downgrade all the way to Debian woody to make stunnel work. Same problem is visible with latest firmware update for DRAC where Active X plugin doesn't have old enough configuration in SSL handshake and doesn't work any more. Java plugin, on the other hand, provides much more cipher options, so one of them still works. ssldump was very useful for finding such problems.

Fortunately, kost was much more persistent than me, and he found out that adding 'SSL_cipher_list' => 'RC4-MD5' will force supported cipher. Armed with that new finding, I was able to modify kost's ssl mitm script up to the point where I can see decrypted key-presses, mouse movements and video settings. Hack, I even wrote drac-vkvm.pl async client which does steps outlined above.

All is not well, unfortunately. When sending authentication request, we need vKvmSessionId which we get from web server, but packet which is sent contains also two bytes which change with session. I haven't been able to figure this part out, and since same two byte sequence is needed to open video channel (to see VGA output) so I'm stuck.
Bytes don't look like crc16, and source code doesn't provide any hints about secondary 16-bit auth info. It seems that client calculates it somehow, since both connections close when I try to send different values for it.

I could write session recorder, but that isn't terribly useful, because it still forces me to use Windows+Java setup to access my console. I will collect usefull snippets about Dell's RAC protocol on wiki.

About this Archive

This page is an archive of entries from January 2010 listed from newest to oldest.

December 2009 is the previous archive.

February 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Pages

  • pics
OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.04