Results matching “virtualization”

Configuring shibboleth is always somewhat confusing for me, so I decided to write this blog post to document how configuration for dspace is done and debugged.

This information is scattered over documentation and dspace-tech mailing list so hopefully this will be useful to someone, at least me if I ever needed to do this again.

First step is to install mod-shib for apache: apt install libapache2-mod-shib Now there are two files which have to be modified with your information, /etc//shibboleth/shibboleth2.xml which defines configuration and /etc/shibboleth/attribute-map.xml to define which information will be passwd from shibboleth to application.

attribute-map.xml

Here we have to define headers which dspace expects, so it can get information from upstream idenitity provider.

diff --git a/shibboleth/attribute-map.xml b/shibboleth/attribute-map.xml
index 1a4a3b0..a8680da 100644
--- a/shibboleth/attribute-map.xml
+++ b/shibboleth/attribute-map.xml
@@ -163,4 +163,34 @@
</Attribute>
-->

+ <!-- In addition to the attribute mapping, DSpace expects the following Shibboleth Headers to be set:
+ - SHIB-NETID
+ - SHIB-MAIL
+ - SHIB-GIVENNAME
+ - SHIB-SURNAME
+ These are set by mapping the respective IdP attribute (left hand side) to the header attribute (right hand side).
+ -->
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.1" id="SHIB-NETID"/>
+ <Attribute name="urn:mace:dir:attribute-def:uid" id="SHIB-NETID"/>
+ <Attribute name="hrEduPersonPersistentID" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-NETID"/>
+
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.3" id="SHIB-MAIL"/>
+ <Attribute name="urn:mace:dir:attribute-def:mail" id="SHIB-MAIL"/>
+ <Attribute name="mail" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-MAIL"/>
+
+ <Attribute name="urn:oid:2.5.4.42" id="SHIB-GIVENNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:givenName" id="SHIB-GIVENNAME"/>
+ <Attribute name="givenName" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-GIVENNAME"/>
+
+ <Attribute name="urn:oid:2.5.4.4" id="SHIB-SURNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:sn" id="SHIB-SURNAME"/>
+ <Attribute name="sn" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-SURNAME"/>
+
</Attributes>

shibboleth2.xml

This is main configuration file for shibd. First we need to add OutOfProcess and InProcess to include useful information in shibd.log which are very useful.

diff --git a/shibboleth/shibboleth2.xml b/shibboleth/shibboleth2.xml
index ddfb98a..7b55987 100644
--- a/shibboleth/shibboleth2.xml
+++ b/shibboleth/shibboleth2.xml
@@ -2,15 +2,46 @@
xmlns:conf="urn:mace:shibboleth:3.0:native:sp:config"
clockSkew="180">

- <OutOfProcess tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a" />
+ <!-- The OutOfProcess section contains properties affecting the shibd daemon. -->
+ <OutOfProcess logger="shibd.logger" tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a">
+ <!--
+ <Extensions>
+ <Library path="odbc-store.so" fatal="true"/>
+ </Extensions>
+ -->
+ </OutOfProcess>

+
+ <!--
+ The InProcess section contains settings affecting web server modules.
+ Required for IIS, but can be removed when using other web servers.
+ -->
+ <InProcess logger="native.logger">
+ <ISAPI normalizeRequest="true" safeHeaderNames="true">
+ <!--
+ Maps IIS Instance ID values to the host scheme/name/port. The name is
+ required so that the proper <Host> in the request map above is found without
+ having to cover every possible DNS/IP combination the user might enter.
+ -->
+ <Site id="1" name="sp.example.org"/>
+ <!--
+ When the port and scheme are omitted, the HTTP request's port and scheme are used.
+ If these are wrong because of virtualization, they can be explicitly set here to
+ ensure proper redirect generation.
+ -->
+ <!--
+ <Site id="42" name="virtual.example.org" scheme="https" port="443"/>
+ -->
+ </ISAPI>
+ </InProcess>
+
<!--
By default, in-memory StorageService, ReplayCache, ArtifactMap, and SessionCache
are used. See example-shibboleth2.xml for samples of explicitly configuring them.
-->

<!-- The ApplicationDefaults element is where most of Shibboleth's SAML bits are defined. -->
- <ApplicationDefaults entityID="https://sp.example.org/shibboleth"
+ <ApplicationDefaults entityID="https://repository.clarin.hr/Shibboleth.sso/Metadata"
REMOTE_USER="eppn subject-id pairwise-id persistent-id"
cipherSuites="DEFAULT:!EXP:!LOW:!aNULL:!eNULL:!DES:!IDEA:!SEED:!RC4:!3DES:!kRSA:!SSLv2:!SSLv3:!TLSv1:!TLSv1.1">

@@ -31,8 +62,8 @@
entityID property and adjust discoveryURL to point to discovery service.
You can also override entityID on /Login query string, or in RequestMap/htaccess.
-->
- <SSO entityID="https://idp.example.org/idp/shibboleth"
- discoveryProtocol="SAMLDS" discoveryURL="https://ds.example.org/DS/WAYF">
+ <SSO
+ discoveryProtocol="SAMLDS" discoveryURL="https://discovery.clarin.eu">
SAML2
</SSO>

@@ -68,6 +99,10 @@
<!--
<MetadataProvider type="XML" validate="true" path="partner-metadata.xml"/>
-->
+ <MetadataProvider type="XML" url="https://login.aaiedu.hr/shib/saml2/idp/metadata.php" backingFilePath="aaieduhr-metadata.xml" maxRefreshDelay="3600" />

<!-- Example of remotely supplied batch of signed metadata. -->
<!--

certificates

To make upstream identity provider connect to us, we need valid certificate so we need /etc//shibboleth/sp-encrypt-cert.pem and /etc/shibboleth/sp-encrypt-key.pem.
Since we are using Let's encrypt for certificates, I'm using shell script to move them over and change permissions so shibd will accept them.

dpavlin@repository:/etc/shibboleth$ cat update-certs.sh
#!/bin/sh -xe

cp -v /etc/letsencrypt/live/repository.clarin.hr/privkey.pem sp-encrypt-key.pem
cp -v /etc/letsencrypt/live/repository.clarin.hr/cert.pem sp-encrypt-cert.pem
chown -v _shibd:_shibd sp-*.pem
/etc/init.d/shibd restart

dspace configuration

We are using upstream dspace docker which includes local.cfg from outside using docker bind, but also re-defines shibboleth headers so we need to restore them to default names defined before in /etc/shibboleth/attribute-map.xml.

diff --git a/docker/local.cfg b/docker/local.cfg
index 168ab0dd42..6f71be32cf 100644
--- a/docker/local.cfg
+++ b/docker/local.cfg
@@ -14,3 +14,22 @@
# test with: /dspace/bin/dspace dsprop -p rest.cors.allowed-origins

handle.prefix = 20.500.14615
+
+shibboleth.discofeed.url = https://repository.clarin.hr/Shibboleth.sso/DiscoFeed
+
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.PasswordAuthentication
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.ShibAuthentication
+
+# in sync with definitions from /etc/shibboleth/attribute-map.xml
+authentication-shibboleth.netid-header = SHIB-NETID
+authentication-shibboleth.email-header = SHIB-MAIL
+authentication-shibboleth.firstname-header = SHIB-GIVENNAME
+authentication-shibboleth.lastname-header = SHIB-SURNAME
+# Should we allow new users to be registered automatically?
+authentication-shibboleth.autoregister = true

example of working login

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:18:40 INFO XMLTooling.StorageService : purged 2 expired record(s) from storage

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.AuthnRequest|||https://login.aaiedu.hr/shib/saml2/idp/metadata.php||||||urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||||||

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:27:42 INFO Shibboleth.SessionCache [1] [default]: new session created: ID (_e37d7f0b8ea3ff4d718b3e2c68d81e45) IdP (https://login.aaiedu.hr/shib/saml2/idp/metadata.php) Protocol(urn:oasis:names:tc:SAML:2.0:protocol) Address (141.138.31.16)

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.Login|https://login.aaiedu.hr/shib/saml2/idp/metadata.php!https://repository.clarin.hr/Shibboleth.sso/Metadata!303a3b0f72c5e29bcbdf35cab3826e62|_e37d7f0b8ea3ff4d718b3e2c68d81e45|https://login.aaiedu.hr/shib/saml2/idp/metadata.php|_3b8a8dc87db05b5e6abf8aaf9d5c67e6ebc62a2eed|urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport|2024-11-18T17:01:57|SHIB-GIVENNAME(1),SHIB-MAIL(2),SHIB-NETID(1),SHIB-SURNAME(1),persistent-id(1)|303a3b0f72c5e29bcbdf35cab3826e62|urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||urn:oasis:names:tc:SAML:2.0:status:Success|||Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0|141.138.31.16

boken urf-8 in shibboleth user first name or surname

You might think that now everything works, but dspace will try to re-encode utf-8 characters received from upstream shibboleth using iso-8859-1 breaking accented characters in process. It has configuration for it, but by default it's false.

root@4880a3097115:/usr/local/tomcat/bin# /dspace/bin/dspace dsprop -p authentication-shibboleth.reconvert.attributes
false

So we need to modify dspace-angular/docker/local.cfg and turn it on:

# incomming utf-8 from shibboleth AAI
authentication-shibboleth.reconvert.attributes=true

This is a story about our mail server which is coming close to it's disk space capacity:

root@mudrac:/home/prof/dpavlin# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        20G  7.7G   11G  42% /
/dev/vdb        4.0T  3.9T   74G  99% /home
/dev/vdc        591G  502G   89G  85% /home/stud

You might say that it's easy to resize disk and provide more storage, but unfortunately it's not so easy. We are using ganeti for our virtualization platform, and current version of ganeti has limit of 4T for single drbd disk.

This can be solved by increasing third (vdc) disk and moving some users to it, but this is not ideal. Another possibility is to use dovecot's zlib plugin to compress mails. However, since our Maildir doesn't have required S=12345 as part of filename to describe size of mail, this solution also wasn't applicable to us.

Installing lvm would allow us to use more than one disk to provide additional storage, but since ganeti already uses lvm to provide virtual disks to instance this also isn't ideal.

OpenZFS comes to rescue

Another solution is to use OpenZFS to provide multiple disks as single filesystem storage, and at the same time provide disk compression. Let's create a pool:

zpool create -o ashift=9 mudrac /dev/vdb
zfs create mudrac/mudrac
zfs set compression=zstd-6 mudrac
zfs set atime=off mudrac
We are using ashift of 9 instead of 12 since it uses 512 bytes blocks on storage (which is supported by our SSD storage) that saves quite a bit of space:
root@t1:~# df | grep mudrac
Filesystem      1K-blocks       Used Available Use% Mounted on
mudrac/mudrac  3104245632 3062591616  41654016  99% /mudrac/mudrac # ashift=12
m2/mudrac      3104303872 2917941376 186362496  94% /m2/mudrac     # ashift=9
This is saving of 137Gb just by choosing smaller ashift.

Most of our e-mail are messages kept on server, but rarely accessed. Because of that I opted to use zstd-6 (instead of default zstd-3) to compress it as much as possible. But, to be sure it's right choice, I also tested zstd-12 and zstd-19 and results are available below:

LEVELUSEDCOMPH:S
zstd-6298797193318460%11:2400
zstd-12298059111577659%15:600
zstd-19297251484160059%52:600
Compression levels higher than 6 seem to need at least 6 cores to compress data, so zstd-6 seemed like best performance/space tradeoff, especially if we take into account additional time needed for compression to finish.

bullseye kernel for zfs and systemd-nspawn

To have zfs, we need recent kernel. Instead of upgrading whole server to bullseye at this moment, I decided to boot bullseye with zfs and start unmodified installation using systemd-nspawn. This is easy using following command line:

systemd-nspawn --directory /mudrac/mudrac/ --boot --machine mudrac --network-interface=eth1010 --hostname mudrac
but it's not ideal for automatic start of machine, so better solution is to use machinectl and systemd service for this. Converting this command-line into nspawn is non-trivial, but after reading man systemd.nspawn configuration needed is:
root@t1:~# cat /etc/systemd/nspawn/mudrac.nspawn
[Exec]
Boot=on
#WorkingDirectory=/mudrac/mudrac
# ln -s /mudrac/mudrac /var/lib/machines/
# don't chown files
PrivateUsers=false

[Network]
Interface=eth1010
Please note that we are not using WorkingDirectory (which would copy files from /var/lib/machines/name) but instead just created symlink to zfs filesystem in /var/lib/machines/.

To enable and start container on boot, we can use:

systemctl enable systemd-nspawn@mudrac
systemctl start systemd-nspawn@mudrac

Keep network device linked to mac address

Predictable network device names which bullseye uses should provide stable network device names. This seems like clean solution, but in testing I figured out that adding additional disks will change name of network devices. Previously Debian used udev to provide mapping between network interface name and device mac using /etc/udev/rules.d/70-persistent-net.rules. Since this is no longer the case, solution is to define similar mapping using systemd network like this:

root@t1:~# cat /etc/systemd/network/11-eth1010.link
[Match]
MACAddress=aa:00:00:39:90:0f

[Link]
Name=eth1010

Increasing disk space

When we do run out of disk space again, we could add new disk and add it to zfs pool using:

root@t2:~# zpool set autoexpand=on mudrac
root@t2:~# zpool add mudrac /dev/vdc
Thanks to autoexpand=on above, this will automatically make new space available. However, if we increase existing disk up to 4T new space isn't visible immediately since zfs has partition table on disk, so we need to extend device to use all space available using:
root@t2:~# zpool online -e mudrac vdb

zfs snapshots for backup

Now that we have zfs under our mail server, it's logical to also use zfs snapshots to provide nice, low overhead incremental backup. It's as easy as:

zfs snap mudrac/mudrac@$( date +%Y-%m-%d )
in cron.daliy and than shipping snapshots to backup machine. I did look into existing zfs snapshot solutions, but they all seemed a little bit too complicated for my use-case, so I wrote zfs-snap-to-dr.pl which copies snapshots to backup site.

To keep just and two last snapshots on mail server simple shell snippet is enough:

zfs list -r -t snapshot -o name -H mudrac/mudrac > /dev/shm/zfs.all
tail -2 /dev/shm/zfs.all > /dev/shm/zfs.tail-2
grep -v -f /dev/shm/zfs.tail-2 /dev/shm/zfs.all | xargs -i zfs destroy {}
Using shell to create and expire snapshots and simpler script to just transfer snapshots seems to me like better and more flexible solution than implementing it all in single perl script. In a sense, it's the unix way of small tools which do one thing well. Only feature which zfs-snap-to-dr.pl has aside from snapshot transfer is ability to keep just configurable number of snapshots on destination which enables it to keep disk usage under check (and re-users already collected data about snapshots).

This was interesting journey. In future, we will migrate mail server to bullseye and remove systemd-nspawn (it feels like we are twisting it's hand using it like this). But it does work, and is simple solution which will come handy in future.

My point of view

First, let me explain my position. I was working for quite a few years in big corporation, and followed EMC storage systems (one from end of of last century and improvement that Clarion did on our production SAP deployment). I even visited EMC factory in Cork, Ireland, and it was very eye-opening experience. They claim that 95% of customers who visited factory did buy EMC storage, and I believe them (we did upgrade to Clarion btw).

In my Linux based deployments on HP, Compaq and IBM hardware I did various crazy RAID configurations (RAID5 across disks on controller and then stripe across other controller, for example). Those where the easy parts: you got RAID controller with DRAM cache (~256Mb) and some kind of battery backup which greatly improved write performance.

Later on in CARNet we had HP EVA storage which proved quite flaky. I heard from friend in one enterprise deployment that they use them only for testing. And you know, it's just shelf of disks with redundant controllers and fiber interface...

In the mean time, on Linux software RAID front, I used md implementation RAID1 and RAID5 back in the days when Linux distributions couldn't handle that.

Solid state drives

However, solid state drives changed a lot of that. I still haven't had pleasure to use Intel SSD which are supposed to be good, but USB sticks are also flash storage, but with quaky characteristics.

This particular one is ID 0951:1603 Kingston Technology Data Traveler 1GB/2GB Pen Drive as reported by Linux, but in fact 8Gb model which seem to have 128Mb of memory which is writable at about 6Mb/s and after that write speed drops to 45K/s.

On the other hand, there is ZFS on FUSE project which enables some really interesting applications of Sun's (and now Oracle) file-system. I do have to mention Sun at this point. Ever since I heard about Oracle's acquisition of Sun, I have wondered what will happen with ZFS. I might even suspect that ZFS is the main reason why Oracle bought Sun. Let me explain...

Sunshine Oracle

If you look at database market (where Oracle is), the only interesting thing to improve relational databases is to make them extremely fast. And that revolution is already here. Don MacAskill from SmugMug makes compelling case about performance of SSD storage. If you don't believe words, watch this video from 24:50 to see solutions to MySQL storage performance problem: hardware!. Sun's hardware. Do you think that Oracle didn't noticed that?

Enterprise storage cheaply

Did you watched the video? I really don't agree that it's hardware. Common! It's Opteron boxes with custom built SSD disks optimized for write speed. SSD with super-capacitors instead of batteries in old RAID controller.

But, to make it really fun, I will try to re-create at least some of those abilities using commodity hardware in my university environment. I have Dell's OptiPlex boxes which come loaded with a lot of goodies to put together a commodity storage cluster:

  • Intel
  • Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz
  • 3Gb RAM
  • 2 SATA disks with ~80Mb/s of read/write performance
  • multi-card reader and 8 USB slots
  • fake software RAID on Intel chipset (supported by dmraid but even it's documentation suggests not to use it)

ZFS

Why ZFS? Isn't btrfs way to go? For this particular application, I don't think so. Let me list features of ZFS which excite me:

  • ability to store log to separate (mirror) device (SSD, USB sticks if that helps)
  • scrub: read all bytes on disk and rewrite it (beats smartctl -t long because it also re-allocates bad blocks, I've seen 80Mb/s scrub)
  • balancing of IO over devices (I will use this over nbd to split mirror between machines for fail-over)
  • arbitrary number of copies (nice for bigger clusters of storage machines)
  • nice snapshots which display it's size and can be cloned to writable ones
  • snapshot send/receive to make off-site backup copies
  • L2ARC - balance read and write cache over SSD devices with different characteristics (USB sticks have fast read and slow write, so they might be good fit)

You might think of it as git with POSIX file-system semantics.

But, it's in user space, you say, it must be slow! It isn't. Really. Linux user-space is much faster than disk speed and having separate process is nice for monitoring purposes. File-system overhead gets counted into user time, not system, so system time is clear indicator of driver (hardware) activity and not file-system overhead.

I have most parts of this setup ready, and I'm using it to backup OpenVZ containers. So, I'm running OpenVZ kernel and I can even make virtual machines from backup snapshots to recover into some point in time. After I finish this setup, expect a detailed guide (it will probably be part of my upcoming virtualization workshop as alternative to LVM).

I'm preparing walk-through screencasts for workshop about virtualization so I needed easy way to produce console screencasts.

First, I found TTYShare which displays ttyrec files using flash, but I really wanted to copy/paste parts of commands and disliked flash plugin requirement.

It seems that I wasn't only one who wanted JavaScript client for ttyrec. jsttplay can produce screencasts like this one about OpenVZ using plain copy/paste friendly JavaScript within browser.

But, for some parts (like Debian installation under VirtualBox) I really needed movie capture. x11grab option from ffmpeg seemed easy enough to use:

ffmpeg -f x11grab -s 640x480 -r 10 -i :0.0 /tmp/screencast.avi
but getting right window size isn't something I want to do by hand. So, I wrote small perl script record-screencast.pl which extract window position and size using xwininfo and pass it to ffmpeg so I don't have to.

vz-tools

For quite some time, I have been happy user of OpenVZ virtualization. And for almost same amount of time, I have been writing vz-tools, a set of few utilities to make work with OpenVZ a joy:

  • vz-optimize.pl was written first, and it tries to intelligently increase beancounter limits for VE until there is some requires left for virtual machine.
    Best usage would be to run some kind of stress-test on your VE, and then re-run vz-optimize.pl afterwards to allocate enough resources for VE.
  • vz-create.pl is second useful tool which will create new VE from upstream Debian mirror and tweak it a bit (the way I like it)
  • vz-clone.pl is latest addition to set of tools and it allows to create temporary clone of virtual machine for testing (of upgrades?) or QA

While those tools work well for me, they don't have much error checking, and most of parameters in them are actually hard-coded. Since, I am having presentation about virtualization on Linux on HrOUG this year, I will probably improve this tools a bit more.

However, I'm not quite sure if this tools would be of general use to somebody else. If it is, please drop me a note, and I would love to improve them a bit more...