We live in troubling times, as web crawlers have become so prevalent in internet traffic that they can cause denial-of-service attacks on Koha instances.

Simplest possible way to prevent this is following rule:

    <LocationMatch "^/cgi-bin/koha/(opac-search\.pl|opac-shelves\.pl|opac-export\.pl|opac-reserve\.pl)$">
        # Block requests without a referer header
        RewriteEngine On
        RewriteCond %{HTTP_REFERER} ^$
        RewriteRule .* - [F,L]

        # Optional: Return a 403 Forbidden status code
        ErrorDocument 403 "Access Forbidden: Direct access to this resource is not allowed."
    </LocationMatch>

This helps to mitigate problems like this: apache_processes-week.png

Configuring shibboleth is always somewhat confusing for me, so I decided to write this blog post to document how configuration for dspace is done and debugged.

This information is scattered over documentation and dspace-tech mailing list so hopefully this will be useful to someone, at least me if I ever needed to do this again.

First step is to install mod-shib for apache: apt install libapache2-mod-shib Now there are two files which have to be modified with your information, /etc//shibboleth/shibboleth2.xml which defines configuration and /etc/shibboleth/attribute-map.xml to define which information will be passwd from shibboleth to application.

attribute-map.xml

Here we have to define headers which dspace expects, so it can get information from upstream idenitity provider.

diff --git a/shibboleth/attribute-map.xml b/shibboleth/attribute-map.xml
index 1a4a3b0..a8680da 100644
--- a/shibboleth/attribute-map.xml
+++ b/shibboleth/attribute-map.xml
@@ -163,4 +163,34 @@
</Attribute>
-->

+ <!-- In addition to the attribute mapping, DSpace expects the following Shibboleth Headers to be set:
+ - SHIB-NETID
+ - SHIB-MAIL
+ - SHIB-GIVENNAME
+ - SHIB-SURNAME
+ These are set by mapping the respective IdP attribute (left hand side) to the header attribute (right hand side).
+ -->
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.1" id="SHIB-NETID"/>
+ <Attribute name="urn:mace:dir:attribute-def:uid" id="SHIB-NETID"/>
+ <Attribute name="hrEduPersonPersistentID" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-NETID"/>
+
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.3" id="SHIB-MAIL"/>
+ <Attribute name="urn:mace:dir:attribute-def:mail" id="SHIB-MAIL"/>
+ <Attribute name="mail" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-MAIL"/>
+
+ <Attribute name="urn:oid:2.5.4.42" id="SHIB-GIVENNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:givenName" id="SHIB-GIVENNAME"/>
+ <Attribute name="givenName" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-GIVENNAME"/>
+
+ <Attribute name="urn:oid:2.5.4.4" id="SHIB-SURNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:sn" id="SHIB-SURNAME"/>
+ <Attribute name="sn" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-SURNAME"/>
+
</Attributes>

shibboleth2.xml

This is main configuration file for shibd. First we need to add OutOfProcess and InProcess to include useful information in shibd.log which are very useful.

diff --git a/shibboleth/shibboleth2.xml b/shibboleth/shibboleth2.xml
index ddfb98a..7b55987 100644
--- a/shibboleth/shibboleth2.xml
+++ b/shibboleth/shibboleth2.xml
@@ -2,15 +2,46 @@
xmlns:conf="urn:mace:shibboleth:3.0:native:sp:config"
clockSkew="180">

- <OutOfProcess tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a" />
+ <!-- The OutOfProcess section contains properties affecting the shibd daemon. -->
+ <OutOfProcess logger="shibd.logger" tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a">
+ <!--
+ <Extensions>
+ <Library path="odbc-store.so" fatal="true"/>
+ </Extensions>
+ -->
+ </OutOfProcess>

+
+ <!--
+ The InProcess section contains settings affecting web server modules.
+ Required for IIS, but can be removed when using other web servers.
+ -->
+ <InProcess logger="native.logger">
+ <ISAPI normalizeRequest="true" safeHeaderNames="true">
+ <!--
+ Maps IIS Instance ID values to the host scheme/name/port. The name is
+ required so that the proper <Host> in the request map above is found without
+ having to cover every possible DNS/IP combination the user might enter.
+ -->
+ <Site id="1" name="sp.example.org"/>
+ <!--
+ When the port and scheme are omitted, the HTTP request's port and scheme are used.
+ If these are wrong because of virtualization, they can be explicitly set here to
+ ensure proper redirect generation.
+ -->
+ <!--
+ <Site id="42" name="virtual.example.org" scheme="https" port="443"/>
+ -->
+ </ISAPI>
+ </InProcess>
+
<!--
By default, in-memory StorageService, ReplayCache, ArtifactMap, and SessionCache
are used. See example-shibboleth2.xml for samples of explicitly configuring them.
-->

<!-- The ApplicationDefaults element is where most of Shibboleth's SAML bits are defined. -->
- <ApplicationDefaults entityID="https://sp.example.org/shibboleth"
+ <ApplicationDefaults entityID="https://repository.clarin.hr/Shibboleth.sso/Metadata"
REMOTE_USER="eppn subject-id pairwise-id persistent-id"
cipherSuites="DEFAULT:!EXP:!LOW:!aNULL:!eNULL:!DES:!IDEA:!SEED:!RC4:!3DES:!kRSA:!SSLv2:!SSLv3:!TLSv1:!TLSv1.1">

@@ -31,8 +62,8 @@
entityID property and adjust discoveryURL to point to discovery service.
You can also override entityID on /Login query string, or in RequestMap/htaccess.
-->
- <SSO entityID="https://idp.example.org/idp/shibboleth"
- discoveryProtocol="SAMLDS" discoveryURL="https://ds.example.org/DS/WAYF">
+ <SSO
+ discoveryProtocol="SAMLDS" discoveryURL="https://discovery.clarin.eu">
SAML2
</SSO>

@@ -68,6 +99,10 @@
<!--
<MetadataProvider type="XML" validate="true" path="partner-metadata.xml"/>
-->
+ <MetadataProvider type="XML" url="https://login.aaiedu.hr/shib/saml2/idp/metadata.php" backingFilePath="aaieduhr-metadata.xml" maxRefreshDelay="3600" />

<!-- Example of remotely supplied batch of signed metadata. -->
<!--

certificates

To make upstream identity provider connect to us, we need valid certificate so we need /etc//shibboleth/sp-encrypt-cert.pem and /etc/shibboleth/sp-encrypt-key.pem.
Since we are using Let's encrypt for certificates, I'm using shell script to move them over and change permissions so shibd will accept them.

dpavlin@repository:/etc/shibboleth$ cat update-certs.sh
#!/bin/sh -xe

cp -v /etc/letsencrypt/live/repository.clarin.hr/privkey.pem sp-encrypt-key.pem
cp -v /etc/letsencrypt/live/repository.clarin.hr/cert.pem sp-encrypt-cert.pem
chown -v _shibd:_shibd sp-*.pem
/etc/init.d/shibd restart

dspace configuration

We are using upstream dspace docker which includes local.cfg from outside using docker bind, but also re-defines shibboleth headers so we need to restore them to default names defined before in /etc/shibboleth/attribute-map.xml.

diff --git a/docker/local.cfg b/docker/local.cfg
index 168ab0dd42..6f71be32cf 100644
--- a/docker/local.cfg
+++ b/docker/local.cfg
@@ -14,3 +14,22 @@
# test with: /dspace/bin/dspace dsprop -p rest.cors.allowed-origins

handle.prefix = 20.500.14615
+
+shibboleth.discofeed.url = https://repository.clarin.hr/Shibboleth.sso/DiscoFeed
+
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.PasswordAuthentication
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.ShibAuthentication
+
+# in sync with definitions from /etc/shibboleth/attribute-map.xml
+authentication-shibboleth.netid-header = SHIB-NETID
+authentication-shibboleth.email-header = SHIB-MAIL
+authentication-shibboleth.firstname-header = SHIB-GIVENNAME
+authentication-shibboleth.lastname-header = SHIB-SURNAME
+# Should we allow new users to be registered automatically?
+authentication-shibboleth.autoregister = true

example of working login

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:18:40 INFO XMLTooling.StorageService : purged 2 expired record(s) from storage

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.AuthnRequest|||https://login.aaiedu.hr/shib/saml2/idp/metadata.php||||||urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||||||

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:27:42 INFO Shibboleth.SessionCache [1] [default]: new session created: ID (_e37d7f0b8ea3ff4d718b3e2c68d81e45) IdP (https://login.aaiedu.hr/shib/saml2/idp/metadata.php) Protocol(urn:oasis:names:tc:SAML:2.0:protocol) Address (141.138.31.16)

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.Login|https://login.aaiedu.hr/shib/saml2/idp/metadata.php!https://repository.clarin.hr/Shibboleth.sso/Metadata!303a3b0f72c5e29bcbdf35cab3826e62|_e37d7f0b8ea3ff4d718b3e2c68d81e45|https://login.aaiedu.hr/shib/saml2/idp/metadata.php|_3b8a8dc87db05b5e6abf8aaf9d5c67e6ebc62a2eed|urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport|2024-11-18T17:01:57|SHIB-GIVENNAME(1),SHIB-MAIL(2),SHIB-NETID(1),SHIB-SURNAME(1),persistent-id(1)|303a3b0f72c5e29bcbdf35cab3826e62|urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||urn:oasis:names:tc:SAML:2.0:status:Success|||Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0|141.138.31.16

boken urf-8 in shibboleth user first name or surname

You might think that now everything works, but dspace will try to re-encode utf-8 characters received from upstream shibboleth using iso-8859-1 breaking accented characters in process. It has configuration for it, but by default it's false.

root@4880a3097115:/usr/local/tomcat/bin# /dspace/bin/dspace dsprop -p authentication-shibboleth.reconvert.attributes
false

So we need to modify dspace-angular/docker/local.cfg and turn it on:

# incomming utf-8 from shibboleth AAI
authentication-shibboleth.reconvert.attributes=true

Let's assume that you inherited WordPress installation (or three) with tens of instances (or hundreds in this case) which are generating spam using comments. I will try to describe problem here and suggest solution which doesn't require clicking in WordPress but instead using wp cli which is faster and easier especially if you don't have administrative account on all those WordPress instances. Interested? Read on.

WordPress comment spam

If you try googling around how to prevent WordPress comment spam, you will soon arrive at two solutions:

  • changing default_comment_status to closed which will apply to all new posts
  • changing comment_status on all existing posts to close
However, this is not full solution, since media in WordPress can also have comments enabled, and those two steps above won't solve spam from media. There are plugins to disable media comments, but since we have many WordPress instances I wanted to find solution which doesn't require modifying each of them. And there is simple solution using close_comments_for_old_posts option which will basically do same thing after close_comments_days_old days (which by default is 14).

So, in summary, all this can easily be done using following commands in wp cli:

wp post list --post-status=publish --post_type=post --comment_status=open --format=ids \
        | xargs -d ' ' -I % wp post update % --comment_status=closed

wp option update default_comment_status closed

wp option update close_comments_for_old_posts 1

If wp cli doesn't work for you (for example if your WordPress instance is so old that wp cli is returning errors for some plugins instead of working) you can achieve same thing using SQL (this assumes that wp db query is working, but if it doesn't you can always connect using mysql and login and password from wp-config.php):

cat << __SQL__ | wp db query
update wp_posts set comment_status='closed' where comment_status != 'closed' ;
update wp_options set option_value = 'closed' where option_name = 'default_comment_status' and option_value != 'closed' ;
update wp_options set option_value = 1 where option_name = 'close_comments_for_old_posts' and option_value != 1
__SQL__
This is also faster option, because all SQL SQL queries are invoked using single wp db query call (and this since php instance startup which can time some time).

Cleaning up held or spam comments

After you disabled new spam in comments, you will be left with some amount of comments which are marked as spam or left in held status if your WordPress admins didn't do anything about them. To cleanup database, you can use following to delete spam or held comments:

wp comment delete $(wp comment list --status=spam --format=ids) --force

wp comment delete $(wp comment list --status=hold --format=ids) --force

Disabling contact form spam

All spam is not result of comments, some of it might come through contact form. To disable those, you can disable comment plugin which will leave ugly markup on page without it enabled, but spams will stop.

# see which contact plugins are active
wp plugin list | grep contact
contact-form-7  active  none    5.7.5.1
contact-form-7-multilingual     active  none    1.2.1

# disable them
wp plugin deactivate contact-form-7

freeradius testing and logging

If you are put in front of working radius server which you want to upgrade, but this is your first encounter with radius, following notes might be useful to get you started.

Goal is to to upgrade system and test to see if everything still works after upgrade.

radtest

First way to test radius is radtest which comes with freeradius and enables you to verify if login/password combination results in successful auth.

You have to ensure that you have 127.0.0.1 client in our case in /etc/freeradius/3.0/clients-local.conf file:

client 127.0.0.1 {
    ipv4addr    = 127.0.0.1
    secret      = testing123
    shortname   = test-localhost
}
Restart freeradius and test
# systemctl restart freeradius


# radtest username@example.com PASSword 127.0.0.1 0 testing123

Sent Access-Request Id 182 from 0.0.0.0:45618 to 127.0.0.1:1812 length 86
    User-Name = "username@example.com"
    User-Password = "PASSword"
    NAS-IP-Address = 193.198.212.8
    NAS-Port = 0
    Message-Authenticator = 0x00
    Cleartext-Password = "PASSword"
Received Access-Accept Id 182 from 127.0.0.1:1812 to 127.0.0.1:45618 length 115
    Connect-Info = "NONE"
    Configuration-Token = "djelatnik"
    Callback-Number = "username@example.com"
    Chargeable-User-Identity = 0x38343431636162353262323566356663643035613036373765343630333837383135653766376434
    User-Name = "username@example.com"

# tail /var/log/freeradius/radius.log
Tue Dec 27 19:41:15 2022 : Info: rlm_ldap (ldap-aai): Opening additional connection (11), 1 of 31 pending slots used
Tue Dec 27 19:41:15 2022 : Auth: (9) Login OK: [user@example.com] (from client test-localhost port 0)
This will also test connection to LDAP in this case.

radsniff -x

To get dump of radius traffic on production server to stdout, use radsniff -x.

This is useful, but won't get you encrypted parts of EAP.

freeradius logging

To see all protocol decode from freeradius, you can run it with -X flag in terminal which will run it in foreground with debug output.

# freeradius -X
If you have ability to run isolated freeradius for testing, this is easiest way to see all configuration parsed (and warnings!) and decoded EAP traffic.

generating more verbose log file

Adding -x to /etc/default/freeradius or to radius command-line will generate debug log in log file. Be mindful about disk space usage for additional logging! But to see enough debugging in logs to see which EAP type is unsupported like:

dpavlin@deenes:~/radius-tools$ grep 'unsupported EAP type' /var/log/freeradius/radius.log
(27) eap-aai: Peer NAK'd asking for unsupported EAP type PEAP (25), skipping...
(41) eap-aai: Peer NAK'd asking for unsupported EAP type PEAP (25), skipping...
(82) eap-aai: Peer NAK'd asking for unsupported EAP type PEAP (25), skipping...
(129) eap-aai: Peer NAK'd asking for unsupported EAP type PEAP (25), skipping...
(142) eap-aai: Peer NAK'd asking for unsupported EAP type PEAP (25), skipping...
you will need to use -xx (two times x) to get enough debugging log. Again, monitor disk usage carefully.

EAP radius testing using eapol_test from wpa_supplicant

To test EAP we need to build eapol_test tool from wpa_supplicant.

wget http://w1.fi/releases/wpa_supplicant-2.10.tar.gz

cd wpa_supplicant-/wpa_supplicant
$ cp defconfig .config
$ vi .config

CONFIG_EAPOL_TEST=y

# install development libraries needed
apt install libssl-dev libnl-3-dev libnl-genl-3-dev libnl-route-3-dev

make eapol_test

EAP/TTLS

Now ne need configuration file for wpa_supplicant which tests EAP:

ctrl_interface=/var/run/wpa_supplicant
ap_scan=1

network={
    ssid="eduroam"
    proto=WPA2
    key_mgmt=WPA-EAP
    pairwise=CCMP
    group=CCMP
    eap=TTLS
    anonymous_identity="anonymous@example.com"
    phase2="auth=PAP"
    identity="username@example.com"
    password="PASSword"
}
Now we can test against our radius server (with optional certificate test):
# ./wpa_supplicant-2.10/wpa_supplicant/eapol_test -c ffzg.conf -s testing123
and specifying your custom CA cert:
# ./wpa_supplicant-2.10/wpa_supplicant/eapol_test -c ffzg.conf -s testing123 -o /etc/freeradius/3.0/certs/fRcerts/server-cert.pem
This will generate a lot of output, but in radius log you should see
Tue Dec 27 20:00:33 2022 : Auth: (9)   Login OK: [username@example.com] (from client test-localhost port 0 cli 02-00-00-00-00-01 via TLS tunnel)
Tue Dec 27 20:00:33 2022 : Auth: (9) Login OK: [username@example.com] (from client test-localhost port 0 cli 02-00-00-00-00-01)

GTC

This seems like a part of tibial knowledge (passed to me by another sysadmin), but to make GTC work, change of default_eap_type to gtc under ttls and add gtc section:

        ttls {
                # ... rest of config...
                default_eap_type = gtc
                # ... rest of config...
        }

        gtc {
                challenge = "Password: "
                auth_type = LDAP
        }
and changing wpa-supplicant configuration to:
CLONE dupli deenes:/home/dpavlin# cat eduroam-ttls-gtc.conf
ctrl_interface=/var/run/wpa_supplicant
ap_scan=1

network={
        ssid="eduroam"
        proto=WPA2
        key_mgmt=WPA-EAP
        pairwise=CCMP
        group=CCMP
        eap=TTLS
        anonymous_identity="anonymous@example.com"
        phase2="autheap=GTC"
        identity="username@example.com"
        password="PASSword"
}

PEAP

To make PEAP GTC work, I needed to add:

diff --git a/freeradius/3.0/mods-available/eap-aai b/freeradius/3.0/mods-available/eap-aai
index 245b7eb..6b7cefb 100644
--- a/freeradius/3.0/mods-available/eap-aai
+++ b/freeradius/3.0/mods-available/eap-aai
@@ -73,5 +73,11 @@ eap eap-aai {
                auth_type = LDAP
        }

+       # XXX 2023-01-06 dpavlin - peap
+       peap {
+               tls = tls-common
+               default_eap_type = gtc
+               virtual_server = "default"
+       }

 }
which then can be tested with:
network={
        ssid="wired"
        key_mgmt=IEEE8021X
        eap=PEAP
        anonymous_identity="anonymous@example.com"
        identity="username@example.com"
        password="PASSword"
}

What do do when you have bind as caching resolver which forwards to your DNS servers which do recursive resolving and host primary and secondary of your local domains and upstream link goes down?

To my surprise, caching server can't resolve your local domains although both primary and secondary of those domains are still available on your network and can resolve your domains without problem (when queried directly).

That's because caching server tries to do recursive resolving using root servers which aren't available if your upstream link is down, so even your local domains aren't available to clients using caching server.

Solution is simple if you know what it is. Simply add your local zones on caching server with type forward:

zone "ffzg.hr" {
    type forward;
    forwarders {
        193.198.212.8;
        193.198.213.8;
    };
};

zone "ffzg.unizg.hr" {
    type forward;
    forwarders {
        193.198.212.8;
        193.198.213.8;
    };
};
This will work, since queries for those zones are no longer recursive queries, so they don't need root servers which aren't available without upstream link.