dspace shibboleth config and debug

Configuring shibboleth is always somewhat confusing for me, so I decided to write this blog post to document how configuration for dspace is done and debugged.

This information is scattered over documentation and dspace-tech mailing list so hopefully this will be useful to someone, at least me if I ever needed to do this again.

First step is to install mod-shib for apache: apt install libapache2-mod-shib Now there are two files which have to be modified with your information, /etc//shibboleth/shibboleth2.xml which defines configuration and /etc/shibboleth/attribute-map.xml to define which information will be passwd from shibboleth to application.

attribute-map.xml

Here we have to define headers which dspace expects, so it can get information from upstream idenitity provider.

diff --git a/shibboleth/attribute-map.xml b/shibboleth/attribute-map.xml
index 1a4a3b0..a8680da 100644
--- a/shibboleth/attribute-map.xml
+++ b/shibboleth/attribute-map.xml
@@ -163,4 +163,34 @@
</Attribute>
-->

+ <!-- In addition to the attribute mapping, DSpace expects the following Shibboleth Headers to be set:
+ - SHIB-NETID
+ - SHIB-MAIL
+ - SHIB-GIVENNAME
+ - SHIB-SURNAME
+ These are set by mapping the respective IdP attribute (left hand side) to the header attribute (right hand side).
+ -->
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.1" id="SHIB-NETID"/>
+ <Attribute name="urn:mace:dir:attribute-def:uid" id="SHIB-NETID"/>
+ <Attribute name="hrEduPersonPersistentID" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-NETID"/>
+
+ <Attribute name="urn:oid:0.9.2342.19200300.100.1.3" id="SHIB-MAIL"/>
+ <Attribute name="urn:mace:dir:attribute-def:mail" id="SHIB-MAIL"/>
+ <Attribute name="mail" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-MAIL"/>
+
+ <Attribute name="urn:oid:2.5.4.42" id="SHIB-GIVENNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:givenName" id="SHIB-GIVENNAME"/>
+ <Attribute name="givenName" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-GIVENNAME"/>
+
+ <Attribute name="urn:oid:2.5.4.4" id="SHIB-SURNAME"/>
+ <Attribute name="urn:mace:dir:attribute-def:sn" id="SHIB-SURNAME"/>
+ <Attribute name="sn" nameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic" id="SHIB-SURNAME"/>
+
</Attributes>

shibboleth2.xml

This is main configuration file for shibd. First we need to add OutOfProcess and InProcess to include useful information in shibd.log which are very useful.

diff --git a/shibboleth/shibboleth2.xml b/shibboleth/shibboleth2.xml
index ddfb98a..7b55987 100644
--- a/shibboleth/shibboleth2.xml
+++ b/shibboleth/shibboleth2.xml
@@ -2,15 +2,46 @@
xmlns:conf="urn:mace:shibboleth:3.0:native:sp:config"
clockSkew="180">

- <OutOfProcess tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a" />
+ <!-- The OutOfProcess section contains properties affecting the shibd daemon. -->
+ <OutOfProcess logger="shibd.logger" tranLogFormat="%u|%s|%IDP|%i|%ac|%t|%attr|%n|%b|%E|%S|%SS|%L|%UA|%a">
+ <!--
+ <Extensions>
+ <Library path="odbc-store.so" fatal="true"/>
+ </Extensions>
+ -->
+ </OutOfProcess>

+
+ <!--
+ The InProcess section contains settings affecting web server modules.
+ Required for IIS, but can be removed when using other web servers.
+ -->
+ <InProcess logger="native.logger">
+ <ISAPI normalizeRequest="true" safeHeaderNames="true">
+ <!--
+ Maps IIS Instance ID values to the host scheme/name/port. The name is
+ required so that the proper <Host> in the request map above is found without
+ having to cover every possible DNS/IP combination the user might enter.
+ -->
+ <Site id="1" name="sp.example.org"/>
+ <!--
+ When the port and scheme are omitted, the HTTP request's port and scheme are used.
+ If these are wrong because of virtualization, they can be explicitly set here to
+ ensure proper redirect generation.
+ -->
+ <!--
+ <Site id="42" name="virtual.example.org" scheme="https" port="443"/>
+ -->
+ </ISAPI>
+ </InProcess>
+
<!--
By default, in-memory StorageService, ReplayCache, ArtifactMap, and SessionCache
are used. See example-shibboleth2.xml for samples of explicitly configuring them.
-->

<!-- The ApplicationDefaults element is where most of Shibboleth's SAML bits are defined. -->
- <ApplicationDefaults entityID="https://sp.example.org/shibboleth"
+ <ApplicationDefaults entityID="https://repository.clarin.hr/Shibboleth.sso/Metadata"
REMOTE_USER="eppn subject-id pairwise-id persistent-id"
cipherSuites="DEFAULT:!EXP:!LOW:!aNULL:!eNULL:!DES:!IDEA:!SEED:!RC4:!3DES:!kRSA:!SSLv2:!SSLv3:!TLSv1:!TLSv1.1">

@@ -31,8 +62,8 @@
entityID property and adjust discoveryURL to point to discovery service.
You can also override entityID on /Login query string, or in RequestMap/htaccess.
-->
- <SSO entityID="https://idp.example.org/idp/shibboleth"
- discoveryProtocol="SAMLDS" discoveryURL="https://ds.example.org/DS/WAYF">
+ <SSO
+ discoveryProtocol="SAMLDS" discoveryURL="https://discovery.clarin.eu">
SAML2
</SSO>

@@ -68,6 +99,10 @@
<!--
<MetadataProvider type="XML" validate="true" path="partner-metadata.xml"/>
-->
+ <MetadataProvider type="XML" url="https://login.aaiedu.hr/shib/saml2/idp/metadata.php" backingFilePath="aaieduhr-metadata.xml" maxRefreshDelay="3600" />

<!-- Example of remotely supplied batch of signed metadata. -->
<!--

certificates

To make upstream identity provider connect to us, we need valid certificate so we need /etc//shibboleth/sp-encrypt-cert.pem and /etc/shibboleth/sp-encrypt-key.pem.
Since we are using Let's encrypt for certificates, I'm using shell script to move them over and change permissions so shibd will accept them.

dpavlin@repository:/etc/shibboleth$ cat update-certs.sh
#!/bin/sh -xe

cp -v /etc/letsencrypt/live/repository.clarin.hr/privkey.pem sp-encrypt-key.pem
cp -v /etc/letsencrypt/live/repository.clarin.hr/cert.pem sp-encrypt-cert.pem
chown -v _shibd:_shibd sp-*.pem
/etc/init.d/shibd restart

dspace configuration

We are using upstream dspace docker which includes local.cfg from outside using docker bind, but also re-defines shibboleth headers so we need to restore them to default names defined before in /etc/shibboleth/attribute-map.xml.

diff --git a/docker/local.cfg b/docker/local.cfg
index 168ab0dd42..6f71be32cf 100644
--- a/docker/local.cfg
+++ b/docker/local.cfg
@@ -14,3 +14,22 @@
# test with: /dspace/bin/dspace dsprop -p rest.cors.allowed-origins

handle.prefix = 20.500.14615
+
+shibboleth.discofeed.url = https://repository.clarin.hr/Shibboleth.sso/DiscoFeed
+
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.PasswordAuthentication
+plugin.sequence.org.dspace.authenticate.AuthenticationMethod = org.dspace.authenticate.ShibAuthentication
+
+# in sync with definitions from /etc/shibboleth/attribute-map.xml
+authentication-shibboleth.netid-header = SHIB-NETID
+authentication-shibboleth.email-header = SHIB-MAIL
+authentication-shibboleth.firstname-header = SHIB-GIVENNAME
+authentication-shibboleth.lastname-header = SHIB-SURNAME
+# Should we allow new users to be registered automatically?
+authentication-shibboleth.autoregister = true

example of working login

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:18:40 INFO XMLTooling.StorageService : purged 2 expired record(s) from storage

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.AuthnRequest|||https://login.aaiedu.hr/shib/saml2/idp/metadata.php||||||urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||||||

==> /var/log/shibboleth/shibd.log <==
2024-11-18 17:27:42 INFO Shibboleth.SessionCache [1] [default]: new session created: ID (_e37d7f0b8ea3ff4d718b3e2c68d81e45) IdP (https://login.aaiedu.hr/shib/saml2/idp/metadata.php) Protocol(urn:oasis:names:tc:SAML:2.0:protocol) Address (141.138.31.16)

==> /var/log/shibboleth/transaction.log <==
2024-11-18 17:27:42|Shibboleth-TRANSACTION.Login|https://login.aaiedu.hr/shib/saml2/idp/metadata.php!https://repository.clarin.hr/Shibboleth.sso/Metadata!303a3b0f72c5e29bcbdf35cab3826e62|_e37d7f0b8ea3ff4d718b3e2c68d81e45|https://login.aaiedu.hr/shib/saml2/idp/metadata.php|_3b8a8dc87db05b5e6abf8aaf9d5c67e6ebc62a2eed|urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport|2024-11-18T17:01:57|SHIB-GIVENNAME(1),SHIB-MAIL(2),SHIB-NETID(1),SHIB-SURNAME(1),persistent-id(1)|303a3b0f72c5e29bcbdf35cab3826e62|urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST||urn:oasis:names:tc:SAML:2.0:status:Success|||Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0|141.138.31.16

boken urf-8 in shibboleth user first name or surname

You might think that now everything works, but dspace will try to re-encode utf-8 characters received from upstream shibboleth using iso-8859-1 breaking accented characters in process. It has configuration for it, but by default it's false.

root@4880a3097115:/usr/local/tomcat/bin# /dspace/bin/dspace dsprop -p authentication-shibboleth.reconvert.attributes
false

So we need to modify dspace-angular/docker/local.cfg and turn it on:

# incomming utf-8 from shibboleth AAI
authentication-shibboleth.reconvert.attributes=true