
DSpace Installation and
Systems Administration Procedures for the
http://www.era.lib.ed.ac.uk/
Version: 1.9
Author: Richard Jones
(r.d.jones@ed.ac.uk)
DSpace Version: 1.1.1
Tapir Version: 0.3
(pre-release)
© 2004 Edinburgh
University Library
1.2.
Notation and Fonts for this Document
1.3.
Short Hand for this Document
3.3.
Starting and Stopping PostgreSQL
5.2.
Tomcat Configuration as Standalone Web-Server
5.3.
Starting and Stopping Tomcat
6.1.
Generating SSL Certificates with OpenSSL
6.2.
Importing the Signed Certificate
6.3.
Creating a Self-Signed Certificate with the Java Keytool
6.4.
Creating a Self-Signed Certificate with OpenSSL
7.2.
Starting and Stopping the Handle Server
8.1.
Installing the Custom Design
8.2.
Installing the New Log Reporter
8.1.
Configuring the OAI Interface
10.1.
Installation from Source
11.1.
Taking the Data from Source DSpace
11.2.
Installing the Data in Target DSpace
Required Ports
Operating System: Red
Hat
Java Version: 1.4.2_04
Ant Version: 1.5.2
PostgreSQL Version: 7.4.2
DSpace Version: 1.1.1
IP and Server Name
129.215.166.124
agrona.lib.ed.ac.uk
Normal Sans-Serif Font - The main body of the text of this document
Italic Sans-Serif Font
- Notes that should be paid attention to.
Fixed-Width Font - Examples of commands or items that you might see on the computer screen. For example, directory and file names as well as installation commands.
Italic
Fixed-Width Font - The user that you should be logged in as when
performing tasks. You should always be
logged in as the user most recently specified on the left-hand side of the page
in the current section.
The following short-hand notations are used throughout this document to represent certain types of information. When these are encountered in the main text they should always be replaced with the values found in this section.
[postgres]
=
The location of the PostgreSQL installation
[dspace]
=
The location of the DSpace installation
[dspace-source]
= /
The location of the source code used to create the DSpace installation
[dspace-home]
=
The dspace user's home directory on the system
[database-pw] =
The dspace database password for the dspace user
[admin-email]
=
The email address of the system administrator who will receive all administrative requests and emails from the site.
[admin-fn]
=
The First Name of the administrator's account
[admin-ln] =
The Last Name of the administrator's account
[admin-pw] =
The DSpace administrator's login password
[era-url] =
The full URL of the instance of DSpace, including http://
[era-host]
=
The full URL of the instance of DSpace without the http:// or any trailing slashes
[handle]
=
The Handle Prefix given to the organisation by CNRI
[tomcat]
=
The installation directory of the Tomcat web-server
[handle-server]
=
The installation directory of the Handle Server
[CNRI Contact] =
Your contact at the CNRI to whom to send support requests
[tapir-source]
=
The pre-installation location of the Tapir source code
[machine-ip] =
The IP address of the machine on which your DSpace installation is going
Users that will have to exist on the machine onto which DSpace is to be installed are as follows:
The user you need to be logged in as when performing the
actions laid out in this document will be given on the left hand side of the
page before the set of actions which this user needs to perform.
This section details the order in which things ought to be done in order to have a fully working DSpace with Tapir in the form of the Edinburgh Research Archive.
1. Before starting the installation ensure that the machine has the prerequisite software for DSpace installed:
a.
Red Hat
b. Tomcat 4.0.30
c. Java 1.4.2_04
d. Ant 1.5.2
The installation of this software is outside the scope of this document.
2. Install PostgreSQL as per the section PostgreSQL: Installation from Source. Once this has been done you should configure PostgreSQL as per the section PostgreSQL: PostgreSQL Configuration before starting the database server as per the section PostgreSQL: Starting and Stopping PostgreSQL.
3.
Set up the PostgreSQL cron job as per the section PostgreSQL: Cron Jobs.
4.
Install DSpace as per the section DSpace: Installation from Source.
5. Generate and install a Self-Signed SSL Certificate with OpenSSL as per the section SSL Certificates: Creating a Self-Signed Certificate with OpenSSL.
6. Configure Tomcat to use the SSL Certificate as per the section Tomcat: Configuring SSL
7. Configure the pre-installed handle server as per the section Handle-Server: Configuration then start it as per the section Handle Server: Starting and Stopping.
8. Make the localisation updates to integrate the system with local design decisions as per the section Localisation: Installing the Custom Design. Update the default JSPs to have the new system name as per the section Localisation: System Wide Name Change
9. To replace a broken perl script to do log analysis follow the section Localisation: Installing the New Log Reporter.
10. Test that the new OAI interface is working correctly as per the section OAI-PMH: Configuring the OAI Interface.
11. Install Tapir as per the section Tapir: Installation from Source.
12. Import the data from another instance of DSpace as per the section Data Import.
postgres
gunzip postgresql-7.4.2.tar.gz
tar –xvf postgresql.7.4.2.tar
In [postgres]/postgres-7.4.2:
./configure –-prefix=[postgres]
gmake
gmake check
gmake install
PATH = PATH:[postgres]/bin
[postgres]/bin/initdb –D [postgres]/data
The following settings need to be entered in the file [postgres]/data/postgresql.conf:
1.
tcpip_socket
= true
To allow PostgreSQL to be accessed via TCP/IP
2.
max_connections
= 400
To allow 400 connections to be opened to the database
3.
shared_buffers
= 3000
To provide normal usage in a live
environment
Note that if these values have to go any larger then the value of the system variable SHMMAX which limits segment size must be increased.
postgres
[postgres]/bin/postmaster –D [postgres]/data
pg_ctl stop -m fast -D [postgres]/data
or (only
if above fails):
kill `cat [postgres]/data/postmaster.pid`
dspace
#vacuum full analyze the database every night
0 2 * * * [postgres]/bin/vacuumdb -f -z
crontab mycron
root
dspace
./createuser –U postgres –d –A –P dspace
Password: [database-pw]
./createdb –U dspace dspace
ant
ant fresh_install
Note: If you wish to perform
this step manually, then in [dspace-source]/bin:
./dsrun org.dspace.storage.rdbms.InitializeDatabase
/u01/dspace-home/dspace-1.1.1
/etc/database_schema.sql
./dsrun org.dspace.administer.RegistryLoader
–bitstream
/u01/dspace/config/registries
/bitstream-formats.xml
./dsrun org.dspace.administer.RegistryLoader
–dc
/u01/dspace/config/registries
/dublin-core-types.xml
./install-configs
./create-administrator
Using the values
email: [admin-email]
First
Name: [admin-fn]
Last
Name: [admin-ln]
Password:
[admin-pw]
./index-all
The following settings are included in the DSpace config file dspace.cfg in [dspace]/config (or [dspace-source]/config prior to install). It is important to ensure that there are no extraneous spaces after any of the properties.
# DSpace
installation directory
dspace.dir = [dspace]
# DSpace base
URL
dspace.url = [era-url]
# DSpace host
name - should match base URL
dspace.hostname = [era-host]
# Name of the
site
dspace.name = Edinburgh Research Archive
config.template.log4j.properties
=
[dspace]/config/log4j.properties
config.template.log4j-handle-plugin.properties =
[dspace]/config/log4j-handle-plugin.properties
config.template.oaicat.properties =
[dspace]/config/oaicat.properties
config.template.oai-web.xml =
[dspace]/oai/WEB-INF/web.xml
##### Database
settings #####
# URL for
connecting to database
db.url = jdbc:postgresql://localhost:5432/dspace
# JDBC Driver
db.driver = org.postgresql.Driver
# Database
username and password
db.username = dspace
db.password = [database-pw]
##### Email
settings ######
# SMTP mail
server
mail.server=mailrelay.ed.ac.uk
# From address for mail
mail.from.address = [admin-email]
# Currently limited to one recipient!
feedback.recipient = [admin-email]
# General site
administration (Webmaster) e-mail
mail.admin = [admin-email]
# Recipient for
server errors and alerts
alert.recipient = [admin-email]
#### File
Storage ######
# Asset
(bitstream) store number 0 (zero)
assetstore.dir = [dspace]/assetstore
# Specify extra
asset stores like this, counting from 1 upwards:
#
assetstore.dir.1 = /second/assetstore
#
assetstore.dir.2 = /third/assetstore
# Specify the
number of the store to use for new bitstreams with this
# property. The default
is 0 (zero) which corresponds to the
# 'assetstore.dir' above
# assetstore.incoming = 1
# Directory for
history serializations
history.dir = [dspace]/history
# Where to put
search index files
search.dir = [dspace]/search
# Where to put
the logs
log.dir = [dspace]/log
# Where to
temporarily store uploaded files
upload.temp.dir = /tmp
# Maximum size
of uploaded files in bytes, must be positive
# 512Mb
upload.max = 536870912
##### Handle
settings ######
# CNRI Handle
prefix
handle.prefix = [handle]
# Directory for
installing Handle server files
handle.dir = [handle-server]
##### Web UI
Settings ######
# The site authenticator - must implement
# org.dspace.app.webui.SiteAuthenticator
webui.site.authenticator = org.dspace.app.webui.SimpleAuthenticator
# Certificate
authority
webui.cert.ca = [dspace]/etc/certificate-ca.pem
# If a user
presents a valid Web certificate, but does not have an
# e-person record,
should they automatically be given a new e-person
# record?
webui.cert.autoregister = true
# Should the submit UI block submissions marked as theses?
webui.submit.blocktheses = false
##### SFX Server
#####
# SFX query is
appended to this URL. If this property
is commented
# out or omitted,
SFX support is switched off.
# sfx.server.url = http://sfx.myu.edu:8888/sfx?
##### Ingest
settings #####
# Default
language for content of submissions
default.language = en
dspace
#send out subscription
emails at 01.00 every day
0 1 * * * [dspace]/bin/sub-daily
crontab mycron
export
PATH=$PATH:/usr/java/j2sdk1.4.2_04/bin
dspace
ln –s [dspace]/oai dspace-oai
ln –s [dspace]/jsp ROOT
dspace
<!-- Define the Tomcat Stand-Alone Service -->
Under:
<!-- Define a non-SSL Coyote HTTP/1.1 Connector on port xx -->
ensure the next line reads (the port number is the important bit):
<Connector className="org.apache.coyote.tomcat4.CoyoteConnector"
port="8080" ...
Then under:
<!-- Define a SSL Coyote HTTP/1.1 Connector on port xx -->
ensure the next line reads (the port number is the important bit), and has also been uncommented:
<Connector className="org.apache.coyote.tomcat4.CoyoteConnector"
port="8443" ...
At the top of the Tomcat connector examples section, inside the virtualhost configuration for the Tomcat Standalone service, enter the following (the first section should replace the current Tomcat ROOT context):
<!-- Tomcat Root Context -
DSpace -->
<Context path="" docBase="ROOT"
debug="0"
reloadable="true" crossContext="true">
<Resources
className=
"org.apache.naming.resources.FileDirContext"
allowLinking="true" />
</Context>
<!-- DSpace OAI Interface -->
<Context path="/dspace-oai"
docBase="dspace-oai"
debug="0" reloadable="true" crossContext="true">
<Resources
className=
"org.apache.naming.resources.FileDirContext"
allowLinking="true" />
</Context>
root
To set up IP tables to route 80 to 8080 and 443 to 8443:
iptables -t nat -A PREROUTING -p tcp -i eth0 -d
[machine-ip] --dport
80 -j DNAT --to
[machine-ip]:8080
iptables -A FORWARD -p tcp -i eth0 -d
[machine-ip] --dport
8080 -j ACCEPT
iptables -t nat -A PREROUTING -p tcp -i eth0 -d
[machine-ip] --dport
443 -j DNAT --to
[machine-ip]:8443
iptables
-A FORWARD -p tcp -i eth0 -d
[machine-ip] --dport
8443 -j ACCEPT
The web server will now route communication from ports 80 and 443 to 8080 and 8443 respectively.
dspace
Before starting or stopping tomcat for the first time, create shell scripts to do the job for you, as this will be much quicker.
[tomcat]/bin/startup.sh
[tomcat]/bin/shutdown.sh
chmod 764 starttomcat stoptomcat
PATH=$PATH:$HOME/bin
You can now start
and stop tomcat from anywhere on the system by issuing one of the commands:
starttomcat
stoptomcat
Note that it is
worth restarting tomcat after every system change as otherwise updates may not
be reflected.
To force Tomcat to update changed files it can be necessary to remove its cached files. We create a shell script to do this job for us, as this will be much quicker.
dspace
rm -r [tomcat]/work/Standalone/localhost/_
chmod 764 removecache
dspace
1.
Generate the self-signed certificate for tomcat
as per the section SSL: Certificates:
Creating a Self-Signed Certificate with OpenSSL
2. Set up an SSL connector for Tomcat.
In [tomcat]/conf/server.xml set up an HTTP connector on port 8443:
<!-- Define a SSL Coyote HTTP/1.1
Connector on port 8443 -->
<Connector className="org.apache.coyote.tomcat4.CoyoteConnector"
port="8443"
minProcessors="5" maxProcessors="75"
enableLookups="true" acceptCount="100" debug="0"
scheme="https"
secure="true"
useURIValidationHack="false"
disableUploadTimeout="true">
<Factory className=
"org.apache.coyote.tomcat4.CoyoteServerSocketFactory"
clientAuth="false"
protocol="TLS"
keystoreFile="[dspace-home]/keystore.tomcat"
keystorePass="changeit" />
</Connector>
dspace
openssl genrsa –out [era-host]
1024
openssl req -new -key [era-host].key
-out
[era-host].csr
Using the following values:
Country Name: GB
State or Province Name: Mid-Lothian
Locality Name:
Organisation Name:
Organisational Unit Name: Library Systems
Common Name:
Email Address: [admin-email]
Password: <none>
Company: <none>
http://www.ucs.ed.ac.uk/fmd/unix/docs/certificates/sign_cert.html
dspace
keytool -import -alias dspace -file
[dspace-home]/certificates/dspace.pem
When a password is requested use: changeit
The resulting keystore file is located here:
[dspace-home]/.keystore
dspace
keytool -genkey -alias
tomcat -keyalg RSA
keytool -selfcert -alias tomcat
keytool -certreq -keyalg RSA -alias tomcat -file
<place to put
certificate request file>
This is the method that has been used to generate the certificate that is currently in use in ERA.
dspace
1. Generate the key:
openssl genrsa -des3 -out pass.key 1024
2. Generate the server key:
openssl rsa -in pass.key -out server.key
3. Sign the certificate yourself (valid for 999 days):
openssl req -new -key server.key -x509 -out
server.crt -days 999
4. Generate the DER key file:
openssl pkcs8 -topk8 -nocrypt
-in server.key -out
server.key.der -outform
der
5.
Generate the DER certificate file:
openssl x509 -in server.crt
-out server.crt.der
-outform der
6. Use the ImportKey utility in Java to import the key into the keystore:
java -cp importkey.jar comu.ImportKey
server.key.der server.crt
dspace
./make-handle-config
Note: This may fail with an error like "Warning: data not encrypted". In this case it is necessary to do the installation directly:
./dsrun net.handle.server.SimpleSetup
[handle-server]
Caching
or Regular: 1
Primary
Server: y
IP
Address: [era-host]
Port
Number: 2641
Log
all access: y
Version/Serial
No: 1
Description:
Disable
UDP: y
Server
Key: n
Admin
Key: n
[handle-server]/sitebndl.zip
and email to CNRI:
[cnri-contact]
quoting
the identifier number registered for ERA: [handle]
Find the section server_config and do the following updates:
"storage_type" = "CUSTOM"
"storage_class" = "org.dspace.handle.HandlePlugin"
In addition, replace all instances of YOUR_NAMING_AUTHORITY with [handle] for the whole file.
dspace
To start the handle server, in [dspace]/bin use the command:
./start-handle-server
To stop the handle server it is necessary to use the unix command kill. Get the process ID for the line starting:
/bin/sh /u01/dspace/bin/dsrun -Dlog4j.configuration
=log4j-handle-plugin.properties
net.handle.server.Main ...
and use:
kill <process id>
dspace
1. Copy all of the design JSPs into [dspace]/jsp/local
2. Copy the custom image directory eul-image into [dspace]/jsp
3. Replace the file [dspace]/jsp/styles.css.jsp with the custom version of the same name.
4.
Remove the file [dspace]/jsp/local/styles.css.jsp
5. Replace the files in [dspace]/config/emails with the custom version of each file.
6. Replace the file [dspace]/jsp/favicon.ico with the custom version of this file
7. Replace the files in [dspace]/jsp/image/submit with the custom version of each file
8. Replace the file [dspace]/config/default.licence with the custom version of this file.
dspace
1. Insert the following line near the end of the file [dspace-home]/.bash_profile:
export LC_ALL=C
2. Replace the file [dspace]/bin/log-reporter with the custom version of this file
3. Ensure the permission to run this file are set by running:
chmod 764 log-reporter
To replace all instances of "DSpace" with instances of "ERA" we perform a recursive find and replace on the source code.
dspace
1. In [dspace]/jsp:
grep -l " DSpace" `find | grep jsp$` | xargs perl
-pi -e 's/ DSpace/
ERA/g'
grep -l " DSpace"
`find | grep jsp$` | xargs
perl -pi -e 's/ DSpace/ ERA/g'
All instances of DSpace and DSpace have now been replaced with ERA and ERA.
2. To reverse this process use the following commands in [dspace]/jsp:
grep -l " ERA" `find | grep jsp$` | xargs perl
-pi -e 's/ ERA/
DSpace/g'
grep -l " ERA"
`find | grep jsp$` | xargs
perl -pi -e 's/ ERA/ DSpace/g'
All instances of DSpace and ERA have now been replaced with ERA and DSpace.
The ERA OAI-PMH interface is available at the address:
http://www.era.lib.ed.ac.uk/dspace-oai
Almost no configuration should be required to get the OAI interface working. A couple of tests should suffice to ensure that things are behaving as they should:
http://www.era.lib.ed.ac.uk/dspace-oai?verb=Identify
http://www.era.lib.ed.ac.uk/dspace-oai?verb=ListRecords&from=<insert start date>&until=<insert end date>&metadataPrefix=oai_dc
If the results of these queries come back looking OK then the OAI interface is working correctly.
Since Tomcat can only be started as root on any privileged port and does not come with the facility to fork to another user it is necessary to find a solution that allows Tomcat to run on port 8080 while all communications to port 80 (the default HTTP port) get routed to port 8080 without causing any security problems; likewise for 443 and 8443. Possible solutions to this problem are as follows:
1. Use Apache httpd on ports 80 and 443 to redirect requests to ports 8080 and 8443
2. Use IPTables to do port pre-forwarding from 80 to 8080 and 443 to 8443
3. Use xinetd to start Tomcat bound to port 80 and 443 but not running as root.
4. Use mod_jk to connect Apache httpd to Tomcat.
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Tapir is the Edinburgh University Library's customised add-on to the DSpace system to provide E-Theses functionalit.
For ERA v1.0 we
install the Tapir version that lies between 0.2.1 and 0.3 and is not an
official release.
dspace
1.
Upload the Tapir source to the directory [tapir-source]
2. In the directory [tapir-source]/java run
ant -Ddslib[dspace]/lib install
This generates two warnings that can be ignored.
3. In the directory [tapir-source]/java run
ant -Dconfig[dspace]/config/dspace.cfg database
Note: to do this manually, in [dspace]/bin:
./dsrun org.dspace.rdbms.InitializeDatabase
[tapir-source]/java/etc/workspace.sql
./dsrun org.dspace.rdbms.InitializeDatabase
[tapir-source]/java/etc/submissions.sql
4.
Insert the contents of [tapir-source]/jsp/WEB-INF/web.xml into [dspace]/jsp/WEB-INF/web.xml. Note that all the sections of this file must
be in the correct order or Tomcat will not start; it is therefore necessary to
manually move each of the relevant parts in the Tapir file to the relevant part
in the DSpace file.
5.
Insert the contents of [tapir-source]/jsp/WEB-INF/dspace-tags/tld
into [dspace]/jsp/WEB-INF/dspace-tags.tld
6.
At this stage there exists one local JSP which
has been customised in the section Localisation
and is also present in the Tapir source.
The file [tapir-source]/jsp/layout/navbar-admin.jsp must be merged with the
file [dspace]/jsp/local/layout/navbar-admin.jsp
7.
Copy all of the files in [tapir-source]/jsp/ into [dspace]/jsp/local
(except navbar-admin.jsp):
cp -r [tapir-source]/jsp/* [dspace]/jsp/local
8.
Remove the directory [dspace]/jsp/local/WEB-INF
9.
Update the DSpace config file to contain
additional Tapir information. In [dspace]/config/dspace.cfg
insert the lines of text contained in the file [tapir-source]/config/dspace.cfg:
cat [tapir-source]/config/dspace.cfg
>> [dspace]/config/dspace.cfg
10.
Insert all of the new licences into [dspace]/config (in [tapir-source]/config):
cp default.licence licence.* [dspace]/config/
11. Add the Tapir specific styles in [dspace]/jsp/local/styles.css.jsp to the base style sheet for DSpace at [dspace]/jsp/styles.css.jsp
12. Remove the file [dspace]/jsp/local/styles.css.jsp
13. Ensure the copies of all JSPs provided by Tapir that are new files to the system and not updated versions of DSpace native files are duplicated in [dspace]/jsp as well as appearing in the local directory. The directory structuring of these files should also be reflected.
To see a list of the new files look at the file [tapir-source]/docs/FILE_LIST.txt.
This section describes how we move data from one instance of DSpace on one machine to another instance on another machine. This process should be adaptable to most import or upgrade tasks.
This procedure assumes similar structure on each machine, and notation is relative to the machine in question.
dspace
1. Stop tomcat.
2. Shut down Tomcat to prevent any further changes to the data or the database.
3. In [dspace] bundle up the assetstore for moving:
tar -cvf assetstore.tar
assetstore/
gzip -v9 assetstore.tar
This produces a file called assetstore.tar.gz which contains all of the archived files. This file can be very large (e.g. 1.7 Gb for ~ 380 items)
postgres
4. Back up the contents and structure of the database. In [dspace] run:
pg_dump dspace > era-data
This produces a file called era-data which contains all of the database. This file is generally of reasonable size (e.g. 3.2Mb for ~ 380 items).
dspace
5. Compress the era-data file for convenience:
gzip -v9 era-data
This generates a file called era-data.gz
which could be as little as 30% the size of the original era-data file.
6. Backup assetstore.tar.gz and era-data.gz to another machine (ideally not the source or the target machines).
7. You can now restart Tomcat. Changes made after this will not be reflected in the target machine, and this procedure must be performed again for new updates.
dspace
1. Stop Tomcat.
2. Upload the files assetstore.tar.gz and era-data.gz created in the previous section to the directory [dspace-home]/era-data/
3. In [dspace] rename the old assetstore:
mv assetstore assetstore_old
4. Copy [dspace-home]/era-data/assetstore.tar.gz to [dspace]
5. Unzip and untar the new assetstore:
gunzip asseststore.tar.gz
tar -xvf assetstore.tar
This will create a new directory called assetstore which is an exact copy of the one in the source DSpace installation.
postgres
6. Backup the old data in the database. In [dspace] run:
pg_dump dspace > era-data-old
This generates a file called era-data-old which contains the old database contents.
7. Prepare to drop the old dspace database by stopping and starting postgres to kill all latent connections to the database (still logged in as postgres):
pg_ctl stop -m fast -D [postgres]/data
postmaster -i -D [postgres]/data
8. Drop the old DSpace database, and (still logged in as postgres):
dropdb dspace
dspace
9. Now log in as dspace and reinitialise the DSpace database as you would do in a normal installation. In [postgres]/bin run:
./createdb -U dspace dspace
postgres
10.
Prepare to import the data into the DSpace
database. Copy [dspace-home]/era-data/era-data.gz
to [postgres]
11. Unzip the era-data.gz file:
gunzip era-data.gz
12. Import the data into the DSpace database:
psql dspace < era-data
This will create the database as an exact replica of the source installation's database.
13. You can now restart Tomcat.