Updating a Database from Existing Data

From Dryad wiki
Jump to: navigation, search

You may have an instance of Dryad already installed that you would like to be able to copy the data from. There are two sources of data that you'll need to copy: the DSpace assetstore and the DSpace PostgreSQL database.

  • To copy the PostgreSQL database, do a standard Backup and Restore (it helps if your db and db user are the same in the both instances). Dumping as SQL text is generally the most reliable, especially if the local PostreSQL installation is a different version from the source of the dump. More detailed instructions:
# (on source machine) dump the postgres database to a file (the production server does this regularly)
pg_dump -F c -U postgres dryad_repo >dryadDBexport.sql
# (on target machine) stop tomcat so nothing interferes with the reload
tomcat-stop.sh
# (target machine) remove old database from target server (if it exists) and rebuild
# may need to restart postgres to disable any connections to it
/etc/init.d/postgresql restart
sudo dropdb -U postgres dryad_repo
sudo -u dryad createdb -U postgres -E UNICODE dryad_repo
# (on target machine) load the database into target server
# this step may take an hour or more; do it as a batch process to keep it running
# even if you lose your connection
# you can check on progress by starting psql and seeing which tables have content
at now
at> pg_restore -d dryad_repo -U postgres dryadDBexport.sql >import.log
at> (press Ctrl-D)
  • To copy the assetstore, do an rsync from /opt/dryad/assetstore on one machine to the same location on the other.
# (source machine, assuming dryaddev2.lib.ncsu.edu is the target machine)
rsync -azv /opt/dryad/assetstore/ dryad@dryaddev2.lib.ncsu.edu:/opt/dryad-demo/assetstore
# may need to include switches --delete --rsh=ssh --rsync-path=/usr/local/bin/rsync
  • The statistics need to be synced from the source machine
# (running on target machine, in /opt/dryad/solr)
rsync -azv --delete --rsh=ssh --rsync-path=/usr/local/bin/rsync dryad@dryad.lib.ncsu.edu:/opt/dryad/solr/statistics/* statistics
  • After installing or updating a database, the solr index will need to be (re)built.
# ensure Tomcat is running
tomcat-start.sh
# rebuild the solr index
/opt/dryad/bin/dspace update-discovery-index

Cleaning the database on a Vagrant VM

To clear the postgres db on the Vagrant VM: First, kill all open sessions from psql:

SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'dryad_repo'
AND pid <> pg_backend_pid();

Then quit psql and dump the db:

dropdb -U dryad_app dryad_repo -W

Then halt the vagrant machine and reprovision:

vagrant halt
vagrant up
vagrant provision

Then reinstall the dryad db from inside the VM:

bash ~/bin/install_dryad_database.sh

Then redeploy.

Importing a pg dump file

The following commands can be used to import a pg dump file (e.g., from the prod database) into a Vagrant-hosted application:

dropdb dryad_repo
createdb -U dryad_app -E UNICODE dryad_repo
pg_restore -j $THREADS -d dryad_repo -U dryad_app $DUMPFILE

where $DUMPFILE is the .sql file produced by pg_dump, and $THREADS are the number of threads pg_restore can use for the job.