Troubleshooting

Troubleshooting issues with a Dryad installation

If an error isn't listed here, check the DSpace Troubleshooting Page.

Compilation issues
If maven fails to run correctly, ensure you are using the same version of maven as before. Even minor updates to maven can cause some issues to become larger problems (e.g., something that was only a warning becomes a full error and stops the build).

Apache/redirection issues
Ensure that the Apache proxy is configured for the correct server name, and that the server name is correct in the dspace.cfg setting for dspace hostname.

install setroubleshootd...this should send you emails with blocked actions by selinux and how to fix them: yum install setroubleshoot echo myemail@address.com >> /var/lib/setroubleshoot/email_alert_recipients

You can add this to your apache config to log what the rewriterules are doing: RewriteLog "rewrite.log" RewriteLogLevel 1

SELinux issues
If SELinux is denying access to something, use the audit2allow command to establish new policies.

If SELinux is denying Apache access to a port (e.g., 9999), run: sudo semanage port -a -t http_port_t -p tcp 9999

SOLR issues
The most common reason for a homepage to not load is because a SOLR index is corrupted. This happens most frequently on secundus, because the SOLR index is being modified while it is being replicated. This is very difficult for the statistics index, since it is updated every time someone views an item page, but for the other indexes, there is only a problem if the index is copied while curators are working.

Basic repair of secundus SOLR indexes:

1. Ensure the SOLR index isn't being updated (curators are not currently modifying anything).

2. On the production server, copy the SOLR indexes to the amazon backup (this process excludes the statistics index):

/home/ubuntu/dryad-utils/aws-tools/backup-sql-solr.sh

3. On secundus, install the new SOLR indexes:

/home/ubuntu/bin/tomcat_stop.sh; /home/ubuntu/dryad-utils/import_recent_dryad.sh >/dev/null; sleep 10; /home/ubuntu/bin/tomcat_start.sh

If the statistics index is the source of the problem, you will need to shut down tomcat (so the index is no longer being updated), and manually copy it.

Connection refused
Sometimes, the home page will not load, with an error of:

java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

This usually indicates that the home page cannot contact the SOLR statistics system. Verify that SOLR is running, and that the dspace.cfg file has the correct addresses for connecting to SOLR (in at least two places).

Unknown protocol: resource
java.net.MalformedURLException: unknown protocol: resource

This is usually related to DSpace bug #239. Don't set the root-level logging to DEBUG.

Home page doesn't load, no message
One cause of this is running out of PermGen space. The error will show up in the dspace log, though the problem is in the tomcat configuration. Adding the following java configuration options seems to resolve this:

-XX:+CMSClassUnloadingEnabled -XX:PermSize=512M -XX:MaxPermSize=512M

Errors loading certain types of pages
There are sometimes errors loading e.g., all item pages, or all static pages.

There are many possible causes. Things to check:


 * Tomcat version
 * Java version AND Java type -- Dryad is known to work with Sun/Oracle Java 1.6. Other versions of Java have been known to cause strange errors in the xmlui.
 * Permissions. All files in the install directory should be owned by the dryad user, and Tomcat should run as this user.

APIs and encoded URLs
Some API calls rely on URLs that contain encoded slashes. By default, both Apache and Tomcat disallow this practice, even though it is valid according to RFC 3986.

For Apache, locate the http.conf file for your server. In the VirtualHost section, add: AllowEncodedSlashes On

For Tomcat, locate the script that starts tomcat (typically catalina.sh). Somewhere in the beginning of this script, add: export CATALINA_OPTS="-Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true"

Item Review Pages
Sometimes, an item review page doesn't render properly. This is usually caused by a permissions problem. Items in review should have the following permissions:


 * Package: READ for COLLECTION_2_WORKFLOW_ROLE_curator
 * File: READ for COLLECTION_2_WORKFLOW_ROLE_curator
 * Bundle: READ for COLLECTION_2_WORKFLOW_ROLE_curator
 * Bitstream: READ for COLLECTION_2_WORKFLOW_ROLE_curator

Items in review should have entries (for the package and all files) in the workflowitem table, and no entries in the workspaceitem table.

The Wing Exception, "give exception", and "give item"
org.dspace.app.xmlui.wing.WingException: The available object manager is unable to manage the give object.

These exceptions are often caused by the SOLR index being out of sync with the database. Try: /opt/dryad/bin/dspace update-discovery-index -f

Errors rendering the data package page
If the data package is not correctly displaying the associated data files:


 * verify that the package contains correct haspart references to the files
 * verify that the DOI database has the correct listing for the package and each file (see DOI Services Technology)

When initializing a new submission, the "select a collection" page displays
Verify that the correct "Data Package" collection exists, 10255/3.

Verify the rights for the 10255/3 collection:


 * 1) Login as an administrator
 * 2) Go to http://localhost:8000/handle/10255/3
 * 3) Choose "Edit Collection" from the menu
 * 4) At the top of the page, choose "Assign Roles"
 * 5) Verify that there is a group set for the Submitters role.
 * 6) Verify that everyone is a member of the group from step 5.

In Dryad, the Submitters role is assigned to a group called COLLECTION_2_SUBMIT. One of the members of COLLECTION_2_SUBMIT is the group Anonymous. This allows anyone to create a new submission.

NOTE: It would be a good idea to repeat all of the above steps with collection 10255/2, which should be the "Data Files" collection.

Errors in the search system
You may need to rebuild the index [dspace]/bin/dspace update-discovery-index -f Then you can optimize (but not necessary) [dspace]/bin/dspace update-discovery-index -o

DOI Issues
If DOIs are not resolving correctly:


 * 1) Run the DOI synchronization script on the server DOI database:

./dspace dsrun org.dspace.doi.DOIDbSync -s: to synchronize + report -r: to produce the report


 * 1) Check the definition of the DOIs in the EZID server

Logging issues
The dspace.cfg file specifies which log configuration file is used. Normally, we use the log4j.xml file to configure logging. Note that the beginning section just sets up many appenders, while the lower section actually associates loggers with the appenders.

Items that do not appear in workflow
UPDATED:


 * 1) Find the item's internal ID
 * 2) Run python ~/scripts/dryad-utils/fix_tasklistitems.py doi-of-item
 * 3) Run /opt/dryad/bin/dspace update-discovery-index -i ITEM_ID
 * 4) As a curator, go to the internal item page: http://datadryad.org/internal-item?itemID=xxxx and claim the task.

OLD NOTES:


 * 1) Re-index the item
 * 2) Find the item's internal ID
 * 3) Run /opt/dryad/bin/dspace update-discovery-index -i ITEM_ID (This uses IndexClient)
 * 4) If the item is still broken, go to the admin page:
 * 5) http://datadryad.org/admin/item?itemID=ITEM_ID
 * 6) Claim this item and return it to the pool.
 * 7) If the item is still broken, go back to the item page and choose the menu option "Change workflow step".
 * 8) If the item is still broken, compare its database entries with those on Workflow State in Database and repair them. Then re-index it again.

Returning items from the pool to the review stage

 * 1) Go to Workflow Overview and locate the item
 * 2) Click to open the item's internal-item page
 * 3) From the menu, select Change Workflow Step
 * 4) At the bottom of the page, set the stage to reviewStep
 * 5) Find the message that was sent with the link to the review item (it should be sent to all curators). Get the review token.
 * 6) Edit the item's metadata. Add the review token to workflow.step.reviewerKey

Items with inconsistent database state
These items typically fail to show up in the curation system, or they show up "incorrectly", with the data package pages not displaying the associated data files.

Ensure that the item is listed in either the workspaceitem table or the workflowitem table, but not both. See Extreme Curation Techniques for more information on changing the database state of an item.

The typical process for fixing these items is:

select item_id, metadata_field_id,text_value from metadatavalue where text_value like 'SOME_DOI'; delete from workspaceitem where item_id=FILE_ITEM_ID; INSERT INTO workflowitem(workflow_id,item_id,collection_id,multiple_titles,published_before,multiple_files) VALUES (getnextid('workflowitem'),FILE_ITEM_ID,1,'f','f','f'); http://datadryad.org/internal-item?itemID=PACKAGE_ITEM_ID
 * Find the affected item IDs, usualy by searching the metadata table for a DOI or other appropriate metadata:
 * Note the item IDs for files--these have a metadata_field_id of 42.
 * For each file item ID, delete the corresponding row in workspaceitem:
 * For each file item ID, add a corresponding row in workflowitem:
 * Give curators an admin link to the package item ID:

Changing email address
After changing a user's email address, he/she cannot see any of their own submissions in the My Submissions page. Confirm by searching Dryad for the new and old email addresses.

Reindex the user's submissions:


 * 1) Get a list of all the user's submitted items from the database:
 * 2) run  on each one
 * 3) Verify items appear when searching for new email address.

Fixing permissions on returned submissions
Fixing returned submissions

Fixing items lost in approval
Fixing items lost in approval

DatabaseManager issues
If you receive an exception from DatabaseManager (often at line 222), the problem is that the database connection is not available. Check that:


 * The configuration files have the correct information for connecting to the database.
 * The code that's running has not closed a Connection object or Context object.

If you find an issue in the core DSpace...
File a ticket in the DSpace bug tracker.

If it is simple to fix, first fix and test it. The best way to do this is to copy the original DSpace source file(s) into the appropriate modules directory, make the changes, compile, commit (which will deploy on our development server), and test.

Once you have a fix, you can add a patch to the bug report in the DSpace bug tracker.

Creating a patch for DSpace
Ensure you have an up-to-date copy of the DSpace trunk code.

Edit the trunk code to include the fix.

From the base directory of the trunk code, run: svn diff > ~/fix_ugly_bug.diff

Verify that the fix_ugly_bug.diff contains the expected differences.

Upload the diff file to the bug tracker.

Server Load Issues
To see which java thread is causing the most CPU use:
 * run top, note the PID of the java process
 * press Shift-H to enable Threads View
 * get PID of the thread with highest CPU
 * convert PID to HEX
 * get stack dump of the java process (using "jstack PID", with the PID from the first step)
 * in stack dump look for thread with the matching HEX PID.

Postgres issues
To see current postgres processes: select pid, backend_start, state, query from pg_stat_activity order by backend_start;

Bots and malicious scrapers
If a particular user/bot is doing something crazy that is overloading the server, you can change the apache settings so they are redirected or blocked: sudo emacs /etc/httpd/conf/httpd.dryad.conf sudo apachectl restart OR block them from the server entirely: sudo emacs /etc/sysconfig/iptables sudo service iptables restart