How to make your OA repository work really well!
Checklist to help you get the best out of your DSpace open access repository

You are here

EIFL Open Access Programme Manager, Iryna Kuchma, shares a checklist for fine-tuning open access repositories that are built with DSpace software. The checklist is drawn from a series of seven webinars organized by EIFL, the Institute of Development Studies (IDS, United Kingdom) and Stellenbosch University (South Africa), from January to May 2016.

One of the EIFL Open Access Programme goals for 2016 - 2017 is to enhance open access repositories in EIFL partner countries. This means ensuring that they work well together with other systems and platforms and include new repository functionalities that make repositories more user-friendly and enable easier sharing of research outputs.  

Together with Nason Bimbe (IDS) and Hilton Gibson (Stellenbosch University), we hosted seven webinars in which we shared suggestions and good practices in setting up and running open access repositories with DSpace free and open source software, which is the most commonly used repository software in EIFL partner countries.

Based on the webinars, we have drafted a checklist to support repository development and management.

Here’s the checklist - if you have any feedback or suggestions for improving the checklist, please contact me at iryna.kuchma@eifl.net.

Which version of the DSpace software do you use?

Always run your repository with the latest software versions (or no more than one version behind). The current latest release is DSpace 5.5 that can be downloaded from GitHub (dspace-5.5 release).  Release Notes are available at DSpace 5.x Release Notes. Documentation is available at DSpace 5.x Documentation. DSpace 5.5 is a bug fix release to the 5.x platform. Beginning with DSpace 5.x, DSpace now provides an easier upgrade process (than in prior versions -1.x.x, 3.x or 4.x).

How to upgrade DSpace: eifl.net/resources/webinar-how-upgrade-dspace

Have you enabled the handle service?

The handle service allows you to apply a short URL, which is persistent, for the purposes of citation and discovery on the web. See http://hdl.handle.net; http://www.handle.net/documentation.html; and http://wiki.lib.sun.ac.za/index.php/SUNScholar/Handle_Server

What are your backup/restore procedures and policies for disaster recovery?

Since your repository is now the vehicle for maintaining the permanent digital academic research record of your institution, you will be concerned about its sustainability. You will want to make sure it is backed up and monitored correctly. Read more at: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Disaster_Recovery

Have you enabled the OAI-PMH server?

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH: https://www.openarchives.org/pmh/) is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. Configurations for OAI-PMH server are held in OAI-PMH crosswalks and [dspacesource]/config/modules/oai.cfg. NOTE - unless you need to change the default behaviour, you may not need to alter OAI-PMH crosswalks and [dspacesource]/config/modules/oai.cfg.

To enable DSpace's OAI-PMH server, just make sure the [dspace]/webapps/oai/ web application is available from your Servlet Container (usually Tomcat). You can test that it is working by sending a request to: http://[full-URL-to-OAI-PMH]/request?verb=Identify

If you are harvesting content (bitstreams and metadata) from an external DSpace installation via OAI-PMH & OAI-ORE, you first should verify that the external DSpace installation allows for OAI-ORE harvesting. More details are at https://wiki.lyrasis.org/display/DSDOC5x/OAI and http://wiki.lib.sun.ac.za/index.php/SUNScholar/Remote_Harvest

Improving discoverability through search engines

Search engines use software programmes that gather information from websites for indexing. These software programmes are called crawlers or bots. They are good at finding HTML files but since DSpace content is in a database system, you will need to tell the crawler how to get to that content. First you need to make sure that you generate sitemaps. These can be done through the Cron job # Generate sitemaps at 6:00 am local time each day 0 6 * * * [dspace]/bin/dspace generate-sitemaps. This will generate the sitemaps that are accessible at http://{your-DSpace-URL}/sitemap and http://{yourDSpace-URL}/htmlmap Make sure also that the robots.txt contains directives to these paths. You will need the robots.txt placed in the root of your DSpace site. Make sure the robots.txt contains directives for what can and cannot be indexed Details of robots.txt structure and instructions are at https://wiki.lyrasis.org/display/DSDOC5x/Search+Engine+Optimization

Register your repository with Google Scholar and all known ‘reputable’ aggregators – OpenDOAR, ROAR, etc.

More information: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Repository_Website_Metrics

Analytics and basic web traffic insights

Knowing how much content is in your repository, how it is used, and where users come from, is crucial to help in securing the sustainability of your platform. Therefore having good statistics of your repository is good for advocacy, buy-in, business case building. At least you should know: Number of downloads and views; Where these are coming from (geo-location); and Content analysis, for example, how many items are in the repository (by various dimensions i.e. by item type, subject, language, etc.).

Dspace provides statistics such as the number of downloads, number of items, number of failed logins, etc. (all the data is available in the SOLR indexes –these can be queried directly as well).

Use Google Analytics (registration required) or Piwik (an open source web analytics) to collect information about site visits. Standard Google Analytics (JavaScript based) has limitations in capturing downloads. As of DSpace 5, there is integration with Google Analytics via Google Analytics API. Statistics are therefore accessible in DSpace - but you will need to configure DSpace for this to happen, see https://wiki.lyrasis.org/display/DSDOC5x/DSpace+Google+Analytics+Statistics

To enable visits tracking, you will need to acquire a Google Analytics Key from Google - see Google Webmaster Tools. Add the key to the DSpace configuration in dspace.cfg under the setting jspui.google.analytics.key=UAXXXXXX-X (for JSPUI) or xmlui.google.analytics.key=UA-XXXXXX-X (for XMLUI) by replacing the UAXXXXXX-X with the key you will be given by Google Analytics. The statistics can be viewed on the Google Analytics web application at https://marketingplatform.google.com/about/analytics/.

Third party tools you could consider: Content Usage Analysis module - a DSpace add-on from Atmire (not open source); MyDashfrom Harvard University - a stand-alone Open Source web application (https://github.com/oscharvard/mydash); Altmetrics such as those from Altmetric.com or PlumX. You could start with a free tool from Altmetric.com - Bookmarklet for Researchers and instantly get article-level metrics for any recent paper: https://www.altmetric.com/products/free-tools/bookmarklet/. Assess what sort of insights you want from your DSpace and setup/acquire the appropriate visualisation and analysis tools.

More information: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Google

Does your repository offer a good user experience for smartphones and tablets?

Mirage 2 theme implements this capability by providing a distinct look for each of the three different categories of screen sizes: mobile phone, tablet and desktop. How to enable and customize the Mirage 2 responsive XMLUI for DSpace repository: eifl.net/resources/webinar-enabling-and-customizing-mirage-2-responsive-xmlui-dspace-repository

Does your repository have an open access repository policy?

In the open access repository policy you should define an overall vision for your institutional repository, a collection policy, a submission policy, the content types that you will be including in your institutional repository, a deposit licence and policy and a re-use licence for your institutional repository, take-down policies and embargoes, a preservation policy, and rights, responsibilities and repository services, etc. When you have a publicly stated open access repository policy for the permitted re-use of deposited items or for such things as submission of items, long-term preservation, etc, it simplifies matters for organisations wishing to provide search services, which in turn increases the visibility and impact of the repositories. More information: http://wiki.lib.sun.ac.za/index.php/Open_Access

Researcher identification:  Do you allow your researchers to register to deposit in your repository with their ORCID?

ORCID (Open Researcher and Contributor ID) is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors. The ORCID integration adds ORCID compatibility to the existing solutions for Authority control in DSpace. More information: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Researcher_Identification; http://wiki.lib.sun.ac.za/index.php/SUNScholar/Researcher_Identification/5.X/ORCID and https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration

 

FURTHER RESOURCES

Take a look at the expert tips for setting up and managing a DSpace repository, and resources on the following topics: DSpace installation, post-installation tasks, how to upgrade DSpace, DSpace system administration, enabling a responsive user interface and customizing DSpace.