3. Deploying the IMI data catalogue¶

Recipe Overview

Reading Time

20 minutes

Executable Code

Yes

Difficulty

Deploying the IMI data catalogue

Recipe Type

Hands-on

Audience

Software Developer, Data Manager, System Administrator

Maturity Level & Indicator

[F+MM-1.1C] [F+MM-1.2C]

Cite me with FCB048

3.1. Main Objectives¶

This recipe is a step-by-step guide on how to deploy the IMI Data Catalogue in Docker.

3.2. Introduction¶

For a more general introduction to data catalogues, their elements and data models, see the data catalogue recipe. This recipe is intended as a set of step-by-step instructions to deploy via Docker the IMI Data Catalogue developed at the Luxembourg Centre for Systems Biomedicine. The overall purpose of the data catalogue is to host dataset-level metadata for a wide range of IMI projects. Datasets are FAIRified and searchable by a range facets. The catalogue is not intended to hold the actual data, although it provides links to where the data is hosted, together with information on any access restrictions.

3.3. Requirements¶

The following need to be installed on the machine the deployment is run on:

Git
Docker

3.4. Ingredients¶

IMI data catalogue code

Check out the code to your local machine by running the following command in a terminal:

$ git clone git@github.com:FAIRplus/imi-data-catalogue.git

Thanks to docker-compose, it is possible to easily manage all the components (solr and web server) required to run the application.

3.5. Step-by-step guide¶

Unless otherwise specified, all the following commands should be run in a terminal from the base directory of the data catalogue code.

3.5.1. Building¶

(local) and (web container) indicate context of execution.

First, generate the certificates that will be used to enable HTTPS in reverse proxy. To do so, execute:

$ cd docker/nginx/
$ ./generate_keys.sh

Warning

⚡ Please note that if you run this command outside the nginx directory, the certificate and key will be generated in the wrong location.
This command relies on OpenSSL. If you don’t plan to use HTTPS or just want to see demo running, you can skip this.

Warning

⚡ However, be aware that skipping this would cause the HTTPS connection to be unsafe!

Return to the root directory ($cd ../..), then copy datacatalog/settings.py.template to datacatalog/settings.py.

$ cd ../..
$ cp datacatalog/settings.py.template datacatalog/settings.py

Edit the settings.py file to add a random string of characters in SECRET_KEY attribute. For maximum security, in Python, use the following to generate this key:

import os
os.urandom(24)

Build and start the docker containers by running:

(local) $ docker-compose up --build

That will create:
* a container with `datacatalog web application`

* a container for `Solr`

Note

⚡ the data will be persistant between runs.

In a new terminal, to create Solr cores, do:

(local) $ docker-compose exec solr solr create_core -c datacatalog
(local) $ docker-compose exec solr solr create_core -c datacatalog_test

Then, still in the second terminal, put Solr data into the cores:

(local) $ docker-compose exec web /bin/bash

(web container) $ python manage.py init_index 
(web container) $ python manage.py import_entities Json dataset 

Tip

⚡ to kill the container, press “CTRL+D” or type: “exit” from the terminal

The web application should now be available with loaded data via http://localhost and https://localhost with ssl connection

Warning

⚡ - Most browsers display a warning or block self-signed certificates.

3.5.2. Maintenance of docker-compose¶

Docker container keeps the application in the state it was when built. Therefore, if you change any files in the project, the container has to be rebuilt in order to see changes in application :

$ docker-compose up --build

If you wanted to delete Solr data, you’d need to run:

$ docker-compose down --volumes

This will remove any persisted data - you must redo solr create_core (see step 4 in the previous section) to recreate the Solr cores.

3.5.3. Modifying the datasets¶

The datasets are all defined in the file tests/data/records.json. This file can be modified to add, delete and modify datasets. After saving the file, rebuild and restart docker-compose.

First, to stop all the containers:

$ CTRL+D

Then rebuild and restart the containers:

$ docker-compose up --build

Finally, reindex the datasets using:

(local) $ docker-compose exec web /bin/bash

(web container) $ python manage.py import_entities Json dataset

Tip

⚡ to kill the container, press “CTRL+D” or type: “exit” from the terminal

3.6. Single Docker deployment¶

In some cases, you might not want Solr and Nginx to run (for example, if there are multiple instances of Data Catalog running). Then, simply use:

(local) $ docker build . -t "data-catalog"
(local) $ docker run --name data-catalog --entrypoint "gunicorn" -p 5000:5000 -t data-catalog -t 600 -w 2 datacatalog:app --bind 0.0.0.0:5000

3.7. Manual deployment¶

If you’d rather not to use Docker and compile and run the data catalogue manually instead, please follow the instructions in the README file

3.8. Conclusion¶

This recipe provides a step-by-step guide to deploying the IMI data catalogue developed at University of Luxembourg, as part of IMI FAIRplus to a local system.

3.8.1. What should I read next?¶

3.9. References¶

3.10. Authors¶

Authors

Name	Affiliation	Contribution
Danielle Welter	University of Luxembourg	Writing - Original Draft
Valentin Grouès	University of Luxembourg	Writing - Original Draft
Wei Gu	University of Luxembourg	Writing - Review & Editing
Venkata P. Satagopam	University of Luxembourg	Writing - Review & Editing
Philippe Rocca-Serra	University of Oxford	Writing - Review & Editing

3. Deploying the IMI data catalogue¶

3.1. Main Objectives¶

3.2. Introduction¶

3.3. Requirements¶

3.4. Ingredients¶

3.5. Step-by-step guide¶

3.5.1. Building¶

3.5.2. Maintenance of docker-compose¶

3.5.3. Modifying the datasets¶

3.6. Single Docker deployment¶

3.7. Manual deployment¶

3.8. Conclusion¶

3.8.1. What should I read next?¶

3.9. References¶

3.10. Authors¶

3.11. License¶