The Ingredients of Good ELN Maintenance are Context Dependent
Good maintenance is important for the safety and sustainability of your ELN or research platform. But the definition of good maintenance depends on the environment and purpose of your research platform or ELN. In a small lab on a computer that is blocked from access to login from the public internet, the security requirements would be different compared to an ELN or research platform based in the cloud or on a hosting provider. However, in most cases, the following features would be important for good maintenance.
- Appropriate security
- Regular backups of the data
Some additional features that would be desirable in many cases would include the following.
- Regular backups of the entire ELN or research platform
- A development or staging environment that would allow development or testing of new plugins
Open Source Provides Potential Advantages for Data Exchange
For a research project, the data that is collected is oftentimes the most critical aspect of an ELN or research platform. Therefore, a key practice of good hygiene for an ELN or research platform is backing up the data. This is usually an advantage of using an open source platform like WordPress for an ELN since this process can be more transparent and data can often be exported in multiple different formats if desired. This is often not the case for proprietary platforms.
However, even with open source platforms, finding the best approach to exporting data can be challenging. Although there are some standard approaches for WordPress coding, plugin developers are free to take different approaches to storing, exporting, and importing data. So, if the ELN uses several different plugins when storing research data, then it may be challenging to keep all of the data backed up in a consistent manner. This is why it is preferable to use one plugin to manage all of the data and associated files if possible. This is the approach that we take with the Directories Pro plugin, as described below.
Importance of Keeping Data Linked with Files, Images, and Videos
If you are using an ELN with multiple types of media (e.g., text, document files, images, videos), then there is an additional challenge of not only backing up the different types of media but also the challenge of keeping the different types of media correlated with each other appropriately. This is not inherently difficult but should be carefully planned out before the study if possible. For backups it is not enough to backup all of the data to a spreadsheet and then all of the images, videos, and other linked files without any thought about filenames, file paths and so forth. The data in the spreadsheet, for example, needs to stay linked to the appropriate images, videos and files.
A simple example would be that a researcher is collecting data about the migration habits of zebras. The basic data could be kept in a spreadsheet or table in the ELN about each given zebra. For the images and videos about an individual zebra then the filename of the image and video could be kept in the spreadsheet or table. There may be 2 image filenames with the same name, however, and might just be differentiated by the directory path that they have. Situations like this should be taken into account when performing backups so that distinct correlations between different files are not lost. Ideally, the backup of data should also be as automated as possible while still keeping the correlation information intact, since manual input of data is usually more error prone.
Directories Pro Keeps Data Associations Intact and Encapsulated
A significant advantage that we found by using the Directory Pro plugin for all of our data collection needs is that it does keep files associated with the appropriate data records and it also encapsulates all of the data and files into one ZIP file for export or import. The Directory Pro plugin’s export function creates a ZIP file with all of the data and keeps the associated images or other types of files associated with a given data record. It also has a restore feature that allows the ELN to restore the data from the ZIP file while keeping the correlations with the images and files intact. By keeping all data and associated files in one ZIP file, then data management is significantly simplified. We have successfully exported and imported over 1500 data points using this approach. One behavior that we noted, though, is that we had more success in having the import use a value that is greater than the number of data points rather than have it attempt the import of the data in smaller pieces. In other words, in our case we had to set the number of imports over 1500 in order to get all of the data points to import.
Regular Backups of the Entire WordPress Website
The basic backup of a WordPress website is straightforward in the sense that we can make a copy of the code and database with the Linux command of “sudo cp -R ...
” and the Mysql command of “sudo mysqldump ...
“, respectively. This will produce files that allow us to recreate the website again as long as we used the same URL for the website. However, the restriction that WordPress has of using the same URL with the original files is limiting in many ways. It makes it difficult to recreate, test, or compare the backup without moving many files around.
As an aside, changing the URL for a backup of a Drupal site is straightforward since the Drupal URL is normally only used in the settings.php file. However, with WordPress the URL is often used within different plugins and possibly other places. Therefore, if we are to have a WordPress backup that we can easily test and make sure that the backup is valid, then we need to use specialized plugins that can modify the appropriate fields in the database that contain the URL. The plugin that we chose to make backups that would allow us to change the URL is the Duplicator Pro plugin from SnapCreek. The use of the Duplicator Pro plugin is well documented by their creator.
Overall Backup Procedure
For backing up an ELN site, one approach is to back up the site and then back up the data in the ELN separately. Backing up the data separately allows us to easily utilize the data in separate analysis tools. For example, since the data export can be separated into data values in a CSV file and then image, video, and other files separately, then the raw data values in the CSV file can be used with statistical and other analytical tools. This procedure of separate backups for the site and the data is described below.
The Duplicator Pro plugin allows us to selectively backup different files and selectively backup database tables of a WordPress website. In our cases, we chose to back up all of the database and all of the WordPress files except selective files and subdirectories in the wp-content directory. In the wp-content subdirectory, we chose to not back up the cache subdirectory, the debug.log file, and uploads subdirectories that represented a year such as 2022 or 2023. The uploads subdirectories with the years contain mostly the files that are associated with the data in the Directories Pro data, so that data will be backed up separately when we back up the Directories Pro separately.
Note that the Duplicator Pro plugin provides a sophisticated scan step that displays large files that may be an issue to back up successfully on certain platforms. It also provides a simple process to exclude any of those large files and rescan. If any of those large files are excluded, though, there is of course a possibility of an issue with the restored site unless the file is somehow restored otherwise. So, that is why we chose to backup all of the files except those mentioned above.
After we backed up the essential parts of the website using the Duplicator Pro plugin, then we also made a backup of the data and associated files from the Directories Pro plugin. This is accomplished by using the Export step for the directory that we created. We selected all of the fields to be exported and then in the following step we set the formats. We used default formats except for the Date and time format, we use the Formatted date/time string with a format of Y-m-d and for the taxonomy we used Title instead of slugs. This will produce a .zip file that contains the values for the data in a CSV file along with the associated image or other files in .zip files within the main .zip file.
To restore the site to another URL, then we set up another URL with a simple index.php file. Then we copied over the install.php and the ZIP archive that the Duplicator Pro had created for our original site. In our case, we also created a new database beforehand to use. During the first step of the restoration process then we set the database name, login, and password. After restoring the site, then we were able to login with the site fully restored. We even had the images and attached documents for the data. That was because we were using the WP Offload Media file to store assets on a Digital Ocean Spaces volume.