To minimize the risk of data loss should there be a failure in the data center that hosts XRay’s Solr cluster, you must periodically copy the data in Solr to a separate data center. This can be accomplished by using the Backup and Restore features of the XRay Database Maintenance Utility. (See the XRay Database Maintenance Utility Guide for details.) This command-line utility will need to be run periodically to take backups of the necessary XRay Solr collections.
By default, the utility will back up all non-reference XRay Solr collections. (Reference XRay Solr collections are those whose names start with “jkoolref.” There’s no need to back these up, as they are initialized as part of XRay installation setup and are not modified by XRay.) Rather than running the backup utility using the default setting, however, it is recommended that you configure it to back up only specific collections. This approach allows multiple instances of the utility to be used to back up specific groups of XRay Solr collections at different intervals, depending on your requirements.
There are three classes of XRay Solr collections (in addition to the reference ones):
- Administration collections (those whose collection name starts with “jkooladmin”).
- Definition collections, which hold XRay object definitions, like Sets and Triggers. These collections are updated as a result of user interactions.
- Streaming-data collections, which hold records generated from processing streamed data. These collections include:
Although the jkool.logs collection is not updated as a direct result of streaming, much of the activity that occurs in XRay is logged here, and it can grow quite large. So for backup purposes, it should be included with streaming-data collections, if needed.
At any given time, there must be only one active backup running per collection. Keep in mind that backups of larger collections may take some time. Therefore, the decision concerning how often to run the backups is a tradeoff between how much data loss you’re willing to accept in the event of a disaster and how long the backups will take. The general recommendation is to define two separate CRON jobs (see About CRON below) that are run at different times. Since they will share a common log file, it is preferable that you set up these times so that their executions do not overlap:
- One for the administration and definition collections that would run the backups synchronously, twice daily. The Database Maintenance Utility would be run as follows:
jkool-db-maint.sh -backup -src:http://<solr-host>:8983 -f:/XrayBackups -name:XrayProd -tables:jkooladmin.registeredusers,jkooladmin.organization,jkooladmin.repositories,jkooladmin.accesstokens,jkooladmin.volumes,jkool.actions,jkool.dictionaries,jkool.inputdatarules,jkool.mlmodel,jkool.providers,jkool.scripts,jkool.sets,jkool.triggers,jkool.views,jkool.viewtemplates
- One for the streaming-data collections that would run the backups asynchronously, once weekly, preferably during a time of low streaming volume. The Database Maintenance Utility would be run as follows:
jkool-db-maint.sh -backup -src:http://<solr-host>:8983 -f:/XrayBackups -name:XrayProd -async -tables:jkool.activities,jkool.datasets,jkool.events,jkool.relationships,jkool.resources,jkool.snapshots,jkool.sources
The above examples omit the following Solr collections, since this information is not critical for XRay:
If maintaining this information is important for your situation, you can include them along with the streaming-data collections (example 2 above), since these tables have the potential to grow quite large.
The synchronous backups in example 1 above will log the success or failure of the backup to the log file. The asynchronous ones only log that they are started, so the log files do not indicate whether they were successful. You must monitor the status of these manually, using
jkool-db-maint.sh -status. See the XRay Database Maintenance Utility Guide for details.
CRON is the process that runs scheduled jobs. These jobs are defined in a file named “crontab.” Refer to the links below for more information.