Content Staging with Apache Solr

We all love test and staging environments but it becomes a problem when you have Solr integrated in your project and you have a test solr, a staging solr and a production solr core. To re-index a site is not a big problem, but what if you want to go live and switch indexes immediately?

Drupal.org has wrestled with this problem. So I wanted to show you how they do it.

Create three Solr cores - mysite_core1 (will be used for tests) - mysite_core2 (will be used for staging initially) - mysite_core3 (will be used for production initially)

Make sure your staging site indexes as if it was the production site. There are a number of tricks you can apply.

Site hash

Make sure both site hashes are exactly the same, otherwise it won't correctly find the indexed content

  1. variable_set('apachesolr_site_hash', 'mysite_custom_hash');

Base URL

When Solr indexes content, it will use absolute urls. To trick solr you could set the base url to reflect production.

  1. # $base_url = 'http://www.example.com'; // NO trailing slash!

Alter the documents that are being sent

You could also not do the base_url approach and hardcode the fields you want to change. Using the hook_apachesolr_index_documents_alter() you could any of the staging urls to reflect production urls. csevb10 created something similar but used a different hook. Below is more or less the same code, using the documents_alter hook.

  1. function hook_apachesolr_index_documents_alter($documents, $entity, $entity_type, $env_id) {
  2. $start_url = variable_get('apachesolr_url_switch_start_url', 'http://staging');
  3. $end_url = variable_get('apachesolr_url_switch_end_url', 'http://production');
  4.  
  5. $elems = array(
  6. 'site',
  7. 'url',
  8. 'content',
  9. 'teaser',
  10. );
  11. foreach ($documents as $id => $document) {
  12. foreach ($elems as $elem) {
  13. $documents[$id]->{$elem} = str_replace($start_url, $end_url, $document->{$elem});
  14. }
  15. }

Create an update function and let it set the correct index position

  1. function my_module_update_7001() {
  2. // I am assuming your environment is called "core"
  3. // mark all content to be reindexed so our table would be in sync with the transfered nodes.
  4. apachesolr_index_node_solr_reindex('core');
  5. // fetch your current apachesolr_index_last from staging
  6. $staging_index_last = array();
  7. variable_set('apachesolr_index_last',$staging_index_last);
  8. // fetch your current apachesolr_index_updated from staging
  9. $staging_updated = array();
  10. variable_set('apachesolr_index_updated', $staging_updated);
  11. }

Switching the core

Once you move to production, make sure you switch the cores. So mysite_core2 becomes production and mysite_core3 becomes staging. You could also copy the complete data dir from solr to your thrid core but that might not be automatic enough for some you.

Enjoy