We all love test and staging environments but it becomes a problem when you have Solr integrated in your project and you have a test solr, a staging solr and a production solr core. To re-index a site is not a big problem, but what if you want to go live and switch indexes immediately?

Drupal.org has wrestled with this problem. So I wanted to show you how they do it.

Create three Solr cores

  • mysite_core1 (will be used for tests)
  • mysite_core2 (will be used for staging initially)
  • mysite_core3 (will be used for production initially)

Make sure your staging site indexes as if it was the production site. There are a number of tricks you can apply.

Site hash

Make sure both site hashes are exactly the same, otherwise it won’t correctly find the indexed content

<?php
variable_set('apachesolr_site_hash', 'mysite_custom_hash');

Base URL

When Solr indexes content, it will use absolute urls. To trick solr you could set the base url to reflect production.

<?php
# $base_url = 'http://www.example.com';  // NO trailing slash!

Alter the documents that are being sent

You could also not do the base_url approach and hardcode the fields you want to change. Using the hook_apachesolr_index_documents_alter() you could any of the staging urls to reflect production urls. csevb10 created something similar but used a different hook. Below is more or less the same code, using the documents_alter hook.

<?php
function hook_apachesolr_index_documents_alter($documents, $entity, $entity_type, $env_id) {
  $start_url = variable_get('apachesolr_url_switch_start_url', 'http://staging');
  $end_url = variable_get('apachesolr_url_switch_end_url', 'http://production');
 
  $elems = array(
    'site',
    'url',
    'content',
    'teaser',
  );
  foreach ($documents as $id => $document) {
    foreach ($elems as $elem) {
      $documents[$id]->{$elem} = str_replace($start_url, $end_url, $document->{$elem});
    }
}

Create an update function and let it set the correct index position

<?php
function my_module_update_7001() {
  // I am assuming your environment is called "core"
  // mark all content to be reindexed so our table would be in sync with the transfered nodes.
  apachesolr_index_node_solr_reindex('core');
  // fetch your current apachesolr_index_last from staging
  $staging_index_last = array();
  variable_set('apachesolr_index_last',$staging_index_last);
  // fetch your current apachesolr_index_updated from staging
  $staging_updated = array();
  variable_set('apachesolr_index_updated', $staging_updated);
}

Switching the core

Once you move to production, make sure you switch the cores. So mysite_core2 becomes production and mysite_core3 becomes staging. You could also copy the complete data dir from solr to your thrid core but that might not be automatic enough for some you.

Enjoy