December 15, 2014
Drupal

Apache Solr and Drupal - Part I: Set up Apache Solr to enhance Drupal search

Image

Today most of the websites have search functionality. With the help of Apache Solr the time spent on waiting for a search result can be radically reduced. In this article we are going to set up a basic searching infrastructure on a *nix-based system.

Image

This tutorial is going to show an easy and fast way to set up a Solr search infrastructure powered by Jetty on an Ubuntu 14 or a Debian 7 or a Red Hat Enterprise Linux 7 server. Jetty is a pure Java-based webserver and Java servlet container shipped with Solr, so you do not need to install any third party servlet containers. Apache Solr needs Java Runtime Environment 1.7 or higher. Check your system:
 

java -version


If you don't have Java, you can install it from a package...
 

  1. # RHEL
  2. sudo yum install java-1.7.0-openjdk
  3. # Debian/Ubuntu
  4. sudo add-apt-repository ppa:webupd8team/java
  5. sudo apt-get update
  6. sudo apt-get install oracle-java7-set-default


...restart the terminal or reload your configuration...
 

source ~/.profile or source ~/.bashrc


..and give Java a try:
 

echo $JAVA_HOME


If you get an empty message, just insert the following lines into the ~/.profile or ~/.bashrc:
 

  1. # RHEL
  2. JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71-2.5.3.1.el7_0.x86_64/jre
  3. PATH=$PATH:$JAVA_HOME/bin
  4. export JAVA_HOME
  5. export PATH
  6. # Debian/Ubuntu
  7. JAVA_HOME=/usr/lib/jvm/java-7-oracle
  8. PATH=$PATH:$JAVA_HOME/bin
  9. export JAVA_HOME
  10. export PATH


If you can not find Java's home folder, try locating it with
 

readlink -f "$( which java )"


...and on Debian/Ubuntu install the daemon package for the Solr service:
 

  1. # On Debian/Ubuntu only
  2. sudo apt-get install daemon


After these prerequisites are installed, it's time to install and configure Apache Solr. The archives can be found here with all versions of Solrs: http://archive.apache.org/dist/lucene/solr/. I recommend using 4.3.1 or a higher version on a Drupal 7 site. First download and decompress Solr:
 

  1. wget https://archive.apache.org/dist/lucene/solr/4.3.1/solr-4.3.1.tgz
  2. tar -xzf solr-4.3.1.tgz -C /home/drupal/solr431
  3. mv /home/drupal/solr431/solr-4.3.1/* /home/drupal/solr431/


In case you are as lazy as I am, you will also need a service which can be easily managed like any other (httpd, mysql, memcached etc.) on your system.
 


1. RHEL7

Download the Jetty service startup script,
 

  1. sudo curl -k -o /etc/init.d/solr https://cheppers.com/sites/default/files/attachments/jetty_0.txt
  2. sudo chmod 0755 /etc/init.d/solr


...change references from Jetty configuration to Solr,
 

sudo perl -pi -e 's/\/default\/jetty/\/sysconfig\/solr/g' /etc/init.d/solr


...set up variables by inserting the lines below (if you previously did not set JAVA_HOME, uncomment the first line in your configuration),
 

  1. sudo vi /etc/sysconfig/solr
  2. # JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71-2.5.3.1.el7_0.x86_64/jre/bin
  3. JAVA_OPTIONS="-Djetty.port=8984 -Dsolr.solr.home=/home/drupal/solr431/example/multicore $JAVA_OPTIONS"
  4. JETTY_HOME=/home/drupal/solr431/example
  5. JETTY_USER=drupal
  6. JETTY_LOGS=/var/log/solr


...and finally enable the Solr service:
 

sudo chkconfig solr on


2. Debian/Ubuntu


Get the Jetty service startup script,
 

  1. sudo curl -k -o /etc/init.d/solr https://cheppers.com/sites/default/files/attachments/solr_0.txt
  2. sudo chmod 0755 /etc/init.d/solr


...edit the below part of our script that has just been downloaded (set up the install folder, the port and Solr home):
 

  1. start () {
  2.   echo -n "Starting solr..."
  3. # Start the daemon
  4.   daemon --chdir='/home/drupal/solr431/example' --command "/usr/bin/java -Djetty.port=8984 -Dsolr.solr.home=multicore -jar start.jar" --respawn --output=/var/log/solr/solr.log --name=solr --verbose
  5.   RETVAL=$?


...and enable the Solr service (if we reboot our system, it will start automatically):
 

sudo update-rc.d solr defaults


Now with your last command you are going to start the service that you can easily manage with the well-known commands, and after from the admin page:

sudo service solr start

http://yoursite.com:8984/solr

Hint: if you are using an Amazon instance, don't forget to open that port!


Because we want to use the Solr with Drupal, copy the Solr configuration files from Search API Solr Search module into your core:

 

  1. cp -Rf /your/drupal/site/folder/sites/all/modules/contrib/search_api_solr/solr-conf/4.x/* /home/drupal/solr431/solr/example/multicore/core0/conf/<br />
  2. sudo service solr restart


And that's all folks! If you need any other Solr cores (for example you have several staging sites which needs different cores), you can copy an existing core into a new folder next to it, configure your Solr to load the core when Solr (re)starts and set up the new one on the admin page:

 

  1. cp -R /home/drupal/solr431/example/multicore/core0/ /home/drupal/solr431/example/multicore/othercore/<br />
  2. vi /home/drupal/solr431/example/multicore/solr.xml


Add a new line at the bottom:
 

<core instancedir="othercore" name="othercore"></core>


And add a new core:

Image

If you need different Solr services on your server (one for your development sites, one for your production sites etc.), just copy the Solr and the init script, set up a new home folder and a new port in the config and start it as well:
 

  1. <code class="ini">cp -R /home/drupal/solr431/ /home/drupal/solr431_live/
  2. sudo cp /etc/init.d/solr /etc/init.d/solr_live # Edit the config file (on Ubuntu/Debian the init script, on RHEL the also copied /etc/sysconfig/solr_live configuration file)
  3. sudo update-rc.d solr_live defaults
  4. sudo service solr_live start


Sources:
 


In our next blog post one of my colleagues will show you how you can set up Solr to search in uploaded files as well, so don't forget to check back later!

Related posts

Image
Image
December 22, 2014
Drupal

What do you do when you need to search in files as well? For a recent project I had to enable users to search the content of attached files mainly in .pdf format. The Apache Solr with Tika seemed to be a good solution.