December 22, 2014
Drupal

Apache Solr and Drupal - Part II: How to set up Drupal and Solr to search in attachments

Image

What do you do when you need to search in files as well? For a recent project I had to enable users to search the content of attached files mainly in .pdf format. The Apache Solr with Tika seemed to be a good solution.

Image

This guide is based on Drupal 7 with Search API 7.x-1.1.3, Solr search 7.x-1.6, Search views 7.x-1.13, Search API attachments 7.x-1.4 and Views 7.x-3.8
 

  1. Install Solr
    If you haven't installed Solr yet, check our blog post how to set up easily a basic Solr service on your *nix system or read the official instructions how you can do it.
     
  2. Install Drupal modules
    Install and enable the following modules:
  3. Download tika
    Download the Tika app .jar file (tika-app-1.6.jar as per the time of this post), and copy it to
     
    sites/all/libraries/tika

    , or in case you build your site as an install profile, copy it to
     
    profiles/{my_profile}/libraries/tika

    . Be sure that you have the java JDK installed. If you use Ubuntu like I do, you can read here the "Installing default JRE/JDK" section for further info. Important: Once you have downloaded the .jar file, you may need to adjust its permissions.
     
  4. Set up Search API
    Once you are done go to
     
    admin/config/search/search_api
     
    • Add a new server.
    • Add an index to the newly created server.
      • On Filters tab enable File attachments:
    • On Fields tab select the desired fields to be indexed:
    • Open the Search API Attachments tab, select Tika Extraction method and fill in the Tika Extraction Settings section. Save configuration when you are done.
       
  5. Create a view Go to
     
    admin/structure/views

    and add a new view.
    • From Show list select your index created earlier.
    • For display format you might select rendered entity.
    • Click Continue & edit.
    • At Filter Criteria section set up your filter.
    • Check "Expose this filter to visitors" and select the Searched fields from the list.
       

And basically that's it. Congratulations, you have just set up a full text search for attached files.

Related posts

Image
Image
December 15, 2014
Drupal

Today most of the websites have search functionality. With the help of Apache Solr the time spent on waiting for a search result can be radically reduced. In this article we are going to set up a basic searching infrastructure on a *nix-based system.