Garrett Bluma

Web Developer

Joomla Sphinx Search Plugin

Tuesday, February 16th, 2010

Beware, this is beta quality software. Use at your own risk.

As we understand already, Joomla doesn’t have relevant searching. It’s a shame since everything else is designed well, but this is a problem regarding implementation. This article explains how to setup sphinx to index your Joomla articles and use my plugin to search using Joomla.

In my own experiments I built a plugin that accomplished this. It works splendidly, and I had all intention of releasing the code to the public, but got distracted. This process is currently unrefined, but it works. I’m a fan of revision anyway, so if you see a way to improve this article, hit me up (email).

So why doesn’t Joomla have relevant searching already? — Unfortunately it doesn’t seem like they had much choice about it. Joomla relies on MySQL. MySQL is just now beginning to get some Full-Text searching mechanisms implemented and those that do exist are spotty. For example, it is only available in recent versions of the MyISAM database engine and not at all for InnoDB. Plus, I’ve heard some complaints about it not even being terribly good at it’s own job.

A system like Sphinx can go beyond what MySQL supports anyway. It can use word stems and substitutions, so I could specify CPP to be used interchangeably with C++ in any search query and it doesn’t get confused about plural or singular versions of words. It also supports weights, so I can tell Sphinx to prefer matches found in titles and move them higher in the search results.

Install & Configure Sphinx

There are a number of good articles on the web about this. I may revisit how to install Sphinx in detail, but for now defer to the follow links:

  • Sphinx HomepageObviously the first place to start looking is the Sphinx homepage. Good documentation on Sphinx operation and internals.
  • Build a custom search engine with PHPIBM has a great article on using Sphinx in-the-field. This article describes setup of Sphinx as well as testing and integration with PHP.

Here is my configuration I’ve been using for my Joomla Setup. This is basically the same as any configuration setup on IBM’s website, but adapted to Joomla. There are a few things you’ll need to edit though:

  • exampleuserThis should be the username you connect with to search your database
  • examplepassThis should be the password for the username just described
  • exampledbThis is the database sphinx will be searching on.
  • Check your paths. I’m making some assumptions here. Your sphinx data should be outside of your web-root and readable/writable by whatever process searchd is running as.

/var/www/website/sphinx/config.conf:


source mywebsite_articles
{
    type            = mysql
    sql_host        = localhost
    sql_user        = exampleuser
    sql_pass        = examplepassword
    sql_db          = exampledb
    sql_sock        = /var/run/mysqld/mysqld.sock
    sql_port        = 3306

    # indexer query
    # document_id MUST be the very first field
    # document_id MUST be positive (non-zero, non-negative)
    # document_id MUST fit into 32 bits
    # document_id MUST be unique

    sql_query = \
        SELECT `id`, `title`, `alias`, `introtext`, `fulltext`, `created`, `modified`, `hits` \
        FROM jos_content ;

    # document info query
    # ONLY used by search utility to display document information
    # MUST be able to fetch document info by its id, therefore
    # MUST contain '$id' macro
    #

    sql_query_info = \
        SELECT * \
        FROM jos_content  \
        WHERE id=$id
}
index mywebsite_articles
{
    source                  = mywebsite_articles
    path                    = /var/www/website/data
    morphology              = none

    min_word_len            = 3
    min_prefix_len          = 0
    min_infix_len           = 3
}

searchd
{
    port          = 3312
    log           = /var/www/website/sphinx/search.log
    query_log     = /var/www/website/sphinx/query.log
    pid_file      = /var/www/website/sphinx/searchd.pid
}

Testing Sphinx

How do we know if Sphinx is setup correctly? We test it!

First we need to generate the index. So fire up your terminal and run the following (assuming your sphinx install lives in /usr/local/sphinx):


/usr/local/sphinx/bin/indexer \
--config /var/www/website/sphinx/config.conf \
--all --rotate

Second we need to start the search daemon. This will process search requests.


/usr/local/sphinx/bin/searchd \
--config /var/www/website/sphinx/config.conf

If you get any errors, I’d recommend going back and reviewing the configuration and re-running the indexer, then start searchd again.

If we don’t have any errors, we can run a test outside of Joomla to see if everything is working correctly. For example:

/usr/local/sphinx/bin/search \
--config /var/www/website/sphinx/config.conf \
"Hello world"

Install & Configure Joomla Plugin

You can download the plugin here (Download) or at the bottom of this article.

Sphinx Plugin Upload

Enable the plugin. Be sure to turn off other search modules while you’re at it. It is much easier to determine if your search results are isolated to one plugin.

Enable Plugin

Configure Sphinx Plugin

If you are hosting Sphinx on the same server as your web service, you’ll probably want to keep these similar to what I have. Otherwise, be sure to fill-in the appropriate fields here and configure any firewalls that may be between the web server and the Sphinx server.

You will also want to be careful to use the same name of your index defined in your config file. In my example I’m using mywebsite_articles. This is what I fill-in for Resource Name.

Test Sphinx Plugin

The only thing left to do now is to actually test the thing. Go to your website and do a search!

Completing the solution

There are a few things still missing from this setup that I want you to be aware of.

First, if you don’t update your index periodically, you will only see the content that you indexed the first time. One solution is to setup a scheduled task (or cron) to re-index your content. Notice as well I put --rotate at the end of my indexing operation. Once Sphinx has updated the index, it will begin to use the new index to search with. You don’t need to restart your searchd process, just update the index.

Second, you should be aware that if your server restarts you’ll need searchd to start automatically. An init script would work great, even something as simple as this:

/etc/init.d/searchd


#! /bin/sh
NAME=sphinx
DAEMON=/usr/local/sphinx/bin/searchd
CONFIG=/var/www/website/sphinx/config.conf
[ -x "$DAEMON" ] || exit 0
case "$1" in
  start)
    echo "Stopping any running daemons..."
    $DAEMON --config $CONFIG --stop
    echo "Starting sphinx search daemon..."
    $DAEMON --config $CONFIG
    ;;
  stop)
    echo "Stopping sphinx search daemon..."
    $DAEMON --config $CONFIG --stop
    ;;
  *)
    echo "Usage: $NAME {start|stop}"
    exit 3
    ;;
esac
:


  • Earx
    Hi!

    Thank you for the wonderful plugin.

    there is one little point - if no results found, error is displayed.

    to fix it, open {joomla_root}\plugins\search\sphinxContent.php file, and at the end of plgSearchSphinxContent function change

    if (!empty($ids)) {
    ..
    }

    to

    if (!empty($ids)) {
    ..
    // return results back to Joomla
    return $results;
    }

    return array();

    in file
  • Thanks for the feedback. I'll add that in once I get back into the office.
  • This is very nice and detailed explanation of sphinx for joomla .
  • drmehdi
    hi
    thanks for your great effort.because of development of ccks like k2 for joomla,some ofus need a solution for sphinx.could you work on it and if its necessary release dome commercial plugins?i will pay for it.please contact me if you can
blog comments powered by Disqus

Categories