In Elastic Path 6.1.1, we added support for search server clustering, which greatly improved scalability and reliability. In our testing since that time, we've found that large-scale deployments that are clustering the search server can benefit from certain optimizations to index replication and search request distribution. Also, in 6.1.2, we upgraded to Solr 1.3, which brought overall performance gains, including improved indexing performance. Finally, we've updated the replication scripts to use the new snapshot "check" functionality, making the scripts more efficient by duplicating less redundant data.
Master Index Replication Interval
The frequency for running the snapshot script depends on how often the slave machines need the updated indexes. Of course, you need to consider the performance impact on the system, and the size of your indexes and the frequency of updates have a direct impact. If you have large indexes, consider a longer interval between snapshots. If the indexes need to be updated frequently, you may require a shorter interval. Keep in mind that frequent index replication across slave servers can affect performance. You may want to experiment to determine the optimal interval that balances your needs for frequency of updates against performance.
The snapshooter.ep script has the responsibility of taking a snapshot of the current master index. There are two ways to run it:
- By running a cron job at a predefined interval (this is the default)
- By using Solr's postCommit hook, which fires the script after a commit is complete.
Typically, we recommend using a cron job, unless you need to replicate your indexes more often than in 1-minute intervals or you've changed the frequency at which your Quartz indexers check for new objects to index (normally this is defaulted at 5 seconds). With recent changes from the Solr 1.3 scripts, the snapshooter.ep script now compares the previous snapshot files with the current index files and will only duplicate an index if it has changed. With this new feature, you can set your interval to be quite frequent without additional overhead of duplicating unchanged indexes. For example, in a cron job, you can run this every minute.
The advantage of using the postCommit hooks is that there is much more control over when you take a snapshot because they're only run when an actual change is made to an index. A snapshot can be taken post-commit, post-optimize, or both. To use postCommit replication, do the following:
- Modify snapshooter.ep.start to accept an argument that specifies which index you want to replicate (instead of having them all replicate with one call). For example, you would specify snapshooter.ep.start -i product to create only a product index snapshot.
- Update the Solr config file for that specific index to cause replication on a product post-commit. For example, edit WEB-INF/solrHome/conf/product.config.xml to add the following example:
| product.config.xml Example |
|---|
| <!-- A postCommit event is fired after every commit or optimize command --> <listener event="postCommit" class="solr.RunExecutableListener"> <str name="exe">/path/to/searchserver/WEB-INF/solrHome/bin/snapshooter.ep.start</str> <str name="dir">/path/to/searchserver/WEB-INF/solrHome/bin</str> <bool name="wait">true</bool> <arr name="args"><str>-i</str><str>product</str></arr> <arr name="env"></arr> </listener> |
Note: In 6.1.2, postCommit replication is now be done properly; Solr 1.3 included a fix for an issue that caused each commit call to be run twice. Prior to the fix, using a postCommit snapshot would produce multiple snapshots, wasting CPU time and disk space.
Slave Index Installation Interval
snappuller.ep maintains a status of the last index that was pulled, and therefore, will not unnecessarily pull down indexes on the master that have already been pulled. As a result, this script can be run at fairly regular intervals. snapinstaller.ep also maintains a status of the last index installed and will not attempt to re-install indexes that have already been installed. Therefore, similar to snappuller.ep, this script can be run at fairly regular intervals with little overhead.
HTTP Search Request Distribution over Master/Slave(s)
Although the master search server is fully capable of handling search requests from the storefront and CM client, we typically recommend that only the slave nodes handle search requests. Keep in mind that index replication isn't instantaneous. There's a delay between the time the master search server indexes a new object and the time it becomes searchable on the slave. With requests only going through to the slaves, this delay is consistent and all objects will show at the same time on all nodes. Also, you'll want to reduce the load on the master server and let it focus on continuous indexing, if necessary.
To take advantage of these changes, be sure to download the 6.1.2 search server cluster/failover scripts from the downloads area at http://grep.elasticpath.com/docs/DOC-1278 (requires login).
