Technical Blog

3 Posts tagged with the clustering tag

Clustering Quartz Jobs

Posted by Darren Pendery Sep 19, 2011

Overview

The standard Elastic Path Quartz jobs are tied to the JVM and are not distributed.  If configured on multiple servers they will be executed on multiple servers simultaneously.  The challenge is that jobs which update entities should not be executed on multiple servers simultaneously due to risks of concurrent access issues.

This can be alleviated by tying them to a specific server when deploying to production.  However, this introduces a single point of failure and is a risk to the reliability and scalability of the system.

To resolve this issue use the persisted scheduler implementation available in the Quartz framework, which uses database tables to persist the job trigger schedules and controls which servers execute which jobs.

It is rather simple to implement this.

  1. Place the attached RowLockSemaphore and JobStoreTX classes into the core library of your development environment
    • These are necessary to resolve a known issue with the locking logic in Quartz 5.1.
  2. Place the attached AbstractProcessorJob class into the core library of your development environment
  3. Extend AbstractProcessorJob for each Quartz job to be distributed
  4. Place the Quartz SQL script appropriate for your RDBMS into your development environment
    • This can be found in the Quartz distribution
  5. Add a Spring module property that specifies the JDBC data source name (usually jdbc/epjndi)
  6. Configure the scheduler factory as well as trigger and job beans in your quartz.xml file

 

 

Warning:  You should rename the packages in the Java files to suit your project and to avoid conflicts if these files are ever included in the EP code base.

 

 

 

 

 

 

The following is an example of a persisted scheduler factory in a quartz.xml file:

 

 

 

<bean id="myPersistedSchedulerFactory"  class="org.springframework.scheduling.quartz.SchedulerFactoryBean">

  <property name="applicationContextSchedulerContextKey"><value>applicationContext</value></property>

  <property name="quartzProperties">

  <props>

    <prop key="org.quartz.scheduler.instanceName">ClusteredScheduler-cmserver</prop>

    <prop key="org.quartz.scheduler.instanceId">AUTO</prop>

    <!-- ThreadPool -->

    <prop key="org.quartz.threadPool.class">org.quartz.simpl.SimpleThreadPool</prop>

    <prop key="org.quartz.threadPool.threadCount">10</prop>

    <prop key="org.quartz.threadPool.threadPriority">5</prop>

    <!-- Job store -->

    <prop key="org.quartz.jobStore.misfireThreshold">30000</prop>

    <prop key="org.quartz.jobStore.class">com.customer.ep.quartz.JobStoreTX</prop>

    <prop key="org.quartz.jobStore.driverDelegateClass">org.quartz.impl.jdbcjobstore.StdJDBCDelegate</prop>

    <prop key="org.quartz.jobStore.useProperties">true</prop>

    <prop key="org.quartz.jobStore.dataSource">myDataSourceName</prop>

    <prop key="org.quartz.jobStore.isClustered">true</prop>

    <prop key="org.quartz.jobStore.clusterCheckinInterval">20000</prop>

    <prop key="org.quartz.jobStore.selectWithLockSQL">UPDATE {0}LOCKS SET LOCK_NAME = ? WHERE LOCK_NAME = ?</prop>

    <!-- Configure Plugin -->

    <prop key="org.quartz.plugin.shutdownhook.class">org.quartz.plugins.management.ShutdownHookPlugin</prop>

    <prop key="org.quartz.plugin.shutdownhook.cleanShutdown">true</prop>

    <!--  Datasource -->

    <prop key="org.quartz.dataSource.mfldirectDS.jndiURL">${ep.quartz.datasource}</prop>

    <prop key="org.quartz.dataSource.mfldirectDS.validationQuery">select 0 from dual</prop>

  </props>

  </property>

  <property name="triggers">

  <list>

    <ref bean="customJobTrigger"/>

  </list>

  </property>

</bean>

 

 

Place this scheduler configuration in each application that will act as the container for the clustered jobs.

 

Warning:  Refer to the section below on multiple clustered schedulers when you need separate sets of clustered jobs.

 

 

 

Distributing a Job Bean

Distributed Quartz job beans are serialized and persisted in the Quartz database tables.  As a result, the Spring injection mechanism will not work for these beans.  Therefore, job beans must retrieve any beans they use from the bean factory directly.  The typical MethodInvokingJobDetailFactoryBean configuration will not work.

The attached AbstractProcessorJob Java class, which extends QuartzJobBean, provides a simple framework that allows extensions to easily get to the Elastic Path bean factory.

Extend this class to implement a custom Quartz job class and override the executeProcess method to implement the job logic.

 

 

Note:  Preferably the job logic itself should be implemented in a separate service bean and the custom job bean just calls the necessary method on that service.

 

The following code snippet shows an extension class that calls importJobProcessor.launchImportJob().

 

 

public class ImportProcessorJob extends AbstractProcessorJob {

    @Override

    protected void executeProcess(final ApplicationContext context) {

      try {

        ImportJobProcessor importJobProcessor;

        importJobProcessor = (ImportJobProcessor) context.getBean("importJobProcessor");

        importJobProcessor.launchImportJob();

      } catch (Exception e) {

        // Log the error and handle as appropriate.

      } finally {

        // Any appropriate finally logic.

      }

    }

}

 

Once the job class has been implemented configure the Quartz trigger for the job to use the new job class.

 

 

<bean id="processImportJobTrigger" class="org.springframework.scheduling.quartz.SimpleTriggerBean">

  <property name="jobDetail">

        <ref bean="processImportJob" />

  </property>

  <property name="startDelay" value="10000" />

  <property name="repeatInterval" value="5000" />

  <property name="group" value="MflClusteredScheduler-cmserver"/>

</bean>

<bean name="processImportJob" lazy-init="default" autowire="default" dependency-check="default">

  <property name="jobClass" value="com.myproject.ep.cmserver.quartz.ImportProcessorJob" />

</bean>

 

This is different than using the Quartz MethodInvokingJobDetailFactoryBean used for the OOTB configurations, which allows you to specify a service class and method.  As AbstractProcessorJob extends QuartzJobBean it is already a job bean class, and since you would retrieve beans explicitly from the bean factory, the job bean can then be serialized.

Distributing OOTB Quartz Jobs

The OOTB Quartz jobs in the CM Server are, by default, not distributed.  They should be clustered in a production environment to provide reliability.

To get around this, simply use the  AbstractProcessorJob described above to implement custom extensions for each CM Server job bean and configure them appropriately.

These include the following jobs in the CM Server quartz.xml:

  • topSeller
  • productRecommendation
  • demoProductRecommendation
  • releaseShipment
  • cleanupOrderLocks
  • processImportJob
  • importJobCleanup
  • staleImportJob
  • cleanupSessions

Multiple Clustered Schedulers

It may sometimes be necessary to setup different Quartz job containers for different sets of clustered jobs.  For example, the EP Connect application may contain a completely different set of jobs than the CM Server.  In these cases it is necessary to configure different persisted schedulers for the different sets.

However, there is a catch:  Quartz requires separate database tables for each persisted scheduler even if the scheduler names and triggers are different.  So, even if you specify different scheduler names and triggers, all triggers from all schedulers will be placed in the database tables and each scheduler instance will attempt to execute all of them.  For example, a job in the EP Connect scheduler whose class resides in the EP Connect application will result in a ClassNotFoundException when the CM Server scheduler is executed.

To get around this limitation you will need to create separate sets of Quartz tables in your database, one for each persisted scheduler.

Copy the Quartz SQL scripts to create separate script files for each scheduler.  Modify them to change the prefix on each table:  the default prefix is QRTZ_.  The following excerpt is from a scheduler for the CM Server application and specifies "qrtz_cm_" as the prefix:

 

 

CREATE TABLE qrtz_cm_job_details ...

 

A separate scheduler for the EP Connect application might then use "qrtz_connect_".

Add these scripts to your development and deployment processes.

Then, specify the table prefix in the corresponding scheduler factory configuration in quartz.xml:

<bean id="schedulerFactoryMfl"

  class="org.springframework.scheduling.quartz.SchedulerFactoryBean">

...

<prop key="org.quartz.jobStore.tablePrefix">QRTZ_CM_</prop>

...

</bean>

 

 

 

The prefix is case-insensitive.

 

 

Operations Implications

 

Changes to the quartz schedules require the scheduler tables to be cleared.  Quartz may not always update the trigger schedules from the XML configuration when the scheduler starts up.

 

This typically impacts new deployments.

Include the attached reset SQL script to your operations manual and add steps to deployment manuals to execute it.

 

Note:  Be sure to modify the script to include the prefix if you have specified one. Also, it may be advisable to create separate scripts for each persisted scheduler when configuring multiple clustered schedulers.

 

 

 

 

0 Comments Permalink

Apache HTTP Server is a very effective tool for caching static content and, if configured properly, can improve performance of your Elastic Path deployment by up to 30%! Furthermore, Apache does a great job of load balancing a cluster of storefront nodes, giving you even more throughput and scalability, without resorting to expensive hardware load balancers. Obviously, Apache will never perform like a hardware load balancer, but it is a little more affordable (read: free). So really, what more can you ask for from an HTTP server?

 

In this post, we'll look at using Apache to load balance our storefront servers. We'll also look at enabling caching of static content at the Apache level, removing a lot of network and CPU load from our application servers and giving a faster load time to browsers. Before we begin, make sure you have the following:

 

  • Apache HTTP Server 2.2.10+ with either JBoss 4.2+ or Tomcat 5.5+ (using Apache with WebLogic is more complicated and requires the use of a specific Oracle-WebLogic Apache plug-in.)
  • Apache has been built with the following modules: mod_proxy, mod_proxy_ajp, mod_proxy_balancer, mod_cache, mod_disk_cache.

 

Configuring a Proxy and Static Content Cache

Let's start by creating a proxy server and caching static content at the Apache level. This is relatively easy to set up, but important to understand before moving on to load balancing. We'll assume Apache is the front-most facing component to the user's browser. The architecture will look something like the following diagram.

 

ApacheSimple.jpg

 

Let's examine a request working it's way through this architecture. A typical first request from a shopper's browser, such as viewing a product page, will flow through Apache (bypassing all caches since they're empty) and arrive at the application server. The application server will gather and serve the necessary HTML and subsequent embedded objects (images, js, css, etc). These objects will pass back through Apache and to the user's browser. The key process here, however, is that as these static objects pass back through Apache, Apache will cache them based on their cache control headers.

 

When a request comes in for the same product page (or any request for the same set of static HTML objects), Apache will serve the static objects straight back to the user's browser from its cache. Only the dynamic HTML and other dynamic content will come from the app server. Although a second load of the same page on the same user's browser will already be cached at the user's browser level, it will be very useful for new sessions that have an empty browser cache.

 

Unfortunately, there are a couple issues we need to think about before we can implement this setup, such as:

 

  • How do we communicate between Apache and the app server?
  • What protocol do we use between Apache and the app server, HTTP or AJP?
  • How do we support Acegi security, which is required by the storefront application servers?

 

Don't worry! We did a fair amount of performance testing to answer these questions, and came up with the following diagram.

 

ApacheProtocols.jpg

 

The key here, is that a) we're using AJP between Apache and the app server, a fast binary protocol, and b) we're using two separate AJP connectors on the app server, one non-secure for HTTP traffic and one considered "secure" for HTTPS traffic. This allows Acegi to know that a request is "secure" so that it will not try to redirect endlessly to a secure port (a typical problem we see). I'm putting "secure" in quotes because it's really no different than the insecure channel (it's not encrypted). It simply has additional header information stating it's a secure channel.

 

In order to implement this, there are a number of items to configure such as Apache's mod_proxy and mod_cache, as well as any cache control configuration that needs to be done on the application server.

 

mod_proxy

We need to allow requests that come in to Apache to pass through to the application server and then return to the user. This is done using Apache's mod_proxy module. The full mod_proxy documentation is here: http://httpd.apache.org/docs/2.2/mod/mod_proxy.html. It's a recommended read. We'll also be using the mod_proxy_ajp module for AJP support.

 

The first step is to enable the two AJP connectors on the application server, in server.xml (or jboss-server.xml):

<Connector enableLookups="false" port="8009" protocol="AJP/1.3"/>
<Connector enableLookups="false" port="8010" protocol="AJP/1.3" scheme="https" secure="true"/>

 

Note the secure parameters for port 8010. This fools Acegi into thinking that anything coming over this port with AJP is a secure connection and it will not redirect it.

 

The second step is to ensure Acegi knows it may receive connections over port 80 and its secure mapped port is then 443 (the typical HTTP and HTTPS ports). To do this, we edit the storefront web app's WEB-INF/conf/spring/security/acegi.xml file and add an additional port mapping to the portMapper bean as follows:

    <!-- port # are specified in default.xml -->
    <bean id="portMapper" class="org.acegisecurity.util.PortMapperImpl">
        <property name="portMappings">
            <map>
                <entry key="80"><value>443</value></entry>
                <entry key="8080"><value>8443</value></entry>
            </map>
        </property>
    </bean>

 

 

In the third and final step, we want to configure the HTTP and HTTPS virtual hosts on Apache to listen to ports 80 and 443.


LoadModule proxy_ajp_module modules/mod_proxy_ajp.so

<VirtualHost 10.10.90.54:80>
        ServerName 10.10.90.54
        ProxyPreserveHost On
        ProxyPass /storefront ajp://10.10.90.54:8009/storefront keepalive=On
</VirtualHost>

<VirtualHost 10.10.90.54:443>
        ServerName 10.10.90.54
        # Enable/Disable SSL for this virtual host if you want to terminate SSL here
        ProxyPreserveHost On
        ProxyPass /storefront ajp://10.10.90.54:8010/storefront keepalive=On
</VirtualHost>

 

There's a lot going on here, so let's have a look at the HTTPS:443 virtual host as it's the more complex one here:

  1. Clearly, one would want to configure the virtual hosts to listen on the specific machine's port.
  2. Within here is where we would do any SSL termination before passing the request over AJP to the app server.
  3. "ProxyPreserveHost On" ensures the Host header is maintained as it's passed to the app server. This is required for Elastic Path 6.1 and later to be able to handle multi-store requests.
  4. The ProxyPass directive is the key here. This tells Apache to pass any requests coming in matching /storefront to the app server's AJP connector under /storefront.
  5. There are a large number of options for this directive, including maintaining keepalive, as we've done here.
  6. Note that the storefront server doesn't have to be the localhost. We'll see this later when we being load balancing.

 

At this point, after rebooting, you should be able to hit Apache on port 80 and pull up your storefront.

mod_cache

Next, we want to cache any static content we can on the Apache side. To do this, we'll use mod_cache, or more specifically mod_disk_cache. There is also mod_mem_cache, which is a memory based cache, but we've actually found better performance results with mod_disk_cache, plus the persistence of all cache files is a plus.

 

Adding the httpd.conf directives for a disk cache is fairly straightforward. Let's try the following:

 

CacheEnable disk /storefront/
CacheRoot /var/www/cache
CacheDirLevels 5
CacheDirLength 2
CacheIgnoreHeaders Set-Cookie

 

Looking at the lines in detail:

 

  1. Enable the disk cache on the URL /storefront/.
  2. Specify the cache location on the local disk, in this case /var/www/cache. You'll want to make sure the Apache user can write to that directory.
  3. The number of directory levels in the cache tree structure.
  4. The number of characters for each directory.
  5. Finally, we specify which headers we DO NOT want to cache. This is essential. If we don't set this for cookies, we will end up getting someone else's session!

 

At this point, after rebooting Apache, we will begin to cache any static objects with cache control headers. In order to expand on what is (or isn't cached), let's move on to the next section.

 

Cache Control Header Config

Finally, we want the application server, or more specifically, the deployed applications, to tell Apache if there's anything to cache. This is typically done by using cache control headers, such as max-age.

 

For the storefront web application, you can use the Caching Control Filter to add the max-age cache control header to requests for specific types of content (based on URL patterns). The Caching Control Filter configuration is in the storefront's conf/spring/web/filter-config.xml file, in the cachingControlFilter bean definition. The cachingControlEntries list contains bean definitions that represent the URL patterns to test and max-age value to set.

 

The following is an example of caching all /renderImage.image dynamic image calls, all /template-resources/ calls (css, js, etc) and any dynamic content assets under /content/:

 

<bean id="cachingControlFilter"
     class="com.elasticpath.commons.filter.impl.CachingControlFilter">
     <property name="cachingControlEntries">
          <list>
               <bean class="com.elasticpath.commons.filter.impl.CachingControlFilter$CachingControlEntry">
                    <property name="urlPattern">
                         <value>^.*renderImage\.image.*$</value>
                    </property>
                    <property name="maxAge">
                         <value>86400</value>
                    </property>
               </bean>
               <bean class="com.elasticpath.commons.filter.impl.CachingControlFilter$CachingControlEntry">
                    <property name="urlPattern">
                         <value>^.*template-resources.*$</value>
                    </property>
                    <property name="maxAge">
                         <value>86400</value>
                    </property>
               </bean>
               <bean class="com.elasticpath.commons.filter.impl.CachingControlFilter$CachingControlEntry">
                          <property name="urlPattern">
                                  <value>^.*content.*$</value>
                    </property>
                    <property name="maxAge">
                         <value>86400</value>
                    </property>
               </bean>
          </list>
     </property>
</bean>

 

 

Now, after restarting Apache, you should have a fully functioning Apache proxy with proper caching of static content.

 

 

Configuring Load Balancing

Load balancing is an easy extension once our proxy is set up. Essentially, with load balancing, instead of passing the request through to the same machine each time, we pass it to a cluster of machines (two or more) based on a certain algorithm. I recommend reading the complete Apache documentation on mod_proxy_balancer, which is the module we'll use to enable load balancing. It can be found here: http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html

 

Let's first lay out the Apache configuration, adding to our existing VirtualHost entries.

<VirtualHost 10.10.90.54:80>

        ServerName 10.10.90.54

        # ProxyPreserveHost On
        RequestHeader set Host mars.elasticpath.net

        <Proxy balancer://tomcatservers>
                BalancerMember ajp://localhost:9009 route=node1 loadfactor=90
                BalancerMember ajp://10.10.90.51:9009 route=node2 loadfactor=100
                BalancerMember ajp://10.10.90.52:9009 route=node3 loadfactor=100
                BalancerMember ajp://10.10.90.53:9009 route=node4 loadfactor=100
        </Proxy>

        ProxyPass /storefront balancer://tomcatservers/storefront stickysession=JSESSIONID nofailover=Off
        ProxyPass /server-status !

</VirtualHost>

<VirtualHost 10.10.90.54:443>

        ServerName 10.10.90.54

        LogLevel warn
        #CustomLog logs/ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
        LogFormat "%h %l %u %t \"%r\" %>s %b" common
        CustomLog logs/ssl_access_log common
        ErrorLog logs/ssl_error_log

        #   SSL Engine Switch:
        #   Enable/Disable SSL for this virtual host.
        SSLEngine on
        SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
        SSLCertificateFile "/usr/local/apache2/conf/server.crt"
        SSLCertificateKeyFile "/usr/local/apache2/conf/server.key"

        #DocumentRoot    "/var/www/html/one"

        # ProxyPreserveHost On
        RequestHeader set HOST mars.elasticpath.net

        SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0

        <Proxy balancer://tomcatservers-ssl>
                BalancerMember ajp://localhost:9010 route=node1 loadfactor=90
                BalancerMember ajp://10.10.90.51:9010 route=node2 loadfactor=100
                BalancerMember ajp://10.10.90.52:9010 route=node3 loadfactor=100
                BalancerMember ajp://10.10.90.53:9010 route=node4 loadfactor=100
        </Proxy>

        ProxyPass /storefront balancer://tomcatservers-ssl/storefront stickysession=JSESSIONID nofailover=off

</VirtualHost>

 

We've seen the VirtualHost entries before, so let's just look at the Proxy balancer configuration in detail. We'll zoom in on this below for the insecure, port 80 connector:

 

...
        <Proxy balancer://tomcatservers>
                BalancerMember ajp://localhost:9009 route=node1 loadfactor=80
                BalancerMember ajp://10.10.90.51:9009 route=node2 loadfactor=100
                BalancerMember ajp://10.10.90.52:9009 route=node3 loadfactor=100
                BalancerMember ajp://10.10.90.53:9009 route=node4 loadfactor=100
        </Proxy>

        ProxyPass /storefront balancer://tomcatservers/storefront stickysession=JSESSIONID nofailover=Off
        ProxyPass /server-status !
...

 

Within the Proxy balancer directive, we've named our balancer <Proxy balancer://tomcatservers> and defined our load balancer members, for example BalancerMember ajp://10.10.90.51:9009 route=node2 loadfactor=100. In this case, we have 4 storefronts included, 1 being on the localhost (the same machine as the Apache install). We've opened up AJP port 9009 on these machines and configured all to have an even load factor, except the first node, which has slightly lower factor to give breathing room for Apache on the same machine.

 

The next directive, ProxyPass /storefront balancer://tomcatservers/storefront, we specify our ProxyPass to allow requests to /storefront* to pass to the balancer's /storefront*.  Note that we're also specifying the cookie name we want to keep stuck to each storefront node, so subsequent requests return to the same node. In our case, this is the JSESSIONID.

 

The last key setting in here is the route=nodeN on each BalancerMember. This is the name you configure for a node's jvmRoute within the app's server.xml. This allows Apache and the application server to identify which requests will go to which node. Without this setting (and/or the stickysession setting), the user's session may bounce between storefront nodes. This will cause strange behavior, like gettin bounced back to the homepage.

 

To set the jvmRoute within the server.xml, look for a commented-out line like the following:

 

<!-- You should set jvmRoute to support load-balancing via AJP ie :
<Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">        
-->  

 

Uncomment this and change jvmRoute="" to be the same as your BalancerMember entry (or vice versa). The same configuration as above is done for the secure connectors, which, in this case are on port 9010.

 

After rebooting Apache, you should be getting load balanced to a specific node in the cluster and stay on that node for subsequent requests. Your HTML assets will also be getting cached at the Apache layer as they pass through the proxy.

 

Now you can cache and load balance storefront servers with Apache HTTP Server. Go ahead and try it. Once you're set up, I would recommend tailing your Apache and app server access logs to watch your requests pass through Apache and your app server and ensure they're using sticky sessions correctly. Increasing the access log level on Apache and the app server to output cookie names/values is handy if you need to debug any sticky session config issues.

 

Some Final Considerations

  • There are some known issues around keep-alive and some older versions of Apache HTTP and Tomcat where the AJP connections between the two will not get released, causing the connection pool to fill and not allow new requests.
  • Consider using Apache's htcacheclean, which runs as a daemon or a one-time job, to control the size of your Apache cache on the disk. If your website has a small, finite number of cacheable HTML objects, this typically isn't a huge issue. On the other hand, if you have many GBs of assets and want to keep your cache to, say, 500 MB, htcacheclean is your tool. See the documentation for full details: http://httpd.apache.org/docs/2.2/programs/htcacheclean.html
  • Test, test, test. Make sure you do proper functional testing on a staging environment to ensure there are no strange redirects or odd behavior after putting another layer between your ecommerce site and the user. And just as importantly, proper performance testing will ensure there are no capacity issues between Apache and the app server. This will allow you to fine tune your connection pools for maximum performance, both on the Apache side and the app server side.
1 Comments Permalink

In Elastic Path 6.1.1, we added support for search server clustering, which greatly improved scalability and reliability. In our testing since that time, we've found that large-scale deployments that are clustering the search server can benefit from certain optimizations to index replication and search request distribution. Also, in 6.1.2, we upgraded to Solr 1.3, which brought overall performance gains, including improved indexing performance. Finally, we've updated the replication scripts to use the new snapshot "check" functionality, making the scripts more efficient by duplicating less redundant data.

 

Master Index Replication Interval

The frequency for running the snapshot script depends on how often the slave machines need the updated indexes. Of course, you need to consider the performance impact on the system, and the size of your indexes and the frequency of updates have a direct impact. If you have large indexes, consider a longer interval between snapshots. If the indexes need to be updated frequently, you may require a shorter interval. Keep in mind that frequent index replication across slave servers can affect performance. You may want to experiment to determine the optimal interval that balances your needs for frequency of updates against performance.


The snapshooter.ep script has the responsibility of taking a snapshot of the current master index. There are two ways to run it:

  • By running a cron job at a predefined interval (this is the default)
  • By using Solr's postCommit hook, which fires the script after a commit is complete.


Typically, we recommend using a cron job, unless you need to replicate your indexes more often than in 1-minute intervals or you've changed the frequency at which your Quartz indexers check for new objects to index (normally this is defaulted at 5 seconds). With recent changes from the Solr 1.3 scripts, the snapshooter.ep script now compares the previous snapshot files with the current index files and will only duplicate an index if it has changed. With this new feature, you can set your interval to be quite frequent without additional overhead of duplicating unchanged indexes. For example, in a cron job, you can run this every minute.

 

The advantage of using the postCommit hooks is that there is much more control over when you take a snapshot because they're only run when an actual change is made to an index. A snapshot can be taken post-commit, post-optimize, or both. To use postCommit replication, do the following:

 

  1. Modify snapshooter.ep.start to accept an argument that specifies which index you want to replicate (instead of having them all replicate with one call). For example, you would specify snapshooter.ep.start -i product to create only a product index snapshot.
  2. Update the Solr config file for that specific index to cause replication on a product post-commit. For example, edit WEB-INF/solrHome/conf/product.config.xml to add the following example:

   

product.config.xml Example
<!-- A postCommit event is fired after every commit or optimize command -->
     <listener event="postCommit" class="solr.RunExecutableListener">
       <str name="exe">/path/to/searchserver/WEB-INF/solrHome/bin/snapshooter.ep.start</str>
       <str name="dir">/path/to/searchserver/WEB
-INF/solrHome/bin</str>
       <bool name="wait">true</bool>
       <arr name="args"><str>-i</str><str>product</str></arr>
       <arr name="env"></arr>
     </listener>

 

Note: In 6.1.2, postCommit replication is now be done properly; Solr 1.3 included a fix for an issue that caused each commit call to be run twice. Prior to the fix, using a postCommit snapshot would produce multiple snapshots, wasting CPU time and disk space.

Slave Index Installation Interval

snappuller.ep maintains a status of the last index that was pulled, and therefore, will not unnecessarily pull down indexes on the master that have already been pulled. As a result, this script can be run at fairly regular intervals. snapinstaller.ep also maintains a status of the last index installed and will not attempt to re-install indexes that have already been installed. Therefore, similar to snappuller.ep, this script can be run at fairly regular intervals with little overhead.

 

HTTP Search Request Distribution over Master/Slave(s)

Although the master search server is fully capable of handling search requests from the storefront and CM client, we typically recommend that only the slave nodes handle search requests. Keep in mind that index replication isn't instantaneous. There's a delay between the time the master search server indexes a new object and the time it becomes searchable on the slave. With requests only going through to the slaves, this delay is consistent and all objects will show at the same time on all nodes. Also, you'll want to reduce the load on the master server and let it focus on continuous indexing, if necessary.

 

To take advantage of these changes, be sure to download the 6.1.2 search server cluster/failover scripts from the downloads area at http://grep.elasticpath.com/docs/DOC-1278 (requires login).

0 Comments Permalink