1/06/2014

Fixing Solr on ColdFusion

I work with computers a lot.

Occasionally (sic) I need to fix freshly installed software on servers.

ColdFusion is an app engine hailing from the dark days of the Internet around 1995. Its a bit like PHP for Government, Education and a few businesses.

It currently acts as a simpler gentler developer interface to the J2E or J2EE virtual machine that hosts it on Windows or Linux.

We run ColdFusion 8,9,10 on Linux.

At one time ColdFusion bundled software services from other venders, like Verity. Verity is a "full text" search engine. Recently they "unbundled" their contract with Verity and as a result left developers with migrating and replacing this search engine with another one. They recommended developers begin using an Opensource search engine called Solr (sounds like "Solar"). And they began bundling it with their installers.

The only problem is since Solr is not developed internally by the ColdFusion group (neither was Verity) the documentation on how to implement, maintain and use it is somewhat lacking. Worse, there appears to be some bugs in the integration pieces for starting and stopping the service (a provided run control script) and the example solr.xml configuration file provided by the ColdFusion group.

Essentially here is how I got it working:

1. If running CF8 you have Verity and Solr was never bundled with ColdFusion, CF8 is no longer supported and probably unsafe from a security stand point to be using at this point. We have since migrated off of it and on to CF9.

2. CF9 came in five flavors; CF9.0 alpha, CF9.0.1 alpha, CF9.0.2, CF9.0 beta, CF9.0.1 beta what happened was CF9 got released and then the updates came out CF9.0.1 alpha and CF9.0.2 (CF9.0.2 never had a bundled copy of Verity). Then they cancelled their distribution agreement with Verity and were required to remove Verity from the CF product installers. So they re-released CF9.0 alpha and CF9.0.1 alpha as CF9.0 beta and CF9.0.1 beta (except its not labelled as alpha or beta and you have to figure this out) then they default pointed or relabeled CF9.0.2 as CF9.0 -- so now if you attempt to download CF9.0 you actually get CF9.0.2 (except its not labeled as CF9.0 or CF9.0.2 and you have to figure this out).

The main way to determine if you have a CF9.0 alpha or CF9.0.1 alpha edition installed or one of the beta editions is whether you have a ColdFusion Administrator "Migrate Verity Collections" option under "Data & Services"

If you still have a "Migrate Verity Collections" option then you still have one of the "alpha" editions and should probably consider re-installing with one of the newer editions that do not have it.

Re-installing is not exactly easy if your ColdFusion server configuration is highly customized. First you have to manually record all your custom settings, or export them. Then "uninstall" and re-install with the new installer and finally re-configure manually or by importing your custom settings.

One thing you might consider is "is it worth the trouble?"

In a word [ yes ] because the newer "beta" editions include a slightly newer version of Solr 1.4.1 which does not have a few bugs in the version of Solr bundled with the earlier "alpha" editions of the ColdFusion installers.

Solr also has a bit of recent history, in that it was originally a commercial venture, that got gifted to the Apache foundation as Opensource and then merged with the Lucene release cycle. So while ColdFusion includes versions 1.4.x the more recent Solr opensource releases include 3.x version numbers to match the recent Lucene version numbers. So you might be asking since its a standalone service from ColdFusion service why not install the Solr opensource latest release? Simply because the ColdFusion group went to the trouble of integrating Solr 1.4.1 in the ColdFusion Administrator application, such that you can create "Solr Collections" and maintain them from the ColdFusion Administrator application without doing a lot at a command prompt.

And that brings up another good point, ColdFusion terminology has been around a lot longer than Solr. So what is called a "Collection" in ColdFusion parlance is called a "Core" in Solr terminology.

And Solr can be run in two modes, single "Core" or multi "Core" -- the latter supports multiple "Collections" in ColdFusion. The default configuration in ColdFusion is multi "Core".

3. The default install location of the bundled Solr on "Linux" is /opt/coldfusion9/solr the configuration file is in /opt/coldfusion9/solr/multicore/solr.xml < which in CF9.0.2 has an incorrect configuration line for Windows servers

  <core name="solr1" instanceDir="F:/temp/solr1\"/>

this needs to be changed to point at a real "Linux" path.

One of the annoying things about this ColdFusion x Solr arrangement is the Solr admin page will actually "say" you have a bad path in the configuration file, but unless you have a local web browser running on the server and consult the default Solr web page for administrators -- your unlikely to see it.

To add difficulty to the situation, the default install of Solr does not listen for remote web browser connections [ on purpose ] to be more secure, and most Linux firewalls block port 8983 unless explicitly un-blocked.

As a result the Solr admin page is usually not accessible, unless additional steps are taken.

 ColdFusion does not perform sanity checks on the Solr configuration file they provide, so this can lead to a logical conundrum as to why the service appears inaccessible.

You can correct this by first creating a default Core or Collection this way:

# mkdir -p /opt/coldfusion9/collections/solr1
# cp -R /opt/coldfusion9/solr/multicore/templates/* /opt/coldfusion9/collections/solr1
# chown -R coldfusion:coldfusion /opt/coldfusion9/collections

And changing the configuration line this way:

  <core name="solr1" instanceDir="/opt/coldfusion9/collections/solr1/"/>

4. The ColdFusion group also provided a run control script for systems using a system v type run control system here

/opt/coldfusion9/solr/cfsolr

It needs to be modified to include the path to the Java virtual machine run time you are using, if it is not JRun

For example:

#SOLR_JVM="/opt/coldfusion9/runtime/jre"
SOLR_JVM="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre"


And the script does not compensate for being run outside the /opt/coldfusion9/solr directory, so you have to modify the "Linux" section for starting and stopping this way:

[note: pay close attention to the "[quotes]"  after the $SUCMD argument the differences are subtle but very important, this code is cut and pasted directly from a known working script.]

Linux)
    OS=Linux
    # With SELinux, have to use runuser command
    if [ -x /sbin/runuser ]; then
            SUCMD="/sbin/runuser -s /bin/sh $RUNTIME_USER -c"
    else
            SUCMD="su -s /bin/sh $RUNTIME_USER -c"
    fi

    if [ $ID -eq 0 ]; then
    SOLRSTART='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar" >> $SOLR/logs/start.log 2>&1'
    SOLRSTOP='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar --stop" >> $SOLR/logs/start.log 2>&1'
                else
    SOLRSTART='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar" >> $SOLR/logs/start.log 2>&1'
    SOLRSTOP='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar --stop" >> $SOLR/logs/start.log 2>&1'

   fi
;;


5. And finally copy the script, activate it and start the service:

# cp /opt/coldfusion9/solr/cfsolr /etc/init.d
# chkconfig cfsolr on
# service cfsolr start

Technically that is all you have to do, but you may also want to visit the ColdFusion Administrator application and check the settings for the Solr server to verify its pointed at the correct server and the correct path to copy the multi "Core" template files from when its creating new collections:

Notice the Solr Host Name is
[ localhost ]
normally this is because

a. there may be a firewall around other servers running Solr protecting their port 8983 service, so it is likely you'll only want to connect to a Solr service running on your localhost

b. the Solr service is hosted in a minimal J2E virtual machine container called "Jetty" another Apache Foundation project installed on the localhost listening on port 8983.

Solr comes with a built-in adminstrator web interface you may need to refer to or want to access in order to practice exercises on the Solr documentation wiki.

You may not have a browser on the ColdFusion localhost, so to make it accessible from remote systems, first make sure your firewall permits remote access to port 8983, or temporarily shut it off like so:

# service iptables stop

Then re-configure the Jetty server to "listen" for remote connections on all interfaces like this:

# vi /opt/coldfusion9/solr/etc/jetty.xml

look for

 <New class="org.mortbay.jetty.bio.SocketConnector">
       <Set name="Host">127.0.0.1</Set>


change to

 <New class="org.mortbay.jetty.bio.SocketConnector">
       <Set name="Host">0.0.0.0</Set>


restart the cfsolr service

# service cfsolr stop
# service cfsolr start

 and then visit the URL

http://<coldfusionserver>:8983/solr/

take care that you understand that running without a firewall is not a best practice, and listening on all interfaces might not be a good thing to do all the time.


from this interface you can run "ad-hoc" queries or browse the schema

normally a new collection will not contain data and a search will return nothing

a migrated Verity collection will return search results

you get things into a collection by "publishing" to a web service URL using either a Curl script or ColdFusion script to put things into the collection

Solr is a standalone service, bundled with ColdFusion, but starting and stopping it is a separate operation from starting and stopping the ColdFusion service

Solr can be installed as a "standalone" service from an installer from the ColdFusion group at Adobe, but it appears to be an older one than the version that comes bundled in the CF9.0.2 and beta editions

Solr can be installed from source or packages specific to Linux distros, but then you loose the Integration features like the start up script and the tools within ColdFusion administrator for creating and managing Solr collections. The ColdFusion group "integrated" the bundled version just enough that it is slightly different from a mainstream install of Solr.

Finally you should be aware that the Solr search "language" or semantics are slightly different from the search "language" or semantics that work with Verity. The CFML language tag cfsearch has been updated to accommodate the differences, but you have to update any cfsearch calls in your code to use the new semantics.

addendum: 

Migrating a Verity collection is challenging.

If you had an "alpha" edition between CF8 and CF9.0.2 then you had a built in tool for migrating from Verity to Solr. 

If you have since upgraded (aka re-installed) with a "beta" edition then the option is gone.

But there is a tool to manually migrate CF8, CF9.0 and CF9.0.1 Verity collections "manually" by copying a ColdFusion tool folder onto the older server if it still has a Verity service, export the Verity collection to a CSV file, then copy the tool folder with the CSV file to the new ColdFusion server running Solr and Import the exported Verity collection into a fresh Solr collection.


This tool works with CF8, CF9.0 and CF9.0.1 for exporting. But it does have some bugs.

On CF8.0 it may report the Verity service is not running, even though it is running, clicking the Export button succeeds.


On CF9.0.1 "alpha" an Import will not succeed, but this is not surprising since the Solr bundled with that edition was 11 months older than the one bundled with the "beta" editions and had a pathing problem.

On CF9.0.2 Imports succeed as expected.

The tool is not available from the Adobe ColdFusion download site, but is referenced many times in the Community forums as an unsupported "possibility".

The title mentions CF10, but it does indeed work for CF8 to CF9.0.2 migrations, and presumably CF10

ColdFusion 10: Verity to Solr migration  

A migration tool for Verity to Solr for ColdFusion collections 


addendum 2:

There is an uninstall script in the Solr directory which will uninstall only the Solr service. 

If you do that however you may need to re-install and may want to do so without disturbing the ColdFusion service or its configuration. You can do this by re-running the installer and choosing an EAR/WAR type of install, this compiles a separate install of the ColdFusion service intended to be copied to a different server. 

One of the benefits of choosing the EAR/WAR type install is it leaves the current install alone, and you can direct the storage path for EAR/WAR result away from any current path in your existing ColdFusion service.

It also proceeds to offer up the choice of "installing" the Solr service and ColdFusion documentation again. I uncheck the options I don't need and let it only re-install the Solr service. Afterwards the Solr service will need the fixes mentioned in the current blog post.

One other thing about the ColdFusion "installers" is you should be aware that they are bi-modal, in that they can be run in an Xwindows GUI mode or Textual CLI mode using an command line argument.

If you have a remote desktop on the same network running Xwindows you can invoke the command and it will forward the display to your local desktop. If your remote and use X over an ssh connection that will work as well. If your running a windows desktop, try installing Xming, Xming fonts and Xlauncher over PuTTY. 

Xming X Server 

I tend to mention these options because the GUI is a bit more clear about the options, but the CLI is perfectly capable of performing the same options.

addendum 3:

A few common mistakes are:

1. Incomplete or wrong entry in the [solr.xml] file, I frequently leave off the "/solr1/" part of the path, resulting in an inaccessible service once its started.

2. Missing the closing quotes in the /etc/init.d/cfsolr script, the quoting is very important, otherwise on start up the "su" command may glob and interpret the rest of the argument string in the wrong way, returning an error message that goes to /dev/null and you will never see it. The only way to see it is to remove the stderror redirects and watch output from the start or stop commands.. usually something about option -X "unknown"

3. Not looking at the Solr admin webpage, it takes effort to reconfigure the jetty service and open up the firewall and hit it with a remote browser. But the ColdFusion service is not taking care of the Solr service, the ColdFusion Administrator application is not providing troubleshooting tips. Solr is treated as a completely separate service, so you have to learn to take care of it on your own. 

Solr is highly configurable and runs in its own jvm instance, which means it can become memory sensitive, it may require tuning of the jvm environment like raising initial heap sizes and max memory. Also Solr 3.x requires either an advanced Jetty container or a full blown J2EE hosting container to provide even more services that Solr 3.x requires. 

So upgrading from the bundled Solr 1.4.x included with ColdFusion 9 or ColdFusion 10 might not be such a good idea until the ColdFusion group has resolved the dependencies for you. 

Also each Solr version seems to have slight dialect changes in its query language which may or may not figure into the cfsearch tag support for the proper semantics. I have read good things about those moving up independent of the ColdFusion bundled version, but those same people seem to know a great deal about their Solr instances. Beyond what a casual user would seem to be interested in spending a lot of time reconciling.