1/24/2014

Literally saving Books


I've been a fan of the paperless office concept for some time.

But people never seemed to apply it to books, until online digital libraries like those of Google books and Amazon started to grab headlines.

Regardless of the Copyright and Authors rights issues, there is an inherent 'fear of loss' and 'convenience' factor in having a personal copy backed up as a digital file.

To this end, since online sharing is, for this moment in history, effectively 'banned' we each have to choose how we deal with it.

Being from a generation with access to physical books, and being aware there is a lot of good reference material in them. I've sought out to back things up first by duplicating paper xerox copies of pages and chapters, then using digital scanners and software for archiving and turning those into files. Then turing those files into PDF files and finally OCR'd PDF files optimized for size and archival quality accessibility in some far distant future, in the PDF-A format.

I've even recently become interested in M-disc technology from Milleniata which proposes that with the right DVD media, files can be backed up that will last 1000 years or more.

Millenniata - Write Once, Read Forever

LG initally was the only company that produced drives with the power to scribe M-discs, as it takes a higher power laser to etch the information into a non-dye layer.

But Sony/NEC apparently also dabbled in "Archival Quality" DVD burners in the recent past

Sony Optiarc

After exiting the business NEC and then Sony left a fine drive with Overburn capabilities 5280S-CB-PLUS

And the Vinpower digital company licensed these drives to continue providing them through outlets like Amazon.com

They serve multiple purposes, aside from being able to Burn M-disc media, they can also copy traditionally discs like those for Games in consoles.

The Sony/NEC Optiarc alliance was founded on the idea to produce a more stable, predictable drive that could produce more reliably burned media and fewer badly burned discs or "coasters". They include a stronger direct drive, gear train and more closely monitored and regulated drive speed.

They are not cheap by todays standards, costing about twice what a $15 ordinary consumer grade burner would cost.

But back to "Saving books"

Converting, copying or saving a book implies its digitally scanned into a computer file format.

In order to do that the book has to be setup under photographic conditions suitable for a capture sensor to take an image.

In some scanners this requires de-binding, or literally destroying the book by cleaving the binding off the back of the spine to free the pages so they can be shoved into a sheet feeder, or laided flat one at a time on a flatbed scanner.

I've been down the sheet fed path with one book, one alternative to cleaving is to "de-glue" the pages from the binding, it is sometimes recommended to use a "heat gun" or "hair dryer" to melt the waxy glue and pull the pages free. But in my case I try not to collect lots of hardware I may never use again, unless absolutely necessary.. so I tried using what I already had on hand, a clothing Iron. I heated it up and slid it along the backside of a paper bound book, and it in fact did melt the glue.. and I could pull the pages free.

I still had to trim the pages however since it is nearly impossible to remove all of the glue, and I couldn't tolerate letting that glue get into the gears and rollers of an expensive sheet fed printer. It would simply ruin it and I would never complete the task.

To trim hundreds of pages, you need a near professional paper cutter made of a hard material like steel.

In my case this led to purchasing a discount paper cutter from Amazon.com


It is possible to find them "used" and on sale, which I was fortunate enough to do.

I was aware you can take the books to someplace like a Kinkos or Printing company with a professional paper cutter that would charge $0.50 to a $1.00 per book. But some print shops will do it in a less than satisfactory manner, ask questions, or even refuse on the basis of a possible act of piracy. Further you don't always have control over how much of the border well between pages will remain and you could end up loosing text or images.. so I choose to de-bind and trim the pages myself.

Another alternative is a "near edgeless scanner" like the Plustek Optibook3600 

I briefly owned one of these, but being a flatbed and optimized for the mass market and cost savings, it was just too slow. I couldn't imagine archiving a book having to lift and heave the spine and entire mass of the book up and down over long periods of time.

I kept a careful eye on the Open hardware designs Google had released for hands free automating the scanning of a book. But it appeared too ad-hoc and hard to reproduce, and then it took up so much space.

That left me with Atiz BookSnap option.

I had purchased a BookSnap at the beginning of 2009 but a family death occupied my time and set unused for a while.

A surprisingly innovative and useful device, it is essentially a bundled set of equipment and software to use a pair of Canon PowerShot cameras to acquire images, then post process those into PDF files.

Its somewhat intimidating from a lot of angles, but ultimately I concluded it was the correct path to take for most of my book archiving needs.

First it was intimidating because the BookSnap was not cheap, then PowerShot Cameras are not cheap and acquiring two of the same model was especially not cheap.

Add to this support for the product was not great as it had been recently cancelled because Canon had removed a feature that allowed the Canon PowerShot cameras to remotely download images over USB cables to a computer. This had been a standard feature for a number of years, and then was abruptly removed in a Canon driver "update" for the Cameras after they were released. This caused some confusing in the market and to the end user unless you watch what goes into the Canon drivers available for download. So basically you needed to "know" do not Upgrade your Canon drivers direct from the Canon website or up to the latest and greatest versions.

Canon apparently had shifted the feature sets around a bit, and PowerShots would no longer be able to be remotely used as capture devices. This feature does remain in the Canon driver sets for DSLR cameras, a different and much higher cost camera type, which they perceive as becoming more adopted by a less frugal middle class.

To documentation that came with the equipment was also not the smoothest or easiest to understand.

First there was the hardware setup and alignment of the cameras. For various reasons you needed to have the cameras equipped with SD cards and Batteries and then cabled to a computer over a long USB cable, the provided cables with the cameras and hub that came with the BookSnap were just too short.

Ultimately I learned the Canon G10s which I had, also had an optional battery replacement and tiny trap door to run a power cable out to external power bricks to avoid the inconvenience of rechargable batteries. And I learned Monoprice had "ballun" impedance protected and corrected USB cables, and a twin long distance USB repeater cable that took the place of the Atiz hub and delivered good USB signals at a long distance.

After all of that I had wires running everywhere, power lines for the cameras, usb lines for the data, and power cables for the over head lights. I found an online source for black tiny cable raceway with adhesive tape and a side grooved lock to allow running the cables down the side of the BookSnap.

Then it was time to deal with the software.

BookSnap comes with BookScan and Book Edit software, but its not labeled that way.

BookScan software is called "BookDrive Capture Control" software and changed names in later products to "BookDrive Capture"

BookEdit software was versioned from V3 to V6 as "BookDrive Editor Pro" and up until V4 was "locked" during activation to one computers hardware, so it could not be run on multiple machines.
In V6 of the software they adopted a USB dongle key for "enabling" the software wherever it was installed. However this was not available to me, and I was stuck on V4 as the last upgrade available for the BookSnap.

It took some time.

But I eventually figured out the work flow should be to capture the images whole cloth as jpeg or tiff files. 

The BookScan software allows for some configuration to autoname and save sequential files to Left and Right or Combined Left&Right folders of BookScan images. And some minimal "Cropping" of the images during capture (this isn't as important as more post processing including Cropping can be achieved in the BookEdit software)

Then the workflow continues after the BookScan capture session has ended and the BookEdit post processing session is begun.

Opening BookEdit, you find both a large central preview region for Left and Right pages, and a Left navigation bar to indicate which book is being worked on. Clicking the L or R or L&R buttons allows designating the file folders or "sources" of images for a "book". A wrench icon for the book opens the post processing tools and permits customizing how the pages for that book will be post processed "en masse" during a Batch session.

At the bottom of the Left navigation is a large ]> arrow key for kicking off the Batch session to process all the pages in all of the books setup in the Left Navigation column.

And finally a Left navigation "tiny" Export button, will open up another utility program for "binding" all of the post processed images for a "book" into a multipage pdf or multipage tiff file.

Within the export tool are options for choosing jpeg or tiff compression schemes or optimization of the book to render it a smaller size rather just a large concatenation of images.

However you might not want to compress or optimize too much until "after" running the multipage document through an OCR engine like the one in Abbyy Finereader Pro or Adobe Acrobat Pro to enable full text indexing. These can often also perform the final optimization step.

Once all of this was understood, I then began to become aware of the importance of "White Balance" and "Exposure" and "Focus" as well as "Image Stabilization" control.

The software with BookSnap manages a few settings, but things like the camera "Mode" which determines if Auto White Balance, or a Custom White Balance is applied during capture [are not].

Auto Focus can also be a mixed blessing in small quarters unless turned off and manually set, or manipulated to perform its function using something like manual control, or an optional laser beam to help it focus.

Atiz products also have a number of separate options that can be purchased, but I'm not sure any of them would work with the BookSnap or its software.

Normally for example the trigger options for capturing a set of pages with the cameras, is by tapping the keyboard "enter" key, or setting up an interval timer to auto capture a set of images every 2, 3 or 5 seconds. 

But an optional USB proximity switch could be installed on the frame such that when the V shaped transparent sled were brought down to flatten the pages, an "Enter" keypress would be sent to the BookScan software to initiate image capture. These aren't unique to Atiz however as they emulate an additional USB HID input device, a second keyboard, and can be configured to send different key strokes. One such option is a foot switch, or a big button photobooth or kiosk device.

I all it opens up some exciting possibilities.

Recently I became aware of a Kickstarter project by way of an Amazon.com offering for a manual scanning stand, called Fopydo - which stands for [ Foto Copy Document ] scanning stand.


Essentially it is a corrugated "black" plastic frame that normally resides flat, and which you can toss in a backpack or briefcase.

It is for all intents and purposes a portable "V-cradle" which as was mentioned before is a superior platform for conducting non-destructive scanning of books and other documents.

What caught my eye was a YouTube video of the device in use 



 


The BookScan portion of the capture would depend upon a cell phone, Android or iPhone, or could be a Canon Powershot or DSLR camera.

The raw images could then be returned to a computer and offloaded by USB or SD memory card transfer, and post processed in the same way as with the BookSnap, practically distortion free.

And since its so light, it might almost be equally likely you'd have it in your pack as you would your cell phone.

This struck a cord with me because I often visit libraries, labs or places where a quick access to a scanner or copy machine might not be available, and I'd prefer to have something a little better than just a hand held phone snapshot of a book or document.

It's lighter and less bulky than a tripod, and the cell phone is practically optimized for on the spot shooting with minimal adjustments for the environment.

The same BookEdit software that came with BookSnap should also be able to process these images, but even if that is not an option. Fopydo has a suite of software available for free on its website to download and use to post process the images. Then an OCR engine like Abbyy Finereader or Adobe Acrobat could be used to index and optimize it.

And another option for slightly distorted images could be to use Booksorber to "de-warp" the images




A more specialized software for taking cell phone images and de-warping and pearing them for storage.


1/22/2014

VMware - Time travel with (Red Hat 7.2)


In the year 2014 running a version of Linux from 2001 in VMware is challenging.




Basically Installing and Running Red Hat 7.2 (Code name 'Enigma') in VMware Player 5.0 on Windows 7 is not easy.

However, it can be done.

Why?

I had a curiosity about ColdFusion history and wanted to install and run ColdFusion 5.0 from 2001. Then found an installer for ColdFusion 5.0 targeting RH7.2 on a cdrom in the back of a book.

But, back to the main story...

First the VMware Player 5.0 will let you create a virtual machine and boot from a Red Hat 7.2 .iso image to install it.

But you have to [catch] the [boot: ] prompt and type [ text ] to avoid running the GUI installer.

Then complete the install of Red Hat 7.2 using the text based user interface.

After installing, if you want to run an xserver, those provided by Red Hat 7.2 will not be compatible with VMware Player 5.0. They will crash if you try to start them.

But you can go to VMware.com and download an older copy of VMWare Workstation 4.5 and install it some place temporarily (so that it will extract its bundled vmware-tool .iso files) and copy the vmware-tools for linux .iso from "the" older VMware Workstation 4.5 from path

C:\Program Files\VMware\tools-linux\linux.iso

to the Windows 7 system and mount it in the Red Hat 7.2 virtual machine then installed the "older" vmware-tools manually. Copy the tar file to the local file system, expand it and run the setup script vmware-config.pl

As part of the vmware-tools configuration process. It will install an xserver compatible with VMware Player 5.0 and it will also offer to set the screen resolution, I'd recommend 1024x768.

At the command prompt run the [startx] command to try it out.

And finally change /etc/inittab [ id:3:initdefault ] run level to 5 if you want to boot to a desktop.



Some tips

Red Hat Linux 7.1 and 7.2 Installation Guidelines

"Installation sometimes hangs, due to a kernel bug, a workaround is available.

Power off the virtual machine and close the VMware Workstation window. Open the virtual machine's configuration file (.vmx file on a Windows host or .cfg file on a Linux host) in a text editor and add the following line:

cdrom.minvirtualtime=100

Save the file. Now install the guest operating system. After installing the guest, remove this setting, it may have a performance impact."

1/13/2014

Fixing Microsoft Windows, when 'Eight is Enough'

Windows 8 has a public relations problem.

Or rather the Public just can't seem to relate "to it"

It seems rather obvious in hindsight but I think they should have done something just slightly more subtler and "Obvious" to the unconscious bystander.

Instead of "copying" everything around them, try "adopting" everything around them.

Windows is a great platform for code development and emulation, prototyping and educating people about the computer space and now the network space.

Essentially its history is one of "adoption" be that Seattle Computers "DOS" or the Xerox Parc, Desktop/Mouse metaphor.

Why then didn't they just "Emulate" the Windows Phone applications on the desktop as a "Widget" or gadget if they simply must trademark everything.

Linux has a grand tradition of "build it (the kernel) and they will come" so it scales from a lowly service handler at the Core of many server operating systems, to supporting the Xwindows or Gnome desktops .. and then to the microspace with BusyBox.

The java JVM seemed to recognize this and had Micro to J2EE editions and everything in between, JRE, JDT, ect..

Windows evolved from a DOS kernel into a multitasking kernel with a similar Window Manager.

Why then completely "reinvent the wheel" and try to re-task a Windows Manager as a cell phone Interface? How brain dead is that concept?

The "fear" that a Widget platform would not be "accepted by developers" is just plain silly.

Imagine a platform you could compile to.. that worked on your portable PDA? Wait.. PalmPilot already did that.. see it worked.

As for targeting the cell phone device specifically, first you have to get it in the hands of the user, then the developers will follow. And today that means, a free platform.

Microsoft should have released, even subsidized the cell phones that they have.. or even "shock an awe" come up with an API, not an OS that would allow Windows Widget apps to run on other operating systems, like Linux, Android, OSX, iOS. But like .NET a language with only one adopter is doomed to irrelevance. Mostly because it enriches on those few "licensed" and "exclusive" to the inner circle who are "allowed" to use it. That unfortunately is the worst of the damage Sun MicroSystems did to Microsoft.. and its ongoing.

There was a time when Microsoft was a "Languages" company, Basic, Assembly, C, C++ those are "like" API factories for getting libraries and their features in the hands of developers. And developers will pay good money for tools to make goods they can sell to their employers, customers, interested parties.

But somewhere along the way it got lost and bought into "we must own everything and control everything" and totally abandoned "innovation and collaboration".

Its profits aren't in its "possessions" but in its potentials and being at the center of change, just as the fortunes of Silcon Valley are built on "possibilities" and not fundamentals.

1/12/2014

Fixing Firefox crashes on Flash or Youtube videos


This has been driving me crazy for weeks. As Firefox puts the screws on Plugins and Addons (tell me what's the difference?)

It seems Flash video or video that depends on a Flash plugin is getting crazier and more unreliable.

Then FF 26 came out and it got orders of magnitude worse, other Plugins and Addons started freaking out.

Well here's the probable solution;

See there is more than one [Activation, Allow, Block] control in Firefox, yep they figured just one wasn't enough. Worse its hidden and kind of hard to get to.

We all know about Tools>Addons


Or (ok, so what's a Plugin, Addon, or Extension? AlphaBabel Soup please?)


We know about About:config



And we know about About:plugins




But "Did You Know" about [Per-Site/Per-Page] Plugin "Permissions" ???


Everything can be setup perfect, latest version of Firefox installed, latest version of Flash and Shockwave plugin installed, fully up to date and fully patched, fully activated in all the traditional pages.

But if the "Permissions" for that plugin are set to "Block" for that Site/page here is what you'll see:



No clue that the problem is not the new nebulous Firefox overlord "we know best" imbedded defaults (you can't change), no clue it is not the Firefox browser global defaults (you can change), not the Adobe Plugin (actually crashing), that its a "Permissions!!!" problem set on the page by the browser.

Try threading "that Needle Sherlock!"

Now how do you get to that page?

While you browser is Crashing and Flickering you have to

[Right-Click > View Page Info > Permissions]

Survey all of your Plugins on the "chance" that this is the problem, when it could be many others.

[Oh and Firefox is helpfully "bouncing" back and forth from the Crashing Window with the Flash plugin and the View Page Info window.. can you say Seizure? ]

How a Permission gets set like that is beyond me, but I suspect its one of the new defaults in that Blog entry that looks like a EULA that pops up everytime a long running script is slowing down Firefox and it offers to [Stop |or| Continue].

And to make matters worse; this persists both Updating Firefox and Downgrading Firefox, and even if you clear your browser cache and clear your cookies per site and all cookies.

And it appears to "propagate or inherit" if you hit a page with the "Block" permission for a plugin and then link from there to other pages with no custom permission settings sheet.. so if you go through a portal, its really bad.

To be sort of fair; the message seems to be generated by Adobe Flash, but how do they know what Blocked them? A Browser defaults block, a custom property triggered on foul language, a tear sheet with an additional ACL of Permissions per page or element? And that would mean without Adobe's message you would have "no clue" the video was even failing, or missing.

And this applies to [All Plugins]..

Seriously deficient in the common sense communications department.

Do we really need NTFS style per file/page Permissions to Protect "Us" from other peoples webpages? What is it a Filesystem?

1/06/2014

Fixing Solr on ColdFusion

I work with computers a lot.

Occasionally (sic) I need to fix freshly installed software on servers.

ColdFusion is an app engine hailing from the dark days of the Internet around 1995. Its a bit like PHP for Government, Education and a few businesses.

It currently acts as a simpler gentler developer interface to the J2E or J2EE virtual machine that hosts it on Windows or Linux.

We run ColdFusion 8,9,10 on Linux.

At one time ColdFusion bundled software services from other venders, like Verity. Verity is a "full text" search engine. Recently they "unbundled" their contract with Verity and as a result left developers with migrating and replacing this search engine with another one. They recommended developers begin using an Opensource search engine called Solr (sounds like "Solar"). And they began bundling it with their installers.

The only problem is since Solr is not developed internally by the ColdFusion group (neither was Verity) the documentation on how to implement, maintain and use it is somewhat lacking. Worse, there appears to be some bugs in the integration pieces for starting and stopping the service (a provided run control script) and the example solr.xml configuration file provided by the ColdFusion group.

Essentially here is how I got it working:

1. If running CF8 you have Verity and Solr was never bundled with ColdFusion, CF8 is no longer supported and probably unsafe from a security stand point to be using at this point. We have since migrated off of it and on to CF9.

2. CF9 came in five flavors; CF9.0 alpha, CF9.0.1 alpha, CF9.0.2, CF9.0 beta, CF9.0.1 beta what happened was CF9 got released and then the updates came out CF9.0.1 alpha and CF9.0.2 (CF9.0.2 never had a bundled copy of Verity). Then they cancelled their distribution agreement with Verity and were required to remove Verity from the CF product installers. So they re-released CF9.0 alpha and CF9.0.1 alpha as CF9.0 beta and CF9.0.1 beta (except its not labelled as alpha or beta and you have to figure this out) then they default pointed or relabeled CF9.0.2 as CF9.0 -- so now if you attempt to download CF9.0 you actually get CF9.0.2 (except its not labeled as CF9.0 or CF9.0.2 and you have to figure this out).

The main way to determine if you have a CF9.0 alpha or CF9.0.1 alpha edition installed or one of the beta editions is whether you have a ColdFusion Administrator "Migrate Verity Collections" option under "Data & Services"

If you still have a "Migrate Verity Collections" option then you still have one of the "alpha" editions and should probably consider re-installing with one of the newer editions that do not have it.

Re-installing is not exactly easy if your ColdFusion server configuration is highly customized. First you have to manually record all your custom settings, or export them. Then "uninstall" and re-install with the new installer and finally re-configure manually or by importing your custom settings.

One thing you might consider is "is it worth the trouble?"

In a word [ yes ] because the newer "beta" editions include a slightly newer version of Solr 1.4.1 which does not have a few bugs in the version of Solr bundled with the earlier "alpha" editions of the ColdFusion installers.

Solr also has a bit of recent history, in that it was originally a commercial venture, that got gifted to the Apache foundation as Opensource and then merged with the Lucene release cycle. So while ColdFusion includes versions 1.4.x the more recent Solr opensource releases include 3.x version numbers to match the recent Lucene version numbers. So you might be asking since its a standalone service from ColdFusion service why not install the Solr opensource latest release? Simply because the ColdFusion group went to the trouble of integrating Solr 1.4.1 in the ColdFusion Administrator application, such that you can create "Solr Collections" and maintain them from the ColdFusion Administrator application without doing a lot at a command prompt.

And that brings up another good point, ColdFusion terminology has been around a lot longer than Solr. So what is called a "Collection" in ColdFusion parlance is called a "Core" in Solr terminology.

And Solr can be run in two modes, single "Core" or multi "Core" -- the latter supports multiple "Collections" in ColdFusion. The default configuration in ColdFusion is multi "Core".

3. The default install location of the bundled Solr on "Linux" is /opt/coldfusion9/solr the configuration file is in /opt/coldfusion9/solr/multicore/solr.xml < which in CF9.0.2 has an incorrect configuration line for Windows servers

  <core name="solr1" instanceDir="F:/temp/solr1\"/>

this needs to be changed to point at a real "Linux" path.

One of the annoying things about this ColdFusion x Solr arrangement is the Solr admin page will actually "say" you have a bad path in the configuration file, but unless you have a local web browser running on the server and consult the default Solr web page for administrators -- your unlikely to see it.

To add difficulty to the situation, the default install of Solr does not listen for remote web browser connections [ on purpose ] to be more secure, and most Linux firewalls block port 8983 unless explicitly un-blocked.

As a result the Solr admin page is usually not accessible, unless additional steps are taken.

 ColdFusion does not perform sanity checks on the Solr configuration file they provide, so this can lead to a logical conundrum as to why the service appears inaccessible.

You can correct this by first creating a default Core or Collection this way:

# mkdir -p /opt/coldfusion9/collections/solr1
# cp -R /opt/coldfusion9/solr/multicore/templates/* /opt/coldfusion9/collections/solr1
# chown -R coldfusion:coldfusion /opt/coldfusion9/collections

And changing the configuration line this way:

  <core name="solr1" instanceDir="/opt/coldfusion9/collections/solr1/"/>

4. The ColdFusion group also provided a run control script for systems using a system v type run control system here

/opt/coldfusion9/solr/cfsolr

It needs to be modified to include the path to the Java virtual machine run time you are using, if it is not JRun

For example:

#SOLR_JVM="/opt/coldfusion9/runtime/jre"
SOLR_JVM="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre"


And the script does not compensate for being run outside the /opt/coldfusion9/solr directory, so you have to modify the "Linux" section for starting and stopping this way:

[note: pay close attention to the "[quotes]"  after the $SUCMD argument the differences are subtle but very important, this code is cut and pasted directly from a known working script.]

Linux)
    OS=Linux
    # With SELinux, have to use runuser command
    if [ -x /sbin/runuser ]; then
            SUCMD="/sbin/runuser -s /bin/sh $RUNTIME_USER -c"
    else
            SUCMD="su -s /bin/sh $RUNTIME_USER -c"
    fi

    if [ $ID -eq 0 ]; then
    SOLRSTART='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar" >> $SOLR/logs/start.log 2>&1'
    SOLRSTOP='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar --stop" >> $SOLR/logs/start.log 2>&1'
                else
    SOLRSTART='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar" >> $SOLR/logs/start.log 2>&1'
    SOLRSTOP='cd $SOLR;$SUCMD "$SOLR_JVM/bin/java $JVMARGS -jar start.jar --stop" >> $SOLR/logs/start.log 2>&1'

   fi
;;


5. And finally copy the script, activate it and start the service:

# cp /opt/coldfusion9/solr/cfsolr /etc/init.d
# chkconfig cfsolr on
# service cfsolr start

Technically that is all you have to do, but you may also want to visit the ColdFusion Administrator application and check the settings for the Solr server to verify its pointed at the correct server and the correct path to copy the multi "Core" template files from when its creating new collections:

Notice the Solr Host Name is
[ localhost ]
normally this is because

a. there may be a firewall around other servers running Solr protecting their port 8983 service, so it is likely you'll only want to connect to a Solr service running on your localhost

b. the Solr service is hosted in a minimal J2E virtual machine container called "Jetty" another Apache Foundation project installed on the localhost listening on port 8983.

Solr comes with a built-in adminstrator web interface you may need to refer to or want to access in order to practice exercises on the Solr documentation wiki.

You may not have a browser on the ColdFusion localhost, so to make it accessible from remote systems, first make sure your firewall permits remote access to port 8983, or temporarily shut it off like so:

# service iptables stop

Then re-configure the Jetty server to "listen" for remote connections on all interfaces like this:

# vi /opt/coldfusion9/solr/etc/jetty.xml

look for

 <New class="org.mortbay.jetty.bio.SocketConnector">
       <Set name="Host">127.0.0.1</Set>


change to

 <New class="org.mortbay.jetty.bio.SocketConnector">
       <Set name="Host">0.0.0.0</Set>


restart the cfsolr service

# service cfsolr stop
# service cfsolr start

 and then visit the URL

http://<coldfusionserver>:8983/solr/

take care that you understand that running without a firewall is not a best practice, and listening on all interfaces might not be a good thing to do all the time.


from this interface you can run "ad-hoc" queries or browse the schema

normally a new collection will not contain data and a search will return nothing

a migrated Verity collection will return search results

you get things into a collection by "publishing" to a web service URL using either a Curl script or ColdFusion script to put things into the collection

Solr is a standalone service, bundled with ColdFusion, but starting and stopping it is a separate operation from starting and stopping the ColdFusion service

Solr can be installed as a "standalone" service from an installer from the ColdFusion group at Adobe, but it appears to be an older one than the version that comes bundled in the CF9.0.2 and beta editions

Solr can be installed from source or packages specific to Linux distros, but then you loose the Integration features like the start up script and the tools within ColdFusion administrator for creating and managing Solr collections. The ColdFusion group "integrated" the bundled version just enough that it is slightly different from a mainstream install of Solr.

Finally you should be aware that the Solr search "language" or semantics are slightly different from the search "language" or semantics that work with Verity. The CFML language tag cfsearch has been updated to accommodate the differences, but you have to update any cfsearch calls in your code to use the new semantics.

addendum: 

Migrating a Verity collection is challenging.

If you had an "alpha" edition between CF8 and CF9.0.2 then you had a built in tool for migrating from Verity to Solr. 

If you have since upgraded (aka re-installed) with a "beta" edition then the option is gone.

But there is a tool to manually migrate CF8, CF9.0 and CF9.0.1 Verity collections "manually" by copying a ColdFusion tool folder onto the older server if it still has a Verity service, export the Verity collection to a CSV file, then copy the tool folder with the CSV file to the new ColdFusion server running Solr and Import the exported Verity collection into a fresh Solr collection.


This tool works with CF8, CF9.0 and CF9.0.1 for exporting. But it does have some bugs.

On CF8.0 it may report the Verity service is not running, even though it is running, clicking the Export button succeeds.


On CF9.0.1 "alpha" an Import will not succeed, but this is not surprising since the Solr bundled with that edition was 11 months older than the one bundled with the "beta" editions and had a pathing problem.

On CF9.0.2 Imports succeed as expected.

The tool is not available from the Adobe ColdFusion download site, but is referenced many times in the Community forums as an unsupported "possibility".

The title mentions CF10, but it does indeed work for CF8 to CF9.0.2 migrations, and presumably CF10

ColdFusion 10: Verity to Solr migration  

A migration tool for Verity to Solr for ColdFusion collections 


addendum 2:

There is an uninstall script in the Solr directory which will uninstall only the Solr service. 

If you do that however you may need to re-install and may want to do so without disturbing the ColdFusion service or its configuration. You can do this by re-running the installer and choosing an EAR/WAR type of install, this compiles a separate install of the ColdFusion service intended to be copied to a different server. 

One of the benefits of choosing the EAR/WAR type install is it leaves the current install alone, and you can direct the storage path for EAR/WAR result away from any current path in your existing ColdFusion service.

It also proceeds to offer up the choice of "installing" the Solr service and ColdFusion documentation again. I uncheck the options I don't need and let it only re-install the Solr service. Afterwards the Solr service will need the fixes mentioned in the current blog post.

One other thing about the ColdFusion "installers" is you should be aware that they are bi-modal, in that they can be run in an Xwindows GUI mode or Textual CLI mode using an command line argument.

If you have a remote desktop on the same network running Xwindows you can invoke the command and it will forward the display to your local desktop. If your remote and use X over an ssh connection that will work as well. If your running a windows desktop, try installing Xming, Xming fonts and Xlauncher over PuTTY. 

Xming X Server 

I tend to mention these options because the GUI is a bit more clear about the options, but the CLI is perfectly capable of performing the same options.

addendum 3:

A few common mistakes are:

1. Incomplete or wrong entry in the [solr.xml] file, I frequently leave off the "/solr1/" part of the path, resulting in an inaccessible service once its started.

2. Missing the closing quotes in the /etc/init.d/cfsolr script, the quoting is very important, otherwise on start up the "su" command may glob and interpret the rest of the argument string in the wrong way, returning an error message that goes to /dev/null and you will never see it. The only way to see it is to remove the stderror redirects and watch output from the start or stop commands.. usually something about option -X "unknown"

3. Not looking at the Solr admin webpage, it takes effort to reconfigure the jetty service and open up the firewall and hit it with a remote browser. But the ColdFusion service is not taking care of the Solr service, the ColdFusion Administrator application is not providing troubleshooting tips. Solr is treated as a completely separate service, so you have to learn to take care of it on your own. 

Solr is highly configurable and runs in its own jvm instance, which means it can become memory sensitive, it may require tuning of the jvm environment like raising initial heap sizes and max memory. Also Solr 3.x requires either an advanced Jetty container or a full blown J2EE hosting container to provide even more services that Solr 3.x requires. 

So upgrading from the bundled Solr 1.4.x included with ColdFusion 9 or ColdFusion 10 might not be such a good idea until the ColdFusion group has resolved the dependencies for you. 

Also each Solr version seems to have slight dialect changes in its query language which may or may not figure into the cfsearch tag support for the proper semantics. I have read good things about those moving up independent of the ColdFusion bundled version, but those same people seem to know a great deal about their Solr instances. Beyond what a casual user would seem to be interested in spending a lot of time reconciling.

1/05/2014

Mastering the Stuff of Life


Continued my fascination with managing Blood Sugar. A few years ago I was under a lot of stress at work and my Father passed away.

To make matters more interesting my Doctor informed me I was loosing control of my Blood sugar and if I didn't do something Diabetes was probably not far off.

Perhaps fortunately it gave me something to distract my attention and focus on and I happened upon reducing table sugar in the diet. This led to people assuming I was "on a diet" and guessing it was "Low Carb".. a popular fad at the time.

The changes in blood chemistry were somewhat dramatic for me. And later I'd run into a YouTube video by Lawrence Lustig on a theory that it wasn't the sugar exactly, but the "Fructose" bound up in the sugar that was sabotaging all the finely tuned regulatory systems in our bodies, causing a cascading chain of events that .. coincidentally leads to a symptom called Obesity. 

But the real problem is the release of Insulin, which is rather like blinds on a window to a body cell, when its released it flips the blinds open so that glucose can enter the cell. The cells then "burn or store" the glucose. But when this happens too much the cells become saturated and the Insulin "blind flipping" doesn't work any more. The glucose floods the blood and the liver struggles to reduce the load on the arterial system by linking the glucose molecules together in chains, called Fatty Acids, the specialized cells called fat cells scoop this out of the blood so there is room for other vital supplies to get moved around the body.

Too much glucose (or blood sugar) is like a traffic jam to the body, the transport system becomes derailed and time critical or life critical messages and supplies can't get through. Tissues can die, regular processes can get thrown off track.. really bad things can happen. Not the least of which is the lingering glucose and fatty acid soup can begin to "React" with raw body materials "glycating" and permanently cementing various pathways and organs so that they never work again.

Managing blood sugar is a delicate dance between Life and Death. You need blood sugar to live. But too much and you don't.

Rather the Liver is the ultimate machine to restore an imbalanced body system. It has a lot of tricks up its sleeves, but too much Fructose and it can become pre-occupied or "stunned" such that it can't keep up with the emergency that a flood of blood sugar can become.

Fructose ironically is "what makes table sugar sweet" its relatively rare in fruits and berries or elsewhere in nature. But due to our growing expertise in chemistry outside the body, we have come up with large quantities of it and put it back into all kinds of foods that never had it before.

Fructose dulls the body's sensitivity to glucose, it can not be used by any other part of the body except the Liver, which must convert it to fat and nasty by products to dispose of it. Meanwhile the rest of the body has to deal with the excess glucose that might accompany the Fructose without any help from the Liver.. it draws out the high blood sugar episode, wrecking more damage.

Worse, Fructose appears to disable the feed back loop that tells the body its had enough food when eating.. hence it causes the body to draw in even larger meal quantities than it can actually use.. provoking a kind of self inflicted wound upon a person as they eat.

I've had a rather myopic view of Fructose for some time, but I think managing it is more important than managing the glucose or the table sugar. It just so happens we currently over simplify and say, anything sweet has sugar in it.. not quite true.. it probably has Fructose in it.. and the more the worse it is for you.

Artificial sweeteners are also drawing my attention.

At first I thought some were okay, but the deeper I looked the more I realized, many breakdown later in the gut.. due to bacteria we carry around inside us.. turning it back into simple sugars, or worse by-products.. which can promote some pretty bad things further down the digestive system.

We are only just now considering what happens inside us to the food stuff we can't digest.. fiber and such are not necessarily "inert" or harmless.

This year we kind of mapped the three big kinds of bacteria; suphur, methane and hydrogen by-product producers and where they reside inside us. We recognized that they compete with bad bacteria and other microbes that would do us harm in an active symbiotic way.. that is more "current" than "any" static antibiotic medicine. Someday antibiotics will be looked upon with the simplicity of chemotherapy.. just too simplistic to ever believe they were effective.

In the same way vaccines "manipulate" a far more complex immune system into preparing for an assault the immune system cannot see coming. Food choices can prepare the gut for an assault by chemicals and food stuffs the gut bacteria cannot see coming.

Life gets more complicated.

February, March, April

The hits just kept on coming.

Shuttlecraft Galileo, the real one, from Star Trek made final approach and landed at the Johnson Space Center near Houston, Texas

I actually got to drive down there and stay one night across the street at a fine hotel.

Guests for the unveiling included Doctor Who (Sylvester McCoy), Captain Lockley from Babylon 5 - Season 5 (Tracy Scoggins), Buck Rogers (Gil Gerard), Babylon 5 Ranger (Marcus Coal), and Spock's Co-Pilot on the Galileo (Don Marshall).

Originally conceived as a "Helicopter" with antigravs and booster rockets. It was the original Star Trek Commuter Van.

They plan to build a replica of the Hangar deck of the Enterprise 1701 in the Space Center Houston to permanently display this visitor from The Last Generation.

The year in Review

New Years is past, but its still the dead of Winter. So pondering the recent past is a popular thing to do right now.

January 2013 started off with a Bang!

For over 5 years I'd been watching and participating in a "Fringe Group" online in a chatroom. It was an interesting time to be a 'Cord Cutter'. Broadcast TV had finally stopped treating online Streaming like a freakish anomaly and started making episodes of their recently broadcast commercial shows available within 2-3 hours on itunes and other media outlets.



Just enough time if you missed the 'live' event in the chatroom to 'catch-up' and re-watch the show before a post show held later that weekend to discuss the show and listen with live 'Hosts' Darrell and Clint who provided contextual commentary on the recent episodes and discussed how they fit into the overall story arc of the season and the series.

Up until the post Show, you could call or email voice files to be played and edited into a downloadable stream of the post Show.. which people all across the World could then download and play 'catch-up' with the rest of the exclusive Club of viewers participating in this Fringe "Group".

Even more awesome, the actual Stars, Writers and Producers of the show would call in just like viewers and participate in whatever fashion they pleased.. stealing fire from the heavens.. very Prometheus like.

But alas, all good things come to an end.

Darrell and Clint hosted a live Party in "Real Space" at the "The End of the World" in Oklahoma City. Oklahoma.

People from everywhere, across the country, across the World.. even Canada.. showed up in person.. and we got to watch the final Broadcast of the series Finale together.

 It was not a night to be missed.

And oh yeah, the Breakfast Club has nothing on this group - redoing the hallway scene outside the convention center.


Bittersweet, nothing since has been quite like it.


2014 a fresh start


Time to freshen things up a bit. With a new domain name and a new name for this site.

As of today its been changed to my name [ John Willis ] and www.johnwillis.com, by chance it came up for renewal at the end of 2013 and it was recently transferred.

Hopefully it will bring with it a renewed interest in posting and developing a more robust website through which to communicate with the world.

For now Let's Just Say, this is where I hang my Hat.