Showing posts with label Apache. Show all posts
Showing posts with label Apache. Show all posts

Saturday, 1 June 2013

Configuring Solr 4.0 to index data from a MySQL database


In a previous post, I documented the steps in setting up Solr 4.0 on a Tomcat 7.0.28 instance. In this post, I'll describe the steps involved in configuring the same Solr instance to index data from a MySQL database.

1.Set up another Solr core.

A Solr core holds the configuration details for connecting to a data store as well as the indexed data of the data store.
The simplest way to set up a core is to copy the collection_1 core and re-name the copy 'test_core'
To make Solr aware of this new core, bring up the Solr Admin console at http://localhost:8080/apache-solr-4.0.0/#/
Select the core-admin option and add core and supply the parameters for name, instanceDir, dataDir, config and schema.

As a result of adding this core, the solr.xml file in the root of the solr_home directory will have the following entry

2. Set up the required libraries

Since to index data from the MySQL database, we will be calling a DataImporter, we will need the Solr libraries. The lib directory is available in your downloaded unzipped version of Solr at c:\apache-solr-4.0.0\dist
Copy all jar files within this directory and transfer them to your solr_home. The solr_home directory should look similar to the screenshot below



I could have transferred these jars to my Tomcat/lib and have solr use them from there but I was worried about conflicts with existing jars.
Finally, tell Solr about this lib directory by modifying the solr.xml file and adding the sharedLib="lib" element.
The solr.xml looks like this.


3. Configure your test_core to talk to the database.

To enable this, you'll need to follow the instructions given on http://wiki.apache.org/solr/DIHQuickStart
To summarise, essentially you will :
(a) Modify the solrconfig.xml and add the defintion for a DataImportHandler


  
    data-config.xml
  


(b) Create a data-config.xml file in the same directory as your solrconfig.xml and specify the database connection parameters


  
  
    
      
      
      
    
  

Ensure that the name values in the element have corresponding matches in the schema.xml
For example : The value 'description' should exist in the schema.xml

Finally, we are all set up to import records from our MySQL test schema.

4. Select the test_core in the Solr Admin console and click on the DataImport option which should be last option available in the menu.
Clicking on the DataImport option will bring up the DataImporter Interface, select the Clean and Commit checkboxes and click on Execute Import.

When the indexing is complete, you will see a message giving the number of documents added or deleted.
To check if the indexes have been created as expected and we are getting some sensible results back, perform a Search using 'Query' option
available under the 'test_core'

In my case, I searched for the term 'Hyundai' as I knew of one record having that particular value in the column that has been indexed.
And we get 1 record returned.



Thursday, 25 April 2013

Setting up Solr 4.0 on Tomcat 7.0.28

Generally every user centric application requires a Search capability. With Google easily leading the Search domain, user expectations are generally high when they come to a web-application and perform a search. If giving user's the Google experience is not possible, consider Apache's Solr  which is a popular, open-source Java based search tool that runs Lucene under hood.
While Solr comes packaged with Jetty and has a basic tutorial, that provides pointers on its different features, setting up the tool to run within an existing Apache Tomcat instance can require a bit of effort as there are several files and folders and it is not immediately clear which files control the configuration of the tool. This post will document the various steps involved in setting up Solr to run with Apache Tomcat 7.0.28 on a Windows server.

Step 1 : Download a copy of Solr and unzip it  to apache-solr-4.0.0.

Step 2: Create new directory which will be the home of Solr c:\solr_home and copy the following files and folders into it.  ( You will find  them within  the unzipped apache-solr-4.0.0 folder).
bin
collection1
solr.xml
zoo.cfg

collection1 is an existing example which we will load up into our Tomcat.Within the directory collection1, you'll find two other folders conf & data. Just verify that these are present.

Now we'll connect Solr to Tomcat.

Step 3:  Copy the apache-solr-4.0.0.war (which you will find in apache-solr-4.0.0\dist)  into your Tomcat web-apps directory.

Step 4:  Finally, the last step. Tell Solr where the existing search configuration files for collection1 reside.
To do this add the location of the Solr Home directory to your Tomcat JAVA_OPTS. I am setting this value in the setenv.bat file and the declaration looks like this :

set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8 -server -Xms1536m -Xmx1536m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+DisableExplicitGC -Dsolr.solr.home=C:/solr_home

Step 5:  Start Tomcat and browse to your loaded Solr instance at http://localhost:8080/apache-solr-4.0.0/#/ You should see the Solr Admin screen as below :
Solr Admin Console for Collection 1
To check if Solr has been configured and set up correctly, select collection1 in the left hand side bar and select the query option. Enter 'solr' in the textbox labelled 'q' as shown in the screenshot below.

Default Search Interface
On submitting the query, one result is returned by Solr. On clicking on the result link, the record is displayed as an XML.
Search Result for 'Solr'


In the next post, I'll document the steps of integrating Solr with an existing web-application which is running on the same Tomcat instance with a MySQL back-end.

Thursday, 17 January 2013

Apache Commons Lang Library : Managing Strings without NullPointers

The Apache Commons Lang 3.0 package is a useful set of utilities and interface definitions for String and Character manipulation. With respect to String manipulation, the StringUtils, StringEscapeUtils, RandomStringUtils, Tokenizer, WordUtils classes provide a number of utility methods for manipulating and managing String.

The StringUtils class has some interesting functions such as 
  • IsEmpty/IsBlank - checks if a String contains text. The main difference between IsEmpty and IsBlank is that that IsBlank also checks for whitespaces while IsEmpty doesn't. 
  • Trim/Strip - removes leading and trailing whitespace
  • Equals - compares two strings null-safe
  • startsWith - check if a String starts with a prefix null-safe
  • endsWith - check if a String ends with a suffix null-safe
  • IndexOf/LastIndexOf/Contains - null-safe index-of checks
  • IndexOfAny/LastIndexOfAny/IndexOfAnyBut/LastIndexOfAnyBut - index-of any of a set of Strings
  • ContainsOnly/ContainsNone/ContainsAny - does String contains only/none/any of these characters
  • Substring/Left/Right/Mid - null-safe substring extractions
  • SubstringBefore/SubstringAfter/SubstringBetween - substring extraction relative to other strings
  • Split/Join - splits a String into an array of substrings and vice versa
While these functions may appear similar to the ones provided by the String package itself, the StringUtils versions are null safe thus eliminating the possiblity of those horrible NPEs. It should be noted that a null input will return null without throwing any exception.

Thursday, 11 October 2007

Configuring Middlgen to generate Hibernate files from MySQL

Following on from my earlier post , I'll now show you how to configure Middlegen and talk to your MySQL database. You will need to have ANT installed in order to run the ANT tasks that I customized to build the hbms and the Java objects.

I performed the Middlegen connection tasks using ANT version 1.7, MySQL version 5.0.24-community-nt and MySQL client version 5.1.11 and Middlegen 2.1

In your build.xml, set the following Middlegen property
<property name="Middlegen.home" value="${lib}/Middlegen"/>

The lib directory has 2 main jars, Middlegen-2.1.jar and Middlegen-hibernate-plugin-2.1.jar.

Both these jars are required to
(1) Run Middlegen and connect to MySQL
(2) Create Hbm mappings from the database and convert the hbm mappings into Java objects.

Firstly create a directory where you are going to store your generated files using the Middlegen-init ANT task.

<target name="Middlegen-init"
description="Initializes everything, creates directories, etc.">
<mkdir dir="${gen.java}" />
</target>


The next task is Middlegen which talks to the database but inorder to do so, it needs to know where the database is located, what driver file to use and what connection properties to use, so make the task know all this by defining the following properties in the build.xml

<property name="database.initialise.script" value="${main.resources}/database/ddl/FULL_DROP_INITIALISE.sql"/>
<property name="database.driver.file" value="${lib}/mysql-connector-java-3.0.14-production-bin.jar"/>
<property name="database.driver.classpath" value="${database.driver.file}"/>
<property name="database.driver" value="org.gjt.mm.mysql.Driver"/>
<property name="database.url" value="jdbc:mysql://localhost/testdatabase"/>
<property name="database.userid" value="root"/>
<property name="database.password" value="root"/>
<property name="database.schema" value="testdatabase"/>
<property name="database.catalog" value=""/>

You'll notice that the properties talk about a 'database.driver' which should be on the 'database.driver.classpath'.
This driver can be found within the mysql-connector-java-3.0.14-production-bin.jar so it should be downloaded and made available to the application.

Now, write the following ANT tasks in your build.xml

<!-- Middlegen related Tasks --->
<!-- =================================================================== -->
<!-- Run Middlegen -->
<!-- =================================================================== -->
<target
name="Middlegen"
description="Run Middlegen"
unless="Middlegen.skip"
depends="Middlegen-init"
>

<taskdef
name="Middlegen"
classname="Middlegen.MiddlegenTask"
classpathref="lib.class.path"
/>

<Middlegen
appname="${name}"
prefsdir="${Middlegen.prefs}"
gui="${gui}"
databaseurl="${database.url}"
initialContextFactory="${java.naming.factory.initial}"
providerURL="${java.naming.provider.url}"
datasourceJNDIName="${datasource.jndi.name}"
driver="${database.driver}"
username="${database.userid}"
password="${database.password}"
schema="${database.schema}"
catalog="${database.catalog}"
includeViews="false"
>

<!-- Sets up the hibernate plug-in for Middlegen -->

<hibernate
destination="${gen.java}"
package="${name}.persistence"
genXDocletTags="true"
javaTypeMapper="Middlegen.plugins.hibernate.HibernateJavaTypeMapper"
/>
</Middlegen>

</target>

<!-- =================================================================== -->
<!-- Run hbm2java -->
<!-- =================================================================== -->
<target name="hbm2java" description="Generate .hbm and then .java from .hbm files.">
<taskdef
name="hbm2java"
classname="net.sf.hibernate.tool.hbm2java.Hbm2JavaTask"
classpathref="lib.class.path"
/>
<hbm2java output="${gen.java}">
<fileset dir="${gen.java}">
<include name="**/*.hbm.xml"/>
</fileset>
</hbm2java>
</target>
<!-- End of Middlegen related Tasks --->

To create the HBM mappings for ALL tables in your 'testdatabase' schema:
Run the 'Middlegen' task which will
(1) create a directory to store the generated files
(2) connect to your MySQL database based on the params supplied and generate the hbm mapping files.

You can customize the hbm files according to your requirements, now run the hbm2java mapper task on these files by invoking the target 'hbm2java'. The java files will be generated and stored in the same directory alongside their respective hbms.

Well, that's all there is to it.

Middlegen is a very useful tool as it takes away the pain of manually creating these files and helps you setup your Hibernate environment within a couple of hours.

For more info on the subject refer to the Middlegen homepage.

Saturday, 23 June 2007

My experience on building projects with Maven

I have been using Maven for the last couple of projects and have begun to realise the cool things that are possible. Claims that Maven is much more than a build tool and rather a project management tool appear justified.

Maven is best suited for setting up a project structure ( I mean the essential directory structure for a web based project) without minimum fuss. There are several public repositories such as http//www.ibiblio.com available for downloading project dependencies and associating them with the project. If your required jar isn't available, you don't need to fret. Download them from somewhere and update the ibiblio repository, so the next time anyone needs them, they can get them.

Another advantage as compared to it traditional rival ANT is that Maven conforms with the OO paradigm and supports the concept of a super class (read Parent POM). Low level tasks like code compilation are carried out by Maven without needing any additional tweaking from the developer.

Maven also breaks new grounds with its project site generation, reporting, source control management and well.. a lot of stuff. I could go on and on singing praises but hey, why not try something out for yourself.
This tutorial has a handy Maven command reference, to get you started.