Saturday, 29 October 2011

Using 'R' for Statistical Computing

I recently tried out the 'R' toolkit for performing some statistical operations and was impressed with the power and the extensibility of the toolkit in manipulating terra-byte sized data and its ability to produce publication ready graphs. Being open-source, it is available under the GNU GPL and can be downloaded from a CRAN site and set up to run on Windows, UNIX or Mac OS.
The toolkit basically provides a command-line enviroment for manipulating and loading data into memory arrays which can be subject to further statistical analysis. While the majority of the users see 'R' as a statistical package, it can be also be used as a modelling tool for linear / non-linear models.Apart from providing some high level functions for statistical computing, 'R' also has a well-developed suite for looping and conditional execution. It  also allows extensibility so that users can write their own functions.While the user-interface is easy to get used to, if you are coming from a UNIX background, it might take some getting used to for Windows users.

The 'Help' module is a well designed and is easy to use. Documentation on any function can be loaded from the command-line by preceeding the function name with a '?'. Overall, 'R' is a powerful toolkit for statistical computing and an excellent choice for manipulating large data sets.

Tuesday, 18 October 2011

GeoTools - GIS for Java Developers

GeoTools is an excellent Geographical Information System (GIS) toolkit for Java. It is open-source and has regular updates. The last released version at the time of writing is 2.7 released on the 7th of October 2011. The toolkit also serves as an engine for some other GIS based open-source tools such as uDig, GeoMajas and GeoServer.GeoTools also has some good supporting documentation with tutorials for setting up the toolkit in Eclipse / Maven. The tutorials also give some pointers on generic Geospatial concepts, such as Features, Layers and Maps

A Feature in GIS world is any real world entity that can be represented on a Map. Valid examples of such entities are rivers, buildings, roads, etc. In GeoTools, Features are an instance of a FeatureType. Thus, Features can be considered analogous to (Java) Objects and FeatureTypes can be modelled as (Java) Classes. Features have attributes. for example,  a river may have length, depth, water salinity as its attributes similar to the Field concept in Java. Features also define operations which are analogous to Methods in the Java world. Thus, this close analogy between Java concepts and the GIS toolkit implementation  makes it an easy for Java programmers to work with the toolkit.

Data in GIS (& GeoTools) can be represented in two forms, Raster or Vector type. Raster type data refers to digital images that can be transformed to a grid representation. Vector type data use Geometry to represent real-world elements.Geometry can be reduced to three main forms: Point, Line and Ploygon. Geometry also gives a location attribute to the element it refers to.

Thus, in GeoTools, defining a Feature using a Vector datatype requires a Geometry component specification.  A Feature also has a Style component associated with it. The Style component defines the rendering and look and feel of the Feature on the Map. Features are connected to a Map using Layers. A Map can have several Layers. A Layer contains a set of Features and their associated Styles. It is possible to overlay several Layers on a Map and thus have different objects visible in the same view.Usually, this entire information schema is stored within a file, known as a ShapeFile.

To summarise the process of rendering a Map using GeoTools, the following steps need to be followed:

1. Create a Map, using a DefaultMapContext
2. Load the ShapeFile 
3. Extract the FeatureSource from the ShapeFile  [FileDataStoreFinder.getDataStore(shapeFileLocation)]
4. Create a Style using the FeatureSource. GeoTools has some good tutorials on getting this done.
  A Style involves creating FeatureTypeStyles and associating Symbolizers and Rules with the defined FeatureTypeStyles.
5. Add the Style to a Layer of the Map.
6. Add the Map and a Renderer, to a JMapPane which extends JPanel.
7. Display the JMapPane.

GeoTools also has tools available for creating ShapeFiles from CSV files.Thus, GeoTools simplifies the process of creating GIS applications in Java and enables applications to be up and running within a matter of hours.