Saturday, December 31, 2011

Apache Tika - a content analysis toolkit

Apache Tika has been released version 1.0. This release removes all deprecated pre 1.0 API methods, make several Configuration and OSGi improvements.

It provide a single API for extracting data and detecting language from arbitrary input formats, such as PDFs, images, text documents, spreadsheets. Even audio or video input formats are supported to a certain degree.

Java SE 7 Update 2

Oracle released a Java SE 7 Update 2. The Java 7 update includes a new version of HotSpot to improve performance and reliability, and adds support for Solaris 11.

In addition, as part of the Java 7 Update 2, it includes the SDK for developing JavaFX applications and, more importantly, the JavaFX Runtime is now installed with the JRE.