Apache Tika has been released version 1.0. This release removes all deprecated pre 1.0 API methods, make several Configuration and OSGi improvements.
It provide a single API for extracting data and detecting language from arbitrary input formats, such as PDFs, images, text documents, spreadsheets. Even audio or video input formats are supported to a certain degree.
It provide a single API for extracting data and detecting language from arbitrary input formats, such as PDFs, images, text documents, spreadsheets. Even audio or video input formats are supported to a certain degree.
No comments:
Post a Comment