PDF to text

By | September 22, 2011

This service will extract the text content from a PDF file. It uses the pdftotext executable from Xpdf (http://www.foolabs.com/xpdf/). The text returned from this service often contains characters which are XML-invalid, therefore the text is returned in its binary or Base64 encoded form. The text output should be cleaned before sending to another web service.

Name
PDF to text
Documentation
http://gnode1.mib.man.ac.uk:8080/ArticleSectionClassifierWebApp/
Protocol
SOAP
WSDL
Endpoint
http://gnode1.mib.man.ac.uk:8080/FullTextWebServices/PdfToTextService
Topic
General
Type
Text Mining
Tags
, , , ,
Description

This service will extract the text content from a PDF file. It uses the pdftotext executable from Xpdf (http://www.foolabs.com/xpdf/). The [...]

Further information

This service will extract the text content from a PDF file. It uses the pdftotext executable from Xpdf (http://www.foolabs.com/xpdf/). The text returned from this service often contains characters which are XML-invalid, therefore the text is returned in its binary or Base64 encoded form. The text output should be cleaned before sending to another web service.

Original source
BioCatalogue

Leave Your Comment

Your email will not be published or shared. Required fields are marked *

*

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>