Skip to main content

Extract data from single input

POST 

/documents/single

Receives a single file input and attempts to both classify the document, to identify its type, and then extract structured data from it. Structured data is presented as a collection of nodes that are inter-related (a directed acylic graph), with the relationships between nodes described using the URIs of another node as property values.

Ensure either a content type or file extension is supplied with the form data. A PDF with no content or an image type will use machine learning to identify text and its bounding boxes before classifiation and extraction. A PDF with text content will use this content directly, and will be slightly faster.

The returned nodes vary depending on the type of document and data within.

This endpoint is synchronous and will return a response within roughly ten seconds for most documents. For multi-page image capture or long PDFs, an extraction session should be used instead as this allows for background processing while new pages are being captured.

Request

Responses

Document extraction completed without error