Document Detection

Overview

Document Detection is one of the intelligence services of the Filestack platform. You can detect your document in the image, transform it to fully fit the image, and preprocess it, such as de-noising and distortion reduction, in order to increase the accuracy of the OCR engine in text extraction. Please see the following resources to learn more:

Resources

Document Detection API can only accept images with a resolution of no more than 2000×2000 pixels. You can use Resize task in chain to configure your image size and make it compatible with Document Detection.

Processing API

Document Detection is available as a synchronous operation in the Processing API using the following task:

doc_detection=coords:<coords>,preprocess:<preprocess>

Providing coords and preprocess in the above signature is not mandatory. If you do not configure these parameters in your URL and just use the signature /doc_detection/, default values of coords:false and preprocess:true would be set.

To use this task in Processing API, you have to use the security policy and signature. Read more about security policies here.

Parameters

coords	boolean false	Indicates whether this task to return coordinates of detected document in the image.
preprocess	boolean true	Indicates whether this task to return preprocessed image or the warped one.

Response

Original image:

/doc_detection=coords:true/

{
"coords": {
    "x": 0,
    "y": 300,
    "width": 2912,
    "height": 1804
}
}

Response Parameters

coords

object

Defines bounding box of the document detected inside of your image.

/doc_detection=coords:false,preprocess:true/

/doc_detection=coords:false,preprocess:false/

Examples

Get the coordinates of the detected document in the image (the same result with both values of preprocess):

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:true,preprocess:true/<HANDLE>

Get the preprocessed warped document from your original image:

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:false,preprocess:true/<HANDLE>

Get the warped document from your original image:

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:false,preprocess:false/<HANDLE>

Use doc_detection in a chain with other tasks such as resize:

https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/resize=h:<HEIGHT>/doc_detection=coords:false,preprocess:true/<HANDLE>

Use doc_detection with an external URL:

https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:<COORDS>,preprocess:<PREPROCESS>/<EXTERNAL_URL>

Use doc_detection with Storage Aliases:

https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:<COORDS>,preprocess:<PREPROCESS>/src://<STORAGE_ALIAS>/<PATH_TO_FILE>

Workflows Task Configuration

Visit Creating Workflows Tutorial to learn how you can use Workflows UI to configure your tasks and logic between them.

Document Detection task is available under Intelligence tasks category.

Workflows Parameters

Task Name	string	Unique name of the task. It will be included in the webhook response and can be used to build the logic below.
coords	boolean false	Indicates whether this task to return coordinates of detected document in the image.
preprocess	boolean true	Indicates whether this task to return preprocessed image or the warped one.

Logic

Document Detection task returns following responses to the workflow:

If coords is enabled:

{
"data": {
    "coords": {
        "x": "X coordinate of the top left corner of the bounding box",
        "y": "Y coordinate of the top left corner of the bounding box",
        "width": "Width of the bounding box",
        "height": "Height of the bounding box"
    }
}
}

If coords is not enabled:

{
"url": "the URL where the image is stored",
"mimetype": "image/<image_format>",
"size": "image size"
}

Logic Parameters

data	dictionary	Includes the coordinates of detected document.
coords	object	Indicates the coordinates of four edges belonging to the detected document.
url	string	Indicates the result file URL.
mimetype	string	Indicates the result file type and its format.
size	integer	Indicates result file size in bytes.

Considering the response from the task, you can build logic that tells the workflow how dependent tasks should be executed. For example, if you would like to implement another task if the image size is greater than or equal to a specific value, you can use the following rule:

size gte 300000

In Workflows UI, this command looks like the following example:

You can visit Creating Workflows Tutorial to learn how to use Workflows UI to configure your tasks and logic between them.

Webhook

Below you can find an example webhook payload for a Document Detection task on a sample image:

Screenshot of a document detection task in Filestack, demonstrating automated processing