⚙ī¸Options

The options object in Extracta.ai's API requests provides additional customization for your data extraction process. By setting different properties within this object, you can tailor the extraction to suit the specific needs of your documents. Here's how each option affects your data extraction:

Options Overview

hasTable

  • Type: Boolean

  • Not required

  • Default: false

  • Description: Indicates whether the document to be processed contains tables. When set to true, the extraction process includes an additional step specifically designed to analyze and extract information from tables within the document. This option ensures that table data is accurately recognized and extracted, providing structured information that's easy to use and analyze.

handwrittenTextRecognition

  • Type: Boolean

  • Not required

  • Default: false

  • Description: Determines if the document includes handwritten text that needs to be recognized and extracted. Setting this option to true initiates a specialized step in the extraction process focused on analyzing handwritten text. This feature leverages advanced OCR and machine learning techniques to convert handwritten notes into digital text, enhancing the comprehensiveness of the data extraction.

checkboxRecognition

  • Type: Boolean

  • Not required

  • Default: false

  • Description: Determines if the document contains checkboxes that need to be recognized and their states (checked or unchecked) extracted. When set to true, the extraction process includes a specialized step focused on identifying checkboxes within the document and accurately determining their status.

specificPageProcessing

  • Type: Boolean

  • Not required

  • Default: false

  • Description: Specific Page Processing is a feature designed to allow users to extract and process only a specified range of pages from a PDF document rather than processing the entire document. This feature is particularly useful when working with large PDF files where only certain sections or pages are relevant for the task at hand.

specificPageProcessingOptions

  • Type: Map

  • Required only if specificPageProcessing is true

  • Description: When Specific Page Processing is enabled, the system allows the user to define a range of pages (using from and to parameters) that they want to focus on. The specified range of pages is then extracted from the original PDF, creating a new document that contains only those pages. This newly created PDF is treated as a separate file and is processed according to the usual workflow—whether for storage, analysis, or further manipulation.

  • Example: Imagine a scenario where you have a 100-page PDF document, but the relevant information that you want to extract is from page 1 to 3. This feature allows you to specify this range, reducing processing time and cost.

{
  "options": {
    "specificPageProcessing": true,
    "specificPageProcessingOptions": {
      "from": 1,
      "to": 3
    }
  }
}

Using the options Object

To utilize these options, include the options object in your API request payload, specifying your preferences for hasTable and handwrittenTextRecognition as shown below:

{
  "options": {
    "hasTable": false,
    "handwrittenTextRecognition": false,
    "checkboxRecognition": false
  }
}

Adjust the values according to the needs of your document. For instance, if your document includes tables and handwritten notes, your options object would look like this:

{
  "options": {
    "hasTable": true,
    "handwrittenTextRecognition": true,
    "checkboxRecognition": false
  }
}

Conclusion

The options object allows for significant customization of the extraction process, enabling you to adapt the extraction to fit the unique characteristics of your documents. Whether dealing with complex tables, handwritten notes, or both, adjusting these options ensures that your extraction process is optimized for the highest accuracy and relevance of the extracted data.

Remember to review your documents' needs and set the options accordingly to take full advantage of the customized extraction capabilities offered by Extracta.ai.

Last updated