📋Fields

The fields parameter plays a crucial role in defining the data you wish to extract using the Extracta.ai API. It allows you to specify the exact structure of the information you're interested in, ensuring tailored and efficient data extraction. There are three primary types of fields you can define: string, object, and array. Understanding the differences and applications of each is key to leveraging the Extracta.ai API effectively.

String Type

The string type is the simplest form and is used to extract text-based information. Each string field requires a key, description, example, and a type.

Example:

{
  "key": "name",
  "description": "Name of the person",
  "example": "Alex Smith",
  "type": "string"
}

This defines a single piece of data you wish to extract, such as a name or email address, where the value is expected to be a text string.

Object Type

The object type is used for structured data that includes multiple related properties. An object field will have a key, description, type, and a list of properties, each of which is itself a field definition that can be a string, another object, or an array.

Example:

{
  "key": "personal_info",
  "description": "Personal information of the person",
  "type": "object",
  "properties": [
    {
      "key": "name",
      "description": "Name of the person",
      "example": "Alex Smith",
      "type": "string"
    },
    {
      "key": "email",
      "description": "Email of the person",
      "example": "alex.smith@gmail.com",
      "type": "string"
    }
  ]
}

This structure is ideal for extracting grouped data, such as personal information, that contains multiple attributes.

Array Type

The array type is used when the data to be extracted is a list of items, which can be either simple string values or structured object types. An array field will include a key, description, type, and items specifying the type of elements in the array.

Array of Strings Example:

{
  "key": "languages",
  "description": "Languages spoken by the person",
  "type": "array",
  "items": {
    "type": "string",
    "example": "English"
  }
}

This format is used for lists where each item is a text string, like languages or skills.

Array of Objects Example:

{
  "key": "items",
  "description": "The items in the invoice",
  "type": "array",
  "items": {
    "type": "object",
    "properties": [
      {
        "key": "name",
        "description": "The name of the item",
        "example": "Item 1",
        "type": "string"
      },
      {
        "key": "quantity",
        "description": "The quantity of the item",
        "example": "1",
        "type": "string"
      },
      {
        "key": "unit_price",
        "description": "The unit price of the item. Return only the number as a string.",
        "example": "100.00",
        "type": "string"
      }
    ]
  }
}

This structure supports extracting a list of complex items, each with its own set of attributes, such as invoice items.

Conclusion

Understanding the distinction between string, object, and array types is fundamental when defining the fields parameter for your data extraction needs with Extracta.ai. By carefully structuring your fields, you can customize the API's output to match the specific requirements of your application, ensuring that you capture precisely the data you need.

Last updated