# Document Types

In Extracta's document classification system, a `Document Type` defines a category of documents you expect to classify—such as invoices, receipts, contracts, or purchase orders. Each document type helps the system determine how to interpret and route the documents you upload.

## 🧩 Whats a Document Type?

A documentType is a JSON object with the following fields:

<table><thead><tr><th width="179.61328125">Field</th><th width="161.2734375">Type</th><th width="104.8828125">Required</th><th>Description</th></tr></thead><tbody><tr><td><code>name</code></td><td><code>string</code></td><td>✅</td><td>A clear, human-readable label for the document type (e.g., <code>"Invoice"</code>).</td></tr><tr><td><code>description</code></td><td><code>string</code></td><td>✅</td><td>A short explanation of the type of documents this represents.</td></tr><tr><td><code>uniqueWords</code></td><td><code>list&#x3C;string></code></td><td>✅</td><td>A list of key terms or phrases likely to appear in this document type.</td></tr><tr><td><code>extractionId</code></td><td><code>string</code></td><td>❌</td><td>ID of a pre-configured extraction template to auto-extract data.</td></tr></tbody></table>

## 📌 Why Unique Words Matter?

The `uniqueWords` array is critical to helping our classification model distinguish between document types. These are **keywords or phrases** commonly found in that document type. Think of them like clues for classification.

```json
"uniqueWords": [
    "invoice number", 
    "bill to", 
    "total amount"
]
```

This helps the classifier recognize invoices based on visible cues.

## ⚙️ Linking to Extraction Templates

You can optionally link a `documentType` to an existing **extraction template** using `extractionId`. This allows you to:

1. Upload a batch of documents for classification.
2. Have each document automatically classified (e.g., as an "Invoice").
3. Automatically trigger **data extraction** for that document using the associated template.

This creates a powerful end-to-end flow: **classification ➝ structured data extraction**.

## 🧾 Example Definition

Here’s how a `documentTypes` array might look:

```json
"documentTypes": [
  {
    "name": "Invoice",
    "description": "Standard commercial invoice from vendors or suppliers.",
    "uniqueWords": ["invoice number", "bill to", "total amount"],
    "extractionId": "invoiceExtractionId"
  },
  {
    "name": "Purchase Order",
    "description": "Internal or external purchase order documents.",
    "uniqueWords": ["PO number", "item description", "quantity ordered"]
  },
  {
    "name": "Receipt",
    "description": "Retail or online transaction receipts.",
    "uniqueWords": ["receipt", "paid", "transaction id"]
  }
]
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.extracta.ai/document-classification-api/classification-details/document-types.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
