# `PhoenixKitCatalogue.Schemas.PdfExtraction`
[🔗](https://github.com/BeamLabEU/phoenix_kit_catalogue/blob/0.8.0/lib/phoenix_kit_catalogue/schemas/pdf_extraction.ex#L1)

Extraction state for one unique PDF file content.

Keyed by `file_uuid` (PK) — one row per unique
`phoenix_kit_files.uuid`, regardless of how many times that content
was uploaded under different filenames. The worker's state machine
lives here, not on `Pdf`, so two uploads of the same content share
one extraction job + one extracted page set.

Status flow: `pending → extracting → extracted | scanned_no_text |
failed`. Cascades on the file row's hard delete.

# `t`

```elixir
@type t() :: %PhoenixKitCatalogue.Schemas.PdfExtraction{
  __meta__: term(),
  error_message: term(),
  extracted_at: term(),
  extraction_status: term(),
  file_uuid: term(),
  inserted_at: term(),
  page_count: term(),
  updated_at: term()
}
```

# `changeset`

```elixir
@spec changeset(
  t()
  | %PhoenixKitCatalogue.Schemas.PdfExtraction{
      __meta__: term(),
      error_message: term(),
      extracted_at: term(),
      extraction_status: term(),
      file_uuid: term(),
      inserted_at: term(),
      page_count: term(),
      updated_at: term()
    },
  map()
) :: Ecto.Changeset.t(t())
```

# `status_changeset`

```elixir
@spec status_changeset(t(), map()) :: Ecto.Changeset.t(t())
```

# `statuses`

```elixir
@spec statuses() :: [String.t()]
```

---

*Consult [api-reference.md](api-reference.md) for complete listing*