Skip to content

0.3

Compare
Choose a tag to compare
@simonw simonw released this 30 Jun 00:44
· 25 commits to main since this release

First non-alpha release.

  • Breaking change: the order of arguments for s3-ocr index <bucket> <database_file> has been swapped, for consistency with other commands. #9
  • Breaking change: the start command no longer defaults to processing every .pdf file in the bucket. It now accepts a list of keys, or use the --all option to process every PDF file. #10
  • New s3-ocr fetch <bucket> <path> command for fetching the raw OCR JSON data for that file. #7
  • New s3-ocr text <bucket> <path> command for outputting just the extracted OCR text for a specified file. #8