127 Commits

Author SHA1 Message Date
ad95684771 Update --ocr-* args, enable OCR'ing images 2022-01-08 14:24:50 -05:00
b37e5a4ad4 Fix some warnings in media.c 2022-01-08 11:06:14 -05:00
15ae2190cf Fix tesseract lang validation, update README.md, fix tesseract memory leak 2022-01-08 11:04:52 -05:00
255bc2d689 Tweak MIN_OCR_SIZE behavior, update gitignore 2022-01-08 10:33:02 -05:00
cd2a44e016
Update ocr.h
Fix minimum image size validation in ocr_extract_text
2022-01-08 10:24:57 -05:00
Yatao Li
94a5e0ac59 refactor: split ocr_extract_text from ebook 2022-01-07 23:20:35 +08:00
81008d8936 Add --list-file argument 2021-12-29 18:54:13 -05:00
f2fd7ccf41 Fix raw parsing maybe, fix index picker css 2021-12-25 11:08:52 -05:00
08b2ca9d43 Update lcms -> lcms2 2021-11-12 11:29:50 -05:00
61ab68ce15 Update argparse repo URL 2021-11-07 09:42:17 -05:00
a41b5dcc1f Remove libscan git submodule 2021-11-07 09:30:14 -05:00
06f21d5f0f Remove libscan submodule 2021-11-07 09:17:02 -05:00
0887046b41 Fix sidecar files, better error handling in store_write 2021-09-20 20:34:05 -04:00
17fda1e540 Support for rewind buffer 2021-09-11 20:46:40 -04:00
34b363bfd8 Add argument to calculate checksums 2021-09-11 14:31:48 -04:00
7267d4bd2c Add basic JSON/NDJSON support 2021-09-07 08:14:32 -04:00
27560a82bb Basic support for WordPerfect files 2021-09-06 14:08:53 -04:00
f16ead1902 Parse page numbers from .docx files 2021-09-06 09:50:00 -04:00
f4e1d90a6b web UI rewrite, switch to ndjson.zst index format 2021-09-05 09:49:25 -04:00
9c0f3e0e31 Fix .docx segmentation fault 2021-08-16 17:50:54 -04:00
78f3c897e2 libscan version sync 2021-07-10 12:52:24 -04:00
3da2c8cae3 Update CI scripts, Dockerfiles, enable arm64 build again 2021-06-14 14:02:16 -04:00
c6fee7f6e2 update argparse 2021-06-13 09:41:18 -04:00
5b8c13fd13 Handle GPS metadata in the UI 2021-06-11 20:41:05 -04:00
81670ee107 Fix subtitle problems 2021-06-11 10:05:33 -04:00
f8d9b718c0 Fix memory leak in RAW parsing 2021-06-09 08:22:31 -04:00
6f5fdc2935 Fix for segfault in some comic files 2021-06-07 09:01:46 -04:00
a01f6dff1f Use 16-bit ints for meta keys (wip) 2021-06-07 08:40:12 -04:00
fc7f30d670 Add tests for subtitle 2021-05-05 16:10:55 -04:00
71f9dfcfe0 sync libscan 2021-05-05 14:21:01 -04:00
50771bd1dc Read subtitles from media files, fix bug in text_buffer 2021-03-26 19:48:16 -04:00
bc884e137c Change encoding for antiword PDF 2021-01-16 12:17:43 -05:00
ce1e241dea Workaround for UTF8 .doc files 2021-01-16 12:13:56 -05:00
f87eac1f90 Update submodules 2020-12-31 10:26:05 -05:00
576140e542 fix submodules 2020-12-31 10:26:05 -05:00
050c1283a3 Remove UUID dep, fix incremental scan, use MD5(path) as unique id, version bump 2020-12-31 10:26:05 -05:00
c6e1ba03bc Better support for .doc files 2020-12-31 10:26:05 -05:00
51a40c8819 Add .doc support 2020-12-31 10:26:05 -05:00
95bbe39afc Update libmupdf 2020-10-25 09:44:30 -04:00
641a8ec90c sidecar files #114, version bump 2020-10-25 09:44:30 -04:00
12f162d760 Fix #110 2020-10-25 09:44:30 -04:00
f7887f24d1 sync libscan 2020-09-13 16:16:26 -04:00
0cd2523b05 arm64 build 2020-09-13 16:16:26 -04:00
6a8027789a Limited support for UTF16 2020-08-29 10:17:28 -04:00
b1d16d8abf Fix #100 2020-08-29 10:17:28 -04:00
a32c68cba8 Build fixes 2020-08-25 10:38:38 -04:00
d116cf9d91 Default index for web & exec 2020-08-25 10:38:38 -04:00
1c898640cf Fix #88 2020-08-25 10:38:38 -04:00
02ad035b09 Workaround when first ebook page is blank 2020-08-25 10:38:38 -04:00
c11feb213d Gracefully handle archive errors in comic.c 2020-08-25 10:38:38 -04:00