Compare commits

...

165 Commits

Author SHA1 Message Date
4e1109c528 Merge pull request #288 from simon987/dev
v2.12.1
2022-04-23 10:30:19 -04:00
f87de89275 Version bump 2022-04-23 10:29:50 -04:00
1205981a11 CURL error handling, fix ES version handling, support for ES8, add --es-insecure-ssl argument 2022-04-23 10:29:31 -04:00
09613eaaf9 import magic database as a blob as last resort to make it work 2022-04-18 12:55:22 -04:00
a74726be55 Merge pull request #285 from simon987/dependabot/npm_and_yarn/sist2-vue/async-2.6.4
Bump async from 2.6.3 to 2.6.4 in /sist2-vue
2022-04-17 13:42:40 -04:00
dependabot[bot]
cb228052d2 Bump async from 2.6.3 to 2.6.4 in /sist2-vue
Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4.
- [Release notes](https://github.com/caolan/async/releases)
- [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md)
- [Commits](https://github.com/caolan/async/compare/v2.6.3...v2.6.4)

---
updated-dependencies:
- dependency-name: async
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-17 17:41:14 +00:00
fe56da95d5 Merge pull request #284 from simon987/dev
v2.12.0
2022-04-17 13:38:42 -04:00
9f2ad58f78 bump version 2022-04-17 12:30:14 -04:00
84d9bf4323 Fix cmake libmobi build maybe 2022-04-17 12:23:45 -04:00
90aa90f3f3 Update antiword 2022-04-17 11:47:33 -04:00
3fad07360c Merge pull request #283 from simon987/dependabot/npm_and_yarn/sist2-vue/minimist-1.2.6
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
2022-04-17 10:12:10 -04:00
dependabot[bot]
00c3a640d0 Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-17 12:53:12 +00:00
730e495bde Enable highlight in document info modal, remove /d/ endpoint 2022-04-16 16:11:17 -04:00
54df1dfcf7 Fix spacebar not working in search bar 2022-04-16 13:51:36 -04:00
a75675ecea Fix thumbnail copy bug, update tests 2022-04-16 11:48:43 -04:00
901035da15 Build libmobi with cmake, update to 0.10 2022-04-15 16:01:40 -04:00
ceb7265639 Fix max_analyzed_offset (again?) 2022-04-15 15:35:39 -04:00
036ed9ea1e Update libmagic cmake things 2022-04-15 15:35:20 -04:00
779303a2f7 Print body response when task id cannot be read 2022-04-14 16:24:56 -04:00
23aee14c07 Fix exec-script & fix memory leak in exec_args_validate 2022-04-14 15:43:24 -04:00
50b9201be3 Merge pull request #279 from simon987/dependabot/npm_and_yarn/sist2-vue/minimist-1.2.6
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
2022-04-05 20:12:03 -04:00
dependabot[bot]
14cfb15661 Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-31 23:28:25 +00:00
125c85d9bb localize tag filter bar 2022-03-18 09:15:07 -04:00
474eb95aff Update antiword 2022-03-17 15:08:55 -04:00
acf7453057 Add test for large msdoc 2022-03-17 15:05:48 -04:00
9a949d2694 Use TRUE rather than 1 2022-03-17 09:13:19 -04:00
dbdc75dcb8 Add filter bar in tag picker 2022-03-17 09:12:43 -04:00
c575fca91d Do not store duration or bitrate when the value is 0 or for images 2022-03-05 21:24:59 -05:00
0bf4244683 Do blank search on page reload when media tab auto-reload is disabled 2022-03-05 20:56:02 -05:00
eea5ce75f3 Fix query args updating outside of the search page 2022-03-05 20:42:13 -05:00
9b81856353 Fix some errors in keyboard handler 2022-03-05 20:33:45 -05:00
a10d6952ba Fix segfault in print_errors() 2022-03-05 20:33:21 -05:00
2b639bd4ac Error handling in get_es_version() 2022-03-05 14:59:37 -05:00
e9f92330fd Cleanup macros 2022-03-05 11:18:07 -05:00
cb37a6e6c1 Fix thumbnail bug in serve 2022-03-05 11:18:07 -05:00
b82c26f0fb Add mt_ int_ prefixes in InfoTable 2022-03-05 11:18:06 -05:00
16a4fb4874 Rework document IDs 2022-03-05 11:18:06 -05:00
cdc4c0ad3d Cap maximum thumbnail count to 1000 2022-03-05 11:18:06 -05:00
d034851ecb Setup keyboard shortcuts for Lightbox, add option to disable animations 2022-03-05 11:18:06 -05:00
ea7dfe7c84 Update to mongoose 7.6 2022-03-05 11:18:05 -05:00
8bfd010f4b Update dev ES docker script 2022-03-05 11:18:05 -05:00
499eb2b2e4 Un-break raw file thumbnails 2022-03-05 11:18:05 -05:00
25ab883063 Merge pull request #263 from simon987/dependabot/npm_and_yarn/sist2-vue/url-parse-1.5.10
Bump url-parse from 1.5.4 to 1.5.10 in /sist2-vue
2022-02-28 09:26:15 -05:00
dependabot[bot]
6ab606203f Bump url-parse from 1.5.4 to 1.5.10 in /sist2-vue
Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.5.4 to 1.5.10.
- [Release notes](https://github.com/unshiftio/url-parse/releases)
- [Commits](https://github.com/unshiftio/url-parse/compare/1.5.4...1.5.10)

---
updated-dependencies:
- dependency-name: url-parse
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-28 04:23:32 +00:00
6ec98046fa Merge pull request #262 from yatli/fix_261
fix #261: inherit index id from base index when using incremental scan
2022-02-26 11:37:16 -05:00
Yatao Li
4fac81ca6a fix #261: new index ids generated for incremental scan 2022-02-27 00:25:23 +08:00
2882741926 Fix multiple content metadata bug (but without compilation error this time) 2022-02-20 10:52:22 -05:00
edba9b7917 Fix multiple content metadata bug 2022-02-20 10:43:34 -05:00
e89964d592 Fix antiword build 2022-02-20 09:37:24 -05:00
329afcbe4f Update docs & UI stuff 2022-02-20 09:13:19 -05:00
2a2664a5cd Disable debug in docker image oops 2022-02-20 09:01:17 -05:00
0d18637e88 Merge pull request #257 from simon987/dev
v2.11.7
2022-02-20 08:34:26 -05:00
8ad9fc9e32 Fix caption path 2022-02-19 14:11:40 -05:00
f075b542fe Tweak mem-throttle option 2022-02-19 14:05:50 -05:00
3d4331b27d Add thumbnail-count option 2022-02-19 13:45:31 -05:00
a0db49e7d8 Add file page endpoint 2022-02-19 13:43:44 -05:00
065146ff8a Docker fixes 2022-02-19 13:43:44 -05:00
d58fcbc788 Merge pull request #246 from yatli/mem_cap_dev
scan memory threshold
2022-02-13 13:26:18 -05:00
b483447b1c Merge pull request #251 from yatli/example_systemd
add systemd integration example
2022-02-13 11:56:16 -05:00
Yatao Li
0d68d5fc7f use --index-incremental 2022-02-14 00:47:01 +08:00
Yatao Li
1813bf505c add systemd integration example 2022-02-13 19:05:13 +08:00
Yatao Li
9a6e7c7c47 reset throttle timer for each work item 2022-02-13 18:43:25 +08:00
Yatao Li
68252b4e80 query page size on tpool creation 2022-02-13 18:43:25 +08:00
Yatao Li
d1f13f2c84 stop scanning gracefully if memory limit target cannot be met 2022-02-13 18:43:25 +08:00
Yatao Li
6075c21a3a do not throttle writer/index thread pools 2022-02-13 18:43:23 +08:00
Yatao Li
f3674ffa02 stop threadpool when the memory limit is too low for any worker thread to proceed 2022-02-13 18:42:54 +08:00
Yatao Li
de187eff1c minor fix 2022-02-13 18:42:54 +08:00
Yatao Li
8e96174e1f scan memory threshold 2022-02-13 18:42:54 +08:00
8fa34da02f Fix some memory leaks, fix tests, fix --print regression 2022-02-11 11:09:29 -05:00
37919932de Merge pull request #238 from yatli/dev
incremental scan: build delete index. only load from main & original; incremental indexing;
2022-02-11 10:13:26 -05:00
8ab8124370 CSS tweaks 2022-02-10 21:25:16 -05:00
bfd080943d Disable automatic mime map update by default 2022-02-10 21:18:54 -05:00
c6820b6cc6 Fix CSS border & checkbox bug of index picker 2022-02-10 21:17:32 -05:00
3c09c45694 Merge pull request #249 from yatli/index_script_fix
do not log arg script if null
2022-02-07 15:44:16 -05:00
Yatao Li
bb5c17ec78 do not log arg script if null 2022-02-08 04:24:56 +08:00
Yatao Li
501064da10 parse: fix full scan regression 2022-01-25 19:03:25 +08:00
Yatao Li
8f7edf3190 incremental_delete: read from index file so that we have parent info 2022-01-25 19:03:25 +08:00
Yatao Li
e65905a165 only add new entries into new_table to save memory 2022-01-25 19:03:25 +08:00
Yatao Li
2cb57f3634 index: bulk delete 2022-01-25 19:03:25 +08:00
Yatao Li
679e12f786 unify READ_INDICES to reduce clutter 2022-01-25 19:03:25 +08:00
Yatao Li
291d307689 index: incremental indexing, add stub for index entries removal 2022-01-25 19:03:25 +08:00
Yatao Li
7d40b9e959 incremental scan: build delete index. only load from main & original. 2022-01-25 19:03:25 +08:00
cf56bdfb74 Add configuration option to use a date picker instead of date slider 2022-01-22 14:41:01 -05:00
b799a2e976 Fix for infinite reload in mime picker when automatic update is enabled 2022-01-22 13:30:48 -05:00
727b57b78a Fix dependabot issue I think 2022-01-22 13:21:34 -05:00
61cb845a0e Hotfix to patch segmentation fault when specifying a very long script 2022-01-22 13:17:47 -05:00
dad14fb66d Replace "File not found" messages with LOG_FATAL calls 2022-01-22 12:56:03 -05:00
c98a09d264 Version bump 2022-01-22 12:55:41 -05:00
b978132ee0 Update readme 2022-01-09 10:20:49 -05:00
4dedd281f1 Push compiled vue changes 2022-01-09 09:30:31 -05:00
65c499e477 Merge pull request #231 from simon987/dev
v2.11.6
2022-01-09 09:28:24 -05:00
625f3d0d6e Option to update media type tab in real time, add media type table in details 2022-01-08 18:23:22 -05:00
64b8aab8bf Validate that all the tesseract data files are in the same folder 2022-01-08 15:04:07 -05:00
ad95684771 Update --ocr-* args, enable OCR'ing images 2022-01-08 14:24:50 -05:00
b37e5a4ad4 Fix some warnings in media.c 2022-01-08 11:06:14 -05:00
15ae2190cf Fix tesseract lang validation, update README.md, fix tesseract memory leak 2022-01-08 11:04:52 -05:00
255bc2d689 Tweak MIN_OCR_SIZE behavior, update gitignore 2022-01-08 10:33:02 -05:00
fe1aa6dd4c Merge pull request #227 from yatli/dev
refactor: split ocr_extract_text from ebook
2022-01-08 10:25:41 -05:00
cd2a44e016 Update ocr.h
Fix minimum image size validation in ocr_extract_text
2022-01-08 10:24:57 -05:00
ed2a3f342a Localize tag add/delete, fix some translations, add LanguageIcon, add --lang arg, fix lightbox slideshow time, fix gif hover 2022-01-08 10:03:38 -05:00
1107fe9a53 Remove libscan hash debug info 2022-01-08 10:00:34 -05:00
a96e65d039 Add zh-CN option in language dropdown 2022-01-07 17:44:49 -05:00
87936eecd4 Merge pull request #229 from yatli/master
add zh-CN translation
2022-01-07 13:55:14 -05:00
Yatao Li
d817a0e9dd add zh-CN translation 2022-01-08 01:39:50 +08:00
Yatao Li
94a5e0ac59 refactor: split ocr_extract_text from ebook 2022-01-07 23:20:35 +08:00
d40f5052f9 static link for libasan in debug build 2021-12-29 19:25:03 -05:00
ee9a8fa514 Add thread lock for incremental_mark_file_for_copy() 2021-12-29 19:18:10 -05:00
81008d8936 Add --list-file argument 2021-12-29 18:54:13 -05:00
52466d5d8a Update tesseract datapaths 2021-12-25 11:12:00 -05:00
5f73fc024b Version bump, update readme 2021-12-25 11:08:52 -05:00
f2fd7ccf41 Fix raw parsing maybe, fix index picker css 2021-12-25 11:08:52 -05:00
d87fee8e00 Merge pull request #214 from dpieski/patch-2
Update USAGE.md
2021-12-22 09:55:24 -05:00
Andrew
672d1344d7 Update USAGE.md
Get-WmiObject is deprecated in favor of Get-CimInstance
2021-12-15 15:00:36 -06:00
27e32db1ed Fix attempt for excludes 2021-11-17 20:18:48 -05:00
bb91139ffb console log fixes, version bump 2021-11-15 20:52:24 -05:00
70cfa8c37c Fix Dockerfile.arm64 2021-11-13 18:25:24 -05:00
7493dedc8c Merge pull request #208 from simon987/dev
v2.11.4
2021-11-13 17:37:47 -05:00
c786a31bb2 Merge remote-tracking branch 'origin/master' into dev
# Conflicts:
#	README.md
2021-11-13 17:36:55 -05:00
48d024e751 Update dockerfiles 2021-11-13 17:36:30 -05:00
08b2ca9d43 Update lcms -> lcms2 2021-11-12 11:29:50 -05:00
ed8b4f4fad Add natural sorting support 2021-11-12 10:33:51 -05:00
66de93a8bd Language & formatting 2021-11-12 10:17:32 -05:00
e3f78fb693 Shift click & select all/none in index picker 2021-11-12 10:12:25 -05:00
030643cee0 Move CI scripts to script folder 2021-11-12 09:05:37 -05:00
b17b9439df Print progress bar in index module 2021-11-07 13:20:05 -05:00
414f65346c Update docker command in README.md 2021-11-07 13:18:32 -05:00
be8eedc9c7 Skip subtree of excluded directories 2021-11-07 11:56:09 -05:00
5b62fe77f2 Update demo URL 2021-11-07 09:52:28 -05:00
61ab68ce15 Update argparse repo URL 2021-11-07 09:42:17 -05:00
82ecb8bb85 Update gitignore 2021-11-07 09:36:39 -05:00
a41b5dcc1f Remove libscan git submodule 2021-11-07 09:30:14 -05:00
06f21d5f0f Remove libscan submodule 2021-11-07 09:17:02 -05:00
e82a388d1e Don't show resolution badge on narrow images 2021-10-22 10:21:35 -04:00
bf02e571b3 Forgot to add that file two commits ago 2021-10-22 09:44:56 -04:00
750a392a61 Show reduced ResuldCard when there are no results 2021-10-22 09:32:17 -04:00
3d7b977a82 Read ES version, handle legacy versions, add notice & debug info 2021-10-21 19:14:43 -04:00
cd71551a22 Some documentation updates 2021-09-25 09:30:53 -04:00
58741058cf Merge pull request #200 from simon987/dev
v2.11.3
2021-09-24 20:56:00 -04:00
0a7e59b646 Some documentation updates 2021-09-24 20:55:08 -04:00
43a566fe2f Version bump 2021-09-24 20:33:19 -04:00
b2631a86c8 Rework index picker 2021-09-24 20:31:11 -04:00
d0a1deca30 Fix thumbnail in DocInfoModal.vue 2021-09-24 19:40:06 -04:00
b03ce90a05 Fix max_analyzed_offset error 2021-09-20 21:01:23 -04:00
a5eacb4950 Set list item color for sub-documents 2021-09-20 20:40:48 -04:00
0887046b41 Fix sidecar files, better error handling in store_write 2021-09-20 20:34:05 -04:00
17fda1e540 Support for rewind buffer 2021-09-11 20:46:40 -04:00
34b363bfd8 Add argument to calculate checksums 2021-09-11 14:31:48 -04:00
c9aa4bed72 Add argument to calculate checksums 2021-09-11 14:31:31 -04:00
7267d4bd2c Add basic JSON/NDJSON support 2021-09-07 08:14:32 -04:00
43470e9ce6 Add basic JSON/NDJSON support 2021-09-06 21:27:17 -04:00
0331d46fff Merge pull request #186 from simon987/dev
v2.11.2
2021-09-06 14:14:51 -04:00
bbf1aca936 Version bump 2021-09-06 14:14:00 -04:00
27560a82bb Basic support for WordPerfect files 2021-09-06 14:08:53 -04:00
f16ead1902 Parse page numbers from .docx files 2021-09-06 09:50:00 -04:00
e2e07e80c7 Install libasan5 in Dockerfile 2021-09-06 09:25:01 -04:00
9499c6b189 Add v prefix in version badge 2021-09-06 09:18:28 -04:00
c5cd00b76c Update USAGE.md 2021-09-05 20:26:09 -04:00
ec5f07cab8 Merge pull request #184 from simon987/dev
v2.11.1
2021-09-05 20:06:18 -04:00
f098f7916a Version bump 2021-09-05 20:05:46 -04:00
85d67a9393 null checks in sig_handler 2021-09-05 20:03:42 -04:00
c5ac89813f Fix UI bug when losing focus of tags/mime tree 2021-09-05 19:59:01 -04:00
ec5642a3df Fix docker build for arm64 2021-09-05 13:41:08 -04:00
c1de74e7eb Fix build_arm64.sh (again) 2021-09-05 12:58:49 -04:00
f31f138f2e Set default tagline when none is specified 2021-09-05 12:53:52 -04:00
6a48b219e6 Fix build_arm64.sh & update README 2021-09-05 12:19:44 -04:00
144 changed files with 12161 additions and 3212 deletions

View File

@@ -27,4 +27,5 @@ sist2
**/ext_libmobi
**/ext_libwpd
**/core
*.a
*.a
tmp_scan/

View File

@@ -10,22 +10,7 @@ steps:
- name: build
image: simon987/sist2-build
commands:
- ./ci/build.sh
- name: docker
image: plugins/docker
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: simon987/sist2
context: ./
dockerfile: ./Dockerfile
auto_tag: true
auto_tag_suffix: x64-linux
when:
event:
- tag
- ./scripts/build.sh
- name: scp files
image: appleboy/drone-scp
settings:
@@ -42,6 +27,21 @@ steps:
- ./VERSION
- ./sist2-x64-linux
- ./sist2-x64-linux-debug
- name: docker
image: plugins/docker
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: simon987/sist2
context: ./
dockerfile: ./Dockerfile
auto_tag: true
auto_tag_suffix: x64-linux
when:
event:
- tag
---
kind: pipeline
@@ -55,7 +55,7 @@ steps:
- name: build
image: simon987/sist2-build-arm64
commands:
- ./ci/build_arm64.sh
- ./scripts/build_arm64.sh
- name: scp files
image: appleboy/drone-scp
settings:
@@ -80,7 +80,7 @@ steps:
from_secret: DOCKER_PASSWORD
repo: simon987/sist2
context: ./
dockerfile: ./Dockerfile
dockerfile: ./Dockerfile.arm64
auto_tag: true
auto_tag_suffix: arm64-linux
when:

8
.gitignore vendored
View File

@@ -10,17 +10,19 @@ Makefile
LOG
sist2*
!sist2-vue/
index.sist2/
*.sist2/
bundle*.css
bundle.js
*.a
vgcore.*
build/
third-party/
third-party/argparse
*.idx/
VERSION
git_hash.h
Testing/
test_i
test_i_inc
node_modules/
node_modules/
.cmake/
i_inc/

14
.gitmodules vendored
View File

@@ -1,6 +1,12 @@
[submodule "third-party/libscan"]
path = third-party/libscan
url = https://github.com/simon987/libscan
[submodule "third-party/argparse"]
path = third-party/argparse
url = https://github.com/cofyc/argparse
url = https://github.com/simon987/argparse
[submodule "third-party/libscan/third-party/utf8.h"]
path = third-party/libscan/third-party/utf8.h
url = https://github.com/sheredom/utf8.h
[submodule "third-party/libscan/third-party/antiword"]
path = third-party/libscan/third-party/antiword
url = https://github.com/simon987/antiword
[submodule "third-party/libscan/third-party/libmobi"]
path = third-party/libscan/third-party/libmobi
url = https://github.com/bfabiszewski/libmobi

View File

@@ -4,6 +4,7 @@ set(CMAKE_C_STANDARD 11)
project(sist2 C)
option(SIST_DEBUG "Build a debug executable" on)
option(SIST_FAST "Enable more optimisation flags" off)
option(SIST_FAKE_STORE "Disable IO operations of LMDB stores for debugging purposes" 0)
add_compile_definitions(
@@ -21,10 +22,6 @@ set(ARGPARSE_SHARED off)
add_subdirectory(third-party/argparse)
add_executable(sist2
# argparse
third-party/argparse/argparse.h third-party/argparse/argparse.c
src/main.c
src/sist.h
src/io/walk.h src/io/walk.c
@@ -41,7 +38,11 @@ add_executable(sist2
src/log.c src/log.h
src/cli.c src/cli.h
src/stats.c src/stats.h src/ctx.c
src/parsing/sidecar.c src/parsing/sidecar.h)
src/parsing/sidecar.c src/parsing/sidecar.h
# argparse
third-party/argparse/argparse.h third-party/argparse/argparse.c
)
target_link_directories(sist2 PRIVATE BEFORE ${_VCPKG_INSTALLED_DIR}/${VCPKG_TARGET_TRIPLET}/lib/)
set(CMAKE_FIND_LIBRARY_SUFFIXES .a .lib)
@@ -54,6 +55,10 @@ find_package(lmdb CONFIG REQUIRED)
find_package(cJSON CONFIG REQUIRED)
find_package(unofficial-mongoose CONFIG REQUIRED)
find_package(CURL CONFIG REQUIRED)
find_library(MAGIC_LIB
NAMES libmagic.so.1 magic
PATHS /usr/lib/x86_64-linux-gnu/ /usr/lib/aarch64-linux-gnu/
)
target_include_directories(
@@ -86,16 +91,29 @@ if (SIST_DEBUG)
sist2
PRIVATE
-fsanitize=address
-static-libasan
)
set_target_properties(
sist2
PROPERTIES
OUTPUT_NAME sist2_debug
)
elseif (SIST_FAST)
target_compile_options(
sist2
PRIVATE
-Ofast
-march=native
-fno-stack-protector
-fomit-frame-pointer
-freciprocal-math
)
else ()
target_compile_options(
sist2
PRIVATE
-Ofast
-fno-stack-protector
-fomit-frame-pointer
@@ -120,11 +138,12 @@ target_link_libraries(
CURL::libcurl
pthread
magic
c
scan
${MAGIC_LIB}
)
add_custom_target(

View File

@@ -5,13 +5,11 @@ WORKDIR /build/
COPY . .
RUN cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE=/vcpkg/scripts/buildsystems/vcpkg.cmake .
RUN make -j$(nproc)
RUN strip sist2
RUN ls -lh
RUN ls -lh sist2-vue/dist/
RUN strip sist2 || mv sist2_debug sist2
FROM ubuntu:20.10
FROM --platform="linux/amd64" ubuntu:21.10
RUN apt update && apt install -y curl
RUN apt update && apt install -y curl libasan5 libmagic1 && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
@@ -22,9 +20,9 @@ RUN mkdir -p /usr/share/tessdata && \
curl -o /usr/share/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata
COPY --from=build /build/sist2 /root/sist2
ENTRYPOINT ["/root/sist2"]
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENTRYPOINT ["/root/sist2"]
COPY --from=build /build/sist2 /root/sist2

View File

@@ -7,9 +7,9 @@ RUN cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE
RUN make -j$(nproc)
RUN strip sist2
FROM ubuntu:20.10
FROM --platform="linux/arm64/v8" ubuntu:21.10
RUN apt update && apt install -y curl
RUN apt update && apt install -y curl libasan5 && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
@@ -20,9 +20,9 @@ RUN mkdir -p /usr/share/tessdata && \
curl -o /usr/share/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata
COPY --from=build /build/sist2 /root/sist2
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENTRYPOINT ["/root/sist2"]
ENTRYPOINT ["/root/sist2"]
COPY --from=build /build/sist2 /root/sist2

View File

@@ -2,7 +2,7 @@
[![CodeFactor](https://www.codefactor.io/repository/github/simon987/sist2/badge?s=05daa325188aac4eae32c786f3d9cf4e0593f822)](https://www.codefactor.io/repository/github/simon987/sist2)
[![Development snapshots](https://ci.simon987.net/api/badges/simon987/sist2/status.svg)](https://files.simon987.net/.gate/sist2/simon987_sist2/)
**Demo**: [sist2.simon987.net](https://sist2.simon987.net/?i=Demo%20files)
**Demo**: [sist2.simon987.net](https://sist2.simon987.net/)
# sist2
@@ -10,7 +10,7 @@ sist2 (Simple incremental search tool)
*Warning: sist2 is in early development*
![sist2.png](docs/sist2.png)
![search panel](docs/sist2.png)
## Features
@@ -33,12 +33,11 @@ sist2 (Simple incremental search tool)
## Getting Started
1. Have an Elasticsearch (>= 6.X.X) instance running
1. Have an Elasticsearch (>= 6.8.X, ideally >=7.14.0) instance running
1. Download [from official website](https://www.elastic.co/downloads/elasticsearch)
1. *(or)* Run using docker:
```bash
docker run -d --name es1 --net sist2_net -p 9200:9200 \
-e "discovery.type=single-node" elasticsearch:7.14.0
docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.14.0
```
1. *(or)* Run using docker-compose:
```yaml
@@ -49,9 +48,11 @@ sist2 (Simple incremental search tool)
- "ES_JAVA_OPTS=-Xms1G -Xmx2G"
```
1. Download sist2 executable
1. Download the [latest sist2 release](https://github.com/simon987/sist2/releases) *
1. *(or)* Download a [development snapshot](https://files.simon987.net/.gate/sist2/simon987_sist2/) *(Not recommended!)*
1. *(or)* `docker pull simon987/sist2:2.10.3-x64-linux`
1. Download the [latest sist2 release](https://github.com/simon987/sist2/releases).
Select the file corresponding to your CPU architecture and mark the binary as executable with `chmod +x` *
2. *(or)* Download a [development snapshot](https://files.simon987.net/.gate/sist2/simon987_sist2/) *(Not
recommended!)*
3. *(or)* `docker pull simon987/sist2:2.12.1-x64-linux`
1. See [Usage guide](docs/USAGE.md)
@@ -67,21 +68,23 @@ See [Usage guide](docs/USAGE.md) for more details
## Format support
File type | Library | Content | Thumbnail | Metadata
:---|:---|:---|:---|:---
pdf,xps,fb2,epub | MuPDF | text+ocr | yes | author, title |
cbz,cbr | *(none)* | - | yes | - |
`audio/*` | ffmpeg | - | yes | ID3 tags |
`video/*` | ffmpeg | - | yes | title, comment, artist |
`image/*` | ffmpeg | - | yes | [Common EXIF tags](https://github.com/simon987/sist2/blob/efdde2734eca9b14a54f84568863b7ffd59bdba3/src/parsing/media.c#L190), GPS tags |
raw, rw2, dng, cr2, crw, dcr, k25, kdc, mrw, pef, xf3, arw, sr2, srf, erf | LibRaw | - | yes | Common EXIF tags, GPS tags |
ttf,ttc,cff,woff,fnt,otf | Freetype2 | - | yes, `bmp` | Name & style |
`text/plain` | *(none)* | yes | no | - |
html, xml | *(none)* | yes | no | - |
tar, zip, rar, 7z, ar ... | Libarchive | yes\* | - | no |
docx, xlsx, pptx | *(none)* | yes | if embedded | creator, modified_by, title |
doc (MS Word 97-2003) | antiword | yes | yes | author, title |
mobi, azw, azw3 | libmobi | yes | no | author, title |
| File type | Library | Content | Thumbnail | Metadata |
|:--------------------------------------------------------------------------|:-----------------------------------------------------------------------------|:---------|:------------|:---------------------------------------------------------------------------------------------------------------------------------------|
| pdf,xps,fb2,epub | MuPDF | text+ocr | yes | author, title |
| cbz,cbr | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | - | yes | - |
| `audio/*` | ffmpeg | - | yes | ID3 tags |
| `video/*` | ffmpeg | - | yes | title, comment, artist |
| `image/*` | ffmpeg | ocr | yes | [Common EXIF tags](https://github.com/simon987/sist2/blob/efdde2734eca9b14a54f84568863b7ffd59bdba3/src/parsing/media.c#L190), GPS tags |
| raw, rw2, dng, cr2, crw, dcr, k25, kdc, mrw, pef, xf3, arw, sr2, srf, erf | LibRaw | no | yes | Common EXIF tags, GPS tags |
| ttf,ttc,cff,woff,fnt,otf | Freetype2 | - | yes, `bmp` | Name & style |
| `text/plain` | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | no | - |
| html, xml | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | no | - |
| tar, zip, rar, 7z, ar ... | Libarchive | yes\* | - | no |
| docx, xlsx, pptx | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | if embedded | creator, modified_by, title |
| doc (MS Word 97-2003) | antiword | yes | yes | author, title |
| mobi, azw, azw3 | libmobi | yes | no | author, title |
| wpd (WordPerfect) | libwpd | yes | no | *planned* |
| json, jsonl, ndjson | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | - | - |
\* *See [Archive files](#archive-files)*
@@ -100,18 +103,24 @@ scan is also supported.
### OCR
You can enable OCR support for pdf,xps,fb2,epub file types with the
`--ocr <lang>` option. Download the language data files with your package manager (`apt install tesseract-ocr-eng`) or
You can enable OCR support for ebook (pdf,xps,fb2,epub) or image file types with the
`--ocr-lang <lang>` option in combination with `--ocr-images` and/or `--ocr-ebooks`.
Download the language data files with your package manager (`apt install tesseract-ocr-eng`) or
directly [from Github](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files).
The `simon987/sist2` image comes with common languages
(hin, jpn, eng, fra, rus, spa) pre-installed.
Examples
You can use the `+` separator to specify multiple languages. The language
name must be identical to the `*.traineddata` file installed on your system
(use `chi_sim` rather than `chi-sim`).
Examples:
```bash
sist2 scan --ocr jpn ~/Books/Manga/
sist2 scan --ocr eng ~/Books/Textbooks/
sist2 scan --ocr-ebooks --ocr-lang jpn ~/Books/Manga/
sist2 scan --ocr-images --ocr-lang eng ~/Images/Screenshots/
sist2 scan --ocr-ebooks --ocr-images --ocr-lang eng+chi_sim ~/Chinese-Bilingual/
```
## Build from source
@@ -124,7 +133,7 @@ You can compile **sist2** by yourself if you don't want to use the pre-compiled
git clone --recursive https://github.com/simon987/sist2/
cd sist2
docker build . -f ./Dockerfile -t my-sist2-image
docker run --rm my-sist2-image cat /root/sist2 > sist2-x64-linux
docker run --rm --entrypoint cat my-sist2-image /root/sist2 > sist2-x64-linux
```
### On a linux computer
@@ -134,14 +143,14 @@ docker run --rm my-sist2-image cat /root/sist2 > sist2-x64-linux
```bash
apt install gcc g++ python3 yasm ragel automake autotools-dev wget libtool libssl-dev curl zip unzip tar xorg-dev libglu1-mesa-dev libxcursor-dev libxml2-dev libxinerama-dev gettext nasm git
```
1. Apply vcpkg patches, as per [sist2-build](https://github.com/simon987/sist2-build) Dockerfile
1. Install vcpkg dependencies
```bash
vcpkg install curl[core,openssl]
vcpkg install lmdb cjson glib brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf gtest mongoose libuuid libmagic libraw jasper lcms gumbo
vcpkg install lmdb cjson glib brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf gtest mongoose libmagic libraw jasper lcms gumbo
```
1. Build

7
contrib/systemd/Makefile Normal file
View File

@@ -0,0 +1,7 @@
install:
install sist2-update-all.sh /usr/bin/sist2-update-all.sh
install sist2-update-files.sh /usr/bin/sist2-update-files.sh
install sist2-update-nextcloud.sh /usr/bin/sist2-update-nextcloud.sh
install sist2-update.service /etc/systemd/system/sist2-update.service
install sist2-update.timer /etc/systemd/system/sist2-update.timer
systemctl daemon-reload

31
contrib/systemd/README.md Normal file
View File

@@ -0,0 +1,31 @@
# Systemd integration example
This example contains my (yatli) personal configuration for sist2 auto-updating.
The following indices are involved in this configuration:
| Index | Path | Description |
|-----------|------------------|--------------------------------------------|
| files | /zpool/files | Main file repository |
| nextcloud | /zpool/nextcloud | Externally synchronized to a cloud account |
The systemd integration achieves automatic sist2 scanning & indexing everyday at 3:00AM.
### Tailoring the configuration for yourself
`sist2-update-all.sh` calls update scripts for each sist2 index. Add or remove
update scripts accordingly to suit your need. Each update script (e.g.
`sist2-update-files.sh`) has important parameters laid down at the beginning so
make sure to edit them to point to your files and index locations.
### Installation
```bash
# install the services and scripts
sudo make install
# enable & start the timer
sudo systemctl enable sist2-update.timer
sudo systemctl start sist2-update.timer
# verify that the timer has been enabled
systemctl list-timers --all
```

View File

@@ -0,0 +1,9 @@
#!/bin/bash
set -e
__dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
echo "Update index: Files"
source ${__dir}/sist2-update-files.sh
echo "Update index: Nextcloud"
source ${__dir}/sist2-update-nextcloud.sh
echo "Done. Restarting sist2."
docker restart sist2-sist2-1

View File

@@ -0,0 +1,34 @@
#!/bin/bash
set -e
DATE=$(date +%Y_%m_%d)
CONTENT=/zpool/files
ORIG=/mnt/ssd/sist-index/files.idx
NEW=/mnt/ssd/sist-index/files_$DATE.idx
EXCLUDE='ZArchives|TorrentStore|TorrentDownload|624f0c59-1fef-44f6-95e9-7483296f2833|ubuntu-full-2021-12-07'
NAME=Files
#REWRITE_URL="http://localhost:33333/activate?collection=$NAME&path="
REWRITE_URL=""
sist2 scan \
--threads 14 \
--mem-throttle 32768 \
--quality 1.0 \
--name $NAME \
--ocr-lang=eng+chi_sim \
--ocr-ebooks \
--ocr-images \
--exclude=$EXCLUDE \
--rewrite-url=$REWRITE_URL \
--incremental=$ORIG \
--output=$NEW \
$CONTENT
echo ">>> Scan complete"
rm -rf $ORIG
mv $NEW $ORIG
unset http_proxy
unset https_proxy
unset HTTP_PROXY
unset HTTPS_PROXY
sist2 index $ORIG --incremental-index
echo ">>> Index complete"

View File

@@ -0,0 +1,33 @@
#!/bin/bash
set -e
DATE=$(date +%Y_%m_%d)
CONTENT=/zpool/nextcloud/v-yadli
ORIG=/mnt/ssd/sist-index/nextcloud.idx
NEW=/mnt/ssd/sist-index/nextcloud_$DATE.idx
EXCLUDE='Yatao|.*263418493\\/Image\\/.*'
NAME=NextCloud
# REWRITE_URL="http://localhost:33333/activate?collection=$NAME&path="
REWRITE_URL=""
sist2 scan \
--threads 14 \
--mem-throttle 32768 \
--quality 1.0 \
--name $NAME \
--ocr-lang=eng+chi_sim \
--ocr-ebooks \
--ocr-images \
--exclude=$EXCLUDE \
--rewrite-url=$REWRITE_URL \
--incremental=$ORIG \
--output=$NEW \
$CONTENT
echo ">>> Scan complete"
rm -rf $ORIG
mv $NEW $ORIG
unset http_proxy
unset https_proxy
unset HTTP_PROXY
unset HTTPS_PROXY
sist2 index $ORIG --incremental-index

View File

@@ -0,0 +1,6 @@
[Unit]
Description=sist2-update
[Service]
User=yatli
ExecStart=/bin/bash /usr/bin/sist2-update-all.sh

View File

@@ -0,0 +1,10 @@
[Unit]
Description=sist2-update
[Timer]
OnCalendar=*-*-* 3:00:00
Persistent=true
Unit=sist2-update.service
[Install]
WantedBy=timers.target

View File

@@ -13,7 +13,7 @@
* [options](#web-options)
* [examples](#web-examples)
* [rewrite_url](#rewrite_url)
* [link to specific indices](#link-to-specific-indices)
* [elasticsearch](#elasticsearch)
* [exec-script](#exec-script)
* [tagging](#tagging)
* [sidecar files](#sidecar-files)
@@ -25,53 +25,65 @@ Usage: sist2 scan [OPTION]... PATH
or: sist2 exec-script [OPTION]... INDEX
Lightning-fast file system indexer and search tool.
-h, --help show this help message and exit
-v, --version Show version and exit
--verbose Turn on logging
--very-verbose Turn on debug messages
-h, --help show this help message and exit
-v, --version Show version and exit
--verbose Turn on logging
--very-verbose Turn on debug messages
Scan options
-t, --threads=<int> Number of threads. DEFAULT=1
-q, --quality=<flt> Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. DEFAULT=5
--size=<int> Thumbnail size, in pixels. Use negative value to disable. DEFAULT=500
--content-size=<int> Number of bytes to be extracted from text documents. Use negative value to disable. DEFAULT=32768
--incremental=<str> Reuse an existing index and only scan modified files.
-o, --output=<str> Output directory. DEFAULT=index.sist2/
--rewrite-url=<str> Serve files from this url instead of from disk.
--name=<str> Index display name. DEFAULT: (name of the directory)
--depth=<int> Scan up to DEPTH subdirectories deep. Use 0 to only scan files in PATH. DEFAULT: -1
--archive=<str> Archive file mode (skip|list|shallow|recurse). skip: Don't parse, list: only get file names as text, shallow: Don't parse archives inside archives. DEFAULT: recurse
--ocr=<str> Tesseract language (use tesseract --list-langs to see which are installed on your machine)
-e, --exclude=<str> Files that match this regex will not be scanned
--fast Only index file names & mime type
--treemap-threshold=<str> Relative size threshold for treemap (see USAGE.md). DEFAULT: 0.0005
--mem-buffer=<int> Maximum memory buffer size per thread in MB for files inside archives (see USAGE.md). DEFAULT: 2000
--read-subtitles Read subtitles from media files
-t, --threads=<int> Number of threads. DEFAULT=1
--mem-throttle=<int> Total memory threshold in MiB for scan throttling. DEFAULT=0
-q, --thumbnail-quality=<flt> Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. DEFAULT=1
--thumbnail-size=<int> Thumbnail size, in pixels. DEFAULT=500
--thumbnail-count=<int> Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT=1
--content-size=<int> Number of bytes to be extracted from text documents. Set to 0 to disable. DEFAULT=32768
--incremental=<str> Reuse an existing index and only scan modified files.
-o, --output=<str> Output directory. DEFAULT=index.sist2/
--rewrite-url=<str> Serve files from this url instead of from disk.
--name=<str> Index display name. DEFAULT: (name of the directory)
--depth=<int> Scan up to DEPTH subdirectories deep. Use 0 to only scan files in PATH. DEFAULT: -1
--archive=<str> Archive file mode (skip|list|shallow|recurse). skip: Don't parse, list: only get file names as text, shallow: Don't parse archives inside archives. DEFAULT: recurse
--archive-passphrase=<str> Passphrase for encrypted archive files
--ocr-lang=<str> Tesseract language (use 'tesseract --list-langs' to see which are installed on your machine)
--ocr-images Enable OCR'ing of image files.
--ocr-ebooks Enable OCR'ing of ebook files.
-e, --exclude=<str> Files that match this regex will not be scanned
--fast Only index file names & mime type
--treemap-threshold=<str> Relative size threshold for treemap (see USAGE.md). DEFAULT: 0.0005
--mem-buffer=<int> Maximum memory buffer size per thread in MiB for files inside archives (see USAGE.md). DEFAULT: 2000
--read-subtitles Read subtitles from media files.
--fast-epub Faster but less accurate EPUB parsing (no thumbnails, metadata)
--checksums Calculate file checksums when scanning.
--list-file=<str> Specify a list of newline-delimited paths to be scanned instead of normal directory traversal. Use '-' to read from stdin.
Index options
-t, --threads=<int> Number of threads. DEFAULT=1
--es-url=<str> Elasticsearch url with port. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
-p, --print Just print JSON documents to stdout.
--script-file=<str> Path to user script.
--mappings-file=<str> Path to Elasticsearch mappings.
--settings-file=<str> Path to Elasticsearch settings.
--async-script Execute user script asynchronously.
--batch-size=<int> Index batch size. DEFAULT: 100
-f, --force-reset Reset Elasticsearch mappings and settings. (You must use this option the first time you use the index command)
-t, --threads=<int> Number of threads. DEFAULT=1
--es-url=<str> Elasticsearch url with port. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
-p, --print Just print JSON documents to stdout.
--incremental-index Conduct incremental indexing, assumes that the old index is already digested by Elasticsearch.
--script-file=<str> Path to user script.
--mappings-file=<str> Path to Elasticsearch mappings.
--settings-file=<str> Path to Elasticsearch settings.
--async-script Execute user script asynchronously.
--batch-size=<int> Index batch size. DEFAULT: 100
-f, --force-reset Reset Elasticsearch mappings and settings. (You must use this option the first time you use the index command)
Web options
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--bind=<str> Listen on this address. DEFAULT=localhost:4090
--auth=<str> Basic auth in user:password format
--tag-auth=<str> Basic auth in user:password format for tagging
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--bind=<str> Listen on this address. DEFAULT=localhost:4090
--auth=<str> Basic auth in user:password format
--tag-auth=<str> Basic auth in user:password format for tagging
--tagline=<str> Tagline in navbar
--dev Serve html & js files from disk (for development)
--lang=<str> Default UI language. Can be changed by the user
Exec-script options
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--script-file=<str> Path to user script.
--async-script Execute user script asynchronously.
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--script-file=<str> Path to user script.
--async-script Execute user script asynchronously.
Made by simon987 <me@simon987.net>. Released under GPL-3.0
```
@@ -80,14 +92,22 @@ Made by simon987 <me@simon987.net>. Released under GPL-3.0
### Scan options
* `-t, --threads`
Number of threads for file parsing. **Do not set a number higher than `$(nproc)` or `$(Get-WmiObject Win32_ComputerSystem).NumberOfLogicalProcessors` in Windows!**
* `-q, --quality`
Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. *Does not affect PDF thumbnails quality*
* `--size`
Number of threads for file parsing. **Do not set a number higher than `$(nproc)` or `$(Get-CimInstance Win32_ComputerSystem).NumberOfLogicalProcessors` in Windows!**
* `--mem-throttle`
Total memory threshold in MiB for scan throttling. Worker threads will not start a new parse job
until the total memory usage of sist2 is below this threshold. Set to 0 to disable. DEFAULT=0
* `-q, --thumbnail-quality`
Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best.
* `--thumbnail-size`
Thumbnail size in pixels.
* `--thumbnail-count`
Maximum number of thumbnails to generate. When set to a value >= 2, thumbnails for video previews
will be generated. The actual number of thumbnails generated depends on the length of the video (maximum 1 image
every ~7s). Set to 0 to completely disable thumbnails.
* `--content-size`
Number of bytes of text to be extracted from the content of files (plain text and PDFs).
Number of bytes of text to be extracted from the content of files (plain text, PDFs etc.).
Repeated whitespace and special characters do not count toward this limit.
Set to 0 to completely disable content parsing.
* `--incremental`
Specify an existing index. Information about files in this index that were not modified (based on *mtime* attribute)
will be copied to the new index and will not be parsed again.
@@ -100,7 +120,7 @@ Made by simon987 <me@simon987.net>. Released under GPL-3.0
* list: Only get file names as text
* shallow: Don't parse archives inside archives.
* recurse: Scan archives recursively (default)
* `--ocr` See [OCR](../README.md#OCR)
* `--ocr-lang`, `--ocr-ebooks`, `--ocr-images` See [OCR](../README.md#OCR)
* `-e, --exclude` Regex pattern to exclude files. A file is excluded if the pattern matches any
part of the full absolute path.
@@ -120,11 +140,15 @@ Made by simon987 <me@simon987.net>. Released under GPL-3.0
In effect, smaller `treemap-threshold` values will yield a more detailed
(but also a more cluttered and harder to read) visualization.
* `--mem-buffer` Maximum memory buffer size in MB (per thread) for files inside archives. Media files
* `--mem-buffer` Maximum memory buffer size in MiB (per thread) for files inside archives. Media files
larger than this number will be read sequentially and no *seek* operations will be supported.
To check if a media file can be parsed without *seek*, execute `cat file.mp4 | ffprobe -`
* `--read-subtitles` When enabled, will attempt to read the subtitles stream from media files.
* `--fast-epub` Much faster but less accurate EPUB parsing. When enabled, sist2 will use a simple HTML parser to read epub files instead of the MuPDF library. No thumbnails are generated and author/title metadata are not parsed.
* `--checksums` Calculate file checksums (SHA1) when scanning files. This option does not cause any additional read
operations. Checksums are not calculated for all file types, unless the file is inside an archive. When enabled, duplicate
files are hidden in the web UI (this behaviour can be toggled in the Configuration page).
### Scan examples
@@ -145,15 +169,11 @@ sist2 scan --incremental ./orig_idx/ -o ./updated_idx/ ~/Documents
### Index format
A typical `binary` type index structure looks like this:
A typical `ndjson` type index structure looks like this:
```
documents.idx/
├── descriptor.json
├── _index_139965416830720
├── _index_139965425223424
├── _index_139965433616128
├── _index_139965442008832
├── _index_139965442008832
├── _index_main.ndjson.zst
├── treemap.csv
├── agg_mime.csv
├── agg_date.csv
@@ -169,9 +189,7 @@ documents.idx/
└── lock.mdb
```
The `_index_*` files contain the raw binary index data and are not meant to be
read by other applications. The format is generally compatible across different
sist2 versions.
The `_index_*.ndjson.zst` files contain the document data in JSON format, in a compressed newline-delemited file.
The `thumbs/` folder is a [LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database)
database containing the thumbnails.
@@ -181,66 +199,6 @@ following fields are safe to modify manually: `root`, `name`, [rewrite_url](#rew
The `.csv` are pre-computed aggregations necessary for the stats page.
*Advanced usage*
Instead of using the `scan` module, you can also import an index generated
by a third party application. The 'external' index must have the following format:
```
my_index/
├── descriptor.json
├── _index_0
└── thumbs/
| ├── data.mdb
| └── lock.mdb
└── meta/
└── <empty>
```
*descriptor.json*:
```json
{
"uuid": "<valid UUID4>",
"version": "_external_v1",
"root": "(optional)",
"name": "<name>",
"rewrite_url": "(optional)",
"type": "json",
"timestamp": 1578971024
}
```
*_index_0*: NDJSON format (One json object per line)
```json
{
"_id": "unique uuid for the file",
"index": "index uuid4 (same one as descriptor.json!)",
"mime": "application/x-cbz",
"size": 14341204,
"mtime": 1578882996,
"extension": "cbz",
"name": "my_book",
"path": "path/to/books",
"content": "text contents of the book",
"title": "Title of the book",
"tag": ["genre.fiction", "author.someguy", "etc..."],
"_keyword": [
{"k": "ISBN", "v": "ABCD34789231"}
],
"_text": [
{"k": "other", "v": "This will be indexed as text"}
]
}
```
You can find the full list of supported fields [here](../src/io/serialize.c#L90)
The `_keyword.*` items will be indexed and searchable as **keyword** fields (only full matches allowed).
The `_text.*` items will be indexed and searchable as **text** fields (fuzzy searching allowed)
*thumbs/*:
LMDB key-value store. Keys are **binary** 16-byte md5 hash* (`_id` field)
@@ -248,9 +206,6 @@ and values are raw image bytes.
*\* Hash is calculated from the full path of the file, including the extension, relative to the index root*
Importing an external `binary` type index is technically possible but
it is currently unsupported and has no guaranties of back/forward compatibility.
## Index
### Index options
@@ -261,6 +216,9 @@ it is currently unsupported and has no guaranties of back/forward compatibility.
Elasticsearch index name. DEFAULT=sist2
* `-p, --print`
Print index in JSON format to stdout.
* `--incremental-index`
Conduct incremental indexing. Assumes that the old index is already ingested in Elasticsearch.
Only the new changes since the last scan will be sent.
* `--script-file`
Path to user script. See [Scripting](scripting.md).
* `--mappings-file`
@@ -276,6 +234,7 @@ it is currently unsupported and has no guaranties of back/forward compatibility.
down the process.
* `-f, --force-reset`
Reset Elasticsearch mappings and settings.
* `-t, --threads` Number of threads to use. Ideally, choose a number equal to the number of logical cores of the machine hosting Elasticsearch.
### Index examples
@@ -305,7 +264,11 @@ sist2 index --print ./my_index/ | jq | less
* `--auth=<str>` Basic auth in user:password format
* `--tag-auth=<str>` Basic auth in user:password format. Works the same way as the
`--auth` argument, but authentication is only applied the `/tag/` endpoint.
* `--tagline=<str>` When specified, will replace the default tagline in the navbar.
* `--dev` Serve html & js files from disk (for development, used to modify frontend files without having to recompile)
* `--lang=<str>` Set the default web UI language (See #180 for a list of supported languages, default
is `en`). The user can change the language in the configuration page
### Web examples
**Single index**
@@ -324,14 +287,19 @@ sist2 web index1 index2 index3 index4
When the `rewrite_url` field is not empty, the web module ignores the `root`
field and will return a HTTP redirect to `<rewrite_url><path>/<name><extension>`
instead of serving the file from disk.
Both the `root` and `rewrite_url` fields are safe to manually modify from the
Both the `root` and `rewrite_url` fields are safe to manually modify from the
`descriptor.json` file.
### Link to specific indices
# Elasticsearch
To link to specific indices, you can add a list of comma-separated index name to
the URL: `?i=<name>,<name>`. By default, indices with `"(nsfw)"` in their name are
not displayed.
Elasticsearch versions >=6.8.0, 7.X.X and 8.X.X are supported by sist2.
Using a version >=7.14.0 is recommended to enable the following features:
- Bug fix for large documents (See #198)
When using a legacy version of ES, a notice will be displayed next to the sist2 version in the web UI.
If you don't care about the features above, you can ignore it or disable it in the configuration page.
## exec-script
@@ -367,7 +335,7 @@ See [scripting](scripting.md) documentation.
# Sidecar files
When scanning, sist2 will read metadata from `.s2meta` JSON files and overwrite the
original document's metadata. Sidecar metadata files will also work inside archives.
original document's indexed metadata (does not modify the actual file). Sidecar metadata files will also work inside archives.
Sidecar files themselves are not saved in the index.
This feature is useful to leverage third-party applications such as speech-to-text or

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.9 KiB

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 889 KiB

After

Width:  |  Height:  |  Size: 1011 KiB

View File

@@ -4,6 +4,10 @@
"type": "keyword",
"doc_values": true
},
"checksum": {
"type": "keyword",
"index": false
},
"_depth": {
"type": "integer"
},
@@ -35,7 +39,7 @@
"index": false
},
"thumbnail": {
"type": "keyword",
"type": "integer",
"index": false
},
"videoc": {
@@ -74,6 +78,7 @@
"name": {
"analyzer": "content_analyzer",
"type": "text",
"fielddata": true,
"fields": {
"nGram": {
"type": "text",

View File

@@ -2,7 +2,8 @@
"index": {
"refresh_interval": "30s",
"codec": "best_compression",
"number_of_replicas": 0
"number_of_replicas": 0,
"highlight.max_analyzed_offset": 1000000
},
"analysis": {
"tokenizer": {
@@ -15,7 +16,7 @@
"delimiter": "."
},
"my_nGram_tokenizer": {
"type": "nGram",
"type": "ngram",
"min_gram": 3,
"max_gram": 3
}

View File

@@ -0,0 +1,58 @@
{
"index": {
"refresh_interval": "30s",
"codec": "best_compression",
"number_of_replicas": 0
},
"analysis": {
"tokenizer": {
"path_tokenizer": {
"type": "path_hierarchy",
"delimiter": "/"
},
"tag_tokenizer": {
"type": "path_hierarchy",
"delimiter": "."
},
"my_nGram_tokenizer": {
"type": "nGram",
"min_gram": 3,
"max_gram": 3
}
},
"analyzer": {
"path_analyzer": {
"tokenizer": "path_tokenizer",
"filter": [
"lowercase"
]
},
"tag_analyzer": {
"tokenizer": "tag_tokenizer",
"filter": [
"lowercase"
]
},
"case_insensitive_kw_analyzer": {
"tokenizer": "keyword",
"filter": [
"lowercase"
]
},
"my_nGram": {
"tokenizer": "my_nGram_tokenizer",
"filter": [
"lowercase",
"asciifolding"
]
},
"content_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
}

View File

@@ -5,6 +5,6 @@ rm -rf index.sist2/
python3 scripts/mime.py > src/parsing/mime_generated.c
python3 scripts/serve_static.py > src/web/static_generated.c
python3 scripts/index_static.py > src/index/static_generated.c
python3 scripts/magic_static.py > src/magic_generated.c
printf "static const char *const Sist2CommitHash = \"%s\";\n" $(git rev-parse HEAD) > src/git_hash.h
printf "static const char *const LibScanCommitHash = \"%s\";\n" $(cd third-party/libscan/ && git rev-parse HEAD) >> src/git_hash.h
printf "static const char *const Sist2CommitHash = \"%s\";\n" $(git rev-parse HEAD) > src/git_hash.h

View File

@@ -14,4 +14,4 @@ rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG=on -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" .
make -j $(nproc)
strip sist2
mv sist2 sist2-arm64-linux-debug
mv sist2_debug sist2-arm64-linux-debug

View File

@@ -3,6 +3,7 @@ import json
files = [
"schema/mappings.json",
"schema/settings.json",
"schema/settings_legacy.json",
"schema/pipeline.json",
]

8
scripts/magic_static.py Normal file
View File

@@ -0,0 +1,8 @@
try:
with open("/usr/lib/file/magic.mgc", "rb") as f:
data = f.read()
except:
data = bytes([])
print("char magic_database_buffer[%d] = {%s};" % (len(data), ",".join(str(int(b)) for b in data)))

View File

@@ -22,6 +22,7 @@ application/java-archive, jar
application/java, class
application/javascript,
application/json, json
application/ndjson, jsonl|ndjson
application/marc, mrc
application/mbedlet, mbd
application/mime, aps
@@ -78,9 +79,7 @@ application/vocaltec-media-desc, vmd
application/vocaltec-media-file, vmf
application/warc, warc
application/winhelp, hlp
application/wordperfect6.0, w60
application/wordperfect6.1, w61
application/wordperfect, wp|wp5|wp6|wpd
application/wordperfect, wp|wp5|wp6|wpd|w60|w61
application/x-123, wk1
application/x-7z-compressed, 7z
application/x-aim, aim
1 application/arj arj
22 application/java class
23 application/javascript
24 application/json json
25 application/ndjson jsonl|ndjson
26 application/marc mrc
27 application/mbedlet mbd
28 application/mime aps
79 application/vocaltec-media-file vmf
80 application/warc warc
81 application/winhelp hlp
82 application/wordperfect6.0 application/wordperfect w60 wp|wp5|wp6|wpd|w60|w61
application/wordperfect6.1 w61
application/wordperfect wp|wp5|wp6|wpd
83 application/x-123 wk1
84 application/x-7z-compressed 7z
85 application/x-aim aim

3
scripts/start_dev_es.sh Executable file
View File

@@ -0,0 +1,3 @@
docker run --rm -it --name "sist2-dev-es"\
-p 9200:9200 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:7.14.0

3
scripts/start_dev_es_6.sh Executable file
View File

@@ -0,0 +1,3 @@
docker run --rm -it --name "sist2-dev-es-6"\
-p 9202:9200 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:6.8.0

3
scripts/start_dev_es_8.sh Executable file
View File

@@ -0,0 +1,3 @@
docker run --rm -it --name "sist2-dev-es"\
-p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:8.1.2

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@@ -8,10 +8,9 @@
},
"dependencies": {
"@egjs/vue-infinitegrid": "3.3.0",
"axios": "^0.21.1",
"axios": "^0.25.0",
"bootstrap-vue": "^2.21.2",
"core-js": "^3.6.5",
"crypto-es": "^1.2.7",
"d3": "^5.16.0",
"date-fns": "^2.21.3",
"dom-to-image": "^2.6.0",
@@ -22,7 +21,6 @@
"vue-color": "^2.8.1",
"vue-i18n": "^8.24.4",
"vue-masonry-wall": "^0.3.2",
"vue-multiselect": "^2.1.6",
"vue-router": "^3.2.0",
"vue-simple-suggest": "^1.11.1",
"vuex": "^3.4.0"

View File

@@ -146,6 +146,7 @@ html, body {
.theme-black .nav-tabs .nav-link {
color: #e0e0e0;
border-radius: 0;
}
.theme-black .nav-tabs .nav-item.show .nav-link, .theme-black .nav-tabs .nav-link.active {
@@ -309,4 +310,8 @@ mark {
display: inline-block;
width: 40%;
}
.pointer {
cursor: pointer;
}
</style>

View File

@@ -1,6 +1,5 @@
import axios from "axios";
import {ext, strUnescape, lum} from "./util";
import CryptoES from 'crypto-es';
export interface EsTag {
id: string
@@ -30,7 +29,6 @@ export interface EsHit {
_index: string
_id: string
_score: number
_path_md5: string
_type: string
_tags: Tag[]
_seq: number
@@ -50,6 +48,8 @@ export interface EsHit {
height: number
duration: number
tag: string[]
checksum: string
thumbnail: string
}
_props: {
isSubDocument: boolean
@@ -60,6 +60,9 @@ export interface EsHit {
isPlayableImage: boolean
isAudio: boolean
hasThumbnail: boolean
hasVidPreview: boolean
/** Number of thumbnails available */
tnNum: number
}
highlight: {
name: string[] | undefined,
@@ -130,6 +133,15 @@ class Sist2Api {
if ("thumbnail" in hit._source) {
hit._props.hasThumbnail = true;
if (Number.isNaN(Number(hit._source.thumbnail))) {
// Backwards compatibility
hit._props.tnNum = 1;
hit._props.hasVidPreview = false;
} else {
hit._props.tnNum = Number(hit._source.thumbnail);
hit._props.hasVidPreview = hit._props.tnNum > 1;
}
}
switch (mimeCategory) {
@@ -235,11 +247,6 @@ class Sist2Api {
res.hits.hits.forEach((hit: EsHit) => {
hit["_source"]["name"] = strUnescape(hit["_source"]["name"]);
hit["_source"]["path"] = strUnescape(hit["_source"]["path"]);
hit["_path_md5"] = CryptoES.MD5(
hit["_source"]["path"] +
(hit["_source"]["path"] ? "/" : "") +
hit["_source"]["name"] + ext(hit)
).toString();
this.setHitProps(hit);
this.setHitTags(hit);
@@ -250,20 +257,31 @@ class Sist2Api {
});
}
getMimeTypes() {
return this.esQuery({
aggs: {
mimeTypes: {
terms: {
field: "mime",
size: 10000
}
getMimeTypes(query = undefined) {
const AGGS = {
mimeTypes: {
terms: {
field: "mime",
size: 10000
}
},
size: 0,
}).then(resp => {
}
};
if (!query) {
query = {
aggs: AGGS,
size: 0,
};
} else {
query.size = 0;
query.aggs = AGGS;
}
return this.esQuery(query).then(resp => {
const mimeMap: any[] = [];
resp["aggregations"]["mimeTypes"]["buckets"].sort((a: any, b: any) => a.key > b.key).forEach((bucket: any) => {
const buckets = resp["aggregations"]["mimeTypes"]["buckets"];
buckets.sort((a: any, b: any) => a.key > b.key).forEach((bucket: any) => {
const tmp = bucket["key"].split("/");
const category = tmp[0];
const mime = tmp[1];
@@ -283,11 +301,18 @@ class Sist2Api {
});
if (!category_exists) {
mimeMap.push({"text": category, children: [child]});
mimeMap.push({text: category, children: [child], id: category});
}
})
return mimeMap;
mimeMap.forEach(node => {
if (node.children) {
node.children.sort((a, b) => a.id.localeCompare(b.id));
}
})
mimeMap.sort((a, b) => a.id.localeCompare(b.id))
return {buckets, mimeMap};
});
}
@@ -311,10 +336,6 @@ class Sist2Api {
};
}
getDocInfo(docId: string) {
return axios.get(`${this.baseUrl}d/${docId}`);
}
getTags() {
return this.esQuery({
aggs: {
@@ -348,8 +369,7 @@ class Sist2Api {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
delete: false,
name: tag,
doc_id: hit["_id"],
path_md5: hit._path_md5
doc_id: hit["_id"]
});
}
@@ -357,8 +377,7 @@ class Sist2Api {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
delete: true,
name: tag,
doc_id: hit["_id"],
path_md5: hit._path_md5
doc_id: hit["_id"]
});
}

View File

@@ -43,6 +43,20 @@ const SORT_MODES = {
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.size
},
nameAsc: {
mode: [
{name: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
},
nameDesc: {
mode: [
{name: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
}
} as any;
@@ -55,7 +69,7 @@ interface SortMode {
class Sist2Query {
searchQuery(): any {
searchQuery(blankSearch: boolean = false): any {
const getters = store.getters;
@@ -73,26 +87,12 @@ class Sist2Query {
const selectedMimeTypes = getters.selectedMimeTypes;
const selectedTags = getters.selectedTags;
const legacyES = store.state.sist2Info.esVersionLegacy;
const filters = [
{terms: {index: selectedIndexIds}}
] as any[];
if (sizeMin && sizeMax) {
filters.push({range: {size: {gte: sizeMin, lte: sizeMax}}})
} else if (sizeMin) {
filters.push({range: {size: {gte: sizeMin}}})
} else if (sizeMax) {
filters.push({range: {size: {lte: sizeMax}}})
}
if (dateMin && dateMax) {
filters.push({range: {mtime: {gte: dateMin, lte: dateMax}}})
} else if (dateMin) {
filters.push({range: {mtime: {gte: dateMin}}})
} else if (dateMax) {
filters.push({range: {mtime: {lte: dateMax}}})
}
const fields = [
"name^8",
"content^3",
@@ -112,20 +112,39 @@ class Sist2Query {
fields.push("name.nGram^3");
}
const path = pathText.replace(/\/$/, "").toLowerCase(); //remove trailing slashes
if (path !== "") {
filters.push({term: {path: path}})
}
if (!blankSearch) {
if (sizeMin && sizeMax) {
filters.push({range: {size: {gte: sizeMin, lte: sizeMax}}})
} else if (sizeMin) {
filters.push({range: {size: {gte: sizeMin}}})
} else if (sizeMax) {
filters.push({range: {size: {lte: sizeMax}}})
}
if (selectedMimeTypes.length > 0) {
filters.push({terms: {"mime": selectedMimeTypes}});
}
if (dateMin && dateMax) {
filters.push({range: {mtime: {gte: dateMin, lte: dateMax}}})
} else if (dateMin) {
filters.push({range: {mtime: {gte: dateMin}}})
} else if (dateMax) {
filters.push({range: {mtime: {lte: dateMax}}})
}
if (selectedTags.length > 0) {
if (getters.optTagOrOperator) {
filters.push({terms: {"tag": selectedTags}});
} else {
selectedTags.forEach((tag: string) => filters.push({term: {"tag": tag}}));
const path = pathText.replace(/\/$/, "").toLowerCase(); //remove trailing slashes
if (path !== "") {
filters.push({term: {path: path}})
}
if (selectedMimeTypes.length > 0) {
filters.push({terms: {"mime": selectedMimeTypes}});
}
if (selectedTags.length > 0) {
if (getters.optTagOrOperator) {
filters.push({terms: {"tag": selectedTags}});
} else {
selectedTags.forEach((tag: string) => filters.push({term: {"tag": tag}}));
}
}
}
@@ -166,7 +185,7 @@ class Sist2Query {
size: size,
} as any;
if (!empty) {
if (!empty && !blankSearch) {
q.query.bool.must = query;
}
@@ -189,6 +208,11 @@ class Sist2Query {
font_name: {},
}
};
if (!legacyES) {
q.highlight.max_analyzed_offset = 999_999;
}
if (getters.optSearchInPath) {
q.highlight.fields["path.text"] = {};
q.highlight.fields["path.nGram"] = {};
@@ -216,7 +240,7 @@ class Sist2Query {
}
}
if (!empty) {
if (!empty && !blankSearch) {
q.query.function_score.query.bool.must.push(query);
}
}

View File

@@ -1,5 +1,31 @@
<template>
<div id="dateSlider"></div>
<div v-if="$store.state.optUseDatePicker">
<b-row>
<b-col sm="6">
<b-form-datepicker
value-as-date
:date-format-options="{ year: 'numeric', month: '2-digit', day: '2-digit' }"
:locale="$store.state.optLang"
class="mb-2"
:value="dateMin" @input="setDateMin"></b-form-datepicker>
</b-col>
<b-col sm="6">
<b-form-datepicker
value-as-date
:date-format-options="{ year: 'numeric', month: '2-digit', day: '2-digit' }"
:locale="$store.state.optLang"
class="mb-2"
:value="dateMax" @input="setDateMax"></b-form-datepicker>
</b-col>
</b-row>
</div>
<div v-else>
<b-row>
<b-col style="height: 70px;">
<div id="dateSlider"></div>
</b-col>
</b-row>
</div>
</template>
<script>
@@ -10,11 +36,36 @@ import {mergeTooltips} from "@/util-js";
export default {
name: "DateSlider",
methods: {
setDateMin(val) {
const epochDate = Math.ceil(+val / 1000);
this.$store.commit("setDateMin", epochDate);
},
setDateMax(val) {
const epochDate = Math.ceil(+val / 1000);
this.$store.commit("setDateMax", epochDate);
},
},
computed: {
dateMin() {
const dateMin = this.$store.state.dateMin ? this.$store.state.dateMin : this.$store.state.dateBoundsMin;
return new Date(dateMin * 1000)
},
dateMax() {
const dateMax = this.$store.state.dateMax ? this.$store.state.dateMax : this.$store.state.dateBoundsMax;
return new Date(dateMax * 1000)
}
},
mounted() {
this.$store.subscribe((mutation) => {
if (mutation.type === "setDateBoundsMax") {
const elem = document.getElementById("dateSlider");
if (elem === null) {
// Using b-form-datepicker, skip initialisation of slider
return
}
if (elem.children.length > 0) {
return;
}

View File

@@ -5,7 +5,6 @@
<b-card-body>
<!-- TODO: ES connectivity, Link to GH page -->
<b-table :items="tableItems" small borderless responsive="md" thead-class="hidden" class="mb-0"></b-table>
<hr />
@@ -16,7 +15,7 @@
<script>
import IndexDebugInfo from "@/components/IndexDebugInfo";
import DebugIcon from "@/components/DebugIcon";
import DebugIcon from "@/components/icons/DebugIcon";
export default {
name: "DebugInfo.vue",
@@ -28,10 +27,13 @@ export default {
{key: "platform", value: this.$store.state.sist2Info.platform},
{key: "debugBinary", value: this.$store.state.sist2Info.debug},
{key: "sist2CommitHash", value: this.$store.state.sist2Info.sist2Hash},
{key: "libscanCommitHash", value: this.$store.state.sist2Info.libscanHash},
{key: "esIndex", value: this.$store.state.sist2Info.esIndex},
{key: "tagline", value: this.$store.state.sist2Info.tagline},
{key: "dev", value: this.$store.state.sist2Info.dev},
{key: "mongooseVersion", value: this.$store.state.sist2Info.mongooseVersion},
{key: "esVersion", value: this.$store.state.sist2Info.esVersion},
{key: "esVersionSupported", value: this.$store.state.sist2Info.esVersionSupported},
{key: "esVersionLegacy", value: this.$store.state.sist2Info.esVersionLegacy},
]
}
}

View File

@@ -1,5 +1,6 @@
<template>
<div class="doc-card" :class="{'sub-document': doc._props.isSubDocument}" :style="`width: ${width}px`">
<div class="doc-card" :class="{'sub-document': doc._props.isSubDocument}" :style="`width: ${width}px`"
@click="$store.commit('busTnTouchStart', null)">
<b-card
no-body
img-top
@@ -10,36 +11,11 @@
<ContentDiv :doc="doc"></ContentDiv>
<!-- Thumbnail-->
<div v-if="doc._props.hasThumbnail" class="img-wrapper" @mouseenter="onTnEnter()" @mouseleave="onTnLeave()">
<div v-if="doc._props.isAudio" class="card-img-overlay" :class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div v-if="doc._props.isImage && !hover" class="card-img-overlay" :class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ `${doc._source.width}x${doc._source.height}` }}</span>
</div>
<div v-if="(doc._props.isVideo || doc._props.isGif) && doc._source.duration > 0 && !hover" class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div v-if="doc._props.isPlayableVideo" class="play">
<svg viewBox="0 0 494.942 494.942" xmlns="http://www.w3.org/2000/svg">
<path d="m35.353 0 424.236 247.471-424.236 247.471z"/>
</svg>
</div>
<img v-if="doc._props.isPlayableImage || doc._props.isPlayableVideo"
:src="(doc._props.isGif && hover) ? `f/${doc._id}` : `t/${doc._source.index}/${doc._id}`"
alt=""
class="pointer fit card-img-top" @click="onThumbnailClick()">
<img v-else :src="`t/${doc._source.index}/${doc._id}`" alt=""
class="fit card-img-top">
</div>
<FullThumbnail :doc="doc" :small-badge="smallBadge" @onThumbnailClick="onThumbnailClick()"></FullThumbnail>
<!-- Audio player-->
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls :type="doc._source.mime"
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls
:type="doc._source.mime"
:src="`f/${doc._id}`"
@play="onAudioPlay()"></audio>
@@ -66,31 +42,19 @@ import TagContainer from "@/components/TagContainer.vue";
import DocFileTitle from "@/components/DocFileTitle.vue";
import DocInfoModal from "@/components/DocInfoModal.vue";
import ContentDiv from "@/components/ContentDiv.vue";
import FullThumbnail from "@/components/FullThumbnail";
export default {
components: {ContentDiv, DocInfoModal, DocFileTitle, TagContainer},
components: {FullThumbnail, ContentDiv, DocInfoModal, DocFileTitle, TagContainer},
props: ["doc", "width"],
data() {
return {
ext: ext,
showInfo: false,
hover: false
}
},
computed: {
placeHolderStyle() {
const tokens = this.doc._source.thumbnail.split(",");
const w = Number(tokens[0]);
const h = Number(tokens[1]);
const MAX_HEIGHT = 400;
return {
height: `${Math.min((h / w) * this.width, MAX_HEIGHT)}px`,
}
},
smallBadge() {
return this.width < 150;
}
@@ -112,28 +76,10 @@ export default {
}
});
},
onTnEnter() {
this.hover = true;
},
onTnLeave() {
this.hover = false;
}
},
}
</script>
<style>
.img-wrapper {
position: relative;
}
.img-wrapper:hover svg {
fill: rgba(0, 0, 0, 1);
}
.pointer {
cursor: pointer;
}
.fit {
display: block;
min-width: 64px;
@@ -143,15 +89,17 @@ export default {
width: auto;
height: auto;
}
.audio-fit {
height: 39px;
vertical-align: bottom;
display: inline;
width: 100%;
}
</style>
<style scoped>
.card-img-top {
border-top-left-radius: 0;
border-top-right-radius: 0;
}
.padding-03 {
padding: 0.3rem;
}
@@ -169,55 +117,11 @@ export default {
padding: 0.3rem;
}
.thumbnail-placeholder {
}
.card-img-overlay {
pointer-events: none;
padding: 0.75rem;
bottom: unset;
top: 0;
left: unset;
right: unset;
}
.badge-resolution {
color: #212529;
background-color: #FFC107;
}
.play {
position: absolute;
width: 25px;
height: 25px;
left: 50%;
top: 50%;
transform: translate(-50%, -50%);
pointer-events: none;
}
.play svg {
fill: rgba(0, 0, 0, 0.7);
}
.doc-card {
padding-left: 3px;
padding-right: 3px;
}
.small-badge {
padding: 1px 3px;
font-size: 70%;
}
.audio-fit {
height: 39px;
vertical-align: bottom;
display: inline;
width: 100%;
}
.sub-document .card {
background: #AB47BC1F !important;
}

View File

@@ -2,9 +2,13 @@
<b-modal :visible="show" size="lg" :hide-footer="true" static lazy @close="$emit('close')" @hide="$emit('close')"
>
<template #modal-title>
<h5 class="modal-title" :title="doc._source.name + ext(doc)">{{ doc._source.name + ext(doc) }}</h5>
<h5 class="modal-title" :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
<router-link :to="`/file?byId=${doc._id}`">#</router-link>
</h5>
</template>
<img :src="`t/${doc._source.index}/${doc._id}`" alt="" class="fit card-img-top">
<img v-if="doc._props.hasThumbnail" :src="`t/${doc._source.index}/${doc._id}`" alt="" class="fit card-img-top">
<InfoTable :doc="doc"></InfoTable>

View File

@@ -1,10 +1,13 @@
<template>
<b-list-group-item class="flex-column align-items-start mb-2">
<b-list-group-item class="flex-column align-items-start mb-2" :class="{'sub-document': doc._props.isSubDocument}"
@mouseenter="onTnEnter()" @mouseleave="onTnLeave()">
<!-- Info modal-->
<DocInfoModal :show="showInfo" :doc="doc" @close="showInfo = false"></DocInfoModal>
<div class="media ml-2">
<!-- Thumbnail-->
<div v-if="doc._props.hasThumbnail" class="align-self-start mr-2 wrapper-sm">
<div class="img-wrapper">
<div v-if="doc._props.isPlayableVideo" class="play">
@@ -25,6 +28,7 @@
<FileIcon></FileIcon>
</div>
<!-- Doc line-->
<div class="doc-line ml-3">
<div style="display: flex">
<span class="info-icon" @click="showInfo = true"></span>
@@ -40,9 +44,11 @@
</div>
<div v-if="doc._source.pages || doc._source.author" class="path-row text-muted">
<span v-if="doc._source.pages">{{ doc._source.pages }} {{ doc._source.pages > 1 ? $t("pages") : $t("page") }}</span>
<span v-if="doc._source.pages">{{ doc._source.pages }} {{
doc._source.pages > 1 ? $t("pages") : $t("page")
}}</span>
<span v-if="doc._source.author && doc._source.pages" class="mx-1">-</span>
<span v-if="doc._source.author">{{doc._source.author}}</span>
<span v-if="doc._source.author">{{ doc._source.author }}</span>
</div>
</div>
</div>
@@ -54,7 +60,7 @@ import TagContainer from "@/components/TagContainer";
import DocFileTitle from "@/components/DocFileTitle";
import DocInfoModal from "@/components/DocInfoModal";
import ContentDiv from "@/components/ContentDiv";
import FileIcon from "@/components/FileIcon";
import FileIcon from "@/components/icons/FileIcon";
export default {
name: "DocListItem",
@@ -83,12 +89,26 @@ export default {
return this.doc.highlight["path.nGram"] + "/"
}
return this.doc._source.path + "/"
}
},
onTnEnter() {
this.hover = true;
},
onTnLeave() {
this.hover = false;
},
}
}
</script>
<style scoped>
.sub-document {
background: #AB47BC1F !important;
}
.theme-black .sub-document {
background: #37474F !important;
}
.list-group {
margin-top: 1em;
}
@@ -137,6 +157,7 @@ export default {
.list-group-item .img-wrapper {
width: 88px;
height: 88px;
position: relative;
}
.fit-sm {

View File

@@ -0,0 +1,173 @@
<template>
<div v-if="doc._props.hasThumbnail" class="img-wrapper" @mouseenter="onTnEnter()" @mouseleave="onTnLeave()"
@touchstart="onTouchStart()">
<div v-if="doc._props.isAudio" class="card-img-overlay" :class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div
v-if="doc._props.isImage && !hover && doc._props.tnW / doc._props.tnH < 5"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ `${doc._source.width}x${doc._source.height}` }}</span>
</div>
<div v-if="(doc._props.isVideo || doc._props.isGif) && doc._source.duration > 0 && !hover"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div v-if="doc._props.isPlayableVideo" class="play">
<svg viewBox="0 0 494.942 494.942" xmlns="http://www.w3.org/2000/svg">
<path d="m35.353 0 424.236 247.471-424.236 247.471z"/>
</svg>
</div>
<img ref="tn"
v-if="doc._props.isPlayableImage || doc._props.isPlayableVideo"
:src="tnSrc"
alt=""
:style="{height: (doc._props.isGif && hover) ? `${tnHeight()}px` : undefined}"
class="pointer fit card-img-top" @click="onThumbnailClick()">
<img v-else :src="tnSrc" alt=""
class="fit card-img-top">
<ThumbnailProgressBar v-if="hover && doc._props.hasVidPreview"
:progress="(currentThumbnailNum + 1) / (doc._props.tnNum)"
></ThumbnailProgressBar>
</div>
</template>
<script>
import {humanTime} from "@/util";
import ThumbnailProgressBar from "@/components/ThumbnailProgressBar";
export default {
name: "FullThumbnail",
props: ["doc", "smallBadge"],
components: {ThumbnailProgressBar},
data() {
return {
hover: false,
currentThumbnailNum: 0,
timeoutId: null
}
},
created() {
this.$store.subscribe((mutation) => {
if (mutation.type === "busTnTouchStart" && mutation.payload !== this.doc._id) {
this.onTnLeave();
}
});
},
computed: {
tnSrc() {
const doc = this.doc;
const props = doc._props;
if (props.isGif && this.hover) {
return `f/${doc._id}`;
}
return (this.currentThumbnailNum === 0)
? `t/${doc._source.index}/${doc._id}`
: `t/${doc._source.index}/${doc._id}${String(this.currentThumbnailNum).padStart(4, "0")}`;
},
},
methods: {
humanTime: humanTime,
onThumbnailClick() {
this.$emit("onThumbnailClick");
},
tnHeight() {
return this.$refs.tn.height;
},
tnWidth() {
return this.$refs.tn.width;
},
onTnEnter() {
this.hover = true;
if (this.doc._props.hasVidPreview) {
this.currentThumbnailNum += 1;
this.scheduleNextTnNum();
}
},
onTnLeave() {
this.currentThumbnailNum = 0;
this.hover = false;
if (this.timeoutId !== null) {
window.clearTimeout(this.timeoutId);
this.timeoutId = null;
}
},
scheduleNextTnNum() {
const INTERVAL = this.$store.state.optVidPreviewInterval ?? 700;
this.timeoutId = window.setTimeout(() => {
if (!this.hover) {
return;
}
this.scheduleNextTnNum();
if (this.currentThumbnailNum === this.doc._props.tnNum - 1) {
this.currentThumbnailNum = 0;
} else {
this.currentThumbnailNum += 1;
}
}, INTERVAL);
},
onTouchStart() {
this.$store.commit("busTnTouchStart", this.doc._id);
if (!this.hover) {
this.onTnEnter()
}
},
}
}
</script>
<style scoped>
.img-wrapper {
position: relative;
}
.img-wrapper:hover svg {
fill: rgba(0, 0, 0, 1);
}
.card-img-top {
border-top-left-radius: 0;
border-top-right-radius: 0;
}
.play {
position: absolute;
width: 25px;
height: 25px;
left: 50%;
top: 50%;
transform: translate(-50%, -50%);
pointer-events: none;
}
.play svg {
fill: rgba(0, 0, 0, 0.7);
}
.badge-resolution {
color: #212529;
background-color: #FFC107;
}
.card-img-overlay {
pointer-events: none;
padding: 0.75rem;
bottom: unset;
top: 0;
left: unset;
right: unset;
}
.small-badge {
padding: 1px 3px;
font-size: 70%;
}
</style>

View File

@@ -1,93 +1,191 @@
<template>
<VueMultiselect
multiple
label="name"
:value="selectedIndices"
:options="indices"
:close-on-select="indices.length <= 1"
:placeholder="$t('indexPickerPlaceholder')"
@select="addItem"
@remove="removeItem">
<div v-if="isMobile">
<b-form-select
:value="selectedIndicesIds"
@change="onSelect($event)"
:options="indices" multiple :select-size="6" text-field="name"
value-field="id"></b-form-select>
</div>
<div v-else>
<template slot="option" slot-scope="idx">
<b-row>
<b-col>
<span class="mr-1">{{ idx.option.name }}</span>
<SmallBadge pill :text="idx.option.version"></SmallBadge>
</b-col>
</b-row>
<b-row class="mt-1">
<b-col>
<span>{{ formatIdxDate(idx.option.timestamp) }}</span>
</b-col>
</b-row>
</template>
<div class="d-flex justify-content-between align-content-center">
<span>
{{ selectedIndices.length }}
{{ selectedIndices.length === 1 ? $t("indexPicker.selectedIndex") : $t("indexPicker.selectedIndices") }}
</span>
</VueMultiselect>
<div>
<b-button variant="link" @click="selectAll()"> {{ $t("indexPicker.selectAll") }}</b-button>
<b-button variant="link" @click="selectNone()"> {{ $t("indexPicker.selectNone") }}</b-button>
</div>
</div>
<b-list-group id="index-picker-desktop" class="unselectable">
<b-list-group-item
v-for="idx in indices"
@click="toggleIndex(idx, $event)"
@click.shift="shiftClick(idx, $event)"
class="d-flex justify-content-between align-items-center list-group-item-action pointer"
:class="{active: lastClickIndex === idx}"
>
<div class="d-flex">
<b-checkbox style="pointer-events: none" :checked="isSelected(idx)"></b-checkbox>
{{ idx.name }}
<span class="text-muted timestamp-text ml-2">{{ formatIdxDate(idx.timestamp) }}</span>
</div>
<b-badge class="version-badge">v{{ idx.version }}</b-badge>
</b-list-group-item>
</b-list-group>
</div>
</template>
<script lang="ts">
import VueMultiselect from "vue-multiselect"
import SmallBadge from "./SmallBadge.vue"
import {mapActions, mapGetters} from "vuex";
import {Index} from "@/Sist2Api";
import Vue from "vue";
import {format} from "date-fns";
export default Vue.extend({
components: {
VueMultiselect,
SmallBadge
},
data() {
return {
loading: true
loading: true,
lastClickIndex: null
}
},
computed: {
...mapGetters([
"indices", "selectedIndices"
]),
selectedIndicesIds() {
return this.selectedIndices.map(idx => idx.id)
},
isMobile() {
return window.innerWidth <= 650;
}
},
methods: {
...mapActions({
setSelectedIndices: "setSelectedIndices"
}),
removeItem(val: Index): void {
this.setSelectedIndices(this.selectedIndices.filter((item: Index) => item !== val))
shiftClick(index, e) {
if (this.lastClickIndex === null) {
return;
}
const select = this.isSelected(this.lastClickIndex);
let leftBoundary = this.indices.indexOf(this.lastClickIndex);
let rightBoundary = this.indices.indexOf(index);
if (rightBoundary < leftBoundary) {
let tmp = leftBoundary;
leftBoundary = rightBoundary;
rightBoundary = tmp;
}
for (let i = leftBoundary; i <= rightBoundary; i++) {
if (select) {
if (!this.isSelected(this.indices[i])) {
this.setSelectedIndices([this.indices[i], ...this.selectedIndices]);
}
} else {
this.setSelectedIndices(this.selectedIndices.filter(idx => idx !== this.indices[i]));
}
}
},
addItem(val: Index): void {
this.setSelectedIndices([...this.selectedIndices, val])
selectAll() {
this.setSelectedIndices(this.indices);
},
selectNone() {
this.setSelectedIndices([]);
},
onSelect(value) {
this.setSelectedIndices(this.indices.filter(idx => value.includes(idx.id)));
},
formatIdxDate(timestamp: number): string {
return format(new Date(timestamp * 1000), "yyyy-MM-dd");
},
toggleIndex(index, e) {
if (e.shiftKey) {
return;
}
this.lastClickIndex = index;
if (this.isSelected(index)) {
this.setSelectedIndices(this.selectedIndices.filter(idx => idx.id != index.id));
} else {
this.setSelectedIndices([index, ...this.selectedIndices]);
}
},
isSelected(index) {
return this.selectedIndices.find(idx => idx.id == index.id) != null;
}
},
})
</script>
<style src="vue-multiselect/dist/vue-multiselect.min.css"></style>
<style>
.multiselect__option {
padding: 5px 10px;
<style scoped>
.timestamp-text {
line-height: 24px;
font-size: 80%;
}
.multiselect__content-wrapper {
overflow: hidden;
.theme-black .version-badge {
color: #eee !important;
background: none;
}
.theme-black .multiselect__tags {
background: #37474F;
border: 1px solid #616161 !important
.version-badge {
color: #222 !important;
background: none;
}
.theme-black .multiselect__input {
color: #dbdbdb;
background: #37474F;
.list-group-item {
padding: 0.2em 0.4em;
}
.theme-black .multiselect__content-wrapper {
border: none
#index-picker-desktop {
overflow-y: auto;
max-height: 132px;
}
.btn-link:focus {
box-shadow: none;
}
.unselectable {
user-select: none;
-ms-user-select: none;
-moz-user-select: none;
-webkit-user-select: none;
}
.list-group-item.active {
z-index: 2;
background-color: inherit;
color: inherit;
}
.theme-black .list-group-item {
border: 1px solid rgba(255,255,255, 0.1);
}
.theme-black .list-group-item:first-child {
border: 1px solid rgba(255,255,255, 0.05);
}
.theme-black .list-group-item.active {
z-index: 2;
background-color: inherit;
color: inherit;
border: 1px solid rgba(255,255,255, 0.3);
border-radius: 0;
}
.theme-black .list-group {
border-radius: 0;
}
</style>

View File

@@ -1,9 +1,8 @@
<template>
<b-table :items="tableItems" small borderless responsive="md" thead-class="hidden" class="mb-0 mt-4">
<template #cell(value)="data">
<span v-if="'html' in data.item" v-html="data.item.html"></span>
<span v-else>{{data.value}}</span>
<span v-else>{{ data.value }}</span>
</template>
</b-table>
</template>
@@ -33,12 +32,18 @@ function dmsToDecimal(dms, ref) {
export default {
name: "InfoTable",
props: ["doc"],
data() {
return {
indexName: "loading..."
}
},
computed: {
tableItems() {
this.indexName;
const src = this.doc._source;
const items = [
{key: "index", value: `[${this.$store.getters.indexMap[src.index].name}]`},
{key: "index", value: `[${this.indexName}]`},
{key: "mtime", value: humanDate(src.mtime)},
{key: "mime", value: src.mime},
{key: "size", value: humanFileSize(src.size)},
@@ -57,7 +62,8 @@ export default {
"bitrate", "artist", "album", "album_artist", "genre", "font_name", "author",
"modified_by", "pages", "tag",
"exif_make", "exif_software", "exif_exposure_time", "exif_fnumber", "exif_focal_length",
"exif_user_comment", "exif_iso_speed_ratings", "exif_model", "exif_datetime",
"exif_user_comment", "exif_iso_speed_ratings", "exif_model", "exif_datetime",
"checksum"
];
fields.forEach(field => {
@@ -66,6 +72,12 @@ export default {
}
});
Object.keys(src).forEach(key => {
if (key.startsWith("mt_") || key.startsWith("int_")) {
items.push({key: key, value: src[key]});
}
});
// Exif GPS
if ("exif_gps_longitude_dec" in src) {
items.push({
@@ -76,15 +88,24 @@ export default {
items.push({
key: "Exif GPS",
html: makeGpsLink(
dmsToDecimal(src["exif_gps_latitude_dms"], src["exif_gps_latitude_ref"]),
dmsToDecimal(src["exif_gps_longitude_dms"], src["exif_gps_longitude_ref"]),
),
dmsToDecimal(src["exif_gps_latitude_dms"], src["exif_gps_latitude_ref"]),
dmsToDecimal(src["exif_gps_longitude_dms"], src["exif_gps_longitude_ref"]),
),
});
}
return items;
}
}
},
mounted() {
if (this.$store.getters.indexMap[this.doc.index]) {
this.indexName = this.$store.getters.indexMap[this.doc._source.index].name
}
window.setTimeout(() => {
this.indexName = this.$store.getters.indexMap[this.doc._source.index].name
}, 500)
},
}
</script>

View File

@@ -1,11 +1,13 @@
<template>
<Preloader v-if="loading"></Preloader>
<div v-else-if="content" class="content-div">{{ content }}</div>
<div v-else-if="content" class="content-div" v-html="content"></div>
</template>
<script>
import Sist2Api from "@/Sist2Api";
import Preloader from "@/components/Preloader";
import Sist2Query from "@/Sist2Query";
import store from "@/store";
export default {
name: "LazyContentDiv",
@@ -18,10 +20,72 @@ export default {
}
},
mounted() {
Sist2Api.getDocInfo(this.docId).then(src => {
this.content = src.data.content;
const query = Sist2Query.searchQuery();
if (this.$store.state.optHighlight) {
const fields = this.$store.state.fuzzy
? {"content.nGram": {}}
: {content: {}};
query.highlight = {
pre_tags: ["<mark>"],
post_tags: ["</mark>"],
number_of_fragments: 0,
fields,
};
if (!store.state.sist2Info.esVersionLegacy) {
query.highlight.max_analyzed_offset = 999_999;
}
}
if ("function_score" in query.query) {
query.query = query.query.function_score.query;
}
if (!("must" in query.query.bool)) {
query.query.bool.must = [];
} else if (!Array.isArray(query.query.bool.must)) {
query.query.bool.must = [query.query.bool.must];
}
query.query.bool.must.push({match: {_id: this.docId}});
delete query["sort"];
delete query["aggs"];
delete query["search_after"];
delete query.query["function_score"];
query._source = {
includes: ["content", "name", "path", "extension"]
}
query.size = 1;
Sist2Api.esQuery(query).then(resp => {
this.loading = false;
})
if (resp.hits.hits.length === 1) {
this.content = this.getContent(resp.hits.hits[0]);
} else {
console.log("FIXME: could not get content")
console.log(resp)
}
});
},
methods: {
getContent(doc) {
if (!doc.highlight) {
return doc._source.content;
}
if (doc.highlight["content.nGram"]) {
return doc.highlight["content.nGram"][0];
}
if (doc.highlight.content) {
return doc.highlight.content[0];
}
}
}
}
</script>

View File

@@ -1,7 +1,7 @@
<template>
<div>
<!-- TODO: Set slideshowTime as a configurable option-->
<div :class="{'disable-animations': $store.state.optSimpleLightbox}">
<FsLightbox
ref="lightbox"
:key="lightboxKey"
:toggler="showLightbox"
:sources="lightboxSources"
@@ -10,8 +10,8 @@
:types="lightboxTypes"
:source-index="lightboxSlide"
:custom-toolbar-buttons="customButtons"
:slideshow-time="1000 * 10"
:zoom-increment="0.5"
:slideshow-time="$store.getters.optLightboxSlideDuration * 1000"
:zoom-increment="0.25"
:load-only-current-source="$store.getters.optLightboxLoadOnlyCurrent"
:on-close="onClose"
:on-open="onShow"
@@ -30,6 +30,7 @@ export default {
components: {FsLightbox},
data() {
return {
disableAnimations: true,
customButtons: [
{
viewBox: "0 0 384.928 384.928",
@@ -65,7 +66,84 @@ export default {
return this.$store.getters["uiLightboxTypes"];
}
},
mounted() {
const listener = document.onkeydown;
document.onkeydown = (e) => {
const ret = this.keyDownListener(e)
if (listener && ret) {
return listener(e);
}
};
},
methods: {
keyDownListener(e) {
const isLightboxOpen = this.$refs.lightbox === undefined || this.$refs.lightbox.$el.tagName === undefined;
if (isLightboxOpen) {
return true;
}
const lightboxStore = this.$refs.lightbox.fsLightboxStore.slice(-1)[0];
switch (e.key) {
case " ": {
e.preventDefault();
e.stopPropagation();
e.stopImmediatePropagation();
// Find video at current slide, toggle play/pause
[...document.getElementsByClassName("fslightbox-absoluted")].forEach(elem => {
if (elem.style.transform === "translate(0px)" || elem.style.transform === "translate(0px, 0px)") {
const vid = elem.getElementsByTagName("video")[0];
if (vid) {
if (vid.paused) {
vid.play();
} else {
vid.pause()
}
}
}
return false;
});
return false;
}
case "ArrowUp":
case "k": {
if (!lightboxStore.data.isThumbing && lightboxStore.core.thumbsToggler) {
lightboxStore.core.thumbsToggler.toggleThumbs();
}
return false;
}
case "ArrowDown":
case "j": {
if (lightboxStore.data.isThumbing && lightboxStore.core.thumbsToggler) {
lightboxStore.core.thumbsToggler.toggleThumbs();
}
return false;
}
case "h": {
if (lightboxStore.core.stageManager.getPreviousSlideIndex) {
lightboxStore.core.slideIndexChanger.jumpTo(lightboxStore.core.stageManager.getPreviousSlideIndex());
}
return false;
}
case "l": {
if (lightboxStore.core.stageManager.getNextSlideIndex) {
lightboxStore.core.slideIndexChanger.jumpTo(lightboxStore.core.stageManager.getNextSlideIndex());
}
return false;
}
}
return true;
},
onDownloadClick() {
const url = this.lightboxSources[this.lightboxSlide];
@@ -126,4 +204,20 @@ export default {
.fslightbox-toolbar-button:nth-child(7) {
order: 7;
}
.disable-animations .fslightbox-container {
background: rgba(30,30,30,.9);
}
.disable-animations .fslightbox-transform-transition {
transition: none;
}
.disable-animations .fslightbox-fade-in-strong {
animation: none;
}
.fslightbox-container video, .fslightbox-container img {
cursor: unset !important;
}
</style>

View File

@@ -3,7 +3,7 @@
<p>
<b>{{
`[${$store.getters.indices.find(i => i.id === hit._source.index).name}]`
}}</b>{{ `/${hit._source.path}/${hit._source.name}${ext(hit)}` }}
}}</b>{{ `${hit._source.path === '' ? '' : '/'}${hit._source.path}/${hit._source.name}${ext(hit)}` }}
</p>
<p style="margin-top: -1em">
<span v-if="hit._source.width">{{ `${hit._source.width}x${hit._source.height}`}}</span>

View File

@@ -7,48 +7,113 @@ import InspireTree from "inspire-tree";
import InspireTreeDOM from "inspire-tree-dom";
import "inspire-tree-dom/dist/inspire-tree-light.min.css";
import {getSelectedTreeNodes} from "@/util";
import {getSelectedTreeNodes, getTreeNodeAttributes} from "@/util";
import Sist2Api from "@/Sist2Api";
import Sist2Query from "@/Sist2Query";
export default {
name: "MimePicker",
data() {
return {
mimeTree: null,
stashedMimeTreeAttributes: null,
updateBusy: false
}
},
mounted() {
this.$store.subscribe((mutation) => {
if (mutation.type === "setUiMimeMap") {
const mimeMap = mutation.payload.slice();
this.mimeTree = new InspireTree({
selection: {
mode: 'checkbox'
},
data: mimeMap
});
new InspireTreeDOM(this.mimeTree, {
target: '#mimeTree'
});
this.mimeTree.on("node.state.changed", this.handleTreeClick);
this.mimeTree.deselect();
if (this.$store.state._onLoadSelectedMimeTypes.length > 0) {
this.$store.state._onLoadSelectedMimeTypes.forEach(mime => {
this.mimeTree.node(mime).select();
});
}
if (mutation.type === "setUiMimeMap" && this.mimeTree === null) {
this.initializeTree();
} else if (mutation.type === "busSearch") {
this.updateTree();
}
});
},
methods: {
handleTreeClick(node, e) {
if (e === "indeterminate" || e === "collapsed") {
if (e === "indeterminate" || e === "collapsed" || e === 'rendered' || e === "focused") {
return;
}
if (this.updateBusy) {
return;
}
this.$store.commit("setSelectedMimeTypes", getSelectedTreeNodes(this.mimeTree));
},
updateTree() {
if (this.$store.getters.optUpdateMimeMap === false) {
return;
}
if (this.updateBusy) {
return
}
this.updateBusy = true;
if (this.stashedMimeTreeAttributes === null) {
this.stashedMimeTreeAttributes = getTreeNodeAttributes(this.mimeTree);
}
const query = Sist2Query.searchQuery();
Sist2Api.getMimeTypes(query).then(({buckets, mimeMap}) => {
this.$store.commit("setUiMimeMap", mimeMap);
this.$store.commit("setUiDetailsMimeAgg", buckets);
this.mimeTree.removeAll();
this.mimeTree.addNodes(mimeMap);
// Restore selected mimes
if (this.stashedMimeTreeAttributes === null) {
// NOTE: This happens when successive fast searches are triggered
this.stashedMimeTreeAttributes = {};
// Always add the selected mime types
this.$store.state.selectedMimeTypes.forEach(mime => {
this.stashedMimeTreeAttributes[mime] = {
checked: true
}
});
}
Object.entries(this.stashedMimeTreeAttributes).forEach(([mime, attributes]) => {
if (this.mimeTree.node(mime)) {
if (attributes.checked) {
this.mimeTree.node(mime).select();
}
if (attributes.collapsed === false) {
this.mimeTree.node(mime).expand();
}
}
});
this.stashedMimeTreeAttributes = null;
this.updateBusy = false;
});
},
initializeTree() {
const mimeMap = this.$store.state.uiMimeMap;
this.mimeTree = new InspireTree({
selection: {
mode: "checkbox"
},
data: mimeMap
});
new InspireTreeDOM(this.mimeTree, {
target: "#mimeTree"
});
this.mimeTree.on("node.state.changed", this.handleTreeClick);
this.mimeTree.deselect();
if (this.$store.state._onLoadSelectedMimeTypes.length > 0) {
this.$store.state._onLoadSelectedMimeTypes.forEach(mime => {
this.mimeTree.node(mime).select();
});
}
}
}
}
</script>

View File

@@ -8,7 +8,8 @@
</b-navbar-brand>
<span class="badge badge-pill version" v-if="$store && $store.state.sist2Info">
{{ sist2Version() }}<span v-if="isDebug()">-dbg</span>
v{{ sist2Version() }}<span v-if="isDebug()">-dbg</span><span v-if="isLegacy() && !hideLegacy()">-<a
href="https://github.com/simon987/sist2/blob/master/docs/USAGE.md#elasticsearch" target="_blank">legacyES</a></span>
</span>
<span v-if="$store && $store.state.sist2Info" class="tagline" v-html="tagline()"></span>
@@ -19,7 +20,8 @@
</template>
<script>
import Sist2Icon from "@/components/Sist2Icon";
import Sist2Icon from "@/components/icons/Sist2Icon";
export default {
name: "NavBar",
components: {Sist2Icon},
@@ -32,6 +34,12 @@ export default {
},
isDebug() {
return this.$store.state.sist2Info.debug;
},
isLegacy() {
return this.$store.state.sist2Info.esVersionLegacy;
},
hideLegacy() {
return this.$store.state.optHideLegacy;
}
}
}
@@ -95,7 +103,7 @@ export default {
}
}
.theme-light .btn-link{
.theme-light .btn-link {
color: #222;
}
</style>

View File

@@ -3,31 +3,56 @@
<span>{{ hitCount }} {{ hitCount === 1 ? $t("hit") : $t("hits") }}</span>
<div style="float: right">
<b-button v-b-toggle.collapse-1 variant="primary" class="not-mobile">{{ $t("details") }}</b-button>
<b-button v-b-toggle.collapse-1 variant="primary" class="not-mobile" @click="onToggle()">{{
$t("details")
}}
</b-button>
<SortSelect class="ml-2"></SortSelect>
<template v-if="hitCount !== 0">
<SortSelect class="ml-2"></SortSelect>
<DisplayModeToggle class="ml-2"></DisplayModeToggle>
<DisplayModeToggle class="ml-2"></DisplayModeToggle>
</template>
</div>
<b-collapse id="collapse-1" class="pt-2" style="clear:both;">
<b-card>
<b-table :items="tableItems" small borderless thead-class="hidden" class="mb-0"></b-table>
<b-table :items="tableItems" small borderless bordered thead-class="hidden" class="mb-0"></b-table>
<br/>
<h4>
{{$t("mimeTypes")}}
<b-button size="sm" variant="primary" class="float-right" @click="onCopyClick"><ClipboardIcon/></b-button>
</h4>
<Preloader v-if="$store.state.uiDetailsMimeAgg == null"></Preloader>
<b-table
v-else
sort-by="doc_count"
:sort-desc="true"
thead-class="hidden"
:items="$store.state.uiDetailsMimeAgg" small bordered class="mb-0"
></b-table>
</b-card>
</b-collapse>
</b-card>
</template>
<script lang="ts">
import {EsResult} from "@/Sist2Api";
import Sist2Api, {EsResult} from "@/Sist2Api";
import Vue from "vue";
import {humanFileSize, humanTime} from "@/util";
import {humanFileSize} from "@/util";
import DisplayModeToggle from "@/components/DisplayModeToggle.vue";
import SortSelect from "@/components/SortSelect.vue";
import Preloader from "@/components/Preloader.vue";
import Sist2Query from "@/Sist2Query";
import ClipboardIcon from "@/components/icons/ClipboardIcon.vue";
export default Vue.extend({
name: "ResultsCard",
components: {SortSelect, DisplayModeToggle},
components: {ClipboardIcon, Preloader, SortSelect, DisplayModeToggle},
created() {
},
computed: {
lastResultsLoaded() {
return this.$store.state.lastQueryResults != null;
@@ -52,6 +77,39 @@ export default Vue.extend({
totalSize() {
return humanFileSize((this.$store.state.lastQueryResults as EsResult).aggregations.total_size.value);
},
onToggle() {
const show = !document.getElementById("collapse-1").classList.contains("show");
this.$store.commit("setUiShowDetails", show);
if (show && this.$store.state.uiDetailsMimeAgg == null && !this.$store.state.optUpdateMimeMap) {
// Mime aggs are not updated automatically, update now
this.forceUpdateMimeAgg();
}
},
onCopyClick() {
let tsvString = "";
this.$store.state.uiDetailsMimeAgg.slice().sort((a,b) => b["doc_count"] - a["doc_count"]).forEach(row => {
tsvString += `${row["key"]}\t${row["doc_count"]}\n`;
});
navigator.clipboard.writeText(tsvString);
this.$bvToast.toast(
this.$t("toast.copiedToClipboard"),
{
title: null,
noAutoHide: false,
toaster: "b-toaster-bottom-right",
headerClass: "hidden",
bodyClass: "toast-body-info",
});
},
forceUpdateMimeAgg() {
const query = Sist2Query.searchQuery();
Sist2Api.getMimeTypes(query).then(({buckets}) => {
this.$store.commit("setUiDetailsMimeAgg", buckets);
});
}
},
});

View File

@@ -19,6 +19,14 @@
{{ $t("sort.sizeDesc") }}
</b-dropdown-item>
<b-dropdown-item :class="{'dropdown-active': sort === 'nameDesc'}" @click="onSelect('nameDesc')">
{{ $t("sort.nameDesc") }}
</b-dropdown-item>
<b-dropdown-item :class="{'dropdown-active': sort === 'nameAsc'}" @click="onSelect('nameAsc')">
{{ $t("sort.nameAsc") }}
</b-dropdown-item>
<b-dropdown-item :class="{'dropdown-active': sort === 'random'}" @click="onSelect('random')">
{{ $t("sort.random") }}
</b-dropdown-item>

View File

@@ -51,7 +51,7 @@
>{{ tag.text.split(".").pop() }}</span>
<b-popover :target="hit._id+tag.rawText" triggers="focus blur" placement="top">
<b-button variant="danger" @click="onTagDeleteClick(tag, $event)">Delete</b-button>
<b-button variant="danger" @click="onTagDeleteClick(tag, $event)">{{$t("deleteTag")}}</b-button>
</b-popover>
</div>
@@ -63,7 +63,7 @@
</template>
<!-- Add button -->
<small v-if="showAddButton" class="badge add-tag-button" @click="tagAdd()">Add</small>
<small v-if="showAddButton" class="badge add-tag-button" @click="tagAdd()">{{$t("addTag")}}</small>
<!-- Size tag-->
<small v-else class="text-muted badge-size">{{

View File

@@ -1,5 +1,13 @@
<template>
<div id="tagTree"></div>
<div>
<b-input-group v-if="showSearchBar" id="tag-picker-filter-bar">
<b-form-input :value="filter"
:placeholder="$t('tagFilter')"
@input="onFilter($event)"></b-form-input>
</b-input-group>
<div id="tagTree"></div>
</div>
</template>
<script>
@@ -112,15 +120,17 @@ function addTag(map, tag, id, count) {
export default {
name: "TagPicker",
props: ["showSearchBar"],
data() {
return {
tagTree: null,
loadedFromArgs: false,
filter: ""
}
},
mounted() {
this.$store.subscribe((mutation) => {
if (mutation.type === "setUiMimeMap") {
if (mutation.type === "setUiMimeMap" && this.tagTree === null) {
this.initializeTree();
this.updateTree();
} else if (mutation.type === "busUpdateTags") {
@@ -129,6 +139,10 @@ export default {
});
},
methods: {
onFilter(value) {
this.filter = value;
this.tagTree.search(value);
},
initializeTree() {
const tagMap = [];
this.tagTree = new InspireTree({
@@ -147,6 +161,7 @@ export default {
this.tagTree.on("node.state.changed", this.handleTreeClick);
},
updateTree() {
// TODO: remember which tags are selected and restore?
const tagMap = [];
Sist2Api.getTags().then(tags => {
tags.forEach(tag => addTag(tagMap, tag.id, tag.id, tag.count));
@@ -162,7 +177,8 @@ export default {
});
},
handleTreeClick(node, e) {
if (e === "indeterminate" || e === "collapsed" || e === 'rendered') {
if (e === "indeterminate" || e === "collapsed" || e === 'rendered' || e === "focused"
|| e === "matched" || e === "hidden") {
return;
}
@@ -179,7 +195,15 @@ export default {
}
</style>
<style>
.inspire-tree .focused>.wholerow {
.inspire-tree .focused > .wholerow {
border: none;
}
#tag-picker-filter-bar {
padding: 10px 4px 4px;
}
.theme-black .inspire-tree .matched > .wholerow {
background: rgba(251, 191, 41, 0.25);
}
</style>

View File

@@ -0,0 +1,40 @@
<template>
<div class="thumbnail-progress-bar" :style="{width: `${percentProgress}%`}"></div>
</template>
<script>
export default {
name: "ThumbnailProgressBar",
props: ["doc", "progress"],
computed: {
percentProgress() {
return Math.min(Math.max(this.progress * 100, 0), 100);
}
}
}
</script>
<style scoped>
.thumbnail-progress-bar {
position: absolute;
left: 0;
bottom: 0;
height: 4px;
background: #2196f3AA;
z-index: 9;
}
.theme-black .thumbnail-progress-bar {
background: rgba(0, 188, 212, 0.95);
}
.sub-document .thumbnail-progress-bar {
max-width: calc(100% - 8px);
left: 4px;
}
</style>

View File

@@ -0,0 +1,21 @@
<template>
<svg style="width:24px;height:24px" viewBox="0 0 24 24">
<path
fill="currentColor"
d="M17,9H7V7H17M17,13H7V11H17M14,17H7V15H14M12,3A1,1 0 0,1 13,4A1,1 0 0,1 12,5A1,1 0 0,1 11,4A1,1 0 0,1 12,3M19,3H14.82C14.4,1.84 13.3,1 12,1C10.7,1 9.6,1.84 9.18,3H5A2,2 0 0,0 3,5V19A2,2 0 0,0 5,21H19A2,2 0 0,0 21,19V5A2,2 0 0,0 19,3Z"/>
</svg>
</template>
<script>
export default {
name: "ClipboardIcon"
}
</script>
<style scoped>
svg {
display: inline-block;
width: 20px;
height: 20px;
}
</style>

View File

@@ -0,0 +1,21 @@
<template>
<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24">
<path
fill="currentColor"
d="M12 0c-6.627 0-12 5.373-12 12s5.373 12 12 12 12-5.373 12-12-5.373-12-12-12zm1 16.057v-3.057h2.994c-.059 1.143-.212 2.24-.456 3.279-.823-.12-1.674-.188-2.538-.222zm1.957 2.162c-.499 1.33-1.159 2.497-1.957 3.456v-3.62c.666.028 1.319.081 1.957.164zm-1.957-7.219v-3.015c.868-.034 1.721-.103 2.548-.224.238 1.027.389 2.111.446 3.239h-2.994zm0-5.014v-3.661c.806.969 1.471 2.15 1.971 3.496-.642.084-1.3.137-1.971.165zm2.703-3.267c1.237.496 2.354 1.228 3.29 2.146-.642.234-1.311.442-2.019.607-.344-.992-.775-1.91-1.271-2.753zm-7.241 13.56c-.244-1.039-.398-2.136-.456-3.279h2.994v3.057c-.865.034-1.714.102-2.538.222zm2.538 1.776v3.62c-.798-.959-1.458-2.126-1.957-3.456.638-.083 1.291-.136 1.957-.164zm-2.994-7.055c.057-1.128.207-2.212.446-3.239.827.121 1.68.19 2.548.224v3.015h-2.994zm1.024-5.179c.5-1.346 1.165-2.527 1.97-3.496v3.661c-.671-.028-1.329-.081-1.97-.165zm-2.005-.35c-.708-.165-1.377-.373-2.018-.607.937-.918 2.053-1.65 3.29-2.146-.496.844-.927 1.762-1.272 2.753zm-.549 1.918c-.264 1.151-.434 2.36-.492 3.611h-3.933c.165-1.658.739-3.197 1.617-4.518.88.361 1.816.67 2.808.907zm.009 9.262c-.988.236-1.92.542-2.797.9-.89-1.328-1.471-2.879-1.637-4.551h3.934c.058 1.265.231 2.488.5 3.651zm.553 1.917c.342.976.768 1.881 1.257 2.712-1.223-.49-2.326-1.211-3.256-2.115.636-.229 1.299-.435 1.999-.597zm9.924 0c.7.163 1.362.367 1.999.597-.931.903-2.034 1.625-3.257 2.116.489-.832.915-1.737 1.258-2.713zm.553-1.917c.27-1.163.442-2.386.501-3.651h3.934c-.167 1.672-.748 3.223-1.638 4.551-.877-.358-1.81-.664-2.797-.9zm.501-5.651c-.058-1.251-.229-2.46-.492-3.611.992-.237 1.929-.546 2.809-.907.877 1.321 1.451 2.86 1.616 4.518h-3.933z"/>
</svg>
</template>
<script>
export default {
name: "LanguageIcon"
}
</script>
<style scoped>
svg {
display: inline-block;
width: 20px;
height: 20px;
}
</style>

View File

@@ -1,16 +1,22 @@
export default {
en: {
filePage: {
notFound: "Not found"
},
searchBar: {
simple: "Search",
advanced: "Advanced search",
fuzzy: "Fuzzy"
},
addTag: "Add",
deleteTag: "Delete",
download: "Download",
and: "and",
page: "page",
pages: "pages",
mimeTypes: "Media types",
tags: "Tags",
tagFilter: "Filter tags",
help: {
simpleSearch: "Simple search",
advancedSearch: "Advanced search",
@@ -62,7 +68,14 @@ export default {
lightboxLoadOnlyCurrent: "Do not preload full-size images for adjacent slides in image viewer.",
slideDuration: "Slide duration",
resultSize: "Number of results per page",
tagOrOperator: "Use OR operator when specifying multiple tags."
tagOrOperator: "Use OR operator when specifying multiple tags.",
hideDuplicates: "Hide duplicate results based on checksum",
hideLegacy: "Hide the 'legacyES' Elasticsearch notice",
updateMimeMap: "Update the Media Types tree in real time",
useDatePicker: "Use a Date Picker component rather than a slider",
vidPreviewInterval: "Video preview frame duration in ms",
simpleLightbox: "Disable animations in image viewer",
showTagPickerFilter: "Display the tag filter bar"
},
queryMode: {
simple: "Simple",
@@ -70,7 +83,8 @@ export default {
},
lang: {
en: "English",
fr: "Français"
fr: "Français",
"zh-CN": "简体中文",
},
displayMode: {
grid: "Grid",
@@ -124,18 +138,21 @@ export default {
esQueryErr: "Could not parse or execute query, please check the Advanced search documentation. " +
"See server logs for more information.",
dupeTagTitle: "Duplicate tag",
dupeTag: "This tag already exists for this document."
dupeTag: "This tag already exists for this document.",
copiedToClipboard: "Copied to clipboard"
},
saveTagModalTitle: "Add tag",
saveTagPlaceholder: "Tag name",
confirm: "Confirm",
indexPickerPlaceholder: "Select indices",
indexPickerPlaceholder: "Select an index",
sort: {
relevance: "Relevance",
dateAsc: "Date (Older first)",
dateDesc: "Date (Newer first)",
sizeAsc: "Size (Smaller first)",
sizeDesc: "Size (Larger first)",
nameAsc: "Name (A-z)",
nameDesc: "Name (Z-a)",
random: "Random",
},
d3: {
@@ -143,20 +160,32 @@ export default {
mimeSize: "Size distribution by media type",
dateHistogram: "File modification time distribution",
sizeHistogram: "File size distribution",
}
},
indexPicker: {
selectNone: "Select None",
selectAll: "Select All",
selectedIndex: "selected index",
selectedIndices: "selected indices",
},
},
fr: {
filePage: {
notFound: "Ficher introuvable"
},
searchBar: {
simple: "Recherche",
advanced: "Recherche avancée",
fuzzy: "Approximatif"
},
addTag: "Ajouter",
deleteTag: "Supprimer",
download: "Télécharger",
and: "et",
page: "page",
pages: "pages",
mimeTypes: "Types de médias",
tags: "Tags",
tagFilter: "Filtrer les tags",
help: {
simpleSearch: "Recherche simple",
advancedSearch: "Recherche avancée",
@@ -209,7 +238,14 @@ export default {
lightboxLoadOnlyCurrent: "Désactiver le chargement des diapositives adjacentes pour le visualiseur d'images",
slideDuration: "Durée des diapositives",
resultSize: "Nombre de résultats par page",
tagOrOperator: "Utiliser l'opérateur OU lors de la spécification de plusieurs tags"
tagOrOperator: "Utiliser l'opérateur OU lors de la spécification de plusieurs tags",
hideDuplicates: "Masquer les résultats en double",
hideLegacy: "Masquer la notice 'legacyES' Elasticsearch",
updateMimeMap: "Mettre à jour l'arbre de Types de médias en temps réel",
useDatePicker: "Afficher un composant « Date Picker » plutôt qu'un slider",
vidPreviewInterval: "Durée des images d'aperçu video en millisecondes",
simpleLightbox: "Désactiver les animations du visualiseur d'images",
showTagPickerFilter: "Afficher le filtre dans l'onglet Tags"
},
queryMode: {
simple: "Simple",
@@ -217,7 +253,8 @@ export default {
},
lang: {
en: "English",
fr: "Français"
fr: "Français",
"zh-CN": "简体中文",
},
displayMode: {
grid: "Grille",
@@ -272,7 +309,8 @@ export default {
esQueryErr: "Impossible d'analyser ou d'exécuter la requête, veuillez consulter la documentation sur la " +
"recherche avancée. Voir les journaux du serveur pour plus d'informations.",
dupeTagTitle: "Tag en double",
dupeTag: "Ce tag existe déjà pour ce document."
dupeTag: "Ce tag existe déjà pour ce document.",
copiedToClipboard: "Copié dans le presse-papier"
},
saveTagModalTitle: "Ajouter un tag",
saveTagPlaceholder: "Nom du tag",
@@ -284,6 +322,8 @@ export default {
dateDesc: "Date (Plus récent)",
sizeAsc: "Taille (Plus petit)",
sizeDesc: "Taille (Plus grand)",
nameAsc: "Nom (A-z)",
nameDesc: "Nom (Z-a)",
random: "Aléatoire",
},
d3: {
@@ -291,6 +331,181 @@ export default {
mimeSize: "Distribution des tailles de fichiers par type de média",
dateHistogram: "Distribution des dates de modification",
sizeHistogram: "Distribution des tailles de fichier",
}
}
}
},
indexPicker: {
selectNone: "Sélectionner aucun",
selectAll: "Sélectionner tout",
selectedIndex: "indice sélectionné",
selectedIndices: "indices sélectionnés",
},
},
"zh-CN": {
filePage: {
notFound: "未找到"
},
searchBar: {
simple: "搜索",
advanced: "高级搜索",
fuzzy: "模糊搜索"
},
addTag: "添加",
deleteTag: "删除",
download: "下载",
and: "与",
page: "页",
pages: "页",
mimeTypes: "文件类型",
tags: "标签",
tagFilter: "筛选标签",
help: {
simpleSearch: "简易搜索",
advancedSearch: "高级搜索",
help: "帮助",
term: "<关键词>",
and: "与操作",
or: "或操作",
not: "反选单个关键词",
quotes: "括起来的部分视为一个关键词,保序",
prefix: "在词尾使用时,匹配前缀",
parens: "表达式编组",
tildeTerm: "匹配编辑距离以内的关键词",
tildePhrase: "匹配短语,容忍一些非匹配词",
example1:
"例如: <code>\"番茄\" +(炒蛋 | 牛腩) -饭</code> 将匹配" +
"短语 <i>番茄炒蛋</i>、<i>炒蛋</i> 或者 <i>牛腩</i>,而忽略任何带有" +
"<i>饭</i>的关键词.",
defaultOperator:
"表达式中无<code>+</code>或者<code>|</code>时,默认使用" +
"<code>+</code>(与操作)。",
fuzzy:
"选中<b>模糊搜索</b>选项时返回部分匹配的结果3-grams)。",
moreInfoSimple: "详细信息:<a target=\"_blank\" " +
"rel=\"noreferrer\" href=\"//www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html\">Elasticsearch文档</a>",
moreInfoAdvanced: "高级搜索模式文档:<a target=\"_blank\" rel=\"noreferrer\" href=\"//www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax\">Elasticsearch文档</a>"
},
config: "配置",
configDescription: "配置在此浏览器中实时保存。",
configReset: "重置所有设置",
searchOptions: "搜索选项",
treemapOptions: "树状图选项",
displayOptions: "显示选项",
opt: {
lang: "语言",
highlight: "启用高亮",
fuzzy: "默认使用模糊搜索",
searchInPath: "匹配文档路径",
suggestPath: "搜索框启用自动补全",
fragmentSize: "高亮上下文大小",
queryMode: "搜索模式",
displayMode: "显示",
columns: "列数",
treemapType: "树状图类属性",
treemapTiling: "树状图平铺",
treemapColorGroupingDepth: "树状图颜色编组深度(展开)",
treemapColor: "树状图颜色(折叠)",
treemapSize: "树状图大小",
theme: "主题",
lightboxLoadOnlyCurrent: "在图片查看器中,不要预读相邻的全图",
slideDuration: "幻灯片时长",
resultSize: "每页结果数",
tagOrOperator: "使用或操作OR匹配多个标签。",
hideDuplicates: "使用校验码隐藏重复结果",
hideLegacy: "隐藏'legacyES' Elasticsearch 通知",
updateMimeMap: "媒体类型树的实时更新",
useDatePicker: "使用日期选择器组件而不是滑块",
vidPreviewInterval: "视频预览帧的持续时间,以毫秒为单位",
simpleLightbox: "在图片查看器中,禁用动画",
showTagPickerFilter: "显示标签过滤栏"
},
queryMode: {
simple: "简单",
advanced: "高级",
},
lang: {
en: "English",
fr: "Français",
"zh-CN": "简体中文",
},
displayMode: {
grid: "网格",
list: "列表",
},
columns: {
auto: "自动"
},
treemapType: {
cascaded: "折叠",
flat: "平铺(紧凑)"
},
treemapSize: {
small: "小",
medium: "中",
large: "大",
xLarge: "加大",
xxLarge: "加加大",
custom: "自订",
},
treemapTiling: {
binary: "Binary",
squarify: "Squarify",
slice: "Slice",
dice: "Dice",
sliceDice: "Slice & Dice",
},
theme: {
light: "亮",
black: "暗"
},
hit: "命中",
hits: "命中",
details: "详细信息",
stats: "统计信息",
queryTime: "查询时间",
totalSize: "总大小",
pathBar: {
placeholder: "过滤路径",
modalTitle: "选择路径"
},
debug: "调试信息",
debugDescription: "对调试除错有用的信息。 若您遇到bug或者想建议新功能请提交新Issue到" +
"<a href='https://github.com/simon987/sist2/issues/new/choose'>这里</a>.",
tagline: "标签栏",
toast: {
esConnErrTitle: "Elasticsearch连接错误",
esConnErr: "sist2 web 模块连接Elasticsearch出错。" +
"查看服务日志以获取更多信息。",
esQueryErrTitle: "查询错误",
esQueryErr: "无法识别或执行查询,请查阅高级搜索文档。" +
"查看服务日志以获取更多信息。",
dupeTagTitle: "重复标签",
dupeTag: "该标签已存在于此文档。",
copiedToClipboard: "复制到剪贴板"
},
saveTagModalTitle: "增加标签",
saveTagPlaceholder: "标签名",
confirm: "确认",
indexPickerPlaceholder: "选择一个索引",
sort: {
relevance: "相关度",
dateAsc: "日期(由旧到新)",
dateDesc: "日期(由新到旧)",
sizeAsc: "大小(从小到大)",
sizeDesc: "大小(从大到小)",
nameAsc: "名字A-z",
nameDesc: "名字 Z-a",
random: "随机",
},
d3: {
mimeCount: "各类文件数量分布",
mimeSize: "各类文件大小分布",
dateHistogram: "文件修改时间分布",
sizeHistogram: "文件大小分布",
},
indexPicker: {
selectNone: "清空",
selectAll: "全选",
selectedIndex: "选中索引",
selectedIndices: "选中索引",
},
},
}

View File

@@ -3,6 +3,7 @@ import VueRouter, {RouteConfig} from "vue-router"
import StatsPage from "../views/StatsPage.vue"
import Configuration from "../views/Configuration.vue"
import SearchPage from "@/views/SearchPage.vue";
import FilePage from "@/views/FilePage.vue";
Vue.use(VueRouter)
@@ -21,6 +22,11 @@ const routes: Array<RouteConfig> = [
path: "/config",
name: "Configuration",
component: Configuration
},
{
path: "/file",
name: "File",
component: FilePage
}
]

View File

@@ -4,6 +4,8 @@ import VueRouter, {Route} from "vue-router";
import {EsHit, EsResult, EsTag, Index, Tag} from "@/Sist2Api";
import {deserializeMimes, serializeMimes} from "@/util";
const CONF_VERSION = 2;
Vue.use(Vuex)
export default new Vuex.Store({
@@ -24,12 +26,14 @@ export default new Vuex.Store({
sortMode: "score",
fuzzy: false,
size: 60,
optLang: "en",
optLangIsDefault: true,
optHideDuplicates: true,
optTheme: "light",
optDisplay: "grid",
optSize: 60,
optHighlight: true,
optTagOrOperator: false,
optFuzzy: true,
@@ -45,6 +49,12 @@ export default new Vuex.Store({
optTreemapColor: "PuBuGn",
optLightboxLoadOnlyCurrent: false,
optLightboxSlideDuration: 15,
optHideLegacy: false,
optUpdateMimeMap: false,
optUseDatePicker: false,
optVidPreviewInterval: 700,
optSimpleLightbox: true,
optShowTagPickerFilter: true,
_onLoadSelectedIndices: [] as string[],
_onLoadSelectedMimeTypes: [] as string[],
@@ -69,9 +79,14 @@ export default new Vuex.Store({
uiLightboxSlide: 0,
uiReachedScrollEnd: false,
uiDetailsMimeAgg: null,
uiShowDetails: false,
uiMimeMap: [] as any[]
},
mutations: {
setUiShowDetails: (state, val) => state.uiShowDetails = val,
setUiDetailsMimeAgg: (state, val) => state.uiDetailsMimeAgg = val,
setUiReachedScrollEnd: (state, val) => state.uiReachedScrollEnd = val,
setTags: (state, val) => state.tags = val,
setPathText: (state, val) => state.pathText = val,
@@ -79,7 +94,11 @@ export default new Vuex.Store({
setSizeMax: (state, val) => state.sizeMax = val,
setSist2Info: (state, val) => state.sist2Info = val,
setSeed: (state, val) => state.seed = val,
setOptLang: (state, val) => state.optLang = val,
setOptHideDuplicates: (state, val) => state.optHideDuplicates = val,
setOptLang: (state, val) => {
state.optLang = val;
state.optLangIsDefault = false;
},
setSortMode: (state, val) => state.sortMode = val,
setIndices: (state, val) => {
state.indices = val;
@@ -134,7 +153,7 @@ export default new Vuex.Store({
setOptSuggestPath: (state, val) => state.optSuggestPath = val,
setOptFragmentSize: (state, val) => state.optFragmentSize = val,
setOptQueryMode: (state, val) => state.optQueryMode = val,
setOptResultSize: (state, val) => state.size = val,
setOptResultSize: (state, val) => state.optSize = val,
setOptTagOrOperator: (state, val) => state.optTagOrOperator = val,
setOptTreemapType: (state, val) => state.optTreemapType = val,
@@ -142,8 +161,15 @@ export default new Vuex.Store({
setOptTreemapColorGroupingDepth: (state, val) => state.optTreemapColorGroupingDepth = val,
setOptTreemapSize: (state, val) => state.optTreemapSize = val,
setOptTreemapColor: (state, val) => state.optTreemapColor = val,
setOptHideLegacy: (state, val) => state.optHideLegacy = val,
setOptUpdateMimeMap: (state, val) => state.optUpdateMimeMap = val,
setOptUseDatePicker: (state, val) => state.optUseDatePicker = val,
setOptVidPreviewInterval: (state, val) => state.optVidPreviewInterval = val,
setOptSimpleLightbox: (state, val) => state.optSimpleLightbox = val,
setOptShowTagPickerFilter: (state, val) => state.optShowTagPickerFilter = val,
setOptLightboxLoadOnlyCurrent: (state, val) => state.optLightboxLoadOnlyCurrent = val,
setOptLightboxSlideDuration: (state, val) => state.optLightboxSlideDuration = val,
setUiMimeMap: (state, val) => state.uiMimeMap = val,
@@ -153,8 +179,24 @@ export default new Vuex.Store({
busUpdateTags: () => {
// noop
},
busSearch: () => {
// noop
},
busTouchEnd: () => {
// noop
},
busTnTouchStart: (doc_id) => {
// noop
},
},
actions: {
setSist2Info: (store, val) => {
store.commit("setSist2Info", val);
if (store.state.optLangIsDefault) {
store.commit("setOptLang", val.lang);
}
},
loadFromArgs({commit}, route: Route) {
if (route.query.q) {
@@ -203,6 +245,11 @@ export default new Vuex.Store({
}
},
async updateArgs({state}, router: VueRouter) {
if (router.currentRoute.path !== "/") {
return;
}
await router.push({
query: {
q: state.searchText.trim() ? state.searchText.trim().replace(/\s+/g, " ") : undefined,
@@ -231,6 +278,8 @@ export default new Vuex.Store({
}
});
conf["version"] = CONF_VERSION;
localStorage.setItem("sist2_configuration", JSON.stringify(conf));
},
loadConfiguration({state}) {
@@ -238,6 +287,11 @@ export default new Vuex.Store({
if (confString) {
const conf = JSON.parse(confString);
if (!("version" in conf) || conf["version"] != CONF_VERSION) {
localStorage.removeItem("sist2_configuration");
window.location.reload();
}
Object.keys(state).forEach((key) => {
if (key.startsWith("opt")) {
(state as any)[key] = conf[key];
@@ -274,6 +328,7 @@ export default new Vuex.Store({
commit("setUiLightboxTypes", []);
commit("setUiLightboxCaptions", []);
commit("setUiLightboxKey", 0);
commit("setUiDetailsMimeAgg", null);
}
},
modules: {},
@@ -298,7 +353,7 @@ export default new Vuex.Store({
searchText: state => state.searchText,
pathText: state => state.pathText,
fuzzy: state => state.fuzzy,
size: state => state.size,
size: state => state.optSize,
sortMode: state => state.sortMode,
lastQueryResult: state => state.lastQueryResults,
lastDoc: function (state): EsHit | null {
@@ -317,6 +372,7 @@ export default new Vuex.Store({
uiLightboxKey: state => state.uiLightboxKey,
uiLightboxSlide: state => state.uiLightboxSlide,
optHideDuplicates: state => state.optHideDuplicates,
optLang: state => state.optLang,
optTheme: state => state.optTheme,
optDisplay: state => state.optDisplay,
@@ -335,6 +391,12 @@ export default new Vuex.Store({
optTreemapColor: state => state.optTreemapColor,
optLightboxLoadOnlyCurrent: state => state.optLightboxLoadOnlyCurrent,
optLightboxSlideDuration: state => state.optLightboxSlideDuration,
optResultSize: state => state.size,
optResultSize: state => state.optSize,
optHideLegacy: state => state.optHideLegacy,
optUpdateMimeMap: state => state.optUpdateMimeMap,
optUseDatePicker: state => state.optUseDatePicker,
optVidPreviewInterval: state => state.optVidPreviewInterval,
optSimpleLightbox: state => state.optSimpleLightbox,
optShowTagPickerFilter: state => state.optShowTagPickerFilter,
}
})

View File

@@ -1,8 +1,12 @@
import {EsHit} from "@/Sist2Api";
export function ext(hit: EsHit) {
return Object.prototype.hasOwnProperty.call(hit._source, "extension")
&& hit["_source"]["extension"] !== "" ? "." + hit["_source"]["extension"] : "";
return srcExt(hit._source)
}
export function srcExt(src) {
return Object.prototype.hasOwnProperty.call(src, "extension")
&& src["extension"] !== "" ? "." + src["extension"] : "";
}
export function strUnescape(str: string): string {
@@ -97,6 +101,30 @@ export function getSelectedTreeNodes(tree: any) {
return Array.from(selectedNodes);
}
export function getTreeNodeAttributes(tree: any) {
const nodes = tree.selectable();
const attributes = {};
for (let i = 0; i < nodes.length; i++) {
let id = null;
if (nodes[i].text.indexOf("(") !== -1 && nodes[i].values) {
id = nodes[i].values.slice(-1)[0];
} else {
id = nodes[i].id
}
attributes[id] = {
checked: nodes[i].itree.state.checked,
collapsed: nodes[i].itree.state.collapsed,
}
}
return attributes;
}
export function serializeMimes(mimes: string[]): string | undefined {
if (mimes.length == 0) {
return undefined;

View File

@@ -15,11 +15,8 @@
<h4>{{ $t("displayOptions") }}</h4>
<b-card>
<b-form-checkbox :checked="optLightboxLoadOnlyCurrent" @input="setOptLightboxLoadOnlyCurrent">
{{ $t("opt.lightboxLoadOnlyCurrent") }}
</b-form-checkbox>
<label>{{ $t("opt.lang") }}</label>
<label><LanguageIcon/><span style="vertical-align: middle">&nbsp;{{ $t("opt.lang") }}</span></label>
<b-form-select :options="langOptions" :value="optLang" @input="setOptLang"></b-form-select>
<label>{{ $t("opt.theme") }}</label>
@@ -30,11 +27,44 @@
<label>{{ $t("opt.columns") }}</label>
<b-form-select :options="columnsOptions" :value="optColumns" @input="setOptColumns"></b-form-select>
<div style="height: 10px"></div>
<b-form-checkbox :checked="optLightboxLoadOnlyCurrent" @input="setOptLightboxLoadOnlyCurrent">
{{ $t("opt.lightboxLoadOnlyCurrent") }}
</b-form-checkbox>
<b-form-checkbox :checked="optHideLegacy" @input="setOptHideLegacy">
{{ $t("opt.hideLegacy") }}
</b-form-checkbox>
<b-form-checkbox :checked="optUpdateMimeMap" @input="setOptUpdateMimeMap">
{{ $t("opt.updateMimeMap") }}
</b-form-checkbox>
<b-form-checkbox :checked="optUseDatePicker" @input="setOptUseDatePicker">
{{ $t("opt.useDatePicker") }}
</b-form-checkbox>
<b-form-checkbox :checked="optSimpleLightbox" @input="setOptSimpleLightbox">{{
$t("opt.simpleLightbox")
}}
</b-form-checkbox>
<b-form-checkbox :checked="optShowTagPickerFilter" @input="setOptShowTagPickerFilter">{{
$t("opt.showTagPickerFilter")
}}
</b-form-checkbox>
</b-card>
<br/>
<h4>{{ $t("searchOptions") }}</h4>
<b-card>
<b-form-checkbox :checked="optHideDuplicates" @input="setOptHideDuplicates">{{
$t("opt.hideDuplicates")
}}
</b-form-checkbox>
<b-form-checkbox :checked="optHighlight" @input="setOptHighlight">{{ $t("opt.highlight") }}</b-form-checkbox>
<b-form-checkbox :checked="optTagOrOperator" @input="setOptTagOrOperator">{{
$t("opt.tagOrOperator")
@@ -65,6 +95,10 @@
<label>{{ $t("opt.slideDuration") }}</label>
<b-form-input :value="optLightboxSlideDuration" type="number" min="1"
@input="setOptLightboxSlideDuration"></b-form-input>
<label>{{ $t("opt.vidPreviewInterval") }}</label>
<b-form-input :value="optVidPreviewInterval" type="number" min="50"
@input="setOptVidPreviewInterval"></b-form-input>
</b-card>
<h4 class="mt-3">{{ $t("treemapOptions") }}</h4>
@@ -108,15 +142,15 @@
</template>
<script>
import Vue from "vue";
import {mapGetters, mapMutations} from "vuex";
import {mapActions, mapGetters, mapMutations} from "vuex";
import DebugInfo from "@/components/DebugInfo.vue";
import Preloader from "@/components/Preloader.vue";
import sist2 from "@/Sist2Api";
import GearIcon from "@/components/GearIcon.vue";
import GearIcon from "@/components/icons/GearIcon.vue";
import LanguageIcon from "@/components/icons/LanguageIcon";
export default {
components: {GearIcon, DebugInfo, Preloader},
components: {LanguageIcon, GearIcon, DebugInfo, Preloader},
data() {
return {
loading: true,
@@ -124,6 +158,7 @@ export default {
langOptions: [
{value: "en", text: this.$t("lang.en")},
{value: "fr", text: this.$t("lang.fr")},
{value: "zh-CN", text: this.$t("lang.zh-CN")},
],
queryModeOptions: [
{value: "simple", text: this.$t("queryMode.simple")},
@@ -206,10 +241,16 @@ export default {
"optTreemapSize",
"optLightboxLoadOnlyCurrent",
"optLightboxSlideDuration",
"optContainerWidth",
"optResultSize",
"optTagOrOperator",
"optLang"
"optLang",
"optHideDuplicates",
"optHideLegacy",
"optUpdateMimeMap",
"optUseDatePicker",
"optVidPreviewInterval",
"optSimpleLightbox",
"optShowTagPickerFilter",
]),
clientWidth() {
return window.innerWidth;
@@ -217,7 +258,7 @@ export default {
},
mounted() {
sist2.getSist2Info().then(data => {
this.$store.commit("setSist2Info", data)
this.setSist2Info(data);
this.loading = false;
});
@@ -228,6 +269,9 @@ export default {
});
},
methods: {
...mapActions({
setSist2Info: "setSist2Info",
}),
...mapMutations([
"setOptTheme",
"setOptDisplay",
@@ -245,10 +289,16 @@ export default {
"setOptTreemapSize",
"setOptLightboxLoadOnlyCurrent",
"setOptLightboxSlideDuration",
"setOptContainerWidth",
"setOptResultSize",
"setOptTagOrOperator",
"setOptLang"
"setOptLang",
"setOptHideDuplicates",
"setOptHideLegacy",
"setOptUpdateMimeMap",
"setOptUseDatePicker",
"setOptVidPreviewInterval",
"setOptSimpleLightbox",
"setOptShowTagPickerFilter",
]),
onResetClick() {
localStorage.removeItem("sist2_configuration");

View File

@@ -0,0 +1,149 @@
<template>
<div style="margin-left: auto; margin-right: auto;" class="container">
<Preloader v-if="loading"></Preloader>
<b-card v-else-if="!loading && found">
<b-card-title :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
</b-card-title>
<!-- Thumbnail-->
<div style="position: relative; margin-left: auto; margin-right: auto; text-align: center">
<FullThumbnail :doc="doc" :small-badge="false" @onThumbnailClick="onThumbnailClick()"></FullThumbnail>
</div>
<!-- Audio player-->
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls
:type="doc._source.mime"
:src="`f/${doc._id}`"></audio>
<InfoTable :doc="doc" v-if="doc"></InfoTable>
<div v-if="doc._source.content" class="content-div">{{ doc._source.content }}</div>
</b-card>
<div v-else>
<b-card>
<b-card-title>{{ $t("filePage.notFound") }}</b-card-title>
</b-card>
</div>
</div>
</template>
<script>
import Preloader from "@/components/Preloader.vue";
import InfoTable from "@/components/InfoTable.vue";
import Sist2Api from "@/Sist2Api";
import {ext} from "@/util";
import Vue from "vue";
import sist2 from "@/Sist2Api";
import FullThumbnail from "@/components/FullThumbnail";
export default Vue.extend({
name: "FilePage",
components: {
FullThumbnail,
Preloader,
InfoTable
},
data() {
return {
loading: true,
found: false,
doc: null
}
},
methods: {
ext: ext,
onThumbnailClick() {
window.open(`/f/${this.doc._id}`, "_blank");
},
findByCustomField(field, id) {
return {
query: {
bool: {
must: [
{
match: {
[field]: id
}
}
]
}
},
size: 1
}
},
findById(id) {
return {
query: {
bool: {
must: [
{
match: {
"_id": id
}
}
]
}
},
size: 1
}
},
findByName(name) {
return {
query: {
bool: {
must: [
{
match: {
"name": name
}
}
]
}
},
size: 1
}
}
},
mounted() {
if (this.$store.state.sist2Info === null) {
sist2.getSist2Info().then(data => {
this.$store.dispatch("setSist2Info", data);
this.$store.commit("setIndices", data.indices);
});
}
let query = null;
if (this.$route.query.byId) {
query = this.findById(this.$route.query.byId);
} else if (this.$route.query.byName) {
query = this.findByName(this.$route.query.byName);
} else if (this.$route.query.by && this.$route.query.q) {
query = this.findByCustomField(this.$route.query.by, this.$route.query.q)
}
if (query) {
Sist2Api.esQuery(query).then(result => {
if (result.hits.hits.length === 0) {
this.found = false;
} else {
this.doc = result.hits.hits[0];
this.found = true;
}
this.loading = false;
});
} else {
this.loading = false;
this.found = false;
}
}
});
</script>
<style scoped>
.img-wrapper {
display: inline-block;
}
</style>

View File

@@ -19,11 +19,7 @@
</b-row>
<b-row>
<b-col sm="6">
<b-row>
<b-col style="height: 70px;">
<DateSlider></DateSlider>
</b-col>
</b-row>
<DateSlider></DateSlider>
<b-row>
<b-col>
<IndexPicker></IndexPicker>
@@ -31,21 +27,25 @@
</b-row>
</b-col>
<b-col>
<b-tabs>
<b-tabs justified>
<b-tab :title="$t('mimeTypes')">
<MimePicker></MimePicker>
</b-tab>
<b-tab :title="$t('tags')">
<TagPicker></TagPicker>
<TagPicker :show-search-bar="$store.state.optShowTagPickerFilter"></TagPicker>
</b-tab>
</b-tabs>
</b-col>
</b-row>
</b-card>
<Preloader v-if="searchBusy && docs.length === 0" class="mt-3"></Preloader>
<div v-show="docs.length === 0 && !uiLoading">
<Preloader v-if="searchBusy" class="mt-3"></Preloader>
<div v-else-if="docs.length > 0">
<ResultsCard></ResultsCard>
</div>
<div v-if="docs.length > 0">
<ResultsCard></ResultsCard>
<DocCardWall v-if="optDisplay==='grid'" :docs="docs" :append="appendFunc"></DocCardWall>
@@ -56,7 +56,7 @@
<script lang="ts">
import Preloader from "@/components/Preloader.vue";
import {mapGetters, mapMutations} from "vuex";
import {mapActions, mapGetters, mapMutations} from "vuex";
import sist2 from "../Sist2Api";
import Sist2Api, {EsHit, EsResult} from "../Sist2Api";
import SearchBar from "@/components/SearchBar.vue";
@@ -91,6 +91,7 @@ export default Vue.extend({
search: undefined as any,
docs: [] as EsHit[],
docIds: new Set(),
docChecksums: new Set(),
searchBusy: false,
Sist2Query: Sist2Query,
showHelp: false
@@ -99,6 +100,10 @@ export default Vue.extend({
...mapGetters(["indices", "optDisplay"]),
},
mounted() {
// Handle touch events
window.ontouchend = () => this.$store.commit("busTouchEnd");
window.ontouchcancel = this.$store.commit("busTouchEnd");
this.search = _debounce(async (clear: boolean) => {
if (clear) {
await this.clearResults();
@@ -108,10 +113,6 @@ export default Vue.extend({
}, 350, {leading: false});
Sist2Api.getMimeTypes().then(mimeMap => {
this.$store.commit("setUiMimeMap", mimeMap);
});
this.$store.dispatch("loadFromArgs", this.$route).then(() => {
this.$store.subscribe(() => this.$store.dispatch("updateArgs", this.$router));
this.$store.subscribe((mutation) => {
@@ -137,17 +138,25 @@ export default Vue.extend({
sist2.getSist2Info().then(data => {
this.setSist2Info(data);
this.setIndices(data.indices);
this.uiLoading = false;
this.search(true);
const doBlankSearch = !this.$store.state.optUpdateMimeMap;
Sist2Api.getMimeTypes(Sist2Query.searchQuery(doBlankSearch)).then(({mimeMap}) => {
this.$store.commit("setUiMimeMap", mimeMap);
this.uiLoading = false;
this.search(true);
});
}).catch(() => {
this.showErrorToast();
});
});
},
methods: {
...mapMutations({
...mapActions({
setSist2Info: "setSist2Info",
}),
...mapMutations({
setIndices: "setIndices",
setDateBoundsMin: "setDateBoundsMin",
setDateBoundsMax: "setDateBoundsMax",
@@ -178,6 +187,7 @@ export default Vue.extend({
async searchNow(q: any) {
this.searchBusy = true;
await this.$store.dispatch("incrementQuerySequence");
this.$store.commit("busSearch");
Sist2Api.esQuery(q).then(async (resp: EsResult) => {
await this.handleSearch(resp);
@@ -193,16 +203,29 @@ export default Vue.extend({
async clearResults() {
this.docs = [];
this.docIds.clear();
this.docChecksums.clear();
await this.$store.dispatch("clearResults");
this.$store.commit("setUiReachedScrollEnd", false);
},
async handleSearch(resp: EsResult) {
if (resp.hits.hits.length == 0) {
if (resp.hits.hits.length == 0 || resp.hits.hits.length < this.$store.state.optSize) {
this.$store.commit("setUiReachedScrollEnd", true);
}
resp.hits.hits = resp.hits.hits.filter(hit => !this.docIds.has(hit._id));
resp.hits.hits.forEach(hit => this.docIds.add(hit._id));
if (this.$store.state.optHideDuplicates) {
resp.hits.hits = resp.hits.hits.filter(hit => {
if (!("checksum" in hit._source)) {
return true;
}
const isDupe = !this.docChecksums.has(hit._source.checksum);
this.docChecksums.add(hit._source.checksum);
return isDupe;
});
}
for (const hit of resp.hits.hits) {
if (hit._props.isPlayableImage || hit._props.isPlayableVideo) {
@@ -225,6 +248,8 @@ export default Vue.extend({
this.$store.commit("setLastQueryResult", resp);
this.docs.push(...resp.hits.hits);
resp.hits.hits.forEach(hit => this.docIds.add(hit._id));
},
getDateRange(): Promise<{ min: number, max: number }> {
return sist2.esQuery({
@@ -266,6 +291,11 @@ export default Vue.extend({
border: none;
}
.toast-header-info, .toast-body-info {
background: #2196f3;
color: #fff !important;
}
.toast-header-error, .toast-body-error {
background: #a94442;
color: #f2dede !important;

186
src/cli.c
View File

@@ -5,7 +5,8 @@
#define DEFAULT_OUTPUT "index.sist2/"
#define DEFAULT_CONTENT_SIZE 32768
#define DEFAULT_QUALITY 1
#define DEFAULT_SIZE 300
#define DEFAULT_THUMBNAIL_SIZE 500
#define DEFAULT_THUMBNAIL_COUNT 1
#define DEFAULT_REWRITE_URL ""
#define DEFAULT_ES_URL "http://localhost:9200"
@@ -19,9 +20,12 @@
#define DEFAULT_MAX_MEM_BUFFER 2000
#define DEFAULT_THROTTLE_MEMORY_THRESHOLD 0
const char *TESS_DATAPATHS[] = {
"/usr/share/tessdata/",
"/usr/share/tesseract-ocr/tessdata/",
"/usr/share/tesseract-ocr/4.00/tessdata/",
"./",
NULL
};
@@ -64,6 +68,10 @@ void index_args_destroy(index_args_t *args) {
if (args->es_settings_path) {
free(args->es_settings);
}
if (args->index_path != NULL) {
free(args->index_path);
}
free(args);
}
@@ -73,6 +81,11 @@ void web_args_destroy(web_args_t *args) {
}
void exec_args_destroy(exec_args_t *args) {
if (args->index_path != NULL) {
free(args->index_path);
}
free(args);
}
@@ -84,13 +97,12 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
char *abs_path = abspath(argv[1]);
if (abs_path == NULL) {
fprintf(stderr, "File not found: %s\n", argv[1]);
return 1;
LOG_FATALF("cli.c", "Invalid PATH argument. File not found: %s", argv[1])
} else {
args->path = abs_path;
}
if (args->incremental != NULL) {
if (args->incremental != OPTION_VALUE_UNSPECIFIED) {
args->incremental = abspath(args->incremental);
if (abs_path == NULL) {
sist_log("main.c", LOG_SIST_WARNING, "Could not open original index! Disabled incremental scan feature.");
@@ -98,32 +110,42 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
}
}
if (args->quality == 0) {
args->quality = DEFAULT_QUALITY;
} else if (args->quality < 1 || args->quality > 31) {
fprintf(stderr, "Invalid quality: %f\n", args->quality);
if (args->tn_quality == OPTION_VALUE_UNSPECIFIED) {
args->tn_quality = DEFAULT_QUALITY;
} else if (args->tn_quality < 1.0f || args->tn_quality > 31.0f) {
fprintf(stderr, "Invalid value for --thumbnail-quality argument: %f. Must be within [1.0, 31.0].\n",
args->tn_quality);
return 1;
}
if (args->size == 0) {
args->size = DEFAULT_SIZE;
} else if (args->size > 0 && args->size < 32) {
printf("Invalid size: %d\n", args->content_size);
if (args->tn_size == OPTION_VALUE_UNSPECIFIED) {
args->tn_size = DEFAULT_THUMBNAIL_SIZE;
} else if (args->tn_size < 32) {
printf("Invalid value --thumbnail-size argument: %d. Must be greater than 32 pixels.\n", args->tn_size);
return 1;
}
if (args->content_size == 0) {
if (args->tn_count == OPTION_VALUE_UNSPECIFIED) {
args->tn_count = DEFAULT_THUMBNAIL_COUNT;
} else if (args->tn_count == OPTION_VALUE_DISABLE) {
args->tn_count = 0;
} else if (args->tn_count > 1000) {
printf("Invalid value --thumbnail-count argument: %d. Must be <= 1000.\n", args->tn_size);
return 1;
}
if (args->content_size == OPTION_VALUE_UNSPECIFIED) {
args->content_size = DEFAULT_CONTENT_SIZE;
}
if (args->threads == 0) {
args->threads = 1;
} else if (args->threads < 0) {
fprintf(stderr, "Invalid threads: %d\n", args->threads);
fprintf(stderr, "Invalid value for --threads: %d. Must be a positive number\n", args->threads);
return 1;
}
if (args->output == NULL) {
if (args->output == OPTION_VALUE_UNSPECIFIED) {
args->output = malloc(strlen(DEFAULT_OUTPUT) + 1);
strcpy(args->output, DEFAULT_OUTPUT);
} else {
@@ -142,19 +164,19 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
args->depth += 1;
}
if (args->name == NULL) {
if (args->name == OPTION_VALUE_UNSPECIFIED) {
args->name = g_path_get_basename(args->output);
} else {
char* tmp = malloc(strlen(args->name) + 1);
char *tmp = malloc(strlen(args->name) + 1);
strcpy(tmp, args->name);
args->name = tmp;
}
if (args->rewrite_url == NULL) {
if (args->rewrite_url == OPTION_VALUE_UNSPECIFIED) {
args->rewrite_url = DEFAULT_REWRITE_URL;
}
if (args->archive == NULL || strcmp(args->archive, "recurse") == 0) {
if (args->archive == OPTION_VALUE_UNSPECIFIED || strcmp(args->archive, "recurse") == 0) {
args->archive_mode = ARC_MODE_RECURSE;
} else if (strcmp(args->archive, "list") == 0) {
args->archive_mode = ARC_MODE_LIST;
@@ -167,17 +189,50 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
return 1;
}
if (args->tesseract_lang != NULL) {
TessBaseAPI *api = TessBaseAPICreate();
if (args->ocr_images && args->tesseract_lang == OPTION_VALUE_UNSPECIFIED) {
fprintf(stderr, "You must specify --ocr-lang <LANG> to use --ocr-images");
return 1;
}
char filename[128];
sprintf(filename, "%s.traineddata", args->tesseract_lang);
const char *path = find_file_in_paths(TESS_DATAPATHS, filename);
if (path == NULL) {
LOG_FATAL("cli.c", "Could not find tesseract language file!");
if (args->ocr_ebooks && args->tesseract_lang == OPTION_VALUE_UNSPECIFIED) {
fprintf(stderr, "You must specify --ocr-lang <LANG> to use --ocr-ebooks");
return 1;
}
if (args->tesseract_lang != OPTION_VALUE_UNSPECIFIED) {
if (!args->ocr_ebooks && !args->ocr_images) {
fprintf(stderr, "You must specify at least one of --ocr-ebooks, --ocr-images");
return 1;
}
ret = TessBaseAPIInit3(api, path, args->tesseract_lang);
TessBaseAPI *api = TessBaseAPICreate();
const char *trained_data_path = NULL;
char *lang = malloc(strlen(args->tesseract_lang) + 1);
strcpy(lang, args->tesseract_lang);
lang = strtok(lang, "+");
while (lang != NULL) {
char filename[128];
sprintf(filename, "%s.traineddata", lang);
const char *path = find_file_in_paths(TESS_DATAPATHS, filename);
if (path == NULL) {
LOG_FATALF("cli.c", "Could not find tesseract language file: %s!", filename);
}
if (trained_data_path != NULL && path != trained_data_path) {
LOG_FATAL("cli.c", "When specifying more than one tesseract language, all the traineddata "
"files must be in the same folder")
}
trained_data_path = path;
lang = strtok(NULL, "+");
}
free(lang);
ret = TessBaseAPIInit3(api, trained_data_path, args->tesseract_lang);
if (ret != 0) {
fprintf(stderr, "Could not initialize tesseract with lang '%s'\n", args->tesseract_lang);
return 1;
@@ -185,10 +240,10 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
TessBaseAPIEnd(api);
TessBaseAPIDelete(api);
args->tesseract_path = path;
args->tesseract_path = trained_data_path;
}
if (args->exclude_regex != NULL) {
if (args->exclude_regex != OPTION_VALUE_UNSPECIFIED) {
const char *error;
int error_offset;
@@ -208,18 +263,36 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
ScanCtx.exclude = NULL;
}
if (args->treemap_threshold_str == 0) {
if (args->treemap_threshold_str == OPTION_VALUE_UNSPECIFIED) {
args->treemap_threshold = DEFAULT_TREEMAP_THRESHOLD;
} else {
args->treemap_threshold = atof(args->treemap_threshold_str);
}
if (args->max_memory_buffer == 0) {
args->max_memory_buffer = DEFAULT_MAX_MEM_BUFFER;
if (args->max_memory_buffer_mib == OPTION_VALUE_UNSPECIFIED) {
args->max_memory_buffer_mib = DEFAULT_MAX_MEM_BUFFER;
}
LOG_DEBUGF("cli.c", "arg quality=%f", args->quality)
LOG_DEBUGF("cli.c", "arg size=%d", args->size)
if (args->scan_mem_limit_mib == OPTION_VALUE_UNSPECIFIED || args->scan_mem_limit_mib == OPTION_VALUE_DISABLE) {
args->scan_mem_limit_mib = DEFAULT_THROTTLE_MEMORY_THRESHOLD;
}
if (args->list_path != OPTION_VALUE_UNSPECIFIED) {
if (strcmp(args->list_path, "-") == 0) {
args->list_file = stdin;
LOG_DEBUG("cli.c", "Using stdin as list file")
} else {
args->list_file = fopen(args->list_path, "r");
if (args->list_file == NULL) {
LOG_FATALF("main.c", "List file could not be opened: %s (%s)", args->list_path, errno);
}
}
}
LOG_DEBUGF("cli.c", "arg tn_quality=%f", args->tn_quality)
LOG_DEBUGF("cli.c", "arg tn_size=%d", args->tn_size)
LOG_DEBUGF("cli.c", "arg tn_count=%d", args->tn_count)
LOG_DEBUGF("cli.c", "arg content_size=%d", args->content_size)
LOG_DEBUGF("cli.c", "arg threads=%d", args->threads)
LOG_DEBUGF("cli.c", "arg incremental=%s", args->incremental)
@@ -236,7 +309,8 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
LOG_DEBUGF("cli.c", "arg fast=%d", args->fast)
LOG_DEBUGF("cli.c", "arg fast_epub=%d", args->fast_epub)
LOG_DEBUGF("cli.c", "arg treemap_threshold=%f", args->treemap_threshold)
LOG_DEBUGF("cli.c", "arg max_memory_buffer=%d", args->max_memory_buffer)
LOG_DEBUGF("cli.c", "arg max_memory_buffer_mib=%d", args->max_memory_buffer_mib)
LOG_DEBUGF("cli.c", "arg list_path=%s", args->list_path)
return 0;
}
@@ -287,11 +361,9 @@ int index_args_validate(index_args_t *args, int argc, const char **argv) {
char *index_path = abspath(argv[1]);
if (index_path == NULL) {
fprintf(stderr, "File not found: %s\n", argv[1]);
return 1;
LOG_FATALF("cli.c", "Invalid PATH argument. File not found: %s", argv[1])
} else {
args->index_path = argv[1];
free(index_path);
args->index_path = index_path;
}
if (args->es_url == NULL) {
@@ -326,10 +398,19 @@ int index_args_validate(index_args_t *args, int argc, const char **argv) {
LOG_DEBUGF("cli.c", "arg es_url=%s", args->es_url)
LOG_DEBUGF("cli.c", "arg es_index=%s", args->es_index)
LOG_DEBUGF("cli.c", "arg es_insecure_ssl=%d", args->es_insecure_ssl)
LOG_DEBUGF("cli.c", "arg index_path=%s", args->index_path)
LOG_DEBUGF("cli.c", "arg script_path=%s", args->script_path)
LOG_DEBUGF("cli.c", "arg async_script=%s", args->async_script)
LOG_DEBUGF("cli.c", "arg script=%s", args->script)
LOG_DEBUGF("cli.c", "arg async_script=%d", args->async_script)
if (args->script) {
char log_buf[5000];
strncpy(log_buf, args->script, sizeof(log_buf));
*(log_buf + sizeof(log_buf) - 1) = '\0';
LOG_DEBUGF("cli.c", "arg script=%s", log_buf)
}
LOG_DEBUGF("cli.c", "arg print=%d", args->print)
LOG_DEBUGF("cli.c", "arg es_mappings_path=%s", args->es_mappings_path)
LOG_DEBUGF("cli.c", "arg es_mappings=%s", args->es_mappings)
@@ -362,11 +443,15 @@ int web_args_validate(web_args_t *args, int argc, const char **argv) {
args->es_index = DEFAULT_ES_INDEX;
}
if (args->tagline == NULL) {
args->tagline = DEFAULT_TAGLINE;
}
if (args->lang == NULL) {
args->lang = DEFAULT_LANG;
}
if (strlen(args->lang) != 2) {
if (strlen(args->lang) != 2 && strlen(args->lang) != 5) {
fprintf(stderr, "Invalid --lang value, see usage\n");
return 1;
}
@@ -422,13 +507,13 @@ int web_args_validate(web_args_t *args, int argc, const char **argv) {
for (int i = 0; i < args->index_count; i++) {
char *abs_path = abspath(args->indices[i]);
if (abs_path == NULL) {
fprintf(stderr, "File not found: %s\n", args->indices[i]);
return 1;
LOG_FATALF("cli.c", "Index not found: %s", args->indices[i])
}
}
LOG_DEBUGF("cli.c", "arg es_url=%s", args->es_url)
LOG_DEBUGF("cli.c", "arg es_index=%s", args->es_index)
LOG_DEBUGF("cli.c", "arg es_insecure_ssl=%d", args->es_insecure_ssl)
LOG_DEBUGF("cli.c", "arg tagline=%s", args->tagline)
LOG_DEBUGF("cli.c", "arg dev=%d", args->dev)
LOG_DEBUGF("cli.c", "arg listen=%s", args->listen_address)
@@ -463,11 +548,9 @@ int exec_args_validate(exec_args_t *args, int argc, const char **argv) {
char *index_path = abspath(argv[1]);
if (index_path == NULL) {
fprintf(stderr, "File not found: %s\n", argv[1]);
return 1;
LOG_FATALF("cli.c", "Invalid index PATH argument. File not found: %s", argv[1])
} else {
args->index_path = argv[1];
free(index_path);
args->index_path = index_path;
}
if (args->es_url == NULL) {
@@ -487,6 +570,11 @@ int exec_args_validate(exec_args_t *args, int argc, const char **argv) {
}
LOG_DEBUGF("cli.c", "arg script_path=%s", args->script_path)
LOG_DEBUGF("cli.c", "arg script=%s", args->script)
char log_buf[5000];
strncpy(log_buf, args->script, sizeof(log_buf));
*(log_buf + sizeof(log_buf) - 1) = '\0';
LOG_DEBUGF("cli.c", "arg script=%s", log_buf)
return 0;
}

View File

@@ -5,11 +5,15 @@
#include "libscan/arc/arc.h"
#define OPTION_VALUE_DISABLE (-1)
#define OPTION_VALUE_UNSPECIFIED (0)
typedef struct scan_args {
float quality;
int size;
float tn_quality;
int tn_size;
int content_size;
int threads;
int scan_mem_limit_mib;
char *incremental;
char *output;
char *rewrite_url;
@@ -21,13 +25,20 @@ typedef struct scan_args {
char *archive_passphrase;
char *tesseract_lang;
const char *tesseract_path;
int ocr_images;
int ocr_ebooks;
char *exclude_regex;
int fast;
const char* treemap_threshold_str;
double treemap_threshold;
int max_memory_buffer;
int max_memory_buffer_mib;
int read_subtitles;
/** Number of thumbnails to generate */
int tn_count;
int fast_epub;
int calculate_checksums;
char *list_path;
FILE *list_file;
} scan_args_t;
scan_args_t *scan_args_create();
@@ -39,7 +50,8 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv);
typedef struct index_args {
char *es_url;
char *es_index;
const char *index_path;
int es_insecure_ssl;
char *index_path;
const char *script_path;
char *script;
const char *es_settings_path;
@@ -51,11 +63,13 @@ typedef struct index_args {
int async_script;
int force_reset;
int threads;
int incremental;
} index_args_t;
typedef struct web_args {
char *es_url;
char *es_index;
int es_insecure_ssl;
char *listen_address;
char *credentials;
char *tag_credentials;
@@ -73,7 +87,8 @@ typedef struct web_args {
typedef struct exec_args {
char *es_url;
char *es_index;
const char *index_path;
int es_insecure_ssl;
char *index_path;
const char *script_path;
int async_script;
char *script;

View File

@@ -2,6 +2,9 @@
ScanCtx_t ScanCtx = {
.stat_index_size = 0,
.stat_tn_size = 0,
.dbg_current_files = NULL,
.pool = NULL
};
WebCtx_t WebCtx;
IndexCtx_t IndexCtx;

View File

@@ -14,7 +14,10 @@
#include "libscan/mobi/scan_mobi.h"
#include "libscan/raw/raw.h"
#include "libscan/msdoc/msdoc.h"
#include "libscan/wpd/wpd.h"
#include "libscan/json/json.h"
#include "src/io/store.h"
#include "src/index/elastic.h"
#include <glib.h>
#include <pcre.h>
@@ -31,12 +34,16 @@ typedef struct {
int threads;
int depth;
int calculate_checksums;
size_t mem_limit;
size_t stat_tn_size;
size_t stat_index_size;
GHashTable *original_table;
GHashTable *copy_table;
GHashTable *new_table;
pthread_mutex_t copy_table_mu;
pcre *exclude;
pcre_extra *exclude_extra;
@@ -60,6 +67,8 @@ typedef struct {
scan_mobi_ctx_t mobi_ctx;
scan_raw_ctx_t raw_ctx;
scan_msdoc_ctx_t msdoc_ctx;
scan_wpd_ctx_t wpd_ctx;
scan_json_ctx_t json_ctx;
} ScanCtx_t;
typedef struct {
@@ -70,6 +79,8 @@ typedef struct {
typedef struct {
char *es_url;
int es_insecure_ssl;
es_version_t *es_version;
char *es_index;
int batch_size;
tpool_t *pool;
@@ -77,11 +88,17 @@ typedef struct {
GHashTable *tags;
store_t *meta_store;
GHashTable *meta;
/**
* Set to false when using --print
*/
int needs_es_connection;
} IndexCtx_t;
typedef struct {
char *es_url;
es_version_t *es_version;
char *es_index;
int es_insecure_ssl;
int index_count;
char *auth_user;
char *auth_pass;
@@ -89,7 +106,7 @@ typedef struct {
int tag_auth_enabled;
char *tagline;
struct index_t indices[256];
char lang[3];
char lang[10];
int dev;
} WebCtx_t;

View File

@@ -15,28 +15,45 @@ typedef struct es_indexer {
} es_indexer_t;
static __thread es_indexer_t *Indexer;
static __thread es_indexer_t *Indexer = NULL;
void delete_queue(int max);
void free_queue(int max);
void elastic_flush();
void elastic_cleanup() {
elastic_flush();
if (Indexer != NULL) {
free(Indexer->es_index);
free(Indexer->es_url);
free(Indexer);
void print_error(response_t *r);
void destroy_indexer(es_indexer_t *indexer) {
if (indexer == NULL) {
return;
}
LOG_DEBUG("elastic.c", "Destroying indexer")
if (indexer->es_url != NULL) {
free(indexer->es_url);
free(indexer->es_index);
}
free(indexer);
}
void print_json(cJSON *document, const char id_str[MD5_STR_LENGTH]) {
void elastic_cleanup() {
if (IndexCtx.needs_es_connection) {
elastic_flush();
}
destroy_indexer(Indexer);
}
void print_json(cJSON *document, const char id_str[SIST_DOC_ID_LEN]) {
cJSON *line = cJSON_CreateObject();
cJSON_AddStringToObject(line, "_id", id_str);
cJSON_AddStringToObject(line, "_index", IndexCtx.es_index);
cJSON_AddStringToObject(line, "_type", "_doc");
// cJSON_AddStringToObject(line, "_type", "_doc");
cJSON_AddItemReferenceToObject(line, "_source", document);
char *json = cJSON_PrintUnformatted(line);
@@ -52,13 +69,24 @@ void index_json_func(void *arg) {
elastic_index_line(line);
}
void index_json(cJSON *document, const char index_id_str[MD5_STR_LENGTH]) {
void delete_document(const char* document_id_str, void* UNUSED(_data)) {
es_bulk_line_t *bulk_line = malloc(sizeof(es_bulk_line_t));
bulk_line->type = ES_BULK_LINE_DELETE;
bulk_line->next = NULL;
strcpy(bulk_line->doc_id, document_id_str);
tpool_add_work(IndexCtx.pool, index_json_func, bulk_line);
}
void index_json(cJSON *document, const char doc_id[SIST_DOC_ID_LEN]) {
char *json = cJSON_PrintUnformatted(document);
size_t json_len = strlen(json);
es_bulk_line_t *bulk_line = malloc(sizeof(es_bulk_line_t) + json_len + 2);
bulk_line->type = ES_BULK_LINE_INDEX;
memcpy(bulk_line->line, json, json_len);
memcpy(bulk_line->path_md5_str, index_id_str, MD5_STR_LENGTH);
strcpy(bulk_line->doc_id, doc_id);
*(bulk_line->line + json_len) = '\n';
*(bulk_line->line + json_len + 1) = '\0';
bulk_line->next = NULL;
@@ -67,7 +95,7 @@ void index_json(cJSON *document, const char index_id_str[MD5_STR_LENGTH]) {
tpool_add_work(IndexCtx.pool, index_json_func, bulk_line);
}
void execute_update_script(const char *script, int async, const char index_id[MD5_STR_LENGTH]) {
void execute_update_script(const char *script, int async, const char index_id[SIST_INDEX_ID_LEN]) {
if (Indexer == NULL) {
Indexer = create_indexer(IndexCtx.es_url, IndexCtx.es_index);
@@ -82,16 +110,16 @@ void execute_update_script(const char *script, int async, const char index_id[MD
cJSON *term_obj = cJSON_AddObjectToObject(query, "term");
cJSON_AddStringToObject(term_obj, "index", index_id);
char *str = cJSON_Print(body);
char *str = cJSON_PrintUnformatted(body);
char bulk_url[4096];
char url[4096];
if (async) {
snprintf(bulk_url, sizeof(bulk_url), "%s/%s/_update_by_query?wait_for_completion=false", Indexer->es_url,
snprintf(url, sizeof(url), "%s/%s/_update_by_query?wait_for_completion=false", Indexer->es_url,
Indexer->es_index);
} else {
snprintf(bulk_url, sizeof(bulk_url), "%s/%s/_update_by_query", Indexer->es_url, Indexer->es_index);
snprintf(url, sizeof(url), "%s/%s/_update_by_query", Indexer->es_url, Indexer->es_index);
}
response_t *r = web_post(bulk_url, str);
response_t *r = web_post(url, str, IndexCtx.es_insecure_ssl);
if (!async) {
LOG_INFOF("elastic.c", "Executed user script <%d>", r->status_code);
}
@@ -111,13 +139,18 @@ void execute_update_script(const char *script, int async, const char index_id[MD
if (async) {
cJSON *task = cJSON_GetObjectItem(resp, "task");
if (task == NULL) {
LOG_FATALF("elastic.c", "FIXME: Could not get task id: %s", r->body);
}
LOG_INFOF("elastic.c", "User script queued: %s/_tasks/%s", Indexer->es_url, task->valuestring);
}
cJSON_Delete(resp);
}
void *create_bulk_buffer(int max, int *count, size_t *buf_len) {
void *create_bulk_buffer(int max, int *count, size_t *buf_len, int legacy) {
es_bulk_line_t *line = Indexer->line_head;
*count = 0;
@@ -125,30 +158,56 @@ void *create_bulk_buffer(int max, int *count, size_t *buf_len) {
size_t buf_cur = 0;
char *buf = malloc(8192);
size_t buf_capacity = 8192;
#define GROW_BUF(delta) \
while (buf_size + (delta) > buf_capacity) { \
buf_capacity *= 2; \
buf = realloc(buf, buf_capacity); \
} \
buf_size += (delta); \
// see: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
// ES_BULK_LINE_INDEX: two lines, 1st action, 2nd content
// ES_BULK_LINE_DELETE: one line
while (line != NULL && *count < max) {
char action_str[256];
snprintf(
action_str, sizeof(action_str),
"{\"index\":{\"_id\":\"%s\",\"_type\":\"_doc\",\"_index\":\"%s\"}}\n",
line->path_md5_str, Indexer->es_index
);
if (line->type == ES_BULK_LINE_INDEX) {
size_t action_str_len = strlen(action_str);
size_t line_len = strlen(line->line);
if (legacy) {
snprintf(
action_str, sizeof(action_str),
"{\"index\":{\"_id\":\"%s\",\"_type\":\"_doc\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
);
} else {
snprintf(
action_str, sizeof(action_str),
"{\"index\":{\"_id\":\"%s\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
);
}
while (buf_size + line_len + action_str_len > buf_capacity) {
buf_capacity *= 2;
buf = realloc(buf, buf_capacity);
size_t action_str_len = strlen(action_str);
size_t line_len = strlen(line->line);
GROW_BUF(action_str_len + line_len);
memcpy(buf + buf_cur, action_str, action_str_len);
buf_cur += action_str_len;
memcpy(buf + buf_cur, line->line, line_len);
buf_cur += line_len;
} else if (line->type == ES_BULK_LINE_DELETE) {
snprintf(
action_str, sizeof(action_str),
"{\"delete\":{\"_id\":\"%s\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
);
size_t action_str_len = strlen(action_str);
GROW_BUF(action_str_len);
memcpy(buf + buf_cur, action_str, action_str_len);
buf_cur += action_str_len;
}
buf_size += line_len + action_str_len;
memcpy(buf + buf_cur, action_str, action_str_len);
buf_cur += action_str_len;
memcpy(buf + buf_cur, line->line, line_len);
buf_cur += line_len;
line = line->next;
(*count)++;
}
@@ -169,7 +228,13 @@ void print_errors(response_t *r) {
*(tmp + r->size) = '\0';
cJSON *ret_json = cJSON_Parse(tmp);
if (cJSON_GetObjectItem(ret_json, "errors")->valueint != 0) {
cJSON *errors = cJSON_GetObjectItem(ret_json, "errors");
if (errors == NULL) {
char *str = cJSON_Print(ret_json);
LOG_ERRORF("elastic.c", "%s\n", str);
cJSON_free(str);
} else if (errors->valueint != 0) {
cJSON *err;
cJSON_ArrayForEach(err, cJSON_GetObjectItem(ret_json, "items")) {
if (cJSON_GetObjectItem(cJSON_GetObjectItem(err, "index"), "status")->valueint != 201) {
@@ -207,11 +272,11 @@ void _elastic_flush(int max) {
size_t buf_len;
int count;
void *buf = create_bulk_buffer(max, &count, &buf_len);
void *buf = create_bulk_buffer(max, &count, &buf_len, IS_LEGACY_VERSION(IndexCtx.es_version));
char bulk_url[4096];
snprintf(bulk_url, sizeof(bulk_url), "%s/%s/_bulk?pipeline=tie", Indexer->es_url, Indexer->es_index);
response_t *r = web_post(bulk_url, buf);
response_t *r = web_post(bulk_url, buf, IndexCtx.es_insecure_ssl);
if (r->status_code == 0) {
LOG_FATALF("elastic.c", "Could not connect to %s, make sure that elasticsearch is running!\n", IndexCtx.es_url)
@@ -220,10 +285,10 @@ void _elastic_flush(int max) {
if (r->status_code == 413) {
if (max <= 1) {
LOG_ERRORF("elastic.c", "Single document too large, giving up: {%s}", Indexer->line_head->path_md5_str)
LOG_ERRORF("elastic.c", "Single document too large, giving up: {%s}", Indexer->line_head->doc_id)
free_response(r);
free(buf);
delete_queue(1);
free_queue(1);
if (Indexer->queued != 0) {
elastic_flush();
}
@@ -248,13 +313,13 @@ void _elastic_flush(int max) {
} else if (r->status_code != 200) {
print_errors(r);
delete_queue(Indexer->queued);
free_queue(Indexer->queued);
} else {
print_errors(r);
LOG_INFOF("elastic.c", "Indexed %d documents (%zukB) <%d>", count, buf_len / 1024, r->status_code);
delete_queue(max);
LOG_DEBUGF("elastic.c", "Indexed %d documents (%zukB) <%d>", count, buf_len / 1024, r->status_code);
free_queue(max);
if (Indexer->queued != 0) {
elastic_flush();
@@ -265,7 +330,7 @@ void _elastic_flush(int max) {
free(buf);
}
void delete_queue(int max) {
void free_queue(int max) {
for (int i = 0; i < max; i++) {
es_bulk_line_t *tmp = Indexer->line_head;
Indexer->line_head = tmp->next;
@@ -309,16 +374,22 @@ void elastic_index_line(es_bulk_line_t *line) {
es_indexer_t *create_indexer(const char *url, const char *index) {
char *es_url = malloc(strlen(url) + 1);
strcpy(es_url, url);
char *es_index = malloc(strlen(index) + 1);
strcpy(es_index, index);
es_indexer_t *indexer = malloc(sizeof(es_indexer_t));
indexer->es_url = es_url;
indexer->es_index = es_index;
if (IndexCtx.needs_es_connection) {
char *es_url = malloc(strlen(url) + 1);
strcpy(es_url, url);
char *es_index = malloc(strlen(index) + 1);
strcpy(es_index, index);
indexer->es_url = es_url;
indexer->es_index = es_index;
} else {
indexer->es_url = NULL;
indexer->es_index = NULL;
}
indexer->queued = 0;
indexer->line_head = NULL;
indexer->line_tail = NULL;
@@ -331,7 +402,7 @@ void finish_indexer(char *script, int async_script, char *index_id) {
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_refresh", IndexCtx.es_url, IndexCtx.es_index);
response_t *r = web_post(url, "");
response_t *r = web_post(url, "", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Refresh index <%d>", r->status_code);
free_response(r);
@@ -340,38 +411,104 @@ void finish_indexer(char *script, int async_script, char *index_id) {
free(script);
snprintf(url, sizeof(url), "%s/%s/_refresh", IndexCtx.es_url, IndexCtx.es_index);
r = web_post(url, "");
r = web_post(url, "", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Refresh index <%d>", r->status_code);
free_response(r);
}
snprintf(url, sizeof(url), "%s/%s/_forcemerge", IndexCtx.es_url, IndexCtx.es_index);
r = web_post(url, "");
r = web_post(url, "", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Merge index <%d>", r->status_code);
free_response(r);
snprintf(url, sizeof(url), "%s/%s/_settings", IndexCtx.es_url, IndexCtx.es_index);
r = web_put(url, "{\"index\":{\"refresh_interval\":\"1s\"}}");
r = web_put(url, "{\"index\":{\"refresh_interval\":\"1s\"}}", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Set refresh interval <%d>", r->status_code);
free_response(r);
}
void elastic_init(int force_reset, const char* user_mappings, const char* user_settings) {
es_version_t *elastic_get_version(const char *es_url, int insecure) {
response_t *r = web_get(es_url, 30, insecure);
char *tmp = malloc(r->size + 1);
memcpy(tmp, r->body, r->size);
*(tmp + r->size) = '\0';
cJSON *response = cJSON_Parse(tmp);
free(tmp);
if (response == NULL) {
return NULL;
}
if (cJSON_GetObjectItem(response, "error") != NULL) {
LOG_WARNING("elastic.c", "Could not get Elasticsearch version")
print_error(r);
free_response(r);
return NULL;
}
free_response(r);
if (cJSON_GetObjectItem(response, "version") == NULL ||
cJSON_GetObjectItem(cJSON_GetObjectItem(response, "version"), "number") == NULL) {
cJSON_Delete(response);
return NULL;
}
char *version_str = cJSON_GetObjectItem(cJSON_GetObjectItem(response, "version"), "number")->valuestring;
es_version_t *version = malloc(sizeof(es_version_t));
const char *tok = strtok(version_str, ".");
version->major = atoi(tok);
tok = strtok(NULL, ".");
version->minor = atoi(tok);
tok = strtok(NULL, ".");
version->patch = atoi(tok);
cJSON_Delete(response);
return version;
}
void elastic_init(int force_reset, const char *user_mappings, const char *user_settings) {
es_version_t *es_version = elastic_get_version(IndexCtx.es_url, IndexCtx.es_insecure_ssl);
IndexCtx.es_version = es_version;
if (es_version == NULL) {
LOG_FATAL("elastic.c", "Could not get ES version")
}
LOG_INFOF("elastic.c",
"Elasticsearch version is %s (supported=%d, legacy=%d)",
format_es_version(es_version), IS_SUPPORTED_ES_VERSION(es_version), IS_LEGACY_VERSION(es_version));
if (!IS_SUPPORTED_ES_VERSION(es_version)) {
LOG_FATAL("elastic.c", "This elasticsearch version is not supported!")
}
char *settings = NULL;
if (IS_LEGACY_VERSION(es_version)) {
settings = settings_legacy_json;
} else {
settings = settings_json;
}
// Check if index exists
char url[4096];
snprintf(url, sizeof(url), "%s/%s", IndexCtx.es_url, IndexCtx.es_index);
response_t *r = web_get(url, 30);
response_t *r = web_get(url, 30, IndexCtx.es_insecure_ssl);
int index_exists = r->status_code == 200;
free_response(r);
if (!index_exists || force_reset) {
r = web_delete(url);
r = web_delete(url, IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Delete index <%d>", r->status_code);
free_response(r);
snprintf(url, sizeof(url), "%s/%s", IndexCtx.es_url, IndexCtx.es_index);
r = web_put(url, "");
r = web_put(url, "", IndexCtx.es_insecure_ssl);
if (r->status_code != 200) {
print_error(r);
@@ -382,17 +519,17 @@ void elastic_init(int force_reset, const char* user_mappings, const char* user_s
free_response(r);
snprintf(url, sizeof(url), "%s/%s/_close", IndexCtx.es_url, IndexCtx.es_index);
r = web_post(url, "");
r = web_post(url, "", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Close index <%d>", r->status_code);
free_response(r);
snprintf(url, sizeof(url), "%s/_ingest/pipeline/tie", IndexCtx.es_url);
r = web_put(url, pipeline_json);
r = web_put(url, pipeline_json, IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Create pipeline <%d>", r->status_code);
free_response(r);
snprintf(url, sizeof(url), "%s/%s/_settings", IndexCtx.es_url, IndexCtx.es_index);
r = web_put(url, user_settings ? user_settings : settings_json);
r = web_put(url, user_settings ? user_settings : settings, IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Update ES settings <%d>", r->status_code);
if (r->status_code != 200) {
print_error(r);
@@ -400,8 +537,13 @@ void elastic_init(int force_reset, const char* user_mappings, const char* user_s
}
free_response(r);
snprintf(url, sizeof(url), "%s/%s/_mappings/_doc?include_type_name=true", IndexCtx.es_url, IndexCtx.es_index);
r = web_put(url, user_mappings ? user_mappings : mappings_json);
if (IS_LEGACY_VERSION(es_version)) {
snprintf(url, sizeof(url), "%s/%s/_mappings/_doc?include_type_name=true", IndexCtx.es_url, IndexCtx.es_index);
} else {
snprintf(url, sizeof(url), "%s/%s/_mappings", IndexCtx.es_url, IndexCtx.es_index);
}
r = web_put(url, user_mappings ? user_mappings : mappings_json, IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Update ES mappings <%d>", r->status_code);
if (r->status_code != 200) {
print_error(r);
@@ -410,7 +552,7 @@ void elastic_init(int force_reset, const char* user_mappings, const char* user_s
free_response(r);
snprintf(url, sizeof(url), "%s/%s/_open", IndexCtx.es_url, IndexCtx.es_index);
r = web_post(url, "");
r = web_post(url, "", IndexCtx.es_insecure_ssl);
LOG_INFOF("elastic.c", "Open index <%d>", r->status_code);
free_response(r);
}
@@ -420,7 +562,7 @@ cJSON *elastic_get_document(const char *id_str) {
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_doc/%s", WebCtx.es_url, WebCtx.es_index, id_str);
response_t *r = web_get(url, 3);
response_t *r = web_get(url, 3, WebCtx.es_insecure_ssl);
cJSON *json = NULL;
if (r->status_code == 200) {
char *tmp = malloc(r->size + 1);
@@ -438,7 +580,7 @@ char *elastic_get_status() {
snprintf(url, sizeof(url),
"%s/_cluster/state/metadata/%s?filter_path=metadata.indices.*.state", WebCtx.es_url, WebCtx.es_index);
response_t *r = web_get(url, 30);
response_t *r = web_get(url, 30, IndexCtx.es_insecure_ssl);
cJSON *json = NULL;
char *status = malloc(128 * sizeof(char));
status[0] = '\0';

View File

@@ -3,12 +3,38 @@
#include "src/sist.h"
#define ES_BULK_LINE_INDEX 0
#define ES_BULK_LINE_DELETE 1
typedef struct es_bulk_line {
struct es_bulk_line *next;
char path_md5_str[MD5_STR_LENGTH];
char doc_id[SIST_DOC_ID_LEN];
int type;
char line[0];
} es_bulk_line_t;
typedef struct {
int major;
int minor;
int patch;
} es_version_t;
#define VERSION_GE(version, maj, min) ((version)->major > (maj) || ((version)->major == (maj) && (version)->minor >= (min)))
#define VERSION_LT(version, maj, min) (!VERSION_GE(version, maj, min))
#define IS_SUPPORTED_ES_VERSION(es_version) ((es_version) != NULL && VERSION_GE((es_version), 6, 8) && VERSION_LT((es_version), 9, 0))
#define IS_LEGACY_VERSION(es_version) ((es_version) != NULL && VERSION_LT((es_version), 7, 14))
__always_inline
static const char *format_es_version(es_version_t *version) {
static char buf[64];
snprintf(buf, sizeof(buf), "%d.%d.%d", version->major, version->minor, version->patch);
return buf;
}
/**
* Note: indexer is *not* thread safe
*/
@@ -16,9 +42,11 @@ typedef struct es_indexer es_indexer_t;
void elastic_index_line(es_bulk_line_t *line);
void print_json(cJSON *document, const char index_id_str[MD5_STR_LENGTH]);
void print_json(cJSON *document, const char index_id_str[SIST_INDEX_ID_LEN]);
void index_json(cJSON *document, const char index_id_str[MD5_STR_LENGTH]);
void index_json(cJSON *document, const char doc_id[SIST_INDEX_ID_LEN]);
void delete_document(const char *document_id_str, void* data);
es_indexer_t *create_indexer(const char *url, const char *index);
@@ -31,6 +59,8 @@ cJSON *elastic_get_document(const char *id_str);
char *elastic_get_status();
void execute_update_script(const char *script, int async, const char index_id[MD5_STR_LENGTH]);
es_version_t *elastic_get_version(const char *es_url, int insecure);
void execute_update_script(const char *script, int async, const char index_id[SIST_INDEX_ID_LEN]);
#endif

File diff suppressed because one or more lines are too long

View File

@@ -22,7 +22,7 @@ void free_response(response_t *resp) {
free(resp);
}
void web_post_async_poll(subreq_ctx_t* req) {
void web_post_async_poll(subreq_ctx_t *req) {
fd_set fdread;
fd_set fdwrite;
fd_set fdexcep;
@@ -34,7 +34,7 @@ void web_post_async_poll(subreq_ctx_t* req) {
CURLMcode mc = curl_multi_fdset(req->multi, &fdread, &fdwrite, &fdexcep, &maxfd);
if(mc != CURLM_OK) {
if (mc != CURLM_OK) {
req->done = TRUE;
return;
}
@@ -47,7 +47,7 @@ void web_post_async_poll(subreq_ctx_t* req) {
struct timeval timeout = {1, 0};
int rc = select(maxfd + 1, &fdread, &fdwrite, &fdexcep, &timeout);
switch(rc) {
switch (rc) {
case -1:
req->done = TRUE;
break;
@@ -64,6 +64,10 @@ void web_post_async_poll(subreq_ctx_t* req) {
req->response->size = req->response_buf.cur;
curl_easy_getinfo(req->handle, CURLINFO_RESPONSE_CODE, &req->response->status_code);
if (req->response->status_code == 0) {
LOG_ERRORF("web.c", "CURL Error: %s", req->curl_err_buffer)
}
curl_multi_cleanup(req->multi);
curl_easy_cleanup(req->handle);
curl_slist_free_all(req->headers);
@@ -71,7 +75,7 @@ void web_post_async_poll(subreq_ctx_t* req) {
}
}
subreq_ctx_t *web_post_async(const char *url, char *data) {
subreq_ctx_t *web_post_async(const char *url, char *data, int insecure) {
subreq_ctx_t *req = calloc(1, sizeof(subreq_ctx_t));
req->response = calloc(1, sizeof(response_t));
req->data = data;
@@ -84,6 +88,11 @@ subreq_ctx_t *web_post_async(const char *url, char *data) {
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_cb);
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
}
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, req->curl_err_buffer);
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type: application/json");
@@ -100,7 +109,7 @@ subreq_ctx_t *web_post_async(const char *url, char *data) {
return req;
}
response_t *web_get(const char *url, int timeout) {
response_t *web_get(const char *url, int timeout, int insecure) {
response_t *resp = malloc(sizeof(response_t));
CURL *curl;
@@ -112,14 +121,24 @@ response_t *web_get(const char *url, int timeout) {
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_cb);
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout);
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
}
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type: application/json");
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
char err_buffer[CURL_ERROR_SIZE + 1] = {};
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, err_buffer);
curl_easy_perform(curl);
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &resp->status_code);
if (resp->status_code == 0) {
LOG_ERRORF("web.c", "CURL Error: %s", err_buffer)
}
curl_easy_cleanup(curl);
curl_slist_free_all(headers);
@@ -128,7 +147,7 @@ response_t *web_get(const char *url, int timeout) {
return resp;
}
response_t *web_post(const char *url, const char *data) {
response_t *web_post(const char *url, const char *data, int insecure) {
response_t *resp = malloc(sizeof(response_t));
@@ -141,6 +160,12 @@ response_t *web_post(const char *url, const char *data) {
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_cb);
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
}
char err_buffer[CURL_ERROR_SIZE + 1] = {};
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, err_buffer);
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type: application/json");
@@ -151,17 +176,21 @@ response_t *web_post(const char *url, const char *data) {
curl_easy_perform(curl);
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &resp->status_code);
curl_easy_cleanup(curl);
curl_slist_free_all(headers);
resp->body = buffer.buf;
resp->size = buffer.cur;
if (resp->status_code == 0) {
LOG_ERRORF("web.c", "CURL Error: %s", err_buffer)
}
curl_easy_cleanup(curl);
curl_slist_free_all(headers);
return resp;
}
response_t *web_put(const char *url, const char *data) {
response_t *web_put(const char *url, const char *data, int insecure) {
response_t *resp = malloc(sizeof(response_t));
@@ -175,7 +204,10 @@ response_t *web_put(const char *url, const char *data) {
curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "PUT");
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE, 0);
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURLOPT_DNS_LOCAL_IP4 );
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURLOPT_DNS_LOCAL_IP4);
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
}
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Content-Type: application/json");
@@ -194,7 +226,7 @@ response_t *web_put(const char *url, const char *data) {
return resp;
}
response_t *web_delete(const char *url) {
response_t *web_delete(const char *url, int insecure) {
response_t *resp = malloc(sizeof(response_t));
@@ -207,6 +239,9 @@ response_t *web_delete(const char *url) {
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_cb);
curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "DELETE");
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
}
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "");
struct curl_slist *headers = NULL;

View File

@@ -25,14 +25,15 @@ typedef struct {
response_t *response;
int running_handles;
int done;
char curl_err_buffer[CURL_ERROR_SIZE + 1];
} subreq_ctx_t;
response_t *web_get(const char *url, int timeout);
response_t *web_post(const char * url, const char * data);
response_t *web_get(const char *url, int timeout, int insecure);
response_t *web_post(const char * url, const char * data, int insecure);
void web_post_async_poll(subreq_ctx_t* req);
subreq_ctx_t *web_post_async(const char *url, char *data);
response_t *web_put(const char *url, const char *data);
response_t *web_delete(const char *url);
subreq_ctx_t *web_post_async(const char *url, char *data, int insecure);
response_t *web_put(const char *url, const char *data, int insecure);
response_t *web_delete(const char *url, int insecure);
void free_response(response_t *resp);

View File

@@ -38,6 +38,8 @@ char *get_meta_key_text(enum metakey meta_key) {
return "parent";
case MetaExifMake:
return "exif_make";
case MetaExifDescription:
return "exif_description";
case MetaExifSoftware:
return "exif_software";
case MetaExifExposureTime:
@@ -74,6 +76,8 @@ char *get_meta_key_text(enum metakey meta_key) {
return "exif_gps_latitude_dms";
case MetaExifGpsLatitudeDec:
return "exif_gps_latitude_dec";
case MetaChecksum:
return "checksum";
default:
LOG_FATALF("serialize.c", "FIXME: Unknown meta key: %d", meta_key)
}
@@ -120,15 +124,14 @@ char *build_json_string(document_t *doc) {
cJSON_AddStringToObject(json, "path", "");
}
char md5_str[MD5_STR_LENGTH];
buf2hex(doc->path_md5, MD5_DIGEST_LENGTH, md5_str);
cJSON_AddStringToObject(json, "_id", md5_str);
cJSON_AddStringToObject(json, "_id", doc->doc_id);
// Metadata
meta_line_t *meta = doc->meta_head;
while (meta != NULL) {
switch (meta->key) {
case MetaThumbnail:
case MetaPages:
case MetaWidth:
case MetaHeight:
@@ -148,6 +151,7 @@ char *build_json_string(document_t *doc) {
case MetaFontName:
case MetaParent:
case MetaExifMake:
case MetaExifDescription:
case MetaExifSoftware:
case MetaExifExposureTime:
case MetaExifFNumber:
@@ -158,13 +162,13 @@ char *build_json_string(document_t *doc) {
case MetaExifModel:
case MetaAuthor:
case MetaModifiedBy:
case MetaThumbnail:
case MetaExifGpsLongitudeDMS:
case MetaExifGpsLongitudeDec:
case MetaExifGpsLongitudeRef:
case MetaExifGpsLatitudeDMS:
case MetaExifGpsLatitudeDec:
case MetaExifGpsLatitudeRef:
case MetaChecksum:
case MetaTitle: {
cJSON_AddStringToObject(json, get_meta_key_text(meta->key), meta->str_val);
buffer_size_guess += (int) strlen(meta->str_val);
@@ -392,7 +396,7 @@ void read_index_bin_handle_line(const char *line, const char *index_id, index_fu
}
}
void read_index_ndjson(const char *path, const char *index_id, index_func func) {
void read_lines(const char *path, const line_processor_t processor) {
dyn_buffer_t buf = dyn_buffer_create();
// Initialize zstd things
@@ -421,7 +425,7 @@ void read_index_ndjson(const char *path, const char *index_id, index_func func)
if (c == '\n') {
dyn_buffer_write_char(&buf, '\0');
read_index_bin_handle_line(buf.buf, index_id, func);
processor.func(buf.buf, processor.data);
buf.cur = 0;
} else {
dyn_buffer_write_char(&buf, c);
@@ -448,20 +452,29 @@ void read_index_ndjson(const char *path, const char *index_id, index_func func)
fclose(file);
}
void read_index(const char *path, const char index_id[MD5_STR_LENGTH], const char *type, index_func func) {
void read_index_ndjson(const char *line, void *_data) {
void **data = _data;
const char *index_id = data[0];
index_func func = data[1];
read_index_bin_handle_line(line, index_id, func);
}
void read_index(const char *path, const char index_id[SIST_INDEX_ID_LEN], const char *type, index_func func) {
if (strcmp(type, INDEX_TYPE_NDJSON) == 0) {
read_index_ndjson(path, index_id, func);
read_lines(path, (line_processor_t) {
.data = (void *[2]) {(void *) index_id, func},
.func = read_index_ndjson,
});
}
}
static __thread GHashTable *IncrementalReadTable = NULL;
void json_put_incremental(cJSON *document, UNUSED(const char id_str[MD5_STR_LENGTH])) {
void json_put_incremental(cJSON *document, UNUSED(const char doc_id[SIST_DOC_ID_LEN])) {
const char *path_md5_str = cJSON_GetObjectItem(document, "_id")->valuestring;
const int mtime = cJSON_GetObjectItem(document, "mtime")->valueint;
incremental_put_str(IncrementalReadTable, path_md5_str, mtime);
incremental_put(IncrementalReadTable, path_md5_str, mtime);
}
void incremental_read(GHashTable *table, const char *filepath, index_descriptor_t *desc) {
@@ -470,16 +483,15 @@ void incremental_read(GHashTable *table, const char *filepath, index_descriptor_
}
static __thread GHashTable *IncrementalCopyTable = NULL;
static __thread GHashTable *IncrementalNewTable = NULL;
static __thread store_t *IncrementalCopySourceStore = NULL;
static __thread store_t *IncrementalCopyDestinationStore = NULL;
void incremental_copy_handle_doc(cJSON *document, UNUSED(const char id_str[MD5_STR_LENGTH])) {
void incremental_copy_handle_doc(cJSON *document, UNUSED(const char id_str[SIST_DOC_ID_LEN])) {
const char *path_md5_str = cJSON_GetObjectItem(document, "_id")->valuestring;
unsigned char path_md5[MD5_DIGEST_LENGTH];
hex2buf(path_md5_str, MD5_STR_LENGTH - 1, path_md5);
const char *doc_id = cJSON_GetObjectItem(document, "_id")->valuestring;
if (cJSON_GetObjectItem(document, "parent") != NULL || incremental_get_str(IncrementalCopyTable, path_md5_str)) {
if (cJSON_GetObjectItem(document, "parent") != NULL || incremental_get(IncrementalCopyTable, doc_id)) {
// Copy index line
cJSON_DeleteItemFromObject(document, "index");
char *json_str = cJSON_PrintUnformatted(document);
@@ -493,9 +505,9 @@ void incremental_copy_handle_doc(cJSON *document, UNUSED(const char id_str[MD5_S
// Copy tn store contents
size_t buf_len;
char *buf = store_read(IncrementalCopySourceStore, (char *) path_md5, sizeof(path_md5), &buf_len);
char *buf = store_read(IncrementalCopySourceStore, (char *) doc_id, SIST_DOC_ID_LEN, &buf_len);
if (buf_len != 0) {
store_write(IncrementalCopyDestinationStore, (char *) path_md5, sizeof(path_md5), buf, buf_len);
store_write(IncrementalCopyDestinationStore, (char *) doc_id, SIST_DOC_ID_LEN, buf, buf_len);
free(buf);
}
}
@@ -518,3 +530,33 @@ void incremental_copy(store_t *store, store_t *dst_store, const char *filepath,
read_index(filepath, "", INDEX_TYPE_NDJSON, incremental_copy_handle_doc);
}
void incremental_delete_handle_doc(cJSON *document, UNUSED(const char id_str[SIST_DOC_ID_LEN])) {
char doc_id_n[SIST_DOC_ID_LEN + 1];
doc_id_n[SIST_DOC_ID_LEN] = '\0';
doc_id_n[SIST_DOC_ID_LEN - 1] = '\n';
const char *doc_id = cJSON_GetObjectItem(document, "_id")->valuestring;
// do not delete archive virtual entries
if (cJSON_GetObjectItem(document, "parent") == NULL
&& !incremental_get(IncrementalCopyTable, doc_id)
&& !incremental_get(IncrementalNewTable, doc_id)
) {
memcpy(doc_id_n, doc_id, SIST_DOC_ID_LEN - 1);
zstd_write_string(doc_id, sizeof(doc_id_n));
}
}
void incremental_delete(const char *del_filepath, const char *index_filepath,
GHashTable *copy_table, GHashTable *new_table) {
if (WriterCtx.out_file == NULL) {
initialize_writer_ctx(del_filepath);
}
IncrementalCopyTable = copy_table;
IncrementalNewTable = new_table;
read_index(index_filepath, "", INDEX_TYPE_NDJSON, incremental_delete_handle_doc);
}

View File

@@ -7,14 +7,24 @@
#include <sys/syscall.h>
#include <glib.h>
typedef void(*index_func)(cJSON *, const char[MD5_STR_LENGTH]);
typedef struct line_processor {
void* data;
void (*func)(const char*, void*);
} line_processor_t;
typedef void(*index_func)(cJSON *, const char[SIST_DOC_ID_LEN]);
void incremental_copy(store_t *store, store_t *dst_store, const char *filepath,
const char *dst_filepath, GHashTable *copy_table);
void incremental_delete(const char *del_filepath, const char* index_filepath,
GHashTable *copy_table, GHashTable *new_table);
void write_document(document_t *doc);
void read_index(const char *path, const char[MD5_STR_LENGTH], const char *type, index_func);
void read_lines(const char *path, const line_processor_t processor);
void read_index(const char *path, const char index_id[SIST_INDEX_ID_LEN], const char *type, index_func);
void incremental_read(GHashTable *table, const char *filepath, index_descriptor_t *desc);
@@ -29,4 +39,18 @@ void write_index_descriptor(char *path, index_descriptor_t *desc);
index_descriptor_t read_index_descriptor(char *path);
#endif
// caller ensures char file_path[PATH_MAX]
#define READ_INDICES(file_path, index_path, action_ok, action_main_fail, cond_original) \
snprintf(file_path, PATH_MAX, "%s_index_main.ndjson.zst", index_path); \
if (access(file_path, R_OK) == 0) { \
action_ok; \
} else { \
action_main_fail; \
} \
snprintf(file_path, PATH_MAX, "%s_index_original.ndjson.zst", index_path); \
if ((cond_original) && access(file_path, R_OK) == 0) { \
action_ok; \
} \
#endif

View File

@@ -4,6 +4,7 @@
store_t *store_create(const char *path, size_t chunk_size) {
store_t *store = malloc(sizeof(struct store_t));
mkdir(path, S_IWUSR | S_IRUSR | S_IXUSR);
strcpy(store->path, path);
#if (SIST_FAKE_STORE != 1)
store->chunk_size = chunk_size;
@@ -22,7 +23,6 @@ store_t *store_create(const char *path, size_t chunk_size) {
}
store->size = (size_t) store->chunk_size;
ScanCtx.stat_tn_size = 0;
mdb_env_set_mapsize(store->env, store->size);
// Open dbi
@@ -52,13 +52,7 @@ void store_flush(store_t *store) {
void store_write(store_t *store, char *key, size_t key_len, char *buf, size_t buf_len) {
if (LogCtx.very_verbose) {
if (key_len == MD5_DIGEST_LENGTH) {
char path_md5_str[MD5_STR_LENGTH];
buf2hex((unsigned char *) key, MD5_DIGEST_LENGTH, path_md5_str);
LOG_DEBUGF("store.c", "Store write {%s} %lu bytes", path_md5_str, buf_len)
} else {
LOG_DEBUGF("store.c", "Store write {%s} %lu bytes", key, buf_len)
}
LOG_DEBUGF("store.c", "Store write %s@{%s} %lu bytes", store->path, key, buf_len)
}
#if (SIST_FAKE_STORE != 1)
@@ -78,27 +72,57 @@ void store_write(store_t *store, char *key, size_t key_len, char *buf, size_t bu
int put_ret = mdb_put(txn, store->dbi, &mdb_key, &mdb_value, 0);
ScanCtx.stat_tn_size += buf_len;
int db_full = FALSE;
int should_abort_transaction = FALSE;
if (put_ret == MDB_MAP_FULL) {
mdb_txn_abort(txn);
db_full = TRUE;
should_abort_transaction = TRUE;
} else {
int commit_ret = mdb_txn_commit(txn);
if (commit_ret == MDB_MAP_FULL) {
db_full = TRUE;
}
}
if (db_full) {
LOG_INFOF("store.c", "Updating mdb mapsize to %lu bytes", store->size)
if (should_abort_transaction) {
mdb_txn_abort(txn);
}
pthread_rwlock_unlock(&store->lock);
// Cannot resize when there is a opened transaction.
// Resize take effect on the next commit.
pthread_rwlock_wrlock(&store->lock);
store->size += store->chunk_size;
mdb_env_set_mapsize(store->env, store->size);
int resize_ret = mdb_env_set_mapsize(store->env, store->size);
if (resize_ret != 0) {
LOG_ERROR("store.c", mdb_strerror(put_ret))
}
mdb_txn_begin(store->env, NULL, 0, &txn);
put_ret = mdb_put(txn, store->dbi, &mdb_key, &mdb_value, 0);
int put_ret_retry = mdb_put(txn, store->dbi, &mdb_key, &mdb_value, 0);
if (put_ret_retry != 0) {
LOG_ERROR("store.c", mdb_strerror(put_ret))
}
int ret = mdb_txn_commit(txn);
if (ret != 0) {
LOG_FATALF("store.c", "FIXME: Could not commit to store %s: %s (%d), %d, %d %d",
store->path, mdb_strerror(ret), ret,
put_ret, put_ret_retry);
}
LOG_INFOF("store.c", "Updated mdb mapsize to %lu bytes", store->size)
}
mdb_txn_commit(txn);
pthread_rwlock_unlock(&store->lock);
if (put_ret != 0) {
} else if (put_ret != 0) {
LOG_ERROR("store.c", mdb_strerror(put_ret))
}
pthread_rwlock_unlock(&store->lock);
#endif
}

View File

@@ -6,12 +6,12 @@
#include <glib.h>
#define STORE_SIZE_TN 1024 * 1024 * 5
#define STORE_SIZE_TAG 1024 * 16
#define STORE_SIZE_TN (1024 * 1024 * 5)
#define STORE_SIZE_TAG (1024 * 1024)
#define STORE_SIZE_META STORE_SIZE_TAG
typedef struct store_t {
char *path;
char path[PATH_MAX];
char *tmp_path;
MDB_dbi dbi;
MDB_env *env;

View File

@@ -4,6 +4,8 @@
#include <ftw.h>
#define STR_STARTS_WITH(x, y) (strncmp(y, x, strlen(y) - 1) == 0)
__always_inline
parse_job_t *create_fs_parse_job(const char *filepath, const struct stat *info, int base) {
int len = (int) strlen(filepath);
@@ -20,43 +22,114 @@ parse_job_t *create_fs_parse_job(const char *filepath, const struct stat *info,
job->vfile.info = *info;
memset(job->parent, 0, MD5_DIGEST_LENGTH);
job->parent[0] = '\0';
job->vfile.filepath = job->filepath;
job->vfile.read = fs_read;
// Filesystem reads are always rewindable
job->vfile.read_rewindable = fs_read;
job->vfile.reset = fs_reset;
job->vfile.close = fs_close;
job->vfile.fd = -1;
job->vfile.is_fs_file = TRUE;
job->vfile.has_checksum = FALSE;
job->vfile.rewind_buffer_size = 0;
job->vfile.rewind_buffer = NULL;
job->vfile.calculate_checksum = ScanCtx.calculate_checksums;
return job;
}
int sub_strings[30];
#define EXCLUDED(str) (pcre_exec(ScanCtx.exclude, ScanCtx.exclude_extra, filepath, strlen(filepath), 0, 0, sub_strings, sizeof(sub_strings)) >= 0)
#define EXCLUDED(str) (pcre_exec(ScanCtx.exclude, ScanCtx.exclude_extra, str, strlen(str), 0, 0, sub_strings, sizeof(sub_strings)) >= 0)
int handle_entry(const char *filepath, const struct stat *info, int typeflag, struct FTW *ftw) {
if (typeflag == FTW_F && S_ISREG(info->st_mode) && ftw->level <= ScanCtx.depth) {
if (ftw->level > ScanCtx.depth) {
if (typeflag == FTW_D) {
return FTW_SKIP_SUBTREE;
}
return FTW_CONTINUE;
}
if (ScanCtx.exclude != NULL && EXCLUDED(filepath)) {
LOG_DEBUGF("walk.c", "Excluded: %s", filepath)
if (ScanCtx.exclude != NULL && EXCLUDED(filepath)) {
LOG_DEBUGF("walk.c", "Excluded: %s", filepath)
if (typeflag == FTW_F && S_ISREG(info->st_mode)) {
pthread_mutex_lock(&ScanCtx.dbg_file_counts_mu);
ScanCtx.dbg_excluded_files_count += 1;
pthread_mutex_unlock(&ScanCtx.dbg_file_counts_mu);
return 0;
} else if (typeflag == FTW_D) {
return FTW_SKIP_SUBTREE;
}
return FTW_CONTINUE;
}
if (typeflag == FTW_F && S_ISREG(info->st_mode)) {
parse_job_t *job = create_fs_parse_job(filepath, info, ftw->base);
tpool_add_work(ScanCtx.pool, parse, job);
}
return 0;
return FTW_CONTINUE;
}
#define MAX_FILE_DESCRIPTORS 64
int walk_directory_tree(const char *dirpath) {
return nftw(dirpath, handle_entry, MAX_FILE_DESCRIPTORS, FTW_PHYS | FTW_DEPTH);
return nftw(dirpath, handle_entry, MAX_FILE_DESCRIPTORS, FTW_PHYS | FTW_ACTIONRETVAL);
}
int iterate_file_list(void *input_file) {
char buf[PATH_MAX];
struct stat info;
while (fgets(buf, sizeof(buf), input_file) != NULL) {
// Remove trailing newline
*(buf + strlen(buf) - 1) = '\0';
int stat_ret = stat(buf, &info);
if (stat_ret != 0) {
LOG_ERRORF("walk.c", "Could not stat file %s (%s)", buf, strerror(errno));
continue;
}
if (!S_ISREG(info.st_mode)) {
LOG_ERRORF("walk.c", "Is not a regular file: %s", buf);
continue;
}
char *absolute_path = canonicalize_file_name(buf);
if (absolute_path == NULL) {
LOG_FATALF("walk.c", "FIXME: Could not get absolute path of %s", buf);
}
if (ScanCtx.exclude != NULL && EXCLUDED(absolute_path)) {
LOG_DEBUGF("walk.c", "Excluded: %s", absolute_path)
if (S_ISREG(info.st_mode)) {
pthread_mutex_lock(&ScanCtx.dbg_file_counts_mu);
ScanCtx.dbg_excluded_files_count += 1;
pthread_mutex_unlock(&ScanCtx.dbg_file_counts_mu);
}
continue;
}
if (!STR_STARTS_WITH(absolute_path, ScanCtx.index.desc.root)) {
LOG_FATALF("walk.c", "File is not a children of root folder (%s): %s", ScanCtx.index.desc.root, buf);
}
int base = (int) (strrchr(buf, '/') - buf) + 1;
parse_job_t *job = create_fs_parse_job(absolute_path, &info, base);
free(absolute_path);
tpool_add_work(ScanCtx.pool, parse, job);
}
return 0;
}

View File

@@ -5,4 +5,6 @@
int walk_directory_tree(const char *);
int iterate_file_list(void* input_file);
#endif

View File

@@ -48,6 +48,12 @@ void vsist_logf(const char *filepath, int level, char *format, va_list ap) {
size_t maxsize = sizeof(log_str) - log_len;
log_len += vsnprintf(log_str + log_len, maxsize, format, ap);
if (log_len >= maxsize) {
fprintf(stderr, "([%s] FIXME: Log string is too long to display: %dB)\n",
log_levels[level], log_len);
return;
}
if (is_tty) {
log_len += sprintf(log_str + log_len, "\033[0m\n");
} else {
@@ -55,10 +61,14 @@ void vsist_logf(const char *filepath, int level, char *format, va_list ap) {
log_len += 1;
}
int ret = write(STDERR_FILENO, log_str, log_len);
if (ret == -1) {
LOG_FATALF("serialize.c", "Could not write index descriptor: %s", strerror(errno))
if (PrintingProgressBar) {
PrintingProgressBar = FALSE;
memmove(log_str + 1, log_str, log_len);
log_str[0] = '\n';
log_len += 1;
}
write(STDERR_FILENO, log_str, log_len);
}
void sist_logf(const char *filepath, int level, char *format, ...) {
@@ -104,8 +114,12 @@ void sist_log(const char *filepath, int level, char *str) {
);
}
int ret = write(STDERR_FILENO, log_str, log_len);
if (ret == -1) {
LOG_FATALF("serialize.c", "Could not write index descriptor: %s", strerror(errno));
if (PrintingProgressBar) {
PrintingProgressBar = FALSE;
memmove(log_str + 1, log_str, log_len);
log_str[0] = '\n';
log_len += 1;
}
write(STDERR_FILENO, log_str, log_len);
}

1
src/magic_generated.c vendored Normal file

File diff suppressed because one or more lines are too long

View File

@@ -14,6 +14,9 @@
#include "parsing/mime.h"
#include "parsing/parse.h"
#include <signal.h>
#include <unistd.h>
#include "stats.h"
#define DESCRIPTION "Lightning-fast file system indexer and search tool."
@@ -29,44 +32,50 @@ static const char *const usage[] = {
NULL,
};
#include<signal.h>
#include<unistd.h>
static __sighandler_t sigsegv_handler = NULL;
static __sighandler_t sigabrt_handler = NULL;
void sig_handler(int signum) {
LogCtx.verbose = 1;
LogCtx.very_verbose = 1;
LogCtx.verbose = TRUE;
LogCtx.very_verbose = TRUE;
LOG_ERROR("*SIGNAL HANDLER*", "=============================================\n\n");
LOG_ERRORF("*SIGNAL HANDLER*", "Uh oh! Caught fatal signal: %s", strsignal(signum));
GHashTableIter iter;
g_hash_table_iter_init(&iter, ScanCtx.dbg_current_files);
if (ScanCtx.dbg_current_files != NULL) {
GHashTableIter iter;
g_hash_table_iter_init(&iter, ScanCtx.dbg_current_files);
void *key;
void *value;
while (g_hash_table_iter_next(&iter, &key, &value)) {
parse_job_t *job = value;
void *key;
void *value;
while (g_hash_table_iter_next(&iter, &key, &value)) {
parse_job_t *job = value;
if (isatty(STDERR_FILENO)) {
LOG_DEBUGF(
"*SIGNAL HANDLER*",
"Thread \033[%dm[%04llX]\033[0m was working on job '%s'",
31 + ((unsigned int) key) % 7, key, job->filepath
);
} else {
LOG_DEBUGF(
"*SIGNAL HANDLER*",
"THREAD [%04llX] was working on job %s",
key, job->filepath
);
if (isatty(STDERR_FILENO)) {
LOG_DEBUGF(
"*SIGNAL HANDLER*",
"Thread \033[%dm[%04llX]\033[0m was working on job '%s'",
31 + ((unsigned int) key) % 7, key, job->filepath
);
} else {
LOG_DEBUGF(
"*SIGNAL HANDLER*",
"THREAD [%04llX] was working on job %s",
key, job->filepath
);
}
}
}
tpool_dump_debug_info(ScanCtx.pool);
if (ScanCtx.pool != NULL) {
tpool_dump_debug_info(ScanCtx.pool);
}
if (IndexCtx.pool != NULL) {
tpool_dump_debug_info(IndexCtx.pool);
}
LOG_INFO(
"*SIGNAL HANDLER*",
@@ -94,7 +103,7 @@ void sig_handler(int signum) {
exit(-1);
}
void init_dir(const char *dirpath) {
void init_dir(const char *dirpath, scan_args_t *args) {
char path[PATH_MAX];
snprintf(path, PATH_MAX, "%sdescriptor.json", dirpath);
@@ -102,9 +111,18 @@ void init_dir(const char *dirpath) {
strcpy(ScanCtx.index.desc.version, Version);
strcpy(ScanCtx.index.desc.type, INDEX_TYPE_NDJSON);
unsigned char index_md5[MD5_DIGEST_LENGTH];
MD5((unsigned char *) &ScanCtx.index.desc.timestamp, sizeof(ScanCtx.index.desc.timestamp), index_md5);
buf2hex(index_md5, MD5_DIGEST_LENGTH, ScanCtx.index.desc.id);
if (args->incremental != NULL) {
// copy old index id
char descriptor_path[PATH_MAX];
snprintf(descriptor_path, PATH_MAX, "%sdescriptor.json", args->incremental);
index_descriptor_t original_desc = read_index_descriptor(descriptor_path);
memcpy(ScanCtx.index.desc.id, original_desc.id, sizeof(original_desc.id));
} else {
// generate new index id based on timestamp
unsigned char index_md5[MD5_DIGEST_LENGTH];
MD5((unsigned char *) &ScanCtx.index.desc.timestamp, sizeof(ScanCtx.index.desc.timestamp), index_md5);
buf2hex(index_md5, MD5_DIGEST_LENGTH, ScanCtx.index.desc.id);
}
write_index_descriptor(path, &ScanCtx.index.desc);
}
@@ -161,6 +179,9 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.dbg_current_files = g_hash_table_new_full(g_int64_hash, g_int64_equal, NULL, NULL);
pthread_mutex_init(&ScanCtx.dbg_current_files_mu, NULL);
pthread_mutex_init(&ScanCtx.dbg_file_counts_mu, NULL);
pthread_mutex_init(&ScanCtx.copy_table_mu, NULL);
ScanCtx.calculate_checksums = args->calculate_checksums;
// Archive
ScanCtx.arc_ctx.mode = args->archive_mode;
@@ -177,40 +198,50 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.comic_ctx.log = _log;
ScanCtx.comic_ctx.logf = _logf;
ScanCtx.comic_ctx.store = _store;
ScanCtx.comic_ctx.tn_size = args->size;
ScanCtx.comic_ctx.tn_qscale = args->quality;
ScanCtx.comic_ctx.enable_tn = args->tn_count > 0;
ScanCtx.comic_ctx.tn_size = args->tn_size;
ScanCtx.comic_ctx.tn_qscale = args->tn_quality;
ScanCtx.comic_ctx.cbr_mime = mime_get_mime_by_string(ScanCtx.mime_table, "application/x-cbr");
ScanCtx.comic_ctx.cbz_mime = mime_get_mime_by_string(ScanCtx.mime_table, "application/x-cbz");
// Ebook
pthread_mutex_init(&ScanCtx.ebook_ctx.mupdf_mutex, NULL);
ScanCtx.ebook_ctx.content_size = args->content_size;
ScanCtx.ebook_ctx.tn_size = args->size;
ScanCtx.ebook_ctx.enable_tn = args->tn_count > 0;
ScanCtx.ebook_ctx.tn_size = args->tn_size;
ScanCtx.ebook_ctx.tesseract_lang = args->tesseract_lang;
ScanCtx.ebook_ctx.tesseract_path = args->tesseract_path;
ScanCtx.ebook_ctx.log = _log;
ScanCtx.ebook_ctx.logf = _logf;
ScanCtx.ebook_ctx.store = _store;
ScanCtx.ebook_ctx.fast_epub_parse = args->fast_epub;
ScanCtx.ebook_ctx.tn_qscale = args->quality;
ScanCtx.ebook_ctx.tn_qscale = args->tn_quality;
// Font
ScanCtx.font_ctx.enable_tn = args->size > 0;
ScanCtx.font_ctx.enable_tn = args->tn_count > 0;
ScanCtx.font_ctx.log = _log;
ScanCtx.font_ctx.logf = _logf;
ScanCtx.font_ctx.store = _store;
// Media
ScanCtx.media_ctx.tn_qscale = args->quality;
ScanCtx.media_ctx.tn_size = args->size;
ScanCtx.media_ctx.tn_qscale = args->tn_quality;
ScanCtx.media_ctx.tn_size = args->tn_size;
ScanCtx.media_ctx.tn_count = args->tn_count;
ScanCtx.media_ctx.log = _log;
ScanCtx.media_ctx.logf = _logf;
ScanCtx.media_ctx.store = _store;
ScanCtx.media_ctx.max_media_buffer = (long) args->max_memory_buffer * 1024 * 1024;
ScanCtx.media_ctx.max_media_buffer = (long) args->max_memory_buffer_mib * 1024 * 1024;
ScanCtx.media_ctx.read_subtitles = args->read_subtitles;
ScanCtx.media_ctx.read_subtitles = args->tn_count;
if (args->ocr_images) {
ScanCtx.media_ctx.tesseract_lang = args->tesseract_lang;
ScanCtx.media_ctx.tesseract_path = args->tesseract_path;
}
init_media();
// OOXML
ScanCtx.ooxml_ctx.enable_tn = args->tn_count > 0;
ScanCtx.ooxml_ctx.content_size = args->content_size;
ScanCtx.ooxml_ctx.log = _log;
ScanCtx.ooxml_ctx.logf = _logf;
@@ -227,7 +258,8 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.text_ctx.logf = _logf;
// MSDOC
ScanCtx.msdoc_ctx.tn_size = args->size;
ScanCtx.msdoc_ctx.enable_tn = args->tn_count > 0;
ScanCtx.msdoc_ctx.tn_size = args->tn_size;
ScanCtx.msdoc_ctx.content_size = args->content_size;
ScanCtx.msdoc_ctx.log = _log;
ScanCtx.msdoc_ctx.logf = _logf;
@@ -236,6 +268,7 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.threads = args->threads;
ScanCtx.depth = args->depth;
ScanCtx.mem_limit = (size_t) args->scan_mem_limit_mib * 1024 * 1024;
strncpy(ScanCtx.index.path, args->output, sizeof(ScanCtx.index.path));
strncpy(ScanCtx.index.desc.name, args->name, sizeof(ScanCtx.index.desc.name));
@@ -245,44 +278,112 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.fast = args->fast;
// Raw
ScanCtx.raw_ctx.tn_qscale = args->quality;
ScanCtx.raw_ctx.tn_size = args->size;
ScanCtx.raw_ctx.tn_qscale = args->tn_quality;
ScanCtx.raw_ctx.enable_tn = args->tn_count > 0;
ScanCtx.raw_ctx.tn_size = args->tn_size;
ScanCtx.raw_ctx.log = _log;
ScanCtx.raw_ctx.logf = _logf;
ScanCtx.raw_ctx.store = _store;
// Wpd
ScanCtx.wpd_ctx.content_size = args->content_size;
ScanCtx.wpd_ctx.log = _log;
ScanCtx.wpd_ctx.logf = _logf;
ScanCtx.wpd_ctx.wpd_mime = mime_get_mime_by_string(ScanCtx.mime_table, "application/wordperfect");
// Json
ScanCtx.json_ctx.content_size = args->content_size;
ScanCtx.json_ctx.log = _log;
ScanCtx.json_ctx.logf = _logf;
ScanCtx.json_ctx.json_mime = mime_get_mime_by_string(ScanCtx.mime_table, "application/json");
ScanCtx.json_ctx.ndjson_mime = mime_get_mime_by_string(ScanCtx.mime_table, "application/ndjson");
}
/**
* Loads an existing index as the baseline for incremental scanning.
* 1. load old index files (original+main) => original_table
* 2. allocate empty table => copy_table
* 3. allocate empty table => new_table
* the original_table/copy_table/new_table will be populated in parsing/parse.c:parse
* and consumed in main.c:save_incremental_index
*
* Note: the existing index may or may not be of incremental index form.
*/
void load_incremental_index(const scan_args_t *args) {
char file_path[PATH_MAX];
ScanCtx.original_table = incremental_get_table();
ScanCtx.copy_table = incremental_get_table();
DIR *dir = opendir(args->incremental);
if (dir == NULL) {
LOG_FATALF("main.c", "Could not open original index for incremental scan: %s", strerror(errno))
}
ScanCtx.new_table = incremental_get_table();
char descriptor_path[PATH_MAX];
snprintf(descriptor_path, PATH_MAX, "%s/descriptor.json", args->incremental);
snprintf(descriptor_path, PATH_MAX, "%sdescriptor.json", args->incremental);
index_descriptor_t original_desc = read_index_descriptor(descriptor_path);
if (strcmp(original_desc.version, Version) != 0) {
LOG_FATALF("main.c", "Version mismatch! Index is %s but executable is %s", original_desc.version, Version)
}
struct dirent *de;
while ((de = readdir(dir)) != NULL) {
if (strncmp(de->d_name, "_index", sizeof("_index") - 1) == 0) {
char file_path[PATH_MAX];
snprintf(file_path, PATH_MAX, "%s%s", args->incremental, de->d_name);
incremental_read(ScanCtx.original_table, file_path, &original_desc);
}
}
closedir(dir);
READ_INDICES(
file_path,
args->incremental,
incremental_read(ScanCtx.original_table, file_path, &original_desc),
LOG_FATALF("main.c", "Could not open original main index for incremental scan: %s", strerror(errno)),
TRUE
);
LOG_INFOF("main.c", "Loaded %d items in to mtime table.", g_hash_table_size(ScanCtx.original_table))
}
/**
* Saves an incremental index.
* Before calling this function, the scanner should have finished writing the main index.
* 1. Build original_table - new_table => delete_table
* 2. Incrementally copy from old index files [(original+main) /\ copy_table] => index_original.ndjson.zst & store
*/
void save_incremental_index(scan_args_t *args) {
char dst_path[PATH_MAX];
char store_path[PATH_MAX];
char file_path[PATH_MAX];
char del_path[PATH_MAX];
snprintf(store_path, PATH_MAX, "%sthumbs", args->incremental);
snprintf(dst_path, PATH_MAX, "%s_index_original.ndjson.zst", ScanCtx.index.path);
store_t *source = store_create(store_path, STORE_SIZE_TN);
LOG_INFOF("main.c", "incremental_delete: original size = %u, copy size = %u, new size = %u",
g_hash_table_size(ScanCtx.original_table),
g_hash_table_size(ScanCtx.copy_table),
g_hash_table_size(ScanCtx.new_table));
snprintf(del_path, PATH_MAX, "%s_index_delete.list.zst", ScanCtx.index.path);
READ_INDICES(file_path, args->incremental,
incremental_delete(del_path, file_path, ScanCtx.copy_table, ScanCtx.new_table),
perror("incremental_delete"), 1);
writer_cleanup();
READ_INDICES(file_path, args->incremental,
incremental_copy(source, ScanCtx.index.store, file_path, dst_path, ScanCtx.copy_table),
perror("incremental_copy"), 1);
writer_cleanup();
store_destroy(source);
snprintf(store_path, PATH_MAX, "%stags", args->incremental);
snprintf(dst_path, PATH_MAX, "%stags", ScanCtx.index.path);
store_t *source_tags = store_create(store_path, STORE_SIZE_TAG);
store_copy(source_tags, dst_path);
store_destroy(source_tags);
}
/**
* An index can be either incremental or non-incremental (initial index).
* For an initial index, there is only the "main" index.
* For an incremental index, there are, additionally:
* - An "original" index, referencing all files unchanged since the previous index.
* - A "delete" index, referencing all files that exist in the previous index, but deleted since then.
* Therefore, for an incremental index, "main"+"original" covers all the current files in the live filesystem,
* and is orthognal with the "delete" index. When building an incremental index upon an old incremental index,
* the old "delete" index can be safely ignored.
*/
void sist2_scan(scan_args_t *args) {
ScanCtx.mime_table = mime_get_mime_table();
@@ -290,7 +391,7 @@ void sist2_scan(scan_args_t *args) {
initialize_scan_context(args);
init_dir(ScanCtx.index.path);
init_dir(ScanCtx.index.path, args);
char store_path[PATH_MAX];
snprintf(store_path, PATH_MAX, "%sthumbs", ScanCtx.index.path);
@@ -305,16 +406,26 @@ void sist2_scan(scan_args_t *args) {
load_incremental_index(args);
}
ScanCtx.pool = tpool_create(args->threads, thread_cleanup, TRUE, TRUE);
ScanCtx.pool = tpool_create(ScanCtx.threads, thread_cleanup, TRUE, TRUE, ScanCtx.mem_limit);
tpool_start(ScanCtx.pool);
ScanCtx.writer_pool = tpool_create(1, writer_cleanup, TRUE, FALSE);
ScanCtx.writer_pool = tpool_create(1, writer_cleanup, TRUE, FALSE, 0);
tpool_start(ScanCtx.writer_pool);
int walk_ret = walk_directory_tree(ScanCtx.index.desc.root);
if (walk_ret == -1) {
LOG_FATALF("main.c", "walk_directory_tree() failed! %s (%d)", strerror(errno), errno)
if (args->list_path) {
// Scan using file list
int list_ret = iterate_file_list(args->list_file);
if (list_ret != 0) {
LOG_FATALF("main.c", "iterate_file_list() failed! (%d)", list_ret)
}
} else {
// Scan directory recursively
int walk_ret = walk_directory_tree(ScanCtx.index.desc.root);
if (walk_ret == -1) {
LOG_FATALF("main.c", "walk_directory_tree() failed! %s (%d)", strerror(errno), errno)
}
}
tpool_wait(ScanCtx.pool);
tpool_destroy(ScanCtx.pool);
@@ -324,35 +435,11 @@ void sist2_scan(scan_args_t *args) {
LOG_DEBUGF("main.c", "Skipped files: %d", ScanCtx.dbg_skipped_files_count)
LOG_DEBUGF("main.c", "Excluded files: %d", ScanCtx.dbg_excluded_files_count)
LOG_DEBUGF("main.c", "Failed files: %d", ScanCtx.dbg_failed_files_count)
LOG_DEBUGF("main.c", "Thumbnail store size: %lu", ScanCtx.stat_tn_size)
LOG_DEBUGF("main.c", "Index size: %lu", ScanCtx.stat_index_size)
if (args->incremental != NULL) {
char dst_path[PATH_MAX];
snprintf(store_path, PATH_MAX, "%sthumbs", args->incremental);
snprintf(dst_path, PATH_MAX, "%s_index_original.ndjson.zst", ScanCtx.index.path);
store_t *source = store_create(store_path, STORE_SIZE_TN);
DIR *dir = opendir(args->incremental);
if (dir == NULL) {
perror("opendir");
return;
}
struct dirent *de;
while ((de = readdir(dir)) != NULL) {
if (strncmp(de->d_name, "_index_", sizeof("_index_") - 1) == 0) {
char file_path[PATH_MAX];
snprintf(file_path, PATH_MAX, "%s%s", args->incremental, de->d_name);
incremental_copy(source, ScanCtx.index.store, file_path, dst_path, ScanCtx.copy_table);
}
}
closedir(dir);
store_destroy(source);
writer_cleanup();
snprintf(store_path, PATH_MAX, "%stags", args->incremental);
snprintf(dst_path, PATH_MAX, "%stags", ScanCtx.index.path);
store_t *source_tags = store_create(store_path, STORE_SIZE_TAG);
store_copy(source_tags, dst_path);
store_destroy(source_tags);
save_incremental_index(args);
}
generate_stats(&ScanCtx.index, args->treemap_threshold, ScanCtx.index.path);
@@ -362,17 +449,20 @@ void sist2_scan(scan_args_t *args) {
}
void sist2_index(index_args_t *args) {
char file_path[PATH_MAX];
IndexCtx.es_url = args->es_url;
IndexCtx.es_index = args->es_index;
IndexCtx.es_insecure_ssl = args->es_insecure_ssl;
IndexCtx.batch_size = args->batch_size;
IndexCtx.needs_es_connection = !args->print;
if (!args->print) {
if (IndexCtx.needs_es_connection) {
elastic_init(args->force_reset, args->es_mappings, args->es_settings);
}
char descriptor_path[PATH_MAX];
snprintf(descriptor_path, PATH_MAX, "%s/descriptor.json", args->index_path);
snprintf(descriptor_path, PATH_MAX, "%sdescriptor.json", args->index_path);
index_descriptor_t desc = read_index_descriptor(descriptor_path);
@@ -388,11 +478,11 @@ void sist2_index(index_args_t *args) {
}
char path_tmp[PATH_MAX];
snprintf(path_tmp, sizeof(path_tmp), "%s/tags", args->index_path);
snprintf(path_tmp, sizeof(path_tmp), "%stags", args->index_path);
IndexCtx.tag_store = store_create(path_tmp, STORE_SIZE_TAG);
IndexCtx.tags = store_read_all(IndexCtx.tag_store);
snprintf(path_tmp, sizeof(path_tmp), "%s/meta", args->index_path);
snprintf(path_tmp, sizeof(path_tmp), "%smeta", args->index_path);
IndexCtx.meta_store = store_create(path_tmp, STORE_SIZE_META);
IndexCtx.meta = store_read_all(IndexCtx.meta_store);
@@ -403,32 +493,33 @@ void sist2_index(index_args_t *args) {
f = index_json;
}
void (*cleanup)();
if (args->print) {
cleanup = NULL;
} else {
cleanup = elastic_cleanup;
}
IndexCtx.pool = tpool_create(args->threads, cleanup, FALSE, FALSE);
IndexCtx.pool = tpool_create(args->threads, elastic_cleanup, FALSE, args->print == 0, 0);
tpool_start(IndexCtx.pool);
struct dirent *de;
while ((de = readdir(dir)) != NULL) {
if (strncmp(de->d_name, "_index_", sizeof("_index_") - 1) == 0) {
char file_path[PATH_MAX];
snprintf(file_path, PATH_MAX, "%s/%s", args->index_path, de->d_name);
read_index(file_path, desc.id, desc.type, f);
READ_INDICES(file_path, args->index_path, {
read_index(file_path, desc.id, desc.type, f);
LOG_DEBUGF("main.c", "Read index file %s (%s)", file_path, desc.type);
}, {}, !args->incremental);
// Only read the _delete index if we're sending data to ES
if (!args->print) {
snprintf(file_path, PATH_MAX, "%s_index_delete.list.zst", args->index_path);
if (0 == access(file_path, R_OK)) {
read_lines(file_path, (line_processor_t) {
.data = NULL,
.func = delete_document
});
LOG_DEBUGF("main.c", "Read index file %s (%s)", file_path, desc.type)
}
}
closedir(dir);
tpool_wait(IndexCtx.pool);
tpool_destroy(IndexCtx.pool);
if (!args->print) {
if (IndexCtx.needs_es_connection) {
finish_indexer(args->script, args->async_script, desc.id);
}
@@ -443,11 +534,13 @@ void sist2_exec_script(exec_args_t *args) {
LogCtx.verbose = TRUE;
char descriptor_path[PATH_MAX];
snprintf(descriptor_path, PATH_MAX, "%s/descriptor.json", args->index_path);
snprintf(descriptor_path, PATH_MAX, "%sdescriptor.json", args->index_path);
index_descriptor_t desc = read_index_descriptor(descriptor_path);
IndexCtx.es_url = args->es_url;
IndexCtx.es_index = args->es_index;
IndexCtx.es_insecure_ssl = args->es_insecure_ssl;
IndexCtx.needs_es_connection = TRUE;
LOG_DEBUGF("main.c", "descriptor version %s (%s)", desc.version, desc.type)
@@ -459,6 +552,7 @@ void sist2_web(web_args_t *args) {
WebCtx.es_url = args->es_url;
WebCtx.es_index = args->es_index;
WebCtx.es_insecure_ssl = args->es_insecure_ssl;
WebCtx.index_count = args->index_count;
WebCtx.auth_user = args->auth_user;
WebCtx.auth_pass = args->auth_pass;
@@ -466,7 +560,7 @@ void sist2_web(web_args_t *args) {
WebCtx.tag_auth_enabled = args->tag_auth_enabled;
WebCtx.tagline = args->tagline;
WebCtx.dev = args->dev;
strcpy(WebCtx.lang, "en");
strcpy(WebCtx.lang, args->lang);
for (int i = 0; i < args->index_count; i++) {
char *abs_path = abspath(args->indices[i]);
@@ -486,13 +580,34 @@ void sist2_web(web_args_t *args) {
WebCtx.indices[i].desc = read_index_descriptor(path_tmp);
strcpy(WebCtx.indices[i].path, abs_path);
printf("Loaded index: %s\n", WebCtx.indices[i].desc.name);
LOG_INFOF("main.c", "Loaded index: [%s]", WebCtx.indices[i].desc.name)
free(abs_path);
}
serve(args->listen_address);
}
/**
* Callback to handle options such that
*
* Unspecified -> 0: Set to default value
* Specified "0" -> -1: Disable the option (ex. don't generate thumbnails)
* Negative number -> Raise error
* Specified a valid number -> Continue as normal
*/
int set_to_negative_if_value_is_zero(struct argparse *self, const struct argparse_option *option) {
int specified_value = *(int *) option->value;
if (specified_value == 0) {
*((int *) option->data) = OPTION_VALUE_DISABLE;
}
if (specified_value < 0) {
fprintf(stderr, "error: option `--%s` Value must be >= 0\n", option->long_name);
exit(1);
}
}
int main(int argc, const char *argv[]) {
sigsegv_handler = signal(SIGSEGV, sig_handler);
@@ -508,6 +623,7 @@ int main(int argc, const char *argv[]) {
int arg_version = 0;
char *common_es_url = NULL;
int common_es_insecure_ssl = 0;
char *common_es_index = NULL;
char *common_script_path = NULL;
int common_async_script = 0;
@@ -522,12 +638,21 @@ int main(int argc, const char *argv[]) {
OPT_GROUP("Scan options"),
OPT_INTEGER('t', "threads", &common_threads, "Number of threads. DEFAULT=1"),
OPT_FLOAT('q', "quality", &scan_args->quality,
"Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. DEFAULT=3"),
OPT_INTEGER(0, "size", &scan_args->size,
"Thumbnail size, in pixels. Use negative value to disable. DEFAULT=500"),
OPT_INTEGER(0, "mem-throttle", &scan_args->scan_mem_limit_mib,
"Total memory threshold in MiB for scan throttling. DEFAULT=0",
set_to_negative_if_value_is_zero, (intptr_t) &scan_args->scan_mem_limit_mib),
OPT_FLOAT('q', "thumbnail-quality", &scan_args->tn_quality,
"Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. DEFAULT=1",
set_to_negative_if_value_is_zero, (intptr_t) &scan_args->tn_quality),
OPT_INTEGER(0, "thumbnail-size", &scan_args->tn_size,
"Thumbnail size, in pixels. DEFAULT=500",
set_to_negative_if_value_is_zero, (intptr_t) &scan_args->tn_size),
OPT_INTEGER(0, "thumbnail-count", &scan_args->tn_count,
"Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT=1",
set_to_negative_if_value_is_zero, (intptr_t) &scan_args->tn_count),
OPT_INTEGER(0, "content-size", &scan_args->content_size,
"Number of bytes to be extracted from text documents. Use negative value to disable. DEFAULT=32768"),
"Number of bytes to be extracted from text documents. Set to 0 to disable. DEFAULT=32768",
set_to_negative_if_value_is_zero, (intptr_t) &scan_args->content_size),
OPT_STRING(0, "incremental", &scan_args->incremental,
"Reuse an existing index and only scan modified files."),
OPT_STRING('o', "output", &scan_args->output, "Output directory. DEFAULT=index.sist2/"),
@@ -541,24 +666,34 @@ int main(int argc, const char *argv[]) {
OPT_STRING(0, "archive-passphrase", &scan_args->archive_passphrase,
"Passphrase for encrypted archive files"),
OPT_STRING(0, "ocr", &scan_args->tesseract_lang, "Tesseract language (use tesseract --list-langs to see "
"which are installed on your machine)"),
OPT_STRING(0, "ocr-lang", &scan_args->tesseract_lang,
"Tesseract language (use 'tesseract --list-langs' to see "
"which are installed on your machine)"),
OPT_BOOLEAN(0, "ocr-images", &scan_args->ocr_images, "Enable OCR'ing of image files."),
OPT_BOOLEAN(0, "ocr-ebooks", &scan_args->ocr_ebooks, "Enable OCR'ing of ebook files."),
OPT_STRING('e', "exclude", &scan_args->exclude_regex, "Files that match this regex will not be scanned"),
OPT_BOOLEAN(0, "fast", &scan_args->fast, "Only index file names & mime type"),
OPT_STRING(0, "treemap-threshold", &scan_args->treemap_threshold_str, "Relative size threshold for treemap "
"(see USAGE.md). DEFAULT: 0.0005"),
OPT_INTEGER(0, "mem-buffer", &scan_args->max_memory_buffer,
"Maximum memory buffer size per thread in MB for files inside archives "
OPT_INTEGER(0, "mem-buffer", &scan_args->max_memory_buffer_mib,
"Maximum memory buffer size per thread in MiB for files inside archives "
"(see USAGE.md). DEFAULT: 2000"),
OPT_BOOLEAN(0, "read-subtitles", &scan_args->read_subtitles, "Read subtitles from media files."),
OPT_BOOLEAN(0, "fast-epub", &scan_args->fast_epub,
"Faster but less accurate EPUB parsing (no thumbnails, metadata)"),
OPT_BOOLEAN(0, "checksums", &scan_args->calculate_checksums, "Calculate file checksums when scanning."),
OPT_STRING(0, "list-file", &scan_args->list_path, "Specify a list of newline-delimited paths to be scanned"
" instead of normal directory traversal. Use '-' to read"
" from stdin."),
OPT_GROUP("Index options"),
OPT_INTEGER('t', "threads", &common_threads, "Number of threads. DEFAULT=1"),
OPT_STRING(0, "es-url", &common_es_url, "Elasticsearch url with port. DEFAULT=http://localhost:9200"),
OPT_BOOLEAN(0, "es-insecure-ssl", &common_es_insecure_ssl, "Do not verify SSL connections to Elasticsearch."),
OPT_STRING(0, "es-index", &common_es_index, "Elasticsearch index name. DEFAULT=sist2"),
OPT_BOOLEAN('p', "print", &index_args->print, "Just print JSON documents to stdout."),
OPT_BOOLEAN(0, "incremental-index", &index_args->incremental,
"Conduct incremental indexing. Assumes that the old index is already ingested in Elasticsearch."),
OPT_STRING(0, "script-file", &common_script_path, "Path to user script."),
OPT_STRING(0, "mappings-file", &index_args->es_mappings_path, "Path to Elasticsearch mappings."),
OPT_STRING(0, "settings-file", &index_args->es_settings_path, "Path to Elasticsearch settings."),
@@ -569,15 +704,18 @@ int main(int argc, const char *argv[]) {
OPT_GROUP("Web options"),
OPT_STRING(0, "es-url", &common_es_url, "Elasticsearch url. DEFAULT=http://localhost:9200"),
OPT_BOOLEAN(0, "es-insecure-ssl", &common_es_insecure_ssl, "Do not verify SSL connections to Elasticsearch."),
OPT_STRING(0, "es-index", &common_es_index, "Elasticsearch index name. DEFAULT=sist2"),
OPT_STRING(0, "bind", &web_args->listen_address, "Listen on this address. DEFAULT=localhost:4090"),
OPT_STRING(0, "auth", &web_args->credentials, "Basic auth in user:password format"),
OPT_STRING(0, "tag-auth", &web_args->tag_credentials, "Basic auth in user:password format for tagging"),
OPT_STRING(0, "tagline", &web_args->tagline, "Tagline in navbar"),
OPT_BOOLEAN(0, "dev", &web_args->dev, "Serve html & js files from disk (for development)"),
OPT_STRING(0, "lang", &web_args->lang, "Default UI language. Can be changed by the user"),
OPT_GROUP("Exec-script options"),
OPT_STRING(0, "es-url", &common_es_url, "Elasticsearch url. DEFAULT=http://localhost:9200"),
OPT_BOOLEAN(0, "es-insecure-ssl", &common_es_insecure_ssl, "Do not verify SSL connections to Elasticsearch."),
OPT_STRING(0, "es-index", &common_es_index, "Elasticsearch index name. DEFAULT=sist2"),
OPT_STRING(0, "script-file", &common_script_path, "Path to user script."),
OPT_BOOLEAN(0, "async-script", &common_async_script, "Execute user script asynchronously."),
@@ -607,6 +745,10 @@ int main(int argc, const char *argv[]) {
index_args->es_index = common_es_index;
exec_args->es_index = common_es_index;
web_args->es_insecure_ssl = common_es_insecure_ssl;
index_args->es_insecure_ssl = common_es_insecure_ssl;
exec_args->es_insecure_ssl = common_es_insecure_ssl;
index_args->script_path = common_script_path;
exec_args->script_path = common_script_path;
index_args->threads = common_threads;
@@ -650,9 +792,8 @@ int main(int argc, const char *argv[]) {
sist2_exec_script(exec_args);
} else {
fprintf(stderr, "Invalid command: '%s'\n", argv[0]);
argparse_usage(&argparse);
goto end;
LOG_FATALF("main.c", "Invalid command: '%s'\n", argv[0])
}
printf("\n");

View File

@@ -35,427 +35,426 @@ enum mime {
application_mime=655387,
application_mspowerpoint=655388,
application_msword=655389,
application_netmc=655390,
application_octet_stream=655391,
application_oda=655392,
application_ogg=655393,
application_pdf=655394 | 0x40000000,
application_pgp_keys=655395,
application_pgp_signature=655396,
application_pkcs7_signature=655397,
application_pkix_cert=655398,
application_postscript=655399,
application_pro_eng=655400,
application_ringing_tones=655401,
application_smil=655402,
application_solids=655403,
application_sounder=655404,
application_step=655405,
application_streamingmedia=655406,
application_vda=655407,
application_vnd_amazon_mobi8_ebook=655408 | 0x02000000,
application_vnd_coffeescript=655409,
application_vnd_fdf=655410,
application_vnd_font_fontforge_sfd=655411,
application_vnd_hp_hpgl=655412,
application_vnd_iccprofile=655413,
application_vnd_lotus_1_2_3=655414,
application_vnd_ms_cab_compressed=655415,
application_vnd_ms_excel=655416,
application_vnd_ms_fontobject=655417,
application_vnd_ms_opentype=655418 | 0x20000000,
application_vnd_ms_outlook=655419,
application_vnd_ms_pki_certstore=655420,
application_vnd_ms_pki_pko=655421,
application_vnd_ms_pki_seccat=655422,
application_vnd_ms_powerpoint=655423,
application_vnd_ms_project=655424,
application_vnd_oasis_opendocument_base=655425,
application_vnd_oasis_opendocument_formula=655426,
application_vnd_oasis_opendocument_graphics=655427,
application_vnd_oasis_opendocument_presentation=655428,
application_vnd_oasis_opendocument_spreadsheet=655429,
application_vnd_oasis_opendocument_text=655430,
application_vnd_openxmlformats_officedocument_presentationml_presentation=655431 | 0x04000000,
application_vnd_openxmlformats_officedocument_spreadsheetml_sheet=655432 | 0x04000000,
application_vnd_openxmlformats_officedocument_wordprocessingml_document=655433 | 0x04000000,
application_vnd_symbian_install=655434,
application_vnd_tcpdump_pcap=655435,
application_vnd_wap_wmlc=655436,
application_vnd_wap_wmlscriptc=655437,
application_vnd_xara=655438,
application_vocaltec_media_desc=655439,
application_vocaltec_media_file=655440,
application_warc=655441,
application_winhelp=655442,
application_wordperfect=655443,
application_wordperfect6_0=655444,
application_wordperfect6_1=655445,
application_x_123=655446,
application_x_7z_compressed=655447 | 0x10000000,
application_x_aim=655448,
application_x_apple_diskimage=655449,
application_x_arc=655450 | 0x10000000,
application_x_archive=655451,
application_x_atari_7800_rom=655452,
application_x_authorware_bin=655453,
application_x_authorware_map=655454,
application_x_authorware_seg=655455,
application_x_avira_qua=655456,
application_x_bcpio=655457,
application_x_bittorrent=655458,
application_x_bsh=655459,
application_x_bytecode_python=655460,
application_x_bzip=655461,
application_x_bzip2=655462 | 0x08000000,
application_x_cbr=655463,
application_x_cbz=655464,
application_x_cdlink=655465,
application_x_chat=655466,
application_x_chrome_extension=655467,
application_x_cocoa=655468,
application_x_conference=655469,
application_x_coredump=655470,
application_x_cpio=655471,
application_x_dbf=655472,
application_x_dbt=655473,
application_x_debian_package=655474,
application_x_deepv=655475,
application_x_director=655476,
application_x_dmp=655477,
application_x_dosdriver=655478,
application_x_dosexec=655479,
application_x_dvi=655480,
application_x_elc=655481,
application_ndjson=655390,
application_netmc=655391,
application_octet_stream=655392,
application_oda=655393,
application_ogg=655394,
application_pdf=655395 | 0x40000000,
application_pgp_keys=655396,
application_pgp_signature=655397,
application_pkcs7_signature=655398,
application_pkix_cert=655399,
application_postscript=655400,
application_pro_eng=655401,
application_ringing_tones=655402,
application_smil=655403,
application_solids=655404,
application_sounder=655405,
application_step=655406,
application_streamingmedia=655407,
application_vda=655408,
application_vnd_amazon_mobi8_ebook=655409 | 0x02000000,
application_vnd_coffeescript=655410,
application_vnd_fdf=655411,
application_vnd_font_fontforge_sfd=655412,
application_vnd_hp_hpgl=655413,
application_vnd_iccprofile=655414,
application_vnd_lotus_1_2_3=655415,
application_vnd_ms_cab_compressed=655416,
application_vnd_ms_excel=655417,
application_vnd_ms_fontobject=655418,
application_vnd_ms_opentype=655419 | 0x20000000,
application_vnd_ms_outlook=655420,
application_vnd_ms_pki_certstore=655421,
application_vnd_ms_pki_pko=655422,
application_vnd_ms_pki_seccat=655423,
application_vnd_ms_powerpoint=655424,
application_vnd_ms_project=655425,
application_vnd_oasis_opendocument_base=655426,
application_vnd_oasis_opendocument_formula=655427,
application_vnd_oasis_opendocument_graphics=655428,
application_vnd_oasis_opendocument_presentation=655429,
application_vnd_oasis_opendocument_spreadsheet=655430,
application_vnd_oasis_opendocument_text=655431,
application_vnd_openxmlformats_officedocument_presentationml_presentation=655432 | 0x04000000,
application_vnd_openxmlformats_officedocument_spreadsheetml_sheet=655433 | 0x04000000,
application_vnd_openxmlformats_officedocument_wordprocessingml_document=655434 | 0x04000000,
application_vnd_symbian_install=655435,
application_vnd_tcpdump_pcap=655436,
application_vnd_wap_wmlc=655437,
application_vnd_wap_wmlscriptc=655438,
application_vnd_xara=655439,
application_vocaltec_media_desc=655440,
application_vocaltec_media_file=655441,
application_warc=655442,
application_winhelp=655443,
application_wordperfect=655444,
application_x_123=655445,
application_x_7z_compressed=655446 | 0x10000000,
application_x_aim=655447,
application_x_apple_diskimage=655448,
application_x_arc=655449 | 0x10000000,
application_x_archive=655450,
application_x_atari_7800_rom=655451,
application_x_authorware_bin=655452,
application_x_authorware_map=655453,
application_x_authorware_seg=655454,
application_x_avira_qua=655455,
application_x_bcpio=655456,
application_x_bittorrent=655457,
application_x_bsh=655458,
application_x_bytecode_python=655459,
application_x_bzip=655460,
application_x_bzip2=655461 | 0x08000000,
application_x_cbr=655462,
application_x_cbz=655463,
application_x_cdlink=655464,
application_x_chat=655465,
application_x_chrome_extension=655466,
application_x_cocoa=655467,
application_x_conference=655468,
application_x_coredump=655469,
application_x_cpio=655470,
application_x_dbf=655471,
application_x_dbt=655472,
application_x_debian_package=655473,
application_x_deepv=655474,
application_x_director=655475,
application_x_dmp=655476,
application_x_dosdriver=655477,
application_x_dosexec=655478,
application_x_dvi=655479,
application_x_elc=655480,
application_x_empty=1,
application_x_envoy=655482,
application_x_esrehber=655483,
application_x_excel=655484,
application_x_executable=655485,
application_x_font_gdos=655486,
application_x_font_pf2=655487,
application_x_font_pfm=655488,
application_x_font_sfn=655489,
application_x_font_ttf=655490 | 0x20000000,
application_x_fptapplication_x_dbt=655491,
application_x_freelance=655492,
application_x_gamecube_rom=655493,
application_x_gdbm=655494,
application_x_gettext_translation=655495,
application_x_git=655496,
application_x_gsp=655497,
application_x_gss=655498,
application_x_gtar=655499,
application_x_gzip=655500,
application_x_hdf=655501,
application_x_helpfile=655502,
application_x_httpd_imap=655503,
application_x_ima=655504,
application_x_innosetup=655505,
application_x_internett_signup=655506,
application_x_inventor=655507,
application_x_ip2=655508,
application_x_java_applet=655509,
application_x_java_commerce=655510,
application_x_java_image=655511,
application_x_java_jmod=655512,
application_x_java_keystore=655513,
application_x_kdelnk=655514,
application_x_koan=655515,
application_x_latex=655516,
application_x_livescreen=655517,
application_x_lotus=655518,
application_x_lz4=655519 | 0x08000000,
application_x_lz4_json=655520,
application_x_lzh=655521,
application_x_lzh_compressed=655522,
application_x_lzip=655523 | 0x08000000,
application_x_lzma=655524 | 0x08000000,
application_x_lzop=655525 | 0x08000000,
application_x_lzx=655526,
application_x_mach_binary=655527,
application_x_mach_executable=655528,
application_x_magic_cap_package_1_0=655529,
application_x_mathcad=655530,
application_x_maxis_dbpf=655531,
application_x_meme=655532,
application_x_midi=655533,
application_x_mif=655534,
application_x_mix_transfer=655535,
application_x_mobipocket_ebook=655536 | 0x02000000,
application_x_ms_compress_szdd=655537,
application_x_ms_pdb=655538,
application_x_ms_reader=655539,
application_x_msaccess=655540,
application_x_n64_rom=655541,
application_x_navi_animation=655542,
application_x_navidoc=655543,
application_x_navimap=655544,
application_x_navistyle=655545,
application_x_nes_rom=655546,
application_x_netcdf=655547,
application_x_newton_compatible_pkg=655548,
application_x_nintendo_ds_rom=655549,
application_x_object=655550,
application_x_omc=655551,
application_x_omcdatamaker=655552,
application_x_omcregerator=655553,
application_x_pagemaker=655554,
application_x_pcl=655555,
application_x_pgp_keyring=655556,
application_x_pixclscript=655557,
application_x_pkcs7_certreqresp=655558,
application_x_pkcs7_signature=655559,
application_x_project=655560,
application_x_qpro=655561,
application_x_rar=655562 | 0x10000000,
application_x_rpm=655563,
application_x_sdp=655564,
application_x_sea=655565,
application_x_seelogo=655566,
application_x_setupscript=655567,
application_x_shar=655568,
application_x_sharedlib=655569,
application_x_shockwave_flash=655570,
application_x_snappy_framed=655571,
application_x_sprite=655572,
application_x_sqlite3=655573,
application_x_stargallery_thm=655574,
application_x_stuffit=655575,
application_x_sv4cpio=655576,
application_x_sv4crc=655577,
application_x_tar=655578 | 0x10000000,
application_x_tbook=655579,
application_x_terminfo=655580,
application_x_terminfo2=655581,
application_x_tex_tfm=655582,
application_x_texinfo=655583,
application_x_ustar=655584,
application_x_visio=655585,
application_x_vnd_audioexplosion_mzz=655586,
application_x_vnd_ls_xpix=655587,
application_x_vrml=655588,
application_x_wais_source=655589,
application_x_wine_extension_ini=655590,
application_x_wintalk=655591,
application_x_world=655592,
application_x_wri=655593,
application_x_x509_ca_cert=655594,
application_x_xz=655595 | 0x08000000,
application_x_zip=655596,
application_x_zstd=655597 | 0x08000000,
application_x_zstd_dictionary=655598,
application_xml=655599,
application_zip=655600 | 0x10000000,
application_zlib=655601,
audio_basic=458994 | 0x80000000,
audio_it=458995,
audio_make=458996,
audio_mid=458997,
audio_midi=458998,
audio_mp4=458999,
audio_mpeg=459000,
audio_ogg=459001,
audio_s3m=459002,
audio_tsp_audio=459003,
audio_tsplayer=459004,
audio_vnd_qcelp=459005,
audio_voxware=459006,
audio_x_aiff=459007,
audio_x_flac=459008,
audio_x_gsm=459009,
audio_x_hx_aac_adts=459010,
audio_x_jam=459011,
audio_x_liveaudio=459012,
audio_x_m4a=459013,
audio_x_midi=459014,
audio_x_mod=459015,
audio_x_mp4a_latm=459016,
audio_x_mpeg_3=459017,
audio_x_mpequrl=459018,
audio_x_nspaudio=459019,
audio_x_pn_realaudio=459020,
audio_x_psid=459021,
audio_x_realaudio=459022,
audio_x_s3m=459023,
audio_x_twinvq=459024,
audio_x_twinvq_plugin=459025,
audio_x_voc=459026,
audio_x_wav=459027,
audio_x_xbox_executable=459028 | 0x80000000,
audio_x_xbox360_executable=459029 | 0x80000000,
audio_xm=459030,
font_otf=327959 | 0x20000000,
font_sfnt=327960 | 0x20000000,
font_woff=327961 | 0x20000000,
font_woff2=327962 | 0x20000000,
image_bmp=524571,
image_cmu_raster=524572,
image_fif=524573,
image_florian=524574,
image_g3fax=524575,
image_gif=524576,
image_heic=524577,
image_ief=524578,
image_jpeg=524579,
image_jutvision=524580,
image_naplps=524581,
image_pict=524582,
image_png=524583,
image_svg=524584 | 0x80000000,
image_svg_xml=524585 | 0x80000000,
image_tiff=524586,
image_vnd_adobe_photoshop=524587 | 0x80000000,
image_vnd_djvu=524588 | 0x80000000,
image_vnd_fpx=524589,
image_vnd_microsoft_icon=524590,
image_vnd_rn_realflash=524591,
image_vnd_rn_realpix=524592,
image_vnd_wap_wbmp=524593,
image_vnd_xiff=524594,
image_webp=524595,
image_wmf=524596,
image_x_3ds=524597,
image_x_adobe_dng=524598 | 0x00800000,
image_x_award_bioslogo=524599,
image_x_canon_cr2=524600 | 0x00800000,
image_x_canon_crw=524601 | 0x00800000,
image_x_cmu_raster=524602,
image_x_cur=524603,
image_x_dcraw=524604 | 0x00800000,
image_x_dwg=524605,
image_x_eps=524606,
image_x_epson_erf=524607 | 0x00800000,
image_x_exr=524608,
image_x_fuji_raf=524609 | 0x00800000,
image_x_gem=524610,
image_x_icns=524611,
image_x_icon=524612 | 0x80000000,
image_x_jg=524613,
image_x_jps=524614,
image_x_kodak_dcr=524615 | 0x00800000,
image_x_kodak_k25=524616 | 0x00800000,
image_x_kodak_kdc=524617 | 0x00800000,
image_x_minolta_mrw=524618 | 0x00800000,
image_x_ms_bmp=524619,
image_x_niff=524620,
image_x_nikon_nef=524621 | 0x00800000,
image_x_olympus_orf=524622 | 0x00800000,
image_x_panasonic_raw=524623 | 0x00800000,
image_x_pcx=524624,
image_x_pentax_pef=524625 | 0x00800000,
image_x_pict=524626,
image_x_portable_bitmap=524627,
image_x_portable_graymap=524628,
image_x_portable_pixmap=524629,
image_x_quicktime=524630,
image_x_rgb=524631,
image_x_sigma_x3f=524632 | 0x00800000,
image_x_sony_arw=524633 | 0x00800000,
image_x_sony_sr2=524634 | 0x00800000,
image_x_sony_srf=524635 | 0x00800000,
image_x_tga=524636,
image_x_tiff=524637,
image_x_win_bitmap=524638,
image_x_xcf=524639 | 0x80000000,
image_x_xpixmap=524640 | 0x80000000,
image_x_xwindowdump=524641,
message_news=196962,
message_rfc822=196963,
model_vnd_dwf=65892,
model_vnd_gdl=65893,
model_vnd_gs_gdl=65894,
model_vrml=65895,
model_x_pov=65896,
application_x_envoy=655481,
application_x_esrehber=655482,
application_x_excel=655483,
application_x_executable=655484,
application_x_font_gdos=655485,
application_x_font_pf2=655486,
application_x_font_pfm=655487,
application_x_font_sfn=655488,
application_x_font_ttf=655489 | 0x20000000,
application_x_fptapplication_x_dbt=655490,
application_x_freelance=655491,
application_x_gamecube_rom=655492,
application_x_gdbm=655493,
application_x_gettext_translation=655494,
application_x_git=655495,
application_x_gsp=655496,
application_x_gss=655497,
application_x_gtar=655498,
application_x_gzip=655499,
application_x_hdf=655500,
application_x_helpfile=655501,
application_x_httpd_imap=655502,
application_x_ima=655503,
application_x_innosetup=655504,
application_x_internett_signup=655505,
application_x_inventor=655506,
application_x_ip2=655507,
application_x_java_applet=655508,
application_x_java_commerce=655509,
application_x_java_image=655510,
application_x_java_jmod=655511,
application_x_java_keystore=655512,
application_x_kdelnk=655513,
application_x_koan=655514,
application_x_latex=655515,
application_x_livescreen=655516,
application_x_lotus=655517,
application_x_lz4=655518 | 0x08000000,
application_x_lz4_json=655519,
application_x_lzh=655520,
application_x_lzh_compressed=655521,
application_x_lzip=655522 | 0x08000000,
application_x_lzma=655523 | 0x08000000,
application_x_lzop=655524 | 0x08000000,
application_x_lzx=655525,
application_x_mach_binary=655526,
application_x_mach_executable=655527,
application_x_magic_cap_package_1_0=655528,
application_x_mathcad=655529,
application_x_maxis_dbpf=655530,
application_x_meme=655531,
application_x_midi=655532,
application_x_mif=655533,
application_x_mix_transfer=655534,
application_x_mobipocket_ebook=655535 | 0x02000000,
application_x_ms_compress_szdd=655536,
application_x_ms_pdb=655537,
application_x_ms_reader=655538,
application_x_msaccess=655539,
application_x_n64_rom=655540,
application_x_navi_animation=655541,
application_x_navidoc=655542,
application_x_navimap=655543,
application_x_navistyle=655544,
application_x_nes_rom=655545,
application_x_netcdf=655546,
application_x_newton_compatible_pkg=655547,
application_x_nintendo_ds_rom=655548,
application_x_object=655549,
application_x_omc=655550,
application_x_omcdatamaker=655551,
application_x_omcregerator=655552,
application_x_pagemaker=655553,
application_x_pcl=655554,
application_x_pgp_keyring=655555,
application_x_pixclscript=655556,
application_x_pkcs7_certreqresp=655557,
application_x_pkcs7_signature=655558,
application_x_project=655559,
application_x_qpro=655560,
application_x_rar=655561 | 0x10000000,
application_x_rpm=655562,
application_x_sdp=655563,
application_x_sea=655564,
application_x_seelogo=655565,
application_x_setupscript=655566,
application_x_shar=655567,
application_x_sharedlib=655568,
application_x_shockwave_flash=655569,
application_x_snappy_framed=655570,
application_x_sprite=655571,
application_x_sqlite3=655572,
application_x_stargallery_thm=655573,
application_x_stuffit=655574,
application_x_sv4cpio=655575,
application_x_sv4crc=655576,
application_x_tar=655577 | 0x10000000,
application_x_tbook=655578,
application_x_terminfo=655579,
application_x_terminfo2=655580,
application_x_tex_tfm=655581,
application_x_texinfo=655582,
application_x_ustar=655583,
application_x_visio=655584,
application_x_vnd_audioexplosion_mzz=655585,
application_x_vnd_ls_xpix=655586,
application_x_vrml=655587,
application_x_wais_source=655588,
application_x_wine_extension_ini=655589,
application_x_wintalk=655590,
application_x_world=655591,
application_x_wri=655592,
application_x_x509_ca_cert=655593,
application_x_xz=655594 | 0x08000000,
application_x_zip=655595,
application_x_zstd=655596 | 0x08000000,
application_x_zstd_dictionary=655597,
application_xml=655598,
application_zip=655599 | 0x10000000,
application_zlib=655600,
audio_basic=458993 | 0x80000000,
audio_it=458994,
audio_make=458995,
audio_mid=458996,
audio_midi=458997,
audio_mp4=458998,
audio_mpeg=458999,
audio_ogg=459000,
audio_s3m=459001,
audio_tsp_audio=459002,
audio_tsplayer=459003,
audio_vnd_qcelp=459004,
audio_voxware=459005,
audio_x_aiff=459006,
audio_x_flac=459007,
audio_x_gsm=459008,
audio_x_hx_aac_adts=459009,
audio_x_jam=459010,
audio_x_liveaudio=459011,
audio_x_m4a=459012,
audio_x_midi=459013,
audio_x_mod=459014,
audio_x_mp4a_latm=459015,
audio_x_mpeg_3=459016,
audio_x_mpequrl=459017,
audio_x_nspaudio=459018,
audio_x_pn_realaudio=459019,
audio_x_psid=459020,
audio_x_realaudio=459021,
audio_x_s3m=459022,
audio_x_twinvq=459023,
audio_x_twinvq_plugin=459024,
audio_x_voc=459025,
audio_x_wav=459026,
audio_x_xbox_executable=459027 | 0x80000000,
audio_x_xbox360_executable=459028 | 0x80000000,
audio_xm=459029,
font_otf=327958 | 0x20000000,
font_sfnt=327959 | 0x20000000,
font_woff=327960 | 0x20000000,
font_woff2=327961 | 0x20000000,
image_bmp=524570,
image_cmu_raster=524571,
image_fif=524572,
image_florian=524573,
image_g3fax=524574,
image_gif=524575,
image_heic=524576,
image_ief=524577,
image_jpeg=524578,
image_jutvision=524579,
image_naplps=524580,
image_pict=524581,
image_png=524582,
image_svg=524583 | 0x80000000,
image_svg_xml=524584 | 0x80000000,
image_tiff=524585,
image_vnd_adobe_photoshop=524586 | 0x80000000,
image_vnd_djvu=524587 | 0x80000000,
image_vnd_fpx=524588,
image_vnd_microsoft_icon=524589,
image_vnd_rn_realflash=524590,
image_vnd_rn_realpix=524591,
image_vnd_wap_wbmp=524592,
image_vnd_xiff=524593,
image_webp=524594,
image_wmf=524595,
image_x_3ds=524596,
image_x_adobe_dng=524597 | 0x00800000,
image_x_award_bioslogo=524598,
image_x_canon_cr2=524599 | 0x00800000,
image_x_canon_crw=524600 | 0x00800000,
image_x_cmu_raster=524601,
image_x_cur=524602,
image_x_dcraw=524603 | 0x00800000,
image_x_dwg=524604,
image_x_eps=524605,
image_x_epson_erf=524606 | 0x00800000,
image_x_exr=524607,
image_x_fuji_raf=524608 | 0x00800000,
image_x_gem=524609,
image_x_icns=524610,
image_x_icon=524611 | 0x80000000,
image_x_jg=524612,
image_x_jps=524613,
image_x_kodak_dcr=524614 | 0x00800000,
image_x_kodak_k25=524615 | 0x00800000,
image_x_kodak_kdc=524616 | 0x00800000,
image_x_minolta_mrw=524617 | 0x00800000,
image_x_ms_bmp=524618,
image_x_niff=524619,
image_x_nikon_nef=524620 | 0x00800000,
image_x_olympus_orf=524621 | 0x00800000,
image_x_panasonic_raw=524622 | 0x00800000,
image_x_pcx=524623,
image_x_pentax_pef=524624 | 0x00800000,
image_x_pict=524625,
image_x_portable_bitmap=524626,
image_x_portable_graymap=524627,
image_x_portable_pixmap=524628,
image_x_quicktime=524629,
image_x_rgb=524630,
image_x_sigma_x3f=524631 | 0x00800000,
image_x_sony_arw=524632 | 0x00800000,
image_x_sony_sr2=524633 | 0x00800000,
image_x_sony_srf=524634 | 0x00800000,
image_x_tga=524635,
image_x_tiff=524636,
image_x_win_bitmap=524637,
image_x_xcf=524638 | 0x80000000,
image_x_xpixmap=524639 | 0x80000000,
image_x_xwindowdump=524640,
message_news=196961,
message_rfc822=196962,
model_vnd_dwf=65891,
model_vnd_gdl=65892,
model_vnd_gs_gdl=65893,
model_vrml=65894,
model_x_pov=65895,
sist2_sidecar=2,
text_PGP=590185,
text_asp=590186,
text_css=590187,
text_html=590188 | 0x01000000,
text_javascript=590189,
text_mcf=590190,
text_pascal=590191,
text_plain=590192,
text_richtext=590193,
text_rtf=590194,
text_scriplet=590195,
text_tab_separated_values=590196,
text_troff=590197,
text_uri_list=590198,
text_vnd_abc=590199,
text_vnd_fmi_flexstor=590200,
text_vnd_wap_wml=590201,
text_vnd_wap_wmlscript=590202,
text_webviewhtml=590203,
text_x_Algol68=590204,
text_x_asm=590205,
text_x_audiosoft_intra=590206,
text_x_awk=590207,
text_x_bcpl=590208,
text_x_c=590209,
text_x_c__=590210,
text_x_component=590211,
text_x_diff=590212,
text_x_fortran=590213,
text_x_java=590214,
text_x_la_asf=590215,
text_x_lisp=590216,
text_x_m=590217,
text_x_m4=590218,
text_x_makefile=590219,
text_x_ms_regedit=590220,
text_x_msdos_batch=590221,
text_x_objective_c=590222,
text_x_pascal=590223,
text_x_perl=590224,
text_x_php=590225,
text_x_po=590226,
text_x_python=590227,
text_x_ruby=590228,
text_x_sass=590229,
text_x_scss=590230,
text_x_server_parsed_html=590231,
text_x_setext=590232,
text_x_sgml=590233 | 0x01000000,
text_x_shellscript=590234,
text_x_speech=590235,
text_x_tcl=590236,
text_x_tex=590237,
text_x_uil=590238,
text_x_uuencode=590239,
text_x_vcalendar=590240,
text_x_vcard=590241,
text_xml=590242 | 0x01000000,
video_MP2T=393635,
video_animaflex=393636,
video_avi=393637,
video_avs_video=393638,
video_mp4=393639,
video_mpeg=393640,
video_quicktime=393641,
video_vdo=393642,
video_vivo=393643,
video_vnd_rn_realvideo=393644,
video_vosaic=393645,
video_webm=393646,
video_x_amt_demorun=393647,
video_x_amt_showrun=393648,
video_x_atomic3d_feature=393649,
video_x_dl=393650,
video_x_dv=393651,
video_x_fli=393652,
video_x_flv=393653,
video_x_isvideo=393654,
video_x_jng=393655 | 0x80000000,
video_x_m4v=393656,
video_x_matroska=393657,
video_x_mng=393658,
video_x_motion_jpeg=393659,
video_x_ms_asf=393660,
video_x_msvideo=393661,
video_x_qtc=393662,
video_x_sgi_movie=393663,
x_epoc_x_sisx_app=721344,
text_PGP=590184,
text_asp=590185,
text_css=590186,
text_html=590187 | 0x01000000,
text_javascript=590188,
text_mcf=590189,
text_pascal=590190,
text_plain=590191,
text_richtext=590192,
text_rtf=590193,
text_scriplet=590194,
text_tab_separated_values=590195,
text_troff=590196,
text_uri_list=590197,
text_vnd_abc=590198,
text_vnd_fmi_flexstor=590199,
text_vnd_wap_wml=590200,
text_vnd_wap_wmlscript=590201,
text_webviewhtml=590202,
text_x_Algol68=590203,
text_x_asm=590204,
text_x_audiosoft_intra=590205,
text_x_awk=590206,
text_x_bcpl=590207,
text_x_c=590208,
text_x_c__=590209,
text_x_component=590210,
text_x_diff=590211,
text_x_fortran=590212,
text_x_java=590213,
text_x_la_asf=590214,
text_x_lisp=590215,
text_x_m=590216,
text_x_m4=590217,
text_x_makefile=590218,
text_x_ms_regedit=590219,
text_x_msdos_batch=590220,
text_x_objective_c=590221,
text_x_pascal=590222,
text_x_perl=590223,
text_x_php=590224,
text_x_po=590225,
text_x_python=590226,
text_x_ruby=590227,
text_x_sass=590228,
text_x_scss=590229,
text_x_server_parsed_html=590230,
text_x_setext=590231,
text_x_sgml=590232 | 0x01000000,
text_x_shellscript=590233,
text_x_speech=590234,
text_x_tcl=590235,
text_x_tex=590236,
text_x_uil=590237,
text_x_uuencode=590238,
text_x_vcalendar=590239,
text_x_vcard=590240,
text_xml=590241 | 0x01000000,
video_MP2T=393634,
video_animaflex=393635,
video_avi=393636,
video_avs_video=393637,
video_mp4=393638,
video_mpeg=393639,
video_quicktime=393640,
video_vdo=393641,
video_vivo=393642,
video_vnd_rn_realvideo=393643,
video_vosaic=393644,
video_webm=393645,
video_x_amt_demorun=393646,
video_x_amt_showrun=393647,
video_x_atomic3d_feature=393648,
video_x_dl=393649,
video_x_dv=393650,
video_x_fli=393651,
video_x_flv=393652,
video_x_isvideo=393653,
video_x_jng=393654 | 0x80000000,
video_x_m4v=393655,
video_x_matroska=393656,
video_x_mng=393657,
video_x_motion_jpeg=393658,
video_x_ms_asf=393659,
video_x_msvideo=393660,
video_x_qtc=393661,
video_x_sgi_movie=393662,
x_epoc_x_sisx_app=721343,
};
char *mime_get_mime_text(unsigned int mime_id) {switch (mime_id) {
case application_arj: return "application/arj";
@@ -482,6 +481,7 @@ case application_java_archive: return "application/java-archive";
case application_java: return "application/java";
case application_javascript: return "application/javascript";
case application_json: return "application/json";
case application_ndjson: return "application/ndjson";
case application_marc: return "application/marc";
case application_mbedlet: return "application/mbedlet";
case application_mime: return "application/mime";
@@ -537,8 +537,6 @@ case application_vocaltec_media_desc: return "application/vocaltec-media-desc";
case application_vocaltec_media_file: return "application/vocaltec-media-file";
case application_warc: return "application/warc";
case application_winhelp: return "application/winhelp";
case application_wordperfect6_0: return "application/wordperfect6.0";
case application_wordperfect6_1: return "application/wordperfect6.1";
case application_wordperfect: return "application/wordperfect";
case application_x_123: return "application/x-123";
case application_x_7z_compressed: return "application/x-7z-compressed";
@@ -934,6 +932,8 @@ g_hash_table_insert(ext_table, "inf", (gpointer)application_inf);
g_hash_table_insert(ext_table, "jar", (gpointer)application_java_archive);
g_hash_table_insert(ext_table, "class", (gpointer)application_java);
g_hash_table_insert(ext_table, "json", (gpointer)application_json);
g_hash_table_insert(ext_table, "jsonl", (gpointer)application_ndjson);
g_hash_table_insert(ext_table, "ndjson", (gpointer)application_ndjson);
g_hash_table_insert(ext_table, "mrc", (gpointer)application_marc);
g_hash_table_insert(ext_table, "mbd", (gpointer)application_mbedlet);
g_hash_table_insert(ext_table, "aps", (gpointer)application_mime);
@@ -1008,12 +1008,12 @@ g_hash_table_insert(ext_table, "vmd", (gpointer)application_vocaltec_media_desc)
g_hash_table_insert(ext_table, "vmf", (gpointer)application_vocaltec_media_file);
g_hash_table_insert(ext_table, "warc", (gpointer)application_warc);
g_hash_table_insert(ext_table, "hlp", (gpointer)application_winhelp);
g_hash_table_insert(ext_table, "w60", (gpointer)application_wordperfect6_0);
g_hash_table_insert(ext_table, "w61", (gpointer)application_wordperfect6_1);
g_hash_table_insert(ext_table, "wp", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "wp5", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "wp6", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "wpd", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "w60", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "w61", (gpointer)application_wordperfect);
g_hash_table_insert(ext_table, "wk1", (gpointer)application_x_123);
g_hash_table_insert(ext_table, "7z", (gpointer)application_x_7z_compressed);
g_hash_table_insert(ext_table, "aim", (gpointer)application_x_aim);
@@ -1478,6 +1478,7 @@ g_hash_table_insert(mime_table, "application/java-archive", (gpointer)applicatio
g_hash_table_insert(mime_table, "application/java", (gpointer)application_java);
g_hash_table_insert(mime_table, "application/javascript", (gpointer)application_javascript);
g_hash_table_insert(mime_table, "application/json", (gpointer)application_json);
g_hash_table_insert(mime_table, "application/ndjson", (gpointer)application_ndjson);
g_hash_table_insert(mime_table, "application/marc", (gpointer)application_marc);
g_hash_table_insert(mime_table, "application/mbedlet", (gpointer)application_mbedlet);
g_hash_table_insert(mime_table, "application/mime", (gpointer)application_mime);
@@ -1533,8 +1534,6 @@ g_hash_table_insert(mime_table, "application/vocaltec-media-desc", (gpointer)app
g_hash_table_insert(mime_table, "application/vocaltec-media-file", (gpointer)application_vocaltec_media_file);
g_hash_table_insert(mime_table, "application/warc", (gpointer)application_warc);
g_hash_table_insert(mime_table, "application/winhelp", (gpointer)application_winhelp);
g_hash_table_insert(mime_table, "application/wordperfect6.0", (gpointer)application_wordperfect6_0);
g_hash_table_insert(mime_table, "application/wordperfect6.1", (gpointer)application_wordperfect6_1);
g_hash_table_insert(mime_table, "application/wordperfect", (gpointer)application_wordperfect);
g_hash_table_insert(mime_table, "application/x-123", (gpointer)application_x_123);
g_hash_table_insert(mime_table, "application/x-7z-compressed", (gpointer)application_x_7z_compressed);

View File

@@ -5,30 +5,40 @@
#include "mime.h"
#include "src/io/serialize.h"
#include "src/parsing/sidecar.h"
#include "src/magic_generated.c"
#include <magic.h>
#define MIN_VIDEO_SIZE 1024 * 64
#define MIN_IMAGE_SIZE 1024 * 2
#define MIN_VIDEO_SIZE (1024 * 64)
#define MIN_IMAGE_SIZE (512)
int fs_read(struct vfile *f, void *buf, size_t size) {
if (f->fd == -1) {
SHA1_Init(&f->sha1_ctx);
f->fd = open(f->filepath, O_RDONLY);
if (f->fd == -1) {
LOG_ERRORF(f->filepath, "open(): [%d] %s", errno, strerror(errno))
return -1;
}
}
return read(f->fd, buf, size);
int ret = (int) read(f->fd, buf, size);
if (ret != 0 && f->calculate_checksum) {
f->has_checksum = TRUE;
safe_sha1_update(&f->sha1_ctx, (unsigned char *) buf, ret);
}
return ret;
}
#define CLOSE_FILE(f) if ((f).close != NULL) {(f).close(&(f));};
void fs_close(struct vfile *f) {
if (f->fd != -1) {
SHA1_Final(f->sha1_digest, &f->sha1_ctx);
close(f->fd);
}
}
@@ -60,31 +70,41 @@ void parse(void *arg) {
doc->base = (short) job->base;
char *rel_path = doc->filepath + ScanCtx.index.desc.root_len;
MD5((unsigned char *) rel_path, strlen(rel_path), doc->path_md5);
generate_doc_id(rel_path, doc->doc_id);
doc->meta_head = NULL;
doc->meta_tail = NULL;
doc->mime = 0;
doc->size = job->vfile.info.st_size;
doc->mtime = job->vfile.info.st_mtim.tv_sec;
doc->mtime = (int) job->vfile.info.st_mtim.tv_sec;
int inc_ts = incremental_get(ScanCtx.original_table, doc->path_md5);
int inc_ts = incremental_get(ScanCtx.original_table, doc->doc_id);
if (inc_ts != 0 && inc_ts == job->vfile.info.st_mtim.tv_sec) {
incremental_mark_file_for_copy(ScanCtx.copy_table, doc->path_md5);
pthread_mutex_lock(&ScanCtx.copy_table_mu);
incremental_mark_file(ScanCtx.copy_table, doc->doc_id);
pthread_mutex_unlock(&ScanCtx.copy_table_mu);
pthread_mutex_lock(&ScanCtx.dbg_file_counts_mu);
ScanCtx.dbg_skipped_files_count += 1;
pthread_mutex_unlock(&ScanCtx.dbg_file_counts_mu);
CLOSE_FILE(job->vfile)
free(doc->filepath);
free(doc);
return;
}
if (ScanCtx.new_table != NULL) {
pthread_mutex_lock(&ScanCtx.copy_table_mu);
incremental_mark_file(ScanCtx.new_table, doc->doc_id);
pthread_mutex_unlock(&ScanCtx.copy_table_mu);
}
char *buf[MAGIC_BUF_SIZE];
if (LogCtx.very_verbose) {
char path_md5_str[MD5_STR_LENGTH];
buf2hex(doc->path_md5, MD5_DIGEST_LENGTH, path_md5_str);
LOG_DEBUGF(job->filepath, "Starting parse job {%s}", path_md5_str)
LOG_DEBUGF(job->filepath, "Starting parse job {%s}", doc->doc_id)
}
if (job->vfile.info.st_size == 0) {
@@ -93,18 +113,17 @@ void parse(void *arg) {
doc->mime = mime_get_mime_by_ext(ScanCtx.ext_table, job->filepath + job->ext);
}
int bytes_read = 0;
if (doc->mime == 0 && !ScanCtx.fast) {
// Get mime type with libmagic
if (!job->vfile.is_fs_file) {
if (job->vfile.read_rewindable == NULL) {
LOG_WARNING(job->filepath,
"Guessing mime type with libmagic inside archive files is not currently supported");
"File does not support rewindable reads, cannot guess Media type");
goto abort;
}
bytes_read = job->vfile.read(&job->vfile, buf, MAGIC_BUF_SIZE);
int bytes_read = job->vfile.read_rewindable(&job->vfile, buf, MAGIC_BUF_SIZE);
if (bytes_read < 0) {
if (job->vfile.is_fs_file) {
@@ -113,16 +132,27 @@ void parse(void *arg) {
LOG_ERRORF(job->filepath, "(virtual) read(): [%d] %s", bytes_read, archive_error_string(job->vfile.arc))
}
CLOSE_FILE(job->vfile)
pthread_mutex_lock(&ScanCtx.dbg_file_counts_mu);
ScanCtx.dbg_failed_files_count += 1;
pthread_mutex_unlock(&ScanCtx.dbg_file_counts_mu);
CLOSE_FILE(job->vfile)
free(doc->filepath);
free(doc);
return;
}
magic_t magic = magic_open(MAGIC_MIME_TYPE);
magic_load(magic, NULL);
const char *magic_buffers[1] = {magic_database_buffer,};
size_t sizes[1] = {sizeof(magic_database_buffer),};
int load_ret = magic_load_buffers(magic, (void **) &magic_buffers, sizes, 1);
if (load_ret != 0) {
LOG_FATALF("parse.c", "Could not load libmagic database: (%d)", load_ret)
}
const char *magic_mime_str = magic_buffer(magic, buf, bytes_read);
if (magic_mime_str != NULL) {
@@ -135,7 +165,9 @@ void parse(void *arg) {
}
}
job->vfile.reset(&job->vfile);
if (job->vfile.reset != NULL) {
job->vfile.reset(&job->vfile);
}
magic_close(magic);
}
@@ -149,7 +181,7 @@ void parse(void *arg) {
} else if ((mmime == MimeVideo && doc->size >= MIN_VIDEO_SIZE) ||
(mmime == MimeImage && doc->size >= MIN_IMAGE_SIZE) || mmime == MimeAudio) {
parse_media(&ScanCtx.media_ctx, &job->vfile, doc);
parse_media(&ScanCtx.media_ctx, &job->vfile, doc, mime_get_mime_text(doc->mime));
} else if (IS_PDF(doc->mime)) {
parse_ebook(&ScanCtx.ebook_ctx, &job->vfile, mime_get_mime_text(doc->mime), doc);
@@ -169,7 +201,7 @@ void parse(void *arg) {
IS_ARC(doc->mime) ||
(IS_ARC_FILTER(doc->mime) && should_parse_filtered_file(doc->filepath, doc->ext))
)) {
parse_archive(&ScanCtx.arc_ctx, &job->vfile, doc);
parse_archive(&ScanCtx.arc_ctx, &job->vfile, doc, ScanCtx.exclude, ScanCtx.exclude_extra);
} else if ((ScanCtx.ooxml_ctx.content_size > 0 || ScanCtx.media_ctx.tn_size > 0) && IS_DOC(doc->mime)) {
parse_ooxml(&ScanCtx.ooxml_ctx, &job->vfile, doc);
} else if (is_cbr(&ScanCtx.comic_ctx, doc->mime) || is_cbz(&ScanCtx.comic_ctx, doc->mime)) {
@@ -179,18 +211,24 @@ void parse(void *arg) {
} else if (doc->mime == MIME_SIST2_SIDECAR) {
parse_sidecar(&job->vfile, doc);
CLOSE_FILE(job->vfile)
free(doc->filepath);
free(doc);
return;
} else if (is_msdoc(&ScanCtx.msdoc_ctx, doc->mime)) {
parse_msdoc(&ScanCtx.msdoc_ctx, &job->vfile, doc);
} else if (is_json(&ScanCtx.json_ctx, doc->mime)) {
parse_json(&ScanCtx.json_ctx, &job->vfile, doc);
} else if (is_ndjson(&ScanCtx.json_ctx, doc->mime)) {
parse_ndjson(&ScanCtx.json_ctx, &job->vfile, doc);
}
abort:
//Parent meta
if (!md5_digest_is_null(job->parent)) {
meta_line_t *meta_parent = malloc(sizeof(meta_line_t) + MD5_STR_LENGTH);
if (job->parent[0] != '\0') {
meta_line_t *meta_parent = malloc(sizeof(meta_line_t) + SIST_INDEX_ID_LEN);
meta_parent->key = MetaParent;
buf2hex(job->parent, MD5_DIGEST_LENGTH, meta_parent->str_val);
strcpy(meta_parent->str_val, job->parent);
APPEND_META((doc), meta_parent)
doc->has_parent = TRUE;
@@ -198,9 +236,15 @@ void parse(void *arg) {
doc->has_parent = FALSE;
}
write_document(doc);
CLOSE_FILE(job->vfile)
if (job->vfile.has_checksum) {
char sha1_digest_str[SHA1_STR_LENGTH];
buf2hex((unsigned char *) job->vfile.sha1_digest, SHA1_DIGEST_LENGTH, (char *) sha1_digest_str);
APPEND_STR_META(doc, MetaChecksum, (const char *) sha1_digest_str);
}
write_document(doc);
}
void cleanup_parse() {

View File

@@ -3,7 +3,7 @@
#include "../sist.h"
#define MAGIC_BUF_SIZE 4096 * 6
#define MAGIC_BUF_SIZE (4096 * 6)
int fs_read(struct vfile *f, void *buf, size_t size);
void fs_close(struct vfile *f);

View File

@@ -23,13 +23,19 @@ void parse_sidecar(vfile_t *vfile, document_t *doc) {
}
char *json_str = cJSON_PrintUnformatted(json);
unsigned char path_md5[MD5_DIGEST_LENGTH];
MD5((unsigned char *) vfile->filepath + ScanCtx.index.desc.root_len, doc->ext - 1 - ScanCtx.index.desc.root_len,
path_md5);
char assoc_doc_id[SIST_DOC_ID_LEN];
store_write(ScanCtx.index.meta_store, (char *) path_md5, sizeof(path_md5), json_str, strlen(json_str) + 1);
char rel_path[PATH_MAX];
size_t rel_path_len = doc->ext - 1 - ScanCtx.index.desc.root_len;
memcpy(rel_path, vfile->filepath + ScanCtx.index.desc.root_len, rel_path_len);
*(rel_path + rel_path_len) = '\0';
generate_doc_id(rel_path, assoc_doc_id);
store_write(ScanCtx.index.meta_store, assoc_doc_id, sizeof(assoc_doc_id), json_str,
strlen(json_str) + 1);
cJSON_Delete(json);
free(json_str);
free(buf);
}
}

View File

@@ -1,6 +1,8 @@
#ifndef SIST_H
#define SIST_H
#define _GNU_SOURCE
#ifndef FALSE
#define FALSE (0)
#define BOOL int
@@ -25,8 +27,6 @@
#define UNUSED(x) __attribute__((__unused__)) x
#define MD5_STR_LENGTH 33
#include "util.h"
#include "log.h"
#include "types.h"
@@ -49,13 +49,15 @@
#include <ctype.h>
#include "git_hash.h"
#define VERSION "2.11.0"
#define VERSION "2.12.1"
static const char *const Version = VERSION;
#ifndef SIST_PLATFORM
#define SIST_PLATFORM unknown
#endif
#define EXPECTED_MONGOOSE_VERSION "7.6"
#define Q(x) #x
#define QUOTE(x) Q(x)

View File

@@ -20,7 +20,7 @@ typedef struct {
long count;
} agg_t;
void fill_tables(cJSON *document, UNUSED(const char index_id[MD5_STR_LENGTH])) {
void fill_tables(cJSON *document, UNUSED(const char index_id[SIST_INDEX_ID_LEN])) {
if (cJSON_GetObjectItem(document, "parent") != NULL) {
return;
@@ -96,16 +96,8 @@ void fill_tables(cJSON *document, UNUSED(const char index_id[MD5_STR_LENGTH])) {
}
void read_index_into_tables(index_t *index) {
DIR *dir = opendir(index->path);
struct dirent *de;
while ((de = readdir(dir)) != NULL) {
if (strncmp(de->d_name, "_index_", sizeof("_index_") - 1) == 0) {
char file_path[PATH_MAX];
snprintf(file_path, PATH_MAX, "%s%s", index->path, de->d_name);
read_index(file_path, index->desc.id, index->desc.type, fill_tables);
}
}
closedir(dir);
char file_path[PATH_MAX];
READ_INDICES(file_path, index->path, read_index(file_path, index->desc.id, index->desc.type, fill_tables), {}, 1);
}
static size_t rfind(const char *str, int c) {

View File

@@ -28,6 +28,9 @@ typedef struct tpool {
int work_cnt;
int done_cnt;
int busy_cnt;
int throttle_stuck_cnt;
size_t mem_limit;
size_t page_size;
int free_arg;
int stop;
@@ -114,13 +117,44 @@ int tpool_add_work(tpool_t *pool, thread_func_t func, void *arg) {
return 1;
}
/**
* see: https://github.com/htop-dev/htop/blob/f782f821f7f8081cb43bbad1c37f32830a260a81/linux/LinuxProcessList.c
*/
__always_inline
static size_t _get_total_mem(tpool_t* pool) {
FILE* statmfile = fopen("/proc/self/statm", "r");
if (!statmfile)
return 0;
long int dummy, dummy2, dummy3, dummy4, dummy5, dummy6;
long int m_resident;
int r = fscanf(statmfile, "%ld %ld %ld %ld %ld %ld %ld",
&dummy, /* m_virt */
&m_resident,
&dummy2, /* m_share */
&dummy3, /* m_trs */
&dummy4, /* unused since Linux 2.6; always 0 */
&dummy5, /* m_drs */
&dummy6); /* unused since Linux 2.6; always 0 */
fclose(statmfile);
if (r == 7) {
return m_resident * pool->page_size;
} else {
return 0;
}
}
/**
* Thread worker function
*/
static void *tpool_worker(void *arg) {
tpool_t *pool = arg;
int stuck_notified = 0;
int throttle_ms = 0;
while (1) {
while (TRUE) {
pthread_mutex_lock(&pool->work_mutex);
if (pool->stop) {
break;
@@ -138,10 +172,35 @@ static void *tpool_worker(void *arg) {
pthread_mutex_unlock(&(pool->work_mutex));
if (work != NULL) {
stuck_notified = 0;
throttle_ms = 0;
while(!pool->stop && pool->mem_limit > 0 && _get_total_mem(pool) >= pool->mem_limit) {
if (!stuck_notified && throttle_ms >= 90000) {
// notify the pool that this thread is stuck.
pthread_mutex_lock(&(pool->work_mutex));
pool->throttle_stuck_cnt += 1;
if (pool->throttle_stuck_cnt == pool->thread_cnt) {
LOG_ERROR("tpool.c", "Throttle memory limit too low, cannot proceed!");
pool->stop = TRUE;
}
pthread_mutex_unlock(&(pool->work_mutex));
stuck_notified = 1;
}
usleep(10000);
throttle_ms += 10;
}
if (pool->stop) {
break;
}
// we are not stuck anymore. cancel our notification.
if (stuck_notified) {
pthread_mutex_lock(&(pool->work_mutex));
pool->throttle_stuck_cnt -= 1;
pthread_mutex_unlock(&(pool->work_mutex));
}
work->func(work->arg);
if (pool->free_arg) {
free(work->arg);
@@ -177,7 +236,7 @@ static void *tpool_worker(void *arg) {
}
void tpool_wait(tpool_t *pool) {
LOG_INFO("tpool.c", "Waiting for worker threads to finish")
LOG_DEBUG("tpool.c", "Waiting for worker threads to finish")
pthread_mutex_lock(&(pool->work_mutex));
while (TRUE) {
if (pool->done_cnt < pool->work_cnt) {
@@ -191,7 +250,9 @@ void tpool_wait(tpool_t *pool) {
}
}
}
progress_bar_print(1.0, ScanCtx.stat_tn_size, ScanCtx.stat_index_size);
if (pool->print_progress) {
progress_bar_print(1.0, ScanCtx.stat_tn_size, ScanCtx.stat_index_size);
}
pthread_mutex_unlock(&(pool->work_mutex));
LOG_INFO("tpool.c", "Worker threads finished")
@@ -241,18 +302,21 @@ void tpool_destroy(tpool_t *pool) {
* Create a thread pool
* @param thread_cnt Worker threads count
*/
tpool_t *tpool_create(int thread_cnt, void cleanup_func(), int free_arg, int print_progress) {
tpool_t *tpool_create(int thread_cnt, void cleanup_func(), int free_arg, int print_progress, size_t mem_limit) {
tpool_t *pool = malloc(sizeof(tpool_t));
pool->thread_cnt = thread_cnt;
pool->work_cnt = 0;
pool->done_cnt = 0;
pool->busy_cnt = 0;
pool->throttle_stuck_cnt = 0;
pool->mem_limit = mem_limit;
pool->stop = FALSE;
pool->free_arg = free_arg;
pool->cleanup_func = cleanup_func;
pool->threads = calloc(sizeof(pthread_t), thread_cnt);
pool->print_progress = print_progress;
pool->page_size = getpagesize();
pthread_mutex_init(&(pool->work_mutex), NULL);

View File

@@ -8,7 +8,7 @@ typedef struct tpool tpool_t;
typedef void (*thread_func_t)(void *arg);
tpool_t *tpool_create(int num, void (*cleanup_func)(), int free_arg, int print_progress);
tpool_t *tpool_create(int num, void (*cleanup_func)(), int free_arg, int print_progress, size_t mem_limit);
void tpool_start(tpool_t *pool);
void tpool_destroy(tpool_t *pool);

View File

@@ -4,7 +4,7 @@
#define INDEX_TYPE_NDJSON "ndjson"
typedef struct index_descriptor {
char id[MD5_STR_LENGTH];
char id[SIST_INDEX_ID_LEN];
char version[64];
long timestamp;
char root[PATH_MAX];

View File

@@ -84,11 +84,13 @@ char *expandpath(const char *path) {
return expanded;
}
int PrintingProgressBar = 0;
void progress_bar_print(double percentage, size_t tn_size, size_t index_size) {
static int last_val = -1;
int val = (int) (percentage * 100);
if (last_val == val || val > 100 || index_size < 1024) {
if (last_val == val || val > 100) {
return;
}
last_val = val;
@@ -114,13 +116,21 @@ void progress_bar_print(double percentage, size_t tn_size, size_t index_size) {
index_unit = 'M';
}
printf(
"\r%3d%%[%.*s>%*s] TN:%3d%c IDX:%3d%c",
val, lpad, PBSTR, rpad, "",
(int) tn_size, tn_unit,
(int) index_size, index_unit
);
fflush(stdout);
if (tn_size == 0 && index_size == 0) {
fprintf(stderr,
"\r%3d%%[%.*s>%*s]",
val, lpad, PBSTR, rpad, ""
);
} else {
fprintf(stderr,
"\r%3d%%[%.*s>%*s] TN:%3d%c IDX:%3d%c",
val, lpad, PBSTR, rpad, "",
(int) tn_size, tn_unit,
(int) index_size, index_unit
);
}
PrintingProgressBar = TRUE;
}
GHashTable *incremental_get_table() {

View File

@@ -10,8 +10,6 @@
#include "third-party/utf8.h/utf8.h"
#include "libscan/scan.h"
#define MD5_STR_LENGTH 33
char *abspath(const char *path);
@@ -19,6 +17,8 @@ char *expandpath(const char *path);
dyn_buffer_t url_escape(char *str);
extern int PrintingProgressBar;
void progress_bar_print(double percentage, size_t tn_size, size_t index_size);
GHashTable *incremental_get_table();
@@ -92,49 +92,37 @@ static void buf2hex(const unsigned char *buf, size_t buflen, char *hex_string) {
__always_inline
static int md5_digest_is_null(const unsigned char digest[MD5_DIGEST_LENGTH]) {
return (*(int64_t *) digest) == 0 && (*((int64_t *) digest + 1)) == 0;
static void generate_doc_id(const char *rel_path, char *doc_id) {
unsigned char md[MD5_DIGEST_LENGTH];
MD5((unsigned char *) rel_path, strlen(rel_path), md);
buf2hex(md, sizeof(md), doc_id);
}
__always_inline
static void incremental_put(GHashTable *table, const unsigned char path_md5[MD5_DIGEST_LENGTH], int mtime) {
char *ptr = malloc(MD5_STR_LENGTH);
buf2hex(path_md5, MD5_DIGEST_LENGTH, ptr);
static void incremental_put(GHashTable *table, const char doc_id[SIST_DOC_ID_LEN], int mtime) {
char *ptr = malloc(SIST_DOC_ID_LEN);
strcpy(ptr, doc_id);
g_hash_table_insert(table, ptr, GINT_TO_POINTER(mtime));
}
__always_inline
static void incremental_put_str(GHashTable *table, const char *path_md5, int mtime) {
char *ptr = malloc(MD5_STR_LENGTH);
strcpy(ptr, path_md5);
g_hash_table_insert(table, ptr, GINT_TO_POINTER(mtime));
}
__always_inline
static int incremental_get(GHashTable *table, const unsigned char path_md5[MD5_DIGEST_LENGTH]) {
static int incremental_get(GHashTable *table, const char doc_id[SIST_DOC_ID_LEN]) {
if (table != NULL) {
char md5_str[MD5_STR_LENGTH];
buf2hex(path_md5, MD5_DIGEST_LENGTH, md5_str);
return GPOINTER_TO_INT(g_hash_table_lookup(table, md5_str));
return GPOINTER_TO_INT(g_hash_table_lookup(table, doc_id));
} else {
return 0;
}
}
/**
* Marks a file by adding it to a table.
* !!Not thread safe.
*/
__always_inline
static int incremental_get_str(GHashTable *table, const char *path_md5) {
if (table != NULL) {
return GPOINTER_TO_INT(g_hash_table_lookup(table, path_md5));
} else {
return 0;
}
}
__always_inline
static int incremental_mark_file_for_copy(GHashTable *table, const unsigned char path_md5[MD5_DIGEST_LENGTH]) {
char *ptr = malloc(MD5_STR_LENGTH);
buf2hex(path_md5, MD5_DIGEST_LENGTH, ptr);
static int incremental_mark_file(GHashTable *table, const char doc_id[SIST_DOC_ID_LEN]) {
char *ptr = malloc(SIST_DOC_ID_LEN);
strcpy(ptr, doc_id);
return g_hash_table_insert(table, ptr, GINT_TO_POINTER(1));
}

View File

@@ -8,12 +8,23 @@
#include <src/ctx.h>
#define HTTP_SERVER_HEADER "Server: sist2/" VERSION "\r\n"
#define HTTP_TEXT_TYPE_HEADER "Content-Type: text/plain;charset=utf-8\r\n"
#define HTTP_REPLY_NOT_FOUND mg_http_reply(nc, 404, HTTP_SERVER_HEADER HTTP_TEXT_TYPE_HEADER, "Not found");
static struct mg_http_serve_opts DefaultServeOpts = {
.fs = NULL,
.ssi_pattern = NULL,
.root_dir = NULL,
.mime_types = ""
};
static void send_response_line(struct mg_connection *nc, int status_code, size_t length, char *extra_headers) {
mg_printf(
nc,
"HTTP/1.1 %d %s\r\n"
"Server: sist2/" VERSION "\r\n"
HTTP_SERVER_HEADER
"Content-Length: %d\r\n"
"%s\r\n\r\n",
status_code, "OK",
@@ -25,7 +36,7 @@ static void send_response_line(struct mg_connection *nc, int status_code, size_t
index_t *get_index_by_id(const char *index_id) {
for (int i = WebCtx.index_count; i >= 0; i--) {
if (strncmp(index_id, WebCtx.indices[i].desc.id, MD5_STR_LENGTH) == 0) {
if (strncmp(index_id, WebCtx.indices[i].desc.id, SIST_INDEX_ID_LEN) == 0) {
return &WebCtx.indices[i];
}
}
@@ -50,7 +61,7 @@ store_t *get_tag_store(const char *index_id) {
void search_index(struct mg_connection *nc, struct mg_http_message *hm) {
if (WebCtx.dev) {
mg_http_serve_file(nc, hm, "sist2-vue/dist/index.html", "text/html", NULL);
mg_http_serve_file(nc, hm, "sist2-vue/dist/index.html", &DefaultServeOpts);
} else {
send_response_line(nc, 200, sizeof(index_html), "Content-Type: text/html");
mg_send(nc, index_html, sizeof(index_html));
@@ -59,23 +70,23 @@ void search_index(struct mg_connection *nc, struct mg_http_message *hm) {
void stats_files(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != MD5_STR_LENGTH + 4) {
mg_http_reply(nc, 404, "", "");
if (hm->uri.len != SIST_INDEX_ID_LEN + 4) {
HTTP_REPLY_NOT_FOUND
return;
}
char arg_md5[MD5_STR_LENGTH];
memcpy(arg_md5, hm->uri.ptr + 3, MD5_STR_LENGTH);
*(arg_md5 + MD5_STR_LENGTH - 1) = '\0';
char arg_index_id[SIST_INDEX_ID_LEN];
memcpy(arg_index_id, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index_id + SIST_INDEX_ID_LEN - 1) = '\0';
index_t *index = get_index_by_id(arg_md5);
index_t *index = get_index_by_id(arg_index_id);
if (index == NULL) {
mg_http_reply(nc, 404, "", "");
HTTP_REPLY_NOT_FOUND
return;
}
const char *file;
switch (atoi(hm->uri.ptr + 3 + MD5_STR_LENGTH)) {
switch (atoi(hm->uri.ptr + 3 + SIST_INDEX_ID_LEN)) {
case 1:
file = "treemap.csv";
break;
@@ -100,12 +111,13 @@ void stats_files(struct mg_connection *nc, struct mg_http_message *hm) {
strcpy(full_path, index->path);
strcat(full_path, file);
mg_http_serve_file(nc, hm, full_path, "text/csv", disposition);
struct mg_http_serve_opts opts = {};
mg_http_serve_file(nc, hm, full_path, &opts);
}
void javascript(struct mg_connection *nc, struct mg_http_message *hm) {
if (WebCtx.dev) {
mg_http_serve_file(nc, hm, "sist2-vue/dist/js/index.js", "application/javascript", NULL);
mg_http_serve_file(nc, hm, "sist2-vue/dist/js/index.js", &DefaultServeOpts);
} else {
send_response_line(nc, 200, sizeof(index_js), "Content-Type: application/javascript");
mg_send(nc, index_js, sizeof(index_js));
@@ -114,7 +126,7 @@ void javascript(struct mg_connection *nc, struct mg_http_message *hm) {
void javascript_vendor(struct mg_connection *nc, struct mg_http_message *hm) {
if (WebCtx.dev) {
mg_http_serve_file(nc, hm, "sist2-vue/dist/js/chunk-vendors.js", "application/javascript", NULL);
mg_http_serve_file(nc, hm, "sist2-vue/dist/js/chunk-vendors.js", &DefaultServeOpts);
} else {
send_response_line(nc, 200, sizeof(chunk_vendors_js), "Content-Type: application/javascript");
mg_send(nc, chunk_vendors_js, sizeof(chunk_vendors_js));
@@ -138,32 +150,50 @@ void style_vendor(struct mg_connection *nc, struct mg_http_message *hm) {
void thumbnail(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != 68) {
LOG_DEBUGF("serve.c", "Invalid thumbnail path: %.*s", (int) hm->uri.len, hm->uri.ptr)
mg_http_reply(nc, 404, "", "Not found");
return;
int has_thumbnail_index = FALSE;
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2) {
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2 + 4) {
LOG_DEBUGF("serve.c", "Invalid thumbnail path: %.*s", (int) hm->uri.len, hm->uri.ptr)
HTTP_REPLY_NOT_FOUND
return;
}
has_thumbnail_index = TRUE;
}
char arg_file_md5[MD5_STR_LENGTH];
char arg_index[MD5_STR_LENGTH];
char arg_doc_id[SIST_DOC_ID_LEN];
char arg_index[SIST_INDEX_ID_LEN];
memcpy(arg_index, hm->uri.ptr + 3, MD5_STR_LENGTH);
*(arg_index + MD5_STR_LENGTH - 1) = '\0';
memcpy(arg_file_md5, hm->uri.ptr + 3 + MD5_STR_LENGTH, MD5_STR_LENGTH);
*(arg_file_md5 + MD5_STR_LENGTH - 1) = '\0';
unsigned char md5_buf[MD5_DIGEST_LENGTH];
hex2buf(arg_file_md5, MD5_STR_LENGTH - 1, md5_buf);
memcpy(arg_index, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(arg_doc_id, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, SIST_DOC_ID_LEN);
*(arg_doc_id + SIST_DOC_ID_LEN - 1) = '\0';
store_t *store = get_store(arg_index);
if (store == NULL) {
LOG_DEBUGF("serve.c", "Could not get store for index: %s", arg_index)
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
char *data;
size_t data_len = 0;
char *data = store_read(store, (char *) md5_buf, sizeof(md5_buf), &data_len);
if (has_thumbnail_index) {
const char *tn_index = hm->uri.ptr + SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2;
char tn_key[sizeof(arg_doc_id) + sizeof(char) * 4];
memcpy(tn_key, arg_doc_id, sizeof(arg_doc_id));
memcpy(tn_key + sizeof(arg_doc_id) - 1, tn_index, sizeof(char) * 4);
*(tn_key + sizeof(tn_key) - 1) = '\0';
data = store_read(store, (char *) tn_key, sizeof(tn_key), &data_len);
} else {
data = store_read(store, (char *) arg_doc_id, sizeof(arg_doc_id), &data_len);
}
if (data_len != 0) {
send_response_line(
nc, 200, data_len,
@@ -173,7 +203,7 @@ void thumbnail(struct mg_connection *nc, struct mg_http_message *hm) {
mg_send(nc, data, data_len);
free(data);
} else {
mg_http_reply(nc, 404, "Content-Type: text/plain;charset=utf-8\r\n", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
}
@@ -182,7 +212,7 @@ void search(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->body.len == 0) {
LOG_DEBUG("serve.c", "Client sent empty body, ignoring request")
mg_http_reply(nc, 500, "", "Invalid request");
mg_http_reply(nc, 400, HTTP_SERVER_HEADER HTTP_TEXT_TYPE_HEADER, "Invalid request");
return;
}
@@ -193,7 +223,7 @@ void search(struct mg_connection *nc, struct mg_http_message *hm) {
char url[4096];
snprintf(url, 4096, "%s/%s/_search", WebCtx.es_url, WebCtx.es_index);
nc->fn_data = web_post_async(url, body);
nc->fn_data = web_post_async(url, body, WebCtx.es_insecure_ssl);
}
void serve_file_from_url(cJSON *json, index_t *idx, struct mg_connection *nc) {
@@ -226,6 +256,11 @@ void serve_file_from_url(cJSON *json, index_t *idx, struct mg_connection *nc) {
void serve_file_from_disk(cJSON *json, index_t *idx, struct mg_connection *nc, struct mg_http_message *hm) {
if (strcmp(MG_VERSION, EXPECTED_MONGOOSE_VERSION) != 0) {
LOG_WARNING("serve.c", "sist2 was not linked with latest mongoose version, "
"serving file from disk might not work as expected.")
}
const char *path = cJSON_GetObjectItem(json, "path")->valuestring;
const char *name = cJSON_GetObjectItem(json, "name")->valuestring;
const char *ext = cJSON_GetObjectItem(json, "extension")->valuestring;
@@ -246,21 +281,54 @@ void serve_file_from_disk(cJSON *json, index_t *idx, struct mg_connection *nc, s
char disposition[8192];
snprintf(disposition, sizeof(disposition),
"Content-Disposition: inline; filename=\"%s%s%s\"\r\nAccept-Ranges: bytes\r\n",
HTTP_SERVER_HEADER "Content-Disposition: inline; filename=\"%s%s%s\"\r\n"
"Accept-Ranges: bytes\r\nCache-Control: no-store\r\n",
name, strlen(ext) == 0 ? "" : ".", ext);
mg_http_serve_file(nc, hm, full_path, mime, disposition);
char mime_mapping[1024];
snprintf(mime_mapping, sizeof(mime_mapping), "%s=%s", ext, mime);
struct mg_http_serve_opts opts = {
.extra_headers = disposition,
.mime_types = mime_mapping
};
mg_http_serve_file(nc, hm, full_path, &opts);
}
void cache_es_version() {
static int is_cached = FALSE;
if (is_cached == TRUE) {
return;
}
es_version_t *es_version = elastic_get_version(WebCtx.es_url, WebCtx.es_insecure_ssl);
if (es_version != NULL) {
WebCtx.es_version = es_version;
is_cached = TRUE;
}
}
void index_info(struct mg_connection *nc) {
cache_es_version();
const char *es_version = "0.0.0";
if (WebCtx.es_version != NULL) {
es_version = format_es_version(WebCtx.es_version);
}
cJSON *json = cJSON_CreateObject();
cJSON *arr = cJSON_AddArrayToObject(json, "indices");
cJSON_AddStringToObject(json, "mongooseVersion", MG_VERSION);
cJSON_AddStringToObject(json, "esIndex", WebCtx.es_index);
cJSON_AddStringToObject(json, "version", Version);
cJSON_AddStringToObject(json, "esVersion", es_version);
cJSON_AddBoolToObject(json, "esVersionSupported", IS_SUPPORTED_ES_VERSION(WebCtx.es_version));
cJSON_AddBoolToObject(json, "esVersionLegacy", IS_LEGACY_VERSION(WebCtx.es_version));
cJSON_AddStringToObject(json, "platform", QUOTE(SIST_PLATFORM));
cJSON_AddStringToObject(json, "sist2Hash", Sist2CommitHash);
cJSON_AddStringToObject(json, "libscanHash", LibScanCommitHash);
cJSON_AddStringToObject(json, "lang", WebCtx.lang);
cJSON_AddBoolToObject(json, "dev", WebCtx.dev);
#ifdef SIST_DEBUG
@@ -291,55 +359,19 @@ void index_info(struct mg_connection *nc) {
}
void document_info(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != MD5_STR_LENGTH + 2) {
LOG_DEBUGF("serve.c", "Invalid document_info path: %.*s", (int) hm->uri.len, hm->uri.ptr)
mg_http_reply(nc, 404, "", "Not found");
return;
}
char arg_md5[MD5_STR_LENGTH];
memcpy(arg_md5, hm->uri.ptr + 3, MD5_STR_LENGTH);
*(arg_md5 + MD5_STR_LENGTH - 1) = '\0';
cJSON *doc = elastic_get_document(arg_md5);
cJSON *source = cJSON_GetObjectItem(doc, "_source");
cJSON *index_id = cJSON_GetObjectItem(source, "index");
if (index_id == NULL) {
cJSON_Delete(doc);
mg_http_reply(nc, 404, "", "Not found");
return;
}
index_t *idx = get_index_by_id(index_id->valuestring);
if (idx == NULL) {
cJSON_Delete(doc);
mg_http_reply(nc, 404, "", "Not found");
return;
}
char *json_str = cJSON_PrintUnformatted(source);
send_response_line(nc, 200, (int) strlen(json_str), "Content-Type: application/json");
mg_send(nc, json_str, (int) strlen(json_str));
free(json_str);
cJSON_Delete(doc);
}
void file(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != MD5_STR_LENGTH + 2) {
if (hm->uri.len != SIST_DOC_ID_LEN + 2) {
LOG_DEBUGF("serve.c", "Invalid file path: %.*s", (int) hm->uri.len, hm->uri.ptr)
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
char arg_md5[MD5_STR_LENGTH];
memcpy(arg_md5, hm->uri.ptr + 3, MD5_STR_LENGTH);
*(arg_md5 + MD5_STR_LENGTH - 1) = '\0';
char arg_doc_id[SIST_DOC_ID_LEN];
memcpy(arg_doc_id, hm->uri.ptr + 3, SIST_DOC_ID_LEN);
*(arg_doc_id + SIST_DOC_ID_LEN - 1) = '\0';
const char *next = arg_md5;
const char *next = arg_doc_id;
cJSON *doc = NULL;
cJSON *index_id = NULL;
cJSON *source = NULL;
@@ -350,7 +382,7 @@ void file(struct mg_connection *nc, struct mg_http_message *hm) {
index_id = cJSON_GetObjectItem(source, "index");
if (index_id == NULL) {
cJSON_Delete(doc);
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
cJSON *parent = cJSON_GetObjectItem(source, "parent");
@@ -364,7 +396,7 @@ void file(struct mg_connection *nc, struct mg_http_message *hm) {
if (idx == NULL) {
cJSON_Delete(doc);
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
@@ -390,7 +422,6 @@ void status(struct mg_connection *nc) {
typedef struct {
char *name;
int delete;
char *path_md5_str;
char *doc_id;
} tag_req_t;
@@ -410,12 +441,6 @@ tag_req_t *parse_tag_request(cJSON *json) {
return NULL;
}
cJSON *arg_path_md5 = cJSON_GetObjectItem(json, "path_md5");
if (arg_path_md5 == NULL || !cJSON_IsString(arg_path_md5) ||
strlen(arg_path_md5->valuestring) != MD5_STR_LENGTH - 1) {
return NULL;
}
cJSON *arg_doc_id = cJSON_GetObjectItem(json, "doc_id");
if (arg_doc_id == NULL || !cJSON_IsString(arg_doc_id)) {
return NULL;
@@ -424,33 +449,32 @@ tag_req_t *parse_tag_request(cJSON *json) {
tag_req_t *req = malloc(sizeof(tag_req_t));
req->delete = arg_delete->valueint;
req->name = arg_name->valuestring;
req->path_md5_str = arg_path_md5->valuestring;
req->doc_id = arg_doc_id->valuestring;
return req;
}
void tag(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != MD5_STR_LENGTH + 4) {
if (hm->uri.len != SIST_INDEX_ID_LEN + 4) {
LOG_DEBUGF("serve.c", "Invalid tag path: %.*s", (int) hm->uri.len, hm->uri.ptr)
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
char arg_index[MD5_STR_LENGTH];
memcpy(arg_index, hm->uri.ptr + 5, MD5_STR_LENGTH);
*(arg_index + MD5_STR_LENGTH - 1) = '\0';
char arg_index[SIST_INDEX_ID_LEN];
memcpy(arg_index, hm->uri.ptr + 5, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
if (hm->body.len < 2 || hm->method.len != 4 || memcmp(&hm->method, "POST", 4) == 0) {
LOG_DEBUG("serve.c", "Invalid tag request")
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
store_t *store = get_tag_store(arg_index);
if (store == NULL) {
LOG_DEBUGF("serve.c", "Could not get tag store for index: %s", arg_index)
mg_http_reply(nc, 404, "", "Not found");
HTTP_REPLY_NOT_FOUND
return;
}
@@ -471,7 +495,7 @@ void tag(struct mg_connection *nc, struct mg_http_message *hm) {
cJSON *arr = NULL;
size_t data_len = 0;
const char *data = store_read(store, arg_req->path_md5_str, MD5_STR_LENGTH, &data_len);
const char *data = store_read(store, arg_req->doc_id, SIST_DOC_ID_LEN, &data_len);
if (data_len == 0) {
arr = cJSON_CreateArray();
} else {
@@ -507,7 +531,7 @@ void tag(struct mg_connection *nc, struct mg_http_message *hm) {
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, arg_req->doc_id);
nc->fn_data = web_post_async(url, buf);
nc->fn_data = web_post_async(url, buf, WebCtx.es_insecure_ssl);
} else {
cJSON_AddItemToArray(arr, cJSON_CreateString(arg_req->name));
@@ -527,11 +551,11 @@ void tag(struct mg_connection *nc, struct mg_http_message *hm) {
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, arg_req->doc_id);
nc->fn_data = web_post_async(url, buf);
nc->fn_data = web_post_async(url, buf, WebCtx.es_insecure_ssl);
}
char *json_str = cJSON_PrintUnformatted(arr);
store_write(store, arg_req->path_md5_str, MD5_STR_LENGTH, json_str, strlen(json_str) + 1);
store_write(store, arg_req->doc_id, SIST_DOC_ID_LEN, json_str, strlen(json_str) + 1);
store_flush(store);
free(arg_req);
@@ -593,10 +617,8 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
return;
}
tag(nc, hm);
} else if (mg_http_match_uri(hm, "/d/*")) {
document_info(nc, hm);
} else {
mg_http_reply(nc, 404, "", "Page not found");
HTTP_REPLY_NOT_FOUND
}
} else if (ev == MG_EV_POLL) {
@@ -626,7 +648,8 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
free(tmp);
}
mg_http_reply(nc, 500, "", "");
mg_http_reply(nc, 500, HTTP_SERVER_HEADER HTTP_TEXT_TYPE_HEADER,
"Elasticsearch error, see server logs.");
}
free_response(r);
@@ -640,7 +663,7 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
void serve(const char *listen_address) {
printf("Starting web server @ http://%s\n", listen_address);
LOG_INFOF("serve.c", "Starting web server @ http://%s", listen_address)
struct mg_mgr mgr;
mg_mgr_init(&mgr);

Some files were not shown because too many files have changed in this diff Show More