Compare commits

...

308 Commits

Author SHA1 Message Date
Shy
670dad185e Fix #521 2025-03-19 19:22:17 -04:00
Shy
bbbd727e6a Update sist2-python version 2025-03-19 18:38:21 -04:00
Shy
d800effad9
Merge pull request #511 from dpieski/patch-5
Update README.md
2025-02-06 17:58:36 -05:00
Shy
371e9c408e
Merge pull request #512 from dpieski/patch-6
Update README.md
2025-02-06 17:58:07 -05:00
Andrew
ee1b1d8bb4
Update README.md
Moved README references from simon987 to sist2app
2025-02-03 15:09:11 -06:00
Andrew
63a097a463
Update README.md
Update to the docker-compose.yml example.
2025-02-03 15:00:03 -06:00
Shy
7a03a2202e Fix #481 2025-01-24 19:40:08 -05:00
Shy
050fc500ce Fix #462 2025-01-24 19:22:01 -05:00
Shy
d44679131b Update compose file to avoid confusion. Fixes #490 2025-01-23 21:45:01 -05:00
Shy
4dd5e70406 Fix #492 2025-01-23 21:40:37 -05:00
Shy
5a82581992 Fix magic database problem 2025-01-23 21:40:27 -05:00
Shy
0dc18a56c0 Fix #509 2025-01-23 19:10:17 -05:00
Shy
258b2e31e6 Version bump 2025-01-23 19:10:02 -05:00
Shy
c726074029 Update tessdata paths 2025-01-23 19:09:54 -05:00
Shy
7873ef003d Fix CI build attempt 6 2025-01-22 22:16:42 -05:00
Shy
d41266e136 Fix CI build attempt 5 2025-01-22 22:15:37 -05:00
Shy
0e946092eb Fix CI build attempt 4 2025-01-22 21:58:55 -05:00
Shy
95b19e2e67 Fix CI build attempt 3 2025-01-22 21:55:09 -05:00
Shy
bd98eb2522 Fix CI build attempt 2 2025-01-22 21:51:59 -05:00
Shy
3d99add79e Fix CI build 2025-01-22 21:43:23 -05:00
Shy
2d6553d5d2 Update magic gen script 2025-01-22 21:39:23 -05:00
Shy
7d67354b96 Update CI build config 2025-01-22 21:32:54 -05:00
Shy
1b77daef16 Update repository URLs 2025-01-22 21:27:27 -05:00
Shy
d7038be35b Fix #506 2025-01-16 18:32:33 -05:00
Shy
c1573a803e Update third-party dependencies 2025-01-12 11:55:14 -05:00
2436e52a62
Merge pull request #479 from Kiskadee-dev/master
Update README.md
2024-04-26 10:03:12 -04:00
Matheus Victor
c3a09d0683
Update README.md 2024-04-26 10:41:25 -03:00
b9f82593ce Fix onnx 2024-04-03 20:24:30 -04:00
59bc418a95 Fix loadModel 2024-04-03 20:03:46 -04:00
fc06b3e378 Fix crash for leftover documents in sqlite index 2024-04-03 18:38:09 -04:00
89e1968994
Merge pull request #474 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/express-4.19.2
Bump express from 4.18.2 to 4.19.2 in /sist2-admin/frontend
2024-04-03 16:10:02 -04:00
7009c082e1
Merge pull request #473 from simon987/dependabot/npm_and_yarn/sist2-vue/webpack-dev-middleware-5.3.4
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-vue
2024-04-03 16:09:57 -04:00
64d6bc04a7
Merge pull request #472 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/webpack-dev-middleware-5.3.4
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-admin/frontend
2024-04-03 16:09:48 -04:00
a2655edf2f
Merge pull request #470 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/follow-redirects-1.15.6
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-admin/frontend
2024-04-03 16:09:40 -04:00
86212ece64
Merge pull request #469 from simon987/dependabot/npm_and_yarn/sist2-vue/follow-redirects-1.15.6
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-vue
2024-04-03 16:09:31 -04:00
61170ce503 Update README 2024-04-03 16:08:40 -04:00
7ae410dcc7 fix package-lock.json (again) 2024-04-03 15:51:30 -04:00
dependabot[bot]
8714e7e41a
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-vue
Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/webpack/webpack-dev-middleware/releases)
- [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4)

---
updated-dependencies:
- dependency-name: webpack-dev-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 19:46:41 +00:00
dependabot[bot]
4a804b7319
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-vue
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 19:46:38 +00:00
4f83a044c7 Fix aarch64 build 2024-04-03 15:45:32 -04:00
6e15201a05 Fix package-lock.json 2024-04-03 15:44:35 -04:00
6bb12a563a Version bump 2024-04-03 14:40:03 -04:00
4567f52668 Add toggle for verbose web logs 2024-04-03 14:39:44 -04:00
774efe062f Fix for newer node version in debug script 2024-04-03 14:27:17 -04:00
7a7a0686c2 Fixes for new mongoose version 2024-04-03 14:26:54 -04:00
7bc2ef9e6c Add debug print statement.. 2024-04-03 14:26:33 -04:00
f65cca5a02 Fix NULL mime SQLite index 2024-04-03 14:26:20 -04:00
6423643e24 Fix right click on images in lightbox, update lightbox 2024-04-03 14:24:54 -04:00
f99ea74e3f Passthrough frontend logs to stdout 2024-04-03 11:18:54 -04:00
1f8f65044c 3rd party lib updates 2024-04-03 11:18:24 -04:00
0981a1f421 Update compose file to add ES persistence.. 2024-04-03 09:22:17 -04:00
ff066a3962 Fix build for GCC 12 2024-04-03 09:15:00 -04:00
dependabot[bot]
1e778b6f2a
Bump express from 4.18.2 to 4.19.2 in /sist2-admin/frontend
Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-28 17:15:39 +00:00
dependabot[bot]
ff27a540eb
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-admin/frontend
Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/webpack/webpack-dev-middleware/releases)
- [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4)

---
updated-dependencies:
- dependency-name: webpack-dev-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-23 11:35:55 +00:00
dependabot[bot]
83259eedee
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-admin/frontend
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-16 23:11:52 +00:00
simon987
aff69fb3eb Force user to not have both --auth and --tag-auth at the same time in the UI #453 2024-01-30 10:49:44 -05:00
simon987
08b6323176 Add error message when frontend does not start 2024-01-30 10:42:16 -05:00
2307fc6e15
Merge pull request #455 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/follow-redirects-1.15.4
Bump follow-redirects from 1.15.0 to 1.15.4 in /sist2-admin/frontend
2024-01-14 09:15:19 -05:00
d679e4c3ca
Merge pull request #456 from simon987/dependabot/npm_and_yarn/sist2-vue/follow-redirects-1.15.4
Bump follow-redirects from 1.15.2 to 1.15.4 in /sist2-vue
2024-01-14 09:15:13 -05:00
f423a17543
Merge pull request #458 from SystemZ/fix-tail-width
fix tail horizontal scrolling
2024-01-14 09:15:04 -05:00
Michał Frąckiewicz
1bdf4d71dd
fix tail horizontal scrolling
Before this change, debugging via logs was hard due to clipping width of the log box
2024-01-13 11:25:35 +01:00
dependabot[bot]
f58e66352c
Bump follow-redirects from 1.15.2 to 1.15.4 in /sist2-vue
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.2 to 1.15.4.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.2...v1.15.4)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-10 11:22:12 +00:00
dependabot[bot]
a672822811
Bump follow-redirects from 1.15.0 to 1.15.4 in /sist2-admin/frontend
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.0 to 1.15.4.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.0...v1.15.4)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-10 00:13:02 +00:00
simon987
ae317e590d Update MIN_OCR_LEN to 3 2024-01-07 09:16:40 -05:00
simon987
410283f14a Remove debug print 2023-12-10 09:21:14 -05:00
simon987
2936240df8 Disable OSD, add preserve_interword_spaces for chi_sim OCR (#443) 2023-12-10 09:20:43 -05:00
af5059f366
Update USAGE.md 2023-12-02 09:25:03 -05:00
simon987
03983ce00a Fix for #439 2023-11-19 15:46:26 -05:00
simon987
80528857e9 Duplicate media_comment field, fixes #440 2023-11-18 10:53:42 -05:00
ffa7f2ae84 Add button for full reindex, fixes #403 2023-11-18 10:53:42 -05:00
6ade3395d5
Merge pull request #437 from simon987/dependabot/npm_and_yarn/sist2-vue/axios-1.6.0
Bump axios from 0.25.0 to 1.6.0 in /sist2-vue
2023-11-11 09:26:01 -05:00
a2d5e774b3
Merge pull request #438 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/axios-1.6.0
Bump axios from 0.27.2 to 1.6.0 in /sist2-admin/frontend
2023-11-11 09:25:49 -05:00
dependabot[bot]
19ea1169ff
Bump axios from 0.27.2 to 1.6.0 in /sist2-admin/frontend
Bumps [axios](https://github.com/axios/axios) from 0.27.2 to 1.6.0.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.27.2...v1.6.0)

---
updated-dependencies:
- dependency-name: axios
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-11 06:17:01 +00:00
dependabot[bot]
1225fd6bac
Bump axios from 0.25.0 to 1.6.0 in /sist2-vue
Bumps [axios](https://github.com/axios/axios) from 0.25.0 to 1.6.0.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.25.0...v1.6.0)

---
updated-dependencies:
- dependency-name: axios
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-10 23:42:56 +00:00
687b645840
Merge pull request #434 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/babel/traverse-7.23.2
Bump @babel/traverse from 7.20.5 to 7.23.2 in /sist2-admin/frontend
2023-10-19 08:36:23 -04:00
d2c8f9209d
Merge pull request #433 from simon987/dependabot/npm_and_yarn/sist2-vue/babel/traverse-7.23.2
Bump @babel/traverse from 7.20.12 to 7.23.2 in /sist2-vue
2023-10-19 08:36:13 -04:00
dependabot[bot]
3ea375b37d
Bump @babel/traverse from 7.20.5 to 7.23.2 in /sist2-admin/frontend
Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.20.5 to 7.23.2.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse)

---
updated-dependencies:
- dependency-name: "@babel/traverse"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-18 20:02:12 +00:00
dependabot[bot]
bff89d93e6
Bump @babel/traverse from 7.20.12 to 7.23.2 in /sist2-vue
Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.20.12 to 7.23.2.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse)

---
updated-dependencies:
- dependency-name: "@babel/traverse"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-18 17:03:04 +00:00
f423863acb Add option to search in path for sqlite #402 2023-10-16 21:14:46 -04:00
49a21a5a25 Version bump 2023-10-13 19:02:26 -04:00
560aa82ce7
add discord invite 2023-10-09 18:10:51 -04:00
b8c905bd64 expose indexRoot value to documents 2023-10-08 21:00:03 -04:00
8299237ea0 version bump 2023-10-07 15:04:19 -04:00
31646a2747 Fix CURL error 2023-10-07 13:16:22 -04:00
d9d77de47f Update docs 2023-10-07 11:07:13 -04:00
5f0957d029 Update readme 2023-10-07 10:15:41 -04:00
1cc48f7f33 Version bump 2023-10-07 10:15:03 -04:00
e1e22fd79a Add missing image 2023-09-28 17:52:57 -04:00
786bbc3859 Make sure dateMin and dateMax are not equal with sqlite frontend 2023-09-27 19:49:43 -04:00
9698ea0c37 Fix #419 2023-09-26 19:51:17 -04:00
f345fc1a9a version bump, 2023-09-26 17:58:43 -04:00
660fbf75d8 Potential tpool_wait fix for #416, #397, #399 2023-09-26 17:58:07 -04:00
33ae585879 Fix #426 2023-09-25 21:36:50 -04:00
5729cbd6b4 update gitignore 2023-09-16 09:44:29 -04:00
a19ec3305a Fix #415, fix sqlite-index error 2023-09-16 09:43:55 -04:00
8fdb832c85 refactor index schema, remove sidecar parsing, remove TS 2023-09-05 18:59:18 -04:00
b81ccebdb1 Fix #406 2023-08-20 19:53:50 -04:00
b2d214a19a
Merge pull request #412 from simon987/dependabot/npm_and_yarn/sist2-vue/protobufjs-6.11.4
Bump protobufjs from 6.11.3 to 6.11.4 in /sist2-vue
2023-08-20 19:45:09 -04:00
69438464bf Fix python path for user scripts 2023-08-19 17:04:47 -04:00
dependabot[bot]
aa60b526f4
Bump protobufjs from 6.11.3 to 6.11.4 in /sist2-vue
Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 6.11.3 to 6.11.4.
- [Release notes](https://github.com/protobufjs/protobuf.js/releases)
- [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md)
- [Commits](https://github.com/protobufjs/protobuf.js/commits)

---
updated-dependencies:
- dependency-name: protobufjs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-08-19 20:09:14 +00:00
2ea8b51b34
Merge pull request #411 from simon987/embeddings
rework user scripts, add support for embedding search
2023-08-19 16:08:50 -04:00
64b4b201d5
Merge branch 'master' into embeddings 2023-08-19 15:48:15 -04:00
857f3315c2 Rework user scripts, update DB schema to support embeddings 2023-08-19 15:46:19 -04:00
5771693b1a
Update README.md 2023-07-28 07:57:34 -04:00
27188b6fa0 wip 2023-07-24 19:36:20 -04:00
f56cfb0f2f Version bump, fix #392 2023-07-15 11:57:18 -04:00
70242846ae Rework sist2-admin UI 2023-07-14 21:19:59 -04:00
b833acf522 remove trailing slash in other endpoints 2023-07-14 20:13:32 -04:00
5fa2da5eef Update readme 2023-07-14 12:17:59 -04:00
ec518887ee version bump 2023-07-14 12:14:05 -04:00
0b0b7fe951 Fix stats page #387 2023-07-14 12:13:13 -04:00
ba863e4e6c Version bump 2023-07-14 11:35:11 -04:00
cbab4c2841 Remove leading slash in sist2-admin API requests #384 2023-07-14 11:20:05 -04:00
930361e78c Fix #388 2023-07-13 21:38:43 -04:00
92478ec47c Remove debug statement 2023-07-13 21:12:43 -04:00
0d81d7c43b Update openssl api #374 2023-07-13 20:49:45 -04:00
9f175cb0f0 Speedup CI build 2023-07-13 16:46:43 -04:00
6225cf81de Fix CI build scripts (pt. 2) 2023-07-13 16:29:30 -04:00
d7058ab645 Fix CI build scripts, update caniuse database 2023-07-13 16:23:04 -04:00
84958502b1 Fix websocket for #384 2023-07-12 19:39:52 -04:00
a0b6eed037 Fix arm64 dockerfile 2023-07-12 19:37:15 -04:00
06d6910151 Add test image 2023-07-12 19:36:17 -04:00
b99e4ddf13 npm audit fix 2023-07-12 19:36:17 -04:00
d14139ba44
Merge pull request #385 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/semver-5.7.2
Bump semver from 5.7.1 to 5.7.2 in /sist2-admin/frontend
2023-07-12 16:14:39 -04:00
dependabot[bot]
13960337aa
Bump semver from 5.7.1 to 5.7.2 in /sist2-admin/frontend
Bumps [semver](https://github.com/npm/node-semver) from 5.7.1 to 5.7.2.
- [Release notes](https://github.com/npm/node-semver/releases)
- [Changelog](https://github.com/npm/node-semver/blob/v5.7.2/CHANGELOG.md)
- [Commits](https://github.com/npm/node-semver/compare/v5.7.1...v5.7.2)

---
updated-dependencies:
- dependency-name: semver
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-12 15:28:54 +00:00
2596361af5 Use mupdf's OCR methods rather than raw tesseract, various fixes 2023-07-10 21:40:58 -04:00
5a1a04629f Fix #376 2023-07-01 09:21:02 -04:00
242dd67416 Fix #378 2023-07-01 09:06:03 -04:00
54d902146a Free tag json if parsing failed 2023-07-01 08:38:30 -04:00
3b0ab3679a
Merge pull request #380 from jeaneric/Fix-tag
Fix tag
2023-07-01 08:37:21 -04:00
jeaneric
58ce0ef414
Fix tag
On my setup the cJSON_Delete corrupted the req object, releasing after fixed it.
2023-06-30 19:48:34 -04:00
f984baf7fd Fix #373 2023-06-10 10:59:25 -04:00
ce242d1053 Fix #372 2023-06-09 08:16:10 -04:00
71deab7fa2
Merge pull request #371 from dpieski/patch-3
Only remove files with job_name.
2023-06-08 18:02:20 -04:00
Andrew
b0462f9378
Only remove files with job_name. 2023-06-08 11:50:39 -05:00
ca845d80e8 Version bump 2023-06-07 20:40:45 -04:00
e2025df2c0 sist2-admin: don't set status to failed when using debug binary 2023-06-07 20:40:11 -04:00
7eb064162e close db connection before loop #346 2023-06-07 20:25:13 -04:00
7bc4b73e43 Use relative paths in sist2-admin #369 2023-06-07 19:59:50 -04:00
ca2e308d89 Bug fixes 2023-06-04 10:11:46 -04:00
c03c148273 SQLite backend support for sist2-admin #366 2023-06-03 18:33:44 -04:00
5522bcfa9b Add log cleanup features 2023-05-27 18:54:54 -04:00
f0fd708082 Fix timestamps in sist2-admin #359 2023-05-25 20:51:06 -04:00
6bf2b4c74d Merge remote-tracking branch 'origin/master' 2023-05-25 19:56:56 -04:00
d907576406 Update ES mapping for mtime #364 2023-05-25 19:56:46 -04:00
7659b481fa
Merge pull request #365 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/socket.io-parser-4.2.3
Bump socket.io-parser from 4.2.2 to 4.2.3 in /sist2-admin/frontend
2023-05-24 08:52:16 -04:00
dependabot[bot]
e81e5ee457
Bump socket.io-parser from 4.2.2 to 4.2.3 in /sist2-admin/frontend
Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.2.2 to 4.2.3.
- [Release notes](https://github.com/socketio/socket.io-parser/releases)
- [Changelog](https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md)
- [Commits](https://github.com/socketio/socket.io-parser/compare/4.2.2...4.2.3)

---
updated-dependencies:
- dependency-name: socket.io-parser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-24 01:24:25 +00:00
d4820d2fad updates 2023-05-22 10:52:53 -04:00
b3b3005692 Update thumbnail-quality parameter in sist2-admin 2023-05-20 13:16:07 -04:00
610882112d Use WEBP to encode thumbnails 2023-05-20 13:12:12 -04:00
e2e0cf260f Skip encrypted files when no passphrase is supplied 2023-05-18 20:09:17 -04:00
3ffa30cc6f Only fetch ES aggregations on the first request (#357) 2023-05-18 19:56:40 -04:00
7920318406 Add polish localization (pt.2) 2023-05-18 19:52:15 -04:00
41ef940623 Add polish localization 2023-05-18 19:42:39 -04:00
cdec1cebc6 Add .devcontainer folder 2023-05-18 15:10:24 -04:00
0ce341d8e6 Fix print_errors in elastic.c 2023-05-18 15:10:00 -04:00
7d96d62983
Merge pull request #363 from simon987/sqlite-index
SQLite search backend
2023-05-18 14:29:34 -04:00
72293d26f2 code quality fixes 2023-05-18 14:26:49 -04:00
944c224904 SQLite search backend 2023-05-18 14:16:11 -04:00
63027dd5ca
Merge pull request #354 from dpieski/patch-3
Use es-index WebOption when calling sist2.web
2023-04-27 14:46:59 -04:00
Andrew
ac942947e4
Use es-index WebOption when calling sist2.web 2023-04-27 12:57:52 -05:00
1cfceba518 wip 2023-04-25 08:49:50 -04:00
35cfd3b3b1 version bump 2023-04-23 18:26:55 -04:00
b286c652ad Fix tesseract language paths (pt 2) 2023-04-23 18:26:34 -04:00
2d8685f8f5 Fix tesseract language paths 2023-04-23 17:29:55 -04:00
c930ef7840 Update readme 2023-04-23 16:39:34 -04:00
d32bda0d68 Bug fixes 2023-04-23 14:15:31 -04:00
499ed0be79 Fix readme link 2023-04-23 12:54:33 -04:00
dc39c0ec4b Add NER support 2023-04-23 12:53:27 -04:00
b5cdd9a5df Work on README, optimize database storage 2023-04-22 16:02:19 -04:00
a8b6886f7b Fix stats page 2023-04-16 19:46:01 -04:00
a7e9b6af96 Flush documents in index 2023-04-15 13:49:18 -04:00
0710dc6d3d Update readme / format support 2023-04-15 13:39:25 -04:00
75b66b5982 Fix #351 2023-04-15 13:06:13 -04:00
9813646c11 Fix #343 2023-04-15 12:39:47 -04:00
ebc9468251 Fix some memory leaks 2023-04-15 11:54:56 -04:00
7baaca5078 add_work fix for problem in #349 pt 2 2023-04-15 09:48:50 -04:00
6c4bdc87cf add_work fix for problem in #349 2023-04-15 09:18:17 -04:00
1ea78887c3 Fix aarch64 build 2023-04-14 21:53:25 -04:00
886fa720ec Fix for ES 8.X #302 2023-04-14 21:48:29 -04:00
d43aac735f Add build flag to toggle debug info in web module 2023-04-14 21:07:48 -04:00
faf438a798 Add error message in home page on ES connection error #331 2023-04-14 20:51:35 -04:00
5b3b9911bd Bug fix for delete iterator 2023-04-13 18:35:36 -04:00
237d55ec9c
Merge pull request #348 from simon987/dependabot/npm_and_yarn/sist2-vue/d3-color-and-d3-3.1.0
Bump d3-color and d3 in /sist2-vue
2023-04-10 20:40:35 -04:00
dependabot[bot]
ced4c7de88
Bump d3-color and d3 in /sist2-vue
Bumps [d3-color](https://github.com/d3/d3-color) to 3.1.0 and updates ancestor dependency [d3](https://github.com/d3/d3). These dependencies need to be updated together.


Updates `d3-color` from 1.4.1 to 3.1.0
- [Release notes](https://github.com/d3/d3-color/releases)
- [Commits](https://github.com/d3/d3-color/compare/v1.4.1...v3.1.0)

Updates `d3` from 5.16.0 to 7.8.4
- [Release notes](https://github.com/d3/d3/releases)
- [Changelog](https://github.com/d3/d3/blob/main/CHANGES.md)
- [Commits](https://github.com/d3/d3/compare/v5.16.0...v7.8.4)

---
updated-dependencies:
- dependency-name: d3-color
  dependency-type: indirect
- dependency-name: d3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-04-11 00:35:05 +00:00
90ee318981 Update CI build script 2023-04-10 20:07:52 -04:00
785121e46c
Merge pull request #347 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/webpack-5.78.0
Bump webpack from 5.75.0 to 5.78.0 in /sist2-admin/frontend
2023-04-10 19:57:27 -04:00
585c57a2ad Fix antiword version 2023-04-10 19:57:02 -04:00
42abbbce95 Fix libmobi version 2023-04-10 19:54:05 -04:00
dependabot[bot]
e8607df26f
Bump webpack from 5.75.0 to 5.78.0 in /sist2-admin/frontend
Bumps [webpack](https://github.com/webpack/webpack) from 5.75.0 to 5.78.0.
- [Release notes](https://github.com/webpack/webpack/releases)
- [Commits](https://github.com/webpack/webpack/compare/v5.75.0...v5.78.0)

---
updated-dependencies:
- dependency-name: webpack
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-04-10 23:51:06 +00:00
f1726ca0a9
Merge pull request #342 from simon987/dependabot/npm_and_yarn/sist2-vue/webpack-5.76.1
Bump webpack from 5.75.0 to 5.76.1 in /sist2-vue
2023-04-10 19:50:40 -04:00
3ef675abcf
Merge pull request #345 from simon987/process-pool
Process pool
2023-04-10 19:50:23 -04:00
01490d1cbf Update sist2-admin for 3.x.x, more fixes 2023-04-10 19:45:08 -04:00
6182338f29 Update dependencies, fix some build issues 2023-04-10 15:10:56 -04:00
300c70883d Fixes and cleanup 2023-04-10 11:04:16 -04:00
fc36f33d52 use sqlite to save index, major thread pool refactor 2023-04-03 21:39:50 -04:00
dependabot[bot]
81658efb19
Bump webpack from 5.75.0 to 5.76.1 in /sist2-vue
Bumps [webpack](https://github.com/webpack/webpack) from 5.75.0 to 5.76.1.
- [Release notes](https://github.com/webpack/webpack/releases)
- [Commits](https://github.com/webpack/webpack/compare/v5.75.0...v5.76.1)

---
updated-dependencies:
- dependency-name: webpack
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-15 10:25:29 +00:00
ca973d63a4 Still WIP.. 2023-03-12 11:38:31 -04:00
f8abffba81 process pool mostly works, still WIP 2023-03-09 22:11:21 -05:00
60c77678b4
Merge pull request #339 from einfachTobi/patch-1
Update messages.ts
2023-02-28 17:40:45 -05:00
einfachTobi
bf1d2f7d55
Update messages.ts 2023-02-28 11:24:02 +01:00
8c662bb8f8 Adjust some structs 2023-02-27 20:44:25 -05:00
9c40dddd41 remove deprecated note 2023-02-26 11:03:29 -05:00
d259b95017 Update sist2-admin database schema, fix thumbnail-size 2023-02-26 10:42:20 -05:00
707bac86b3 Fix #329, version bump 2023-02-23 21:21:54 -05:00
8b9b067c06 Fix #332 2023-02-23 19:53:05 -05:00
b17f3ff924
Merge pull request #338 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/sideway/formula-3.0.1
Bump @sideway/formula from 3.0.0 to 3.0.1 in /sist2-admin/frontend
2023-02-23 19:32:09 -05:00
e44fbf741c update libscan-test-files 2023-02-23 18:13:27 -05:00
fa14efbeb6 Handle zipbomb files 2023-02-22 22:25:21 -05:00
c510162dd9 Fix duration formatting in sist2-admin 2023-02-16 21:07:30 -05:00
f5c664507f use index name in sist2-admin auto-named dir 2023-02-16 09:03:06 -05:00
2805fd509f Fix tag-auth param in sist2-admin #337 2023-02-13 20:19:24 -05:00
20adcce4a9 Remove default tags, add configurable featured line 2023-02-13 20:14:11 -05:00
1e6e24111b Add german in loading page 2023-02-13 20:13:07 -05:00
dependabot[bot]
5a76b855c9
Bump @sideway/formula from 3.0.0 to 3.0.1 in /sist2-admin/frontend
Bumps [@sideway/formula](https://github.com/sideway/formula) from 3.0.0 to 3.0.1.
- [Release notes](https://github.com/sideway/formula/releases)
- [Commits](https://github.com/sideway/formula/compare/v3.0.0...v3.0.1)

---
updated-dependencies:
- dependency-name: "@sideway/formula"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-09 08:44:17 +00:00
6f759642fc Rework duration/resolution badge style 2023-02-07 20:39:12 -05:00
587c9a2c90 Add de lang option in config page 2023-02-03 09:27:30 -05:00
821a571ecf
Merge pull request #335 from einfachTobi/UI-localization-german
UI localization german + equations in tesseract
2023-02-03 09:19:33 -05:00
einfachTobi
9020246a01
Merge branch 'simon987:master' into UI-localization-german 2023-02-03 10:18:54 +01:00
einfachTobi
200c000c5a
Update Dockerfile 2023-02-03 10:18:43 +01:00
einfachTobi
a43f930d00
Update messages.ts 2023-02-03 10:12:24 +01:00
abe120197a Remove generated files from repo, build vue frontends in Dockerfile 2023-02-02 20:31:16 -05:00
9e0d7bf992 Add test files as submodule, remove support for msword thumbnails 2023-02-02 19:52:37 -05:00
einfachTobi
959d4b4386
Update messages.ts 2023-02-01 14:55:37 +01:00
einfachTobi
742a50be03
Update messages.ts 2023-02-01 12:54:06 +01:00
87ecc5ef6d
Update USAGE.md 2023-01-29 12:47:17 -05:00
2e3d648796 Update --thumbnail-quality argument, add documentation 2023-01-29 11:24:34 -05:00
9972e21fcc Fix lightbox 2023-01-26 20:20:58 -05:00
c625c03552 Fix #328 2023-01-25 21:30:18 -05:00
5863b9cd6e
Merge pull request #327 from simon987/auth0
Add support for auth0
2023-01-24 19:56:05 -05:00
86ca9f1ecb Add support for auth0 2023-01-24 19:55:16 -05:00
b9f008603a OCR fixes 2023-01-13 20:13:20 -05:00
a074d8cf10 cleanup/fixes 2022-12-19 19:05:35 -05:00
795b6e2e2e
Merge pull request #321 from simon987/dependabot/npm_and_yarn/sist2-vue/express-4.18.2
Bump express from 4.17.1 to 4.18.2 in /sist2-vue
2022-12-19 18:33:55 -05:00
dependabot[bot]
59fd0f935c
Bump express from 4.17.1 to 4.18.2 in /sist2-vue
Bumps [express](https://github.com/expressjs/express) from 4.17.1 to 4.18.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.17.1...4.18.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-14 19:55:36 +00:00
3bd1f593b0 Fix dockerfile 2022-12-03 18:00:55 -05:00
8e93d50d9e
Merge pull request #320 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/loader-utils-1.4.2
Bump loader-utils from 1.4.0 to 1.4.2 in /sist2-admin/frontend
2022-12-03 17:11:48 -05:00
dependabot[bot]
e093a8a05c
Bump loader-utils from 1.4.0 to 1.4.2 in /sist2-admin/frontend
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 1.4.0 to 1.4.2.
- [Release notes](https://github.com/webpack/loader-utils/releases)
- [Changelog](https://github.com/webpack/loader-utils/blob/v1.4.2/CHANGELOG.md)
- [Commits](https://github.com/webpack/loader-utils/compare/v1.4.0...v1.4.2)

---
updated-dependencies:
- dependency-name: loader-utils
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-03 22:11:40 +00:00
588b4df164
Merge pull request #319 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/ejs-and-vue/cli-plugin-babel-and-vue/cli-plugin-eslint-and-vue/cli-plugin-router-and-vue/cli-plugin-vuex-and-vue/cli-service--removed
Bump ejs, @vue/cli-plugin-babel, @vue/cli-plugin-eslint, @vue/cli-plugin-router, @vue/cli-plugin-vuex and @vue/cli-service in /sist2-admin/frontend
2022-12-03 17:11:06 -05:00
dependabot[bot]
8b1740958b
Bump ejs, @vue/cli-plugin-babel, @vue/cli-plugin-eslint, @vue/cli-plugin-router, @vue/cli-plugin-vuex and @vue/cli-service
Removes [ejs](https://github.com/mde/ejs). It's no longer used after updating ancestor dependencies [ejs](https://github.com/mde/ejs), [@vue/cli-plugin-babel](https://github.com/vuejs/vue-cli/tree/HEAD/packages/@vue/cli-plugin-babel), [@vue/cli-plugin-eslint](https://github.com/vuejs/vue-cli/tree/HEAD/packages/@vue/cli-plugin-eslint), [@vue/cli-plugin-router](https://github.com/vuejs/vue-cli/tree/HEAD/packages/@vue/cli-plugin-router), [@vue/cli-plugin-vuex](https://github.com/vuejs/vue-cli/tree/HEAD/packages/@vue/cli-plugin-vuex) and [@vue/cli-service](https://github.com/vuejs/vue-cli/tree/HEAD/packages/@vue/cli-service). These dependencies need to be updated together.


Removes `ejs`

Updates `@vue/cli-plugin-babel` from 4.5.17 to 5.0.8
- [Release notes](https://github.com/vuejs/vue-cli/releases)
- [Changelog](https://github.com/vuejs/vue-cli/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/vuejs/vue-cli/commits/v5.0.8/packages/@vue/cli-plugin-babel)

Updates `@vue/cli-plugin-eslint` from 4.5.17 to 5.0.8
- [Release notes](https://github.com/vuejs/vue-cli/releases)
- [Changelog](https://github.com/vuejs/vue-cli/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/vuejs/vue-cli/commits/v5.0.8/packages/@vue/cli-plugin-eslint)

Updates `@vue/cli-plugin-router` from 4.5.17 to 5.0.8
- [Release notes](https://github.com/vuejs/vue-cli/releases)
- [Changelog](https://github.com/vuejs/vue-cli/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/vuejs/vue-cli/commits/v5.0.8/packages/@vue/cli-plugin-router)

Updates `@vue/cli-plugin-vuex` from 4.5.17 to 5.0.8
- [Release notes](https://github.com/vuejs/vue-cli/releases)
- [Changelog](https://github.com/vuejs/vue-cli/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/vuejs/vue-cli/commits/v5.0.8/packages/@vue/cli-plugin-vuex)

Updates `@vue/cli-service` from 4.5.17 to 5.0.8
- [Release notes](https://github.com/vuejs/vue-cli/releases)
- [Changelog](https://github.com/vuejs/vue-cli/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/vuejs/vue-cli/commits/v5.0.8/packages/@vue/cli-service)

---
updated-dependencies:
- dependency-name: ejs
  dependency-type: indirect
- dependency-name: "@vue/cli-plugin-babel"
  dependency-type: direct:development
- dependency-name: "@vue/cli-plugin-eslint"
  dependency-type: direct:development
- dependency-name: "@vue/cli-plugin-router"
  dependency-type: direct:development
- dependency-name: "@vue/cli-plugin-vuex"
  dependency-type: direct:development
- dependency-name: "@vue/cli-service"
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-03 22:10:39 +00:00
fe25ad5459
Merge pull request #317 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/eventsource-1.1.2
Bump eventsource from 1.1.0 to 1.1.2 in /sist2-admin/frontend
2022-12-03 17:09:12 -05:00
dependabot[bot]
83d9e0fb4b
Bump eventsource from 1.1.0 to 1.1.2 in /sist2-admin/frontend
Bumps [eventsource](https://github.com/EventSource/eventsource) from 1.1.0 to 1.1.2.
- [Release notes](https://github.com/EventSource/eventsource/releases)
- [Changelog](https://github.com/EventSource/eventsource/blob/master/HISTORY.md)
- [Commits](https://github.com/EventSource/eventsource/compare/v1.1.0...v1.1.2)

---
updated-dependencies:
- dependency-name: eventsource
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-03 22:09:00 +00:00
6c8e6ac0b3
Merge pull request #313 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/decode-uri-component-0.2.2
Bump decode-uri-component from 0.2.0 to 0.2.2 in /sist2-admin/frontend
2022-12-03 17:07:57 -05:00
dependabot[bot]
2de2f87f16
Bump decode-uri-component from 0.2.0 to 0.2.2 in /sist2-admin/frontend
Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component) from 0.2.0 to 0.2.2.
- [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases)
- [Commits](https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.2)

---
updated-dependencies:
- dependency-name: decode-uri-component
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-03 22:07:23 +00:00
61eb311577
Merge pull request #292 from simon987/dependabot/npm_and_yarn/sist2-vue/eventsource-1.1.1
Bump eventsource from 1.1.0 to 1.1.1 in /sist2-vue
2022-12-03 17:06:45 -05:00
f3a4598cfd
Merge pull request #298 from simon987/dependabot/npm_and_yarn/sist2-vue/terser-4.8.1
Bump terser from 4.8.0 to 4.8.1 in /sist2-vue
2022-12-03 17:06:39 -05:00
e135efaa4b
Merge pull request #303 from simon987/dependabot/npm_and_yarn/sist2-vue/d3-color-and-d3-3.1.0
Bump d3-color and d3 in /sist2-vue
2022-12-03 17:06:32 -05:00
5d488acd77
Merge pull request #310 from simon987/dependabot/npm_and_yarn/sist2-vue/minimatch-3.1.2
Bump minimatch from 3.0.4 to 3.1.2 in /sist2-vue
2022-12-03 17:06:26 -05:00
79b78a92f8
Merge pull request #312 from simon987/dev
v2.13.0
2022-12-03 17:06:19 -05:00
ad18b4d7fd cleanup 2022-11-27 10:14:42 -05:00
d221de5a94 Revert "cleanup"
This reverts commit 13e7ea188b6556a3c1072e9241e49587ef1cece0.
2022-11-27 10:14:28 -05:00
13e7ea188b cleanup 2022-11-27 09:24:29 -05:00
cb4bd9f05a Add sist2-admin, update Dockerfile & docker-compose 2022-11-26 21:27:43 -05:00
c0b8a9c467 Fix #311 2022-11-23 20:51:56 -05:00
c18557e360 Fix thumbnail copying for incremental index, fix incremental index when there are no new updates, add option for JSON logs output 2022-11-23 20:45:47 -05:00
4ec54c9a32 Generate random seed when ?sort=random param is specified 2022-11-23 20:43:11 -05:00
dependabot[bot]
1baf3861f7
Bump minimatch from 3.0.4 to 3.1.2 in /sist2-vue
Bumps [minimatch](https://github.com/isaacs/minimatch) from 3.0.4 to 3.1.2.
- [Release notes](https://github.com/isaacs/minimatch/releases)
- [Commits](https://github.com/isaacs/minimatch/compare/v3.0.4...v3.1.2)

---
updated-dependencies:
- dependency-name: minimatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-11-12 14:27:07 +00:00
dependabot[bot]
25fb912f69
Bump d3-color and d3 in /sist2-vue
Bumps [d3-color](https://github.com/d3/d3-color) to 3.1.0 and updates ancestor dependency [d3](https://github.com/d3/d3). These dependencies need to be updated together.


Updates `d3-color` from 1.4.1 to 3.1.0
- [Release notes](https://github.com/d3/d3-color/releases)
- [Commits](https://github.com/d3/d3-color/compare/v1.4.1...v3.1.0)

Updates `d3` from 5.16.0 to 7.6.1
- [Release notes](https://github.com/d3/d3/releases)
- [Changelog](https://github.com/d3/d3/blob/main/CHANGES.md)
- [Commits](https://github.com/d3/d3/compare/v5.16.0...v7.6.1)

---
updated-dependencies:
- dependency-name: d3-color
  dependency-type: indirect
- dependency-name: d3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-30 00:54:51 +00:00
dependabot[bot]
e7f27cfd13
Bump terser from 4.8.0 to 4.8.1 in /sist2-vue
Bumps [terser](https://github.com/terser/terser) from 4.8.0 to 4.8.1.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-21 19:34:05 +00:00
dependabot[bot]
6e38653f2f
Bump eventsource from 1.1.0 to 1.1.1 in /sist2-vue
Bumps [eventsource](https://github.com/EventSource/eventsource) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/EventSource/eventsource/releases)
- [Changelog](https://github.com/EventSource/eventsource/blob/master/HISTORY.md)
- [Commits](https://github.com/EventSource/eventsource/compare/v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: eventsource
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-06-01 22:27:52 +00:00
38fba363f2 Upgrade to mongoose 7.7 2022-05-22 10:25:40 -04:00
c7b3d11a6d Version bump, fix mime serve for files with no extensions 2022-04-27 22:01:30 -04:00
4e1109c528
Merge pull request #288 from simon987/dev
v2.12.1
2022-04-23 10:30:19 -04:00
f87de89275 Version bump 2022-04-23 10:29:50 -04:00
1205981a11 CURL error handling, fix ES version handling, support for ES8, add --es-insecure-ssl argument 2022-04-23 10:29:31 -04:00
09613eaaf9 import magic database as a blob as last resort to make it work 2022-04-18 12:55:22 -04:00
a74726be55
Merge pull request #285 from simon987/dependabot/npm_and_yarn/sist2-vue/async-2.6.4
Bump async from 2.6.3 to 2.6.4 in /sist2-vue
2022-04-17 13:42:40 -04:00
dependabot[bot]
cb228052d2
Bump async from 2.6.3 to 2.6.4 in /sist2-vue
Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4.
- [Release notes](https://github.com/caolan/async/releases)
- [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md)
- [Commits](https://github.com/caolan/async/compare/v2.6.3...v2.6.4)

---
updated-dependencies:
- dependency-name: async
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-17 17:41:14 +00:00
fe56da95d5
Merge pull request #284 from simon987/dev
v2.12.0
2022-04-17 13:38:42 -04:00
9f2ad58f78 bump version 2022-04-17 12:30:14 -04:00
84d9bf4323 Fix cmake libmobi build maybe 2022-04-17 12:23:45 -04:00
90aa90f3f3 Update antiword 2022-04-17 11:47:33 -04:00
3fad07360c
Merge pull request #283 from simon987/dependabot/npm_and_yarn/sist2-vue/minimist-1.2.6
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
2022-04-17 10:12:10 -04:00
dependabot[bot]
00c3a640d0
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-17 12:53:12 +00:00
730e495bde Enable highlight in document info modal, remove /d/ endpoint 2022-04-16 16:11:17 -04:00
54df1dfcf7 Fix spacebar not working in search bar 2022-04-16 13:51:36 -04:00
a75675ecea Fix thumbnail copy bug, update tests 2022-04-16 11:48:43 -04:00
901035da15 Build libmobi with cmake, update to 0.10 2022-04-15 16:01:40 -04:00
ceb7265639 Fix max_analyzed_offset (again?) 2022-04-15 15:35:39 -04:00
036ed9ea1e Update libmagic cmake things 2022-04-15 15:35:20 -04:00
779303a2f7 Print body response when task id cannot be read 2022-04-14 16:24:56 -04:00
23aee14c07 Fix exec-script & fix memory leak in exec_args_validate 2022-04-14 15:43:24 -04:00
50b9201be3
Merge pull request #279 from simon987/dependabot/npm_and_yarn/sist2-vue/minimist-1.2.6
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
2022-04-05 20:12:03 -04:00
dependabot[bot]
14cfb15661
Bump minimist from 1.2.5 to 1.2.6 in /sist2-vue
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-31 23:28:25 +00:00
125c85d9bb localize tag filter bar 2022-03-18 09:15:07 -04:00
474eb95aff Update antiword 2022-03-17 15:08:55 -04:00
acf7453057 Add test for large msdoc 2022-03-17 15:05:48 -04:00
9a949d2694 Use TRUE rather than 1 2022-03-17 09:13:19 -04:00
dbdc75dcb8 Add filter bar in tag picker 2022-03-17 09:12:43 -04:00
c575fca91d Do not store duration or bitrate when the value is 0 or for images 2022-03-05 21:24:59 -05:00
0bf4244683 Do blank search on page reload when media tab auto-reload is disabled 2022-03-05 20:56:02 -05:00
eea5ce75f3 Fix query args updating outside of the search page 2022-03-05 20:42:13 -05:00
9b81856353 Fix some errors in keyboard handler 2022-03-05 20:33:45 -05:00
a10d6952ba Fix segfault in print_errors() 2022-03-05 20:33:21 -05:00
2b639bd4ac Error handling in get_es_version() 2022-03-05 14:59:37 -05:00
e9f92330fd Cleanup macros 2022-03-05 11:18:07 -05:00
cb37a6e6c1 Fix thumbnail bug in serve 2022-03-05 11:18:07 -05:00
b82c26f0fb Add mt_ int_ prefixes in InfoTable 2022-03-05 11:18:06 -05:00
16a4fb4874 Rework document IDs 2022-03-05 11:18:06 -05:00
cdc4c0ad3d Cap maximum thumbnail count to 1000 2022-03-05 11:18:06 -05:00
d034851ecb Setup keyboard shortcuts for Lightbox, add option to disable animations 2022-03-05 11:18:06 -05:00
ea7dfe7c84 Update to mongoose 7.6 2022-03-05 11:18:05 -05:00
8bfd010f4b Update dev ES docker script 2022-03-05 11:18:05 -05:00
499eb2b2e4 Un-break raw file thumbnails 2022-03-05 11:18:05 -05:00
25ab883063
Merge pull request #263 from simon987/dependabot/npm_and_yarn/sist2-vue/url-parse-1.5.10
Bump url-parse from 1.5.4 to 1.5.10 in /sist2-vue
2022-02-28 09:26:15 -05:00
dependabot[bot]
6ab606203f
Bump url-parse from 1.5.4 to 1.5.10 in /sist2-vue
Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.5.4 to 1.5.10.
- [Release notes](https://github.com/unshiftio/url-parse/releases)
- [Commits](https://github.com/unshiftio/url-parse/compare/1.5.4...1.5.10)

---
updated-dependencies:
- dependency-name: url-parse
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-28 04:23:32 +00:00
6ec98046fa
Merge pull request #262 from yatli/fix_261
fix #261: inherit index id from base index when using incremental scan
2022-02-26 11:37:16 -05:00
Yatao Li
4fac81ca6a fix #261: new index ids generated for incremental scan 2022-02-27 00:25:23 +08:00
240 changed files with 48202 additions and 30456 deletions

View File

@ -8,13 +8,13 @@ Testing/
**/cmake_install.cmake
**/CMakeCache.txt
**/CMakeFiles/
.cmake
LICENSE
Makefile
**/*.md
**/*.cbp
VERSION
**/node_modules/
.git/
sist2-*-linux-debug
sist2-*-linux
sist2_debug
@ -28,4 +28,16 @@ sist2
**/ext_libwpd
**/core
*.a
tmp_scan/
tmp_scan/
Dockerfile
Dockerfile.arm64
docker-compose.yml
state.db
*-journal
build/
__pycache__/
sist2-vue/dist
sist2-admin/frontend/dist
*.fts
.git
third-party/libscan/third-party/ext_*/*

View File

@ -7,11 +7,36 @@ platform:
arch: amd64
steps:
- name: submodules
image: alpine/git
commands:
- git submodule update --init --recursive
- name: docker
image: plugins/docker
depends_on:
- submodules
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: sist2app/sist2
context: ./
dockerfile: ./Dockerfile
auto_tag: true
auto_tag_suffix: x64-linux
when:
event:
- tag
- name: build
image: simon987/sist2-build
image: sist2app/sist2-build
depends_on:
- submodules
commands:
- ./scripts/build.sh
- name: scp files
depends_on:
- build
image: appleboy/drone-scp
settings:
host:
@ -22,26 +47,11 @@ steps:
from_secret: SSH_USER
key:
from_secret: SSH_KEY
target: /files/sist2/${DRONE_REPO_OWNER}_${DRONE_REPO_NAME}/${DRONE_BRANCH}_${DRONE_BUILD_NUMBER}_${DRONE_COMMIT}/
target: ~/files/sist2/${DRONE_REPO_OWNER}_${DRONE_REPO_NAME}/${DRONE_BRANCH}_${DRONE_BUILD_NUMBER}_${DRONE_COMMIT}/
source:
- ./VERSION
- ./sist2-x64-linux
- ./sist2-x64-linux-debug
- name: docker
image: plugins/docker
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: simon987/sist2
context: ./
dockerfile: ./Dockerfile
auto_tag: true
auto_tag_suffix: x64-linux
when:
event:
- tag
---
kind: pipeline
@ -52,11 +62,36 @@ platform:
arch: arm64
steps:
- name: submodules
image: alpine/git
commands:
- git submodule update --init --recursive
- name: docker
image: plugins/docker
depends_on:
- submodules
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: sist2app/sist2
context: ./
dockerfile: ./Dockerfile.arm64
auto_tag: true
auto_tag_suffix: arm64-linux
when:
event:
- tag
- name: build
image: simon987/sist2-build-arm64
image: sist2app/sist2-build-arm64
depends_on:
- submodules
commands:
- ./scripts/build_arm64.sh
- name: scp files
depends_on:
- build
image: appleboy/drone-scp
settings:
host:
@ -67,22 +102,7 @@ steps:
from_secret: SSH_USER
key:
from_secret: SSH_KEY
target: /files/sist2/${DRONE_REPO_OWNER}_${DRONE_REPO_NAME}/arm_${DRONE_BRANCH}_${DRONE_BUILD_NUMBER}_${DRONE_COMMIT}/
target: ~/files/sist2/${DRONE_REPO_OWNER}_${DRONE_REPO_NAME}/arm_${DRONE_BRANCH}_${DRONE_BUILD_NUMBER}_${DRONE_COMMIT}/
source:
- ./sist2-arm64-linux
- ./sist2-arm64-linux-debug
- name: docker
image: plugins/docker
settings:
username:
from_secret: DOCKER_USER
password:
from_secret: DOCKER_PASSWORD
repo: simon987/sist2
context: ./
dockerfile: ./Dockerfile.arm64
auto_tag: true
auto_tag_suffix: arm64-linux
when:
event:
- tag

3
.gitattributes vendored
View File

@ -1,3 +0,0 @@
CMakeModules/* linguist-vendored
**/*_generated.c linguist-vendored
**/*_generated.h linguist-vendored

23
.gitignore vendored
View File

@ -3,6 +3,7 @@ thumbs
*.cbp
CMakeCache.txt
CMakeFiles
cmake-build-default-event-trace
cmake-build-debug
cmake_install.cmake
Makefile
@ -10,6 +11,9 @@ Makefile
LOG
sist2*
!sist2-vue/
!sist2-admin
!sist2_admin
!sist2.py
*.sist2/
bundle*.css
bundle.js
@ -25,4 +29,21 @@ test_i
test_i_inc
node_modules/
.cmake/
i_inc/
i_inc/
state.db
*.pyc
!sist2-admin/frontend/dist
*.js.map
sist2-vue/dist
sist2-admin/frontend/dist
.ninja_deps
.ninja_log
build.ninja
src/web/static_generated.c
src/magic_generated.c
src/index/static_generated.c
*.sist2
*-shm
*-journal
.vscode
*.fts

6
.gitmodules vendored
View File

@ -7,3 +7,9 @@
[submodule "third-party/libscan/third-party/antiword"]
path = third-party/libscan/third-party/antiword
url = https://github.com/simon987/antiword
[submodule "third-party/libscan/third-party/libmobi"]
path = third-party/libscan/third-party/libmobi
url = https://github.com/bfabiszewski/libmobi
[submodule "third-party/libscan/libscan-test-files"]
path = third-party/libscan/libscan-test-files
url = https://github.com/simon987/libscan-test-files

View File

@ -1,10 +1,11 @@
cmake_minimum_required(VERSION 3.7)
project(sist2)
set(CMAKE_C_STANDARD 11)
project(sist2 C)
option(SIST_DEBUG "Build a debug executable" on)
option(SIST_FAKE_STORE "Disable IO operations of LMDB stores for debugging purposes" 0)
option(SIST_FAST "Enable more optimisation flags" off)
option(SIST_DEBUG_INFO "Turn on debug information in web interface" on)
add_compile_definitions(
"SIST_PLATFORM=${SIST_PLATFORM}"
@ -14,46 +15,67 @@ if (SIST_DEBUG)
add_compile_definitions(
"SIST_DEBUG=${SIST_DEBUG}"
)
endif()
set(VCPKG_BUILD_TYPE debug)
else ()
set(VCPKG_BUILD_TYPE release)
endif ()
if (SIST_DEBUG_INFO)
add_compile_definitions(
"SIST_DEBUG_INFO=${SIST_DEBUG_INFO}"
)
endif ()
add_subdirectory(third-party/libscan)
set(ARGPARSE_SHARED off)
add_subdirectory(third-party/argparse)
add_executable(sist2
add_executable(
sist2
# argparse
third-party/argparse/argparse.h third-party/argparse/argparse.c
src/main.c
src/sist.h
src/io/walk.h src/io/walk.c
src/io/store.h src/io/store.c
src/tpool.h src/tpool.c
src/parsing/parse.h src/parsing/parse.c
src/parsing/magic_util.c src/parsing/magic_util.h
src/io/serialize.h src/io/serialize.c
src/parsing/mime.h src/parsing/mime.c src/parsing/mime_generated.c
src/index/web.c src/index/web.h
src/web/serve.c src/web/serve.h
src/web/web_util.c src/web/web_util.h
src/index/elastic.c src/index/elastic.h
src/util.c src/util.h
src/ctx.h src/types.h
src/ctx.c src/ctx.h
src/types.h
src/log.c src/log.h
src/cli.c src/cli.h
src/stats.c src/stats.h src/ctx.c
src/parsing/sidecar.c src/parsing/sidecar.h
src/database/database.c src/database/database.h
src/parsing/fs_util.h
# argparse
third-party/argparse/argparse.h third-party/argparse/argparse.c
)
src/auth0/auth0_c_api.h src/auth0/auth0_c_api.cpp
src/database/database_stats.c
src/database/database_schema.c
src/database/database_fts.c
src/web/web_fts.c
src/database/database_embeddings.c)
set_target_properties(sist2 PROPERTIES LINKER_LANGUAGE C)
target_link_directories(sist2 PRIVATE BEFORE ${_VCPKG_INSTALLED_DIR}/${VCPKG_TARGET_TRIPLET}/lib/)
set(CMAKE_FIND_LIBRARY_SUFFIXES .a .lib)
find_package(PkgConfig REQUIRED)
pkg_search_module(GLIB REQUIRED glib-2.0)
find_package(lmdb CONFIG REQUIRED)
find_package(cJSON CONFIG REQUIRED)
find_package(unofficial-mongoose CONFIG REQUIRED)
find_package(CURL CONFIG REQUIRED)
find_library(MAGIC_LIB NAMES libmagic.a REQUIRED)
find_package(unofficial-sqlite3 CONFIG REQUIRED)
find_package(OpenBLAS CONFIG REQUIRED)
target_include_directories(
@ -62,7 +84,6 @@ target_include_directories(
${CMAKE_SOURCE_DIR}/third-party/utf8.h/
${CMAKE_SOURCE_DIR}/third-party/libscan/
${CMAKE_SOURCE_DIR}/
${GLIB_INCLUDE_DIRS}
)
target_compile_options(
@ -80,7 +101,7 @@ if (SIST_DEBUG)
-fno-omit-frame-pointer
-fsanitize=address
-fno-inline
# -O2
# -O2
)
target_link_options(
sist2
@ -93,13 +114,27 @@ if (SIST_DEBUG)
PROPERTIES
OUTPUT_NAME sist2_debug
)
elseif (SIST_FAST)
target_compile_options(
sist2
PRIVATE
-Ofast
-march=native
-fno-stack-protector
-fomit-frame-pointer
-freciprocal-math
)
else ()
target_compile_options(
sist2
PRIVATE
-Ofast
# -g
-fno-stack-protector
-fomit-frame-pointer
-w
)
endif ()
@ -112,20 +147,19 @@ add_dependencies(
target_link_libraries(
sist2
m
z
lmdb
cjson
argparse
${GLIB_LDFLAGS}
unofficial::mongoose::mongoose
CURL::libcurl
pthread
magic
c
scan
${MAGIC_LIB}
unofficial::sqlite3::sqlite3
OpenBLAS::OpenBLAS
)
add_custom_target(

View File

@ -1,28 +1,52 @@
FROM simon987/sist2-build as build
MAINTAINER simon987 <me@simon987.net>
FROM sist2app/sist2-build as build
WORKDIR /build/
COPY . .
RUN cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE=/vcpkg/scripts/buildsystems/vcpkg.cmake .
RUN make -j$(nproc)
RUN strip sist2 || mv sist2_debug sist2
FROM --platform="linux/amd64" ubuntu:21.10
COPY scripts scripts
COPY schema schema
COPY CMakeLists.txt .
COPY third-party third-party
COPY src src
COPY sist2-vue sist2-vue
COPY sist2-admin sist2-admin
RUN apt update && apt install -y curl libasan5 && rm -rf /var/lib/apt/lists/*
RUN cd sist2-vue/ && npm install && npm run build
RUN cd sist2-admin/frontend/ && npm install && npm run build
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
curl -o /usr/share/tessdata/hin.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/hin.traineddata &&\
curl -o /usr/share/tessdata/jpn.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/jpn.traineddata &&\
curl -o /usr/share/tessdata/eng.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/eng.traineddata &&\
curl -o /usr/share/tessdata/fra.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/fra.traineddata &&\
curl -o /usr/share/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata
RUN mkdir build && cd build && cmake -DSIST_PLATFORM=x64_linux_docker -DSIST_DEBUG_INFO=on -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE=/vcpkg/scripts/buildsystems/vcpkg.cmake ..
RUN cd build && make -j$(nproc)
RUN strip build/sist2 || mv build/sist2_debug build/sist2
ENTRYPOINT ["/root/sist2"]
FROM --platform="linux/amd64" ubuntu@sha256:965fbcae990b0467ed5657caceaec165018ef44a4d2d46c7cdea80a9dff0d1ea
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
COPY --from=build /build/sist2 /root/sist2
ENTRYPOINT ["/root/sist2"]
RUN apt update && DEBIAN_FRONTEND=noninteractive apt install -y curl libasan5 libmagic1 python3 \
python3-pip git tesseract-ocr && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
curl -o /usr/share/tesseract-ocr/4.00/tessdata/hin.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/hin.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/jpn.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/jpn.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/eng.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/fra.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/fra.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/osd.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/osd.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/deu.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/deu.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/equ.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/equ.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/pol.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/pol.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/chi_sim.traineddata
# sist2
COPY --from=build /build/build/sist2 /root/sist2
# sist2-admin
WORKDIR /root/sist2-admin
COPY sist2-admin/requirements.txt /root/sist2-admin/
RUN ln /usr/bin/python3 /usr/bin/python
RUN python -m pip install --no-cache -r /root/sist2-admin/requirements.txt
COPY --from=build /build/sist2-admin/ /root/sist2-admin/

View File

@ -1,28 +1,53 @@
FROM simon987/sist2-build-arm64 as build
MAINTAINER simon987 <me@simon987.net>
FROM sist2app/sist2-build-arm64 as build
WORKDIR /build/
COPY scripts scripts
COPY schema schema
COPY CMakeLists.txt .
COPY third-party third-party
COPY src src
COPY sist2-vue sist2-vue
COPY sist2-admin sist2-admin
RUN cd sist2-vue/ && npm install && npm run build
RUN cd sist2-admin/frontend/ && npm install && npm run build
WORKDIR /build/
ADD . /build/
RUN cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE=/vcpkg/scripts/buildsystems/vcpkg.cmake .
RUN make -j$(nproc)
RUN strip sist2
RUN mkdir build && cd build && cmake -DSIST_PLATFORM=arm64_linux_docker -DSIST_DEBUG_INFO=on -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE=/vcpkg/scripts/buildsystems/vcpkg.cmake ..
RUN cd build && make -j$(nproc)
RUN strip build/sist2 || mv build/sist2_debug build/sist2
FROM --platform="linux/arm64/v8" ubuntu:21.10
FROM --platform=linux/arm64/v8 ubuntu@sha256:537da24818633b45fcb65e5285a68c3ec1f3db25f5ae5476a7757bc8dfae92a3
RUN apt update && apt install -y curl libasan5 && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
curl -o /usr/share/tessdata/hin.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/hin.traineddata &&\
curl -o /usr/share/tessdata/jpn.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/jpn.traineddata &&\
curl -o /usr/share/tessdata/eng.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/eng.traineddata &&\
curl -o /usr/share/tessdata/fra.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/fra.traineddata &&\
curl -o /usr/share/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata
WORKDIR /root
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENTRYPOINT ["/root/sist2"]
COPY --from=build /build/sist2 /root/sist2
RUN apt update && apt install -y curl libasan5 libmagic1 tesseract-ocr python3-pip python3 git && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /usr/share/tessdata && \
cd /usr/share/tessdata/ && \
curl -o /usr/share/tesseract-ocr/4.00/tessdata/hin.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/hin.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/jpn.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/jpn.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/eng.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/fra.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/fra.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/rus.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/rus.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/osd.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/osd.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/spa.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/spa.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/deu.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/deu.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/equ.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/equ.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/pol.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/pol.traineddata &&\
curl -o /usr/share/tesseract-ocr/4.00/tessdata/chi_sim.traineddata https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/chi_sim.traineddata
# sist2
COPY --from=build /build/build/sist2 /root/sist2
# sist2-admin
COPY sist2-admin/requirements.txt sist2-admin/
RUN python3 -m pip install --no-cache -r sist2-admin/requirements.txt
COPY --from=build /build/sist2-admin/ sist2-admin/

182
README.md
View File

@ -1,22 +1,24 @@
![GitHub](https://img.shields.io/github/license/simon987/sist2.svg)
[![CodeFactor](https://www.codefactor.io/repository/github/simon987/sist2/badge?s=05daa325188aac4eae32c786f3d9cf4e0593f822)](https://www.codefactor.io/repository/github/simon987/sist2)
![GitHub](https://img.shields.io/github/license/sist2app/sist2.svg)
[![CodeFactor](https://www.codefactor.io/repository/github/sist2app/sist2/badge?s=05daa325188aac4eae32c786f3d9cf4e0593f822)](https://www.codefactor.io/repository/github/sist2app/sist2)
[![Development snapshots](https://ci.simon987.net/api/badges/simon987/sist2/status.svg)](https://files.simon987.net/.gate/sist2/simon987_sist2/)
**Demo**: [sist2.simon987.net](https://sist2.simon987.net/)
**Community URL:** [Discord](https://discord.gg/2PEjDy3Rfs)
# sist2
sist2 (Simple incremental search tool)
*Warning: sist2 is in early development*
![search panel](docs/sist2.png)
![search panel](docs/sist2.gif)
## Features
* Fast, low memory usage, multi-threaded
* Manage & schedule scan jobs with simple web interface (Docker only)
* Mobile-friendly Web interface
* Portable (all its features are packaged in a single executable)
* Extracts text and metadata from common file types \*
* Generates thumbnails \*
* Incremental scanning
@ -24,67 +26,92 @@ sist2 (Simple incremental search tool)
* Recursive scan inside archive files \*\*
* OCR support with tesseract \*\*\*
* Stats page & disk utilisation visualization
* Named-entity recognition (client-side) \*\*\*\*
\* See [format support](#format-support)
\*\* See [Archive files](#archive-files)
\*\*\* See [OCR](#ocr)
![stats](docs/stats.png)
\*\*\* See [OCR](#ocr)
\*\*\*\* See [Named-Entity Recognition](#NER)
## Getting Started
1. Have an Elasticsearch (>= 6.8.X, ideally >=7.14.0) instance running
1. Download [from official website](https://www.elastic.co/downloads/elasticsearch)
1. *(or)* Run using docker:
```bash
docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.14.0
```
1. *(or)* Run using docker-compose:
```yaml
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms1G -Xmx2G"
```
1. Download sist2 executable
1. Download the [latest sist2 release](https://github.com/simon987/sist2/releases).
Select the file corresponding to your CPU architecture and mark the binary as executable with `chmod +x` *
2. *(or)* Download a [development snapshot](https://files.simon987.net/.gate/sist2/simon987_sist2/) *(Not
recommended!)*
3. *(or)* `docker pull simon987/sist2:2.11.7-x64-linux`
### Using Docker Compose *(Windows/Linux/Mac)*
1. See [Usage guide](docs/USAGE.md)
```yaml
services:
elasticsearch:
image: elasticsearch:7.17.9
restart: unless-stopped
volumes:
# This directory must have 1000:1000 permissions (or update PUID & PGID below)
- /data/sist2-es-data/:/usr/share/elasticsearch/data
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
- "PUID=1000"
- "PGID=1000"
sist2-admin:
image: sist2app/sist2:x64-linux
restart: unless-stopped
volumes:
- /data/sist2-admin-data/:/sist2-admin/
- /<path to index>/:/host
ports:
- 4090:4090
# NOTE: Don't expose this port publicly!
- 8080:8080
working_dir: /root/sist2-admin/
entrypoint: python3
command:
- /root/sist2-admin/sist2_admin/app.py
```
\* *Windows users*: **sist2** runs under [WSL](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux)
Navigate to http://localhost:8080/ to configure sist2-admin.
## Example usage
### Using the executable file *(Linux/WSL only)*
See [Usage guide](docs/USAGE.md) for more details
1. Choose search backend (See [comparison](#search-backends)):
* **Elasticsearch**: have an Elasticsearch (version >= 6.8.X, ideally >=7.14.0) instance running
1. Download [from official website](https://www.elastic.co/downloads/elasticsearch)
2. *(or)* Run using docker:
```bash
docker run -d -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.17.9
```
* **SQLite**: No installation required
1. Scan a directory: `sist2 scan ~/Documents -o ./docs_idx`
1. Push index to Elasticsearch: `sist2 index ./docs_idx`
1. Start web interface: `sist2 web ./docs_idx`
2. Download the [latest sist2 release](https://github.com/sist2app/sist2/releases).
Select the file corresponding to your CPU architecture and mark the binary as executable with `chmod +x`.
3. See [usage guide](docs/USAGE.md) for command line usage.
Example usage:
1. Scan a directory: `sist2 scan ~/Documents --output ./documents.sist2`
2. Prepare search index:
* **Elasticsearch**: `sist2 index --es-url http://localhost:9200 ./documents.sist2`
* **SQLite**: `sist2 sqlite-index --search-index ./search.sist2 ./documents.sist2`
3. Start web interface:
* **Elasticsearch**: `sist2 web ./documents.sist2`
* **SQLite**: `sist2 web --search-index ./search.sist2 ./documents.sist2`
## Format support
| File type | Library | Content | Thumbnail | Metadata |
|:--------------------------------------------------------------------------|:-----------------------------------------------------------------------------|:---------|:------------|:---------------------------------------------------------------------------------------------------------------------------------------|
| pdf,xps,fb2,epub | MuPDF | text+ocr | yes | author, title |
| cbz,cbr | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | - | yes | - |
| cbz,cbr | [libscan](https://github.com/sist2app/sist2/tree/master/third-party/libscan) | - | yes | - |
| `audio/*` | ffmpeg | - | yes | ID3 tags |
| `video/*` | ffmpeg | - | yes | title, comment, artist |
| `image/*` | ffmpeg | ocr | yes | [Common EXIF tags](https://github.com/simon987/sist2/blob/efdde2734eca9b14a54f84568863b7ffd59bdba3/src/parsing/media.c#L190), GPS tags |
| `image/*` | ffmpeg | ocr | yes | [Common EXIF tags](https://github.com/sist2app/sist2/blob/efdde2734eca9b14a54f84568863b7ffd59bdba3/src/parsing/media.c#L190), GPS tags |
| raw, rw2, dng, cr2, crw, dcr, k25, kdc, mrw, pef, xf3, arw, sr2, srf, erf | LibRaw | no | yes | Common EXIF tags, GPS tags |
| ttf,ttc,cff,woff,fnt,otf | Freetype2 | - | yes, `bmp` | Name & style |
| `text/plain` | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | no | - |
| html, xml | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | no | - |
| `text/plain` | [libscan](https://github.com/sist2app/sist2/tree/master/third-party/libscan) | yes | no | - |
| html, xml | [libscan](https://github.com/sist2app/sist2/tree/master/third-party/libscan) | yes | no | - |
| tar, zip, rar, 7z, ar ... | Libarchive | yes\* | - | no |
| docx, xlsx, pptx | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | if embedded | creator, modified_by, title |
| doc (MS Word 97-2003) | antiword | yes | yes | author, title |
| mobi, azw, azw3 | libmobi | yes | no | author, title |
| docx, xlsx, pptx | [libscan](https://github.com/sist2app/sist2/tree/master/third-party/libscan) | yes | if embedded | creator, modified_by, title |
| doc (MS Word 97-2003) | antiword | yes | no | author, title |
| mobi, azw, azw3 | libmobi | yes | yes | author, title |
| wpd (WordPerfect) | libwpd | yes | no | *planned* |
| json, jsonl, ndjson | [libscan](https://github.com/simon987/sist2/tree/master/third-party/libscan) | yes | - | - |
| json, jsonl, ndjson | [libscan](https://github.com/sist2app/sist2/tree/master/third-party/libscan) | yes | - | - |
\* *See [Archive files](#archive-files)*
@ -108,11 +135,11 @@ You can enable OCR support for ebook (pdf,xps,fb2,epub) or image file types with
Download the language data files with your package manager (`apt install tesseract-ocr-eng`) or
directly [from Github](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files).
The `simon987/sist2` image comes with common languages
(hin, jpn, eng, fra, rus, spa) pre-installed.
The `sist2app/sist2` image comes with common languages
(hin, jpn, eng, fra, rus, spa, chi_sim, deu, pol) pre-installed.
You can use the `+` separator to specify multiple languages. The language
name must be identical to the `*.traineddata` file installed on your system
name must be identical to the `*.traineddata` file installed on your system
(use `chi_sim` rather than `chi-sim`).
Examples:
@ -123,39 +150,80 @@ sist2 scan --ocr-images --ocr-lang eng ~/Images/Screenshots/
sist2 scan --ocr-ebooks --ocr-images --ocr-lang eng+chi_sim ~/Chinese-Bilingual/
```
### Search backends
sist2 v3.0.7+ supports SQLite search backend. The SQLite search backend has
fewer features and generally comparable query performance for medium-size
indices, but it uses much less memory and is easier to set up.
| | SQLite | Elasticsearch |
|----------------------------------------------|:---------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------:|
| Requires separate search engine installation | | ✓ |
| Memory footprint | ~20MB | >500MB |
| Query syntax | [fts5](https://www.sqlite.org/fts5.html) | [query_string](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) |
| Fuzzy search | | ✓ |
| Media Types tree real-time updating | | ✓ |
| Manual tagging | ✓ | ✓ |
| User scripts | ✓ | ✓ |
| Media Type breakdown for search results | | ✓ |
| Embeddings search | ✓ *O(n)* | ✓ *O(logn)* |
### NER
sist2 v3.0.4+ supports named-entity recognition (NER). Simply add a supported repository URL to
**Configuration** > **Machine learning options** > **Model repositories**
to enable it.
The text processing is done in your browser, no data is sent to any third-party services.
See [sist2app/sist2-ner-models](https://github.com/sist2app/sist2-ner-models) for more details.
#### List of available repositories:
| URL | Maintainer | Purpose |
|---------------------------------------------------------------------------------------------------------|-----------------------------------------|---------|
| [sist2app/sist2-ner-models](https://raw.githubusercontent.com/sist2app/sist2-ner-models/main/repo.json) | [sist2app](https://github.com/sist2app) | General |
<details>
<summary>Screenshot</summary>
![ner](docs/ner.png)
</details>
## Build from source
You can compile **sist2** by yourself if you don't want to use the pre-compiled binaries
### With docker (recommended)
### Using docker
```bash
git clone --recursive https://github.com/simon987/sist2/
git clone --recursive https://github.com/sist2app/sist2/
cd sist2
docker build . -f ./Dockerfile -t my-sist2-image
docker build . -t my-sist2-image
# Copy sist2 executable from docker image
docker run --rm --entrypoint cat my-sist2-image /root/sist2 > sist2-x64-linux
```
### On a linux computer
### Using a linux computer
1. Install compile-time dependencies
```bash
apt install gcc g++ python3 yasm ragel automake autotools-dev wget libtool libssl-dev curl zip unzip tar xorg-dev libglu1-mesa-dev libxcursor-dev libxml2-dev libxinerama-dev gettext nasm git
apt install gcc g++ python3 yasm ragel automake autotools-dev wget libtool libssl-dev curl zip unzip tar xorg-dev libglu1-mesa-dev libxcursor-dev libxml2-dev libxinerama-dev gettext nasm git nodejs
```
1. Apply vcpkg patches, as per [sist2-build](https://github.com/simon987/sist2-build) Dockerfile
1. Install vcpkg dependencies
2. Install vcpkg using my fork: https://github.com/sist2app/vcpkg
3. Install vcpkg dependencies
```bash
vcpkg install curl[core,openssl]
vcpkg install lmdb cjson glib brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf gtest mongoose libmagic libraw jasper lcms gumbo
vcpkg install openblas curl[core,openssl] sqlite3[core,fts5,json1] cpp-jwt pcre cjson brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf[ocr] gtest mongoose libmagic libraw gumbo ffmpeg[core,avcodec,avformat,swscale,swresample,webp,opus,mp3lame,vpx,zlib]
```
1. Build
4. Build
```bash
git clone --recursive https://github.com/simon987/sist2/
git clone --recursive https://github.com/sist2app/sist2/
(cd sist2-vue; npm install; npm run build)
(cd sist2-admin/frontend; npm install; npm run build)
cmake -DSIST_DEBUG=off -DCMAKE_TOOLCHAIN_FILE=<VCPKG_ROOT>/scripts/buildsystems/vcpkg.cmake .
make
```

View File

@ -12,7 +12,7 @@ REWRITE_URL=""
sist2 scan \
--threads 14 \
--mem-throttle 32768 \
--quality 1.0 \
--thumbnail-quality 2 \
--name $NAME \
--ocr-lang=eng+chi_sim \
--ocr-ebooks \

View File

@ -12,7 +12,7 @@ REWRITE_URL=""
sist2 scan \
--threads 14 \
--mem-throttle 32768 \
--quality 1.0 \
--thumbnail-quality 2 \
--name $NAME \
--ocr-lang=eng+chi_sim \
--ocr-ebooks \

29
docker-compose.yml Normal file
View File

@ -0,0 +1,29 @@
version: "3"
services:
elasticsearch:
image: elasticsearch:7.17.9
container_name: sist2-es
volumes:
# This directory must have 1000:1000 permissions (or update PUID & PGID below)
- /data/sist2-es-data/:/usr/share/elasticsearch/data
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
- "PUID=1000"
- "PGID=1000"
sist2-admin:
build:
context: .
container_name: sist2-admin
volumes:
- /data/sist2-admin-data/:/sist2-admin/
- /<path to index>/:/host
ports:
- 4090:4090
# NOTE: Don't export this port publicly!
- 8080:8080
working_dir: /root/sist2-admin/
entrypoint: python3
command:
- /root/sist2-admin/sist2_admin/app.py

View File

@ -1,154 +1,89 @@
# Usage
*More examples (specifically with docker/compose) are in progress*
* [scan](#scan)
* [options](#scan-options)
* [examples](#scan-examples)
* [index format](#index-format)
* [index](#index)
* [options](#index-options)
* [examples](#index-examples)
* [web](#web)
* [options](#web-options)
* [examples](#web-examples)
* [rewrite_url](#rewrite_url)
* [elasticsearch](#elasticsearch)
* [exec-script](#exec-script)
* [tagging](#tagging)
* [sidecar files](#sidecar-files)
```
Usage: sist2 scan [OPTION]... PATH
or: sist2 index [OPTION]... INDEX
or: sist2 sqlite-index [OPTION]... INDEX
or: sist2 web [OPTION]... INDEX...
or: sist2 exec-script [OPTION]... INDEX
Lightning-fast file system indexer and search tool.
-h, --help show this help message and exit
-v, --version Show version and exit
--verbose Turn on logging
--very-verbose Turn on debug messages
-v, --version Print version and exit.
--verbose Turn on logging.
--very-verbose Turn on debug messages.
--json-logs Output logs in JSON format.
Scan options
-t, --threads=<int> Number of threads. DEFAULT=1
--mem-throttle=<int> Total memory threshold in MiB for scan throttling. DEFAULT=0
-q, --thumbnail-quality=<flt> Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best. DEFAULT=1
--thumbnail-size=<int> Thumbnail size, in pixels. DEFAULT=500
--thumbnail-count=<int> Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT=1
--content-size=<int> Number of bytes to be extracted from text documents. Set to 0 to disable. DEFAULT=32768
--incremental=<str> Reuse an existing index and only scan modified files.
-o, --output=<str> Output directory. DEFAULT=index.sist2/
-t, --threads=<int> Number of threads. DEFAULT: 1
-q, --thumbnail_count-quality=<int> Thumbnail quality, on a scale of 0 to 100, 100 being the best. DEFAULT: 50
--thumbnail_count-size=<int> Thumbnail size, in pixels. DEFAULT: 552
--thumbnail_count-count=<int> Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT: 1
--content-size=<int> Number of bytes to be extracted from text documents. Set to 0 to disable. DEFAULT: 32768
-o, --output=<str> Output index file path. DEFAULT: index.sist2
--incremental If the output file path exists, only scan new or modified files.
--optimize-index Defragment index file after scan to reduce its file size.
--rewrite-url=<str> Serve files from this url instead of from disk.
--name=<str> Index display name. DEFAULT: (name of the directory)
--name=<str> Index display name. DEFAULT: index
--depth=<int> Scan up to DEPTH subdirectories deep. Use 0 to only scan files in PATH. DEFAULT: -1
--archive=<str> Archive file mode (skip|list|shallow|recurse). skip: Don't parse, list: only get file names as text, shallow: Don't parse archives inside archives. DEFAULT: recurse
--archive=<str> Archive file mode (skip|list|shallow|recurse). skip: don't scan, list: only save file names as text, shallow: don't scan archives inside archives. DEFAULT: recurse
--archive-passphrase=<str> Passphrase for encrypted archive files
--ocr-lang=<str> Tesseract language (use 'tesseract --list-langs' to see which are installed on your machine)
--ocr-images Enable OCR'ing of image files.
--ocr-ebooks Enable OCR'ing of ebook files.
-e, --exclude=<str> Files that match this regex will not be scanned
--fast Only index file names & mime type
-e, --exclude=<str> Files that match this regex will not be scanned.
--fast Only index file names & mime type.
--treemap-threshold=<str> Relative size threshold for treemap (see USAGE.md). DEFAULT: 0.0005
--mem-buffer=<int> Maximum memory buffer size per thread in MiB for files inside archives (see USAGE.md). DEFAULT: 2000
--read-subtitles Read subtitles from media files.
--fast-epub Faster but less accurate EPUB parsing (no thumbnails, metadata)
--fast-epub Faster but less accurate EPUB parsing (no thumbnails, metadata).
--checksums Calculate file checksums when scanning.
--list-file=<str> Specify a list of newline-delimited paths to be scanned instead of normal directory traversal. Use '-' to read from stdin.
Index options
-t, --threads=<int> Number of threads. DEFAULT=1
--es-url=<str> Elasticsearch url with port. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
-p, --print Just print JSON documents to stdout.
--incremental-index Conduct incremental indexing, assumes that the old index is already digested by Elasticsearch.
-t, --threads=<int> Number of threads. DEFAULT: 1
--es-url=<str> Elasticsearch url with port. DEFAULT: http://localhost:9200
--es-insecure-ssl Do not verify SSL connections to Elasticsearch.
--es-index=<str> Elasticsearch index name. DEFAULT: sist2
-p, --print Print JSON documents to stdout instead of indexing to elasticsearch.
--incremental-index Conduct incremental indexing. Assumes that the old index is already ingested in Elasticsearch.
--script-file=<str> Path to user script.
--mappings-file=<str> Path to Elasticsearch mappings.
--settings-file=<str> Path to Elasticsearch settings.
--async-script Execute user script asynchronously.
--batch-size=<int> Index batch size. DEFAULT: 100
-f, --force-reset Reset Elasticsearch mappings and settings. (You must use this option the first time you use the index command)
--batch-size=<int> Index batch size. DEFAULT: 70
-f, --force-reset Reset Elasticsearch mappings and settings.
sqlite-index options
--search-index=<str> Path to search index. Will be created if it does not exist yet.
Web options
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--bind=<str> Listen on this address. DEFAULT=localhost:4090
--es-url=<str> Elasticsearch url. DEFAULT: http://localhost:9200
--es-insecure-ssl Do not verify SSL connections to Elasticsearch.
--search-index=<str> Path to SQLite search index.
--es-index=<str> Elasticsearch index name. DEFAULT: sist2
--bind=<str> Listen for connections on this address. DEFAULT: localhost:4090
--auth=<str> Basic auth in user:password format
--auth0-audience=<str> API audience/identifier
--auth0-domain=<str> Application domain
--auth0-client-id=<str> Application client ID
--auth0-public-key-file=<str> Path to Auth0 public key file extracted from <domain>/pem
--tag-auth=<str> Basic auth in user:password format for tagging
--tagline=<str> Tagline in navbar
--dev Serve html & js files from disk (for development)
--lang=<str> Default UI language. Can be changed by the user
Exec-script options
--es-url=<str> Elasticsearch url. DEFAULT=http://localhost:9200
--es-index=<str> Elasticsearch index name. DEFAULT=sist2
--script-file=<str> Path to user script.
--async-script Execute user script asynchronously.
Made by simon987 <me@simon987.net>. Released under GPL-3.0
```
## Scan
#### Thumbnail database size estimation
### Scan options
See chart below for rough estimate of thumbnail_count size vs. thumbnail_count size & quality arguments:
* `-t, --threads`
Number of threads for file parsing. **Do not set a number higher than `$(nproc)` or `$(Get-CimInstance Win32_ComputerSystem).NumberOfLogicalProcessors` in Windows!**
* `--mem-throttle`
Total memory threshold in MiB for scan throttling. Worker threads will not start a new parse job
until the total memory usage of sist2 is below this threshold. Set to 0 to disable. DEFAULT=0
* `-q, --thumbnail-quality`
Thumbnail quality, on a scale of 1.0 to 31.0, 1.0 being the best.
* `--thumbnail-size`
Thumbnail size in pixels.
* `--thumbnail-count`
Maximum number of thumbnails to generate. When set to a value >= 2, thumbnails for video previews
will be generated. The actual number of thumbnails generated depends on the length of the video (maximum 1 image
every ~5s). Set to 0 to completely disable thumbnails.
* `--content-size`
Number of bytes of text to be extracted from the content of files (plain text, PDFs etc.).
Repeated whitespace and special characters do not count toward this limit.
Set to 0 to completely disable content parsing.
* `--incremental`
Specify an existing index. Information about files in this index that were not modified (based on *mtime* attribute)
will be copied to the new index and will not be parsed again.
* `-o, --output` Output directory.
* `--rewrite-url` Set the `rewrite_url` option for the web module (See [rewrite_url](#rewrite_url))
* `--name` Set the `name` option for the web module
* `--depth` Maximum scan dept. Set to 0 only scan files directly in the root directory, set to -1 for infinite depth
* `--archive` Archive file mode.
* skip: Don't parse
* list: Only get file names as text
* shallow: Don't parse archives inside archives.
* recurse: Scan archives recursively (default)
* `--ocr-lang`, `--ocr-ebooks`, `--ocr-images` See [OCR](../README.md#OCR)
* `-e, --exclude` Regex pattern to exclude files. A file is excluded if the pattern matches any
part of the full absolute path.
Examples:
* `-e ".*\.ttf"`: Ignore ttf files
* `-e ".*\.(ttf|rar)"`: Ignore ttf and rar files
* `-e "^/mnt/backups/"`: Ignore all files in the `/mnt/backups/` directory
* `-e "^/mnt/Data[12]/"`: Ignore all files in the `/mnt/Data1/` and `/mnt/Data2/` directory
* `-e "(^/usr/)|(^/var/)|(^/media/DRIVE-A/tmp/)|(^/media/DRIVE-B/Trash/)"` Exclude the
`/usr`, `/var`, `/media/DRIVE-A/tmp`, `/media/DRIVE-B/Trash` directories
* `--fast` Only index file names and mime type
* `--treemap-threshold` Directories smaller than (`treemap-threshold` * `<total size of the index>`)
will not be considered for the disk utilisation visualization; their size will be added to
the parent directory. If the parent directory is still smaller than the threshold, it will also be "merged upwards"
and so on.
In effect, smaller `treemap-threshold` values will yield a more detailed
(but also a more cluttered and harder to read) visualization.
* `--mem-buffer` Maximum memory buffer size in MiB (per thread) for files inside archives. Media files
larger than this number will be read sequentially and no *seek* operations will be supported.
For example, `--thumbnail_count-size=500`, `--thumbnail_count-quality=50` for a directory with 8 million images will create a thumbnail_count database
that is about `8000000 * 11.8kB = 94.4GB`.
To check if a media file can be parsed without *seek*, execute `cat file.mp4 | ffprobe -`
* `--read-subtitles` When enabled, will attempt to read the subtitles stream from media files.
* `--fast-epub` Much faster but less accurate EPUB parsing. When enabled, sist2 will use a simple HTML parser to read epub files instead of the MuPDF library. No thumbnails are generated and author/title metadata are not parsed.
* `--checksums` Calculate file checksums (SHA1) when scanning files. This option does not cause any additional read
operations. Checksums are not calculated for all file types, unless the file is inside an archive. When enabled, duplicate
files are hidden in the web UI (this behaviour can be toggled in the Configuration page).
![thumbnail_size](thumbnail_size.png)
### Scan examples
@ -157,131 +92,70 @@ Simple scan
sist2 scan ~/Documents
sist2 scan \
--threads 4 --content-size 16000000 --quality 1.0 --archive shallow \
--threads 4 --content-size 16000000 --thumbnail_count-quality 2 --archive shallow \
--name "My Documents" --rewrite-url "http://nas.domain.local/My Documents/" \
~/Documents -o ./documents.idx/
~/Documents -o ./documents.sist2
```
Incremental scan
```
sist2 scan --incremental ./orig_idx/ -o ./updated_idx/ ~/Documents
```
### Index format
A typical `ndjson` type index structure looks like this:
```
documents.idx/
├── descriptor.json
├── _index_main.ndjson.zst
├── treemap.csv
├── agg_mime.csv
├── agg_date.csv
├── add_size.csv
├── thumbs/
| ├── data.mdb
| └── lock.mdb
├── tags/
| ├── data.mdb
| └── lock.mdb
└── meta/
├── data.mdb
└── lock.mdb
```
The `_index_*.ndjson.zst` files contain the document data in JSON format, in a compressed newline-delemited file.
The `thumbs/` folder is a [LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database)
database containing the thumbnails.
The `descriptor.json` file contains general information about the index. The
following fields are safe to modify manually: `root`, `name`, [rewrite_url](#rewrite_url) and `timestamp`.
The `.csv` are pre-computed aggregations necessary for the stats page.
*thumbs/*:
LMDB key-value store. Keys are **binary** 16-byte md5 hash* (`_id` field)
and values are raw image bytes.
*\* Hash is calculated from the full path of the file, including the extension, relative to the index root*
## Index
### Index options
* `--es-url`
Elasticsearch url and port. If you are using docker, make sure that both containers are on the
same network.
* `--es-index`
Elasticsearch index name. DEFAULT=sist2
* `-p, --print`
Print index in JSON format to stdout.
* `--incremental-index`
Conduct incremental indexing. Assumes that the old index is already ingested in Elasticsearch.
Only the new changes since the last scan will be sent.
* `--script-file`
Path to user script. See [Scripting](scripting.md).
* `--mappings-file`
Path to custom Elasticsearch mappings. If none is specified, [the bundled mappings](https://github.com/simon987/sist2/tree/master/schema) will be used.
* `--settings-file`
Path to custom Elasticsearch settings. *(See above)*
* `--async-script`
Use `wait_for_completion=false` elasticsearch option while executing user script.
(See [Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html))
* `--batch-size=<int>`
Index batch size. Indexing is generally faster with larger batches, but payloads that
are too large will fail and additional overhead for retrying with smaller sizes may slow
down the process.
* `-f, --force-reset`
Reset Elasticsearch mappings and settings.
* `-t, --threads` Number of threads to use. Ideally, choose a number equal to the number of logical cores of the machine hosting Elasticsearch.
### Index examples
**Push to elasticsearch**
If the index file does not exist, `--incremental` has no effect.
```bash
sist2 index --force-reset --batch-size 1000 --es-url http://localhost:9200 ./my_index/
sist2 index ./my_index/
sist scan ~/Documents -o ./documents.sist2
sist scan ~/Documents -o ./documents.sist2 --incremental
# or
sist scan ~/Documents -o ./documents.sist2 --incremental
sist scan ~/Documents -o ./documents.sist2 --incremental
```
### Index documents to Elasticsearch search backend
```bash
sist2 index --force-reset --batch-size 1000 --es-url http://localhost:9200 ./my_index.sist2
sist2 index ./my_index.sist2
```
#### Index documents to SQLite search backend
```bash
# The search index will be created if it does not exist already
sist2 sqlite-index ./index1.sist2 --search-index search.sist2
sist2 sqlite-index ./index2.sist2 --search-index search.sist2
```
**Save index in JSON format**
```bash
sist2 index --print ./my_index/ > my_index.ndjson
sist2 index --print ./my_index.sist2 > my_index.ndjson
```
**Inspect contents of an index**
```bash
sist2 index --print ./my_index/ | jq | less
sist2 index --print ./my_index.sist2 | jq | less
```
## Web
### Web options
* `--es-url=<str>` Elasticsearch url.
* `--es-index`
Elasticsearch index name. DEFAULT=sist2
* `--bind=<str>` Listen on this address.
* `--auth=<str>` Basic auth in user:password format
* `--tag-auth=<str>` Basic auth in user:password format. Works the same way as the
`--auth` argument, but authentication is only applied the `/tag/` endpoint.
* `--tagline=<str>` When specified, will replace the default tagline in the navbar.
* `--dev` Serve html & js files from disk (for development, used to modify frontend files without having to recompile)
* `--lang=<str>` Set the default web UI language (See #180 for a list of supported languages, default
is `en`). The user can change the language in the configuration page
### Web examples
**Single index**
**Single index (Elasticsearch backend)**
```bash
sist2 web --auth admin:hunter2 --bind 0.0.0.0:8888 my_index
sist2 web --auth admin:hunter2 --bind 0.0.0.0:8888 my_index.sist2
```
**Multiple indices**
**Multiple indices (Elasticsearch backend)**
```bash
# Indices will be displayed in this order in the web interface
sist2 web index1 index2 index3 index4
sist2 web index1.sist2 index2.sist2 index3.sist2 index4.sist2
```
**SQLite search backend**
```bash
sist2 web --search-index search.sist2 index1.sist2
```
#### Auth0 authentication
See [auth0.md](auth0.md)
### rewrite_url
When the `rewrite_url` field is not empty, the web module ignores the `root`
@ -292,18 +166,43 @@ Both the `root` and `rewrite_url` fields are safe to manually modify from the
# Elasticsearch
Elasticsearch versions >=6.8.0, <8.0.0 are supported by sist2.
Elasticsearch versions >=6.8.0, 7.X.X and 8.X.X are supported by sist2.
Using a version >=7.14.0 is recommended to enable the following features:
- Bug fix for large documents (See #198)
Using a version >=8.0.0 is recommended to enable the following features:
- Approximate KNN search for Embeddings search (faster queries).
When using a legacy version of ES, a notice will be displayed next to the sist2 version in the web UI.
If you don't care about the features above, you can ignore it or disable it in the configuration page.
## exec-script
# Embeddings search
The `exec-script` command is used to execute a user script for an index that has already been imported to Elasticsearch with the `index` command. Note that the documents will not be reset to their default state before each execution as the `index` command does: if you make undesired changes to the documents by accident, you will need to run `index` again to revert to the original state.
Since v3.2.0, User scripts can be used to generate _embeddings_ (vector of float32 numbers) which are stored in the .sist2 index file
(see [scripting](scripting.md)). Embeddings can be used for:
* Nearest-neighbor queries (e.g. "return the documents most similar to this one")
* Semantic searches (e.g. "return the documents that are most closely related to the given topic")
In theory, embeddings can be created for any type of documents (image, text, audio etc.).
For example, the [clip](https://github.com/sist2app/sist2-script-clip) User Script, generates 512-d embeddings of images
(videos are also supported using the thumbnails generated by sist2). When the user enters a query in the "Embeddings Search"
textbox, the query's embedding is generated in their browser, leveraging the ONNX web runtime.
<details>
<summary>Screenshots</summary>
![embeddings-1](embeddings-1.png)
![embeddings-2](embeddings-2.png)
1. Embeddings search bar. You can select the model using the dropdown on the left.
2. This icon appears for indices with embeddings search enabled.
3. Documents with this icon have embeddings. Click on the icon to perform KNN search.
</details>
# Tagging
@ -331,42 +230,3 @@ See [Automatic tagging](#automatic-tagging) for information about tag
### Automatic tagging
See [scripting](scripting.md) documentation.
# Sidecar files
When scanning, sist2 will read metadata from `.s2meta` JSON files and overwrite the
original document's indexed metadata (does not modify the actual file). Sidecar metadata files will also work inside archives.
Sidecar files themselves are not saved in the index.
This feature is useful to leverage third-party applications such as speech-to-text or
OCR to add additional metadata to a file.
**Example**
```
~/Documents/
├── Video.mp4
└── Video.mp4.s2meta
```
The sidecar file must have exactly the same file path and the `.s2meta` suffix.
`Video.mp4.s2meta`:
```json
{
"content": "This sidecar file will overwrite some metadata fields of Video.mp4",
"author": "Some author",
"duration": 12345,
"bitrate": 67890,
"some_arbitrary_field": [1,2,3]
}
```
```
sist2 scan ~/Documents -o ./docs.idx
sist2 index ./docs.idx
```
*NOTE*: It is technically possible to overwrite the `tag` value using sidecar files, however,
it is not currently possible to restore both manual tags and sidecar tags without user scripts
while reindexing.

19
docs/auth0.md Normal file
View File

@ -0,0 +1,19 @@
# Authentication with Auth0
1. Create a new Auth0 application (Single page app)
2. Create a new Auth0 API:
1. Choose `RS256` signing algorithm
2. Set identifier (audience) to `https://sist2`
3. Download the Auth0 certificate from https://<domain>.auth0.com/pem (you can find the domain Applications->Basic information)
4. Extract the public key from the certificate using `openssl x509 -pubkey -noout -in cert.pem > pubkey.txt`
5. Start the sist2 web server
Example options:
```bash
sist2 web \
--auth0-client-id XXX \
--auth0-audience https://sist2 \
--auth0-domain YYY.auth0.com \
--auth0-public-key-file /ZZZ/pubkey.txt
```

BIN
docs/embeddings-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

BIN
docs/embeddings-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 996 KiB

BIN
docs/ner.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 448 KiB

View File

@ -1,18 +1,47 @@
## User scripts
*This document is under construction, more in-depth guide coming soon*
User scripts are used to augment your sist2 index with additional metadata, neural network embeddings, tags etc.
Since version 3.2.0, user scripts are written in Python, and are ran against the sist2 index file. User scripts do not
need a connection to the search backend.
You can create a user script based on a template from the sist2-admin interface:
![sist2-admin-scripts](sist2-admin-scripts.png)
User scripts leverage the [sist2-python](https://github.com/simon987/sist2-python) library to interface with the
index file*. You can find sist2-python documentation and examples
here: [sist2-python.readthedocs.io](https://sist2-python.readthedocs.io/).
If you are not using the sist2-admin interface, you can run user scripts manually from the command line:
```
pip install git+https://github.com/simon987/sist2-python.git
python my_script.py /path/to/my_index.sist2
```
\* It is possible to manually update the index using raw SQL queries, but the database schema is not stable and
can change at any time; it is recommended to use the more stable sist2-python wrapper instead.
<hr>
<details>
<summary>Legacy user scripts (sist2 version < 3.2.0)</summary>
During the `index` step, you can use the `--script-file <script>` option to
modify documents or add user tags. This option is mainly used to
implement automatic tagging based on file attributes.
The scripting language used
([Painless Scripting Language](https://www.elastic.co/guide/en/elasticsearch/painless/7.4/index.html))
The scripting language used
([Painless Scripting Language](https://www.elastic.co/guide/en/elasticsearch/painless/7.4/index.html))
is very similar to Java, but you should be able to create user scripts
without programming experience at all if you're somewhat familiar with
regex.
This is the base structure of the documents we're working with:
```json
{
"_id": "e171405c-fdb5-4feb-bb32-82637bc32084",
@ -34,7 +63,8 @@ This is the base structure of the documents we're working with:
**Example script**
This script checks if the `genre` attribute exists, if it does
it adds the `genre.<genre>` tag.
it adds the `genre.<genre>` tag.
```Java
ArrayList tags = ctx._source.tag = new ArrayList();
@ -47,21 +77,23 @@ You can use `.` to create a hierarchical tag tree:
![scripting/genre_example](genre_example.png)
To use regular expressions, you need to add this line in `/etc/elasticsearch/elasticsearch.yml`
```yaml
script.painless.regex.enabled: true
```
Or, if you're using docker add `-e "script.painless.regex.enabled=true"`
**Tag color**
You can specify the color for an individual tag by appending an
You can specify the color for an individual tag by appending an
hexadecimal color code (`#RRGGBBAA`) to the tag name.
### Examples
If `(20XX)` is in the file name, add the `year.<year>` tag:
```Java
ArrayList tags = ctx._source.tag = new ArrayList();
@ -72,6 +104,7 @@ if (m.find()) {
```
Use default *Calibre* folder structure to infer author.
```Java
ArrayList tags = ctx._source.tag = new ArrayList();
@ -84,8 +117,9 @@ if (ctx._source.name.contains("-") && ctx._source.extension == "pdf") {
}
```
If the file matches a specific pattern `AAAA-000 fName1 lName1, <fName2 lName2>...`, add the `actress.<actress>` and
If the file matches a specific pattern `AAAA-000 fName1 lName1, <fName2 lName2>...`, add the `actress.<actress>` and
`studio.<studio>` tag:
```Java
ArrayList tags = ctx._source.tag = new ArrayList();
@ -102,16 +136,18 @@ if (m.find()) {
```
Set the name of the last folder (`/path/to/<studio>/file.mp4`) to `studio.<studio>` tag
```Java
ArrayList tags = ctx._source.tag = new ArrayList();
if (ctx._source.path != "") {
String[] names = ctx._source.path.splitOnToken('/');
String[] names = ctx._source.path.splitOnToken('/');
tags.add("studio." + names[names.length-1]);
}
```
Parse `EXIF:F Number` tag
```Java
if (ctx._source?.exif_fnumber != null) {
String[] values = ctx._source.exif_fnumber.splitOnToken(' ');
@ -124,6 +160,7 @@ if (ctx._source?.exif_fnumber != null) {
```
Display year and months from `EXIF:DateTime` tag
```Java
if (ctx._source?.exif_datetime != null) {
SimpleDateFormat parser = new SimpleDateFormat("yyyy:MM:dd HH:mm:ss");
@ -140,3 +177,6 @@ if (ctx._source?.exif_datetime != null) {
}
```
</details>

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

BIN
docs/sist2.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1011 KiB

BIN
docs/thumbnail_size.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 169 KiB

View File

@ -67,7 +67,8 @@
"index": false
},
"mtime": {
"type": "integer"
"type": "date",
"format": "epoch_second"
},
"size": {
"type": "long"
@ -201,6 +202,46 @@
},
"modified_by": {
"type": "text"
},
"emb.384.*": {
"type": "dense_vector",
"dims": 384
},
"emb.idx_384.*": {
"type": "dense_vector",
"dims": 384,
"index": true,
"similarity": "cosine"
},
"emb.idx_512.clip": {
"type": "dense_vector",
"dims": 512,
"index": true,
"similarity": "cosine"
},
"emb.512.*": {
"type": "dense_vector",
"dims": 512
},
"emb.idx_768.*": {
"type": "dense_vector",
"dims": 768,
"index": true,
"similarity": "cosine"
},
"emb.768.*": {
"type": "dense_vector",
"dims": 768
},
"emb.idx_1024.*": {
"type": "dense_vector",
"dims": 1024,
"index": true,
"similarity": "cosine"
},
"emb.1024.*": {
"type": "dense_vector",
"dims": 1024
}
}
}

View File

@ -3,7 +3,7 @@
"refresh_interval": "30s",
"codec": "best_compression",
"number_of_replicas": 0,
"highlight.max_analyzed_offset": 10000000
"highlight.max_analyzed_offset": 1000000
},
"analysis": {
"tokenizer": {
@ -16,7 +16,7 @@
"delimiter": "."
},
"my_nGram_tokenizer": {
"type": "nGram",
"type": "ngram",
"min_gram": 3,
"max_gram": 3
}

View File

@ -1,9 +1,13 @@
#!/usr/bin/env bash
rm -rf index.sist2/
(
cd ..
rm -rf index.sist2
python3 scripts/mime.py > src/parsing/mime_generated.c
python3 scripts/serve_static.py > src/web/static_generated.c
python3 scripts/index_static.py > src/index/static_generated.c
python3 scripts/mime.py > src/parsing/mime_generated.c
python3 scripts/serve_static.py > src/web/static_generated.c
python3 scripts/index_static.py > src/index/static_generated.c
python3 scripts/magic_static.py > src/magic_generated.c
printf "static const char *const Sist2CommitHash = \"%s\";\n" $(git rev-parse HEAD) > src/git_hash.h
printf "static const char *const Sist2CommitHash = \"%s\";\n" $(git rev-parse HEAD) > src/git_hash.h
)

View File

@ -2,16 +2,34 @@
VCPKG_ROOT="/vcpkg"
git submodule update --init --recursive
(
cd sist2-vue/
npm install
npm run build
) &
rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" .
make -j $(nproc)
strip sist2
./sist2 -v > VERSION
mv sist2 sist2-x64-linux
(
cd sist2-admin/frontend/
npm install
npm run build
) &
rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG=on -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" .
make -j $(nproc)
mv sist2_debug sist2-x64-linux-debug
wait
mkdir build
(
cd build
cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG_INFO=on -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" ..
make -j $(nproc)
strip sist2
./sist2 -v > VERSION
)
mv build/sist2 sist2-x64-linux
(
cd build
rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=x64_linux -DSIST_DEBUG_INFO=on -DSIST_DEBUG=on -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" ..
make -j $(nproc)
)
mv build/sist2_debug sist2-x64-linux-debug

View File

@ -4,14 +4,33 @@ VCPKG_ROOT="/vcpkg"
git submodule update --init --recursive
rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" .
make -j $(nproc)
strip sist2
mv sist2 sist2-arm64-linux
(
cd sist2-vue/
npm install
npm run build
) &
(
cd sist2-admin/frontend/
npm install
npm run build
) &
wait
mkdir build
(
cd build
cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG_INFO=on -DSIST_DEBUG=off -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" ..
make -j $(nproc)
strip sist2
)
mv build/sist2 sist2-arm64-linux
rm -rf CMakeFiles CMakeCache.txt
cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG=on -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" .
make -j $(nproc)
strip sist2
mv sist2_debug sist2-arm64-linux-debug
(
cd build
cmake -DSIST_PLATFORM=arm64_linux -DSIST_DEBUG_INFO=on -DSIST_DEBUG=on -DBUILD_TESTS=off -DCMAKE_TOOLCHAIN_FILE="${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake" ..
make -j $(nproc)
)
mv build/sist2_debug sist2-arm64-linux-debug

16
scripts/magic_static.py Normal file
View File

@ -0,0 +1,16 @@
MAGIC_PATHS = [
"/vcpkg/installed/x64-linux/share/libmagic/misc/magic.mgc",
"/work/vcpkg/installed/x64-linux/share/libmagic/misc/magic.mgc",
"/usr/lib/file/magic.mgc"
]
for path in MAGIC_PATHS:
try:
with open(path, "rb") as f:
data = f.read()
break
except:
continue
print("char magic_database_buffer[%d] = {%s};" % (len(data), ",".join(str(int(b)) for b in data)))

View File

@ -1,3 +1,4 @@
application/x-matlab-data,mat
application/arj, arj
application/base64, mme
application/binhex, hqx
@ -29,7 +30,7 @@ application/mime, aps
application/mspowerpoint, ppz
application/msword, doc|dot|w6w|wiz|word
application/netmc, mcp
application/octet-stream, bin|dump|gpg
application/octet-stream, bin|dump|gpg|pack|idx
application/oda, oda
application/ogg, ogv
application/pdf, pdf
@ -243,7 +244,7 @@ audio/make, funk|my|pfunk
audio/midi, kar
audio/mid, rmi
audio/mp4, m4b
audio/mpeg, m2a|mpa
audio/mpeg, m2a|mpa|mpga
audio/ogg, ogg
audio/s3m, s3m
audio/tsp-audio, tsi
@ -346,6 +347,8 @@ text/mcf, mcf
text/pascal, pas
text/PGP,
text/plain, com|cmd|conf|def|g|idc|list|lst|mar|sdml|text|txt|md|groovy|license|properties|desktop|ini|rst|cmake|ipynb|readme|less|lo|go|yml|d|cs|hpp|srt|nfo|sfv|m3u|csv|eml|make|log|markdown|yaml
text/x-script.python, pyx
text/csv,
application/vnd.coffeescript, coffee
text/richtext, rt|rtf|rtx
text/rtf,
@ -382,7 +385,7 @@ text/x-pascal, p
text/x-perl, pl
text/x-php, php
text/x-po, po
text/x-python, py
text/x-python, py|pyi
text/x-ruby, rb
text/x-sass, sass
text/x-scss, scss
@ -446,5 +449,4 @@ image/x-sigma-x3f, xf3
image/x-sony-arw, arw
image/x-sony-sr2, sr2
image/x-sony-srf, srf
image/x-epson-erf, erf
sist2/sidecar, s2meta
image/x-epson-erf, erf
1 application/arj application/x-matlab-data arj mat
1 application/x-matlab-data mat
2 application/arj application/arj arj arj
3 application/base64 application/base64 mme mme
4 application/binhex application/binhex hqx hqx
30 application/mspowerpoint application/mspowerpoint ppz ppz
31 application/msword application/msword doc|dot|w6w|wiz|word doc|dot|w6w|wiz|word
32 application/netmc application/netmc mcp mcp
33 application/octet-stream application/octet-stream bin|dump|gpg bin|dump|gpg|pack|idx
34 application/oda application/oda oda oda
35 application/ogg application/ogg ogv ogv
36 application/pdf application/pdf pdf pdf
244 audio/midi audio/midi kar kar
245 audio/mid audio/mid rmi rmi
246 audio/mp4 audio/mp4 m4b m4b
247 audio/mpeg audio/mpeg m2a|mpa m2a|mpa|mpga
248 audio/ogg audio/ogg ogg ogg
249 audio/s3m audio/s3m s3m s3m
250 audio/tsp-audio audio/tsp-audio tsi tsi
347 text/pascal text/pascal pas pas
348 text/PGP text/PGP
349 text/plain text/plain com|cmd|conf|def|g|idc|list|lst|mar|sdml|text|txt|md|groovy|license|properties|desktop|ini|rst|cmake|ipynb|readme|less|lo|go|yml|d|cs|hpp|srt|nfo|sfv|m3u|csv|eml|make|log|markdown|yaml com|cmd|conf|def|g|idc|list|lst|mar|sdml|text|txt|md|groovy|license|properties|desktop|ini|rst|cmake|ipynb|readme|less|lo|go|yml|d|cs|hpp|srt|nfo|sfv|m3u|csv|eml|make|log|markdown|yaml
350 text/x-script.python pyx
351 text/csv
352 application/vnd.coffeescript application/vnd.coffeescript coffee coffee
353 text/richtext text/richtext rt|rtf|rtx rt|rtf|rtx
354 text/rtf text/rtf
385 text/x-perl text/x-perl pl pl
386 text/x-php text/x-php php php
387 text/x-po text/x-po po po
388 text/x-python text/x-python py py|pyi
389 text/x-ruby text/x-ruby rb rb
390 text/x-sass text/x-sass sass sass
391 text/x-scss text/x-scss scss scss
449 image/x-sony-arw image/x-sony-arw arw arw
450 image/x-sony-sr2 image/x-sony-sr2 sr2 sr2
451 image/x-sony-srf image/x-sony-srf srf srf
452 image/x-epson-erf image/x-epson-erf erf erf
sist2/sidecar s2meta

View File

@ -1,6 +1,9 @@
import zlib
mimes = {}
noparse = set()
ext_in_hash = set()
mime_ids = {}
major_mime = {
"sist2": 0,
@ -100,6 +103,9 @@ cnt = 1
def mime_id(mime):
if mime in mime_ids:
return mime_ids[mime]
global cnt
major = mime.split("/")[0]
mime_id = str((major_mime[major] << 16) + cnt)
@ -125,9 +131,7 @@ def mime_id(mime):
elif mime == "application/x-empty":
cnt -= 1
return "1"
elif mime == "sist2/sidecar":
cnt -= 1
return "2"
mime_ids[mime] = mime_id
return mime_id
@ -135,24 +139,40 @@ def clean(t):
return t.replace("/", "_").replace(".", "_").replace("+", "_").replace("-", "_")
def crc(s):
return zlib.crc32(s.encode()) & 0xffffffff
with open("scripts/mime.csv") as f:
for l in f:
mime, ext_list = l.split(",")
if l.startswith("!"):
mime = mime[1:]
noparse.add(mime)
ext = [x.strip() for x in ext_list.split("|")]
ext = [x.strip() for x in ext_list.split("|") if x.strip() != ""]
mimes[mime] = ext
seen_crc = set()
for ext in mimes.values():
for e in ext:
if crc(e) in seen_crc:
raise Exception("CRC32 collision")
seen_crc.add(crc(e))
seen_crc = set()
for mime in mimes.keys():
if crc(mime) in seen_crc:
raise Exception("CRC32 collision")
seen_crc.add(crc(mime))
print("// **Generated by mime.py**")
print("#ifndef MIME_GENERATED_C")
print("#define MIME_GENERATED_C")
print("#include <glib.h>\n")
print("#include <stdlib.h>\n")
# Enum
print("enum mime {")
for mime, ext in sorted(mimes.items()):
print(" " + clean(mime) + "=" + mime_id(mime) + ",")
print(f"{clean(mime)}={mime_id(mime)},")
print("};")
# Enum -> string
@ -163,20 +183,28 @@ with open("scripts/mime.csv") as f:
print("default: return NULL;}}")
# Ext -> Enum
print("GHashTable *mime_get_ext_table() {"
"GHashTable *ext_table = g_hash_table_new(g_str_hash, g_str_equal);")
print("unsigned int mime_extension_lookup(unsigned long extension_crc32) {"
"switch (extension_crc32) {")
for mime, ext in mimes.items():
for e in [e for e in ext if e]:
print("g_hash_table_insert(ext_table, \"" + e + "\", (gpointer)" + clean(mime) + ");")
if e in ext_in_hash:
raise Exception("extension already in hash: " + e)
ext_in_hash.add(e)
print("return ext_table;}")
if len(ext) > 0:
for e in ext:
print(f"case {crc(e)}:", end="")
print(f"return {clean(mime)};")
print("default: return 0;}}")
# string -> Enum
print("GHashTable *mime_get_mime_table() {"
"GHashTable *mime_table = g_hash_table_new(g_str_hash, g_str_equal);")
for mime, ext in mimes.items():
print("g_hash_table_insert(mime_table, \"" + mime + "\", (gpointer)" + clean(mime) + ");")
print("return mime_table;}")
print("unsigned int mime_name_lookup(unsigned long mime_crc32) {"
"switch (mime_crc32) {")
for mime in mimes.keys():
print(f"case {crc(mime)}: return {clean(mime)};")
print("default: return 0;}}")
# mime list
mime_list = ",".join(mime_id(x) for x in mimes.keys()) + ",0"
print(f"unsigned int mime_ids[] = {{{mime_list}}};")
print("unsigned int* get_mime_ids() { return mime_ids; }")
print("#endif")

View File

@ -1,6 +0,0 @@
#!/usr/bin/env bash
make clean
rm -rf CMakeFiles/ CMakeCache.txt Makefile \
third-party/libscan/CMakeFiles third-party/libscan/CMakeCache.txt third-party/libscan/third-party/ext_ffmpeg \
third-party/libscan/third-party/ext_libmobi third-party/libscan/Makefile

View File

@ -0,0 +1,84 @@
#include <sqlite3ext.h>
#include <string.h>
#include <stdlib.h>
SQLITE_EXTENSION_INIT1
static int sep_rfind(const char *str) {
for (int i = (int) strlen(str); i >= 0; i--) {
if (str[i] == '/') {
return i;
}
}
return -1;
}
void path_parent_func(sqlite3_context *ctx, int argc, sqlite3_value **argv) {
if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_TEXT) {
sqlite3_result_error(ctx, "Invalid parameters", -1);
}
const char *value = (const char *) sqlite3_value_text(argv[0]);
int stop = sep_rfind(value);
if (stop == -1) {
sqlite3_result_null(ctx);
return;
}
char parent[4096 * 3];
strncpy(parent, value, stop);
sqlite3_result_text(ctx, parent, stop, SQLITE_TRANSIENT);
}
void random_func(sqlite3_context *ctx, int argc, sqlite3_value **argv) {
if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_INTEGER) {
sqlite3_result_error(ctx, "Invalid parameters", -1);
}
char state_buf[32] = {0,};
struct random_data buf;
int result;
long seed = sqlite3_value_int64(argv[0]);
initstate_r((int) seed, state_buf, sizeof(state_buf), &buf);
random_r(&buf, &result);
sqlite3_result_int(ctx, result);
}
int sqlite3_extension_init(
sqlite3 *db,
char **pzErrMsg,
const sqlite3_api_routines *pApi
) {
SQLITE_EXTENSION_INIT2(pApi);
sqlite3_create_function(
db,
"path_parent",
1,
SQLITE_UTF8,
NULL,
path_parent_func,
NULL,
NULL
);
sqlite3_create_function(
db,
"random_seeded",
1,
SQLITE_UTF8,
NULL,
random_func,
NULL,
NULL
);
return SQLITE_OK;
}

View File

@ -0,0 +1 @@
gcc -I/mnt/work/vcpkg/installed/x64-linux/include -g -fPIC -shared sqlite_extension.c -o sist2funcs.so

View File

@ -1,2 +1,3 @@
docker run --rm -it -p 9200:9200 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:7.14.0
docker run --rm -it --name "sist2-dev-es3"\
-p 9200:9200 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:7.17.9

3
scripts/start_dev_es_6.sh Executable file
View File

@ -0,0 +1,3 @@
docker run --rm -it --name "sist2-dev-es-6"\
-p 9202:9200 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:6.8.0

3
scripts/start_dev_es_8.sh Executable file
View File

@ -0,0 +1,3 @@
docker run --rm -it --name "sist2-dev-es3"\
-p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" \
-e "ES_JAVA_OPTS=-Xms8g -Xmx8g" elasticsearch:8.7.0

View File

@ -0,0 +1,7 @@
docker build . -t tmp
docker run --rm -it\
-v $(pwd):/host \
tmp \
scan --ocr-lang eng --ocr-ebooks -t6 --incremental --very-verbose \
-o /host/docker.sist2 /host/third-party/libscan/libscan-test-files/test_files/

View File

@ -0,0 +1,5 @@
module.exports = {
presets: [
'@vue/cli-plugin-babel/preset'
]
}

19056
sist2-admin/frontend/package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,49 @@
{
"name": "sist2-admin-vue",
"version": "0.1.0",
"private": true,
"scripts": {
"serve": "vue-cli-service serve",
"build": "vue-cli-service build",
"watch": "vue-cli-service build --watch"
},
"dependencies": {
"axios": "^1.6.0",
"bootstrap-vue": "^2.21.2",
"core-js": "^3.6.5",
"moment": "^2.29.3",
"socket.io-client": "^4.5.1",
"vue": "^2.6.14",
"vue-i18n": "^8.24.4",
"vue-router": "^3.5.4",
"vuex": "^3.4.0"
},
"devDependencies": {
"@vue/cli-plugin-babel": "~5.0.8",
"@vue/cli-plugin-router": "~5.0.8",
"@vue/cli-plugin-vuex": "~5.0.8",
"@vue/cli-service": "~5.0.8",
"babel-eslint": "^10.1.0",
"bootstrap": "^4.5.2",
"vue-template-compiler": "^2.6.11"
},
"eslintConfig": {
"root": true,
"env": {
"node": true
},
"extends": [
"plugin:vue/essential",
"eslint:recommended"
],
"parserOptions": {
"parser": "babel-eslint"
},
"rules": {}
},
"browserslist": [
"> 1%",
"last 2 versions",
"not dead"
]
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

View File

@ -0,0 +1,17 @@
<!DOCTYPE html>
<html lang="">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<link rel="icon" href="<%= BASE_URL %>serve_favicon_ico.ico">
<title>sist2-admin</title>
</head>
<body>
<noscript>
<strong>We're sorry but <%= htmlWebpackPlugin.options.title %> doesn't work properly without JavaScript enabled. Please enable it to continue.</strong>
</noscript>
<div id="app"></div>
<!-- built files will be auto injected -->
</body>
</html>

View File

@ -0,0 +1,105 @@
<template>
<div id="app">
<NavBar></NavBar>
<b-container class="pt-4">
<b-alert show dismissible variant="info">
This is a beta version of sist2-admin. Please submit bug reports, usability issues and feature requests
to the <a href="https://github.com/sist2app/sist2/issues/new/choose" target="_blank">issue tracker on
Github</a>. Thank you!
</b-alert>
<router-view v-if="$store.state.sist2AdminInfo"/>
</b-container>
</div>
</template>
<script>
import NavBar from "@/components/NavBar";
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
components: {NavBar},
data() {
return {
socket: null
}
},
mounted() {
Sist2AdminApi.getSist2AdminInfo()
.then(resp => this.$store.commit("setSist2AdminInfo", resp.data));
this.$store.dispatch("loadBrowserSettings");
this.connectNotifications();
// this.socket.onclose = this.connectNotifications;
},
methods: {
connectNotifications() {
if (window.location.protocol === "https:") {
this.socket = new WebSocket(`wss://${window.location.host}/notifications`);
} else {
this.socket = new WebSocket(`ws://${window.location.host}/notifications`);
}
this.socket.onopen = () => {
this.socket.send("Hello from client");
}
this.socket.onmessage = e => {
const notification = JSON.parse(e.data);
if (notification.message) {
notification.messageString = this.$t(notification.message).toString();
}
this.$store.dispatch("notify", notification)
}
}
}
}
</script>
<style>
html, body {
height: 100%;
}
#app {
/*font-family: Avenir, Helvetica, Arial, sans-serif;*/
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
/*text-align: center;*/
color: #2c3e50;
padding-bottom: 1em;
min-height: 100%;
}
.info-icon {
width: 1rem;
min-width: 1rem;
margin-right: 0.2rem;
cursor: pointer;
line-height: 1rem;
height: 1rem;
min-height: 1rem;
background-image: url();
filter: brightness(45%);
display: block;
}
.tabs {
margin-top: 10px;
}
.modal-title {
text-overflow: ellipsis;
overflow: hidden;
white-space: nowrap;
}
@media screen and (min-width: 1500px) {
.container {
max-width: 1440px;
}
}
label {
margin-top: 0.5rem;
margin-bottom: 0;
}
</style>

View File

@ -0,0 +1,179 @@
import axios from "axios";
class Sist2AdminApi {
constructor() {
this.baseUrl = window.location.protocol + "//" + window.location.host;
}
getJobs() {
return axios.get(`${this.baseUrl}/api/job`);
}
getFrontends() {
return axios.get(`${this.baseUrl}/api/frontend`);
}
getTasks() {
return axios.get(`${this.baseUrl}/api/task`);
}
killTask(taskId) {
return axios.post(`${this.baseUrl}/api/task/${taskId}/kill`)
}
getTaskHistory() {
return axios.get(`${this.baseUrl}/api/task/history`);
}
/**
* @param {string} name
*/
getJob(name) {
return axios.get(`${this.baseUrl}/api/job/${name}`);
}
getSearchBackend(name) {
return axios.get(`${this.baseUrl}/api/search_backend/${name}`);
}
updateSearchBackend(name, data) {
return axios.put(`${this.baseUrl}/api/search_backend/${name}`, data);
}
getSearchBackends() {
return axios.get(`${this.baseUrl}/api/search_backend`);
}
deleteBackend(name) {
return axios.delete(`${this.baseUrl}/api/search_backend/${name}`)
}
createBackend(name) {
return axios.post(`${this.baseUrl}/api/search_backend/${name}`);
}
getFrontend(name) {
return axios.get(`${this.baseUrl}/api/frontend/${name}`);
}
/**
* @param {string} name
*/
startFrontend(name) {
return axios.post(`${this.baseUrl}/api/frontend/${name}/start`);
}
/**
* @param {string} name
*/
stopFrontend(name) {
return axios.post(`${this.baseUrl}/api/frontend/${name}/stop`);
}
/**
* @param {string} name
* @param job
*/
updateJob(name, job) {
return axios.put(`${this.baseUrl}/api/job/${name}`, job);
}
/**
* @param {string} name
* @param frontend
*/
updateFrontend(name, frontend) {
return axios.put(`${this.baseUrl}/api/frontend/${name}`, frontend);
}
/**
* @param {string} name
* @param {bool} full
*/
runJob(name, full) {
return axios.get(`${this.baseUrl}/api/job/${name}/run`, {
params: {full}
});
}
/**
* @param {string} name
*/
deleteJob(name) {
return axios.delete(`${this.baseUrl}/api/job/${name}`);
}
/**
* @param {string} name
*/
deleteFrontend(name) {
return axios.delete(`${this.baseUrl}/api/frontend/${name}`);
}
/**
* @param {string} name
*/
createJob(name) {
return axios.post(`${this.baseUrl}/api/job/${name}`);
}
/**
* @param {string} name
*/
createFrontend(name) {
return axios.post(`${this.baseUrl}/api/frontend/${name}`);
}
pingEs(url, insecure) {
return axios.get(`${this.baseUrl}/api/ping_es`, {params: {url, insecure}});
}
getSist2AdminInfo() {
return axios.get(`${this.baseUrl}/api`);
}
getLogsToDelete(jobName, n) {
return axios.get(`${this.baseUrl}/api/job/${jobName}/logs_to_delete`, {
params: {n: n}
});
}
deleteTaskLogs(taskId) {
return axios.post(`${this.baseUrl}/api/task/${taskId}/delete_logs`);
}
getUserScripts() {
return axios.get(`${this.baseUrl}/api/user_script`);
}
getUserScript(name) {
return axios.get(`${this.baseUrl}/api/user_script/${name}`);
}
createUserScript(name, template) {
return axios.post(`${this.baseUrl}/api/user_script/${name}`, null, {
params: {
template: template
}
});
}
updateUserScript(name, data) {
return axios.put(`${this.baseUrl}/api/user_script/${name}`, data);
}
deleteUserScript(name) {
return axios.delete(`${this.baseUrl}/api/user_script/${name}`);
}
testUserScript(name, job) {
return axios.get(`${this.baseUrl}/api/user_script/${name}/run`, {
params: {
job: job
}
});
}
}
export default new Sist2AdminApi()

View File

@ -0,0 +1,31 @@
<template>
<b-list-group-item action :to="`/frontend/${frontend.name}`">
<div class="d-flex w-100 justify-content-between">
<h5 class="mb-1" style="display: block">
{{ frontend.name }}
<b-badge variant="light">{{ formatBindAddress(frontend.web_options.bind) }}</b-badge>
</h5>
<div>
<b-badge v-if="frontend.running" variant="success">{{$t("online")}}</b-badge>
<b-badge v-else variant="secondary">{{$t("offline")}}</b-badge>
</div>
</div>
</b-list-group-item>
</template>
<script>
import {formatBindAddress} from "@/util";
export default {
name: "FrontendListItem",
props: ["frontend"],
data() {
return {
formatBindAddress
}
}
}
</script>

View File

@ -0,0 +1,54 @@
<template>
<div>
<h5>{{ $t("selectJobs") }}</h5>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-form-group v-else>
<b-form-checkbox-group
v-if="jobs.length > 0"
:checked="frontend.jobs"
@input="frontend.jobs = $event; $emit('input')"
>
<div v-for="job in jobs" :key="job.name">
<b-form-checkbox :disabled="job.status !== 'indexed'"
:value="job.name">
<template #default><span
:title="job.status !== 'indexed' ? $t('jobOptions.notIndexed') : ''"
>[{{ job.name }}]</span></template>
</b-form-checkbox>
<br/>
</div>
</b-form-checkbox-group>
<div v-else>
<span class="text-muted">{{ $t('jobOptions.noJobAvailable') }}</span>
<router-link to="/">{{ $t("create") }}</router-link>
</div>
</b-form-group>
</div>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "JobCheckboxGroup",
props: ["frontend"],
mounted() {
Sist2AdminApi.getJobs().then(resp => {
this._jobs = resp.data;
this.loading = false;
});
},
computed: {
jobs() {
return this._jobs
.filter(job => job.index_options.search_backend === this.frontend.web_options.search_backend)
}
},
data() {
return {
loading: true,
_jobs: null
}
}
}
</script>

View File

@ -0,0 +1,55 @@
<template>
<b-list-group-item class="flex-column align-items-start" action :to="`job/${job.name}`">
<div class="d-flex w-100 justify-content-between">
<div>
<h5 class="mb-1">
{{ job.name }}
</h5>
</div>
<div>
<b-row>
<b-col>
<small v-if="job.last_index_date">
{{ $t("scanned") }} {{ formatLastIndexDate(job.last_index_date) }}</small>
<div v-else>&nbsp;</div>
</b-col>
</b-row>
<b-row v-if="job.schedule_enabled">
<b-col>
<small><code>{{job.cron_expression }}</code></small>
</b-col>
</b-row>
<b-row v-else>
<b-col>
&nbsp;
</b-col>
</b-row>
</div>
</div>
</b-list-group-item>
</template>
<script>
import moment from "moment";
export default {
name: "JobListItem",
props: ["job"],
methods: {
formatLastIndexDate(dateString) {
if (dateString === null) {
return "";
}
return moment.utc(dateString).local().fromNow();
}
}
}
</script>
<style scoped>
</style>

View File

@ -0,0 +1,94 @@
<template>
<div>
<b-form-checkbox :checked="desktopNotificationsEnabled" @change="updateNotifications($event)">
{{ $t("jobOptions.desktopNotifications") }}
</b-form-checkbox>
<b-form-checkbox v-model="job.schedule_enabled" @change="update()">
{{ $t("jobOptions.scheduleEnabled") }}
</b-form-checkbox>
<label>{{ $t("jobOptions.cron") }}</label>
<b-form-input class="text-monospace" :state="cronValid" v-model="job.cron_expression"
:disabled="!job.schedule_enabled" @change="update()"></b-form-input>
<label>{{ $t("jobOptions.keepNLogs") }}</label>
<b-input-group>
<b-form-input type="number" v-model="job.keep_last_n_logs" @change="update()"></b-form-input>
<b-input-group-append>
<b-button variant="danger" @click="onDeleteNowClick()">{{ $t("jobOptions.deleteNow") }}</b-button>
</b-input-group-append>
</b-input-group>
</div>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "JobOptions",
props: ["job"],
data() {
return {
cronValid: undefined,
logsToDelete: null
}
},
computed: {
desktopNotificationsEnabled() {
return this.$store.state.jobDesktopNotificationMap[this.job.name];
}
},
mounted() {
this.cronValid = this.checkCron(this.job.cron_expression)
},
methods: {
checkCron(expression) {
return /((((\d+,)+\d+|(\d+([/-])\d+)|\d+|\*) ?){5,7})/.test(expression);
},
updateNotifications(value) {
this.$store.dispatch("setJobDesktopNotification", {
job: this.job.name,
enabled: value
});
},
update() {
if (this.job.schedule_enabled) {
this.cronValid = this.checkCron(this.job.cron_expression);
} else {
this.cronValid = undefined;
}
if (this.cronValid !== false) {
this.$emit("change", this.job);
}
},
onDeleteNowClick() {
Sist2AdminApi.getLogsToDelete(this.job.name, this.job.keep_last_n_logs).then(resp => {
const toDelete = resp.data;
const message = `Delete ${toDelete.length} log files?`;
this.$bvModal.msgBoxConfirm(message, {
title: this.$t("confirmation"),
size: "sm",
buttonSize: "sm",
okVariant: "danger",
okTitle: this.$t("delete"),
cancelTitle: this.$t("cancel"),
footerClass: "p-2",
hideHeaderClose: false,
centered: true
}).then(value => {
if (value) {
toDelete.forEach(row => {
Sist2AdminApi.deleteTaskLogs(row["id"]);
});
}
});
})
}
},
}
</script>

View File

@ -0,0 +1,34 @@
<template>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<span v-else-if="jobs.length === 0"></span>
<b-form-select v-else :options="jobs" text-field="name" value-field="name"
@change="$emit('change', $event)" :value="$t('selectJob')"></b-form-select>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "JobSelect",
mounted() {
Sist2AdminApi.getJobs().then(resp => {
this._jobs = resp.data;
this.loading = false;
});
},
computed: {
jobs() {
return [
{name: this.$t("selectJob"), disabled: true},
...this._jobs.filter(job => job.index_path)
]
}
},
data() {
return {
loading: true,
_jobs: null
}
}
}
</script>

View File

@ -0,0 +1,69 @@
<template>
<b-navbar>
<b-navbar-brand to="/">
<Sist2Icon></Sist2Icon>
</b-navbar-brand>
<b-button class="ml-auto" to="/task" variant="link">{{ $t("tasks") }}</b-button>
</b-navbar>
</template>
<script>
import Sist2Icon from "@/components/icons/Sist2Icon";
export default {
name: "NavBar",
components: {Sist2Icon},
methods: {
tagline() {
return this.$store.state.sist2Info.tagline;
},
sist2Version() {
return this.$store.state.sist2Info.version;
},
isDebug() {
return this.$store.state.sist2Info.debug;
},
isLegacy() {
return this.$store.state.sist2Info.esVersionLegacy;
},
hideLegacy() {
return this.$store.state.optHideLegacy;
}
}
}
</script>
<style scoped>
.navbar {
box-shadow: 0 0.125rem 0.25rem rgb(0 0 0 / 8%) !important;
border-radius: 0;
}
.theme-black .navbar {
background: #546b7a30;
border-bottom: none;
}
.navbar-brand {
color: #222 !important;
font-size: 1.75rem;
padding: 0;
}
.navbar-brand:hover {
color: #000 !important;
}
.version {
color: #222 !important;
margin-left: -18px;
margin-top: -14px;
font-size: 11px;
font-family: monospace;
}
.btn-link {
color: #222;
}
</style>

View File

@ -0,0 +1,110 @@
<template>
<div>
<label>{{ $t("scanOptions.path") }}</label>
<b-form-input v-model="options.path" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.threads") }}</label>
<b-form-input type="number" min="1" v-model="options.threads" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.thumbnailQuality") }}</label>
<b-form-input type="number" min="0" max="100" v-model="options.thumbnail_quality" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.thumbnailCount") }}</label>
<b-form-input type="number" min="0" max="1000" v-model="options.thumbnail_count" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.thumbnailSize") }}</label>
<b-form-input type="number" min="100" v-model="options.thumbnail_size" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.contentSize") }}</label>
<b-form-input type="number" min="0" v-model="options.content_size" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.rewriteUrl") }}</label>
<b-form-input v-model="options.rewrite_url" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.depth") }}</label>
<b-form-input type="number" min="0" v-model="options.depth" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.archive") }}</label>
<b-form-select :options="['skip', 'list', 'shallow', 'recurse']" v-model="options.archive"
@change="update()"></b-form-select>
<label>{{ $t("scanOptions.archivePassphrase") }}</label>
<b-form-input v-model="options.archive_passphrase" @change="update()"></b-form-input>
<label>{{ $t("scanOptions.ocrLang") }}</label>
<b-alert variant="danger" show v-if="selectedOcrLangs.length === 0 && !disableOcrLang">{{ $t("scanOptions.ocrLangAlert") }}</b-alert>
<b-checkbox-group :disabled="disableOcrLang" v-model="selectedOcrLangs" @input="onOcrLangChange">
<b-checkbox v-for="lang in ocrLangs" :key="lang" :value="lang">{{ lang }}</b-checkbox>
</b-checkbox-group>
<!-- <b-form-input readonly v-model="options.ocr_lang" @change="update()"></b-form-input>-->
<div style="height: 10px"></div>
<b-form-checkbox v-model="options.ocr_images" @change="update()">
{{ $t("scanOptions.ocrImages") }}
</b-form-checkbox>
<b-form-checkbox v-model="options.ocr_ebooks" @change="update()">
{{ $t("scanOptions.ocrEbooks") }}
</b-form-checkbox>
<label>{{ $t("scanOptions.exclude") }}</label>
<b-form-input v-model="options.exclude" @change="update()"
:placeholder="$t('scanOptions.excludePlaceholder')"></b-form-input>
<div style="height: 10px"></div>
<b-form-checkbox v-model="options.fast" @change="update()">
{{ $t("scanOptions.fast") }}
</b-form-checkbox>
<b-form-checkbox v-model="options.checksums" @change="update()">
{{ $t("scanOptions.checksums") }}
</b-form-checkbox>
<b-form-checkbox v-model="options.read_subtitles" @change="update()">
{{ $t("scanOptions.readSubtitles") }}
</b-form-checkbox>
<b-form-checkbox v-model="options.optimize_index" @change="update()">
{{ $t("scanOptions.optimizeIndex") }}
</b-form-checkbox>
<label>{{ $t("scanOptions.treemapThreshold") }}</label>
<b-form-input type="number" min="0" v-model="options.treemap_threshold" @change="update()"></b-form-input>
</div>
</template>
<script>
export default {
name: "ScanOptions",
props: ["options"],
data() {
return {
disableOcrLang: false,
selectedOcrLangs: []
}
},
computed: {
ocrLangs() {
return this.$store.state.sist2AdminInfo?.tesseract_langs || [];
}
},
methods: {
onOcrLangChange() {
this.options.ocr_lang = this.selectedOcrLangs.join("+");
this.update();
},
update() {
this.disableOcrLang = this.options.ocr_images === false && this.options.ocr_ebooks === false;
this.$emit("change", this.options);
},
},
mounted() {
this.disableOcrLang = this.options.ocr_images === false && this.options.ocr_ebooks === false;
this.selectedOcrLangs = this.options.ocr_lang ? this.options.ocr_lang.split("+") : [];
}
}
</script>

View File

@ -0,0 +1,24 @@
<template>
<b-list-group-item action :to="`/searchBackend/${backend.name}`">
<div class="d-flex w-100 justify-content-between">
<h5 class="mb-1">
{{ backend.name }}
</h5>
<div>
<b-badge v-if="backend.backend_type === 'sqlite'" variant="info">SQLite</b-badge>
<b-badge v-else variant="info">Elasticsearch</b-badge>
</div>
</div>
</b-list-group-item>
</template>
<script>
export default {
name: "SearchBackendListItem",
props: ["backend"],
}
</script>

View File

@ -0,0 +1,37 @@
<template>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<div v-else>
<label>{{$t("backendOptions.searchBackend")}}</label>
<b-select :options="options" :value="value" @change="$emit('change', $event)"></b-select>
</div>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "SearchBackendSelect",
props: ["value"],
data() {
return {
loading: true,
backends: null,
}
},
computed: {
options() {
return this.backends.map(backend => backend.name)
}
},
mounted() {
Sist2AdminApi.getSearchBackends().then(resp => {
this.loading = false;
this.backends = resp.data
})
}
}
</script>
<style scoped>
</style>

View File

@ -0,0 +1,57 @@
<template>
<b-list-group-item>
<b-row style="height: 50px">
<b-col><h5>{{ task.display_name }}</h5></b-col>
<b-col class="shrink">
<router-link class="btn btn-link" :to="`/log/${task.id}`">{{ $t("logs") }}</router-link>
</b-col>
<b-col class="shrink">
<b-btn variant="link" @click="killTask(task.id)">{{ $t("kill") }}</b-btn>
</b-col>
</b-row>
<b-row>
<b-col>
<b-progress :max="task.progress.count">
<b-progress-bar :value="task.progress.done" :label-html="label" :striped="!task.progress.waiting"/>
</b-progress>
</b-col>
</b-row>
</b-list-group-item>
</template>
<script>
import sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "TaskListItem",
props: ["task"],
computed: {
label() {
const count = this.task.progress.count;
const done = this.task.progress.done;
return `<span>${done}/${count}</span>`
}
},
methods: {
killTask(taskId) {
sist2AdminApi.killTask(taskId).then(() => {
this.$bvToast.toast(this.$t("killConfirmation"), {
title: this.$t("killConfirmationTitle"),
variant: "success",
toaster: "b-toaster-bottom-right"
});
});
}
}
}
</script>
<style scoped>
.shrink {
flex-grow: inherit;
}
</style>

View File

@ -0,0 +1,18 @@
<template>
<b-list-group-item action :to="`/userScript/${script.name}`">
<div class="d-flex w-100 justify-content-between">
<h5 class="mb-1">
{{ script.name }}
</h5>
</div>
</b-list-group-item>
</template>
<script>
export default {
name: "UserScriptListItem",
props: ["script"],
}
</script>

View File

@ -0,0 +1,88 @@
<template>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-row v-else>
<b-col cols="6">
<h5>Selected scripts</h5>
<b-list-group>
<b-list-group-item v-for="script in selectedScripts" :key="script"
button
@click="onRemoveScript(script)"
class="d-flex justify-content-between align-items-center">
{{ script }}
<b-button-group>
<b-button variant="light" @click.stop="moveUpScript(script)"></b-button>
<b-button variant="light" @click.stop="moveDownScript(script)"></b-button>
</b-button-group>
</b-list-group-item>
</b-list-group>
</b-col>
<b-col cols="6">
<h5>Available scripts</h5>
<b-list-group>
<b-list-group-item v-for="script in availableScripts" :key="script" button
@click="onSelectScript(script)">
{{ script }}
</b-list-group-item>
</b-list-group>
</b-col>
</b-row>
<!-- <b-checkbox-group v-else :options="scripts" stacked :checked="selectedScripts"-->
<!-- @input="$emit('change', $event)"></b-checkbox-group>-->
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "UserScriptPicker",
props: ["selectedScripts"],
data() {
return {
loading: true,
scripts: []
}
},
computed: {
availableScripts() {
return this.scripts.filter(script => !this.selectedScripts.includes(script))
}
},
mounted() {
Sist2AdminApi.getUserScripts().then(resp => {
this.scripts = resp.data.map(script => script.name);
this.loading = false;
});
},
methods: {
onSelectScript(name) {
this.selectedScripts.push(name);
this.$emit("change", this.selectedScripts)
},
onRemoveScript(name) {
this.selectedScripts.splice(this.selectedScripts.indexOf(name), 1);
this.$emit("change", this.selectedScripts);
},
moveUpScript(name) {
const index = this.selectedScripts.indexOf(name);
if (index > 0) {
this.selectedScripts.splice(index, 1);
this.selectedScripts.splice(index - 1, 0, name);
}
this.$emit("change", this.selectedScripts);
},
moveDownScript(name) {
const index = this.selectedScripts.indexOf(name);
if (index < this.selectedScripts.length - 1) {
this.selectedScripts.splice(index, 1);
this.selectedScripts.splice(index + 1, 0, name);
}
this.$emit("change", this.selectedScripts);
}
}
}
</script>
<style scoped>
</style>

View File

@ -0,0 +1,73 @@
<template>
<div>
<h4>{{ $t("webOptions.title") }}</h4>
<b-card>
<label>{{ $t("webOptions.lang") }}</label>
<b-form-select v-model="options.lang" :options="['en', 'fr', 'zh-CN', 'pl', 'de']"
@change="update()"></b-form-select>
<label>{{ $t("webOptions.bind") }}</label>
<b-form-input v-model="options.bind" @change="update()"></b-form-input>
<label>{{ $t("webOptions.tagline") }}</label>
<b-form-textarea v-model="options.tagline" @change="update()"></b-form-textarea>
<label>{{ $t("webOptions.auth") }}</label>
<b-form-input v-model="options.auth" @change="update()"></b-form-input>
<label>{{ $t("webOptions.tagAuth") }}</label>
<b-form-input v-model="options.tag_auth" @change="update()" :disabled="Boolean(options.auth)"></b-form-input>
<b-form-checkbox v-model="options.verbose" @change="update()">
{{$t("webOptions.verbose")}}
</b-form-checkbox>
</b-card>
<br>
<h4>Auth0 options</h4>
<b-card>
<label>{{ $t("webOptions.auth0Audience") }}</label>
<b-form-input v-model="options.auth0_audience" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0Domain") }}</label>
<b-form-input v-model="options.auth0_domain" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0ClientId") }}</label>
<b-form-input v-model="options.auth0_client_id" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0PublicKey") }}</label>
<b-textarea rows="10" v-model="options.auth0_public_key" @change="update()"></b-textarea>
</b-card>
</div>
</template>
<script>
export default {
name: "WebOptions",
props: ["options", "frontendName"],
data() {
return {
showEsTestAlert: false,
esTestOk: false,
esTestMessage: ""
}
},
methods: {
update() {
console.log(this.options)
if (this.options.auth && this.options.tag_auth) {
// If both are set, remove tagAuth
this.options.tag_auth = "";
}
this.$emit("change", this.options);
},
}
}
</script>
<style scoped>
</style>

View File

@ -0,0 +1,40 @@
<template>
<svg
xmlns="http://www.w3.org/2000/svg"
width="27.868069mm"
height="7.6446671mm"
viewBox="0 0 27.868069 7.6446671"
>
<g transform="translate(-4.5018313,-4.1849793)">
<g
style="fill: currentColor;fill-opacity:1;stroke:none;stroke-width:0.26458332">
<path
d="m 6.3153296,11.829646 q -0.7717014,0 -1.8134983,-0.337619 v -0.916395 q 1.0128581,0.511252 1.803852,0.511252 0.5643067,0 0.901926,-0.236334 0.3376194,-0.236333 0.3376194,-0.63183 0,-0.3424428 -0.2845649,-0.5498376 Q 6.980922,9.4566645 6.3635609,9.3264399 L 5.9921796,9.2492698 Q 5.2301245,9.0949295 4.8732126,8.7428407 4.5211238,8.3859288 4.5211238,7.7733908 q 0,-0.7765245 0.5305447,-1.1961372 0.5305447,-0.4196126 1.5096409,-0.4196126 0.829579,0 1.6061036,0.3183268 V 7.3441319 Q 7.4101809,6.9004036 6.5854251,6.9004036 q -1.1671984,0 -1.1671984,0.7958171 0,0.2604492 0.1012858,0.4147895 0.1012858,0.1495171 0.3858507,0.2556261 0.2845649,0.1012858 0.8392253,0.2122179 l 0.3569119,0.067524 q 1.3408312,0.2652724 1.3408312,1.4614098 0,0.80064 -0.5691298,1.263661 -0.5691298,0.458197 -1.5578722,0.458197 z"
style="stroke-width:0.26458332"
/>
<path
d="m 11.943927,5.3087694 q -0.144694,0 -0.144694,-0.144694 V 4.3296733 q 0,-0.144694 0.144694,-0.144694 h 0.694531 q 0.144694,0 0.144694,0.144694 v 0.8344021 q 0,0.144694 -0.144694,0.144694 z M 13.5645,11.728361 q -0.795817,0 -1.234722,-0.511253 -0.434082,-0.516075 -0.434082,-1.4469398 V 6.9823969 H 10.714028 V 6.2878656 h 2.069124 v 3.4823026 q 0,0.5884228 0.221864,0.8971028 0.221865,0.308681 0.6463,0.308681 h 1.036974 v 0.752409 z"
style="stroke-width:0.26458332"
/>
<path
d="m 18.209178,11.829646 q -0.771701,0 -1.813498,-0.337619 v -0.916395 q 1.012858,0.511252 1.803852,0.511252 0.564306,0 0.901926,-0.236334 0.337619,-0.236333 0.337619,-0.63183 0,-0.3424428 -0.284565,-0.5498376 Q 18.87477,9.4566645 18.257409,9.3264399 l -0.371381,-0.07717 Q 17.123973,9.0949295 16.767061,8.7428407 16.414972,8.3859288 16.414972,7.7733908 q 0,-0.7765245 0.530545,-1.1961372 0.530545,-0.4196126 1.509641,-0.4196126 0.829579,0 1.606103,0.3183268 v 0.8681641 q -0.757232,-0.4437283 -1.581988,-0.4437283 -1.167198,0 -1.167198,0.7958171 0,0.2604492 0.101286,0.4147895 0.101286,0.1495171 0.385851,0.2556261 0.284565,0.1012858 0.839225,0.2122179 l 0.356912,0.067524 q 1.340831,0.2652724 1.340831,1.4614098 0,0.80064 -0.56913,1.263661 -0.56913,0.458197 -1.557872,0.458197 z"
style="stroke-width:0.26458332"
/>
<path
d="m 25.207545,11.709068 q -0.993565,0 -1.408355,-0.40032 -0.409966,-0.405143 -0.409966,-1.3794164 V 6.9775737 H 21.947107 V 6.2878656 h 1.442117 V 4.8746874 l 0.887457,-0.3858507 v 1.7990289 h 2.016069 v 0.6897081 h -2.016069 v 2.9517579 q 0,0.5932454 0.226687,0.8344024 0.226687,0.236333 0.790994,0.236333 h 0.998388 v 0.709001 z"
style="stroke-width:0.26458332"
/>
<path
d="m 27.995317,11.043476 q 0,-0.178456 0.120578,-0.299035 0.274919,-0.289388 0.651123,-0.684885 0.376205,-0.4003199 0.805464,-0.8681638 0.327973,-0.356912 0.491959,-0.5353679 0.16881,-0.1832791 0.255626,-0.2845649 0.09164,-0.1012858 0.178456,-0.2073948 0.255626,-0.3086805 0.405144,-0.5257215 0.15434,-0.2170411 0.250803,-0.4292589 0.168809,-0.3762045 0.168809,-0.7524089 0,-0.5980686 -0.352089,-0.935688 -0.356911,-0.3424425 -0.979096,-0.3424425 -0.863341,0 -1.938899,0.6414768 V 4.8361023 q 0.491959,-0.2363335 0.979096,-0.3569119 0.47749,-0.1205783 0.945334,-0.1205783 0.501606,0 0.940511,0.1350477 0.438905,0.1350478 0.766878,0.4244358 0.289388,0.2556261 0.463021,0.6270074 0.173633,0.3665582 0.173633,0.829579 0,0.4726671 -0.212218,0.9501574 -0.106109,0.2411567 -0.274919,0.4726671 -0.163986,0.2266873 -0.424435,0.540191 Q 31.270225,8.501684 31.077299,8.718725 30.884374,8.9357661 30.628748,9.2106847 30.445469,9.4084332 30.286305,9.5675966 30.131965,9.72676 29.958332,9.9003928 29.7847,10.069203 29.558012,10.300713 29.336148,10.5274 29.012998,10.869843 h 3.356901 v 0.819932 h -4.374582 z"
style="stroke-width:0.26458332"
/>
</g>
</g>
</svg>
</template>
<script>
export default {
name: "Sist2Icon"
}
</script>

View File

@ -0,0 +1,143 @@
export default {
en: {
start: "Start",
stop: "Stop",
go: "Go",
online: "online",
offline: "offline",
view: "View",
delete: "Delete",
runNow: "Index now",
runNowFull: "Full re-index",
create: "Create",
cancel: "Cancel",
test: "Test",
confirmation: "Confirmation",
jobTitle: "job configuration",
tasks: "Tasks",
runningTasks: "Running tasks",
frontends: "Frontends",
jobDisabled: "There is no valid index for this job",
status: "Status",
taskHistory: "Task history",
taskName: "Task name",
taskStarted: "Started",
taskDuration: "Duration",
taskStatus: "Status",
logs: "Logs",
kill: "Kill",
killConfirmation: "SIGTERM signal sent to sist2 process",
killConfirmationTitle: "Confirmation",
follow: "Follow",
wholeFile: "Whole file",
logLevel: "Log level",
logMode: "Follow mode",
logFile: "Reading log file",
jobs: "Jobs",
newJobName: "New job name",
newJobHelp: "Create a new job to get started!",
newFrontendName: "New frontend name",
scanned: "last scan",
autoStart: "Start automatically",
runJobConfirmationTitle: "Task queued",
runJobConfirmation: "Check the Tasks page to monitor the status.",
extraQueryArgs: "Extra query arguments when launching from sist2-admin",
customUrl: "Custom URL when launching from sist2-admin",
searchBackends: "Search backends",
searchBackendTitle: "search backend configuration",
newBackendName: "New search backend name",
frontendTab: "Frontend",
backendTab: "Backend",
scripts: "User Scripts",
script: "User Script",
testScript: "Test/debug User Script",
newScriptName: "New script name",
scriptType: "Script type",
scriptCode: "Script code (Python)",
scriptOptions: "User scripts",
gitRepository: "Git repository URL",
extraArgs: "Extra command line arguments",
couldNotStartFrontend: "Could not start frontend",
couldNotStartFrontendBody: "Unable to start the frontend, check server logs for more details.",
selectJobs: "Available jobs",
selectJob: "Select a job",
webOptions: {
title: "Web options",
lang: "UI Language",
bind: "Listen address",
tagline: "Tagline in navbar",
auth: "Basic auth in user:password format",
tagAuth: "Basic auth in user:password format for tagging",
auth0Audience: "Auth0 audience",
auth0Domain: "Auth0 domain",
auth0ClientId: "Auth0 client ID",
auth0PublicKey: "Auth0 public key",
verbose: "Verbose logs"
},
backendOptions: {
title: "Search backend options",
searchBackend: "Search backend",
type: "Search backend type",
esUrl: "Elasticsearch URL",
esIndex: "Elasticsearch index name",
esInsecure: "Do not verify SSL connections to Elasticsearch.",
threads: "Number of threads",
batchSize: "Index batch size",
script: "User script",
searchIndex: "Search index file location"
},
scanOptions: {
title: "Scanning options",
path: "Path",
threads: "Number of threads",
memThrottle: "Total memory threshold in MiB for scan throttling",
thumbnailQuality: "Thumbnail quality, on a scale of 0 to 100, 100 being the best",
thumbnailCount: "Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails.",
thumbnailSize: "Thumbnail size, in pixels",
contentSize: "Number of bytes to be extracted from text documents. Set to 0 to disable",
rewriteUrl: "Serve files from this url instead of from disk",
depth: "Scan up to this many subdirectories deep",
archive: "Archive file mode",
archivePassphrase: "Passphrase for encrypted archive files",
ocrLang: "Tesseract language",
ocrLangAlert: "You must select at least one language",
ocrEbooks: "Enable OCR'ing of ebook files",
ocrImages: "Enable OCR'ing of image files",
exclude: "Files that match this regex will not be scanned",
excludePlaceholder: "Exclude",
fast: "Only index file names & mime type",
checksums: "Calculate file checksums when scanning",
readSubtitles: "Read subtitles from media files",
memBuffer: "Maximum memory buffer size per thread in MiB for files inside archives",
treemapThreshold: "Relative size threshold for treemap",
optimizeIndex: "Defragment index file after scan to reduce its file size."
},
jobOptions: {
title: "Job options",
cron: "Job schedule",
keepNLogs: "Keep last N log files. Set to -1 to keep all logs.",
deleteNow: "Delete now",
scheduleEnabled: "Enable scheduled re-scan",
noJobAvailable: "No jobs available for this search backend.",
notIndexed: "Has not been indexed yet",
noBackendError: "You must select a search backend to run this job",
desktopNotifications: "Desktop notifications"
},
frontendOptions: {
title: "Advanced options",
noJobSelectedWarning: "You must select at least one job to start this frontend"
},
notifications: {
indexCompleted: "Task completed for [$JOB$]"
}
}
}

View File

@ -0,0 +1,31 @@
import Vue from 'vue'
import { BootstrapVue, IconsPlugin } from 'bootstrap-vue'
import "bootstrap/dist/css/bootstrap.min.css"
import "bootstrap-vue/dist/bootstrap-vue.min.css"
Vue.use(BootstrapVue);
Vue.use(IconsPlugin);
import App from './App.vue';
import router from './router';
import store from './store';
import VueI18n from "vue-i18n";
import messages from "@/i18n/messages";
Vue.use(VueI18n);
const i18n = new VueI18n({
locale: "en",
messages: messages
});
Vue.config.productionTip = false
new Vue({
router,
store,
i18n,
render: h => h(App)
}).$mount('#app')

View File

@ -0,0 +1,57 @@
import Vue from 'vue'
import VueRouter from 'vue-router'
import Home from '../views/Home.vue'
import Job from "@/views/Job";
import Tasks from "@/views/Tasks";
import Frontend from "@/views/Frontend";
import Tail from "@/views/Tail";
import SearchBackend from "@/views/SearchBackend.vue";
import UserScript from "@/views/UserScript.vue";
Vue.use(VueRouter);
const routes = [
{
path: "/task",
name: "Tasks",
component: Tasks
},
{
path: "/:tab?",
name: "Home",
component: Home
},
{
path: "/job/:name",
name: "Job",
component: Job
},
{
path: "/frontend/:name",
name: "Frontend",
component: Frontend
},
{
path: "/searchBackend/:name",
name: "SearchBackend",
component: SearchBackend
},
{
path: "/userScript/:name",
name: "UserScript",
component: UserScript
},
{
path: "/log/:taskId",
name: "Tail",
component: Tail
},
]
const router = new VueRouter({
mode: "hash",
base: process.env.BASE_URL,
routes
})
export default router

View File

@ -0,0 +1,63 @@
import Vue from "vue";
import Vuex from "vuex";
Vue.use(Vuex);
function saveBrowserSettings(state) {
const settings = {
jobDesktopNotificationMap: state.jobDesktopNotificationMap
};
localStorage.setItem("sist2-admin-settings", JSON.stringify(settings));
console.log("SAVED");
console.log(settings);
}
export default new Vuex.Store({
state: {
sist2AdminInfo: null,
jobDesktopNotificationMap: {}
},
mutations: {
setSist2AdminInfo: (state, payload) => state.sist2AdminInfo = payload,
setJobDesktopNotificationMap: (state, payload) => state.jobDesktopNotificationMap = payload,
},
actions: {
notify: async ({state}, notification) => {
if (!state.jobDesktopNotificationMap[notification.job]) {
console.log("pass");
return;
}
new Notification(notification.messageString.replace("$JOB$", notification.job));
},
setJobDesktopNotification: async ({state}, {job, enabled}) => {
if (enabled === true) {
const permission = await Notification.requestPermission()
if (permission !== "granted") {
return false;
}
}
state.jobDesktopNotificationMap[job] = enabled;
saveBrowserSettings(state);
return true;
},
loadBrowserSettings({commit}) {
const settingString = localStorage.getItem("sist2-admin-settings");
if (!settingString) {
return;
}
const settings = JSON.parse(settingString);
commit("setJobDesktopNotificationMap", settings["jobDesktopNotificationMap"]);
}
},
modules: {}
})

View File

@ -0,0 +1,8 @@
export function formatBindAddress(address) {
if (address.startsWith("0.0.0.0")) {
return address.slice("0.0.0.0".length)
}
return address
}

View File

@ -0,0 +1,145 @@
<template>
<b-card>
<b-card-title>
{{ name }}
<small style="vertical-align: top">
<b-badge v-if="!loading && frontend.running" variant="success">{{ $t("online") }}</b-badge>
<b-badge v-else-if="!loading" variant="secondary">{{ $t("offline") }}</b-badge>
</small>
</b-card-title>
<!-- Action buttons-->
<div class="mb-3" v-if="!loading">
<b-button class="mr-1" :disabled="frontend.running || !valid" variant="success" @click="start()">{{
$t("start")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="danger" @click="stop()">{{
$t("stop")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="primary" :href="frontendUrl" target="_blank">
{{ $t("go") }}
</b-button>
<b-button variant="danger" @click="deleteFrontend()">{{ $t("delete") }}</b-button>
</div>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card-body v-else>
<h4>{{ $t("backendOptions.title") }}</h4>
<b-card>
<b-alert v-if="!valid" variant="warning" show>{{ $t("frontendOptions.noJobSelectedWarning") }}</b-alert>
<SearchBackendSelect :value="frontend.web_options.search_backend"
@change="onBackendSelect($event)"></SearchBackendSelect>
<br>
<JobCheckboxGroup :frontend="frontend" @input="update()"></JobCheckboxGroup>
</b-card>
<br/>
<WebOptions :options="frontend.web_options" :frontend-name="$route.params.name"
@change="update()"></WebOptions>
<br/>
<h4>{{ $t("frontendOptions.title") }}</h4>
<b-card>
<b-form-checkbox v-model="frontend.auto_start" @change="update()">
{{ $t("autoStart") }}
</b-form-checkbox>
<label>{{ $t("extraQueryArgs") }}</label>
<b-form-input v-model="frontend.extra_query_args" @change="update()"></b-form-input>
<label>{{ $t("customUrl") }}</label>
<b-form-input v-model="frontend.custom_url" @change="update()" placeholder="http://"></b-form-input>
</b-card>
</b-card-body>
</b-card>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
import JobCheckboxGroup from "@/components/JobCheckboxGroup";
import WebOptions from "@/components/WebOptions";
import SearchBackendSelect from "@/components/SearchBackendSelect.vue";
export default {
name: 'Frontend',
components: {SearchBackendSelect, JobCheckboxGroup, WebOptions},
data() {
return {
loading: true,
frontend: null,
}
},
computed: {
valid() {
return !this.loading && this.frontend.jobs.length > 0;
},
frontendUrl() {
if (this.frontend.custom_url) {
return this.frontend.custom_url + this.args;
}
if (this.frontend.web_options.bind.startsWith("0.0.0.0")) {
return window.location.protocol + "//" + window.location.hostname + ":" + this.port + this.args;
}
return window.location.protocol + "//" + this.frontend.web_options.bind + this.args;
},
name() {
return this.$route.params.name;
},
port() {
return this.frontend.web_options.bind.split(":")[1]
},
args() {
const args = this.frontend.extra_query_args;
if (args !== "") {
return "#" + (args.startsWith("?") ? (args) : ("?" + args));
}
return "";
}
},
mounted() {
Sist2AdminApi.getFrontend(this.name).then(resp => {
this.frontend = resp.data;
this.loading = false;
});
},
methods: {
start() {
Sist2AdminApi.startFrontend(this.name).then(() => {
this.frontend.running = true;
}).catch(() => {
this.$bvToast.toast(this.$t("couldNotStartFrontendBody"), {
title: this.$t("couldNotStartFrontend"),
variant: "danger",
toaster: "b-toaster-bottom-right"
});
});
},
stop() {
this.frontend.running = false;
Sist2AdminApi.stopFrontend(this.name)
},
deleteFrontend() {
Sist2AdminApi.deleteFrontend(this.name).then(() => {
this.$router.push("/");
});
},
update() {
Sist2AdminApi.updateFrontend(this.name, this.frontend);
},
onBackendSelect(backend) {
this.frontend.web_options.search_backend = backend;
this.frontend.jobs = [];
this.update();
}
}
}
</script>

View File

@ -0,0 +1,240 @@
<template>
<div>
<b-tabs content-class="mt-3" v-model="tab" @input="onTabChange($event)">
<b-tab :title="$t('backendTab')">
<b-card>
<b-card-title>{{ $t("searchBackends") }}</b-card-title>
<b-row>
<b-col>
<b-input v-model="newBackendName" :placeholder="$t('newBackendName')"></b-input>
</b-col>
<b-col>
<b-button variant="primary" @click="createBackend()"
:disabled="!backendNameValid(newBackendName)">
{{ $t("create") }}
</b-button>
</b-col>
</b-row>
<hr/>
<b-progress v-if="backendsLoading" striped animated value="100"></b-progress>
<b-list-group v-else>
<SearchBackendListItem v-for="backend in backends"
:key="backend.name" :backend="backend"></SearchBackendListItem>
</b-list-group>
</b-card>
<br/>
<b-card>
<b-card-title>{{ $t("jobs") }}</b-card-title>
<b-row>
<b-col>
<b-input id="new-job" v-model="newJobName" :placeholder="$t('newJobName')"></b-input>
<b-popover
:show.sync="showHelp"
target="new-job"
placement="top"
triggers="manual"
variant="primary"
:content="$t('newJobHelp')"
></b-popover>
</b-col>
<b-col>
<b-button variant="primary" @click="createJob()" :disabled="!jobNameValid(newJobName)">
{{ $t("create") }}
</b-button>
</b-col>
</b-row>
<hr/>
<b-progress v-if="jobsLoading" striped animated value="100"></b-progress>
<b-list-group v-else>
<JobListItem v-for="job in jobs" :key="job.name" :job="job"></JobListItem>
</b-list-group>
</b-card>
</b-tab>
<b-tab :title="$t('scripts')">
<b-progress v-if="scriptsLoading" striped animated value="100"></b-progress>
<b-card v-else>
<b-card-title>{{ $t("scripts") }}</b-card-title>
<label>Select template</label>
<b-form-radio-group stacked :options="scriptTemplates" v-model="scriptTemplate"></b-form-radio-group>
<br>
<b-row>
<b-col>
<b-form-input v-model="newScriptName" :disabled="!scriptTemplate" :placeholder="$t('newScriptName')"></b-form-input>
</b-col>
<b-col>
<b-button variant="primary" @click="createScript()"
:disabled="!scriptNameValid(newScriptName)">
{{ $t("create") }}
</b-button>
</b-col>
</b-row>
<hr/>
<b-list-group>
<UserScriptListItem v-for="script in scripts"
:key="script.name" :script="script"></UserScriptListItem>
</b-list-group>
</b-card>
</b-tab>
<b-tab :title="$t('frontendTab')">
<b-card>
<b-card-title>{{ $t("frontends") }}</b-card-title>
<b-row>
<b-col>
<b-input v-model="newFrontendName" :placeholder="$t('newFrontendName')"></b-input>
</b-col>
<b-col>
<b-button variant="primary" @click="createFrontend()"
:disabled="!frontendNameValid(newFrontendName)">
{{ $t("create") }}
</b-button>
</b-col>
</b-row>
<hr/>
<b-progress v-if="frontendsLoading" striped animated value="100"></b-progress>
<b-list-group v-else>
<FrontendListItem v-for="frontend in frontends"
:key="frontend.name" :frontend="frontend"></FrontendListItem>
</b-list-group>
</b-card>
</b-tab>
</b-tabs>
</div>
</template>
<script>
import JobListItem from "@/components/JobListItem";
import {formatBindAddress} from "@/util";
import Sist2AdminApi from "@/Sist2AdminApi";
import FrontendListItem from "@/components/FrontendListItem";
import SearchBackendListItem from "@/components/SearchBackendListItem.vue";
import UserScriptListItem from "@/components/UserScriptListItem.vue";
export default {
name: "Jobs",
components: {UserScriptListItem, SearchBackendListItem, JobListItem, FrontendListItem},
data() {
return {
jobsLoading: true,
newJobName: "",
jobs: [],
frontendsLoading: true,
frontends: [],
formatBindAddress,
newFrontendName: "",
backends: [],
backendsLoading: true,
newBackendName: "",
scripts: [],
scriptTemplates: [],
newScriptName: "",
scriptTemplate: null,
scriptsLoading: true,
showHelp: false,
tab: 0
}
},
mounted() {
this.loading = true;
if (this.$route.params.tab) {
console.log("mounted " + this.$route.params.tab)
window.setTimeout(() => {
this.tab = Math.round(Number(this.$route.params.tab));
}, 1)
}
this.reload();
},
methods: {
jobNameValid(name) {
if (this.jobs.some(job => job.name === name)) {
return false;
}
return /^[a-zA-Z0-9-_,.; ]+$/.test(name);
},
frontendNameValid(name) {
if (this.frontends.some(frontend => frontend.name === name)) {
return false;
}
return /^[a-zA-Z0-9-_,.; ]+$/.test(name);
},
backendNameValid(name) {
if (this.backends.some(backend => backend.name === name)) {
return false;
}
return /^[a-zA-Z0-9-_,.; ]+$/.test(name);
},
scriptNameValid(name) {
if (this.scripts.some(script => script.name === name)) {
return false;
}
if (name.length > 16) {
return false;
}
return /^[a-zA-Z0-9-_,.; ]+$/.test(name);
},
reload() {
Sist2AdminApi.getJobs().then(resp => {
this.jobs = resp.data;
this.jobsLoading = false;
this.showHelp = this.jobs.length === 0;
});
Sist2AdminApi.getFrontends().then(resp => {
this.frontends = resp.data;
this.frontendsLoading = false;
});
Sist2AdminApi.getSearchBackends().then(resp => {
this.backends = resp.data;
this.backendsLoading = false;
})
Sist2AdminApi.getUserScripts().then(resp => {
this.scripts = resp.data;
this.scriptTemplates = this.$store.state.sist2AdminInfo.user_script_templates;
this.scriptsLoading = false;
})
},
createJob() {
Sist2AdminApi.createJob(this.newJobName).then(this.reload);
},
createFrontend() {
Sist2AdminApi.createFrontend(this.newFrontendName).then(this.reload)
},
createBackend() {
Sist2AdminApi.createBackend(this.newBackendName).then(this.reload);
},
createScript() {
Sist2AdminApi.createUserScript(this.newScriptName, this.scriptTemplate).then(this.reload)
},
onTabChange(tab) {
if (this.$route.params.tab != tab) {
this.$router.push({params: {tab: tab}})
}
}
}
}
</script>

View File

@ -0,0 +1,138 @@
<template>
<b-card>
<b-card-title>
[{{ getName() }}]
{{ $t("jobTitle") }}
</b-card-title>
<div class="mb-3">
<b-dropdown
split
split-variant="primary"
variant="primary"
:text="$t('runNow')"
class="mr-1"
:disabled="!valid"
@click="runJob()"
>
<b-dropdown-item href="#" @click="runJob(true)">{{ $t("runNowFull") }}</b-dropdown-item>
</b-dropdown>
<b-button variant="danger" @click="deleteJob()">{{ $t("delete") }}</b-button>
</div>
<div v-if="job">
{{ $t("status") }}: <code>{{ job.status }}</code>
</div>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card-body v-else>
<h4>{{ $t("jobOptions.title") }}</h4>
<b-card>
<JobOptions :job="job" @change="update"></JobOptions>
</b-card>
<br/>
<h4>{{ $t("backendOptions.title") }}</h4>
<b-card>
<b-alert v-if="!valid" variant="warning" show>{{ $t("jobOptions.noBackendError") }}</b-alert>
<SearchBackendSelect :value="job.index_options.search_backend"
@change="onBackendSelect($event)"></SearchBackendSelect>
</b-card>
<br/>
<h4>{{ $t("scriptOptions") }}</h4>
<b-card>
<UserScriptPicker :selected-scripts="job.user_scripts"
@change="onScriptChange($event)"></UserScriptPicker>
</b-card>
<br/>
<h4>{{ $t("scanOptions.title") }}</h4>
<b-card>
<ScanOptions :options="job.scan_options" @change="update()"></ScanOptions>
</b-card>
</b-card-body>
</b-card>
</template>
<script>
import ScanOptions from "@/components/ScanOptions";
import Sist2AdminApi from "@/Sist2AdminApi";
import JobOptions from "@/components/JobOptions";
import SearchBackendSelect from "@/components/SearchBackendSelect.vue";
import UserScriptPicker from "@/components/UserScriptPicker.vue";
export default {
name: "Job",
components: {
UserScriptPicker,
SearchBackendSelect,
ScanOptions,
JobOptions
},
data() {
return {
loading: true,
job: null,
console: console
}
},
methods: {
getName() {
return this.$route.params.name;
},
update() {
Sist2AdminApi.updateJob(this.getName(), this.job);
},
runJob(full = false) {
Sist2AdminApi.runJob(this.getName(), full).then(() => {
this.$bvToast.toast(this.$t("runJobConfirmation"), {
title: this.$t("runJobConfirmationTitle"),
variant: "success",
toaster: "b-toaster-bottom-right"
});
});
},
deleteJob() {
Sist2AdminApi.deleteJob(this.getName())
.then(() => {
this.$router.push("/");
})
.catch(err => {
this.$bvToast.toast("Cannot delete job " +
"because it is referenced by a frontend", {
title: "Error",
variant: "danger",
toaster: "b-toaster-bottom-right"
});
})
},
onBackendSelect(backend) {
this.job.index_options.search_backend = backend;
this.update();
},
onScriptChange(scripts) {
this.job.user_scripts = scripts;
this.update();
}
},
mounted() {
Sist2AdminApi.getJob(this.getName()).then(resp => {
this.loading = false;
this.job = resp.data;
})
},
computed: {
valid() {
return this.job?.index_options.search_backend != null;
}
}
}
</script>

View File

@ -0,0 +1,123 @@
<template>
<b-card>
<b-card-title>
<span class="text-monospace">{{ getName() }}</span>
{{ $t("searchBackendTitle") }}
</b-card-title>
<div class="mb-3">
<b-button variant="danger" @click="deleteBackend()">{{ $t("delete") }}</b-button>
</div>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card-body v-else>
<label>{{ $t("backendOptions.type") }}</label>
<b-select :options="backendTypeOptions" v-model="backend.backend_type" @change="update()"></b-select>
<hr/>
<template v-if="backend.backend_type === 'elasticsearch'">
<b-alert :variant="esTestOk ? 'success' : 'danger'" :show="showEsTestAlert" class="mt-1">
{{ esTestMessage }}
</b-alert>
<label>{{ $t("backendOptions.esUrl") }}</label>
<b-input-group>
<b-form-input v-model="backend.es_url" @change="update()"></b-form-input>
<b-input-group-append>
<b-button variant="outline-primary" @click="testEs()">{{ $t("test") }}</b-button>
</b-input-group-append>
</b-input-group>
<b-form-checkbox v-model="backend.es_insecure_ssl" :disabled="!this.backend.es_url.startsWith('https')"
@change="update()">
{{ $t("backendOptions.esInsecure") }}
</b-form-checkbox>
<label>{{ $t("backendOptions.esIndex") }}</label>
<b-form-input v-model="backend.es_index" @change="update()"></b-form-input>
<label>{{ $t("backendOptions.threads") }}</label>
<b-form-input v-model="backend.threads" type="number" min="1" @change="update()"></b-form-input>
<label>{{ $t("backendOptions.batchSize") }}</label>
<b-form-input v-model="backend.batch_size" type="number" min="1" @change="update()"></b-form-input>
</template>
<template v-else>
<label>{{ $t("backendOptions.searchIndex") }}</label>
<b-form-input v-model="backend.search_index" disabled></b-form-input>
</template>
</b-card-body>
</b-card>
</template>
<script>
import sist2AdminApi from "@/Sist2AdminApi";
import Sist2AdminApi from "@/Sist2AdminApi";
export default {
name: "SearchBackend",
data() {
return {
showEsTestAlert: false,
esTestOk: false,
esTestMessage: "",
loading: true,
backend: null,
backendTypeOptions: [
{
text: "Elasticsearch",
value: "elasticsearch"
},
{
text: "SQLite",
value: "sqlite"
}
]
}
},
mounted() {
Sist2AdminApi.getSearchBackend(this.getName()).then(resp => {
this.backend = resp.data;
this.loading = false;
});
},
methods: {
getName() {
return this.$route.params.name;
},
testEs() {
sist2AdminApi.pingEs(this.backend.es_url, this.backend.es_insecure_ssl)
.then((resp) => {
this.showEsTestAlert = true;
this.esTestOk = resp.data.ok;
this.esTestMessage = resp.data.message;
});
},
update() {
Sist2AdminApi.updateSearchBackend(this.getName(), this.backend);
},
deleteBackend() {
Sist2AdminApi.deleteBackend(this.getName())
.then(() => {
this.$router.push("/");
})
.catch(err => {
this.$bvToast.toast("Cannot delete search backend " +
"because it is referenced by a job or frontend", {
title: "Error",
variant: "danger",
toaster: "b-toaster-bottom-right"
});
})
}
}
}
</script>
<style scoped>
</style>

View File

@ -0,0 +1,175 @@
<template>
<b-card>
<b-card-body>
<h4 class="mb-3">{{ taskId }} {{ $t("logs") }}</h4>
<div v-if="$store.state.sist2AdminInfo">
{{ $t("logFile") }}
<code>{{ $store.state.sist2AdminInfo.logs_folder }}/sist2-{{ taskId }}.log</code>
<br/>
<br/>
</div>
<b-row>
<b-col>
<span>{{ $t("logLevel") }}</span>
<b-select :options="levels.slice(0, -1)" v-model="logLevel" @input="connect()"></b-select>
</b-col>
<b-col>
<span>{{ $t("logMode") }}</span>
<b-select :options="modeOptions" v-model="mode" @input="connect()"></b-select>
</b-col>
</b-row>
<div id="log-tail-output" class="mt-3 ml-1"></div>
</b-card-body>
</b-card>
</template>
<script>
export default {
name: "Tail",
data() {
return {
logLevel: "DEBUG",
levels: ["DEBUG", "INFO", "WARNING", "ERROR", "ADMIN", "FATAL"],
socket: null,
mode: "follow",
modeOptions: [
{
"text": this.$t('follow'),
"value": "follow"
},
{
"text": this.$t('wholeFile'),
"value": "wholeFile"
}
]
}
},
computed: {
taskId: function () {
return this.$route.params.taskId;
}
},
methods: {
connect() {
let lineCount = 0;
const outputElem = document.getElementById("log-tail-output")
outputElem.replaceChildren();
if (this.socket !== null) {
this.socket.close();
}
const n = this.mode === "follow" ? 32 : 9999999999;
if (window.location.protocol === "https:") {
this.socket = new WebSocket(`wss://${window.location.host}/log/${this.taskId}?n=${n}`);
} else {
this.socket = new WebSocket(`ws://${window.location.host}/log/${this.taskId}?n=${n}`);
}
this.socket.onopen = () => {
this.socket.send("Hello from client");
}
this.socket.onmessage = e => {
let message;
try {
message = JSON.parse(e.data);
} catch {
console.error(e.data)
return;
}
if ("ping" in message) {
return;
}
if (message.level === undefined) {
if ("stderr" in message) {
message.level = "ERROR";
message.message = message["stderr"];
} else if ("stdout" in message) {
message.level = "INFO";
message.message = message["stdout"];
} else {
message.level = "ADMIN";
message.message = message["sist2-admin"];
}
message.datetime = ""
message.filepath = ""
}
if (this.levels.indexOf(message.level) < this.levels.indexOf(this.logLevel)) {
return;
}
const logLine = `${message.datetime} [${message.level} ${message.filepath}] ${message.message}`;
const span = document.createElement("span");
span.setAttribute("class", message.level);
span.appendChild(document.createTextNode(logLine));
outputElem.appendChild(span);
lineCount += 1;
if (this.mode === "follow" && lineCount >= n) {
outputElem.firstChild.remove();
}
}
}
},
mounted() {
this.connect()
}
}
</script>
<style>
#log-tail-output span {
display: block;
}
span.DEBUG {
color: #9E9E9E;
}
span.WARNING {
color: #FFB300;
}
span.INFO {
color: #039BE5;
}
span.ERROR {
color: #F4511E;
}
span.FATAL {
color: #F4511E;
}
span.ADMIN {
color: #ee05ff;
}
#log-tail-output {
font-size: 13px;
font-family: monospace;
padding: 6px;
background-color: #f5f5f5;
border: 1px solid #ccc;
border-radius: 4px;
margin: 3px;
white-space: pre;
color: #000;
overflow-y: hidden;
}
</style>

View File

@ -0,0 +1,171 @@
<template>
<div>
<b-card v-if="tasks.length > 0">
<h2>{{ $t("runningTasks") }}</h2>
<b-list-group>
<TaskListItem v-for="task in tasks" :key="task.id" :task="task"></TaskListItem>
</b-list-group>
</b-card>
<b-card class="mt-4">
<b-card-title>{{ $t("taskHistory") }}</b-card-title>
<br/>
<b-table
id="task-history"
:items="historyItems"
:fields="historyFields"
:current-page="historyCurrentPage"
:tbody-tr-class="rowClass"
:per-page="10"
>
<template #cell(logs)="data">
<template v-if="data.item._row.has_logs">
<b-button variant="link" size="sm" :to="`/log/${data.item.id}`">
{{ $t("view") }}
</b-button>
/
<b-button variant="link" size="sm" @click="deleteLogs(data.item.id)">
{{ $t("delete") }}
</b-button>
</template>
</template>
<template #cell(delete)="data">
</template>
</b-table>
<b-pagination limit="20" v-model="historyCurrentPage" :total-rows="historyItems.length"
:per-page="10"></b-pagination>
</b-card>
</div>
</template>
<script>
import TaskListItem from "@/components/TaskListItem";
import Sist2AdminApi from "@/Sist2AdminApi";
import moment from "moment";
const DAY = 3600 * 24;
const HOUR = 3600;
const MINUTE = 60;
function humanDuration(sec_num) {
sec_num = sec_num / 1000;
const days = Math.floor(sec_num / DAY);
sec_num -= days * DAY;
const hours = Math.floor(sec_num / HOUR);
sec_num -= hours * HOUR;
const minutes = Math.floor(sec_num / MINUTE);
sec_num -= minutes * MINUTE;
const seconds = Math.floor(sec_num);
if (days > 0) {
return `${days} days ${hours}h ${minutes}m ${seconds}s`;
}
if (hours > 0) {
return `${hours}h ${minutes}m ${seconds}s`;
}
if (minutes > 0) {
return `${minutes}m ${seconds}s`;
}
if (seconds > 0) {
return `${seconds}s`;
}
return "<1s";
}
export default {
name: 'Tasks',
components: {TaskListItem},
data() {
return {
loading: true,
tasks: [],
taskHistory: [],
timerId: null,
historyFields: [
{key: "name", label: this.$t("taskName")},
{key: "time", label: this.$t("taskStarted")},
{key: "duration", label: this.$t("taskDuration")},
{key: "status", label: this.$t("taskStatus")},
{key: "logs", label: this.$t("logs")},
],
historyCurrentPage: 1,
historyItems: []
}
},
props: {
msg: String
},
mounted() {
this.loading = true;
this.update().then(() => this.loading = false);
this.timerId = window.setInterval(this.update, 1000);
this.updateHistory();
},
destroyed() {
if (this.timerId) {
window.clearInterval(this.timerId);
}
},
methods: {
rowClass(row) {
if (row.status === "failed") {
return "table-danger";
}
return null;
},
updateHistory() {
Sist2AdminApi.getTaskHistory().then(resp => {
this.historyItems = resp.data.map(row => ({
id: row.id,
name: row.name,
duration: this.taskDuration(row),
time: moment.utc(row.started).local().format("dd, MMM Do YYYY, HH:mm:ss"),
logs: null,
status: row.return_code === 0 ? "ok" : "failed",
_row: row
}));
});
},
update() {
return Sist2AdminApi.getTasks().then(resp => {
this.tasks = resp.data;
})
},
taskDuration(task) {
const start = moment.utc(task.started);
const end = moment.utc(task.ended);
return humanDuration(end.diff(start))
},
deleteLogs(taskId) {
Sist2AdminApi.deleteTaskLogs(taskId).then(() => {
this.updateHistory();
})
}
}
}
</script>
<style scoped>
#task-history {
font-family: monospace;
font-size: 12px;
}
.btn-link {
padding: 0;
}
</style>

View File

@ -0,0 +1,117 @@
<template>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card v-else>
<b-card-title>
{{ $route.params.name }}
{{ $t("script") }}
</b-card-title>
<div class="mb-3">
<b-button variant="danger" @click="deleteScript()">{{ $t("delete") }}</b-button>
</div>
<b-card>
<h5>{{ $t("testScript") }}</h5>
<b-row>
<b-col cols="11">
<JobSelect @change="onJobSelect($event)"></JobSelect>
</b-col>
<b-col cols="1">
<b-button :disabled="!selectedTestJob" variant="primary" @click="testScript()">{{ $t("test") }}
</b-button>
</b-col>
</b-row>
</b-card>
<br/>
<label>{{ $t("scriptType") }}</label>
<b-form-select :options="['git', 'simple']" v-model="script.type" @change="update()"></b-form-select>
<template v-if="script.type === 'git'">
<label>{{ $t("gitRepository") }}</label>
<b-form-input v-model="script.git_repository" placeholder="https://github.com/example/example.git"
@change="update()"></b-form-input>
<label>{{ $t("extraArgs") }}</label>
<b-form-input v-model="script.extra_args" @change="update()" class="text-monospace"></b-form-input>
</template>
<template v-if="script.type === 'simple'">
<label>{{ $t("scriptCode") }}</label>
<p>Find sist2-python documentation <a href="https://sist2-python.readthedocs.io/" target="_blank">here</a></p>
<b-textarea rows="15" class="text-monospace" v-model="script.script" @change="update()" spellcheck="false"></b-textarea>
</template>
<template v-if="script.type === 'local'">
<!-- TODO-->
</template>
</b-card>
</template>
<script>
import Sist2AdminApi from "@/Sist2AdminApi";
import JobOptions from "@/components/JobOptions.vue";
import JobCheckboxGroup from "@/components/JobCheckboxGroup.vue";
import JobSelect from "@/components/JobSelect.vue";
export default {
name: "UserScript",
components: {JobSelect, JobCheckboxGroup, JobOptions},
data() {
return {
loading: true,
script: null,
selectedTestJob: null
}
},
methods: {
update() {
Sist2AdminApi.updateUserScript(this.name, this.script);
},
onJobSelect(job) {
this.selectedTestJob = job;
},
deleteScript() {
Sist2AdminApi.deleteUserScript(this.name)
.then(() => {
this.$router.push("/");
})
.catch(err => {
this.$bvToast.toast("Cannot delete user script " +
"because it is referenced by a job", {
title: "Error",
variant: "danger",
toaster: "b-toaster-bottom-right"
});
})
},
testScript() {
Sist2AdminApi.testUserScript(this.name, this.selectedTestJob)
.then(() => {
this.$bvToast.toast(this.$t("runJobConfirmation"), {
title: this.$t("runJobConfirmationTitle"),
variant: "success",
toaster: "b-toaster-bottom-right"
});
})
}
},
mounted() {
Sist2AdminApi.getUserScript(this.name).then(resp => {
this.script = resp.data;
this.loading = false;
});
},
computed: {
name() {
return this.$route.params.name;
},
},
}
</script>

View File

@ -0,0 +1,5 @@
module.exports = {
publicPath: "",
filenameHashing: false,
productionSourceMap: false,
};

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,7 @@
fastapi
git+https://github.com/simon987/hexlib.git
uvicorn
websockets
pycron
GitPython
git+https://github.com/sist2app/sist2-python.git@2.1

View File

@ -0,0 +1,595 @@
import asyncio
import os
import signal
from datetime import datetime
from time import sleep
from urllib.parse import urlparse
import requests
import uvicorn
from fastapi import FastAPI, HTTPException
from hexlib.db import PersistentState
from requests import ConnectionError
from requests.exceptions import SSLError
from starlette.middleware.cors import CORSMiddleware
from starlette.responses import RedirectResponse
from starlette.staticfiles import StaticFiles
from starlette.websockets import WebSocket
from websockets.exceptions import ConnectionClosed
import cron
from config import LOG_FOLDER, logger, WEBSERVER_PORT, DATA_FOLDER, SIST2_BINARY
from jobs import Sist2Job, Sist2ScanTask, TaskQueue, Sist2IndexTask, JobStatus, Sist2UserScriptTask
from notifications import Subscribe, Notifications
from sist2 import Sist2, Sist2SearchBackend
from state import migrate_v1_to_v2, RUNNING_FRONTENDS, TESSERACT_LANGS, DB_SCHEMA_VERSION, migrate_v3_to_v4, \
get_log_files_to_remove, delete_log_file, create_default_search_backends
from web import Sist2Frontend
from script import UserScript, SCRIPT_TEMPLATES
from util import tail_sync, pid_is_running
sist2 = Sist2(SIST2_BINARY, DATA_FOLDER)
db = PersistentState(dbfile=os.path.join(DATA_FOLDER, "state.db"))
notifications = Notifications()
task_queue = TaskQueue(sist2, db, notifications)
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_credentials=True,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"],
)
app.mount("/ui/", StaticFiles(directory="./frontend/dist", html=True), name="static")
@app.get("/")
async def home():
return RedirectResponse("ui")
@app.get("/api")
async def api():
return {
"tesseract_langs": TESSERACT_LANGS,
"logs_folder": LOG_FOLDER,
"user_script_templates": list(SCRIPT_TEMPLATES.keys())
}
@app.get("/api/job/{name:str}")
async def get_job(name: str):
job = db["jobs"][name]
if not job:
raise HTTPException(status_code=404)
return job
@app.get("/api/frontend/{name:str}")
async def get_frontend(name: str):
frontend = db["frontends"][name]
frontend: Sist2Frontend
if frontend:
frontend.running = frontend.name in RUNNING_FRONTENDS
return frontend
raise HTTPException(status_code=404)
@app.get("/api/job")
async def get_jobs():
return list(db["jobs"])
@app.put("/api/job/{name:str}")
async def update_job(name: str, new_job: Sist2Job):
new_job.last_modified = datetime.utcnow()
job = db["jobs"][name]
if not job:
raise HTTPException(status_code=404)
args_that_trigger_full_scan = [
"path",
"thumbnail_count",
"thumbnail_quality",
"thumbnail_size",
"content_size",
"depth",
"archive",
"archive_passphrase",
"ocr_lang",
"ocr_images",
"ocr_ebooks",
"fast",
"checksums",
"read_subtitles",
]
for arg in args_that_trigger_full_scan:
if getattr(new_job.scan_options, arg) != getattr(job.scan_options, arg):
new_job.do_full_scan = True
db["jobs"][name] = new_job
@app.put("/api/frontend/{name:str}")
async def update_frontend(name: str, frontend: Sist2Frontend):
db["frontends"][name] = frontend
return "ok"
@app.get("/api/task")
async def get_tasks():
return list(map(lambda t: t.json(), task_queue.tasks()))
@app.get("/api/task/history")
async def task_history():
return list(db["task_done"].sql("ORDER BY started DESC"))
@app.post("/api/task/{task_id:str}/kill")
async def kill_job(task_id: str):
return task_queue.kill_task(task_id)
@app.post("/api/task/{task_id:str}/delete_logs")
async def delete_task_logs(task_id: str):
if not db["task_done"][task_id]:
raise HTTPException(status_code=404)
delete_log_file(db, task_id)
return "ok"
def _run_job(job: Sist2Job):
job.last_modified = datetime.utcnow()
if job.status == JobStatus("created"):
job.status = JobStatus("started")
db["jobs"][job.name] = job
scan_task = Sist2ScanTask(job, f"Scan [{job.name}]")
index_depends_on = scan_task
script_tasks = []
for script_name in job.user_scripts:
script = db["user_scripts"][script_name]
task = Sist2UserScriptTask(script, job, f"Script <{script_name}> [{job.name}]", depends_on=scan_task)
script_tasks.append(task)
index_depends_on = task
index_task = Sist2IndexTask(job, f"Index [{job.name}]", depends_on=index_depends_on)
task_queue.submit(scan_task)
for task in script_tasks:
task_queue.submit(task)
task_queue.submit(index_task)
@app.get("/api/job/{name:str}/run")
async def run_job(name: str, full: bool = False):
job: Sist2Job = db["jobs"][name]
if not job:
raise HTTPException(status_code=404)
if full:
job.do_full_scan = True
_run_job(job)
return "ok"
@app.get("/api/user_script/{name:str}/run")
def run_user_script(name: str, job: str):
script = db["user_scripts"][name]
if not script:
raise HTTPException(status_code=404)
job = db["jobs"][job]
if not job:
raise HTTPException(status_code=404)
script_task = Sist2UserScriptTask(script, job, f"Script <{name}> [{job.name}]")
task_queue.submit(script_task)
return "ok"
@app.get("/api/job/{name:str}/logs_to_delete")
async def task_history(n: int, name: str):
return get_log_files_to_remove(db, name, n)
@app.delete("/api/job/{name:str}")
async def delete_job(name: str):
job: Sist2Job = db["jobs"][name]
if not job:
raise HTTPException(status_code=404)
if any(name in frontend.jobs for frontend in db["frontends"]):
raise HTTPException(status_code=400, detail="in use (frontend)")
try:
os.remove(job.previous_index)
except:
pass
del db["jobs"][name]
return "ok"
@app.delete("/api/frontend/{name:str}")
async def delete_frontend(name: str):
if name in RUNNING_FRONTENDS:
try:
os.kill(RUNNING_FRONTENDS[name], signal.SIGTERM)
except ProcessLookupError:
pass
del RUNNING_FRONTENDS[name]
frontend = db["frontends"][name]
if frontend:
del db["frontends"][name]
else:
raise HTTPException(status_code=404)
@app.post("/api/job/{name:str}")
async def create_job(name: str):
if db["jobs"][name]:
raise ValueError("Job with the same name already exists")
job = Sist2Job.create_default(name)
db["jobs"][name] = job
return job
@app.post("/api/frontend/{name:str}")
async def create_frontend(name: str):
if db["frontends"][name]:
raise ValueError("Frontend with the same name already exists")
frontend = Sist2Frontend.create_default(name)
db["frontends"][name] = frontend
return frontend
@app.get("/api/ping_es")
async def ping_es(url: str, insecure: bool):
return check_es_version(url, insecure)
def check_es_version(es_url: str, insecure: bool):
try:
url = urlparse(es_url)
if url.username:
auth = (url.username, url.password)
es_url = f"{url.scheme}://{url.hostname}:{url.port}"
else:
auth = None
r = requests.get(es_url, verify=not insecure, auth=auth)
except SSLError:
return {
"ok": False,
"message": "Invalid SSL certificate"
}
except ConnectionError as e:
return {
"ok": False,
"message": "Connection refused"
}
except ValueError as e:
return {
"ok": False,
"message": str(e)
}
if r.status_code == 401:
return {
"ok": False,
"message": "Authentication failure"
}
try:
return {
"ok": True,
"message": "Elasticsearch version " + r.json()["version"]["number"]
}
except:
return {
"ok": False,
"message": "Could not read version"
}
def start_frontend_(frontend: Sist2Frontend):
frontend.web_options.indices = [
os.path.join(DATA_FOLDER, db["jobs"][j].index_path)
for j in frontend.jobs
]
backend_name = frontend.web_options.search_backend
search_backend = db["search_backends"][backend_name]
if search_backend is None:
logger.error(
f"Error while running task: search backend not found: {backend_name}")
return -1
logger.debug(f"Fetched search backend options for {backend_name}")
pid = sist2.web(frontend.web_options, search_backend, frontend.name)
sleep(0.2)
if not pid_is_running(pid):
frontend_log = frontend.get_log_path(LOG_FOLDER)
logger.error(f"Frontend exited too quickly, check {frontend_log} for more details:")
for line in tail_sync(frontend.get_log_path(LOG_FOLDER), 3):
logger.error(line.strip())
return False
RUNNING_FRONTENDS[frontend.name] = pid
return True
@app.post("/api/frontend/{name:str}/start")
async def start_frontend(name: str):
frontend = db["frontends"][name]
if not frontend:
raise HTTPException(status_code=404)
ok = start_frontend_(frontend)
if not ok:
raise HTTPException(status_code=500)
return "ok"
@app.post("/api/frontend/{name:str}/stop")
async def stop_frontend(name: str):
if name in RUNNING_FRONTENDS:
os.kill(RUNNING_FRONTENDS[name], signal.SIGTERM)
del RUNNING_FRONTENDS[name]
@app.get("/api/frontend")
async def get_frontends():
res = []
for frontend in db["frontends"]:
frontend: Sist2Frontend
frontend.running = frontend.name in RUNNING_FRONTENDS
res.append(frontend)
return res
@app.get("/api/search_backend")
async def get_search_backends():
return list(db["search_backends"])
@app.put("/api/search_backend/{name:str}")
async def update_search_backend(name: str, backend: Sist2SearchBackend):
if not db["search_backends"][name]:
raise HTTPException(status_code=404)
db["search_backends"][name] = backend
return "ok"
@app.get("/api/search_backend/{name:str}")
def get_search_backend(name: str):
backend = db["search_backends"][name]
if not backend:
raise HTTPException(status_code=404)
return backend
@app.delete("/api/search_backend/{name:str}")
def delete_search_backend(name: str):
backend: Sist2SearchBackend = db["search_backends"][name]
if not backend:
raise HTTPException(status_code=404)
if any(frontend.web_options.search_backend == name for frontend in db["frontends"]):
raise HTTPException(status_code=400, detail="in use (frontend)")
if any(job.index_options.search_backend == name for job in db["jobs"]):
raise HTTPException(status_code=400, detail="in use (job)")
del db["search_backends"][name]
try:
os.remove(os.path.join(DATA_FOLDER, backend.search_index))
except:
pass
return "ok"
@app.post("/api/search_backend/{name:str}")
def create_search_backend(name: str):
if db["search_backends"][name] is not None:
return HTTPException(status_code=400, detail="already exists")
backend = Sist2SearchBackend.create_default(name)
db["search_backends"][name] = backend
return backend
@app.delete("/api/user_script/{name:str}")
def delete_user_script(name: str):
if db["user_scripts"][name] is None:
return HTTPException(status_code=404)
if any(name in job.user_scripts for job in db["jobs"]):
raise HTTPException(status_code=400, detail="in use (job)")
script: UserScript = db["user_scripts"][name]
script.delete_dir()
del db["user_scripts"][name]
return "ok"
@app.post("/api/user_script/{name:str}")
def create_user_script(name: str, template: str):
if db["user_scripts"][name] is not None:
return HTTPException(status_code=400, detail="already exists")
script = SCRIPT_TEMPLATES[template](name)
db["user_scripts"][name] = script
return script
@app.get("/api/user_script")
async def get_user_scripts():
return list(db["user_scripts"])
@app.get("/api/user_script/{name:str}")
async def get_user_script(name: str):
backend = db["user_scripts"][name]
if not backend:
raise HTTPException(status_code=404)
return backend
@app.put("/api/user_script/{name:str}")
async def update_user_script(name: str, script: UserScript):
previous_version: UserScript = db["user_scripts"][name]
if previous_version and previous_version.git_repository != script.git_repository:
script.force_clone = True
db["user_scripts"][name] = script
return "ok"
def tail(filepath: str, n: int):
with open(filepath) as file:
reached_eof = False
buffer = []
line = ""
while True:
tmp = file.readline()
if tmp:
line += tmp
if line.endswith("\n"):
if reached_eof:
yield line
else:
if len(buffer) > n:
buffer.pop(0)
buffer.append(line)
line = ""
else:
if not reached_eof:
reached_eof = True
yield from buffer
yield None
@app.websocket("/notifications")
async def ws_tail_log(websocket: WebSocket):
await websocket.accept()
try:
await websocket.receive_text()
async with Subscribe(notifications) as ob:
async for notification in ob.notifications():
await websocket.send_json(notification)
except ConnectionClosed:
return
@app.websocket("/log/{task_id}")
async def ws_tail_log(websocket: WebSocket, task_id: str, n: int):
log_file = os.path.join(LOG_FOLDER, f"sist2-{task_id}.log")
await websocket.accept()
try:
await websocket.receive_text()
except ConnectionClosed:
return
while True:
for line in tail(log_file, n):
try:
if line:
await websocket.send_text(line)
else:
await websocket.send_json({"ping": ""})
await asyncio.sleep(0.1)
except ConnectionClosed:
return
def main():
uvicorn.run(app, port=WEBSERVER_PORT, host="0.0.0.0", timeout_graceful_shutdown=0)
def initialize_db():
db["sist2_admin"]["info"] = {"version": DB_SCHEMA_VERSION}
frontend = Sist2Frontend.create_default("default")
db["frontends"]["default"] = frontend
create_default_search_backends(db)
logger.info("Initialized database.")
def start_frontends():
for frontend in db["frontends"]:
frontend: Sist2Frontend
if frontend.auto_start and len(frontend.jobs) > 0:
start_frontend_(frontend)
if __name__ == '__main__':
if not db["sist2_admin"]["info"]:
initialize_db()
if db["sist2_admin"]["info"]["version"] == "1":
logger.info("Migrating to v2 database schema")
migrate_v1_to_v2(db)
if db["sist2_admin"]["info"]["version"] == "2":
logger.error("Cannot migrate database from v2 to v3. Delete state.db to proceed.")
exit(-1)
if db["sist2_admin"]["info"]["version"] == "3":
logger.info("Migrating to v4 database schema")
migrate_v3_to_v4(db)
if db["sist2_admin"]["info"]["version"] != DB_SCHEMA_VERSION:
raise Exception(f"Incompatible database {db.dbfile}. "
f"Automatic migration is not available, please delete the database file to continue.")
start_frontends()
cron.initialize(db, _run_job)
logger.info("Started sist2-admin. Hello!")
main()

View File

@ -0,0 +1,32 @@
import os
import logging
import sys
from logging import StreamHandler
from logging.handlers import RotatingFileHandler
MAX_LOG_SIZE = 1 * 1024 * 1024
SIST2_BINARY = os.environ.get("SIST2_BINARY", "/root/sist2")
DATA_FOLDER = os.environ.get("DATA_FOLDER", "/sist2-admin/")
LOG_FOLDER = os.path.join(DATA_FOLDER, "logs")
SCRIPT_FOLDER = os.path.join(DATA_FOLDER, "scripts")
WEBSERVER_PORT = 8080
os.makedirs(LOG_FOLDER, exist_ok=True)
os.makedirs(SCRIPT_FOLDER, exist_ok=True)
os.makedirs(DATA_FOLDER, exist_ok=True)
logger = logging.Logger("sist2-admin")
_log_file = os.path.join(LOG_FOLDER, "sist2-admin.log")
_log_fmt = "%(asctime)s [%(levelname)s] %(message)s"
_log_formatter = logging.Formatter(_log_fmt, datefmt='%Y-%m-%d %H:%M:%S')
console_handler = StreamHandler(sys.stdout)
console_handler.setFormatter(_log_formatter)
file_handler = RotatingFileHandler(_log_file, mode="a", maxBytes=MAX_LOG_SIZE, backupCount=1)
file_handler.setFormatter(_log_formatter)
logger.addHandler(console_handler)
logger.addHandler(file_handler)

View File

@ -0,0 +1,35 @@
from threading import Thread
import pycron
import time
from hexlib.db import PersistentState
from config import logger
from jobs import Sist2Job
def _check_schedule(db: PersistentState, run_job):
jobs = list(db["jobs"])
for job in jobs:
job: Sist2Job
if job.schedule_enabled:
if pycron.is_now(job.cron_expression):
logger.info(f"Submit scan task to queue for [{job.name}]")
run_job(job)
def _cron_thread(db, run_job):
time.sleep(60 - (time.time() % 60))
start = time.time()
while True:
_check_schedule(db, run_job)
time.sleep(60 - ((time.time() - start) % 60))
def initialize(db, run_job):
t = Thread(target=_cron_thread, args=(db, run_job), daemon=True, name="timer")
t.start()

View File

@ -0,0 +1,411 @@
import json
import logging
import os.path
import shlex
import signal
import uuid
from datetime import datetime
from enum import Enum
from io import TextIOWrapper
from logging import FileHandler
from subprocess import Popen
import subprocess
from threading import Lock, Thread
from time import sleep
from typing import List
from uuid import uuid4, UUID
from hexlib.db import PersistentState
from pydantic import BaseModel
from config import logger, LOG_FOLDER, DATA_FOLDER
from notifications import Notifications
from sist2 import ScanOptions, IndexOptions, Sist2
from state import RUNNING_FRONTENDS, get_log_files_to_remove, delete_log_file
from web import Sist2Frontend
from script import UserScript
class JobStatus(Enum):
CREATED = "created"
STARTED = "started"
INDEXED = "indexed"
FAILED = "failed"
class Sist2Job(BaseModel):
name: str
scan_options: ScanOptions
index_options: IndexOptions
user_scripts: List[str] = []
cron_expression: str
schedule_enabled: bool = False
keep_last_n_logs: int = -1
previous_index: str = None
index_path: str = None
previous_index_path: str = None
last_index_date: datetime = None
status: JobStatus = JobStatus("created")
last_modified: datetime
etag: str = None
do_full_scan: bool = False
def __init__(self, **kwargs):
super().__init__(**kwargs)
@staticmethod
def create_default(name: str):
return Sist2Job(
name=name,
scan_options=ScanOptions(path="/"),
index_options=IndexOptions(),
last_modified=datetime.utcnow(),
cron_expression="0 0 * * *"
)
class Sist2TaskProgress:
def __init__(self, done: int = 0, count: int = 0, index_size: int = 0, tn_size: int = 0, waiting: bool = False):
self.done = done
self.count = count
self.index_size = index_size
self.store_size = tn_size
self.waiting = waiting
def percent(self):
return (self.done / self.count) if self.count else 0
class Sist2Task:
def __init__(self, job: Sist2Job, display_name: str, depends_on: uuid.UUID = None):
self.job = job
self.display_name = display_name
self.progress = Sist2TaskProgress()
self.id = uuid4()
self.pid = None
self.started = None
self.ended = None
self.depends_on = depends_on
self._logger = logging.Logger(name=f"{self.id}")
self._logger.addHandler(FileHandler(os.path.join(LOG_FOLDER, f"sist2-{self.id}.log")))
def json(self):
return {
"id": self.id,
"job": self.job,
"display_name": self.display_name,
"progress": self.progress,
"started": self.started,
"ended": self.ended,
"depends_on": self.depends_on,
}
def log_callback(self, log_json):
if "progress" in log_json:
self.progress = Sist2TaskProgress(**log_json["progress"])
elif self._logger:
self._logger.info(json.dumps(log_json))
def run(self, sist2: Sist2, db: PersistentState):
self.started = datetime.utcnow()
logger.info(f"Started task {self.display_name}")
def set_pid(self, pid):
self.pid = pid
class Sist2ScanTask(Sist2Task):
def run(self, sist2: Sist2, db: PersistentState):
super().run(sist2, db)
self.job.scan_options.name = self.job.name
if self.job.index_path is not None and not self.job.do_full_scan:
self.job.scan_options.output = self.job.index_path
else:
self.job.scan_options.output = None
return_code = sist2.scan(self.job.scan_options, logs_cb=self.log_callback, set_pid_cb=self.set_pid)
self.ended = datetime.utcnow()
is_ok = (return_code in (0, 1)) if "debug" in sist2.bin_path else (return_code == 0)
if not is_ok:
self._logger.error(json.dumps({"sist2-admin": f"Process returned non-zero exit code ({return_code})"}))
logger.info(f"Task {self.display_name} failed ({return_code})")
else:
self.job.index_path = self.job.scan_options.output
self.job.last_index_date = datetime.utcnow()
self.job.do_full_scan = False
db["jobs"][self.job.name] = self.job
self._logger.info(json.dumps({"sist2-admin": f"Save last_index_date={self.job.last_index_date}"}))
logger.info(f"Completed {self.display_name} ({return_code=})")
# Remove old index
if is_ok:
if self.job.previous_index_path is not None and self.job.previous_index_path != self.job.index_path:
self._logger.info(json.dumps({"sist2-admin": f"Remove {self.job.previous_index_path=}"}))
try:
os.remove(self.job.previous_index_path)
except FileNotFoundError:
pass
self.job.previous_index_path = self.job.index_path
db["jobs"][self.job.name] = self.job
if is_ok:
return 0
return return_code
class Sist2IndexTask(Sist2Task):
def __init__(self, job: Sist2Job, display_name: str, depends_on: Sist2Task):
super().__init__(job, display_name, depends_on=depends_on.id)
def run(self, sist2: Sist2, db: PersistentState):
super().run(sist2, db)
self.job.index_options.path = self.job.scan_options.output
search_backend = db["search_backends"][self.job.index_options.search_backend]
if search_backend is None:
logger.error(f"Error while running task: search backend not found: {self.job.index_options.search_backend}")
return -1
logger.debug(f"Fetched search backend options for {self.job.index_options.search_backend}")
return_code = sist2.index(self.job.index_options, search_backend, logs_cb=self.log_callback, set_pid_cb=self.set_pid)
self.ended = datetime.utcnow()
duration = self.ended - self.started
ok = return_code in (0, 1)
if ok:
self.restart_running_frontends(db, sist2)
# Update status
self.job.status = JobStatus("indexed") if ok else JobStatus("failed")
self.job.previous_index_path = self.job.index_path
db["jobs"][self.job.name] = self.job
self._logger.info(json.dumps({"sist2-admin": f"Sist2Scan task finished {return_code=}, {duration=}, {ok=}"}))
logger.info(f"Completed {self.display_name} ({return_code=})")
return return_code
def restart_running_frontends(self, db: PersistentState, sist2: Sist2):
for frontend_name, pid in RUNNING_FRONTENDS.items():
frontend = db["frontends"][frontend_name]
frontend: Sist2Frontend
try:
os.kill(pid, signal.SIGTERM)
except ProcessLookupError:
pass
try:
os.wait()
except ChildProcessError:
pass
backend_name = frontend.web_options.search_backend
search_backend = db["search_backends"][backend_name]
if search_backend is None:
logger.error(f"Error while running task: search backend not found: {backend_name}")
return -1
logger.debug(f"Fetched search backend options for {backend_name}")
frontend.web_options.indices = [
os.path.join(DATA_FOLDER, db["jobs"][j].index_path)
for j in frontend.jobs
]
pid = sist2.web(frontend.web_options, search_backend, frontend.name)
RUNNING_FRONTENDS[frontend_name] = pid
self._logger.info(json.dumps({"sist2-admin": f"Restart frontend {pid=} {frontend_name=}"}))
class Sist2UserScriptTask(Sist2Task):
def __init__(self, user_script: UserScript, job: Sist2Job, display_name: str, depends_on: Sist2Task = None):
super().__init__(job, display_name, depends_on=depends_on.id if depends_on else None)
self.user_script = user_script
def run(self, sist2: Sist2, db: PersistentState):
super().run(sist2, db)
try:
self.user_script.setup(self.log_callback, self.set_pid)
except Exception as e:
logger.error(f"Setup for {self.user_script.name} failed: ")
logger.exception(e)
self.log_callback({"sist2-admin": f"Setup for {self.user_script.name} failed: {e}"})
return -1
executable = self.user_script.get_executable()
index_path = os.path.join(DATA_FOLDER, self.job.index_path)
extra_args = self.user_script.extra_args
args = [
executable,
index_path,
*shlex.split(extra_args)
]
self.log_callback({"sist2-admin": f"Starting user script with {executable=}, {index_path=}, {extra_args=}"})
proc = Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=self.user_script.script_dir())
self.set_pid(proc.pid)
t_stderr = Thread(target=self._consume_logs, args=(self.log_callback, proc, "stderr", False))
t_stderr.start()
self._consume_logs(self.log_callback, proc, "stdout", True)
self.ended = datetime.utcnow()
return 0
@staticmethod
def _consume_logs(logs_cb, proc, stream, wait):
pipe_wrapper = TextIOWrapper(getattr(proc, stream), encoding="utf8", errors="ignore")
try:
for line in pipe_wrapper:
if line.strip() == "":
continue
if line.startswith("$PROGRESS"):
progress = json.loads(line[len("$PROGRESS "):])
logs_cb({"progress": progress})
continue
logs_cb({stream: line})
finally:
if wait:
proc.wait()
pipe_wrapper.close()
class TaskQueue:
def __init__(self, sist2: Sist2, db: PersistentState, notifications: Notifications):
self._lock = Lock()
self._sist2 = sist2
self._db = db
self._notifications = notifications
self._tasks = {}
self._queue = []
self._sem = 0
self._thread = Thread(target=self._check_new_task, daemon=True)
self._thread.start()
def _tasks_failed(self):
done = set()
for row in self._db["task_done"].sql("WHERE return_code != 0"):
done.add(uuid.UUID(row["id"]))
return done
def _tasks_done(self):
done = set()
for row in self._db["task_done"]:
done.add(uuid.UUID(row["id"]))
return done
def _check_new_task(self):
while True:
with self._lock:
for task in list(self._queue):
task: Sist2Task
if self._sem >= 1:
break
if not task.depends_on or task.depends_on in self._tasks_done():
self._queue.remove(task)
if task.depends_on in self._tasks_failed():
# The task which we depend on failed, continue
continue
self._sem += 1
t = Thread(target=self._run_task, args=(task,))
self._tasks[task.id] = {
"task": task,
"thread": t,
}
t.start()
break
sleep(1)
def tasks(self):
return list(map(lambda t: t["task"], self._tasks.values()))
def kill_task(self, task_id):
task = self._tasks.get(UUID(task_id))
if task:
pid = task["task"].pid
logger.info(f"Killing task {task_id} (pid={pid})")
os.kill(pid, signal.SIGTERM)
return True
return False
def _run_task(self, task: Sist2Task):
task_result = task.run(self._sist2, self._db)
with self._lock:
del self._tasks[task.id]
self._sem -= 1
self._db["task_done"][task.id] = {
"ended": task.ended,
"started": task.started,
"name": task.display_name,
"return_code": task_result,
"has_logs": 1
}
logs_to_delete = get_log_files_to_remove(self._db, task.job.name, task.job.keep_last_n_logs)
for row in logs_to_delete:
delete_log_file(self._db, row["id"])
if isinstance(task, Sist2IndexTask):
self._notifications.notify({
"message": "notifications.indexCompleted",
"job": task.job.name
})
def submit(self, task: Sist2Task):
logger.info(f"Submitted task to queue {task.display_name}")
with self._lock:
self._queue.append(task)

View File

@ -0,0 +1,40 @@
import asyncio
from typing import List
class Notifications:
def __init__(self):
self._subscribers: List[Subscribe] = []
def subscribe(self, ob):
self._subscribers.append(ob)
def unsubscribe(self, ob):
self._subscribers.remove(ob)
def notify(self, notification: dict):
for ob in self._subscribers:
ob.notify(notification)
class Subscribe:
def __init__(self, notifications: Notifications):
self._queue = []
self._notifications = notifications
async def __aenter__(self):
self._notifications.subscribe(self)
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
self._notifications.unsubscribe(self)
def notify(self, notification: dict):
self._queue.append(notification)
async def notifications(self):
while True:
try:
yield self._queue.pop(0)
except IndexError:
await asyncio.sleep(0.1)

View File

@ -0,0 +1,130 @@
import os
import shutil
import stat
import subprocess
from enum import Enum
from git import Repo
from pydantic import BaseModel
from config import SCRIPT_FOLDER
class ScriptType(Enum):
LOCAL = "local"
SIMPLE = "simple"
GIT = "git"
def set_executable(file):
os.chmod(file, os.stat(file).st_mode | stat.S_IEXEC)
def _initialize_git_repository(url, path, log_cb, force_clone, set_pid_cb):
log_cb({"sist2-admin": f"Cloning {url}"})
if force_clone or not os.path.exists(os.path.join(path, ".git")):
if force_clone:
shutil.rmtree(path, ignore_errors=True)
Repo.clone_from(url, path)
else:
repo = Repo(path)
repo.remote("origin").pull()
setup_script = os.path.join(path, "setup.sh")
if setup_script:
log_cb({"sist2-admin": f"Executing setup script {setup_script}"})
set_executable(setup_script)
proc = subprocess.Popen([setup_script], cwd=path, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
set_pid_cb(proc.pid)
proc.wait()
stdout = proc.stdout.read()
for line in stdout.split(b"\n"):
if line:
log_cb({"stdout": line.decode()})
log_cb({"stdout": f"Executed setup script {setup_script}, return code = {proc.returncode}"})
if proc.returncode != 0:
raise Exception("Error when running setup script!")
log_cb({"sist2-admin": f"Initialized git repository in {path}"})
class UserScript(BaseModel):
name: str
type: ScriptType
git_repository: str = None
force_clone: bool = False
script: str = None
extra_args: str = ""
def script_dir(self):
return os.path.join(SCRIPT_FOLDER, self.name)
def setup(self, log_cb, set_pid_cb):
os.makedirs(self.script_dir(), exist_ok=True)
if self.type == ScriptType.GIT:
_initialize_git_repository(self.git_repository, self.script_dir(), log_cb, self.force_clone, set_pid_cb)
self.force_clone = False
elif self.type == ScriptType.SIMPLE:
self._setup_simple()
set_executable(self.get_executable())
def _setup_simple(self):
with open(self.get_executable(), "w") as f:
f.write(
"#!/bin/bash\n"
"python run.py \"$@\""
)
with open(os.path.join(self.script_dir(), "run.py"), "w") as f:
f.write(self.script)
def get_executable(self):
return os.path.join(self.script_dir(), "run.sh")
def delete_dir(self):
shutil.rmtree(self.script_dir(), ignore_errors=True)
SCRIPT_TEMPLATES = {
"CLIP - Generate embeddings to predict the most relevant image based on the text prompt": lambda name: UserScript(
name=name,
type=ScriptType.GIT,
git_repository="https://github.com/sist2app/sist2-script-clip",
extra_args="--num-tags=1 --tags-file=general.txt --color=#dcd7ff"
),
"Whisper - Speech to text with OpenAI Whisper": lambda name: UserScript(
name=name,
type=ScriptType.GIT,
git_repository="https://github.com/simon987/sist2-script-whisper",
extra_args="--model=base --num-threads=4 --color=#51da4c --tag"
),
"Hamburger - Simple script example": lambda name: UserScript(
name=name,
type=ScriptType.SIMPLE,
script=
'from sist2 import Sist2Index\n'
'import sys\n'
'\n'
'index = Sist2Index(sys.argv[1])\n'
'for doc in index.document_iter():\n'
' doc.json_data["tag"] = ["hamburger.#00FF00"]\n'
' index.update_document(doc)\n'
'\n'
'index.sync_tag_table()\n'
'index.commit()\n'
'\n'
'print("Done!")\n'
),
"(Blank)": lambda name: UserScript(
name=name,
type=ScriptType.SIMPLE,
script=""
)
}

View File

@ -0,0 +1,363 @@
import datetime
import json
import logging
import os.path
import sys
from datetime import datetime
from enum import Enum
from io import TextIOWrapper
from logging import FileHandler, StreamHandler
from subprocess import Popen, PIPE
from tempfile import NamedTemporaryFile
from threading import Thread
from typing import List
from pydantic import BaseModel
from config import logger, LOG_FOLDER, DATA_FOLDER
class Sist2Version:
def __init__(self, version: str):
self._version = version
self.major, self.minor, self.patch = [int(x) for x in version.split(".")]
def __str__(self):
return f"{self.major}.{self.minor}.{self.patch}"
class SearchBackendType(Enum):
SQLITE = "sqlite"
ELASTICSEARCH = "elasticsearch"
class Sist2SearchBackend(BaseModel):
backend_type: SearchBackendType = SearchBackendType("elasticsearch")
name: str
search_index: str = ""
es_url: str = "http://elasticsearch:9200"
es_insecure_ssl: bool = False
es_index: str = "sist2"
threads: int = 1
batch_size: int = 70
@staticmethod
def create_default(name: str, backend_type: SearchBackendType = SearchBackendType("elasticsearch")):
return Sist2SearchBackend(
name=name,
search_index=f"search-index-{name.replace('/', '_')}.sist2",
backend_type=backend_type
)
class IndexOptions(BaseModel):
path: str = None
incremental_index: bool = True
search_backend: str = None
def __init__(self, **kwargs):
super().__init__(**kwargs)
def args(self, search_backend):
absolute_path = os.path.join(DATA_FOLDER, self.path)
if search_backend.backend_type == SearchBackendType("sqlite"):
search_index_absolute = os.path.join(DATA_FOLDER, search_backend.search_index)
args = ["sqlite-index", absolute_path, "--search-index", search_index_absolute]
else:
args = ["index", absolute_path, f"--threads={search_backend.threads}",
f"--es-url={search_backend.es_url}",
f"--es-index={search_backend.es_index}",
f"--batch-size={search_backend.batch_size}"]
if search_backend.es_insecure_ssl:
args.append(f"--es-insecure-ssl")
if self.incremental_index:
args.append(f"--incremental-index")
return args
ARCHIVE_SKIP = "skip"
ARCHIVE_LIST = "list"
ARCHIVE_SHALLOW = "shallow"
ARCHIVE_RECURSE = "recurse"
class ScanOptions(BaseModel):
path: str
threads: int = 1
thumbnail_quality: int = 50
thumbnail_size: int = 552
thumbnail_count: int = 1
content_size: int = 32768
depth: int = -1
archive: str = ARCHIVE_RECURSE
archive_passphrase: str = None
ocr_lang: str = None
ocr_images: bool = False
ocr_ebooks: bool = False
exclude: str = None
fast: bool = False
treemap_threshold: float = 0.0005
mem_buffer: int = 2000
read_subtitles: bool = False
fast_epub: bool = False
checksums: bool = False
incremental: bool = True
optimize_index: bool = False
output: str = None
name: str = None
rewrite_url: str = None
list_file: str = None
def __init__(self, **kwargs):
super().__init__(**kwargs)
def args(self):
output_path = os.path.join(DATA_FOLDER, self.output)
args = ["scan", self.path, f"--threads={self.threads}", f"--thumbnail-quality={self.thumbnail_quality}",
f"--thumbnail-count={self.thumbnail_count}", f"--thumbnail-size={self.thumbnail_size}",
f"--content-size={self.content_size}", f"--output={output_path}", f"--depth={self.depth}",
f"--archive={self.archive}", f"--mem-buffer={self.mem_buffer}"]
if self.incremental:
args.append(f"--incremental")
if self.optimize_index:
args.append(f"--optimize-index")
if self.rewrite_url:
args.append(f"--rewrite-url={self.rewrite_url}")
if self.name:
args.append(f"--name={self.name}")
if self.archive_passphrase:
args.append(f"--archive-passphrase={self.archive_passphrase}")
if self.ocr_lang:
args.append(f"--ocr-lang={self.ocr_lang}")
if self.ocr_ebooks:
args.append(f"--ocr-ebooks")
if self.ocr_images:
args.append(f"--ocr-images")
if self.exclude:
args.append(f"--exclude={self.exclude}")
if self.fast:
args.append(f"--fast")
if self.treemap_threshold:
args.append(f"--treemap-threshold={self.treemap_threshold}")
if self.read_subtitles:
args.append(f"--read-subtitles")
if self.fast_epub:
args.append(f"--fast-epub")
if self.checksums:
args.append(f"--checksums")
if self.list_file:
args.append(f"--list_file={self.list_file}")
return args
class Sist2Index:
def __init__(self, path):
self.path = path
with open(os.path.join(path, "descriptor.json")) as f:
self._descriptor = json.load(f)
def to_json(self):
return {
"path": self.path,
"version": self.version(),
"timestamp": self.timestamp(),
"name": self.name()
}
def version(self) -> Sist2Version:
return Sist2Version(self._descriptor["version"])
def timestamp(self) -> datetime:
return datetime.fromtimestamp(self._descriptor["timestamp"])
def name(self) -> str:
return self._descriptor["name"]
class WebOptions(BaseModel):
indices: List[str] = []
search_backend: str = "elasticsearch"
bind: str = "0.0.0.0:4090"
auth: str = None
tag_auth: str = None
tagline: str = "Lightning-fast file system indexer and search tool"
dev: bool = False
lang: str = "en"
auth0_audience: str = None
auth0_domain: str = None
auth0_client_id: str = None
auth0_public_key: str = None
auth0_public_key_file: str = None
verbose: bool = False
def __init__(self, **kwargs):
super().__init__(**kwargs)
def args(self, search_backend: Sist2SearchBackend):
args = ["web", f"--bind={self.bind}", f"--tagline={self.tagline}",
f"--lang={self.lang}"]
if search_backend.backend_type == SearchBackendType("sqlite"):
search_index_absolute = os.path.join(DATA_FOLDER, search_backend.search_index)
args.append(f"--search-index={search_index_absolute}")
else:
args.append(f"--es-url={search_backend.es_url}")
args.append(f"--es-index={search_backend.es_index}")
if search_backend.es_insecure_ssl:
args.append(f"--es-insecure-ssl")
if self.auth0_audience:
args.append(f"--auth0-audience={self.auth0_audience}")
if self.auth0_domain:
args.append(f"--auth0-domain={self.auth0_domain}")
if self.auth0_client_id:
args.append(f"--auth0-client-id={self.auth0_client_id}")
if self.auth0_public_key_file:
args.append(f"--auth0-public-key-file={self.auth0_public_key_file}")
if self.auth:
args.append(f"--auth={self.auth}")
if self.tag_auth:
args.append(f"--tag-auth={self.tag_auth}")
if self.dev:
args.append(f"--dev")
if self.verbose:
args.append(f"--very-verbose")
args.extend(self.indices)
return args
class Sist2:
def __init__(self, bin_path: str, data_directory: str):
self.bin_path = bin_path
self._data_dir = data_directory
def index(self, options: IndexOptions, search_backend: Sist2SearchBackend, logs_cb, set_pid_cb):
args = [
self.bin_path,
*options.args(search_backend),
"--json-logs",
"--very-verbose"
]
logs_cb({"sist2-admin": f"Starting sist2 command with args {args}"})
proc = Popen(args, stdout=PIPE, stderr=PIPE)
set_pid_cb(proc.pid)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, None, proc))
t_stderr.start()
self._consume_logs_stdout(logs_cb, proc)
t_stderr.join()
return proc.returncode
def scan(self, options: ScanOptions, logs_cb, set_pid_cb):
if options.output is None:
options.output = f"scan-{options.name.replace('/', '_')}-{datetime.utcnow()}.sist2"
args = [
self.bin_path,
*options.args(),
"--json-logs",
"--very-verbose"
]
logs_cb({"sist2-admin": f"Starting sist2 command with args {args}"})
proc = Popen(args, stdout=PIPE, stderr=PIPE)
set_pid_cb(proc.pid)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, None, proc))
t_stderr.start()
self._consume_logs_stdout(logs_cb, proc)
t_stderr.join()
return proc.returncode
@staticmethod
def _consume_logs_stderr(logs_cb, exit_cb, proc):
pipe_wrapper = TextIOWrapper(proc.stderr, encoding="utf8", errors="ignore")
try:
for line in pipe_wrapper:
if line.strip() == "":
continue
logs_cb({"stderr": line})
finally:
return_code = proc.wait()
if exit_cb:
exit_cb(return_code)
pipe_wrapper.close()
@staticmethod
def _consume_logs_stdout(logs_cb, proc):
pipe_wrapper = TextIOWrapper(proc.stdout, encoding="utf8", errors="ignore")
for line in pipe_wrapper:
try:
if line.strip() == "":
continue
log_object = json.loads(line)
logs_cb(log_object)
except Exception as e:
try:
logs_cb({"sist2-admin": f"Could not decode log line: {line}; {e}"})
except NameError:
pass
def web(self, options: WebOptions, search_backend: Sist2SearchBackend, name: str):
if options.auth0_public_key:
with NamedTemporaryFile("w", prefix="sist2-admin", suffix=".txt", delete=False) as f:
f.write(options.auth0_public_key)
options.auth0_public_key_file = f.name
else:
options.auth0_public_key_file = None
args = [
self.bin_path,
*options.args(search_backend)
]
web_logger = logging.Logger(name=f"sist2-frontend-{name}")
web_logger.addHandler(FileHandler(os.path.join(LOG_FOLDER, f"frontend-{name}.log")))
web_logger.addHandler(StreamHandler())
def logs_cb(message):
web_logger.info(json.dumps(message))
def exit_cb(return_code):
logger.info(f"Web frontend exited with return code {return_code}")
logger.info(f"Starting frontend {' '.join(args)}")
proc = Popen(args, stdout=PIPE, stderr=PIPE)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, exit_cb, proc))
t_stderr.start()
t_stdout = Thread(target=self._consume_logs_stdout, args=(logs_cb, proc))
t_stdout.start()
return proc.pid

View File

@ -0,0 +1,136 @@
from typing import Dict
import os
import shutil
from hexlib.db import Table, PersistentState
import pickle
from tesseract import get_tesseract_langs
import sqlite3
from config import LOG_FOLDER, logger
from sist2 import SearchBackendType, Sist2SearchBackend
RUNNING_FRONTENDS: Dict[str, int] = {}
TESSERACT_LANGS = get_tesseract_langs()
DB_SCHEMA_VERSION = "5"
from pydantic import BaseModel
def _serialize(item):
if isinstance(item, BaseModel):
return pickle.dumps(item)
if isinstance(item, bytes):
raise Exception("FIXME: bytes in PickleTable")
return item
def _deserialize(item):
if isinstance(item, bytes):
return pickle.loads(item)
return item
class PickleTable(Table):
def __getitem__(self, item):
row = super().__getitem__(item)
if row:
return dict((k, _deserialize(v)) for k, v in row.items())
return row
def __setitem__(self, key, value):
value = dict((k, _serialize(v)) for k, v in value.items())
super().__setitem__(key, value)
def __iter__(self):
for row in super().__iter__():
yield dict((k, _deserialize(v)) for k, v in row.items())
def sql(self, where_clause, *params):
for row in super().sql(where_clause, *params):
yield dict((k, _deserialize(v)) for k, v in row.items())
def get_log_files_to_remove(db: PersistentState, job_name: str, n: int):
if n < 0:
return []
counter = 0
to_remove = []
for row in db["task_done"].sql("WHERE has_logs=1 ORDER BY started DESC"):
if row["name"].endswith(f"[{job_name}]"):
counter += 1
if counter > n:
to_remove.append(row)
return to_remove
def delete_log_file(db: PersistentState, task_id: str):
db["task_done"][task_id] = {
"has_logs": 0
}
try:
os.remove(os.path.join(LOG_FOLDER, f"sist2-{task_id}.log"))
except:
pass
def migrate_v1_to_v2(db: PersistentState):
shutil.copy(db.dbfile, db.dbfile + "-before-migrate-v2.bak")
# Frontends
db._table_factory = PickleTable
frontends = [row["frontend"] for row in db["frontends"]]
del db["frontends"]
db._table_factory = Table
for frontend in frontends:
db["frontends"][frontend.name] = frontend
list(db["frontends"])
# Jobs
db._table_factory = PickleTable
jobs = [row["job"] for row in db["jobs"]]
del db["jobs"]
db._table_factory = Table
for job in jobs:
db["jobs"][job.name] = job
list(db["jobs"])
db["sist2_admin"]["info"] = {
"version": "2"
}
def create_default_search_backends(db: PersistentState):
es_backend = Sist2SearchBackend.create_default(name="elasticsearch",
backend_type=SearchBackendType("elasticsearch"))
db["search_backends"]["elasticsearch"] = es_backend
sqlite_backend = Sist2SearchBackend.create_default(name="sqlite", backend_type=SearchBackendType("sqlite"))
db["search_backends"]["sqlite"] = sqlite_backend
def migrate_v3_to_v4(db: PersistentState):
shutil.copy(db.dbfile, db.dbfile + "-before-migrate-v4.bak")
create_default_search_backends(db)
try:
conn = sqlite3.connect(db.dbfile)
conn.execute("ALTER TABLE task_done ADD COLUMN has_logs INTEGER DEFAULT 1")
conn.commit()
conn.close()
except Exception as e:
logger.exception(e)
db["sist2_admin"]["info"] = {
"version": "4"
}

View File

@ -0,0 +1,14 @@
import subprocess
def get_tesseract_langs():
res = subprocess.check_output([
"tesseract",
"--list-langs"
]).decode()
languages = res.split("\n")[1:]
return list(filter(lambda lang: lang and lang != "osd", languages))

View File

@ -0,0 +1,41 @@
from glob import glob
import os
from config import DATA_FOLDER
def get_old_index_files(name):
files = glob(os.path.join(DATA_FOLDER, f"scan-{name.replace('/', '_')}-*.sist2"))
files = list(sorted(files, key=lambda f: os.stat(f).st_mtime))
files = files[-1:]
return files
def tail_sync(filename, lines=1, _buffer=4098):
with open(filename) as f:
lines_found = []
block_counter = -1
while len(lines_found) < lines:
try:
f.seek(block_counter * _buffer, os.SEEK_END)
except IOError:
f.seek(0)
lines_found = f.readlines()
break
lines_found = f.readlines()
block_counter -= 1
return lines_found[-lines:]
def pid_is_running(pid):
try:
os.kill(pid, 0)
except OSError:
return False
return True

View File

@ -0,0 +1,28 @@
import os.path
from typing import List
from pydantic import BaseModel
from sist2 import WebOptions
class Sist2Frontend(BaseModel):
name: str
jobs: List[str]
web_options: WebOptions
running: bool = False
auto_start: bool = False
extra_query_args: str = ""
custom_url: str = None
def get_log_path(self, log_folder: str):
return os.path.join(log_folder, f"frontend-{self.name}.log")
@staticmethod
def create_default(name: str):
return Sist2Frontend(
name=name,
web_options=WebOptions(),
jobs=[]
)

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,3 +0,0 @@
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1,user-scalable=no"><title>sist2</title><link href="css/chunk-vendors.css" rel="preload" as="style"><link href="css/index.css" rel="preload" as="style"><link href="js/chunk-vendors.js" rel="preload" as="script"><link href="js/index.js" rel="preload" as="script"><link href="css/chunk-vendors.css" rel="stylesheet"><link href="css/index.css" rel="stylesheet"></head><body><noscript><style>body {
height: initial;
}</style><div style="text-align: center; margin-top: 100px"><strong>We're sorry but sist2 doesn't work properly without JavaScript enabled. Please enable it to continue.</strong><br><strong>Nous sommes désolés mais sist2 ne fonctionne pas correctement si JavaScript est activé. Veuillez l'activer pour continuer.</strong></div></noscript><div id="app"></div><script src="js/chunk-vendors.js"></script><script src="js/index.js"></script></body></html>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Binary file not shown.

27448
sist2-vue/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -1,22 +1,23 @@
{
"name": "sist2",
"version": "2.11.0",
"version": "1.0.0",
"private": true,
"scripts": {
"serve": "vue-cli-service serve",
"build": "vue-cli-service build --mode production"
},
"dependencies": {
"@auth0/auth0-spa-js": "^2.0.2",
"@egjs/vue-infinitegrid": "3.3.0",
"axios": "^0.25.0",
"axios": "^1.6.0",
"bootstrap-vue": "^2.21.2",
"core-js": "^3.6.5",
"crypto-es": "^1.2.7",
"d3": "^5.16.0",
"d3": "^5.6.1",
"date-fns": "^2.21.3",
"dom-to-image": "^2.6.0",
"fslightbox-vue": "file:../../../mnt/Hatchery/main/projects/sist2/fslightbox-vue-pro-1.3.1.tgz",
"fslightbox-vue": "fslightbox-vue.tgz",
"nouislider": "^15.2.0",
"onnxruntime-web": "1.15.1",
"underscore": "^1.13.1",
"vue": "^2.6.12",
"vue-color": "^2.8.1",
@ -27,12 +28,12 @@
"vuex": "^3.4.0"
},
"devDependencies": {
"@babel/polyfill": "^7.11.5",
"@vue/cli-plugin-babel": "~4.5.0",
"@vue/cli-plugin-router": "~4.5.0",
"@vue/cli-plugin-typescript": "~4.5.0",
"@vue/cli-plugin-vuex": "~4.5.0",
"@vue/cli-service": "~4.5.0",
"@babel/polyfill": "^7.12.1",
"@types/underscore": "^1.11.6",
"@vue/cli-plugin-babel": "~5.0.8",
"@vue/cli-plugin-router": "~5.0.8",
"@vue/cli-plugin-vuex": "~5.0.8",
"@vue/cli-service": "^5.0.8",
"@vue/test-utils": "^1.0.3",
"bootstrap": "^4.5.2",
"inspire-tree": "^4.3.1",
@ -42,8 +43,7 @@
"portal-vue": "^2.1.7",
"sass": "^1.26.11",
"sass-loader": "^10.0.2",
"typescript": "~4.1.5",
"vue-cli-plugin-bootstrap-vue": "~0.7.0",
"vue-cli-plugin-bootstrap-vue": "~0.8.2",
"vue-template-compiler": "^2.6.11"
},
"browserslist": [

View File

@ -19,12 +19,6 @@
We're sorry but <%= htmlWebpackPlugin.options.title %> doesn't work properly without JavaScript enabled.
Please enable it to continue.
</strong>
<br/>
<strong>
Nous sommes désolés mais <%= htmlWebpackPlugin.options.title %> ne fonctionne pas correctement
si JavaScript est activé.
Veuillez l'activer pour continuer.
</strong>
</div>
</noscript>
<div id="app"></div>

View File

@ -1,317 +1,410 @@
<template>
<div id="app" :class="getClass()">
<NavBar></NavBar>
<router-view v-if="!configLoading"/>
</div>
<div id="app" :class="getClass()" v-if="!authLoading">
<NavBar></NavBar>
<router-view v-if="!configLoading"/>
</div>
<div class="loading-page" v-else>
<div class="loading-spinners">
<b-spinner type="grow" variant="primary"></b-spinner>
<b-spinner type="grow" variant="primary"></b-spinner>
<b-spinner type="grow" variant="primary"></b-spinner>
</div>
<div class="loading-text">
Loading Chargement 装载 Wird geladen Ładowanie
</div>
</div>
</template>
<script>
import NavBar from "@/components/NavBar";
import {mapGetters} from "vuex";
import {mapActions, mapGetters, mapMutations} from "vuex";
import Sist2Api from "@/Sist2Api";
import ModelsRepo from "@/ml/modelsRepo";
import {setupAuth0} from "@/main";
import Sist2ElasticsearchQuery from "@/Sist2ElasticsearchQuery";
import Sist2SqliteQuery from "@/Sist2SqliteQuery";
export default {
components: {NavBar},
data() {
return {
configLoading: false
}
},
computed: {
...mapGetters(["optTheme"]),
},
mounted() {
this.$store.dispatch("loadConfiguration").then(() => {
this.$root.$i18n.locale = this.$store.state.optLang;
});
components: {NavBar},
data() {
return {
configLoading: false,
authLoading: true,
sist2InfoLoading: true
}
},
computed: {
...mapGetters(["optTheme"]),
},
mounted() {
this.$store.dispatch("loadConfiguration").then(() => {
this.$root.$i18n.locale = this.$store.state.optLang;
ModelsRepo.init(this.$store.getters.mlRepositoryList).catch(err => {
this.$bvToast.toast(
this.$t("ml.repoFetchError"),
{
title: this.$t("ml.repoFetchErrorTitle"),
noAutoHide: true,
toaster: "b-toaster-bottom-right",
headerClass: "toast-header-warning",
bodyClass: "toast-body-warning",
});
});
});
this.$store.subscribe((mutation) => {
if (mutation.type === "setOptLang") {
this.$root.$i18n.locale = mutation.payload;
this.configLoading = true;
window.setTimeout(() => this.configLoading = false, 10);
}
});
},
methods: {
getClass() {
return {
"theme-light": this.optTheme === "light",
"theme-black": this.optTheme === "black",
}
this.$store.subscribe((mutation) => {
if (mutation.type === "setOptLang") {
this.$root.$i18n.locale = mutation.payload;
this.configLoading = true;
window.setTimeout(() => this.configLoading = false, 10);
}
if (mutation.type === "setAuth0Token") {
this.authLoading = false;
}
});
Sist2Api.getSist2Info().then(data => {
if (data.auth0Enabled) {
this.authLoading = true;
setupAuth0(data.auth0Domain, data.auth0ClientId, data.auth0Audience)
this.$auth.$watch("loading", loading => {
if (loading === false) {
if (!this.$auth.isAuthenticated) {
this.$auth.loginWithRedirect();
return;
}
// Remove "code" param
window.history.replaceState({}, "", "/" + window.location.hash);
this.$store.dispatch("loadAuth0Token");
}
});
} else {
this.authLoading = false;
}
this.setSist2Info(data);
this.setIndices(data.indices)
if (Sist2Api.backend() === "sqlite") {
Sist2Api.init(Sist2SqliteQuery.searchQuery);
this.$store.commit("setUiSqliteMode", true);
} else {
Sist2Api.init(Sist2ElasticsearchQuery.searchQuery);
}
});
},
methods: {
...mapActions(["setSist2Info",]),
...mapMutations(["setIndices",]),
getClass() {
return {
"theme-light": this.optTheme === "light",
"theme-black": this.optTheme === "black",
}
}
}
}
,
,
}
</script>
<style>
html, body {
height: 100%;
height: 100%;
}
#app {
/*font-family: Avenir, Helvetica, Arial, sans-serif;*/
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
/*text-align: center;*/
color: #2c3e50;
padding-bottom: 1em;
min-height: 100%;
/*font-family: Avenir, Helvetica, Arial, sans-serif;*/
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
/*text-align: center;*/
color: #2c3e50;
padding-bottom: 1em;
min-height: 100%;
}
/*Black theme*/
.theme-black {
background-color: #000;
background-color: #000;
}
.theme-black .card, .theme-black .modal-content {
background: #212121;
color: #e0e0e0;
border-radius: 1px;
border: none;
background: #212121;
color: #e0e0e0;
border-radius: 1px;
border: none;
}
.theme-black .table {
color: #e0e0e0;
color: #e0e0e0;
}
.theme-black .table td, .theme-black .table th {
border: none;
border: none;
}
.theme-black .table thead th {
border-bottom: 1px solid #646464;
border-bottom: 1px solid #646464;
}
.theme-black .custom-select {
overflow: auto;
background-color: #37474F;
border: 1px solid #616161;
color: #bdbdbd;
overflow: auto;
background-color: #37474F;
border: 1px solid #616161;
color: #bdbdbd;
}
.theme-black .custom-select:focus {
border-color: #757575;
outline: 0;
box-shadow: 0 0 0 .2rem rgba(0, 123, 255, .25);
border-color: #757575;
outline: 0;
box-shadow: 0 0 0 .2rem rgba(0, 123, 255, .25);
}
.theme-black .inspire-tree .selected > .wholerow, .theme-black .inspire-tree .selected > .title-wrap:hover + .wholerow {
background: none !important;
background: none !important;
}
.theme-black .inspire-tree .icon-expand::before, .theme-black .inspire-tree .icon-collapse::before {
background-color: black !important;
background-color: black !important;
}
.theme-black .inspire-tree .title {
color: #eee;
color: #eee;
}
.theme-black .inspire-tree {
font-weight: 400;
font-size: 14px;
font-family: Helvetica, Nueue, Verdana, sans-serif;
max-height: 350px;
overflow: auto;
font-weight: 400;
font-size: 14px;
font-family: Helvetica, Nueue, Verdana, sans-serif;
max-height: 350px;
overflow: auto;
}
.inspire-tree [type=checkbox] {
left: 22px !important;
top: 7px !important;
left: 22px !important;
top: 7px !important;
}
.theme-black .form-control {
background-color: #37474F;
border: 1px solid #616161;
color: #dbdbdb !important;
background-color: #37474F;
border: 1px solid #616161;
color: #dbdbdb !important;
}
.theme-black .form-control:focus {
background-color: #546E7A;
color: #fff;
background-color: #546E7A;
color: #fff;
}
.theme-black .input-group-text, .theme-black .default-input {
background: #37474F !important;
border: 1px solid #616161 !important;
color: #dbdbdb !important;
background: #37474F !important;
border: 1px solid #616161 !important;
color: #dbdbdb !important;
}
.theme-black ::placeholder {
color: #BDBDBD !important;
opacity: 1;
color: #BDBDBD !important;
opacity: 1;
}
.theme-black .nav-tabs .nav-link {
color: #e0e0e0;
border-radius: 0;
color: #e0e0e0;
border-radius: 0;
}
.theme-black .nav-tabs .nav-item.show .nav-link, .theme-black .nav-tabs .nav-link.active {
background-color: #212121;
border-color: #616161 #616161 #212121;
color: #e0e0e0;
background-color: #212121;
border-color: #616161 #616161 #212121;
color: #e0e0e0;
}
.theme-black .nav-tabs .nav-link:focus, .theme-black .nav-tabs .nav-link:focus {
border-color: #616161 #616161 #212121;
color: #e0e0e0;
border-color: #616161 #616161 #212121;
color: #e0e0e0;
}
.theme-black .nav-tabs .nav-link:focus, .theme-black .nav-tabs .nav-link:hover {
border-color: #e0e0e0 #e0e0e0 #212121;
color: #e0e0e0;
border-color: #e0e0e0 #e0e0e0 #212121;
color: #e0e0e0;
}
.theme-black .nav-tabs {
border-bottom: #616161;
border-bottom: #616161;
}
.theme-black a:hover, .theme-black .btn:hover {
color: #fff;
color: #fff;
}
.theme-black .b-dropdown a:hover {
color: inherit;
color: inherit;
}
.theme-black .btn {
color: #eee;
color: #eee;
}
.theme-black .modal-header .close {
color: #e0e0e0;
text-shadow: none;
color: #e0e0e0;
text-shadow: none;
}
.theme-black .modal-header {
border-bottom: 1px solid #646464;
border-bottom: 1px solid #646464;
}
/* -------------------------- */
#nav {
padding: 30px;
padding: 30px;
}
#nav a {
font-weight: bold;
color: #2c3e50;
font-weight: bold;
color: #2c3e50;
}
#nav a.router-link-exact-active {
color: #42b983;
color: #42b983;
}
.mobile {
display: none;
display: none;
}
.container {
padding-top: 1em;
padding-top: 1em;
}
@media (max-width: 650px) {
.mobile {
display: initial;
}
.mobile {
display: initial;
}
.not-mobile {
display: none;
}
.not-mobile {
display: none;
}
.grid-single-column .fit {
max-height: none !important;
}
.grid-single-column .fit {
max-height: none !important;
}
.container {
padding-left: 0;
padding-right: 0;
padding-top: 0
}
.container {
padding-left: 0;
padding-right: 0;
padding-top: 0
}
.lightbox-caption {
display: none;
}
.lightbox-caption {
display: none;
}
}
.info-icon {
width: 1rem;
margin-right: 0.2rem;
cursor: pointer;
line-height: 1rem;
height: 1rem;
background-image: url();
filter: brightness(45%);
display: block;
width: 1rem;
min-width: 1rem;
margin-right: 0.2rem;
cursor: pointer;
line-height: 1rem;
height: 1rem;
min-height: 1rem;
background-image: url();
filter: brightness(45%);
display: block;
}
.theme-black .info-icon {
filter: brightness(80%);
}
.tabs {
margin-top: 10px;
margin-top: 10px;
}
.modal-title {
text-overflow: ellipsis;
overflow: hidden;
white-space: nowrap;
text-overflow: ellipsis;
overflow: hidden;
white-space: nowrap;
}
@media screen and (min-width: 1500px) {
.container {
max-width: 1440px;
}
.container {
max-width: 1440px;
}
}
.noUi-connects {
border-radius: 1px !important;
border-radius: 1px !important;
}
mark {
background: #fff217;
border-radius: 0;
padding: 1px 0;
color: inherit;
background: #fff217;
border-radius: 0;
padding: 1px 0;
color: inherit;
}
.theme-black mark {
background: rgba(251, 191, 41, 0.25);
border-radius: 0;
padding: 1px 0;
color: inherit;
background: rgba(251, 191, 41, 0.25);
border-radius: 0;
padding: 1px 0;
color: inherit;
}
.theme-black .content-div mark {
background: rgba(251, 191, 41, 0.40);
color: white;
background: rgba(251, 191, 41, 0.40);
color: white;
}
.content-div {
font-family: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
font-size: 13px;
padding: 1em;
background-color: #f5f5f5;
border: 1px solid #ccc;
border-radius: 4px;
margin: 3px;
white-space: normal;
color: #000;
overflow: hidden;
font-family: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
font-size: 13px;
padding: 1em;
background-color: #f5f5f5;
border: 1px solid #ccc;
border-radius: 4px;
margin: 3px;
white-space: normal;
color: #000;
overflow: hidden;
}
.theme-black .content-div {
background-color: #37474F;
border: 1px solid #616161;
color: #E0E0E0FF;
background-color: #37474F;
border: 1px solid #616161;
color: #E0E0E0FF;
}
.graph {
display: inline-block;
width: 40%;
display: inline-block;
width: 40%;
}
.pointer {
cursor: pointer;
cursor: pointer;
}
.loading-page {
display: flex;
justify-content: center;
align-items: center;
flex-direction: column;
height: 100%;
gap: 15px
}
.loading-spinners {
display: flex;
gap: 10px;
}
.loading-text {
text-align: center;
}
</style>

628
sist2-vue/src/Sist2Api.js Normal file
View File

@ -0,0 +1,628 @@
import axios from "axios";
import {strUnescape, lum, sid} from "./util";
import Sist2Query from "@/Sist2ElasticsearchQuery";
import store from "@/store";
class Sist2Api {
baseUrl;
sist2Info;
queryfunc;
constructor(baseUrl) {
this.baseUrl = baseUrl;
}
init(queryFunc) {
this.queryfunc = queryFunc;
}
backend() {
return this.sist2Info.searchBackend;
}
models() {
const allModels = this.sist2Info.indices
.map(idx => idx.models)
.flat();
return allModels
.filter((v, i, a) => a.findIndex(v2 => (v2.id === v.id)) === i)
}
getSist2Info() {
return axios.get(`${this.baseUrl}i`).then(resp => {
this.sist2Info = resp.data;
return resp.data;
})
}
setHitProps(hit) {
hit["_props"] = {};
const mimeCategory = hit._source.mime == null ? null : hit._source.mime.split("/")[0];
if ("parent" in hit._source) {
hit._props.isSubDocument = true;
}
if ("thumbnail" in hit._source && hit._source.thumbnail > 0) {
hit._props.hasThumbnail = true;
if (Number.isNaN(Number(hit._source.thumbnail))) {
// Backwards compatibility
hit._props.tnNum = 1;
hit._props.hasVidPreview = false;
} else {
hit._props.tnNum = Number(hit._source.thumbnail);
hit._props.hasVidPreview = hit._props.tnNum > 1;
}
}
switch (mimeCategory) {
case "image":
if (hit._source.videoc === "gif") {
hit._props.isGif = true;
} else {
hit._props.isImage = true;
}
if ("width" in hit._source && !hit._props.isSubDocument && hit._source.videoc !== "tiff"
&& hit._source.videoc !== "raw" && hit._source.videoc !== "ppm") {
hit._props.isPlayableImage = true;
}
if ("width" in hit._source && "height" in hit._source) {
hit._props.imageAspectRatio = hit._source.width / hit._source.height;
}
break;
case "video":
if ("videoc" in hit._source) {
hit._props.isVideo = true;
}
if (hit._props.isVideo) {
const videoc = hit._source.videoc;
const mime = hit._source.mime;
hit._props.isPlayableVideo = mime != null &&
mime.startsWith("video/") &&
!hit._props.isSubDocument &&
hit._source.extension !== "mkv" &&
hit._source.extension !== "avi" &&
hit._source.extension !== "mov" &&
videoc !== "hevc" &&
videoc !== "mpeg1video" &&
videoc !== "mpeg2video" &&
videoc !== "wmv3";
}
break;
case "audio":
if ("audioc" in hit._source && !hit._props.isSubDocument) {
hit._props.isAudio = true;
}
break;
}
}
setHitTags(hit) {
const tags = [];
// User tags
if ("tag" in hit._source) {
hit._source.tag.forEach(tag => {
tags.push(this.createUserTag(tag));
})
}
hit._tags = tags;
}
createUserTag(tag) {
const tokens = tag.split(".");
const colorToken = tokens.pop();
const bg = colorToken;
const fg = lum(colorToken) > 50 ? "#000" : "#fff";
return {
style: "user",
fg: fg,
bg: bg,
text: tokens.join("."),
rawText: tag,
userTag: true,
};
}
search() {
if (this.backend() === "sqlite") {
return this.ftsQuery(this.queryfunc())
} else {
return this.esQuery(this.queryfunc());
}
}
_getIndexRoot(indexId) {
return this.sist2Info.indices.find(idx => idx.id === indexId).root;
}
esQuery(query) {
return axios.post(`${this.baseUrl}es`, query).then(resp => {
const res = resp.data;
if (res.hits?.hits) {
res.hits.hits.forEach((hit) => {
hit["_source"]["name"] = strUnescape(hit["_source"]["name"]);
hit["_source"]["path"] = strUnescape(hit["_source"]["path"]);
hit["_source"]["indexRoot"] = this._getIndexRoot(hit["_source"]["index"]);
this.setHitProps(hit);
this.setHitTags(hit);
});
}
return res;
});
}
ftsQuery(query) {
return axios.post(`${this.baseUrl}fts/search`, query).then(resp => {
const res = resp.data;
if (res.hits.hits) {
res.hits.hits.forEach(hit => {
hit["_source"]["name"] = strUnescape(hit["_source"]["name"]);
hit["_source"]["path"] = strUnescape(hit["_source"]["path"]);
this.setHitProps(hit);
this.setHitTags(hit);
if ("highlight" in hit) {
hit["highlight"]["name"] = [hit["highlight"]["name"]];
hit["highlight"]["content"] = [hit["highlight"]["content"]];
}
});
}
return res;
});
}
getMimeTypesEs(query) {
const AGGS = {
mimeTypes: {
terms: {
field: "mime",
size: 10000
}
}
};
if (!query) {
query = {
aggs: AGGS,
size: 0,
};
} else {
query.size = 0;
query.aggs = AGGS;
}
return this.esQuery(query).then(resp => {
return resp["aggregations"]["mimeTypes"]["buckets"].map(bucket => ({
mime: bucket.key,
count: bucket.doc_count
}));
});
}
getMimeTypesSqlite() {
return axios.get(`${this.baseUrl}fts/mimetypes`)
.then(resp => {
return resp.data;
});
}
async getMimeTypes(query = undefined) {
let buckets;
if (this.backend() === "sqlite") {
buckets = await this.getMimeTypesSqlite();
} else {
buckets = await this.getMimeTypesEs(query);
}
const mimeMap = [];
buckets.sort((a, b) => a.mime > b.mime).forEach((bucket) => {
const tmp = bucket.mime.split("/");
const category = tmp[0];
const mime = tmp[1];
let category_exists = false;
const child = {
"id": bucket.mime,
"text": `${mime} (${bucket.count})`
};
mimeMap.forEach(node => {
if (node.text === category) {
node.children.push(child);
category_exists = true;
}
});
if (!category_exists) {
mimeMap.push({text: category, children: [child], id: category});
}
})
mimeMap.forEach(node => {
if (node.children) {
node.children.sort((a, b) => a.id.localeCompare(b.id));
}
})
mimeMap.sort((a, b) => a.id.localeCompare(b.id))
return {buckets, mimeMap};
}
_createEsTag(tag, count) {
const tokens = tag.split(".");
if (/.*\.#[0-9a-fA-F]{6}/.test(tag)) {
return {
id: tokens.slice(0, -1).join("."),
color: tokens.pop(),
isLeaf: true,
count: count
};
}
return {
id: tag,
count: count,
isLeaf: false,
color: undefined
};
}
getTagsEs() {
return this.esQuery({
aggs: {
tags: {
terms: {
field: "tag",
size: 65535
}
}
},
size: 0,
}).then(resp => {
return resp["aggregations"]["tags"]["buckets"]
.sort((a, b) => a["key"].localeCompare(b["key"]))
.map((bucket) => this._createEsTag(bucket["key"], bucket["doc_count"]));
});
}
getTagsSqlite() {
return axios.get(`${this.baseUrl}fts/tags`)
.then(resp => {
return resp.data.map(tag => this._createEsTag(tag.tag, tag.count))
});
}
async getTags() {
let tags;
if (this.backend() === "sqlite") {
tags = await this.getTagsSqlite();
} else {
tags = await this.getTagsEs();
}
// Remove duplicates (same tag with different color)
const seen = new Set();
return tags.filter((t) => {
if (seen.has(t.id)) {
return false;
}
seen.add(t.id);
return true;
});
}
saveTag(tag, hit) {
return axios.post(`${this.baseUrl}tag/${sid(hit)}`, {
delete: false,
name: tag,
});
}
deleteTag(tag, hit) {
return axios.post(`${this.baseUrl}tag/${sid(hit)}`, {
delete: true,
name: tag,
});
}
searchPaths(indexId, minDepth, maxDepth, prefix = null) {
if (this.backend() === "sqlite") {
return this.searchPathsSqlite(indexId, minDepth, minDepth, prefix);
} else {
return this.searchPathsEs(indexId, minDepth, maxDepth, prefix);
}
}
searchPathsSqlite(indexId, minDepth, maxDepth, prefix) {
return axios.post(`${this.baseUrl}fts/paths`, {
indexId, minDepth, maxDepth, prefix
}).then(resp => {
return resp.data;
});
}
searchPathsEs(indexId, minDepth, maxDepth, prefix) {
const query = {
query: {
bool: {
filter: [
{term: {index: indexId}},
{range: {_depth: {gte: minDepth, lte: maxDepth}}},
]
}
},
aggs: {
paths: {
terms: {
field: "path",
size: 10000
}
}
},
size: 0
};
if (prefix != null) {
query["query"]["bool"]["must"] = {
prefix: {
path: prefix,
}
};
}
return this.esQuery(query).then(resp => {
const buckets = resp["aggregations"]["paths"]["buckets"];
if (!buckets) {
return [];
}
return buckets
.map(bucket => ({
path: bucket.key,
count: bucket.doc_count
}));
});
}
getDateRangeSqlite() {
return axios.get(`${this.baseUrl}fts/dateRange`)
.then(resp => ({
min: resp.data.dateMin,
max: (resp.data.dateMax === resp.data.dateMin)
? resp.data.dateMax + 1
: resp.data.dateMax,
}));
}
getDateRange() {
if (this.backend() === "sqlite") {
return this.getDateRangeSqlite();
} else {
return this.getDateRangeEs();
}
}
getDateRangeEs() {
return this.esQuery({
// TODO: filter current selected indices
aggs: {
dateMin: {min: {field: "mtime"}},
dateMax: {max: {field: "mtime"}},
},
size: 0
}).then(res => {
const range = {
min: res.aggregations.dateMin.value / 1000,
max: res.aggregations.dateMax.value / 1000,
}
if (range.min == null) {
range.min = 0;
range.max = 1;
} else if (range.min === range.max) {
range.max += 1;
}
return range;
});
}
getPathSuggestionsSqlite(text) {
return axios.post(`${this.baseUrl}fts/paths`, {
prefix: text,
minDepth: 1,
maxDepth: 10000
}).then(resp => {
return resp.data.map(bucket => bucket.path);
})
}
getPathSuggestionsEs(text) {
return this.esQuery({
suggest: {
path: {
prefix: text,
completion: {
field: "suggest-path",
skip_duplicates: true,
size: 10000
}
}
}
}).then(resp => {
return resp["suggest"]["path"][0]["options"]
.map(opt => opt["_source"]["path"]);
});
}
getPathSuggestions(text) {
if (this.backend() === "sqlite") {
return this.getPathSuggestionsSqlite(text);
} else {
return this.getPathSuggestionsEs(text)
}
}
getTreemapStat(indexId) {
return `${this.baseUrl}s/${indexId}/TMAP`;
}
getMimeStat(indexId) {
return `${this.baseUrl}s/${indexId}/MAGG`;
}
getSizeStat(indexId) {
return `${this.baseUrl}s/${indexId}/SAGG`;
}
getDateStat(indexId) {
return `${this.baseUrl}s/${indexId}/DAGG`;
}
getDocumentEs(sid, highlight, fuzzy) {
const query = Sist2Query.searchQuery();
if (highlight) {
const fields = fuzzy
? {"content.nGram": {}}
: {content: {}};
query.highlight = {
pre_tags: ["<mark>"],
post_tags: ["</mark>"],
number_of_fragments: 0,
fields,
};
if (!store.state.sist2Info.esVersionLegacy) {
query.highlight.max_analyzed_offset = 999_999;
}
}
if ("knn" in query) {
query.query = {
bool: {
must: []
}
};
delete query.knn;
}
if ("function_score" in query.query) {
query.query = query.query.function_score.query;
}
if (!("must" in query.query.bool)) {
query.query.bool.must = [];
} else if (!Array.isArray(query.query.bool.must)) {
query.query.bool.must = [query.query.bool.must];
}
query.query.bool.must.push({match: {_id: sid}});
delete query["sort"];
delete query["aggs"];
delete query["search_after"];
delete query.query["function_score"];
query._source = {
includes: ["content", "name", "path", "extension"]
}
query.size = 1;
return this.esQuery(query).then(resp => {
if (resp.hits.hits.length === 1) {
return resp.hits.hits[0];
}
return null;
});
}
getDocumentSqlite(sid) {
return axios.get(`${this.baseUrl}fts/d/${sid}`)
.then(resp => ({
_source: resp.data
}));
}
getDocument(sid, highlight, fuzzy) {
if (this.backend() === "sqlite") {
return this.getDocumentSqlite(sid);
} else {
return this.getDocumentEs(sid, highlight, fuzzy);
}
}
getTagSuggestions(prefix) {
if (this.backend() === "sqlite") {
return this.getTagSuggestionsSqlite(prefix);
} else {
return this.getTagSuggestionsEs(prefix);
}
}
getTagSuggestionsSqlite(prefix) {
return axios.post(`${this.baseUrl}fts/suggestTags`, prefix)
.then(resp => (resp.data));
}
getTagSuggestionsEs(prefix) {
return this.esQuery({
suggest: {
tag: {
prefix: prefix,
completion: {
field: "suggest-tag",
skip_duplicates: true,
size: 10000
}
}
}
}).then(resp => {
const result = [];
resp["suggest"]["tag"][0]["options"].map(opt => opt["_source"]["tag"]).forEach(tags => {
tags.forEach(tag => {
const t = tag.slice(0, -8);
if (!result.find(x => x.slice(0, -8) === t)) {
result.push(tag);
}
});
});
return result;
});
}
getEmbeddings(sid, modelId) {
return axios.post(`${this.baseUrl}e/${sid}/${modelId.toString().padStart(3, '0')}`)
.then(resp => (resp.data));
}
}
export default new Sist2Api("");

View File

@ -1,414 +0,0 @@
import axios from "axios";
import {ext, strUnescape, lum} from "./util";
import CryptoES from 'crypto-es';
export interface EsTag {
id: string
count: number
color: string | undefined
isLeaf: boolean
}
export interface Tag {
style: string
text: string
rawText: string
fg: string
bg: string
userTag: boolean
}
export interface Index {
name: string
version: string
id: string
idPrefix: string
timestamp: number
}
export interface EsHit {
_index: string
_id: string
_score: number
_path_md5: string
_type: string
_tags: Tag[]
_seq: number
_source: {
path: string
size: number
mime: string
name: string
extension: string
index: string
_depth: number
mtime: number
videoc: string
audioc: string
parent: string
width: number
height: number
duration: number
tag: string[]
checksum: string
thumbnail: string
}
_props: {
isSubDocument: boolean
isImage: boolean
isGif: boolean
isVideo: boolean
isPlayableVideo: boolean
isPlayableImage: boolean
isAudio: boolean
hasThumbnail: boolean
hasVidPreview: boolean
/** Number of thumbnails available */
tnNum: number
}
highlight: {
name: string[] | undefined,
content: string[] | undefined,
}
}
function getIdPrefix(indices: Index[], id: string): string {
for (let i = 4; i < 32; i++) {
const prefix = id.slice(0, i);
if (indices.filter(idx => idx.id.slice(0, i) == prefix).length == 1) {
return prefix;
}
}
return id;
}
export interface EsResult {
took: number
hits: {
// TODO: ES 6.X ?
total: {
value: number
}
hits: EsHit[]
}
aggregations: any
}
class Sist2Api {
private baseUrl: string
constructor(baseUrl: string) {
this.baseUrl = baseUrl;
}
getSist2Info(): Promise<any> {
return axios.get(`${this.baseUrl}i`).then(resp => {
const indices = resp.data.indices as Index[];
resp.data.indices = indices.map(idx => {
return {
id: idx.id,
name: idx.name,
timestamp: idx.timestamp,
version: idx.version,
idPrefix: getIdPrefix(indices, idx.id)
} as Index;
});
return resp.data;
})
}
setHitProps(hit: EsHit): void {
hit["_props"] = {} as any;
const mimeCategory = hit._source.mime == null ? null : hit._source.mime.split("/")[0];
if ("parent" in hit._source) {
hit._props.isSubDocument = true;
}
if ("thumbnail" in hit._source) {
hit._props.hasThumbnail = true;
if (Number.isNaN(Number(hit._source.thumbnail))) {
// Backwards compatibility
hit._props.tnNum = 1;
hit._props.hasVidPreview = false;
} else {
hit._props.tnNum = Number(hit._source.thumbnail);
hit._props.hasVidPreview = hit._props.tnNum > 1;
}
}
switch (mimeCategory) {
case "image":
if (hit._source.videoc === "gif") {
hit._props.isGif = true;
} else {
hit._props.isImage = true;
}
if ("width" in hit._source && !hit._props.isSubDocument && hit._source.videoc !== "tiff"
&& hit._source.videoc !== "raw" && hit._source.videoc !== "ppm") {
hit._props.isPlayableImage = true;
}
break;
case "video":
if ("videoc" in hit._source) {
hit._props.isVideo = true;
}
if (hit._props.isVideo) {
const videoc = hit._source.videoc;
const mime = hit._source.mime;
hit._props.isPlayableVideo = mime != null &&
mime.startsWith("video/") &&
!hit._props.isSubDocument &&
hit._source.extension !== "mkv" &&
hit._source.extension !== "avi" &&
hit._source.extension !== "mov" &&
videoc !== "hevc" &&
videoc !== "mpeg1video" &&
videoc !== "mpeg2video" &&
videoc !== "wmv3";
}
break;
case "audio":
if ("audioc" in hit._source && !hit._props.isSubDocument) {
hit._props.isAudio = true;
}
break;
}
}
setHitTags(hit: EsHit): void {
const tags = [] as Tag[];
const mimeCategory = hit._source.mime == null ? null : hit._source.mime.split("/")[0];
switch (mimeCategory) {
case "image":
case "video":
if ("videoc" in hit._source && hit._source.videoc) {
tags.push({
style: "video",
text: hit._source.videoc.replace(" ", ""),
userTag: false
} as Tag);
}
break
case "audio":
if ("audioc" in hit._source && hit._source.audioc) {
tags.push({
style: "audio",
text: hit._source.audioc,
userTag: false
} as Tag);
}
break;
}
// User tags
if ("tag" in hit._source) {
hit._source.tag.forEach(tag => {
tags.push(this.createUserTag(tag));
})
}
hit._tags = tags;
}
createUserTag(tag: string): Tag {
const tokens = tag.split(".");
const colorToken = tokens.pop() as string;
const bg = colorToken;
const fg = lum(colorToken) > 50 ? "#000" : "#fff";
return {
style: "user",
fg: fg,
bg: bg,
text: tokens.join("."),
rawText: tag,
userTag: true,
} as Tag;
}
esQuery(query: any): Promise<EsResult> {
return axios.post(`${this.baseUrl}es`, query).then(resp => {
const res = resp.data as EsResult;
if (res.hits?.hits) {
res.hits.hits.forEach((hit: EsHit) => {
hit["_source"]["name"] = strUnescape(hit["_source"]["name"]);
hit["_source"]["path"] = strUnescape(hit["_source"]["path"]);
hit["_path_md5"] = CryptoES.MD5(
hit["_source"]["path"] +
(hit["_source"]["path"] ? "/" : "") +
hit["_source"]["name"] + ext(hit)
).toString();
this.setHitProps(hit);
this.setHitTags(hit);
});
}
return res;
});
}
getMimeTypes(query = undefined) {
const AGGS = {
mimeTypes: {
terms: {
field: "mime",
size: 10000
}
}
};
if (!query) {
query = {
aggs: AGGS,
size: 0,
};
} else {
query.size = 0;
query.aggs = AGGS;
}
return this.esQuery(query).then(resp => {
const mimeMap: any[] = [];
const buckets = resp["aggregations"]["mimeTypes"]["buckets"];
buckets.sort((a: any, b: any) => a.key > b.key).forEach((bucket: any) => {
const tmp = bucket["key"].split("/");
const category = tmp[0];
const mime = tmp[1];
let category_exists = false;
const child = {
"id": bucket["key"],
"text": `${mime} (${bucket["doc_count"]})`
};
mimeMap.forEach(node => {
if (node.text === category) {
node.children.push(child);
category_exists = true;
}
});
if (!category_exists) {
mimeMap.push({text: category, children: [child], id: category});
}
})
mimeMap.forEach(node => {
if (node.children) {
node.children.sort((a, b) => a.id.localeCompare(b.id));
}
})
mimeMap.sort((a, b) => a.id.localeCompare(b.id))
return {buckets, mimeMap};
});
}
_createEsTag(tag: string, count: number): EsTag {
const tokens = tag.split(".");
if (/.*\.#[0-9a-f]{6}/.test(tag)) {
return {
id: tokens.slice(0, -1).join("."),
color: tokens.pop(),
isLeaf: true,
count: count
};
}
return {
id: tag,
count: count,
isLeaf: false,
color: undefined
};
}
getDocInfo(docId: string) {
return axios.get(`${this.baseUrl}d/${docId}`);
}
getTags() {
return this.esQuery({
aggs: {
tags: {
terms: {
field: "tag",
size: 10000
}
}
},
size: 0,
}).then(resp => {
const seen = new Set();
const tags = resp["aggregations"]["tags"]["buckets"]
.sort((a: any, b: any) => a["key"].localeCompare(b["key"]))
.map((bucket: any) => this._createEsTag(bucket["key"], bucket["doc_count"]));
// Remove duplicates (same tag with different color)
return tags.filter((t: EsTag) => {
if (seen.has(t.id)) {
return false;
}
seen.add(t.id);
return true;
});
});
}
saveTag(tag: string, hit: EsHit) {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
delete: false,
name: tag,
doc_id: hit["_id"],
path_md5: hit._path_md5
});
}
deleteTag(tag: string, hit: EsHit) {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
delete: true,
name: tag,
doc_id: hit["_id"],
path_md5: hit._path_md5
});
}
getTreemapCsvUrl(indexId: string) {
return `${this.baseUrl}s/${indexId}/1`;
}
getMimeCsvUrl(indexId: string) {
return `${this.baseUrl}s/${indexId}/2`;
}
getSizeCsv(indexId: string) {
return `${this.baseUrl}s/${indexId}/3`;
}
getDateCsv(indexId: string) {
return `${this.baseUrl}s/${indexId}/4`;
}
}
export default new Sist2Api("");

View File

@ -1,5 +1,5 @@
import store from "./store";
import {EsHit, Index} from "@/Sist2Api";
import store from "@/store";
import Sist2Api from "@/Sist2Api";
const SORT_MODES = {
score: {
@ -7,69 +7,62 @@ const SORT_MODES = {
{_score: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._score
key: (hit) => hit._score
},
random: {
mode: [
{_score: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._score
key: (hit) => hit._score
},
dateAsc: {
mode: [
{mtime: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.mtime
key: (hit) => hit._source.mtime
},
dateDesc: {
mode: [
{mtime: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.mtime
key: (hit) => hit._source.mtime
},
sizeAsc: {
mode: [
{size: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.size
key: (hit) => hit._source.size
},
sizeDesc: {
mode: [
{size: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.size
key: (hit) => hit._source.size
},
nameAsc: {
mode: [
{name: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
key: (hit) => hit._source.name
},
nameDesc: {
mode: [
{name: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
key: (hit) => hit._source.name
}
} as any;
};
interface SortMode {
text: string
mode: any[]
key: (hit: EsHit) => any
}
class Sist2ElasticsearchQuery {
class Sist2Query {
searchQuery(): any {
searchQuery(blankSearch = false) {
const getters = store.getters;
@ -83,31 +76,17 @@ class Sist2Query {
const fuzzy = getters.fuzzy;
const size = getters.size;
const after = getters.lastDoc;
const selectedIndexIds = getters.selectedIndices.map((idx: Index) => idx.id)
const selectedIndexIds = getters.selectedIndices.map((idx) => idx.id)
const selectedMimeTypes = getters.selectedMimeTypes;
const selectedTags = getters.selectedTags;
const sortMode = getters.embedding ? "score" : getters.sortMode;
const legacyES = store.state.sist2Info.esVersionLegacy;
const hasKnn = store.state.sist2Info.esVersionHasKnn;
const filters = [
{terms: {index: selectedIndexIds}}
] as any[];
if (sizeMin && sizeMax) {
filters.push({range: {size: {gte: sizeMin, lte: sizeMax}}})
} else if (sizeMin) {
filters.push({range: {size: {gte: sizeMin}}})
} else if (sizeMax) {
filters.push({range: {size: {lte: sizeMax}}})
}
if (dateMin && dateMax) {
filters.push({range: {mtime: {gte: dateMin, lte: dateMax}}})
} else if (dateMin) {
filters.push({range: {mtime: {gte: dateMin}}})
} else if (dateMax) {
filters.push({range: {mtime: {lte: dateMax}}})
}
];
const fields = [
"name^8",
@ -128,20 +107,39 @@ class Sist2Query {
fields.push("name.nGram^3");
}
const path = pathText.replace(/\/$/, "").toLowerCase(); //remove trailing slashes
if (path !== "") {
filters.push({term: {path: path}})
}
if (!blankSearch) {
if (sizeMin && sizeMax) {
filters.push({range: {size: {gte: sizeMin, lte: sizeMax}}})
} else if (sizeMin) {
filters.push({range: {size: {gte: sizeMin}}})
} else if (sizeMax) {
filters.push({range: {size: {lte: sizeMax}}})
}
if (selectedMimeTypes.length > 0) {
filters.push({terms: {"mime": selectedMimeTypes}});
}
if (dateMin && dateMax) {
filters.push({range: {mtime: {gte: dateMin, lte: dateMax, format: "epoch_second"}}})
} else if (dateMin) {
filters.push({range: {mtime: {gte: dateMin, format: "epoch_second"}}})
} else if (dateMax) {
filters.push({range: {mtime: {lte: dateMax, format: "epoch_second"}}})
}
if (selectedTags.length > 0) {
if (getters.optTagOrOperator) {
filters.push({terms: {"tag": selectedTags}});
} else {
selectedTags.forEach((tag: string) => filters.push({term: {"tag": tag}}));
const path = pathText.replace(/\/$/, "").toLowerCase(); //remove trailing slashes
if (path !== "") {
filters.push({term: {path: path}})
}
if (selectedMimeTypes.length > 0) {
filters.push({terms: {"mime": selectedMimeTypes}});
}
if (selectedTags.length > 0) {
if (getters.optTagOrOperator) {
filters.push({terms: {"tag": selectedTags}});
} else {
selectedTags.forEach((tag) => filters.push({term: {"tag": tag}}));
}
}
}
@ -166,31 +164,76 @@ class Sist2Query {
const q = {
_source: {
excludes: ["content", "_tie"]
excludes: ["content", "_tie", "emb.*"]
},
query: {
bool: {
filter: filters,
}
},
sort: SORT_MODES[getters.sortMode].mode,
aggs:
{
total_size: {"sum": {"field": "size"}},
total_count: {"value_count": {"field": "size"}}
},
sort: SORT_MODES[sortMode].mode,
size: size,
} as any;
};
if (!empty) {
q.query.bool.must = query;
if (!after) {
q.aggs = {
total_size: {"sum": {"field": "size"}},
total_count: {"value_count": {"field": "size"}}
};
}
if (!empty && !blankSearch) {
if (getters.embedding) {
filters.push(query)
} else {
q.query.bool.must = query;
}
}
if (getters.embedding) {
delete q.query;
const field = "emb." + Sist2Api.models().find(m => m.id === getters.embeddingsModel).path;
if (hasKnn) {
// Use knn (8.8+)
q.knn = {
field: field,
query_vector: getters.embedding,
k: 600,
num_candidates: 600,
filter: filters
}
} else {
// Use brute-force as a fallback
filters.push({exists: {field: field}});
q.query = {
function_score: {
query: {
bool: {
must: filters,
}
},
script_score: {
script: {
source: `cosineSimilarity(params.query_vector, "${field}") + 1.0`,
params: {query_vector: getters.embedding}
}
}
}
}
}
}
if (after) {
q.search_after = [SORT_MODES[getters.sortMode].key(after), after["_id"]];
q.search_after = [SORT_MODES[sortMode].key(after), after["_id"]];
}
if (getters.optHighlight) {
if (getters.optHighlight && !getters.embedding) {
q.highlight = {
pre_tags: ["<mark>"],
post_tags: ["</mark>"],
@ -207,7 +250,7 @@ class Sist2Query {
};
if (!legacyES) {
q.highlight.max_analyzed_offset = 9_999_999;
q.highlight.max_analyzed_offset = 999_999;
}
if (getters.optSearchInPath) {
@ -216,7 +259,7 @@ class Sist2Query {
}
}
if (getters.sortMode === "random") {
if (sortMode === "random") {
q.query = {
function_score: {
query: {
@ -237,7 +280,7 @@ class Sist2Query {
}
}
if (!empty) {
if (!empty && !blankSearch) {
q.query.function_score.query.bool.must.push(query);
}
}
@ -246,4 +289,5 @@ class Sist2Query {
}
}
export default new Sist2Query();
export default new Sist2ElasticsearchQuery();

View File

@ -0,0 +1,116 @@
import store from "@/store";
const SORT_MODES = {
score: {
"sort": "score",
},
random: {
"sort": "random"
},
dateAsc: {
"sort": "mtime"
},
dateDesc: {
"sort": "mtime",
"sortAsc": false
},
sizeAsc: {
"sort": "size",
},
sizeDesc: {
"sort": "size",
"sortAsc": false
},
nameAsc: {
"sort": "name",
},
nameDesc: {
"sort": "name",
"sortAsc": false
}
};
class Sist2ElasticsearchQuery {
searchQuery() {
const getters = store.getters;
const searchText = getters.searchText;
const pathText = getters.pathText;
const sizeMin = getters.sizeMin;
const sizeMax = getters.sizeMax;
const dateMin = getters.dateMin;
const dateMax = getters.dateMax;
const size = getters.size;
const after = getters.lastDoc;
const selectedIndexIds = getters.selectedIndices.map((idx) => idx.id)
const selectedMimeTypes = getters.selectedMimeTypes;
const selectedTags = getters.selectedTags;
const q = {
"pageSize": size
}
Object.assign(q, SORT_MODES[getters.sortMode]);
if (!after) {
q["fetchAggregations"] = true;
}
if (searchText) {
q["query"] = searchText;
}
if (pathText) {
q["path"] = pathText.endsWith("/") ? pathText.slice(0, -1) : pathText;
}
if (sizeMin) {
q["sizeMin"] = sizeMin;
}
if (sizeMax) {
q["sizeMax"] = sizeMax;
}
if (dateMin) {
q["dateMin"] = dateMin;
}
if (dateMax) {
q["dateMax"] = dateMax;
}
if (after) {
q["after"] = after.sort;
}
if (selectedIndexIds.length > 0) {
q["indexIds"] = selectedIndexIds;
}
if (selectedMimeTypes.length > 0) {
q["mimeTypes"] = selectedMimeTypes;
}
if (selectedTags.length > 0) {
q["tags"] = selectedTags
}
if (getters.sortMode === "random") {
q["seed"] = getters.seed;
}
if (getters.optHighlight) {
q["highlight"] = true;
q["highlightContextSize"] = Number(getters.optFragmentSize);
}
if (getters.embedding) {
q["model"] = getters.embeddingsModel;
q["embedding"] = getters.embedding;
q["sort"] = "embedding";
q["sortAsc"] = false;
} else if (getters.sortMode === "embedding") {
q["sort"] = "sort"
q["sortAsc"] = true;
}
q["searchInPath"] = getters.optSearchInPath;
return q;
}
}
export default new Sist2ElasticsearchQuery();

View File

@ -0,0 +1,21 @@
<template>
<span :style="getStyle()">{{span.text}}</span>
</template>
<script>
import ModelsRepo from "@/ml/modelsRepo";
export default {
name: "AnalyzedContentSpan",
props: ["span", "text"],
methods: {
getStyle() {
return ModelsRepo.data[this.$store.getters.nerModel.name].labelStyles[this.span.label];
}
}
}
</script>
<style scoped></style>

View File

@ -0,0 +1,75 @@
<template>
<div>
<b-card class="mb-2">
<AnalyzedContentSpan v-for="span of legend" :key="span.id" :span="span"
class="mr-2"></AnalyzedContentSpan>
</b-card>
<div class="content-div">
<AnalyzedContentSpan v-for="span of mergedSpans" :key="span.id" :span="span"></AnalyzedContentSpan>
</div>
</div>
</template>
<script>
import AnalyzedContentSpan from "@/components/AnalyzedContentSpan.vue";
import ModelsRepo from "@/ml/modelsRepo";
export default {
name: "AnalyzedContentSpanContainer",
components: {AnalyzedContentSpan},
props: ["spans", "text"],
computed: {
legend() {
return Object.entries(ModelsRepo.data[this.$store.state.nerModel.name].legend)
.map(([label, name]) => ({
text: name,
id: label,
label: label
}));
},
mergedSpans() {
const spans = this.spans;
const merged = [];
let lastLabel = null;
let fixSpace = false;
for (let i = 0; i < spans.length; i++) {
if (spans[i].label !== lastLabel) {
let start = spans[i].wordIndex;
const nextSpan = spans.slice(i + 1).find(s => s.label !== spans[i].label)
let end = nextSpan ? nextSpan.wordIndex : undefined;
if (end !== undefined && this.text[end - 1] === " ") {
end -= 1;
fixSpace = true;
}
merged.push({
text: this.text.slice(start, end),
label: spans[i].label,
id: spans[i].wordIndex
});
if (fixSpace) {
merged.push({
text: " ",
label: "O",
id: end
});
fixSpace = false;
}
lastLabel = spans[i].label;
}
}
return merged;
},
},
}
</script>
<style scoped></style>

Some files were not shown because too many files have changed in this diff Show More