Compare commits

..

77 Commits

Author SHA1 Message Date
dependabot[bot]
847cdaf3e5 Bump micromatch from 4.0.5 to 4.0.8 in /sist2-admin/frontend
Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8.
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/master/CHANGELOG.md)
- [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8)

---
updated-dependencies:
- dependency-name: micromatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-01 02:05:31 +00:00
2436e52a62 Merge pull request #479 from Kiskadee-dev/master
Update README.md
2024-04-26 10:03:12 -04:00
Matheus Victor
c3a09d0683 Update README.md 2024-04-26 10:41:25 -03:00
b9f82593ce Fix onnx 2024-04-03 20:24:30 -04:00
59bc418a95 Fix loadModel 2024-04-03 20:03:46 -04:00
fc06b3e378 Fix crash for leftover documents in sqlite index 2024-04-03 18:38:09 -04:00
89e1968994 Merge pull request #474 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/express-4.19.2
Bump express from 4.18.2 to 4.19.2 in /sist2-admin/frontend
2024-04-03 16:10:02 -04:00
7009c082e1 Merge pull request #473 from simon987/dependabot/npm_and_yarn/sist2-vue/webpack-dev-middleware-5.3.4
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-vue
2024-04-03 16:09:57 -04:00
64d6bc04a7 Merge pull request #472 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/webpack-dev-middleware-5.3.4
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-admin/frontend
2024-04-03 16:09:48 -04:00
a2655edf2f Merge pull request #470 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/follow-redirects-1.15.6
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-admin/frontend
2024-04-03 16:09:40 -04:00
86212ece64 Merge pull request #469 from simon987/dependabot/npm_and_yarn/sist2-vue/follow-redirects-1.15.6
Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-vue
2024-04-03 16:09:31 -04:00
61170ce503 Update README 2024-04-03 16:08:40 -04:00
7ae410dcc7 fix package-lock.json (again) 2024-04-03 15:51:30 -04:00
dependabot[bot]
8714e7e41a Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-vue
Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/webpack/webpack-dev-middleware/releases)
- [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4)

---
updated-dependencies:
- dependency-name: webpack-dev-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 19:46:41 +00:00
dependabot[bot]
4a804b7319 Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-vue
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-03 19:46:38 +00:00
4f83a044c7 Fix aarch64 build 2024-04-03 15:45:32 -04:00
6e15201a05 Fix package-lock.json 2024-04-03 15:44:35 -04:00
6bb12a563a Version bump 2024-04-03 14:40:03 -04:00
4567f52668 Add toggle for verbose web logs 2024-04-03 14:39:44 -04:00
774efe062f Fix for newer node version in debug script 2024-04-03 14:27:17 -04:00
7a7a0686c2 Fixes for new mongoose version 2024-04-03 14:26:54 -04:00
7bc2ef9e6c Add debug print statement.. 2024-04-03 14:26:33 -04:00
f65cca5a02 Fix NULL mime SQLite index 2024-04-03 14:26:20 -04:00
6423643e24 Fix right click on images in lightbox, update lightbox 2024-04-03 14:24:54 -04:00
f99ea74e3f Passthrough frontend logs to stdout 2024-04-03 11:18:54 -04:00
1f8f65044c 3rd party lib updates 2024-04-03 11:18:24 -04:00
0981a1f421 Update compose file to add ES persistence.. 2024-04-03 09:22:17 -04:00
ff066a3962 Fix build for GCC 12 2024-04-03 09:15:00 -04:00
dependabot[bot]
1e778b6f2a Bump express from 4.18.2 to 4.19.2 in /sist2-admin/frontend
Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2)

---
updated-dependencies:
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-28 17:15:39 +00:00
dependabot[bot]
ff27a540eb Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /sist2-admin/frontend
Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4.
- [Release notes](https://github.com/webpack/webpack-dev-middleware/releases)
- [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md)
- [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4)

---
updated-dependencies:
- dependency-name: webpack-dev-middleware
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-23 11:35:55 +00:00
dependabot[bot]
83259eedee Bump follow-redirects from 1.15.4 to 1.15.6 in /sist2-admin/frontend
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-16 23:11:52 +00:00
simon987
aff69fb3eb Force user to not have both --auth and --tag-auth at the same time in the UI #453 2024-01-30 10:49:44 -05:00
simon987
08b6323176 Add error message when frontend does not start 2024-01-30 10:42:16 -05:00
2307fc6e15 Merge pull request #455 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/follow-redirects-1.15.4
Bump follow-redirects from 1.15.0 to 1.15.4 in /sist2-admin/frontend
2024-01-14 09:15:19 -05:00
d679e4c3ca Merge pull request #456 from simon987/dependabot/npm_and_yarn/sist2-vue/follow-redirects-1.15.4
Bump follow-redirects from 1.15.2 to 1.15.4 in /sist2-vue
2024-01-14 09:15:13 -05:00
f423a17543 Merge pull request #458 from SystemZ/fix-tail-width
fix tail horizontal scrolling
2024-01-14 09:15:04 -05:00
Michał Frąckiewicz
1bdf4d71dd fix tail horizontal scrolling
Before this change, debugging via logs was hard due to clipping width of the log box
2024-01-13 11:25:35 +01:00
dependabot[bot]
f58e66352c Bump follow-redirects from 1.15.2 to 1.15.4 in /sist2-vue
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.2 to 1.15.4.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.2...v1.15.4)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-10 11:22:12 +00:00
dependabot[bot]
a672822811 Bump follow-redirects from 1.15.0 to 1.15.4 in /sist2-admin/frontend
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.0 to 1.15.4.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.15.0...v1.15.4)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-10 00:13:02 +00:00
simon987
ae317e590d Update MIN_OCR_LEN to 3 2024-01-07 09:16:40 -05:00
simon987
410283f14a Remove debug print 2023-12-10 09:21:14 -05:00
simon987
2936240df8 Disable OSD, add preserve_interword_spaces for chi_sim OCR (#443) 2023-12-10 09:20:43 -05:00
af5059f366 Update USAGE.md 2023-12-02 09:25:03 -05:00
simon987
03983ce00a Fix for #439 2023-11-19 15:46:26 -05:00
simon987
80528857e9 Duplicate media_comment field, fixes #440 2023-11-18 10:53:42 -05:00
ffa7f2ae84 Add button for full reindex, fixes #403 2023-11-18 10:53:42 -05:00
6ade3395d5 Merge pull request #437 from simon987/dependabot/npm_and_yarn/sist2-vue/axios-1.6.0
Bump axios from 0.25.0 to 1.6.0 in /sist2-vue
2023-11-11 09:26:01 -05:00
a2d5e774b3 Merge pull request #438 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/axios-1.6.0
Bump axios from 0.27.2 to 1.6.0 in /sist2-admin/frontend
2023-11-11 09:25:49 -05:00
dependabot[bot]
19ea1169ff Bump axios from 0.27.2 to 1.6.0 in /sist2-admin/frontend
Bumps [axios](https://github.com/axios/axios) from 0.27.2 to 1.6.0.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.27.2...v1.6.0)

---
updated-dependencies:
- dependency-name: axios
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-11 06:17:01 +00:00
dependabot[bot]
1225fd6bac Bump axios from 0.25.0 to 1.6.0 in /sist2-vue
Bumps [axios](https://github.com/axios/axios) from 0.25.0 to 1.6.0.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/v1.x/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.25.0...v1.6.0)

---
updated-dependencies:
- dependency-name: axios
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-10 23:42:56 +00:00
687b645840 Merge pull request #434 from simon987/dependabot/npm_and_yarn/sist2-admin/frontend/babel/traverse-7.23.2
Bump @babel/traverse from 7.20.5 to 7.23.2 in /sist2-admin/frontend
2023-10-19 08:36:23 -04:00
d2c8f9209d Merge pull request #433 from simon987/dependabot/npm_and_yarn/sist2-vue/babel/traverse-7.23.2
Bump @babel/traverse from 7.20.12 to 7.23.2 in /sist2-vue
2023-10-19 08:36:13 -04:00
dependabot[bot]
3ea375b37d Bump @babel/traverse from 7.20.5 to 7.23.2 in /sist2-admin/frontend
Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.20.5 to 7.23.2.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse)

---
updated-dependencies:
- dependency-name: "@babel/traverse"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-18 20:02:12 +00:00
dependabot[bot]
bff89d93e6 Bump @babel/traverse from 7.20.12 to 7.23.2 in /sist2-vue
Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.20.12 to 7.23.2.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse)

---
updated-dependencies:
- dependency-name: "@babel/traverse"
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-18 17:03:04 +00:00
f423863acb Add option to search in path for sqlite #402 2023-10-16 21:14:46 -04:00
49a21a5a25 Version bump 2023-10-13 19:02:26 -04:00
560aa82ce7 add discord invite 2023-10-09 18:10:51 -04:00
b8c905bd64 expose indexRoot value to documents 2023-10-08 21:00:03 -04:00
8299237ea0 version bump 2023-10-07 15:04:19 -04:00
31646a2747 Fix CURL error 2023-10-07 13:16:22 -04:00
d9d77de47f Update docs 2023-10-07 11:07:13 -04:00
5f0957d029 Update readme 2023-10-07 10:15:41 -04:00
1cc48f7f33 Version bump 2023-10-07 10:15:03 -04:00
e1e22fd79a Add missing image 2023-09-28 17:52:57 -04:00
786bbc3859 Make sure dateMin and dateMax are not equal with sqlite frontend 2023-09-27 19:49:43 -04:00
9698ea0c37 Fix #419 2023-09-26 19:51:17 -04:00
f345fc1a9a version bump, 2023-09-26 17:58:43 -04:00
660fbf75d8 Potential tpool_wait fix for #416, #397, #399 2023-09-26 17:58:07 -04:00
33ae585879 Fix #426 2023-09-25 21:36:50 -04:00
5729cbd6b4 update gitignore 2023-09-16 09:44:29 -04:00
a19ec3305a Fix #415, fix sqlite-index error 2023-09-16 09:43:55 -04:00
8fdb832c85 refactor index schema, remove sidecar parsing, remove TS 2023-09-05 18:59:18 -04:00
b81ccebdb1 Fix #406 2023-08-20 19:53:50 -04:00
b2d214a19a Merge pull request #412 from simon987/dependabot/npm_and_yarn/sist2-vue/protobufjs-6.11.4
Bump protobufjs from 6.11.3 to 6.11.4 in /sist2-vue
2023-08-20 19:45:09 -04:00
69438464bf Fix python path for user scripts 2023-08-19 17:04:47 -04:00
dependabot[bot]
aa60b526f4 Bump protobufjs from 6.11.3 to 6.11.4 in /sist2-vue
Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 6.11.3 to 6.11.4.
- [Release notes](https://github.com/protobufjs/protobuf.js/releases)
- [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md)
- [Commits](https://github.com/protobufjs/protobuf.js/commits)

---
updated-dependencies:
- dependency-name: protobufjs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-08-19 20:09:14 +00:00
2ea8b51b34 Merge pull request #411 from simon987/embeddings
rework user scripts, add support for embedding search
2023-08-19 16:08:50 -04:00
119 changed files with 9132 additions and 18497 deletions

View File

@@ -1,9 +0,0 @@
FROM simon987/sist2-build
RUN curl -fsSL https://deb.nodesource.com/setup_16.x | bash
RUN apt update -y; apt install -y nodejs && rm -rf /var/lib/apt/lists/*
ENV DEBIAN_FRONTEND=noninteractive
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8

View File

@@ -1,16 +0,0 @@
{
"name": "sist2-dev",
"dockerComposeFile": [
"docker-compose.yml"
],
"service": "sist2-dev",
"customizations": {
"vscode": {
"extensions": [
"ms-vscode.cpptools-extension-pack"
]
}
},
"remoteUser": "root",
"workspaceFolder": "/app/"
}

View File

@@ -1,8 +0,0 @@
version: "3"
services:
sist2-dev:
build: .
command: sleep infinity
volumes:
- ../:/app

View File

@@ -38,4 +38,6 @@ build/
__pycache__/
sist2-vue/dist
sist2-admin/frontend/dist
*.fts
*.fts
.git
third-party/libscan/third-party/ext_*/*

1
.gitignore vendored
View File

@@ -3,6 +3,7 @@ thumbs
*.cbp
CMakeCache.txt
CMakeFiles
cmake-build-default-event-trace
cmake-build-debug
cmake_install.cmake
Makefile

View File

@@ -53,7 +53,6 @@ add_executable(
src/types.h
src/log.c src/log.h
src/cli.c src/cli.h
src/parsing/sidecar.c src/parsing/sidecar.h
src/database/database.c src/database/database.h
src/parsing/fs_util.h

View File

@@ -48,5 +48,6 @@ COPY --from=build /build/build/sist2 /root/sist2
# sist2-admin
WORKDIR /root/sist2-admin
COPY sist2-admin/requirements.txt /root/sist2-admin/
RUN python3 -m pip install --no-cache -r /root/sist2-admin/requirements.txt
RUN ln /usr/bin/python3 /usr/bin/python
RUN python -m pip install --no-cache -r /root/sist2-admin/requirements.txt
COPY --from=build /build/sist2-admin/ /root/sist2-admin/

View File

@@ -4,6 +4,8 @@
**Demo**: [sist2.simon987.net](https://sist2.simon987.net/)
**Community URL:** [Discord](https://discord.gg/2PEjDy3Rfs)
# sist2
sist2 (Simple incremental search tool)
@@ -42,20 +44,28 @@ services:
elasticsearch:
image: elasticsearch:7.17.9
restart: unless-stopped
volumes:
# This directory must have 1000:1000 permissions (or update PUID & PGID below)
- /data/sist2-es-data/:/usr/share/elasticsearch/data
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
- "PUID=1000"
- "PGID=1000"
sist2-admin:
image: simon987/sist2:3.1.4-x64-linux
image: simon987/sist2:3.4.2-x64-linux
restart: unless-stopped
volumes:
- ./sist2-admin-data/:/sist2-admin/
- /data/sist2-admin-data/:/sist2-admin/
- /:/host
ports:
- 4090:4090 # sist2
- 8080:8080 # sist2-admin
- 4090:4090
# NOTE: Don't expose this port publicly!
- 8080:8080
working_dir: /root/sist2-admin/
entrypoint: python3 /root/sist2-admin/sist2_admin/app.py
entrypoint: python3
command:
- /root/sist2-admin/sist2_admin/app.py
```
Navigate to http://localhost:8080/ to configure sist2-admin.
@@ -80,8 +90,10 @@ Example usage:
1. Scan a directory: `sist2 scan ~/Documents --output ./documents.sist2`
2. Prepare search index:
* **Elasticsearch**: `sist2 index --es-url http://localhost:9200 ./documents.sist2`
* **SQLite**: `sist2 index --search-index ./search.sist2 ./documents.sist2`
3. Start web interface: `sist2 web ./documents.sist2`
* **SQLite**: `sist2 sqlite-index --search-index ./search.sist2 ./documents.sist2`
3. Start web interface:
* **Elasticsearch**: `sist2 web ./documents.sist2`
* **SQLite**: `sist2 web --search-index ./search.sist2 ./documents.sist2`
## Format support
@@ -153,10 +165,10 @@ indices, but it uses much less memory and is easier to set up.
| Query syntax | [fts5](https://www.sqlite.org/fts5.html) | [query_string](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax) |
| Fuzzy search | | ✓ |
| Media Types tree real-time updating | | ✓ |
| Search in file `path` | [WIP](https://github.com/simon987/sist2/issues/402) | ✓ |
| Manual tagging | ✓ | ✓ |
| User scripts | ✓ | ✓ |
| Media Type breakdown for search results | | ✓ |
| Embeddings search | ✓ *O(n)* | ✓ *O(logn)* |
### NER
@@ -206,7 +218,7 @@ docker run --rm --entrypoint cat my-sist2-image /root/sist2 > sist2-x64-linux
3. Install vcpkg dependencies
```bash
vcpkg install openblas curl[core,openssl] sqlite3[core,fts5] cpp-jwt pcre cjson brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf[ocr] gtest mongoose libmagic libraw gumbo ffmpeg[core,avcodec,avformat,swscale,swresample,webp,opus,mp3lame,vpx,zlib]
vcpkg install openblas curl[core,openssl] sqlite3[core,fts5,json1] cpp-jwt pcre cjson brotli libarchive[core,bzip2,libxml2,lz4,lzma,lzo] pthread tesseract libxml2 libmupdf[ocr] gtest mongoose libmagic libraw gumbo ffmpeg[core,avcodec,avformat,swscale,swresample,webp,opus,mp3lame,vpx,zlib]
```
4. Build

View File

@@ -4,15 +4,20 @@ services:
elasticsearch:
image: elasticsearch:7.17.9
container_name: sist2-es
volumes:
# This directory must have 1000:1000 permissions (or update PUID & PGID below)
- /data/sist2-es-data/:/usr/share/elasticsearch/data
environment:
- "discovery.type=single-node"
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
- "PUID=1000"
- "PGID=1000"
sist2-admin:
build:
context: .
container_name: sist2-admin
volumes:
- /mnt/array/sist2-admin-data/:/sist2-admin/
- /data/sist2-admin-data/:/sist2-admin/
- /:/host
ports:
- 4090:4090

View File

@@ -16,9 +16,9 @@ Lightning-fast file system indexer and search tool.
Scan options
-t, --threads=<int> Number of threads. DEFAULT: 1
-q, --thumbnail-quality=<int> Thumbnail quality, on a scale of 0 to 100, 100 being the best. DEFAULT: 50
--thumbnail-size=<int> Thumbnail size, in pixels. DEFAULT: 552
--thumbnail-count=<int> Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT: 1
-q, --thumbnail_count-quality=<int> Thumbnail quality, on a scale of 0 to 100, 100 being the best. DEFAULT: 50
--thumbnail_count-size=<int> Thumbnail size, in pixels. DEFAULT: 552
--thumbnail_count-count=<int> Number of thumbnails to generate. Set a value > 1 to create video previews, set to 0 to disable thumbnails. DEFAULT: 1
--content-size=<int> Number of bytes to be extracted from text documents. Set to 0 to disable. DEFAULT: 32768
-o, --output=<str> Output index file path. DEFAULT: index.sist2
--incremental If the output file path exists, only scan new or modified files.
@@ -78,9 +78,9 @@ Made by simon987 <me@simon987.net>. Released under GPL-3.0
#### Thumbnail database size estimation
See chart below for rough estimate of thumbnail size vs. thumbnail size & quality arguments:
See chart below for rough estimate of thumbnail_count size vs. thumbnail_count size & quality arguments:
For example, `--thumbnail-size=500`, `--thumbnail-quality=50` for a directory with 8 million images will create a thumbnail database
For example, `--thumbnail_count-size=500`, `--thumbnail_count-quality=50` for a directory with 8 million images will create a thumbnail_count database
that is about `8000000 * 11.8kB = 94.4GB`.
![thumbnail_size](thumbnail_size.png)
@@ -92,7 +92,7 @@ Simple scan
sist2 scan ~/Documents
sist2 scan \
--threads 4 --content-size 16000000 --thumbnail-quality 2 --archive shallow \
--threads 4 --content-size 16000000 --thumbnail_count-quality 2 --archive shallow \
--name "My Documents" --rewrite-url "http://nas.domain.local/My Documents/" \
~/Documents -o ./documents.sist2
```
@@ -172,9 +172,39 @@ Using a version >=7.14.0 is recommended to enable the following features:
- Bug fix for large documents (See #198)
Using a version >=8.0.0 is recommended to enable the following features:
- Approximate KNN search for Embeddings search (faster queries).
When using a legacy version of ES, a notice will be displayed next to the sist2 version in the web UI.
If you don't care about the features above, you can ignore it or disable it in the configuration page.
# Embeddings search
Since v3.2.0, User scripts can be used to generate _embeddings_ (vector of float32 numbers) which are stored in the .sist2 index file
(see [scripting](scripting.md)). Embeddings can be used for:
* Nearest-neighbor queries (e.g. "return the documents most similar to this one")
* Semantic searches (e.g. "return the documents that are most closely related to the given topic")
In theory, embeddings can be created for any type of documents (image, text, audio etc.).
For example, the [clip](https://github.com/simon987/sist2-script-clip) User Script, generates 512-d embeddings of images
(videos are also supported using the thumbnails generated by sist2). When the user enters a query in the "Embeddings Search"
textbox, the query's embedding is generated in their browser, leveraging the ONNX web runtime.
<details>
<summary>Screenshots</summary>
![embeddings-1](embeddings-1.png)
![embeddings-2](embeddings-2.png)
1. Embeddings search bar. You can select the model using the dropdown on the left.
2. This icon appears for indices with embeddings search enabled.
3. Documents with this icon have embeddings. Click on the icon to perform KNN search.
</details>
# Tagging
### Manual tagging
@@ -200,42 +230,3 @@ See [Automatic tagging](#automatic-tagging) for information about tag
### Automatic tagging
See [scripting](scripting.md) documentation.
# Sidecar files
When scanning, sist2 will read metadata from `.s2meta` JSON files and overwrite the
original document's indexed metadata (does not modify the actual file). Sidecar metadata files will also work inside archives.
Sidecar files themselves are not saved in the index.
This feature is useful to leverage third-party applications such as speech-to-text or
OCR to add additional metadata to a file.
**Example**
```
~/Documents/
├── Video.mp4
└── Video.mp4.s2meta
```
The sidecar file must have exactly the same file path and the `.s2meta` suffix.
`Video.mp4.s2meta`:
```json
{
"content": "This sidecar file will overwrite some metadata fields of Video.mp4",
"author": "Some author",
"duration": 12345,
"bitrate": 67890,
"some_arbitrary_field": [1,2,3]
}
```
```
sist2 scan ~/Documents -o ./docs.sist2
sist2 index ./docs.sist2
```
*NOTE*: It is technically possible to overwrite the `tag` value using sidecar files, however,
it is not currently possible to restore both manual tags and sidecar tags without user scripts
while reindexing.

BIN
docs/embeddings-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

BIN
docs/embeddings-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 996 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

View File

@@ -1,131 +0,0 @@
import sqlite3
import orjson as json
import os
import string
from hashlib import md5
import random
from tqdm import tqdm
schema = """
CREATE TABLE thumbnail (
id TEXT NOT NULL CHECK (
length(id) = 32
),
num INTEGER NOT NULL,
data BLOB NOT NULL,
PRIMARY KEY(id, num)
) WITHOUT ROWID;
CREATE TABLE version (
id INTEGER PRIMARY KEY AUTOINCREMENT,
date TEXT NOT NULL DEFAULT (CURRENT_TIMESTAMP)
);
CREATE TABLE document (
id TEXT PRIMARY KEY NOT NULL CHECK (
length(id) = 32
),
marked INTEGER NOT NULL DEFAULT (1),
version INTEGER NOT NULL REFERENCES version(id),
mtime INTEGER NOT NULL,
size INTEGER NOT NULL,
json_data TEXT NOT NULL CHECK (
json_valid(json_data)
)
);
CREATE TABLE delete_list (
id TEXT PRIMARY KEY CHECK (
length(id) = 32
)
) WITHOUT ROWID;
CREATE TABLE tag (
id TEXT NOT NULL,
tag TEXT NOT NULL,
PRIMARY KEY (id, tag)
);
CREATE TABLE document_sidecar (
id TEXT PRIMARY KEY NOT NULL, json_data TEXT NOT NULL
) WITHOUT ROWID;
CREATE TABLE descriptor (
id TEXT NOT NULL, version_major INTEGER NOT NULL,
version_minor INTEGER NOT NULL, version_patch INTEGER NOT NULL,
root TEXT NOT NULL, name TEXT NOT NULL,
rewrite_url TEXT, timestamp INTEGER NOT NULL
);
CREATE TABLE stats_treemap (
path TEXT NOT NULL, size INTEGER NOT NULL
);
CREATE TABLE stats_size_agg (
bucket INTEGER NOT NULL, count INTEGER NOT NULL
);
CREATE TABLE stats_date_agg (
bucket INTEGER NOT NULL, count INTEGER NOT NULL
);
CREATE TABLE stats_mime_agg (
mime TEXT NOT NULL, size INTEGER NOT NULL,
count INTEGER NOT NULL
);
CREATE TABLE embedding (
id TEXT REFERENCES document(id),
model_id INTEGER NOT NULL references model(id),
start INTEGER NOT NULL,
end INTEGER,
embedding BLOB NOT NULL,
PRIMARY KEY (id, model_id, start)
);
CREATE TABLE model (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL UNIQUE CHECK (
length(name) < 16
),
url TEXT,
path TEXT NOT NULL UNIQUE,
size INTEGER NOT NULL,
type TEXT NOT NULL CHECK (
type IN ('flat', 'nested')
)
);
"""
content = "".join(random.choices(string.ascii_letters, k=500))
def gen_document():
return [
md5(random.randbytes(8)).hexdigest(),
json.dumps({
"content": content,
"mime": "image/jpeg",
"extension": "jpeg",
"name": "test",
"path": "",
})
]
if __name__ == "__main__":
DB_NAME = "big_index.sist2"
SIZE = 30_000_000
os.remove(DB_NAME)
db = sqlite3.connect(DB_NAME)
db.executescript(schema)
db.executescript("""
PRAGMA journal_mode = OFF;
PRAGMA synchronous = 0;
""")
for _ in tqdm(range(SIZE), total=SIZE):
db.execute(
"INSERT INTO document (id, version, mtime, size, json_data) VALUES (?, 1, 1000000, 10000, ?)",
gen_document()
)
# 1. Enable rowid from document
# 2. CREATE TABLE marked (
# id INTEGER PRIMARY KEY,
# marked int
# );
# 3. Set FK for document_sidecar, embedding, tag, thumbnail
# 4. Toggle FK if debug
db.commit()

View File

@@ -449,5 +449,4 @@ image/x-sigma-x3f, xf3
image/x-sony-arw, arw
image/x-sony-sr2, sr2
image/x-sony-srf, srf
image/x-epson-erf, erf
sist2/sidecar, s2meta
image/x-epson-erf, erf
1 application/x-matlab-data mat
449 image/x-sony-arw arw
450 image/x-sony-sr2 sr2
451 image/x-sony-srf srf
452 image/x-epson-erf erf
sist2/sidecar s2meta

View File

@@ -3,6 +3,7 @@ import zlib
mimes = {}
noparse = set()
ext_in_hash = set()
mime_ids = {}
major_mime = {
"sist2": 0,
@@ -102,6 +103,9 @@ cnt = 1
def mime_id(mime):
if mime in mime_ids:
return mime_ids[mime]
global cnt
major = mime.split("/")[0]
mime_id = str((major_mime[major] << 16) + cnt)
@@ -127,9 +131,7 @@ def mime_id(mime):
elif mime == "application/x-empty":
cnt -= 1
return "1"
elif mime == "sist2/sidecar":
cnt -= 1
return "2"
mime_ids[mime] = mime_id
return mime_id
@@ -197,4 +199,12 @@ with open("scripts/mime.csv") as f:
print(f"case {crc(mime)}: return {clean(mime)};")
print("default: return 0;}}")
# mime list
mime_list = ",".join(mime_id(x) for x in mimes.keys()) + ",0"
print(f"unsigned int mime_ids[] = {{{mime_list}}};")
print("unsigned int* get_mime_ids() { return mime_ids; }")
print("#endif")

File diff suppressed because it is too large Load Diff

View File

@@ -8,7 +8,7 @@
"watch": "vue-cli-service build --watch"
},
"dependencies": {
"axios": "^0.27.2",
"axios": "^1.6.0",
"bootstrap-vue": "^2.21.2",
"core-js": "^3.6.5",
"moment": "^2.29.3",

View File

@@ -89,9 +89,12 @@ class Sist2AdminApi {
/**
* @param {string} name
* @param {bool} full
*/
runJob(name) {
return axios.get(`${this.baseUrl}/api/job/${name}/run`);
runJob(name, full) {
return axios.get(`${this.baseUrl}/api/job/${name}/run`, {
params: {full}
});
}
/**

View File

@@ -95,6 +95,7 @@ export default {
methods: {
onOcrLangChange() {
this.options.ocr_lang = this.selectedOcrLangs.join("+");
this.update();
},
update() {
this.disableOcrLang = this.options.ocr_images === false && this.options.ocr_ebooks === false;

View File

@@ -1,59 +1,70 @@
<template>
<div>
<h4>{{ $t("webOptions.title") }}</h4>
<b-card>
<label>{{ $t("webOptions.lang") }}</label>
<b-form-select v-model="options.lang" :options="['en', 'fr', 'zh-CN', 'pl', 'de']"
@change="update()"></b-form-select>
<div>
<h4>{{ $t("webOptions.title") }}</h4>
<b-card>
<label>{{ $t("webOptions.lang") }}</label>
<b-form-select v-model="options.lang" :options="['en', 'fr', 'zh-CN', 'pl', 'de']"
@change="update()"></b-form-select>
<label>{{ $t("webOptions.bind") }}</label>
<b-form-input v-model="options.bind" @change="update()"></b-form-input>
<label>{{ $t("webOptions.bind") }}</label>
<b-form-input v-model="options.bind" @change="update()"></b-form-input>
<label>{{ $t("webOptions.tagline") }}</label>
<b-form-textarea v-model="options.tagline" @change="update()"></b-form-textarea>
<label>{{ $t("webOptions.tagline") }}</label>
<b-form-textarea v-model="options.tagline" @change="update()"></b-form-textarea>
<label>{{ $t("webOptions.auth") }}</label>
<b-form-input v-model="options.auth" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth") }}</label>
<b-form-input v-model="options.auth" @change="update()"></b-form-input>
<label>{{ $t("webOptions.tagAuth") }}</label>
<b-form-input v-model="options.tag_auth" @change="update()"></b-form-input>
</b-card>
<label>{{ $t("webOptions.tagAuth") }}</label>
<b-form-input v-model="options.tag_auth" @change="update()" :disabled="Boolean(options.auth)"></b-form-input>
<br>
<h4>Auth0 options</h4>
<b-card>
<label>{{ $t("webOptions.auth0Audience") }}</label>
<b-form-input v-model="options.auth0_audience" @change="update()"></b-form-input>
<b-form-checkbox v-model="options.verbose" @change="update()">
{{$t("webOptions.verbose")}}
</b-form-checkbox>
</b-card>
<label>{{ $t("webOptions.auth0Domain") }}</label>
<b-form-input v-model="options.auth0_domain" @change="update()"></b-form-input>
<br>
<h4>Auth0 options</h4>
<b-card>
<label>{{ $t("webOptions.auth0Audience") }}</label>
<b-form-input v-model="options.auth0_audience" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0ClientId") }}</label>
<b-form-input v-model="options.auth0_client_id" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0Domain") }}</label>
<b-form-input v-model="options.auth0_domain" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0PublicKey") }}</label>
<b-textarea rows="10" v-model="options.auth0_public_key" @change="update()"></b-textarea>
</b-card>
</div>
<label>{{ $t("webOptions.auth0ClientId") }}</label>
<b-form-input v-model="options.auth0_client_id" @change="update()"></b-form-input>
<label>{{ $t("webOptions.auth0PublicKey") }}</label>
<b-textarea rows="10" v-model="options.auth0_public_key" @change="update()"></b-textarea>
</b-card>
</div>
</template>
<script>
export default {
name: "WebOptions",
props: ["options", "frontendName"],
data() {
return {
showEsTestAlert: false,
esTestOk: false,
esTestMessage: "",
}
},
methods: {
update() {
this.$emit("change", this.options);
},
name: "WebOptions",
props: ["options", "frontendName"],
data() {
return {
showEsTestAlert: false,
esTestOk: false,
esTestMessage: ""
}
},
methods: {
update() {
console.log(this.options)
if (this.options.auth && this.options.tag_auth) {
// If both are set, remove tagAuth
this.options.tag_auth = "";
}
this.$emit("change", this.options);
},
}
}
</script>

View File

@@ -8,6 +8,7 @@ export default {
view: "View",
delete: "Delete",
runNow: "Index now",
runNowFull: "Full re-index",
create: "Create",
cancel: "Cancel",
test: "Test",
@@ -64,6 +65,9 @@ export default {
gitRepository: "Git repository URL",
extraArgs: "Extra command line arguments",
couldNotStartFrontend: "Could not start frontend",
couldNotStartFrontendBody: "Unable to start the frontend, check server logs for more details.",
selectJobs: "Available jobs",
selectJob: "Select a job",
webOptions: {
@@ -77,6 +81,7 @@ export default {
auth0Domain: "Auth0 domain",
auth0ClientId: "Auth0 client ID",
auth0PublicKey: "Auth0 public key",
verbose: "Verbose logs"
},
backendOptions: {
title: "Search backend options",

View File

@@ -1,63 +1,63 @@
<template>
<b-card>
<b-card-title>
{{ name }}
<small style="vertical-align: top">
<b-badge v-if="!loading && frontend.running" variant="success">{{ $t("online") }}</b-badge>
<b-badge v-else-if="!loading" variant="secondary">{{ $t("offline") }}</b-badge>
</small>
</b-card-title>
<b-card>
<b-card-title>
{{ name }}
<small style="vertical-align: top">
<b-badge v-if="!loading && frontend.running" variant="success">{{ $t("online") }}</b-badge>
<b-badge v-else-if="!loading" variant="secondary">{{ $t("offline") }}</b-badge>
</small>
</b-card-title>
<!-- Action buttons-->
<div class="mb-3" v-if="!loading">
<b-button class="mr-1" :disabled="frontend.running || !valid" variant="success" @click="start()">{{
$t("start")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="danger" @click="stop()">{{
$t("stop")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="primary" :href="frontendUrl" target="_blank">
{{ $t("go") }}
</b-button>
<b-button variant="danger" @click="deleteFrontend()">{{ $t("delete") }}</b-button>
</div>
<!-- Action buttons-->
<div class="mb-3" v-if="!loading">
<b-button class="mr-1" :disabled="frontend.running || !valid" variant="success" @click="start()">{{
$t("start")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="danger" @click="stop()">{{
$t("stop")
}}
</b-button>
<b-button class="mr-1" :disabled="!frontend.running" variant="primary" :href="frontendUrl" target="_blank">
{{ $t("go") }}
</b-button>
<b-button variant="danger" @click="deleteFrontend()">{{ $t("delete") }}</b-button>
</div>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card-body v-else>
<b-progress v-if="loading" striped animated value="100"></b-progress>
<b-card-body v-else>
<h4>{{ $t("backendOptions.title") }}</h4>
<b-card>
<b-alert v-if="!valid" variant="warning" show>{{ $t("frontendOptions.noJobSelectedWarning") }}</b-alert>
<h4>{{ $t("backendOptions.title") }}</h4>
<b-card>
<b-alert v-if="!valid" variant="warning" show>{{ $t("frontendOptions.noJobSelectedWarning") }}</b-alert>
<SearchBackendSelect :value="frontend.web_options.search_backend"
@change="onBackendSelect($event)"></SearchBackendSelect>
<SearchBackendSelect :value="frontend.web_options.search_backend"
@change="onBackendSelect($event)"></SearchBackendSelect>
<br>
<JobCheckboxGroup :frontend="frontend" @input="update()"></JobCheckboxGroup>
</b-card>
<br>
<JobCheckboxGroup :frontend="frontend" @input="update()"></JobCheckboxGroup>
</b-card>
<br/>
<br/>
<WebOptions :options="frontend.web_options" :frontend-name="$route.params.name"
@change="update()"></WebOptions>
<br/>
<WebOptions :options="frontend.web_options" :frontend-name="$route.params.name"
@change="update()"></WebOptions>
<br/>
<h4>{{ $t("frontendOptions.title") }}</h4>
<b-card>
<b-form-checkbox v-model="frontend.auto_start" @change="update()">
{{ $t("autoStart") }}
</b-form-checkbox>
<h4>{{ $t("frontendOptions.title") }}</h4>
<b-card>
<b-form-checkbox v-model="frontend.auto_start" @change="update()">
{{ $t("autoStart") }}
</b-form-checkbox>
<label>{{ $t("extraQueryArgs") }}</label>
<b-form-input v-model="frontend.extra_query_args" @change="update()"></b-form-input>
<label>{{ $t("extraQueryArgs") }}</label>
<b-form-input v-model="frontend.extra_query_args" @change="update()"></b-form-input>
<label>{{ $t("customUrl") }}</label>
<b-form-input v-model="frontend.custom_url" @change="update()" placeholder="http://"></b-form-input>
</b-card>
</b-card-body>
</b-card>
<label>{{ $t("customUrl") }}</label>
<b-form-input v-model="frontend.custom_url" @change="update()" placeholder="http://"></b-form-input>
</b-card>
</b-card-body>
</b-card>
</template>
<script>
@@ -68,71 +68,78 @@ import WebOptions from "@/components/WebOptions";
import SearchBackendSelect from "@/components/SearchBackendSelect.vue";
export default {
name: 'Frontend',
components: {SearchBackendSelect, JobCheckboxGroup, WebOptions},
data() {
return {
loading: true,
frontend: null,
}
},
computed: {
valid() {
return !this.loading && this.frontend.jobs.length > 0;
},
frontendUrl() {
if (this.frontend.custom_url) {
return this.frontend.custom_url + this.args;
}
if (this.frontend.web_options.bind.startsWith("0.0.0.0")) {
return window.location.protocol + "//" + window.location.hostname + ":" + this.port + this.args;
}
return window.location.protocol + "//" + this.frontend.web_options.bind + this.args;
},
name() {
return this.$route.params.name;
},
port() {
return this.frontend.web_options.bind.split(":")[1]
},
args() {
const args = this.frontend.extra_query_args;
if (args !== "") {
return "#" + (args.startsWith("?") ? (args) : ("?" + args));
}
return "";
}
},
mounted() {
Sist2AdminApi.getFrontend(this.name).then(resp => {
this.frontend = resp.data;
this.loading = false;
});
},
methods: {
start() {
this.frontend.running = true;
Sist2AdminApi.startFrontend(this.name)
},
stop() {
this.frontend.running = false;
Sist2AdminApi.stopFrontend(this.name)
},
deleteFrontend() {
Sist2AdminApi.deleteFrontend(this.name).then(() => {
this.$router.push("/");
});
},
update() {
Sist2AdminApi.updateFrontend(this.name, this.frontend);
},
onBackendSelect(backend) {
this.frontend.web_options.search_backend = backend;
this.frontend.jobs = [];
this.update();
}
name: 'Frontend',
components: {SearchBackendSelect, JobCheckboxGroup, WebOptions},
data() {
return {
loading: true,
frontend: null,
}
},
computed: {
valid() {
return !this.loading && this.frontend.jobs.length > 0;
},
frontendUrl() {
if (this.frontend.custom_url) {
return this.frontend.custom_url + this.args;
}
if (this.frontend.web_options.bind.startsWith("0.0.0.0")) {
return window.location.protocol + "//" + window.location.hostname + ":" + this.port + this.args;
}
return window.location.protocol + "//" + this.frontend.web_options.bind + this.args;
},
name() {
return this.$route.params.name;
},
port() {
return this.frontend.web_options.bind.split(":")[1]
},
args() {
const args = this.frontend.extra_query_args;
if (args !== "") {
return "#" + (args.startsWith("?") ? (args) : ("?" + args));
}
return "";
}
},
mounted() {
Sist2AdminApi.getFrontend(this.name).then(resp => {
this.frontend = resp.data;
this.loading = false;
});
},
methods: {
start() {
Sist2AdminApi.startFrontend(this.name).then(() => {
this.frontend.running = true;
}).catch(() => {
this.$bvToast.toast(this.$t("couldNotStartFrontendBody"), {
title: this.$t("couldNotStartFrontend"),
variant: "danger",
toaster: "b-toaster-bottom-right"
});
});
},
stop() {
this.frontend.running = false;
Sist2AdminApi.stopFrontend(this.name)
},
deleteFrontend() {
Sist2AdminApi.deleteFrontend(this.name).then(() => {
this.$router.push("/");
});
},
update() {
Sist2AdminApi.updateFrontend(this.name, this.frontend);
},
onBackendSelect(backend) {
this.frontend.web_options.search_backend = backend;
this.frontend.jobs = [];
this.update();
}
}
}
</script>

View File

@@ -6,7 +6,19 @@
</b-card-title>
<div class="mb-3">
<b-button class="mr-1" variant="primary" @click="runJob()" :disabled="!valid">{{ $t("runNow") }}</b-button>
<b-dropdown
split
split-variant="primary"
variant="primary"
:text="$t('runNow')"
class="mr-1"
:disabled="!valid"
@click="runJob()"
>
<b-dropdown-item href="#" @click="runJob(true)">{{ $t("runNowFull") }}</b-dropdown-item>
</b-dropdown>
<b-button variant="danger" @click="deleteJob()">{{ $t("delete") }}</b-button>
</div>
@@ -69,6 +81,7 @@ export default {
return {
loading: true,
job: null,
console: console
}
},
methods: {
@@ -78,8 +91,8 @@ export default {
update() {
Sist2AdminApi.updateJob(this.getName(), this.job);
},
runJob() {
Sist2AdminApi.runJob(this.getName()).then(() => {
runJob(full = false) {
Sist2AdminApi.runJob(this.getName(), full).then(() => {
this.$bvToast.toast(this.$t("runJobConfirmation"), {
title: this.$t("runJobConfirmationTitle"),
variant: "success",

View File

@@ -170,6 +170,6 @@ span.ADMIN {
margin: 3px;
white-space: pre;
color: #000;
overflow: hidden;
overflow-y: hidden;
}
</style>
</style>

View File

@@ -81,7 +81,7 @@ function humanDuration(sec_num) {
return `${seconds}s`;
}
return "<0s";
return "<1s";
}
export default {
@@ -134,7 +134,7 @@ export default {
duration: this.taskDuration(row),
time: moment.utc(row.started).local().format("dd, MMM Do YYYY, HH:mm:ss"),
logs: null,
status: [0,1].includes(row.return_code) ? "ok" : "failed",
status: row.return_code === 0 ? "ok" : "failed",
_row: row
}));
});

File diff suppressed because it is too large Load Diff

View File

@@ -2,6 +2,7 @@ import asyncio
import os
import signal
from datetime import datetime
from time import sleep
from urllib.parse import urlparse
import requests
@@ -25,6 +26,7 @@ from state import migrate_v1_to_v2, RUNNING_FRONTENDS, TESSERACT_LANGS, DB_SCHEM
get_log_files_to_remove, delete_log_file, create_default_search_backends
from web import Sist2Frontend
from script import UserScript, SCRIPT_TEMPLATES
from util import tail_sync, pid_is_running
sist2 = Sist2(SIST2_BINARY, DATA_FOLDER)
db = PersistentState(dbfile=os.path.join(DATA_FOLDER, "state.db"))
@@ -169,11 +171,14 @@ def _run_job(job: Sist2Job):
@app.get("/api/job/{name:str}/run")
async def run_job(name: str):
job = db["jobs"][name]
async def run_job(name: str, full: bool = False):
job: Sist2Job = db["jobs"][name]
if not job:
raise HTTPException(status_code=404)
if full:
job.do_full_scan = True
_run_job(job)
return "ok"
@@ -222,7 +227,10 @@ async def delete_job(name: str):
@app.delete("/api/frontend/{name:str}")
async def delete_frontend(name: str):
if name in RUNNING_FRONTENDS:
os.kill(RUNNING_FRONTENDS[name], signal.SIGTERM)
try:
os.kill(RUNNING_FRONTENDS[name], signal.SIGTERM)
except ProcessLookupError:
pass
del RUNNING_FRONTENDS[name]
frontend = db["frontends"][name]
@@ -318,7 +326,18 @@ def start_frontend_(frontend: Sist2Frontend):
logger.debug(f"Fetched search backend options for {backend_name}")
pid = sist2.web(frontend.web_options, search_backend, frontend.name)
sleep(0.2)
if not pid_is_running(pid):
frontend_log = frontend.get_log_path(LOG_FOLDER)
logger.error(f"Frontend exited too quickly, check {frontend_log} for more details:")
for line in tail_sync(frontend.get_log_path(LOG_FOLDER), 3):
logger.error(line.strip())
return False
RUNNING_FRONTENDS[frontend.name] = pid
return True
@app.post("/api/frontend/{name:str}/start")
@@ -327,7 +346,12 @@ async def start_frontend(name: str):
if not frontend:
raise HTTPException(status_code=404)
start_frontend_(frontend)
ok = start_frontend_(frontend)
if not ok:
raise HTTPException(status_code=500)
return "ok"
@app.post("/api/frontend/{name:str}/stop")

View File

@@ -120,6 +120,10 @@ class Sist2Task:
logger.info(f"Started task {self.display_name}")
def set_pid(self, pid):
self.pid = pid
class Sist2ScanTask(Sist2Task):
@@ -133,13 +137,10 @@ class Sist2ScanTask(Sist2Task):
else:
self.job.scan_options.output = None
def set_pid(pid):
self.pid = pid
return_code = sist2.scan(self.job.scan_options, logs_cb=self.log_callback, set_pid_cb=set_pid)
return_code = sist2.scan(self.job.scan_options, logs_cb=self.log_callback, set_pid_cb=self.set_pid)
self.ended = datetime.utcnow()
is_ok = return_code in (0, 1)
is_ok = (return_code in (0, 1)) if "debug" in sist2.bin_path else (return_code == 0)
if not is_ok:
self._logger.error(json.dumps({"sist2-admin": f"Process returned non-zero exit code ({return_code})"}))
@@ -165,6 +166,9 @@ class Sist2ScanTask(Sist2Task):
self.job.previous_index_path = self.job.index_path
db["jobs"][self.job.name] = self.job
if is_ok:
return 0
return return_code
@@ -185,7 +189,7 @@ class Sist2IndexTask(Sist2Task):
logger.debug(f"Fetched search backend options for {self.job.index_options.search_backend}")
return_code = sist2.index(self.job.index_options, search_backend, logs_cb=self.log_callback)
return_code = sist2.index(self.job.index_options, search_backend, logs_cb=self.log_callback, set_pid_cb=self.set_pid)
self.ended = datetime.utcnow()
duration = self.ended - self.started
@@ -200,7 +204,7 @@ class Sist2IndexTask(Sist2Task):
self.job.previous_index_path = self.job.index_path
db["jobs"][self.job.name] = self.job
self._logger.info(json.dumps({"sist2-admin": f"Sist2Scan task finished {return_code=}, {duration=}"}))
self._logger.info(json.dumps({"sist2-admin": f"Sist2Scan task finished {return_code=}, {duration=}, {ok=}"}))
logger.info(f"Completed {self.display_name} ({return_code=})")
@@ -249,7 +253,7 @@ class Sist2UserScriptTask(Sist2Task):
super().run(sist2, db)
try:
self.user_script.setup(self.log_callback)
self.user_script.setup(self.log_callback, self.set_pid)
except Exception as e:
logger.error(f"Setup for {self.user_script.name} failed: ")
logger.exception(e)
@@ -269,7 +273,7 @@ class Sist2UserScriptTask(Sist2Task):
self.log_callback({"sist2-admin": f"Starting user script with {executable=}, {index_path=}, {extra_args=}"})
proc = Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=self.user_script.script_dir())
self.pid = proc.pid
self.set_pid(proc.pid)
t_stderr = Thread(target=self._consume_logs, args=(self.log_callback, proc, "stderr", False))
t_stderr.start()
@@ -316,7 +320,7 @@ class TaskQueue:
def _tasks_failed(self):
done = set()
for row in self._db["task_done"].sql("WHERE return_code NOT IN (0,1)"):
for row in self._db["task_done"].sql("WHERE return_code != 0"):
done.add(uuid.UUID(row["id"]))
return done

View File

@@ -20,7 +20,7 @@ def set_executable(file):
os.chmod(file, os.stat(file).st_mode | stat.S_IEXEC)
def _initialize_git_repository(url, path, log_cb, force_clone):
def _initialize_git_repository(url, path, log_cb, force_clone, set_pid_cb):
log_cb({"sist2-admin": f"Cloning {url}"})
if force_clone or not os.path.exists(os.path.join(path, ".git")):
@@ -36,14 +36,18 @@ def _initialize_git_repository(url, path, log_cb, force_clone):
log_cb({"sist2-admin": f"Executing setup script {setup_script}"})
set_executable(setup_script)
result = subprocess.run([setup_script], cwd=path, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in result.stdout.split(b"\n"):
proc = subprocess.Popen([setup_script], cwd=path, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
set_pid_cb(proc.pid)
proc.wait()
stdout = proc.stdout.read()
for line in stdout.split(b"\n"):
if line:
log_cb({"stdout": line.decode()})
log_cb({"stdout": f"Executed setup script {setup_script}, return code = {result.returncode}"})
log_cb({"stdout": f"Executed setup script {setup_script}, return code = {proc.returncode}"})
if result.returncode != 0:
if proc.returncode != 0:
raise Exception("Error when running setup script!")
log_cb({"sist2-admin": f"Initialized git repository in {path}"})
@@ -60,11 +64,11 @@ class UserScript(BaseModel):
def script_dir(self):
return os.path.join(SCRIPT_FOLDER, self.name)
def setup(self, log_cb):
def setup(self, log_cb, set_pid_cb):
os.makedirs(self.script_dir(), exist_ok=True)
if self.type == ScriptType.GIT:
_initialize_git_repository(self.git_repository, self.script_dir(), log_cb, self.force_clone)
_initialize_git_repository(self.git_repository, self.script_dir(), log_cb, self.force_clone, set_pid_cb)
self.force_clone = False
elif self.type == ScriptType.SIMPLE:
self._setup_simple()

View File

@@ -2,10 +2,11 @@ import datetime
import json
import logging
import os.path
import sys
from datetime import datetime
from enum import Enum
from io import TextIOWrapper
from logging import FileHandler
from logging import FileHandler, StreamHandler
from subprocess import Popen, PIPE
from tempfile import NamedTemporaryFile
from threading import Thread
@@ -200,6 +201,7 @@ class WebOptions(BaseModel):
auth0_client_id: str = None
auth0_public_key: str = None
auth0_public_key_file: str = None
verbose: bool = False
def __init__(self, **kwargs):
super().__init__(**kwargs)
@@ -231,6 +233,8 @@ class WebOptions(BaseModel):
args.append(f"--tag-auth={self.tag_auth}")
if self.dev:
args.append(f"--dev")
if self.verbose:
args.append(f"--very-verbose")
args.extend(self.indices)
@@ -243,7 +247,7 @@ class Sist2:
self.bin_path = bin_path
self._data_dir = data_directory
def index(self, options: IndexOptions, search_backend: Sist2SearchBackend, logs_cb):
def index(self, options: IndexOptions, search_backend: Sist2SearchBackend, logs_cb, set_pid_cb):
args = [
self.bin_path,
@@ -255,7 +259,9 @@ class Sist2:
logs_cb({"sist2-admin": f"Starting sist2 command with args {args}"})
proc = Popen(args, stdout=PIPE, stderr=PIPE)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, proc))
set_pid_cb(proc.pid)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, None, proc))
t_stderr.start()
self._consume_logs_stdout(logs_cb, proc)
@@ -282,7 +288,7 @@ class Sist2:
set_pid_cb(proc.pid)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, proc))
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, None, proc))
t_stderr.start()
self._consume_logs_stdout(logs_cb, proc)
@@ -292,7 +298,7 @@ class Sist2:
return proc.returncode
@staticmethod
def _consume_logs_stderr(logs_cb, proc):
def _consume_logs_stderr(logs_cb, exit_cb, proc):
pipe_wrapper = TextIOWrapper(proc.stderr, encoding="utf8", errors="ignore")
try:
for line in pipe_wrapper:
@@ -300,7 +306,9 @@ class Sist2:
continue
logs_cb({"stderr": line})
finally:
proc.wait()
return_code = proc.wait()
if exit_cb:
exit_cb(return_code)
pipe_wrapper.close()
@staticmethod
@@ -334,15 +342,19 @@ class Sist2:
web_logger = logging.Logger(name=f"sist2-frontend-{name}")
web_logger.addHandler(FileHandler(os.path.join(LOG_FOLDER, f"frontend-{name}.log")))
web_logger.addHandler(StreamHandler())
def logs_cb(message):
web_logger.info(json.dumps(message))
def exit_cb(return_code):
logger.info(f"Web frontend exited with return code {return_code}")
logger.info(f"Starting frontend {' '.join(args)}")
proc = Popen(args, stdout=PIPE, stderr=PIPE)
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, proc))
t_stderr = Thread(target=self._consume_logs_stderr, args=(logs_cb, exit_cb, proc))
t_stderr.start()
t_stdout = Thread(target=self._consume_logs_stdout, args=(logs_cb, proc))

View File

@@ -0,0 +1,41 @@
from glob import glob
import os
from config import DATA_FOLDER
def get_old_index_files(name):
files = glob(os.path.join(DATA_FOLDER, f"scan-{name.replace('/', '_')}-*.sist2"))
files = list(sorted(files, key=lambda f: os.stat(f).st_mtime))
files = files[-1:]
return files
def tail_sync(filename, lines=1, _buffer=4098):
with open(filename) as f:
lines_found = []
block_counter = -1
while len(lines_found) < lines:
try:
f.seek(block_counter * _buffer, os.SEEK_END)
except IOError:
f.seek(0)
lines_found = f.readlines()
break
lines_found = f.readlines()
block_counter -= 1
return lines_found[-lines:]
def pid_is_running(pid):
try:
os.kill(pid, 0)
except OSError:
return False
return True

Binary file not shown.

15383
sist2-vue/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -9,7 +9,7 @@
"dependencies": {
"@auth0/auth0-spa-js": "^2.0.2",
"@egjs/vue-infinitegrid": "3.3.0",
"axios": "^0.25.0",
"axios": "^1.6.0",
"bootstrap-vue": "^2.21.2",
"core-js": "^3.6.5",
"d3": "^5.6.1",
@@ -17,7 +17,7 @@
"dom-to-image": "^2.6.0",
"fslightbox-vue": "fslightbox-vue.tgz",
"nouislider": "^15.2.0",
"onnxruntime-web": "^1.15.1",
"onnxruntime-web": "1.15.1",
"underscore": "^1.13.1",
"vue": "^2.6.12",
"vue-color": "^2.8.1",
@@ -32,7 +32,6 @@
"@types/underscore": "^1.11.6",
"@vue/cli-plugin-babel": "~5.0.8",
"@vue/cli-plugin-router": "~5.0.8",
"@vue/cli-plugin-typescript": "^5.0.8",
"@vue/cli-plugin-vuex": "~5.0.8",
"@vue/cli-service": "^5.0.8",
"@vue/test-utils": "^1.0.3",
@@ -44,7 +43,6 @@
"portal-vue": "^2.1.7",
"sass": "^1.26.11",
"sass-loader": "^10.0.2",
"typescript": "^4.9.5",
"vue-cli-plugin-bootstrap-vue": "~0.8.2",
"vue-template-compiler": "^2.6.11"
},

View File

@@ -1,116 +1,20 @@
import axios from "axios";
import {ext, strUnescape, lum} from "./util";
import {strUnescape, lum, sid} from "./util";
import Sist2Query from "@/Sist2ElasticsearchQuery";
import store from "@/store";
export interface EsTag {
id: string
count: number
color: string | undefined
isLeaf: boolean
}
export interface Tag {
style: string
text: string
rawText: string
fg: string
bg: string
userTag: boolean
}
export interface Index {
name: string
version: string
id: string
idPrefix: string
timestamp: number
models: []
}
export interface EsHit {
_index: string
_id: string
_score: number
_type: string
_tags: Tag[]
_seq: number
_source: {
path: string
size: number
mime: string
name: string
extension: string
index: string
_depth: number
mtime: number
videoc: string
audioc: string
parent: string
width: number
height: number
duration: number
tag: string[]
checksum: string
thumbnail: string
}
_props: {
isSubDocument: boolean
isImage: boolean
isGif: boolean
isVideo: boolean
isPlayableVideo: boolean
isPlayableImage: boolean
isAudio: boolean
hasThumbnail: boolean
hasVidPreview: boolean
imageAspectRatio: number
/** Number of thumbnails available */
tnNum: number
}
highlight: {
name: string[] | undefined,
content: string[] | undefined,
}
}
function getIdPrefix(indices: Index[], id: string): string {
for (let i = 4; i < 32; i++) {
const prefix = id.slice(0, i);
if (indices.filter(idx => idx.id.slice(0, i) == prefix).length == 1) {
return prefix;
}
}
return id;
}
export interface EsResult {
took: number
hits: {
// TODO: ES 6.X ?
total: {
value: number
}
hits: EsHit[]
}
aggregations: any
}
class Sist2Api {
private readonly baseUrl: string
private sist2Info: any
private queryfunc: () => EsResult;
baseUrl;
sist2Info;
queryfunc;
constructor(baseUrl: string) {
constructor(baseUrl) {
this.baseUrl = baseUrl;
}
init(queryFunc: () => EsResult) {
init(queryFunc) {
this.queryfunc = queryFunc;
}
@@ -127,29 +31,16 @@ class Sist2Api {
.filter((v, i, a) => a.findIndex(v2 => (v2.id === v.id)) === i)
}
getSist2Info(): Promise<any> {
getSist2Info() {
return axios.get(`${this.baseUrl}i`).then(resp => {
const indices = resp.data.indices as Index[];
resp.data.indices = indices.map(idx => {
return {
id: idx.id,
name: idx.name,
timestamp: idx.timestamp,
version: idx.version,
models: idx.models,
idPrefix: getIdPrefix(indices, idx.id),
} as Index;
});
this.sist2Info = resp.data;
return resp.data;
})
}
setHitProps(hit: EsHit): void {
hit["_props"] = {} as any;
setHitProps(hit) {
hit["_props"] = {};
const mimeCategory = hit._source.mime == null ? null : hit._source.mime.split("/")[0];
@@ -157,7 +48,7 @@ class Sist2Api {
hit._props.isSubDocument = true;
}
if ("thumbnail" in hit._source) {
if ("thumbnail" in hit._source && hit._source.thumbnail > 0) {
hit._props.hasThumbnail = true;
if (Number.isNaN(Number(hit._source.thumbnail))) {
@@ -213,8 +104,8 @@ class Sist2Api {
}
}
setHitTags(hit: EsHit): void {
const tags = [] as Tag[];
setHitTags(hit) {
const tags = [];
// User tags
if ("tag" in hit._source) {
@@ -226,10 +117,10 @@ class Sist2Api {
hit._tags = tags;
}
createUserTag(tag: string): Tag {
createUserTag(tag) {
const tokens = tag.split(".");
const colorToken = tokens.pop() as string;
const colorToken = tokens.pop();
const bg = colorToken;
const fg = lum(colorToken) > 50 ? "#000" : "#fff";
@@ -241,25 +132,30 @@ class Sist2Api {
text: tokens.join("."),
rawText: tag,
userTag: true,
} as Tag;
};
}
search(): Promise<EsResult> {
if (this.backend() == "sqlite") {
search() {
if (this.backend() === "sqlite") {
return this.ftsQuery(this.queryfunc())
} else {
return this.esQuery(this.queryfunc());
}
}
esQuery(query: any): Promise<EsResult> {
_getIndexRoot(indexId) {
return this.sist2Info.indices.find(idx => idx.id === indexId).root;
}
esQuery(query) {
return axios.post(`${this.baseUrl}es`, query).then(resp => {
const res = resp.data as EsResult;
const res = resp.data;
if (res.hits?.hits) {
res.hits.hits.forEach((hit: EsHit) => {
res.hits.hits.forEach((hit) => {
hit["_source"]["name"] = strUnescape(hit["_source"]["name"]);
hit["_source"]["path"] = strUnescape(hit["_source"]["path"]);
hit["_source"]["indexRoot"] = this._getIndexRoot(hit["_source"]["index"]);
this.setHitProps(hit);
this.setHitTags(hit);
@@ -270,9 +166,9 @@ class Sist2Api {
});
}
ftsQuery(query: any): Promise<EsResult> {
ftsQuery(query) {
return axios.post(`${this.baseUrl}fts/search`, query).then(resp => {
const res = resp.data as any;
const res = resp.data;
if (res.hits.hits) {
res.hits.hits.forEach(hit => {
@@ -293,7 +189,7 @@ class Sist2Api {
});
}
private getMimeTypesEs(query) {
getMimeTypesEs(query) {
const AGGS = {
mimeTypes: {
terms: {
@@ -322,7 +218,7 @@ class Sist2Api {
});
}
private getMimeTypesSqlite(): Promise<[{ mime: string, count: number }]> {
getMimeTypesSqlite() {
return axios.get(`${this.baseUrl}fts/mimetypes`)
.then(resp => {
return resp.data;
@@ -332,15 +228,15 @@ class Sist2Api {
async getMimeTypes(query = undefined) {
let buckets;
if (this.backend() == "sqlite") {
if (this.backend() === "sqlite") {
buckets = await this.getMimeTypesSqlite();
} else {
buckets = await this.getMimeTypesEs(query);
}
const mimeMap: any[] = [];
const mimeMap = [];
buckets.sort((a: any, b: any) => a.mime > b.mime).forEach((bucket: any) => {
buckets.sort((a, b) => a.mime > b.mime).forEach((bucket) => {
const tmp = bucket.mime.split("/");
const category = tmp[0];
const mime = tmp[1];
@@ -374,7 +270,7 @@ class Sist2Api {
return {buckets, mimeMap};
}
_createEsTag(tag: string, count: number): EsTag {
_createEsTag(tag, count) {
const tokens = tag.split(".");
if (/.*\.#[0-9a-fA-F]{6}/.test(tag)) {
@@ -394,7 +290,7 @@ class Sist2Api {
};
}
private getTagsEs() {
getTagsEs() {
return this.esQuery({
aggs: {
tags: {
@@ -407,21 +303,21 @@ class Sist2Api {
size: 0,
}).then(resp => {
return resp["aggregations"]["tags"]["buckets"]
.sort((a: any, b: any) => a["key"].localeCompare(b["key"]))
.map((bucket: any) => this._createEsTag(bucket["key"], bucket["doc_count"]));
.sort((a, b) => a["key"].localeCompare(b["key"]))
.map((bucket) => this._createEsTag(bucket["key"], bucket["doc_count"]));
});
}
private getTagsSqlite() {
getTagsSqlite() {
return axios.get(`${this.baseUrl}/fts/tags`)
.then(resp => {
return resp.data.map(tag => this._createEsTag(tag.tag, tag.count))
});
}
async getTags(): Promise<EsTag[]> {
async getTags() {
let tags;
if (this.backend() == "sqlite") {
if (this.backend() === "sqlite") {
tags = await this.getTagsSqlite();
} else {
tags = await this.getTagsEs();
@@ -430,7 +326,7 @@ class Sist2Api {
// Remove duplicates (same tag with different color)
const seen = new Set();
return tags.filter((t: EsTag) => {
return tags.filter((t) => {
if (seen.has(t.id)) {
return false;
}
@@ -439,31 +335,29 @@ class Sist2Api {
});
}
saveTag(tag: string, hit: EsHit) {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
saveTag(tag, hit) {
return axios.post(`${this.baseUrl}tag/${sid(hit)}`, {
delete: false,
name: tag,
doc_id: hit["_id"]
});
}
deleteTag(tag: string, hit: EsHit) {
return axios.post(`${this.baseUrl}tag/` + hit["_source"]["index"], {
deleteTag(tag, hit) {
return axios.post(`${this.baseUrl}tag/${sid(hit)}`, {
delete: true,
name: tag,
doc_id: hit["_id"]
});
}
searchPaths(indexId, minDepth, maxDepth, prefix = null) {
if (this.backend() == "sqlite") {
if (this.backend() === "sqlite") {
return this.searchPathsSqlite(indexId, minDepth, minDepth, prefix);
} else {
return this.searchPathsEs(indexId, minDepth, maxDepth, prefix);
}
}
private searchPathsSqlite(indexId, minDepth, maxDepth, prefix) {
searchPathsSqlite(indexId, minDepth, maxDepth, prefix) {
return axios.post(`${this.baseUrl}fts/paths`, {
indexId, minDepth, maxDepth, prefix
}).then(resp => {
@@ -471,7 +365,7 @@ class Sist2Api {
});
}
private searchPathsEs(indexId, minDepth, maxDepth, prefix): Promise<[{ path: string, count: number }]> {
searchPathsEs(indexId, minDepth, maxDepth, prefix) {
const query = {
query: {
@@ -516,23 +410,25 @@ class Sist2Api {
});
}
private getDateRangeSqlite() {
getDateRangeSqlite() {
return axios.get(`${this.baseUrl}fts/dateRange`)
.then(resp => ({
min: resp.data.dateMin,
max: resp.data.dateMax,
max: (resp.data.dateMax === resp.data.dateMin)
? resp.data.dateMax + 1
: resp.data.dateMax,
}));
}
getDateRange(): Promise<{ min: number, max: number }> {
if (this.backend() == "sqlite") {
getDateRange() {
if (this.backend() === "sqlite") {
return this.getDateRangeSqlite();
} else {
return this.getDateRangeEs();
}
}
private getDateRangeEs() {
getDateRangeEs() {
return this.esQuery({
// TODO: filter current selected indices
aggs: {
@@ -549,7 +445,7 @@ class Sist2Api {
if (range.min == null) {
range.min = 0;
range.max = 1;
} else if (range.min == range.max) {
} else if (range.min === range.max) {
range.max += 1;
}
@@ -557,7 +453,7 @@ class Sist2Api {
});
}
private getPathSuggestionsSqlite(text: string) {
getPathSuggestionsSqlite(text) {
return axios.post(`${this.baseUrl}fts/paths`, {
prefix: text,
minDepth: 1,
@@ -567,7 +463,7 @@ class Sist2Api {
})
}
private getPathSuggestionsEs(text) {
getPathSuggestionsEs(text) {
return this.esQuery({
suggest: {
path: {
@@ -585,31 +481,31 @@ class Sist2Api {
});
}
getPathSuggestions(text: string): Promise<string[]> {
if (this.backend() == "sqlite") {
getPathSuggestions(text) {
if (this.backend() === "sqlite") {
return this.getPathSuggestionsSqlite(text);
} else {
return this.getPathSuggestionsEs(text)
}
}
getTreemapStat(indexId: string) {
getTreemapStat(indexId) {
return `${this.baseUrl}s/${indexId}/TMAP`;
}
getMimeStat(indexId: string) {
getMimeStat(indexId) {
return `${this.baseUrl}s/${indexId}/MAGG`;
}
getSizeStat(indexId: string) {
getSizeStat(indexId) {
return `${this.baseUrl}s/${indexId}/SAGG`;
}
getDateStat(indexId: string) {
getDateStat(indexId) {
return `${this.baseUrl}s/${indexId}/DAGG`;
}
private getDocumentEs(docId: string, highlight: boolean, fuzzy: boolean) {
getDocumentEs(sid, highlight, fuzzy) {
const query = Sist2Query.searchQuery();
if (highlight) {
@@ -648,7 +544,7 @@ class Sist2Api {
query.query.bool.must = [query.query.bool.must];
}
query.query.bool.must.push({match: {_id: docId}});
query.query.bool.must.push({match: {_id: sid}});
delete query["sort"];
delete query["aggs"];
@@ -669,35 +565,35 @@ class Sist2Api {
});
}
private getDocumentSqlite(docId: string): Promise<EsHit> {
return axios.get(`${this.baseUrl}/fts/d/${docId}`)
getDocumentSqlite(sid) {
return axios.get(`${this.baseUrl}/fts/d/${sid}`)
.then(resp => ({
_source: resp.data
} as EsHit));
}));
}
getDocument(docId: string, highlight: boolean, fuzzy: boolean): Promise<EsHit | null> {
if (this.backend() == "sqlite") {
return this.getDocumentSqlite(docId);
getDocument(sid, highlight, fuzzy) {
if (this.backend() === "sqlite") {
return this.getDocumentSqlite(sid);
} else {
return this.getDocumentEs(docId, highlight, fuzzy);
return this.getDocumentEs(sid, highlight, fuzzy);
}
}
getTagSuggestions(prefix: string): Promise<string[]> {
if (this.backend() == "sqlite") {
getTagSuggestions(prefix) {
if (this.backend() === "sqlite") {
return this.getTagSuggestionsSqlite(prefix);
} else {
return this.getTagSuggestionsEs(prefix);
}
}
private getTagSuggestionsSqlite(prefix): Promise<string[]> {
getTagSuggestionsSqlite(prefix) {
return axios.post(`${this.baseUrl}/fts/suggestTags`, prefix)
.then(resp => (resp.data));
}
private getTagSuggestionsEs(prefix): Promise<string[]> {
getTagSuggestionsEs(prefix) {
return this.esQuery({
suggest: {
tag: {
@@ -723,8 +619,8 @@ class Sist2Api {
});
}
getEmbeddings(indexId, docId, modelId) {
return axios.post(`${this.baseUrl}/e/${indexId}/${docId}/${modelId.toString().padStart(3, '0')}`)
getEmbeddings(sid, modelId) {
return axios.post(`${this.baseUrl}/e/${sid}/${modelId.toString().padStart(3, '0')}`)
.then(resp => (resp.data));
}
}

View File

@@ -1,5 +1,5 @@
import store from "./store";
import sist2Api, {EsHit, Index} from "@/Sist2Api";
import store from "@/store";
import Sist2Api from "@/Sist2Api";
const SORT_MODES = {
score: {
@@ -7,62 +7,62 @@ const SORT_MODES = {
{_score: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._score
key: (hit) => hit._score
},
random: {
mode: [
{_score: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._score
key: (hit) => hit._score
},
dateAsc: {
mode: [
{mtime: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.mtime
key: (hit) => hit._source.mtime
},
dateDesc: {
mode: [
{mtime: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.mtime
key: (hit) => hit._source.mtime
},
sizeAsc: {
mode: [
{size: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.size
key: (hit) => hit._source.size
},
sizeDesc: {
mode: [
{size: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.size
key: (hit) => hit._source.size
},
nameAsc: {
mode: [
{name: {order: "asc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
key: (hit) => hit._source.name
},
nameDesc: {
mode: [
{name: {order: "desc"}},
{_tie: {order: "asc"}}
],
key: (hit: EsHit) => hit._source.name
key: (hit) => hit._source.name
}
} as any;
};
class Sist2ElasticsearchQuery {
searchQuery(blankSearch: boolean = false): any {
searchQuery(blankSearch = false) {
const getters = store.getters;
@@ -76,7 +76,7 @@ class Sist2ElasticsearchQuery {
const fuzzy = getters.fuzzy;
const size = getters.size;
const after = getters.lastDoc;
const selectedIndexIds = getters.selectedIndices.map((idx: Index) => idx.id)
const selectedIndexIds = getters.selectedIndices.map((idx) => idx.id)
const selectedMimeTypes = getters.selectedMimeTypes;
const selectedTags = getters.selectedTags;
const sortMode = getters.embedding ? "score" : getters.sortMode;
@@ -86,7 +86,7 @@ class Sist2ElasticsearchQuery {
const filters = [
{terms: {index: selectedIndexIds}}
] as any[];
];
const fields = [
"name^8",
@@ -138,7 +138,7 @@ class Sist2ElasticsearchQuery {
if (getters.optTagOrOperator) {
filters.push({terms: {"tag": selectedTags}});
} else {
selectedTags.forEach((tag: string) => filters.push({term: {"tag": tag}}));
selectedTags.forEach((tag) => filters.push({term: {"tag": tag}}));
}
}
}
@@ -173,7 +173,7 @@ class Sist2ElasticsearchQuery {
},
sort: SORT_MODES[sortMode].mode,
size: size,
} as any;
};
if (!after) {
q.aggs = {
@@ -193,7 +193,7 @@ class Sist2ElasticsearchQuery {
if (getters.embedding) {
delete q.query;
const field = "emb." + sist2Api.models().find(m => m.id == getters.embeddingsModel).path;
const field = "emb." + Sist2Api.models().find(m => m.id === getters.embeddingsModel).path;
if (hasKnn) {
// Use knn (8.8+)

View File

@@ -1,5 +1,4 @@
import store from "./store";
import {EsHit, Index} from "@/Sist2Api";
import store from "@/store";
const SORT_MODES = {
score: {
@@ -29,18 +28,12 @@ const SORT_MODES = {
"sort": "name",
"sortAsc": false
}
} as any;
interface SortMode {
text: string
mode: any[]
key: (hit: EsHit) => any
}
};
class Sist2ElasticsearchQuery {
searchQuery(): any {
searchQuery() {
const getters = store.getters;
@@ -52,7 +45,7 @@ class Sist2ElasticsearchQuery {
const dateMax = getters.dateMax;
const size = getters.size;
const after = getters.lastDoc;
const selectedIndexIds = getters.selectedIndices.map((idx: Index) => idx.id)
const selectedIndexIds = getters.selectedIndices.map((idx) => idx.id)
const selectedMimeTypes = getters.selectedMimeTypes;
const selectedTags = getters.selectedTags;
@@ -95,7 +88,7 @@ class Sist2ElasticsearchQuery {
if (selectedTags.length > 0) {
q["tags"] = selectedTags
}
if (getters.sortMode == "random") {
if (getters.sortMode === "random") {
q["seed"] = getters.seed;
}
if (getters.optHighlight) {
@@ -108,11 +101,13 @@ class Sist2ElasticsearchQuery {
q["embedding"] = getters.embedding;
q["sort"] = "embedding";
q["sortAsc"] = false;
} else if (getters.sortMode == "embedding") {
} else if (getters.sortMode === "embedding") {
q["sort"] = "sort"
q["sortAsc"] = true;
}
q["searchInPath"] = getters.optSearchInPath;
return q;
}
}

View File

@@ -9,8 +9,7 @@
<script>
import * as d3 from "d3";
import {burrow} from "@/util-js"
import {humanFileSize} from "@/util";
import {humanFileSize, burrow} from "@/util";
import Sist2Api from "@/Sist2Api";
import domtoimage from "dom-to-image";

View File

@@ -31,8 +31,7 @@
<script>
import noUiSlider from 'nouislider';
import 'nouislider/dist/nouislider.css';
import {humanDate} from "@/util";
import {mergeTooltips} from "@/util-js";
import {humanDate, mergeTooltips} from "@/util";
export default {
name: "DateSlider",

View File

@@ -16,7 +16,7 @@
<!-- Audio player-->
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls
:type="doc._source.mime"
:src="`f/${doc._source.index}/${doc._id}`"
:src="`f/${sid(doc)}`"
@play="onAudioPlay()"></audio>
<b-card-body class="padding-03">
@@ -43,7 +43,7 @@
</template>
<script>
import {ext, humanFileSize, humanTime} from "@/util";
import {ext, humanFileSize, humanTime, sid} from "@/util";
import TagContainer from "@/components/TagContainer.vue";
import DocFileTitle from "@/components/DocFileTitle.vue";
import DocInfoModal from "@/components/DocInfoModal.vue";
@@ -69,13 +69,14 @@ export default {
}
},
methods: {
sid: sid,
humanFileSize: humanFileSize,
humanTime: humanTime,
onInfoClick() {
this.showInfo = true;
},
onEmbeddingClick() {
Sist2Api.getEmbeddings(this.doc._source.index, this.doc._id, this.$store.state.embeddingsModel).then(embeddings => {
Sist2Api.getEmbeddings(sid(this.doc), this.$store.state.embeddingsModel).then(embeddings => {
this.$store.commit("setEmbeddingText", "");
this.$store.commit("setEmbedding", embeddings);
this.$store.commit("setEmbeddingDoc", this.doc);

View File

@@ -1,15 +1,16 @@
<template>
<GridLayout
ref="grid-layout"
:options="gridOptions"
@append="append"
@layout-complete="$emit('layout-complete')"
>
<DocCard v-for="doc in docs" :key="doc._id" :doc="doc" :width="width"></DocCard>
</GridLayout>
<GridLayout
ref="grid-layout"
:options="gridOptions"
@append="append"
@layout-complete="$emit('layout-complete')"
>
<DocCard v-for="doc in docs" :key="sid(doc)" :doc="doc" :width="width"></DocCard>
</GridLayout>
</template>
<script>
import {sid} from "@/util";
import Vue from "vue";
import DocCard from "@/components/DocCard";
@@ -18,50 +19,53 @@ import VueInfiniteGrid from "@egjs/vue-infinitegrid";
Vue.use(VueInfiniteGrid);
export default Vue.extend({
components: {
DocCard,
},
props: ["docs", "append"],
data() {
return {
width: 0,
gridOptions: {
align: "center",
margin: 0,
transitionDuration: 0,
isOverflowScroll: false,
isConstantSize: false,
useFit: false,
// Indicates whether keep the number of DOMs is maintained. If the useRecycle value is 'true', keep the number
// of DOMs is maintained. If the useRecycle value is 'false', the number of DOMs will increase as card elements
// are added.
useRecycle: false
}
}
},
computed: {
colCount() {
const columns = this.$store.getters["optColumns"];
if (columns === "auto") {
return Math.round(this.$refs["grid-layout"].$el.scrollWidth / 300)
}
return columns;
components: {
DocCard,
},
},
mounted() {
this.width = this.$refs["grid-layout"].$el.scrollWidth / this.colCount;
props: ["docs", "append"],
data() {
return {
width: 0,
gridOptions: {
align: "center",
margin: 0,
transitionDuration: 0,
isOverflowScroll: false,
isConstantSize: false,
useFit: false,
// Indicates whether keep the number of DOMs is maintained. If the useRecycle value is 'true', keep the number
// of DOMs is maintained. If the useRecycle value is 'false', the number of DOMs will increase as card elements
// are added.
useRecycle: false
}
}
},
methods: {
sid,
},
computed: {
colCount() {
const columns = this.$store.getters["optColumns"];
if (this.colCount === 1) {
this.$refs["grid-layout"].$el.classList.add("grid-single-column");
}
if (columns === "auto") {
return Math.round(this.$refs["grid-layout"].$el.scrollWidth / 300)
}
return columns;
},
},
mounted() {
this.width = this.$refs["grid-layout"].$el.scrollWidth / this.colCount;
this.$store.subscribe((mutation) => {
if (mutation.type === "busUpdateWallItems" && this.$refs["grid-layout"]) {
this.$refs["grid-layout"].updateItems();
}
});
},
if (this.colCount === 1) {
this.$refs["grid-layout"].$el.classList.add("grid-single-column");
}
this.$store.subscribe((mutation) => {
if (mutation.type === "busUpdateWallItems" && this.$refs["grid-layout"]) {
this.$refs["grid-layout"].updateItems();
}
});
},
});
</script>

View File

@@ -1,5 +1,5 @@
<template>
<a :href="`f/${doc._source.index}/${doc._id}`"
<a :href="`f/${sid(doc)}`"
:class="doc._source.embedding ? 'file-title-anchor-with-embedding' : 'file-title-anchor'" target="_blank">
<div class="file-title" :title="doc._source.path + '/' + doc._source.name + ext(doc)"
v-html="fileName() + ext(doc)"></div>
@@ -7,12 +7,13 @@
</template>
<script>
import {ext} from "@/util";
import {ext, sid} from "@/util";
export default {
name: "DocFileTitle",
props: ["doc"],
methods: {
sid: sid,
ext: ext,
fileName() {
if (!this.doc.highlight) {

View File

@@ -1,33 +1,34 @@
<template>
<b-modal :visible="show" size="lg" :hide-footer="true" static lazy @close="$emit('close')" @hide="$emit('close')"
>
<template #modal-title>
<h5 class="modal-title" :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
<router-link :to="`/file?byId=${doc._id}`">#</router-link>
</h5>
</template>
<b-modal :visible="show" size="lg" :hide-footer="true" static lazy @close="$emit('close')" @hide="$emit('close')"
>
<template #modal-title>
<h5 class="modal-title" :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
<router-link :to="`/file?byId=${doc._id}`">#</router-link>
</h5>
</template>
<img v-if="doc._props.hasThumbnail" :src="`t/${doc._source.index}/${doc._id}`" alt="" class="fit card-img-top">
<img v-if="doc._props.hasThumbnail" :src="`t/${sid(doc)}`" alt="" class="fit card-img-top">
<InfoTable :doc="doc"></InfoTable>
<InfoTable :doc="doc"></InfoTable>
<LazyContentDiv :doc-id="doc._id"></LazyContentDiv>
</b-modal>
<LazyContentDiv :sid="sid(doc)"></LazyContentDiv>
</b-modal>
</template>
<script>
import {ext} from "@/util";
import {ext, sid} from "@/util";
import InfoTable from "@/components/InfoTable";
import LazyContentDiv from "@/components/LazyContentDiv";
export default {
name: "DocInfoModal",
components: {LazyContentDiv, InfoTable},
props: ["doc", "show"],
methods: {
ext: ext,
}
name: "DocInfoModal",
components: {LazyContentDiv, InfoTable},
props: ["doc", "show"],
methods: {
ext: ext,
sid: sid
}
}
</script>

View File

@@ -1,45 +1,49 @@
<template>
<b-list-group class="mt-3">
<DocListItem v-for="doc in docs" :key="doc._id" :doc="doc"></DocListItem>
</b-list-group>
<b-list-group class="mt-3">
<DocListItem v-for="doc in docs" :key="sid(doc)" :doc="doc"></DocListItem>
</b-list-group>
</template>
<script lang="ts">
<script>
import {sid} from "@/util";
import DocListItem from "@/components/DocListItem.vue";
import Vue from "vue";
export default Vue.extend({
name: "DocList",
components: {DocListItem},
props: ["docs", "append"],
mounted() {
window.addEventListener("scroll", () => {
const threshold = 400;
const app = document.getElementById("app");
name: "DocList",
components: {DocListItem},
props: ["docs", "append"],
mounted() {
window.addEventListener("scroll", () => {
const threshold = 400;
const app = document.getElementById("app");
if ((window.innerHeight + window.scrollY) >= app.offsetHeight - threshold) {
this.append();
}
});
}
if ((window.innerHeight + window.scrollY) >= app.offsetHeight - threshold) {
this.append();
}
});
},
methods: {
sid: sid
}
});
</script>
<style>
.theme-black .list-group-item {
background: #212121;
color: #e0e0e0;
background: #212121;
color: #e0e0e0;
border-bottom: none;
border-left: none;
border-right: none;
border-radius: 0;
padding: .25rem 0.5rem;
border-bottom: none;
border-left: none;
border-right: none;
border-radius: 0;
padding: .25rem 0.5rem;
}
.theme-black .list-group-item:first-child {
border-top: none;
border-top: none;
}
</style>

View File

@@ -17,10 +17,10 @@
</div>
<img v-if="doc._props.isPlayableImage || doc._props.isPlayableVideo"
:src="(doc._props.isGif && hover) ? `f/${doc._source.index}/${doc._id}` : `t/${doc._source.index}/${doc._id}`"
:src="(doc._props.isGif && hover) ? `f/${sid(doc)}` : `t/${sid(doc)}`"
alt=""
class="pointer fit-sm" @click="onThumbnailClick()">
<img v-else :src="`t/${doc._source.index}/${doc._id}`" alt=""
<img v-else :src="`t/${sid(doc)}`" alt=""
class="fit-sm">
</div>
</div>
@@ -70,6 +70,7 @@ import FileIcon from "@/components/icons/FileIcon";
import FeaturedFieldsLine from "@/components/FeaturedFieldsLine";
import MLIcon from "@/components/icons/MlIcon.vue";
import Sist2Api from "@/Sist2Api";
import {sid} from "@/util";
export default {
name: "DocListItem",
@@ -82,12 +83,13 @@ export default {
}
},
methods: {
sid: sid,
async onThumbnailClick() {
this.$store.commit("setUiLightboxSlide", this.doc._seq);
await this.$store.dispatch("showLightbox");
},
onEmbeddingClick() {
Sist2Api.getEmbeddings(this.doc._source.index, this.doc._id, this.$store.state.embeddingsModel).then(embeddings => {
Sist2Api.getEmbeddings(sid(this.doc), this.$store.state.embeddingsModel).then(embeddings => {
this.$store.commit("setEmbeddingText", "");
this.$store.commit("setEmbedding", embeddings);
this.$store.commit("setEmbeddingDoc", this.doc);

View File

@@ -1,188 +1,189 @@
<template>
<div v-if="doc._props.hasThumbnail" class="img-wrapper" @mouseenter="onTnEnter()" @mouseleave="onTnLeave()"
@touchstart="onTouchStart()">
<div v-if="doc._props.isAudio" class="card-img-overlay" :class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
<div v-if="doc._props.hasThumbnail" class="img-wrapper" @mouseenter="onTnEnter()" @mouseleave="onTnLeave()"
@touchstart="onTouchStart()">
<div v-if="doc._props.isAudio" class="card-img-overlay" :class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div
v-if="doc._props.isImage && doc._props.imageAspectRatio < 5"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ `${doc._source.width}x${doc._source.height}` }}</span>
</div>
<div v-if="(doc._props.isVideo || doc._props.isGif) && doc._source.duration > 0"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div v-if="doc._props.isPlayableVideo" class="play">
<svg viewBox="0 0 494.942 494.942" xmlns="http://www.w3.org/2000/svg">
<path d="m35.353 0 424.236 247.471-424.236 247.471z"/>
</svg>
</div>
<img ref="tn"
v-if="doc._props.isPlayableImage || doc._props.isPlayableVideo"
:src="tnSrc"
alt=""
:style="{height: (doc._props.isGif && hover) ? `${tnHeight()}px` : undefined}"
class="pointer fit card-img-top" @click="onThumbnailClick()">
<img v-else :src="tnSrc" alt=""
class="fit card-img-top">
<ThumbnailProgressBar v-if="hover && doc._props.hasVidPreview"
:progress="(currentThumbnailNum + 1) / (doc._props.tnNum)"
></ThumbnailProgressBar>
</div>
<div
v-if="doc._props.isImage && doc._props.imageAspectRatio < 5"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ `${doc._source.width}x${doc._source.height}` }}</span>
</div>
<div v-if="(doc._props.isVideo || doc._props.isGif) && doc._source.duration > 0"
class="card-img-overlay"
:class="{'small-badge': smallBadge}">
<span class="badge badge-resolution">{{ humanTime(doc._source.duration) }}</span>
</div>
<div v-if="doc._props.isPlayableVideo" class="play">
<svg viewBox="0 0 494.942 494.942" xmlns="http://www.w3.org/2000/svg">
<path d="m35.353 0 424.236 247.471-424.236 247.471z"/>
</svg>
</div>
<img ref="tn"
v-if="doc._props.isPlayableImage || doc._props.isPlayableVideo"
:src="tnSrc"
alt=""
:style="{height: (doc._props.isGif && hover) ? `${tnHeight()}px` : undefined}"
class="pointer fit card-img-top" @click="onThumbnailClick()">
<img v-else :src="tnSrc" alt=""
class="fit card-img-top">
<ThumbnailProgressBar v-if="hover && doc._props.hasVidPreview"
:progress="(currentThumbnailNum + 1) / (doc._props.tnNum)"
></ThumbnailProgressBar>
</div>
</template>
<script>
import {humanTime} from "@/util";
import {humanTime, sid} from "@/util";
import ThumbnailProgressBar from "@/components/ThumbnailProgressBar";
export default {
name: "FullThumbnail",
props: ["doc", "smallBadge"],
components: {ThumbnailProgressBar},
data() {
return {
hover: false,
currentThumbnailNum: 0,
timeoutId: null
name: "FullThumbnail",
props: ["doc", "smallBadge"],
components: {ThumbnailProgressBar},
data() {
return {
hover: false,
currentThumbnailNum: 0,
timeoutId: null
}
},
created() {
this.$store.subscribe((mutation) => {
if (mutation.type === "busTnTouchStart" && mutation.payload !== this.doc._id) {
this.onTnLeave();
}
});
},
computed: {
tnSrc() {
return this.getThumbnailSrc(this.currentThumbnailNum);
},
},
methods: {
sid: sid,
getThumbnailSrc(thumbnailNum) {
const doc = this.doc;
const props = doc._props;
if (props.isGif && this.hover) {
return `f/${sid(doc)}`;
}
return (this.currentThumbnailNum === 0)
? `t/${sid(doc)}`
: `t/${sid(doc)}/${String(thumbnailNum).padStart(4, "0")}`;
},
humanTime: humanTime,
onThumbnailClick() {
this.$emit("onThumbnailClick");
},
tnHeight() {
return this.$refs.tn.height;
},
tnWidth() {
return this.$refs.tn.width;
},
onTnEnter() {
this.hover = true;
const start = Date.now()
if (this.doc._props.hasVidPreview) {
let img = new Image();
img.src = this.getThumbnailSrc(this.currentThumbnailNum + 1);
img.onload = () => {
this.currentThumbnailNum += 1;
this.scheduleNextTnNum(Date.now() - start);
}
}
},
onTnLeave() {
this.currentThumbnailNum = 0;
this.hover = false;
if (this.timeoutId !== null) {
window.clearTimeout(this.timeoutId);
this.timeoutId = null;
}
},
scheduleNextTnNum(offset = 0) {
const INTERVAL = (this.$store.state.optVidPreviewInterval ?? 700) - offset;
this.timeoutId = window.setTimeout(() => {
const start = Date.now();
if (!this.hover) {
return;
}
if (this.currentThumbnailNum === this.doc._props.tnNum - 1) {
this.currentThumbnailNum = 0;
this.scheduleNextTnNum();
} else {
let img = new Image();
img.src = this.getThumbnailSrc(this.currentThumbnailNum + 1);
img.onload = () => {
this.currentThumbnailNum += 1;
this.scheduleNextTnNum(Date.now() - start);
}
}
}, INTERVAL);
},
onTouchStart() {
this.$store.commit("busTnTouchStart", this.doc._id);
if (!this.hover) {
this.onTnEnter()
}
},
}
},
created() {
this.$store.subscribe((mutation) => {
if (mutation.type === "busTnTouchStart" && mutation.payload !== this.doc._id) {
this.onTnLeave();
}
});
},
computed: {
tnSrc() {
return this.getThumbnailSrc(this.currentThumbnailNum);
},
},
methods: {
getThumbnailSrc(thumbnailNum) {
const doc = this.doc;
const props = doc._props;
if (props.isGif && this.hover) {
return `f/${doc._source.index}/${doc._id}`;
}
return (this.currentThumbnailNum === 0)
? `t/${doc._source.index}/${doc._id}`
: `t/${doc._source.index}/${doc._id}/${String(thumbnailNum).padStart(4, "0")}`;
},
humanTime: humanTime,
onThumbnailClick() {
this.$emit("onThumbnailClick");
},
tnHeight() {
return this.$refs.tn.height;
},
tnWidth() {
return this.$refs.tn.width;
},
onTnEnter() {
this.hover = true;
const start = Date.now()
if (this.doc._props.hasVidPreview) {
let img = new Image();
img.src = this.getThumbnailSrc(this.currentThumbnailNum + 1);
img.onload = () => {
this.currentThumbnailNum += 1;
this.scheduleNextTnNum(Date.now() - start);
}
}
},
onTnLeave() {
this.currentThumbnailNum = 0;
this.hover = false;
if (this.timeoutId !== null) {
window.clearTimeout(this.timeoutId);
this.timeoutId = null;
}
},
scheduleNextTnNum(offset = 0) {
const INTERVAL = (this.$store.state.optVidPreviewInterval ?? 700) - offset;
this.timeoutId = window.setTimeout(() => {
const start = Date.now();
if (!this.hover) {
return;
}
if (this.currentThumbnailNum === this.doc._props.tnNum - 1) {
this.currentThumbnailNum = 0;
this.scheduleNextTnNum();
} else {
let img = new Image();
img.src = this.getThumbnailSrc(this.currentThumbnailNum + 1);
img.onload = () => {
this.currentThumbnailNum += 1;
this.scheduleNextTnNum(Date.now() - start);
}
}
}, INTERVAL);
},
onTouchStart() {
this.$store.commit("busTnTouchStart", this.doc._id);
if (!this.hover) {
this.onTnEnter()
}
},
}
}
</script>
<style scoped>
.img-wrapper {
position: relative;
position: relative;
}
.img-wrapper:hover svg {
fill: rgba(0, 0, 0, 1);
fill: rgba(0, 0, 0, 1);
}
.card-img-top {
border-top-left-radius: 0;
border-top-right-radius: 0;
border-top-left-radius: 0;
border-top-right-radius: 0;
}
.play {
position: absolute;
width: 25px;
height: 25px;
left: 50%;
top: 50%;
transform: translate(-50%, -50%);
pointer-events: none;
position: absolute;
width: 25px;
height: 25px;
left: 50%;
top: 50%;
transform: translate(-50%, -50%);
pointer-events: none;
}
.play svg {
fill: rgba(0, 0, 0, 0.7);
fill: rgba(0, 0, 0, 0.7);
}
.badge-resolution {
color: #c6c6c6;
background-color: #272727CC;
padding: 2px 3px;
color: #c6c6c6;
background-color: #272727CC;
padding: 2px 3px;
}
.card-img-overlay {
pointer-events: none;
padding: 2px 6px;
bottom: 4px;
top: unset;
left: unset;
right: 0;
pointer-events: none;
padding: 2px 6px;
bottom: 4px;
top: unset;
left: unset;
right: 0;
}
.small-badge {
padding: 1px 3px;
font-size: 70%;
padding: 1px 3px;
font-size: 70%;
}
</style>

View File

@@ -27,11 +27,12 @@
@click.shift="shiftClick(idx, $event)"
class="d-flex justify-content-between align-items-center list-group-item-action pointer"
:class="{active: lastClickIndex === idx}"
:key="idx.id"
>
<div class="d-flex">
<b-checkbox style="pointer-events: none" :checked="isSelected(idx)"></b-checkbox>
{{ idx.name }}
<div style="vertical-align: center; margin-left: 5px">
<div v-if="hasEmbeddings(idx)" style="vertical-align: center; margin-left: 5px">
<MLIcon small style="top: -1px; position: relative"></MLIcon>
</div>
<span class="text-muted timestamp-text ml-2"
@@ -43,7 +44,7 @@
</div>
</template>
<script lang="ts">
<script>
import SmallBadge from "./SmallBadge.vue"
import {mapActions, mapGetters} from "vuex";
import Vue from "vue";
@@ -111,7 +112,7 @@ export default Vue.extend({
onSelect(value) {
this.setSelectedIndices(this.indices.filter(idx => value.includes(idx.id)));
},
formatIdxDate(timestamp: number): string {
formatIdxDate(timestamp) {
return format(new Date(timestamp * 1000), "yyyy-MM-dd");
},
toggleIndex(index, e) {
@@ -121,14 +122,17 @@ export default Vue.extend({
this.lastClickIndex = index;
if (this.isSelected(index)) {
this.setSelectedIndices(this.selectedIndices.filter(idx => idx.id != index.id));
this.setSelectedIndices(this.selectedIndices.filter(idx => idx.id !== index.id));
} else {
this.setSelectedIndices([index, ...this.selectedIndices]);
}
},
isSelected(index) {
return this.selectedIndices.find(idx => idx.id == index.id) != null;
}
return this.selectedIndices.find(idx => idx.id === index.id) != null;
},
hasEmbeddings(index) {
return index.models.length > 0;
},
},
})
</script>

View File

@@ -59,7 +59,7 @@ export default {
const fields = [
"title", "duration", "audioc", "videoc",
"bitrate", "artist", "album", "album_artist", "genre", "font_name", "author",
"bitrate", "artist", "album", "album_artist", "genre", "font_name", "author", "media_comment",
"modified_by", "pages", "tag",
"exif_make", "exif_software", "exif_exposure_time", "exif_fnumber", "exif_focal_length",
"exif_user_comment", "exif_iso_speed_ratings", "exif_model", "exif_datetime",

View File

@@ -10,7 +10,7 @@
>{{ $t("ml.analyzeText") }}
</b-button>
<b-select :disabled="mlPredictionsLoading || mlLoading" class="ml-2" v-model="nerModel">
<b-select-option :value="opt.value" v-for="opt of ModelsRepo.getOptions()">{{ opt.text }}
<b-select-option :value="opt.value" v-for="opt of ModelsRepo.getOptions()" :key="opt.value">{{ opt.text }}
</b-select-option>
</b-select>
</b-form>
@@ -46,7 +46,7 @@ import {mapGetters, mapMutations} from "vuex";
export default {
name: "LazyContentDiv",
components: {AnalyzedContentSpansContainer, Preloader},
props: ["docId"],
props: ["sid"],
data() {
return {
ModelsRepo,
@@ -70,7 +70,7 @@ export default {
}
Sist2Api
.getDocument(this.docId, this.$store.state.optHighlight, this.$store.state.fuzzy)
.getDocument(this.sid, this.$store.state.optHighlight, this.$store.state.fuzzy)
.then(doc => {
this.loading = false;

View File

@@ -77,6 +77,7 @@ export default {
return listener(e);
}
};
},
methods: {
keyDownListener(e) {

View File

@@ -43,8 +43,7 @@
</b-card>
</template>
<script lang="ts">
import Sist2Api, {EsResult} from "@/Sist2Api";
<script>
import Vue from "vue";
import {humanFileSize} from "@/util";
import DisplayModeToggle from "@/components/DisplayModeToggle.vue";
@@ -52,6 +51,7 @@ import SortSelect from "@/components/SortSelect.vue";
import Preloader from "@/components/Preloader.vue";
import Sist2Query from "@/Sist2ElasticsearchQuery";
import ClipboardIcon from "@/components/icons/ClipboardIcon.vue";
import Sist2Api from "@/Sist2Api";
export default Vue.extend({
name: "ResultsCard",
@@ -64,7 +64,7 @@ export default Vue.extend({
return this.$store.state.lastQueryResults != null;
},
hitCount() {
return (this.$store.state.firstQueryResults as EsResult).aggregations.total_count.value;
return (this.$store.state.firstQueryResults).aggregations.total_count.value;
},
tableItems() {
const items = [];
@@ -79,10 +79,10 @@ export default Vue.extend({
},
methods: {
took() {
return (this.$store.state.lastQueryResults as EsResult).took + "ms";
return (this.$store.state.lastQueryResults).took + "ms";
},
totalSize() {
return humanFileSize((this.$store.state.firstQueryResults as EsResult).aggregations.total_size.value);
return humanFileSize((this.$store.state.firstQueryResults).aggregations.total_size.value);
},
onToggle() {
const show = !document.getElementById("collapse-1").classList.contains("show");

View File

@@ -5,8 +5,7 @@
<script>
import noUiSlider from 'nouislider';
import 'nouislider/dist/nouislider.css';
import {humanFileSize} from "@/util";
import {mergeTooltips} from "@/util-js";
import {humanFileSize, mergeTooltips} from "@/util";
export default {
name: "SizeSlider",

View File

@@ -2,7 +2,7 @@
<b-badge variant="secondary" :pill="pill">{{ text }}</b-badge>
</template>
<script lang="ts">
<script>
import Vue from "vue";
export default Vue.extend({

View File

@@ -1,5 +1,5 @@
<template>
<div class="thumbnail-progress-bar" :style="{width: `${percentProgress}%`}"></div>
<div class="thumbnail_count-progress-bar" :style="{width: `${percentProgress}%`}"></div>
</template>
<script>
@@ -16,7 +16,7 @@ export default {
<style scoped>
.thumbnail-progress-bar {
.thumbnail_count-progress-bar {
position: absolute;
left: 0;
bottom: 0;
@@ -27,11 +27,11 @@ export default {
z-index: 9;
}
.theme-black .thumbnail-progress-bar {
.theme-black .thumbnail_count-progress-bar {
background: rgba(0, 188, 212, 0.95);
}
.sub-document .thumbnail-progress-bar {
.sub-document .thumbnail_count-progress-bar {
max-width: calc(100% - 8px);
left: 4px;
}

View File

@@ -22,7 +22,9 @@ export class CLIPTransformerModel {
async loadModel(onProgress) {
ort.env.wasm.wasmPaths = ORT_WASM_PATHS;
ort.env.wasm.numThreads = 2;
if (window.crossOriginIsolated) {
ort.env.wasm.numThreads = 2;
}
let buf = await ModelStore.get(this._modelUrl);
if (!buf) {

View File

@@ -21,7 +21,7 @@ const authGuard = (to, from, next) => {
next();
}
const routes: Array<RouteConfig> = [
const routes = [
{
path: "/",
name: "SearchPage",

View File

@@ -1,20 +1,18 @@
import Vue from "vue"
import Vuex from "vuex"
import VueRouter, {Route} from "vue-router";
import {EsHit, EsResult, EsTag, Index} from "@/Sist2Api";
import {deserializeMimes, randomSeed, serializeMimes} from "@/util";
import {getInstance} from "@/plugins/auth0.js";
const CONF_VERSION = 3;
Vue.use(Vuex)
Vue.use(Vuex);
export default new Vuex.Store({
state: {
seed: 0,
indices: [] as Index[],
tags: [] as EsTag[],
sist2Info: null as any,
indices: [],
tags: [],
sist2Info: null,
sizeMin: undefined,
sizeMax: undefined,
@@ -64,12 +62,12 @@ export default new Vuex.Store({
optAutoAnalyze: false,
optMlDefaultModel: null,
_onLoadSelectedIndices: [] as string[],
_onLoadSelectedMimeTypes: [] as string[],
_onLoadSelectedTags: [] as string[],
selectedIndices: [] as Index[],
selectedMimeTypes: [] as string[],
selectedTags: [] as string[],
_onLoadSelectedIndices: [],
_onLoadSelectedMimeTypes: [],
_onLoadSelectedTags: [],
selectedIndices: [],
selectedMimeTypes: [],
selectedTags: [],
lastQueryResults: null,
firstQueryResults: null,
@@ -80,10 +78,10 @@ export default new Vuex.Store({
uiSqliteMode: false,
uiLightboxIsOpen: false,
uiShowLightbox: false,
uiLightboxSources: [] as string[],
uiLightboxThumbs: [] as string[],
uiLightboxCaptions: [] as any[],
uiLightboxTypes: [] as string[],
uiLightboxSources: [],
uiLightboxThumbs: [],
uiLightboxCaptions: [],
uiLightboxTypes: [],
uiLightboxKey: 0,
uiLightboxSlide: 0,
uiReachedScrollEnd: false,
@@ -91,7 +89,7 @@ export default new Vuex.Store({
uiDetailsMimeAgg: null,
uiShowDetails: false,
uiMimeMap: [] as any[],
uiMimeMap: [],
auth0Token: null,
nerModel: {
@@ -122,7 +120,7 @@ export default new Vuex.Store({
if (state._onLoadSelectedIndices.length > 0) {
state.selectedIndices = val.filter(
(idx: Index) => state._onLoadSelectedIndices.some(prefix => idx.id.startsWith(prefix))
(idx) => state._onLoadSelectedIndices.some(id => id === idx.id.toString(16))
);
} else {
state.selectedIndices = val;
@@ -145,18 +143,18 @@ export default new Vuex.Store({
setSelectedIndices: (state, val) => state.selectedIndices = val,
setSelectedMimeTypes: (state, val) => state.selectedMimeTypes = val,
setSelectedTags: (state, val) => state.selectedTags = val,
setUiLightboxIsOpen: (state, val: boolean) => state.uiLightboxIsOpen = val,
_setUiShowLightbox: (state, val: boolean) => state.uiShowLightbox = val,
setUiLightboxKey: (state, val: number) => state.uiLightboxKey = val,
_setKeySequence: (state, val: number) => state.keySequence = val,
_setQuerySequence: (state, val: number) => state.querySequence = val,
setUiLightboxIsOpen: (state, val) => state.uiLightboxIsOpen = val,
_setUiShowLightbox: (state, val) => state.uiShowLightbox = val,
setUiLightboxKey: (state, val) => state.uiLightboxKey = val,
_setKeySequence: (state, val) => state.keySequence = val,
_setQuerySequence: (state, val) => state.querySequence = val,
addLightboxSource: (state, {source, thumbnail, caption, type}) => {
state.uiLightboxSources.push(source);
state.uiLightboxThumbs.push(thumbnail);
state.uiLightboxCaptions.push(caption);
state.uiLightboxTypes.push(type);
},
setUiLightboxSlide: (state, val: number) => state.uiLightboxSlide = val,
setUiLightboxSlide: (state, val) => state.uiLightboxSlide = val,
setUiLightboxSources: (state, val) => state.uiLightboxSources = val,
setUiLightboxThumbs: (state, val) => state.uiLightboxThumbs = val,
@@ -230,7 +228,7 @@ export default new Vuex.Store({
store.commit("setOptLang", val.lang);
}
},
loadFromArgs({commit}, route: Route) {
loadFromArgs({commit}, route) {
if (route.query.q) {
commit("setSearchText", route.query.q);
@@ -265,11 +263,11 @@ export default new Vuex.Store({
}
if (route.query.m) {
commit("_setOnLoadSelectedMimeTypes", deserializeMimes(route.query.m as string));
commit("_setOnLoadSelectedMimeTypes", deserializeMimes(route.query.m));
}
if (route.query.t) {
commit("_setOnLoadSelectedTags", (route.query.t as string).split(","));
commit("_setOnLoadSelectedTags", (route.query.t).split(","));
}
if (route.query.sort) {
@@ -280,7 +278,7 @@ export default new Vuex.Store({
commit("setSeed", Number(route.query.seed));
}
},
async updateArgs({state}, router: VueRouter) {
async updateArgs({state}, router) {
if (router.currentRoute.path !== "/") {
return;
@@ -290,14 +288,14 @@ export default new Vuex.Store({
query: {
q: state.searchText.trim() ? state.searchText.trim().replace(/\s+/g, " ") : undefined,
fuzzy: state.fuzzy ? null : undefined,
i: state.selectedIndices ? state.selectedIndices.map((idx: Index) => idx.idPrefix) : undefined,
i: state.selectedIndices ? state.selectedIndices.map((idx) => idx.id.toString(16)) : undefined,
dMin: state.dateMin,
dMax: state.dateMax,
sMin: state.sizeMin,
sMax: state.sizeMax,
path: state.pathText ? state.pathText : undefined,
m: serializeMimes(state.selectedMimeTypes),
t: state.selectedTags.length == 0 ? undefined : state.selectedTags.join(","),
t: state.selectedTags.length === 0 ? undefined : state.selectedTags.join(","),
sort: state.sortMode === "score" ? undefined : state.sortMode,
seed: state.sortMode === "random" ? state.seed.toString() : undefined
}
@@ -306,11 +304,11 @@ export default new Vuex.Store({
});
},
updateConfiguration({state}) {
const conf = {} as any;
const conf = {};
Object.keys(state).forEach((key) => {
if (key.startsWith("opt")) {
conf[key] = (state as any)[key];
conf[key] = (state)[key];
}
});
@@ -323,14 +321,14 @@ export default new Vuex.Store({
if (confString) {
const conf = JSON.parse(confString);
if (!("version" in conf) || conf["version"] != CONF_VERSION) {
if (!("version" in conf) || conf["version"] !== CONF_VERSION) {
localStorage.removeItem("sist2_configuration");
window.location.reload();
}
Object.keys(state).forEach((key) => {
if (key.startsWith("opt")) {
(state as any)[key] = conf[key];
(state)[key] = conf[key];
}
});
}
@@ -386,7 +384,7 @@ export default new Vuex.Store({
indices: state => state.indices,
sist2Info: state => state.sist2Info,
indexMap: state => {
const map = {} as any;
const map = {};
state.indices.forEach(idx => map[idx.id] = idx);
return map;
},
@@ -405,12 +403,12 @@ export default new Vuex.Store({
size: state => state.optSize,
sortMode: state => state.sortMode,
lastQueryResult: state => state.lastQueryResults,
lastDoc: function (state): EsHit | null {
lastDoc: function (state) {
if (state.lastQueryResults == null) {
return null;
}
return (state.lastQueryResults as unknown as EsResult).hits.hits.slice(-1)[0];
return (state.lastQueryResults).hits.hits.slice(-1)[0];
},
uiShowLightbox: state => state.uiShowLightbox,
uiLightboxSources: state => state.uiLightboxSources,
@@ -451,7 +449,7 @@ export default new Vuex.Store({
optMlRepositories: state => state.optMlRepositories,
mlRepositoryList: state => {
const repos = state.optMlRepositories.split("\n")
return repos[0] == "" ? [] : repos;
return repos[0] === "" ? [] : repos;
},
optMlDefaultModel: state => state.optMlDefaultModel,
optAutoAnalyze: state => state.optAutoAnalyze,

View File

@@ -1,139 +0,0 @@
export function mergeTooltips(slider, threshold, separator, fixTooltips) {
const isMobile = window.innerWidth <= 650;
if (isMobile) {
threshold = 25;
}
const textIsRtl = getComputedStyle(slider).direction === 'rtl';
const isRtl = slider.noUiSlider.options.direction === 'rtl';
const isVertical = slider.noUiSlider.options.orientation === 'vertical';
const tooltips = slider.noUiSlider.getTooltips();
const origins = slider.noUiSlider.getOrigins();
// Move tooltips into the origin element. The default stylesheet handles this.
tooltips.forEach(function (tooltip, index) {
if (tooltip) {
origins[index].appendChild(tooltip);
}
});
slider.noUiSlider.on('update', function (values, handle, unencoded, tap, positions) {
const pools = [[]];
const poolPositions = [[]];
const poolValues = [[]];
let atPool = 0;
// Assign the first tooltip to the first pool, if the tooltip is configured
if (tooltips[0]) {
pools[0][0] = 0;
poolPositions[0][0] = positions[0];
poolValues[0][0] = values[0];
}
for (let i = 1; i < positions.length; i++) {
if (!tooltips[i] || (positions[i] - positions[i - 1]) > threshold) {
atPool++;
pools[atPool] = [];
poolValues[atPool] = [];
poolPositions[atPool] = [];
}
if (tooltips[i]) {
pools[atPool].push(i);
poolValues[atPool].push(values[i]);
poolPositions[atPool].push(positions[i]);
}
}
pools.forEach(function (pool, poolIndex) {
const handlesInPool = pool.length;
for (let j = 0; j < handlesInPool; j++) {
const handleNumber = pool[j];
if (j === handlesInPool - 1) {
let offset = 0;
poolPositions[poolIndex].forEach(function (value) {
offset += 1000 - 10 * value;
});
const direction = isVertical ? 'bottom' : 'right';
const last = isRtl ? 0 : handlesInPool - 1;
const lastOffset = 1000 - 10 * poolPositions[poolIndex][last];
offset = (textIsRtl && !isVertical ? 100 : 0) + (offset / handlesInPool) - lastOffset;
// Center this tooltip over the affected handles
tooltips[handleNumber].innerHTML = poolValues[poolIndex].join(separator);
tooltips[handleNumber].style.display = 'block';
tooltips[handleNumber].style[direction] = offset + '%';
} else {
// Hide this tooltip
tooltips[handleNumber].style.display = 'none';
}
}
});
if (fixTooltips) {
const isMobile = window.innerWidth <= 650;
const len = isMobile ? 20 : 5;
if (positions[0] < len) {
tooltips[0].style.right = `${(1 - ((positions[0]) / len)) * -35}px`
} else {
tooltips[0].style.right = "0"
}
if (positions[1] > (100 - len)) {
tooltips[1].style.right = `${((positions[1] - (100 - len)) / len) * 35}px`
} else {
tooltips[1].style.right = "0"
}
}
});
}
export function burrow(table, addSelfDir, rootName) {
const root = {};
table.forEach(row => {
let layer = root;
row.taxonomy.forEach(key => {
layer[key] = key in layer ? layer[key] : {};
layer = layer[key];
});
if (Object.keys(layer).length === 0) {
layer["$size$"] = row.size;
} else if (addSelfDir) {
layer["."] = {
"$size$": row.size,
};
}
});
const descend = function (obj, depth) {
return Object.keys(obj).filter(k => k !== "$size$").map(k => {
const child = {
name: k,
depth: depth,
value: 0,
children: descend(obj[k], depth + 1)
};
if ("$size$" in obj[k]) {
child.value = obj[k]["$size$"];
}
return child;
});
};
return {
name: rootName,
children: descend(root, 1),
value: 0,
depth: 0,
}
}

329
sist2-vue/src/util.js Normal file
View File

@@ -0,0 +1,329 @@
export function mergeTooltips(slider, threshold, separator, fixTooltips) {
const isMobile = window.innerWidth <= 650;
if (isMobile) {
threshold = 25;
}
const textIsRtl = getComputedStyle(slider).direction === 'rtl';
const isRtl = slider.noUiSlider.options.direction === 'rtl';
const isVertical = slider.noUiSlider.options.orientation === 'vertical';
const tooltips = slider.noUiSlider.getTooltips();
const origins = slider.noUiSlider.getOrigins();
// Move tooltips into the origin element. The default stylesheet handles this.
tooltips.forEach(function (tooltip, index) {
if (tooltip) {
origins[index].appendChild(tooltip);
}
});
slider.noUiSlider.on('update', function (values, handle, unencoded, tap, positions) {
const pools = [[]];
const poolPositions = [[]];
const poolValues = [[]];
let atPool = 0;
// Assign the first tooltip to the first pool, if the tooltip is configured
if (tooltips[0]) {
pools[0][0] = 0;
poolPositions[0][0] = positions[0];
poolValues[0][0] = values[0];
}
for (let i = 1; i < positions.length; i++) {
if (!tooltips[i] || (positions[i] - positions[i - 1]) > threshold) {
atPool++;
pools[atPool] = [];
poolValues[atPool] = [];
poolPositions[atPool] = [];
}
if (tooltips[i]) {
pools[atPool].push(i);
poolValues[atPool].push(values[i]);
poolPositions[atPool].push(positions[i]);
}
}
pools.forEach(function (pool, poolIndex) {
const handlesInPool = pool.length;
for (let j = 0; j < handlesInPool; j++) {
const handleNumber = pool[j];
if (j === handlesInPool - 1) {
let offset = 0;
poolPositions[poolIndex].forEach(function (value) {
offset += 1000 - 10 * value;
});
const direction = isVertical ? 'bottom' : 'right';
const last = isRtl ? 0 : handlesInPool - 1;
const lastOffset = 1000 - 10 * poolPositions[poolIndex][last];
offset = (textIsRtl && !isVertical ? 100 : 0) + (offset / handlesInPool) - lastOffset;
// Center this tooltip over the affected handles
tooltips[handleNumber].innerHTML = poolValues[poolIndex].join(separator);
tooltips[handleNumber].style.display = 'block';
tooltips[handleNumber].style[direction] = offset + '%';
} else {
// Hide this tooltip
tooltips[handleNumber].style.display = 'none';
}
}
});
if (fixTooltips) {
const isMobile = window.innerWidth <= 650;
const len = isMobile ? 20 : 5;
if (positions[0] < len) {
tooltips[0].style.right = `${(1 - ((positions[0]) / len)) * -35}px`
} else {
tooltips[0].style.right = "0"
}
if (positions[1] > (100 - len)) {
tooltips[1].style.right = `${((positions[1] - (100 - len)) / len) * 35}px`
} else {
tooltips[1].style.right = "0"
}
}
});
}
export function burrow(table, addSelfDir, rootName) {
const root = {};
table.forEach(row => {
let layer = root;
row.taxonomy.forEach(key => {
layer[key] = key in layer ? layer[key] : {};
layer = layer[key];
});
if (Object.keys(layer).length === 0) {
layer["$size$"] = row.size;
} else if (addSelfDir) {
layer["."] = {
"$size$": row.size,
};
}
});
const descend = function (obj, depth) {
return Object.keys(obj).filter(k => k !== "$size$").map(k => {
const child = {
name: k,
depth: depth,
value: 0,
children: descend(obj[k], depth + 1)
};
if ("$size$" in obj[k]) {
child.value = obj[k]["$size$"];
}
return child;
});
};
return {
name: rootName,
children: descend(root, 1),
value: 0,
depth: 0,
}
}
export function ext(hit) {
return srcExt(hit._source)
}
export function srcExt(src) {
return Object.prototype.hasOwnProperty.call(src, "extension")
&& src["extension"] !== "" ? "." + src["extension"] : "";
}
export function strUnescape(str) {
let result = "";
for (let i = 0; i < str.length; i++) {
const c = str[i];
const next = str[i + 1];
if (c === "]") {
if (next === "]") {
result += c;
i += 1;
} else {
result += String.fromCharCode(parseInt(str.slice(i, i + 2), 16));
i += 2;
}
} else {
result += c;
}
}
return result;
}
const thresh = 1000;
const units = ["k", "M", "G", "T", "P", "E", "Z", "Y"];
export function humanFileSize(bytes) {
if (bytes === 0) {
return "0 B"
}
if (Math.abs(bytes) < thresh) {
return bytes + ' B';
}
let u = -1;
do {
bytes /= thresh;
++u;
} while (Math.abs(bytes) >= thresh && u < units.length - 1);
return bytes.toFixed(1) + units[u];
}
export function humanTime(sec_num) {
sec_num = Math.floor(sec_num);
const hours = Math.floor(sec_num / 3600);
const minutes = Math.floor((sec_num - (hours * 3600)) / 60);
const seconds = sec_num - (hours * 3600) - (minutes * 60);
if (sec_num < 60) {
return `${sec_num}s`
}
if (sec_num < 3600) {
return `${minutes < 10 ? "0" : ""}${minutes}:${seconds < 10 ? "0" : ""}${seconds}`;
}
return `${hours < 10 ? "0" : ""}${hours}:${minutes < 10 ? "0" : ""}${minutes}:${seconds < 10 ? "0" : ""}${seconds}`;
}
export function humanDate(numMilis) {
const date = (new Date(numMilis * 1000));
return date.getUTCFullYear() + "-" + ("0" + (date.getUTCMonth() + 1)).slice(-2) + "-" + ("0" + date.getUTCDate()).slice(-2)
}
export function lum(c) {
c = c.substring(1);
const rgb = parseInt(c, 16);
const r = (rgb >> 16) & 0xff;
const g = (rgb >> 8) & 0xff;
const b = (rgb >> 0) & 0xff;
return 0.2126 * r + 0.7152 * g + 0.0722 * b;
}
export function getSelectedTreeNodes(tree) {
const selectedNodes = new Set();
const selected = tree.selected();
for (let i = 0; i < selected.length; i++) {
if (selected[i].id === "any") {
return ["any"]
}
//Only get children
if (selected[i].text.indexOf("(") !== -1) {
if (selected[i].values) {
selectedNodes.add(selected[i].values.slice(-1)[0]);
} else {
selectedNodes.add(selected[i].id);
}
}
}
return Array.from(selectedNodes);
}
export function getTreeNodeAttributes(tree) {
const nodes = tree.selectable();
const attributes = {};
for (let i = 0; i < nodes.length; i++) {
let id = null;
if (nodes[i].text.indexOf("(") !== -1 && nodes[i].values) {
id = nodes[i].values.slice(-1)[0];
} else {
id = nodes[i].id
}
attributes[id] = {
checked: nodes[i].itree.state.checked,
collapsed: nodes[i].itree.state.collapsed,
}
}
return attributes;
}
export function serializeMimes(mimes) {
if (mimes.length === 0) {
return undefined;
}
return mimes.map(mime => compressMime(mime)).join("");
}
export function deserializeMimes(mimeString) {
return mimeString
.replaceAll(/([IVATUF])/g, "$$$&")
.split("$")
.map(mime => decompressMime(mime))
.slice(1) // Ignore the first (empty) token
}
export function compressMime(mime) {
return mime
.replace("image/", "I")
.replace("video/", "V")
.replace("application/", "A")
.replace("text/", "T")
.replace("audio/", "U")
.replace("font/", "F")
.replace("+", ",")
.replace("x-", "X")
}
export function decompressMime(mime) {
return mime
.replace("I", "image/")
.replace("V", "video/")
.replace("A", "application/")
.replace("T", "text/")
.replace("U", "audio/")
.replace("F", "font/")
.replace(",", "+")
.replace("X", "x-")
}
export function randomSeed() {
return Math.round(Math.random() * 100000);
}
export function sid(doc) {
if (doc._id.includes(".")) {
return doc._id
}
const num = BigInt(doc._id);
const indexId = (num >> BigInt(32));
const docId = num & BigInt(0xFFFFFFFF);
return indexId.toString(16).padStart(8, "0") + "." + docId.toString(16).padStart(8, "0");
}

View File

@@ -1,177 +0,0 @@
import {EsHit} from "@/Sist2Api";
export function ext(hit: EsHit) {
return srcExt(hit._source)
}
export function srcExt(src) {
return Object.prototype.hasOwnProperty.call(src, "extension")
&& src["extension"] !== "" ? "." + src["extension"] : "";
}
export function strUnescape(str: string): string {
let result = "";
for (let i = 0; i < str.length; i++) {
const c = str[i];
const next = str[i + 1];
if (c === "]") {
if (next === "]") {
result += c;
i += 1;
} else {
result += String.fromCharCode(parseInt(str.slice(i, i + 2), 16));
i += 2;
}
} else {
result += c;
}
}
return result;
}
const thresh = 1000;
const units = ["k", "M", "G", "T", "P", "E", "Z", "Y"];
export function humanFileSize(bytes: number): string {
if (bytes === 0) {
return "0 B"
}
if (Math.abs(bytes) < thresh) {
return bytes + ' B';
}
let u = -1;
do {
bytes /= thresh;
++u;
} while (Math.abs(bytes) >= thresh && u < units.length - 1);
return bytes.toFixed(1) + units[u];
}
export function humanTime(sec_num: number): string {
sec_num = Math.floor(sec_num);
const hours = Math.floor(sec_num / 3600);
const minutes = Math.floor((sec_num - (hours * 3600)) / 60);
const seconds = sec_num - (hours * 3600) - (minutes * 60);
if (sec_num < 60) {
return `${sec_num}s`
}
if (sec_num < 3600) {
return `${minutes < 10 ? "0" : ""}${minutes}:${seconds < 10 ? "0" : ""}${seconds}`;
}
return `${hours < 10 ? "0" : ""}${hours}:${minutes < 10 ? "0" : ""}${minutes}:${seconds < 10 ? "0" : ""}${seconds}`;
}
export function humanDate(numMilis: number): string {
const date = (new Date(numMilis * 1000));
return date.getUTCFullYear() + "-" + ("0" + (date.getUTCMonth() + 1)).slice(-2) + "-" + ("0" + date.getUTCDate()).slice(-2)
}
export function lum(c: string) {
c = c.substring(1);
const rgb = parseInt(c, 16);
const r = (rgb >> 16) & 0xff;
const g = (rgb >> 8) & 0xff;
const b = (rgb >> 0) & 0xff;
return 0.2126 * r + 0.7152 * g + 0.0722 * b;
}
export function getSelectedTreeNodes(tree: any) {
const selectedNodes = new Set();
const selected = tree.selected();
for (let i = 0; i < selected.length; i++) {
if (selected[i].id === "any") {
return ["any"]
}
//Only get children
if (selected[i].text.indexOf("(") !== -1) {
if (selected[i].values) {
selectedNodes.add(selected[i].values.slice(-1)[0]);
} else {
selectedNodes.add(selected[i].id);
}
}
}
return Array.from(selectedNodes);
}
export function getTreeNodeAttributes(tree: any) {
const nodes = tree.selectable();
const attributes = {};
for (let i = 0; i < nodes.length; i++) {
let id = null;
if (nodes[i].text.indexOf("(") !== -1 && nodes[i].values) {
id = nodes[i].values.slice(-1)[0];
} else {
id = nodes[i].id
}
attributes[id] = {
checked: nodes[i].itree.state.checked,
collapsed: nodes[i].itree.state.collapsed,
}
}
return attributes;
}
export function serializeMimes(mimes: string[]): string | undefined {
if (mimes.length == 0) {
return undefined;
}
return mimes.map(mime => compressMime(mime)).join("");
}
export function deserializeMimes(mimeString: string): string[] {
return mimeString
.replaceAll(/([IVATUF])/g, "$$$&")
.split("$")
.map(mime => decompressMime(mime))
.slice(1) // Ignore the first (empty) token
}
export function compressMime(mime: string): string {
return mime
.replace("image/", "I")
.replace("video/", "V")
.replace("application/", "A")
.replace("text/", "T")
.replace("audio/", "U")
.replace("font/", "F")
.replace("+", ",")
.replace("x-", "X")
}
export function decompressMime(mime: string): string {
return mime
.replace("I", "image/")
.replace("V", "video/")
.replace("A", "application/")
.replace("T", "text/")
.replace("U", "audio/")
.replace("F", "font/")
.replace(",", "+")
.replace("X", "x-")
}
export function randomSeed(): number {
return Math.round(Math.random() * 100000);
}

View File

@@ -81,6 +81,7 @@
<li><code>doc.artist</code></li>
<li><code>doc.title</code></li>
<li><code>doc.genre</code></li>
<li><code>doc.media_comment</code></li>
<li><code>doc.album_artist</code></li>
<li><code>doc.exif_make</code></li>
<li><code>doc.exif_model</code></li>
@@ -136,7 +137,7 @@
{{ $t("opt.fuzzy") }}
</b-form-checkbox>
<b-form-checkbox :disabled="uiSqliteMode" :checked="optSearchInPath" @input="setOptSearchInPath">{{
<b-form-checkbox :checked="optSearchInPath" @input="setOptSearchInPath">{{
$t("opt.searchInPath")
}}
</b-form-checkbox>

View File

@@ -1,142 +1,142 @@
<template>
<div style="margin-left: auto; margin-right: auto;" class="container">
<Preloader v-if="loading"></Preloader>
<b-card v-else-if="!loading && found">
<b-card-title :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
</b-card-title>
<div style="margin-left: auto; margin-right: auto;" class="container">
<Preloader v-if="loading"></Preloader>
<b-card v-else-if="!loading && found">
<b-card-title :title="doc._source.name + ext(doc)">
{{ doc._source.name + ext(doc) }}
</b-card-title>
<!-- Thumbnail-->
<div style="position: relative; margin-left: auto; margin-right: auto; text-align: center">
<FullThumbnail :doc="doc" :small-badge="false" @onThumbnailClick="onThumbnailClick()"></FullThumbnail>
</div>
<!-- Thumbnail-->
<div style="position: relative; margin-left: auto; margin-right: auto; text-align: center">
<FullThumbnail :doc="doc" :small-badge="false" @onThumbnailClick="onThumbnailClick()"></FullThumbnail>
</div>
<!-- Audio player-->
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls
:type="doc._source.mime"
:src="`f/${doc._source.index}/${doc._id}`"></audio>
<!-- Audio player-->
<audio v-if="doc._props.isAudio" ref="audio" preload="none" class="audio-fit fit" controls
:type="doc._source.mime"
:src="`f/${sid(doc)}`"></audio>
<InfoTable :doc="doc" v-if="doc"></InfoTable>
<InfoTable :doc="doc" v-if="doc"></InfoTable>
<div v-if="doc._source.content" class="content-div">{{ doc._source.content }}</div>
</b-card>
<div v-else>
<b-card>
<b-card-title>{{ $t("filePage.notFound") }}</b-card-title>
</b-card>
<div v-if="doc._source.content" class="content-div">{{ doc._source.content }}</div>
</b-card>
<div v-else>
<b-card>
<b-card-title>{{ $t("filePage.notFound") }}</b-card-title>
</b-card>
</div>
</div>
</div>
</template>
<script>
import Preloader from "@/components/Preloader.vue";
import InfoTable from "@/components/InfoTable.vue";
import Sist2Api from "@/Sist2Api";
import {ext} from "@/util";
import {ext, sid} from "@/util";
import Vue from "vue";
import sist2 from "@/Sist2Api";
import FullThumbnail from "@/components/FullThumbnail";
export default Vue.extend({
name: "FilePage",
components: {
FullThumbnail,
Preloader,
InfoTable
},
data() {
return {
loading: true,
found: false,
doc: null
}
},
methods: {
ext: ext,
onThumbnailClick() {
window.open(`/f/${this.doc.index}/${this.doc._id}`, "_blank");
name: "FilePage",
components: {
FullThumbnail,
Preloader,
InfoTable
},
findByCustomField(field, id) {
return {
query: {
bool: {
must: [
{
match: {
[field]: id
}
}
]
}
},
size: 1
}
data() {
return {
loading: true,
found: false,
doc: null
}
},
findById(id) {
return {
query: {
bool: {
must: [
{
match: {
"_id": id
}
}
]
}
methods: {
ext: ext,
sid: sid,
onThumbnailClick() {
window.open(`/f/${sid(this.doc)}`, "_blank");
},
size: 1
}
},
findByName(name) {
return {
query: {
bool: {
must: [
{
match: {
"name": name
}
}
]
}
findByCustomField(field, id) {
return {
query: {
bool: {
must: [
{
match: {
[field]: id
}
}
]
}
},
size: 1
}
},
size: 1
}
}
},
mounted() {
let query = null;
if (this.$route.query.byId) {
query = this.findById(this.$route.query.byId);
} else if (this.$route.query.byName) {
query = this.findByName(this.$route.query.byName);
} else if (this.$route.query.by && this.$route.query.q) {
query = this.findByCustomField(this.$route.query.by, this.$route.query.q)
}
if (query) {
Sist2Api.esQuery(query).then(result => {
if (result.hits.hits.length === 0) {
this.found = false;
} else {
this.doc = result.hits.hits[0];
this.found = true;
findById(id) {
return {
query: {
bool: {
must: [
{
match: {
"_id": id
}
}
]
}
},
size: 1
}
},
findByName(name) {
return {
query: {
bool: {
must: [
{
match: {
"name": name
}
}
]
}
},
size: 1
}
}
this.loading = false;
});
} else {
this.loading = false;
this.found = false;
},
mounted() {
let query = null;
if (this.$route.query.byId) {
query = this.findById(this.$route.query.byId);
} else if (this.$route.query.byName) {
query = this.findByName(this.$route.query.byName);
} else if (this.$route.query.by && this.$route.query.q) {
query = this.findByCustomField(this.$route.query.by, this.$route.query.q)
}
if (query) {
Sist2Api.esQuery(query).then(result => {
if (result.hits.hits.length === 0) {
this.found = false;
} else {
this.doc = result.hits.hits[0];
this.found = true;
}
this.loading = false;
});
} else {
this.loading = false;
this.found = false;
}
}
}
});
</script>
<style scoped>
.img-wrapper {
display: inline-block;
display: inline-block;
}
</style>

View File

@@ -60,6 +60,7 @@
</template>
<script>
import {sid} from "@/util";
import Preloader from "@/components/Preloader.vue";
import {mapActions, mapGetters, mapMutations} from "vuex";
import SearchBar from "@/components/SearchBar.vue";
@@ -92,7 +93,6 @@ export default Vue.extend({
SizeSlider, PathTree, ResultsCard, MimePicker, Lightbox, DocCardWall, IndexPicker, SearchBar, Preloader
},
data: () => ({
loading: false,
uiLoading: true,
search: undefined,
docs: [],
@@ -105,6 +105,9 @@ export default Vue.extend({
}),
computed: {
...mapGetters(["indices", "optDisplay"]),
hasEmbeddings() {
return Sist2Api.models().length > 0;
},
},
mounted() {
// Handle touch events
@@ -172,12 +175,6 @@ export default Vue.extend({
setDateBoundsMax: "setDateBoundsMax",
setTags: "setTags",
}),
hasEmbeddings() {
if (!this.loading) {
return false;
}
return Sist2Api.models().some();
},
showErrorToast() {
this.$bvToast.toast(
this.$t("toast.esConnErr"),
@@ -248,9 +245,9 @@ export default Vue.extend({
if (hit._props.isPlayableImage || hit._props.isPlayableVideo) {
hit._seq = await this.$store.dispatch("getKeySequence");
this.$store.commit("addLightboxSource", {
source: `f/${hit._source.index}/${hit._id}`,
thumbnail: hit._props.hasThumbnail
? `t/${hit._source.index}/${hit._id}`
source: `f/${sid(hit)}`,
thumbnail_count: hit._props.hasThumbnail
? `t/${sid(hit)}`
: null,
caption: {
component: LightboxCaption,

View File

@@ -1,40 +0,0 @@
{
"compilerOptions": {
"target": "esnext",
"module": "esnext",
"strict": false,
"jsx": "preserve",
"importHelpers": true,
"moduleResolution": "node",
"experimentalDecorators": true,
"skipLibCheck": true,
"esModuleInterop": true,
"allowSyntheticDefaultImports": true,
"sourceMap": false,
"baseUrl": ".",
"types": [
"webpack-env",
],
"paths": {
"@/*": [
"src/*"
]
},
"lib": [
"esnext",
"dom",
"dom.iterable",
"scripthost"
]
},
"include": [
"src/**/*.ts",
"src/**/*.tsx",
"src/**/*.vue",
"tests/**/*.ts",
"tests/**/*.tsx"
],
"exclude": [
"node_modules"
]
}

View File

@@ -1,3 +1,5 @@
#!/usr/bin/env bash
export NODE_OPTIONS=--openssl-legacy-provider
./node_modules/@vue/cli-service/bin/vue-cli-service.js build --watch

View File

@@ -74,6 +74,21 @@ void sqlite_index_args_destroy(sqlite_index_args_t *args) {
free(args);
}
char *add_trailing_slash(char *abs_path) {
if (strcmp(abs_path, "/") == 0) {
// Special case: don't add trailing slash for "/"
return abs_path;
}
char *new_abs_path = realloc(abs_path, strlen(abs_path) + 2);
if (new_abs_path == NULL) {
LOG_FATALF("cli.c", "FIXME: realloc() failed for abs_path=%s", abs_path);
}
strcat(new_abs_path, "/");
return new_abs_path;
}
int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
if (argc < 2) {
fprintf(stderr, "Required positional argument: PATH.\n");
@@ -83,15 +98,10 @@ int scan_args_validate(scan_args_t *args, int argc, const char **argv) {
char *abs_path = abspath(argv[1]);
if (abs_path == NULL) {
LOG_FATALF("cli.c", "Invalid PATH argument. File not found: %s", argv[1]);
} else {
char *new_abs_path = realloc(abs_path, strlen(abs_path) + 2);
if (new_abs_path == NULL) {
LOG_FATALF("cli.c", "FIXME: realloc() failed for argv[1]=%s, abs_path=%s", argv[1], abs_path);
}
strcat(new_abs_path, "/");
args->path = new_abs_path;
}
args->path = add_trailing_slash(abs_path);
if (args->tn_quality == OPTION_VALUE_UNSPECIFIED) {
args->tn_quality = DEFAULT_QUALITY;
} else if (args->tn_quality < 0 || args->tn_quality > 100) {

View File

@@ -1,8 +1,6 @@
#include "ctx.h"
ScanCtx_t ScanCtx = {
.stat_index_size = 0,
.stat_tn_size = 0,
.pool = NULL,
.index.path = {0,},
};

View File

@@ -31,9 +31,6 @@ typedef struct {
int depth;
int calculate_checksums;
size_t stat_tn_size;
size_t stat_index_size;
pcre *exclude;
pcre_extra *exclude_extra;
int fast;

View File

@@ -4,6 +4,7 @@
#include <string.h>
#include <pthread.h>
#include "src/util.h"
#include "src/parsing/mime.h"
#include <time.h>
@@ -64,9 +65,11 @@ static int sep_rfind(const char *str) {
}
void path_parent_func(sqlite3_context *ctx, int argc, sqlite3_value **argv) {
#ifdef SIST_DEBUG
if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_TEXT) {
sqlite3_result_error(ctx, "Invalid parameters", -1);
}
#endif
const char *value = (const char *) sqlite3_value_text(argv[0]);
@@ -82,28 +85,27 @@ void path_parent_func(sqlite3_context *ctx, int argc, sqlite3_value **argv) {
}
void random_func(sqlite3_context *ctx, int argc, UNUSED(sqlite3_value **argv)) {
#ifdef SIST_DEBUG
if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_INTEGER) {
sqlite3_result_error(ctx, "Invalid parameters", -1);
}
#endif
char state_buf[128] = {0,};
struct random_data buf;
int result;
char state_buf[8] = {0,};
long seed = sqlite3_value_int64(argv[0]);
initstate_r((int) seed, state_buf, sizeof(state_buf), &buf);
initstate((int) seed, state_buf, sizeof(state_buf));
random_r(&buf, &result);
sqlite3_result_int(ctx, result);
sqlite3_result_int(ctx, (int) random());
}
void save_current_job_info(sqlite3_context *ctx, int argc, sqlite3_value **argv) {
#ifdef SIST_DEBUG
if (argc != 1 || sqlite3_value_type(argv[0]) != SQLITE_TEXT) {
sqlite3_result_error(ctx, "Invalid parameters", -1);
}
#endif
database_ipc_ctx_t *ipc_ctx = sqlite3_user_data(ctx);
@@ -146,6 +148,12 @@ void database_open(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "PRAGMA temp_store = memory;", NULL, NULL, NULL));
}
#ifdef SIST_DEBUG
// CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "PRAGMA foreign_keys = ON;", NULL, NULL, NULL));
#else
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "PRAGMA ignore_check_constraints = ON;", NULL, NULL, NULL));
#endif
if (db->type == INDEX_DATABASE) {
// Prepare statements;
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
@@ -154,16 +162,15 @@ void database_open(database_t *db) {
&db->select_thumbnail_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"UPDATE document SET marked=1 WHERE id=? AND mtime=? RETURNING id",
"UPDATE marked SET marked=1 WHERE id=(SELECT ROWID FROM document WHERE path=?) AND mtime=? RETURNING id",
-1,
&db->mark_document_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"REPLACE INTO document_sidecar (id, json_data) VALUES (?,?)", -1,
&db->write_document_sidecar_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"REPLACE INTO document (id, mtime, size, json_data, version) VALUES (?, ?, ?, ?, (SELECT max(id) FROM version));",
"INSERT INTO document (path, parent, mime, mtime, size, thumbnail_count, json_data, version) "
"VALUES (?, (SELECT id FROM document WHERE path=?), ?, ?, ?, ?, ?, (SELECT max(id) FROM version)) "
"ON CONFLICT (path) DO UPDATE SET json_data=excluded.json_data "
"RETURNING id;",
-1,
&db->write_document_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
@@ -173,7 +180,12 @@ void database_open(database_t *db) {
&db->write_thumbnail_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db, "SELECT json_data FROM document WHERE id=?", -1,
db->db, "SELECT json_set(json_data, "
"'$._id', CAST (doc.id AS TEXT),"
"'$.thumbnail', doc.thumbnail_count,"
"'$.mime', m.name,"
"'$.size', doc.size"
") FROM document doc LEFT JOIN mime m ON m.id=doc.mime WHERE doc.id=?", -1,
&db->get_document, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
@@ -184,6 +196,12 @@ void database_open(database_t *db) {
db->db, "SELECT embedding FROM embedding WHERE id=? AND model_id=? AND start=0", -1,
&db->get_embedding, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"INSERT INTO tag (id, tag) VALUES (?,?) ON CONFLICT DO NOTHING;",
-1,
&db->write_tag_stmt, NULL));
// Create functions
sqlite3_create_function(
db->db,
@@ -228,7 +246,7 @@ void database_open(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"DELETE FROM index_job WHERE id = (SELECT MIN(id) FROM index_job)"
" RETURNING doc_id,type,line;",
" RETURNING sid,type,line;",
-1, &db->pop_index_job_stmt, NULL
));
@@ -243,7 +261,7 @@ void database_open(database_t *db) {
db->db, "INSERT INTO parse_job (filepath,mtime,st_size) VALUES (?,?,?);", -1,
&db->insert_parse_job_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db, "INSERT INTO index_job (doc_id,type,line) VALUES (?,?,?);", -1,
db->db, "INSERT INTO index_job (sid,type,line) VALUES (?,?,?);", -1,
&db->insert_index_job_stmt, NULL));
} else if (db->type == FTS_DATABASE) {
@@ -294,6 +312,12 @@ void database_open(database_t *db) {
db->db, "SELECT mime, sum(count) FROM mime_index WHERE mime is not NULL GROUP BY mime", -1,
&db->fts_get_mimetypes, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"INSERT INTO tag (id, index_id, tag) VALUES (?,?,?) ON CONFLICT DO NOTHING;",
-1,
&db->fts_write_tag_stmt, NULL));
sqlite3_create_function(
db->db,
"random_seeded",
@@ -340,13 +364,6 @@ void database_open(database_t *db) {
}
if (db->type == FTS_DATABASE || db->type == INDEX_DATABASE) {
// Tag table is the same schema for FTS database & index database
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"INSERT INTO tag (id, tag) VALUES (?,?) ON CONFLICT DO NOTHING;",
-1,
&db->write_tag_stmt, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare_v2(
db->db,
"DELETE FROM tag WHERE id=? AND tag=?;",
@@ -356,7 +373,7 @@ void database_open(database_t *db) {
}
void database_close(database_t *db, int optimize) {
LOG_DEBUGF("database.c", "Closing database %s", db->filename);
LOG_DEBUGF("database.c", "Closing database %s (%p)", db->filename, db->db);
if (optimize) {
LOG_DEBUG("database.c", "Optimizing database");
@@ -376,8 +393,8 @@ void database_close(database_t *db, int optimize) {
db = NULL;
}
void *database_read_thumbnail(database_t *db, const char *id, int num, size_t *return_value_len) {
sqlite3_bind_text(db->select_thumbnail_stmt, 1, id, -1, SQLITE_STATIC);
void *database_read_thumbnail(database_t *db, int doc_id, int num, size_t *return_value_len) {
sqlite3_bind_int(db->select_thumbnail_stmt, 1, doc_id);
sqlite3_bind_int(db->select_thumbnail_stmt, 2, num);
int ret = sqlite3_step(db->select_thumbnail_stmt);
@@ -410,7 +427,7 @@ void database_write_index_descriptor(database_t *db, index_descriptor_t *desc) {
sqlite3_prepare_v2(db->db, "INSERT INTO descriptor (id, version_major, version_minor, version_patch,"
" root, name, rewrite_url, timestamp) VALUES (?,?,?,?,?,?,?,?);", -1, &stmt, NULL);
sqlite3_bind_text(stmt, 1, desc->id, -1, SQLITE_STATIC);
sqlite3_bind_int(stmt, 1, desc->id);
sqlite3_bind_int(stmt, 2, desc->version_major);
sqlite3_bind_int(stmt, 3, desc->version_minor);
sqlite3_bind_int(stmt, 4, desc->version_patch);
@@ -433,7 +450,7 @@ index_descriptor_t *database_read_index_descriptor(database_t *db) {
CRASH_IF_STMT_FAIL(sqlite3_step(stmt));
const char *id = (char *) sqlite3_column_text(stmt, 0);
int id = sqlite3_column_int(stmt, 0);
int v_major = sqlite3_column_int(stmt, 1);
int v_minor = sqlite3_column_int(stmt, 2);
int v_patch = sqlite3_column_int(stmt, 3);
@@ -443,7 +460,7 @@ index_descriptor_t *database_read_index_descriptor(database_t *db) {
int timestamp = sqlite3_column_int(stmt, 7);
index_descriptor_t *desc = malloc(sizeof(index_descriptor_t));
strcpy(desc->id, id);
desc->id = id;
snprintf(desc->version, sizeof(desc->version), "%d.%d.%d", v_major, v_minor, v_patch);
desc->version_major = v_major;
desc->version_minor = v_minor;
@@ -461,7 +478,8 @@ index_descriptor_t *database_read_index_descriptor(database_t *db) {
database_iterator_t *database_create_delete_list_iterator(database_t *db) {
sqlite3_stmt *stmt;
sqlite3_prepare_v2(db->db, "SELECT id FROM delete_list;", -1, &stmt, NULL);
sqlite3_prepare_v2(db->db, "SELECT doc.id FROM delete_list "
"INNER JOIN document doc ON doc.ROWID = delete_list.id;", -1, &stmt, NULL);
database_iterator_t *iter = malloc(sizeof(database_iterator_t));
@@ -471,14 +489,11 @@ database_iterator_t *database_create_delete_list_iterator(database_t *db) {
return iter;
}
char *database_delete_list_iter(database_iterator_t *iter) {
int database_delete_list_iter(database_iterator_t *iter) {
int ret = sqlite3_step(iter->stmt);
if (ret == SQLITE_ROW) {
const char *id = (const char *) sqlite3_column_text(iter->stmt, 0);
char *id_heap = malloc(strlen(id) + 1);
strcpy(id_heap, id);
return id_heap;
return sqlite3_column_int(iter->stmt, 0);
}
if (ret != SQLITE_DONE) {
@@ -491,7 +506,7 @@ char *database_delete_list_iter(database_iterator_t *iter) {
iter->stmt = NULL;
return NULL;
return 0;
}
database_iterator_t *database_create_document_iterator(database_t *db) {
@@ -501,27 +516,31 @@ database_iterator_t *database_create_document_iterator(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(
sqlite3_prepare_v2(
db->db,
"WITH doc (j) AS (SELECT CASE"
" WHEN emb.embedding IS NULL THEN"
" json_set(document.json_data, "
" '$._id', document.id, "
" '$.size', document.size, "
" '$.mtime', document.mtime, "
" '$.tag', json_group_array((SELECT tag FROM tag WHERE document.id = tag.id)))"
" ELSE"
" json_set(document.json_data,"
" '$._id', document.id,"
" '$.size', document.size,"
" '$.mtime', document.mtime,"
" '$.tag', json_group_array((SELECT tag FROM tag WHERE document.id = tag.id)),"
" '$.emb', json_group_object(m.path, json(emb_to_json(emb.embedding))),"
" '$.embedding', 1)"
" END"
"WITH doc (id, j) AS ("
"SELECT"
" document.id,"
" json_set(document.json_data,"
" '$._id', document.id,"
" '$.index', (SELECT id FROM descriptor),"
" '$.size', document.size,"
" '$.mtime', document.mtime,"
" '$.mime', mim.name,"
" '$.thumbnail', document.thumbnail_count,"
" '$.tag', json_group_array(t.tag))"
" FROM document"
" LEFT JOIN embedding emb ON document.id = emb.id"
" LEFT JOIN model m ON emb.model_id = m.id"
" LEFT JOIN mime mim ON mim.id = document.mime"
" LEFT JOIN tag t ON t.id = document.id"
" GROUP BY document.id)"
" SELECT json_set(j, '$.index', (SELECT id FROM descriptor)) FROM doc",
"SELECT CASE"
" WHEN emb.embedding IS NULL THEN j"
" ELSE json_set(j,"
" '$.emb', json_group_object(m.path, json(emb_to_json(emb.embedding))),"
" '$.embedding', 1"
" ) END"
" FROM doc"
" LEFT JOIN embedding emb ON doc.id = emb.id"
" LEFT JOIN model m ON emb.model_id = m.id"
" GROUP BY doc.id",
-1, &stmt, NULL));
database_iterator_t *iter = malloc(sizeof(database_iterator_t));
@@ -573,43 +592,49 @@ cJSON *database_document_iter(database_iterator_t *iter) {
cJSON *database_incremental_scan_begin(database_t *db) {
LOG_DEBUG("database.c", "Preparing database for incremental scan");
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "UPDATE document SET marked=0;", NULL, NULL, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "DELETE FROM marked;", NULL, NULL, NULL));
LOG_DEBUG("database.c", "Preparing database for incremental scan (create marked table)");
CRASH_IF_NOT_SQLITE_OK(
sqlite3_exec(db->db, "INSERT INTO marked SELECT id, 0, mtime FROM document;", NULL, NULL, NULL));
}
cJSON *database_incremental_scan_end(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"DELETE FROM delete_list WHERE id IN (SELECT id FROM document WHERE marked=1);",
"DELETE FROM delete_list WHERE id IN (SELECT id FROM marked WHERE marked = 1);",
NULL, NULL, NULL
));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"DELETE FROM thumbnail WHERE id IN (SELECT id FROM document WHERE marked=0);",
"DELETE FROM thumbnail WHERE EXISTS ("
" SELECT document.id FROM document INNER JOIN marked m ON m.id = document.ROWID"
" WHERE marked=0 and document.id = thumbnail.id)",
NULL, NULL, NULL
));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"INSERT INTO delete_list (id) SELECT id FROM document WHERE marked=0;",
"INSERT INTO delete_list (id) "
"SELECT id FROM marked WHERE marked=0 ON CONFLICT DO NOTHING;",
NULL, NULL, NULL
));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"DELETE FROM document_sidecar WHERE id IN (SELECT id FROM document WHERE marked=0);",
"DELETE FROM document WHERE ROWID IN (SELECT id FROM marked WHERE marked=0);",
NULL, NULL, NULL
));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"DELETE FROM document WHERE marked=0;",
"DELETE FROM marked;",
NULL, NULL, NULL
));
}
int database_mark_document(database_t *db, const char *id, int mtime) {
sqlite3_bind_text(db->mark_document_stmt, 1, id, -1, SQLITE_STATIC);
int database_mark_document(database_t *db, const char *path, int mtime) {
sqlite3_bind_text(db->mark_document_stmt, 1, path, -1, SQLITE_STATIC);
sqlite3_bind_int(db->mark_document_stmt, 2, mtime);
pthread_mutex_lock(&db->ipc_ctx->index_db_mutex);
@@ -631,31 +656,38 @@ int database_mark_document(database_t *db, const char *id, int mtime) {
CRASH_IF_STMT_FAIL(ret);
}
void database_write_document(database_t *db, document_t *doc, const char *json_data) {
sqlite3_bind_text(db->write_document_stmt, 1, doc->doc_id, -1, SQLITE_STATIC);
sqlite3_bind_int(db->write_document_stmt, 2, doc->mtime);
sqlite3_bind_int64(db->write_document_stmt, 3, (long) doc->size);
sqlite3_bind_text(db->write_document_stmt, 4, json_data, -1, SQLITE_STATIC);
int database_write_document(database_t *db, document_t *doc, const char *json_data) {
const char *rel_path = doc->filepath + ScanCtx.index.desc.root_len;
const char *parent_rel_path = doc->parent[0] != '\0'
? doc->parent + ScanCtx.index.desc.root_len
: NULL;
// path, parent, mtime, size, json_data
sqlite3_bind_text(db->write_document_stmt, 1, rel_path, -1, SQLITE_STATIC);
sqlite3_bind_text(db->write_document_stmt, 2, parent_rel_path, -1, SQLITE_STATIC);
sqlite3_bind_int64(db->write_document_stmt, 3, doc->mime);
sqlite3_bind_int(db->write_document_stmt, 4, doc->mtime);
sqlite3_bind_int64(db->write_document_stmt, 5, (long) doc->size);
sqlite3_bind_int(db->write_document_stmt, 6, doc->thumbnail_count);
if (json_data) {
sqlite3_bind_text(db->write_document_stmt, 7, json_data, -1, SQLITE_STATIC);
} else {
sqlite3_bind_null(db->write_document_stmt, 7);
}
pthread_mutex_lock(&db->ipc_ctx->index_db_mutex);
CRASH_IF_STMT_FAIL(sqlite3_step(db->write_document_stmt));
int id = sqlite3_column_int(db->write_document_stmt, 0);
CRASH_IF_NOT_SQLITE_OK(sqlite3_reset(db->write_document_stmt));
pthread_mutex_unlock(&db->ipc_ctx->index_db_mutex);
return id;
}
void database_write_document_sidecar(database_t *db, const char *id, const char *json_data) {
sqlite3_bind_text(db->write_document_sidecar_stmt, 1, id, -1, SQLITE_STATIC);
sqlite3_bind_text(db->write_document_sidecar_stmt, 2, json_data, -1, SQLITE_STATIC);
pthread_mutex_lock(&db->ipc_ctx->index_db_mutex);
CRASH_IF_STMT_FAIL(sqlite3_step(db->write_document_sidecar_stmt));
CRASH_IF_NOT_SQLITE_OK(sqlite3_reset(db->write_document_sidecar_stmt));
pthread_mutex_unlock(&db->ipc_ctx->index_db_mutex);
}
void database_write_thumbnail(database_t *db, const char *id, int num, void *data, size_t data_size) {
sqlite3_bind_text(db->write_thumbnail_stmt, 1, id, -1, SQLITE_STATIC);
void database_write_thumbnail(database_t *db, int doc_id, int num, void *data, size_t data_size) {
sqlite3_bind_int(db->write_thumbnail_stmt, 1, doc_id);
sqlite3_bind_int(db->write_thumbnail_stmt, 2, num);
sqlite3_bind_blob(db->write_thumbnail_stmt, 3, data, (int) data_size, SQLITE_STATIC);
@@ -716,7 +748,7 @@ job_t *database_get_work(database_t *db, job_type_t job_type) {
} else {
job->bulk_line = malloc(sizeof(es_bulk_line_t));
}
strcpy(job->bulk_line->doc_id, (const char *) sqlite3_column_text(db->pop_index_job_stmt, 0));
strcpy(job->bulk_line->sid, (const char *) sqlite3_column_text(db->pop_index_job_stmt, 0));
job->bulk_line->type = sqlite3_column_int(db->pop_index_job_stmt, 1);
job->bulk_line->next = NULL;
@@ -767,7 +799,7 @@ void database_add_work(database_t *db, job_t *job) {
} while (ret != SQLITE_DONE && ret != SQLITE_OK);
} else if (job->type == JOB_BULK_LINE) {
do {
sqlite3_bind_text(db->insert_index_job_stmt, 1, job->bulk_line->doc_id, -1, SQLITE_STATIC);
sqlite3_bind_text(db->insert_index_job_stmt, 1, job->bulk_line->sid, -1, SQLITE_STATIC);
sqlite3_bind_int(db->insert_index_job_stmt, 2, job->bulk_line->type);
if (job->bulk_line->type != ES_BULK_LINE_DELETE) {
sqlite3_bind_text(db->insert_index_job_stmt, 3, job->bulk_line->line, -1, SQLITE_STATIC);
@@ -808,24 +840,25 @@ void database_add_work(database_t *db, job_t *job) {
pthread_mutex_unlock(&db->ipc_ctx->mutex);
}
void database_write_tag(database_t *db, char *doc_id, char *tag) {
sqlite3_bind_text(db->write_tag_stmt, 1, doc_id, -1, SQLITE_STATIC);
sqlite3_bind_text(db->write_tag_stmt, 2, tag, -1, SQLITE_STATIC);
void database_write_tag(database_t *db, long sid, char *tag) {
sqlite3_bind_int64(db->write_tag_stmt, 1, sid);
sqlite3_bind_int(db->write_tag_stmt, 2, (int) (sid >> 32));
sqlite3_bind_text(db->write_tag_stmt, 3, tag, -1, SQLITE_STATIC);
CRASH_IF_STMT_FAIL(sqlite3_step(db->write_tag_stmt));
CRASH_IF_NOT_SQLITE_OK(sqlite3_reset(db->write_tag_stmt));
}
void database_delete_tag(database_t *db, char *doc_id, char *tag) {
sqlite3_bind_text(db->delete_tag_stmt, 1, doc_id, -1, SQLITE_STATIC);
void database_delete_tag(database_t *db, long sid, char *tag) {
sqlite3_bind_int64(db->delete_tag_stmt, 1, sid);
sqlite3_bind_text(db->delete_tag_stmt, 2, tag, -1, SQLITE_STATIC);
CRASH_IF_STMT_FAIL(sqlite3_step(db->delete_tag_stmt));
CRASH_IF_NOT_SQLITE_OK(sqlite3_reset(db->delete_tag_stmt));
}
cJSON *database_get_document(database_t *db, char *doc_id) {
sqlite3_bind_text(db->get_document, 1, doc_id, -1, SQLITE_STATIC);
cJSON *database_get_document(database_t *db, int doc_id) {
sqlite3_bind_int(db->get_document, 1, doc_id);
int ret = sqlite3_step(db->get_document);
CRASH_IF_STMT_FAIL(ret);
@@ -833,7 +866,7 @@ cJSON *database_get_document(database_t *db, char *doc_id) {
cJSON *json;
if (ret == SQLITE_ROW) {
const char *json_str = sqlite3_column_text(db->get_document, 0);
const char *json_str = (char *) sqlite3_column_text(db->get_document, 0);
json = cJSON_Parse(json_str);
} else {
json = NULL;
@@ -847,4 +880,24 @@ cJSON *database_get_document(database_t *db, char *doc_id) {
void database_increment_version(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db, "INSERT INTO version DEFAULT VALUES", NULL, NULL, NULL));
}
void database_sync_mime_table(database_t *db) {
unsigned int *cur = get_mime_ids();
sqlite3_stmt *stmt;
CRASH_IF_NOT_SQLITE_OK(sqlite3_prepare(
db->db,
"REPLACE INTO mime (id, name) VALUES (?,?)", -1, &stmt, NULL));
while (*cur != 0) {
sqlite3_bind_int64(stmt, 1, (long) *cur);
sqlite3_bind_text(stmt, 2, mime_get_mime_text(*cur), -1, NULL);
CRASH_IF_STMT_FAIL(sqlite3_step(stmt));
sqlite3_reset(stmt);
cur += 1;
}
sqlite3_finalize(stmt);
}

View File

@@ -81,7 +81,6 @@ typedef struct database {
sqlite3_stmt *mark_document_stmt;
sqlite3_stmt *write_document_stmt;
sqlite3_stmt *write_document_sidecar_stmt;
sqlite3_stmt *write_thumbnail_stmt;
sqlite3_stmt *get_document;
sqlite3_stmt *get_models;
@@ -103,9 +102,9 @@ typedef struct database {
sqlite3_stmt *fts_get_document;
sqlite3_stmt *fts_suggest_tag;
sqlite3_stmt *fts_get_tags;
sqlite3_stmt *fts_write_tag_stmt;
sqlite3_stmt *fts_model_size;
char **tag_array;
database_ipc_ctx_t *ipc_ctx;
@@ -133,15 +132,15 @@ void database_close(database_t *, int optimize);
void database_increment_version(database_t *db);
void database_write_thumbnail(database_t *db, const char *id, int num, void *data, size_t data_size);
void database_write_thumbnail(database_t *db, int doc_id, int num, void *data, size_t data_size);
void *database_read_thumbnail(database_t *db, const char *id, int num, size_t *return_value_len);
void *database_read_thumbnail(database_t *db, int doc_id, int num, size_t *return_value_len);
void database_write_index_descriptor(database_t *db, index_descriptor_t *desc);
index_descriptor_t *database_read_index_descriptor(database_t *db);
void database_write_document(database_t *db, document_t *doc, const char *json_data);
int database_write_document(database_t *db, document_t *doc, const char *json_data);
database_iterator_t *database_create_document_iterator(database_t *db);
@@ -154,10 +153,10 @@ cJSON *database_document_iter(database_iterator_t *);
database_iterator_t *database_create_delete_list_iterator(database_t *db);
char *database_delete_list_iter(database_iterator_t *iter);
int database_delete_list_iter(database_iterator_t *iter);
#define database_delete_list_iter_foreach(element, iter) \
for (char *(element) = database_delete_list_iter(iter); (element) != NULL; (element) = database_delete_list_iter(iter))
for (int (element) = database_delete_list_iter(iter); (element) != 0; (element) = database_delete_list_iter(iter))
cJSON *database_incremental_scan_begin(database_t *db);
@@ -166,8 +165,6 @@ cJSON *database_incremental_scan_end(database_t *db);
int database_mark_document(database_t *db, const char *id, int mtime);
void database_write_document_sidecar(database_t *db, const char *id, const char *json_data);
database_iterator_t *database_create_treemap_iterator(database_t *db, long threshold);
treemap_row_t database_treemap_iter(database_iterator_t *iter);
@@ -206,7 +203,7 @@ void database_fts_index(database_t *db);
void database_fts_optimize(database_t *db);
cJSON *database_fts_get_paths(database_t *db, const char *index_id, int depth_min, int depth_max, const char *prefix,
cJSON *database_fts_get_paths(database_t *db, int index_id, int depth_min, int depth_max, const char *prefix,
int suggest);
cJSON *database_fts_get_mimetypes(database_t *db);
@@ -215,18 +212,20 @@ database_summary_stats_t database_fts_get_date_range(database_t *db);
cJSON *database_fts_search(database_t *db, const char *query, const char *path, long size_min,
long size_max, long date_min, long date_max, int page_size,
char **index_ids, char **mime_types, char **tags, int sort_asc,
int *index_ids, char **mime_types, char **tags, int sort_asc,
fts_sort_t sort, int seed, char **after, int fetch_aggregations,
int highlight, int highlight_context_size, int model,
const float *embedding, int embedding_size);
void database_write_tag(database_t *db, char *doc_id, char *tag);
void database_write_tag(database_t *db, long sid, char *tag);
void database_delete_tag(database_t *db, char *doc_id, char *tag);
void database_fts_write_tag(database_t *db, long sid, char *tag);
void database_delete_tag(database_t *db, long sid, char *tag);
void database_fts_detach(database_t *db);
cJSON *database_fts_get_document(database_t *db, char *doc_id);
cJSON *database_fts_get_document(database_t *db, long sid);
database_summary_stats_t database_fts_sync_tags(database_t *db);
@@ -234,7 +233,7 @@ cJSON *database_fts_suggest_tag(database_t *db, char *prefix);
cJSON *database_fts_get_tags(database_t *db);
cJSON *database_get_document(database_t *db, char *doc_id);
cJSON *database_get_document(database_t *db, int doc_id);
void cosine_sim_func(sqlite3_context *ctx, int argc, sqlite3_value **argv);
@@ -242,6 +241,8 @@ cJSON *database_get_models(database_t *db);
int database_fts_get_model_size(database_t *db, int model_id);
cJSON *database_get_embedding(database_t *db, char *doc_id, int model_id);
cJSON *database_get_embedding(database_t *db, int doc_id, int model_id);
void database_sync_mime_table(database_t *db);
#endif

View File

@@ -69,9 +69,9 @@ cJSON *database_get_models(database_t *db) {
return json;
}
cJSON *database_get_embedding(database_t *db, char *doc_id, int model_id) {
cJSON *database_get_embedding(database_t *db, int doc_id, int model_id) {
sqlite3_bind_text(db->get_embedding, 1, doc_id, -1, SQLITE_STATIC);
sqlite3_bind_int(db->get_embedding, 1, doc_id);
sqlite3_bind_int(db->get_embedding, 2, model_id);
int ret = sqlite3_step(db->get_embedding);
CRASH_IF_STMT_FAIL(ret);

View File

@@ -42,21 +42,23 @@ void database_fts_index(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"WITH docs AS ("
" SELECT document.id as id, (SELECT id FROM descriptor) as index_id, size,"
" SELECT "
" ((SELECT id FROM descriptor) << 32) | document.id as id,"
" (SELECT id FROM descriptor) as index_id,"
" size,"
" document.json_data ->> 'name' as name,"
" document.json_data ->> 'path' as path,"
" mtime,"
" document.json_data ->> 'mime' as mime,"
" json_set(document.json_data, "
" '$._id',document.id,"
" '$.size',document.size, "
" '$.mtime',document.mtime)"
" m.name as mime,"
" thumbnail_count,"
" document.json_data"
" FROM document"
" LEFT JOIN mime m ON m.id=document.mime"
" )"
" INSERT"
" INTO fts.document_index (id, index_id, size, name, path, mtime, mime, json_data)"
" INTO fts.document_index (id, index_id, size, name, path, mtime, mime, thumbnail_count, json_data)"
" SELECT * FROM docs WHERE true"
" on conflict (id, index_id) do update set "
" on conflict (id) do update set "
" size=excluded.size, mtime=excluded.mtime, mime=excluded.mime, json_data=excluded.json_data;",
NULL, NULL, NULL));
@@ -64,13 +66,14 @@ void database_fts_index(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"REPLACE INTO fts.embedding (id, model_id, start, end, embedding)"
" SELECT id, model_id, start, end, embedding FROM embedding", NULL, NULL, NULL));
"REPLACE INTO fts.model (id, size)"
" SELECT id, size FROM model", NULL, NULL, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"INSERT INTO fts.model (id, size)"
" SELECT id, size FROM model WHERE TRUE ON CONFLICT (id) DO NOTHING", NULL, NULL, NULL));
"REPLACE INTO fts.embedding (id, model_id, start, end, embedding)"
" SELECT (SELECT id FROM descriptor) << 32 | id, model_id, start, end, embedding FROM embedding "
" WHERE TRUE ON CONFLICT (id, model_id, start) DO NOTHING;", NULL, NULL, NULL));
// TODO: delete old embeddings
@@ -99,7 +102,9 @@ void database_fts_index(database_t *db) {
db->db, "DELETE FROM fts.mime_index;", NULL, NULL, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db, "INSERT INTO fts.mime_index (index_id, mime, count) "
"SELECT index_id, mime, count(*) FROM fts.document_index GROUP BY index_id, mime",
"SELECT index_id, mime, count(*) FROM fts.document_index "
"WHERE mime IS NOT NULL "
"GROUP BY index_id, mime",
NULL, NULL, NULL));
LOG_DEBUG("database_fts.c", "Generating path index");
@@ -157,7 +162,8 @@ void database_fts_index(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"INSERT INTO search(rowid, name, content, title) SELECT id, name, content, title from document_view",
"INSERT INTO search(rowid, name, content, title, path) "
"SELECT id, name, content, title, path from document_view",
NULL, NULL, NULL));
}
@@ -172,7 +178,7 @@ void database_fts_optimize(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db, "PRAGMA fts.optimize;", NULL, NULL, NULL));
}
cJSON *database_fts_get_paths(database_t *db, const char *index_id, int depth_min, int depth_max, const char *prefix,
cJSON *database_fts_get_paths(database_t *db, int index_id, int depth_min, int depth_max, const char *prefix,
int suggest) {
sqlite3_stmt *stmt;
@@ -192,7 +198,7 @@ cJSON *database_fts_get_paths(database_t *db, const char *index_id, int depth_mi
} else if (prefix) {
stmt = db->fts_search_paths_w_prefix;
if (index_id) {
sqlite3_bind_text(stmt, 1, index_id, -1, SQLITE_STATIC);
sqlite3_bind_int(stmt, 1, index_id);
} else {
sqlite3_bind_null(stmt, 1);
}
@@ -207,7 +213,7 @@ cJSON *database_fts_get_paths(database_t *db, const char *index_id, int depth_mi
} else {
stmt = db->fts_search_paths;
if (index_id) {
sqlite3_bind_text(stmt, 1, index_id, -1, SQLITE_STATIC);
sqlite3_bind_int(stmt, 1, index_id);
} else {
sqlite3_bind_null(stmt, 1);
}
@@ -290,7 +296,6 @@ const char *date_where_clause(long date_min, long date_max) {
}
int array_length(char **arr) {
if (arr == NULL) {
return 0;
}
@@ -301,6 +306,17 @@ int array_length(char **arr) {
return count;
}
int int_array_length(const int *arr) {
if (arr == NULL) {
return 0;
}
int count = -1;
while (arr[++count] != 0);
return count;
}
#define INDEX_ID_PARAM_OFFSET (10)
#define MIME_PARAM_OFFSET (INDEX_ID_PARAM_OFFSET + 1000)
@@ -351,8 +367,8 @@ char *build_where_clause(const char *path_where, const char *size_where, const c
return where;
}
char *index_ids_where_clause(char **index_ids) {
int param_count = array_length(index_ids);
char *index_ids_where_clause(int *index_ids) {
int param_count = int_array_length(index_ids);
char *clause = malloc(13 + 2 + 6 * param_count);
@@ -483,7 +499,7 @@ int database_fts_get_model_size(database_t *db, int model_id) {
cJSON *database_fts_search(database_t *db, const char *query, const char *path, long size_min,
long size_max, long date_min, long date_max, int page_size,
char **index_ids, char **mime_types, char **tags, int sort_asc,
int *index_ids, char **mime_types, char **tags, int sort_asc,
fts_sort_t sort, int seed, char **after, int fetch_aggregations,
int highlight, int highlight_context_size, int model,
const float *embedding, int embedding_size) {
@@ -524,13 +540,21 @@ cJSON *database_fts_search(database_t *db, const char *query, const char *path,
const char *json_object_sql;
if (highlight && query_where != NULL) {
json_object_sql = "json_set(json_remove(doc.json_data, '$.content'),"
"'$._id', CAST(doc.id AS TEXT),"
"'$.index', doc.index_id,"
"'$.thumbnail', doc.thumbnail_count,"
"'$.mime', doc.mime,"
"'$.size', doc.size,"
"'$.embedding', (CASE WHEN emb.id IS NOT NULL THEN 1 ELSE 0 END),"
"'$._highlight.name', snippet(search, 0, '<mark>', '</mark>', '', ?6),"
"'$._highlight.content', snippet(search, 1, '<mark>', '</mark>', '', ?6))";
} else {
json_object_sql = "json_set(json_remove(doc.json_data, '$.content'),"
"'$._id', CAST(doc.id AS TEXT),"
"'$.index', doc.index_id,"
"'$.thumbnail', doc.thumbnail_count,"
"'$.mime', doc.mime,"
"'$.size', doc.size,"
"'$.embedding', (CASE WHEN emb.id IS NOT NULL THEN 1 ELSE 0 END))";
}
@@ -592,7 +616,7 @@ cJSON *database_fts_search(database_t *db, const char *query, const char *path,
if (index_ids) {
array_foreach(index_ids) {
sqlite3_bind_text(stmt, INDEX_ID_PARAM_OFFSET + i, index_ids[i], -1, SQLITE_STATIC);
sqlite3_bind_int(stmt, INDEX_ID_PARAM_OFFSET + i, index_ids[i]);
}
}
if (mime_types) {
@@ -692,7 +716,7 @@ cJSON *database_fts_search(database_t *db, const char *query, const char *path,
if (index_ids) {
array_foreach(index_ids) {
sqlite3_bind_text(agg_stmt, INDEX_ID_PARAM_OFFSET + i, index_ids[i], -1, SQLITE_STATIC);
sqlite3_bind_int(agg_stmt, INDEX_ID_PARAM_OFFSET + i, index_ids[i]);
}
}
if (mime_types) {
@@ -764,19 +788,20 @@ database_summary_stats_t database_fts_sync_tags(database_t *db) {
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"DELETE FROM fts.tag WHERE"
" (id, tag) NOT IN (SELECT id, tag FROM tag)",
" (id, index_id, tag) NOT IN (SELECT ((SELECT id FROM descriptor) << 32) | id, (SELECT id FROM descriptor), tag FROM tag)"
" AND index_id = (SELECT id FROM descriptor)",
NULL, NULL, NULL));
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(
db->db,
"INSERT INTO fts.tag (id, tag) "
" SELECT id, tag FROM tag "
" WHERE (id, tag) NOT IN (SELECT * FROM fts.tag)",
"INSERT INTO fts.tag (id, index_id, tag) "
" SELECT (((SELECT id FROM descriptor) << 32) | id) as sid, (SELECT id FROM descriptor), tag FROM tag "
" WHERE (sid, tag) NOT IN (SELECT id, tag FROM fts.tag)",
NULL, NULL, NULL));
}
cJSON *database_fts_get_document(database_t *db, char *doc_id) {
sqlite3_bind_text(db->fts_get_document, 1, doc_id, -1, NULL);
cJSON *database_fts_get_document(database_t *db, long sid) {
sqlite3_bind_int64(db->fts_get_document, 1, sid);
int ret = sqlite3_step(db->fts_get_document);
cJSON *json = NULL;
@@ -844,3 +869,11 @@ cJSON *database_fts_get_tags(database_t *db) {
return json;
}
void database_fts_write_tag(database_t *db, long sid, char *tag) {
sqlite3_bind_int64(db->fts_write_tag_stmt, 1, sid);
sqlite3_bind_int(db->fts_write_tag_stmt, 2, (int) (sid >> 32));
sqlite3_bind_text(db->fts_write_tag_stmt, 3, tag, -1, SQLITE_STATIC);
CRASH_IF_STMT_FAIL(sqlite3_step(db->fts_write_tag_stmt));
CRASH_IF_NOT_SQLITE_OK(sqlite3_reset(db->fts_write_tag_stmt));
}

View File

@@ -1,57 +1,63 @@
#ifdef SIST_DEBUG
#define STRICT " STRICT"
#else
#define STRICT ""
#endif
const char *FtsDatabaseSchema =
"CREATE TABLE IF NOT EXISTS document_index ("
" id TEXT NOT NULL,"
" index_id TEXT NOT NULL,"
" id INTEGER PRIMARY KEY,"
" index_id INTEGER NOT NULL,"
" size INTEGER NOT NULL,"
" name TEXT NOT NULL,"
" path TEXT NOT NULL,"
" mtime INTEGER NOT NULL,"
" mime TEXT,"
" json_data TEXT NOT NULL,"
" PRIMARY KEY (id, index_id)"
");"
" thumbnail_count INTEGER NOT NULL,"
" json_data TEXT NOT NULL"
")"STRICT";"
""
"CREATE TABLE IF NOT EXISTS stats ("
" mtime_min INTEGER,"
" mtime_max INTEGER"
");"
")"STRICT";"
""
"CREATE TABLE IF NOT EXISTS path_index ("
" path TEXT,"
" index_id TEXT,"
" index_id INTEGER,"
" count INTEGER NOT NULL,"
" depth INTEGER NOT NULL,"
" PRIMARY KEY (path, index_id)"
");"
")"STRICT";"
""
"CREATE TABLE IF NOT EXISTS mime_index ("
" index_id TEXT,"
" index_id INTEGER,"
" mime TEXT,"
" count INT,"
" count INTEGER,"
" PRIMARY KEY(index_id, mime)"
");"
")"STRICT";"
""
"CREATE TABLE IF NOT EXISTS tag ("
" id TEXT NOT NULL,"
" id INTEGER NOT NULL,"
" index_id INTEGER NOT NULL,"
" tag TEXT NOT NULL,"
" PRIMARY KEY (id, tag)"
");"
")"STRICT";"
"CREATE INDEX IF NOT EXISTS tag_tag_idx ON tag(tag);"
"CREATE INDEX IF NOT EXISTS tag_id_idx ON tag(id);"
""
"CREATE TABLE IF NOT EXISTS embedding ("
" id TEXT REFERENCES document(id),"
" id INTEGER REFERENCES document_index(id),"
" model_id INTEGER NOT NULL REFERENCES model(id),"
" start INTEGER NOT NULL,"
" end INTEGER,"
" embedding BLOB NOT NULL,"
" PRIMARY KEY (id, model_id, start)"
");"
")"STRICT";"
""
"CREATE TABLE IF NOT EXISTS model ("
" id INTEGER PRIMARY KEY CHECK (id > 0 AND id < 1000),"
" size INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TRIGGER IF NOT EXISTS tag_write_trigger"
" AFTER INSERT ON tag"
@@ -69,23 +75,25 @@ const char *FtsDatabaseSchema =
" WHERE id = OLD.id;"
" END;"
""
"CREATE VIEW IF NOT EXISTS document_view (id, name, content, title)"
"CREATE VIEW IF NOT EXISTS document_view (id, name, content, title, path)"
" AS"
" SELECT rowid,"
" SELECT id,"
" json_data->>'name',"
" json_data->>'content',"
" json_data->>'title'"
" json_data->>'title',"
" json_data->>'path'"
" FROM document_index;"
""
"CREATE VIRTUAL TABLE IF NOT EXISTS search USING fts5 ("
" name,"
" content,"
" title,"
" path,"
" content='document_view',"
" content_rowid='id'"
");"
// name^8, content^3, title^8
"INSERT INTO search(search, rank) VALUES('rank', 'bm25(8, 3, 8)');"
// name^8, content^3, title^8, path^5
"INSERT INTO search(search, rank) VALUES('rank', 'bm25(8, 3, 8, 5)');"
"";
const char *IpcDatabaseSchema =
@@ -94,18 +102,18 @@ const char *IpcDatabaseSchema =
" filepath TEXT NOT NULL,"
" mtime INTEGER NOT NULL,"
" st_size INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE index_job ("
" id INTEGER PRIMARY KEY,"
" doc_id TEXT NOT NULL CHECK ( length(doc_id) = 32 ),"
" sid TEXT NOT NULL,"
" type INTEGER NOT NULL,"
" line TEXT"
");";
")"STRICT";";
const char *IndexDatabaseSchema =
"CREATE TABLE thumbnail ("
" id TEXT NOT NULL CHECK ( length(id) = 32 ),"
" id INTEGER REFERENCES document(id),"
" num INTEGER NOT NULL,"
" data BLOB NOT NULL,"
" PRIMARY KEY(id, num)"
@@ -114,34 +122,46 @@ const char *IndexDatabaseSchema =
"CREATE TABLE version ("
" id INTEGER PRIMARY KEY AUTOINCREMENT,"
" date TEXT NOT NULL DEFAULT (CURRENT_TIMESTAMP)"
");"
")"STRICT";"
""
"CREATE TABLE mime("
" id INTEGER PRIMARY KEY,"
" name TEXT"
")"STRICT";"
"CREATE UNIQUE INDEX mime_name_idx ON mime(name);"
""
"CREATE TABLE document ("
" id TEXT PRIMARY KEY CHECK ( length(id) = 32 ),"
" marked INTEGER NOT NULL DEFAULT (1),"
" id INTEGER PRIMARY KEY,"
" parent INTEGER REFERENCES document(id),"
" mime INTEGER REFERENCES mime(id),"
" path TEXT NOT NULL,"
" version INTEGER NOT NULL REFERENCES version(id),"
" mtime INTEGER NOT NULL,"
" size INTEGER NOT NULL,"
" json_data TEXT NOT NULL CHECK ( json_valid(json_data) )"
") WITHOUT ROWID;"
" thumbnail_count INTEGER NOT NULL,"
" json_data TEXT CHECK ( json_data IS NULL OR json_valid(json_data) )"
")"STRICT";"
"CREATE UNIQUE INDEX document_path_idx ON document(path);"
"CREATE TABLE marked ("
" id INTEGER PRIMARY KEY,"
" marked INTEGER NOT NULL,"
" mtime INTEGER NOT NULL"
")"STRICT";"
""
"CREATE INDEX marked_marked ON marked(marked);"
""
"CREATE TABLE delete_list ("
" id TEXT PRIMARY KEY CHECK ( length(id) = 32 )"
") WITHOUT ROWID;"
" id INTEGER PRIMARY KEY"
")"STRICT";"
""
"CREATE TABLE tag ("
" id TEXT NOT NULL,"
" id INTEGER NOT NULL REFERENCES document(id),"
" tag TEXT NOT NULL,"
" PRIMARY KEY (id, tag)"
");"
""
"CREATE TABLE document_sidecar ("
" id TEXT PRIMARY KEY NOT NULL,"
" json_data TEXT NOT NULL"
") WITHOUT ROWID;"
")"STRICT";"
""
"CREATE TABLE descriptor ("
" id TEXT NOT NULL,"
" id INTEGER PRIMARY KEY,"
" version_major INTEGER NOT NULL,"
" version_minor INTEGER NOT NULL,"
" version_patch INTEGER NOT NULL,"
@@ -149,37 +169,37 @@ const char *IndexDatabaseSchema =
" name TEXT NOT NULL,"
" rewrite_url TEXT,"
" timestamp INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE stats_treemap ("
" path TEXT NOT NULL,"
" size INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE stats_size_agg ("
" bucket INTEGER NOT NULL,"
" count INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE stats_date_agg ("
" bucket INTEGER NOT NULL,"
" count INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE stats_mime_agg ("
" mime TEXT NOT NULL,"
" size INTEGER NOT NULL,"
" count INTEGER NOT NULL"
");"
")"STRICT";"
""
"CREATE TABLE embedding ("
" id TEXT REFERENCES document(id),"
" id INTEGER REFERENCES document(id),"
" model_id INTEGER NOT NULL references model(id),"
" start INTEGER NOT NULL,"
" end INTEGER,"
" embedding BLOB NOT NULL,"
" PRIMARY KEY (id, model_id, start)"
");"
")"STRICT";"
""
"CREATE TABLE model ("
" id INTEGER PRIMARY KEY CHECK (id > 0 AND id < 1000),"
@@ -188,5 +208,5 @@ const char *IndexDatabaseSchema =
" path TEXT NOT NULL UNIQUE,"
" size INTEGER NOT NULL,"
" type TEXT NOT NULL CHECK ( type IN ('flat', 'nested') )"
");";
")"STRICT";";

View File

@@ -98,10 +98,10 @@ void database_generate_stats(database_t *db, double treemap_threshold) {
// mime aggregation
sqlite3_prepare_v2(db->db, "INSERT INTO stats_mime_agg"
" SELECT"
" (json_data->>'mime') as bucket,"
" m.name as bucket,"
" sum(size),"
" count(*)"
" FROM document"
" FROM document INNER JOIN mime m ON m.id=document.mime"
" WHERE bucket IS NOT NULL"
" GROUP BY bucket", -1, &stmt, NULL);
CRASH_IF_STMT_FAIL(sqlite3_step(stmt));
@@ -117,8 +117,8 @@ void database_generate_stats(database_t *db, double treemap_threshold) {
// flat map
CRASH_IF_NOT_SQLITE_OK(sqlite3_exec(db->db,
"INSERT INTO tm (path, size) SELECT json_data->>'path' as path, sum(size)"
" FROM document WHERE json_data->>'parent' IS NULL GROUP BY path;",
"INSERT INTO tm (path, size) SELECT path, sum(size)"
" FROM document WHERE parent IS NULL GROUP BY path;",
NULL, NULL, NULL));
// Merge up

View File

@@ -47,7 +47,7 @@ void elastic_cleanup() {
destroy_indexer(Indexer);
}
void print_json(cJSON *document, const char id_str[SIST_DOC_ID_LEN]) {
void print_json(cJSON *document, const char id_str[SIST_SID_LEN]) {
cJSON *line = cJSON_CreateObject();
@@ -64,12 +64,12 @@ void print_json(cJSON *document, const char id_str[SIST_DOC_ID_LEN]) {
cJSON_Delete(line);
}
void delete_document(const char *document_id) {
void delete_document(const char *sid) {
es_bulk_line_t bulk_line;
bulk_line.type = ES_BULK_LINE_DELETE;
bulk_line.next = NULL;
strcpy(bulk_line.doc_id, document_id);
strcpy(bulk_line.sid, sid);
tpool_add_work(IndexCtx.pool, &(job_t) {
.type = JOB_BULK_LINE,
@@ -78,14 +78,14 @@ void delete_document(const char *document_id) {
}
void index_json(cJSON *document, const char doc_id[SIST_DOC_ID_LEN]) {
void index_json(cJSON *document, const char doc_id[SIST_SID_LEN]) {
char *json = cJSON_PrintUnformatted(document);
size_t json_len = strlen(json);
es_bulk_line_t *bulk_line = malloc(sizeof(es_bulk_line_t) + json_len + 2);
bulk_line->type = ES_BULK_LINE_INDEX;
memcpy(bulk_line->line, json, json_len);
strcpy(bulk_line->doc_id, doc_id);
strcpy(bulk_line->sid, doc_id);
*(bulk_line->line + json_len) = '\n';
*(bulk_line->line + json_len + 1) = '\0';
bulk_line->next = NULL;
@@ -124,13 +124,13 @@ void *create_bulk_buffer(int max, int *count, size_t *buf_len, int legacy) {
snprintf(
action_str, sizeof(action_str),
"{\"index\":{\"_id\":\"%s\",\"_type\":\"_doc\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
line->sid, Indexer->es_index
);
} else {
snprintf(
action_str, sizeof(action_str),
"{\"index\":{\"_id\":\"%s\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
line->sid, Indexer->es_index
);
}
@@ -148,7 +148,7 @@ void *create_bulk_buffer(int max, int *count, size_t *buf_len, int legacy) {
snprintf(
action_str, sizeof(action_str),
"{\"delete\":{\"_id\":\"%s\",\"_index\":\"%s\"}}\n",
line->doc_id, Indexer->es_index
line->sid, Indexer->es_index
);
size_t action_str_len = strlen(action_str);
@@ -236,7 +236,7 @@ void _elastic_flush(int max) {
if (r->status_code == 413) {
if (max <= 1) {
LOG_ERRORF("elastic.c", "Single document too large, giving up: {%s}", Indexer->line_head->doc_id);
LOG_ERRORF("elastic.c", "Single document too large, giving up: {%s}", Indexer->line_head->sid);
free_response(r);
free(buf);
free_queue(1);
@@ -348,7 +348,7 @@ es_indexer_t *create_indexer(const char *url, const char *index) {
return indexer;
}
void finish_indexer(char *index_id) {
void finish_indexer(int index_id) {
char url[4096];

View File

@@ -8,7 +8,7 @@
typedef struct es_bulk_line {
struct es_bulk_line *next;
char doc_id[SIST_DOC_ID_LEN];
char sid[SIST_SID_LEN];
int type;
char line[0];
} es_bulk_line_t;
@@ -44,16 +44,16 @@ typedef struct es_indexer es_indexer_t;
void elastic_index_line(es_bulk_line_t *line);
void print_json(cJSON *document, const char index_id_str[SIST_INDEX_ID_LEN]);
void print_json(cJSON *document, const char doc_id[SIST_SID_LEN]);
void index_json(cJSON *document, const char doc_id[SIST_INDEX_ID_LEN]);
void index_json(cJSON *document, const char doc_id[SIST_SID_LEN]);
void delete_document(const char *document_id);
void delete_document(const char *sid);
es_indexer_t *create_indexer(const char *url, const char *index);
void elastic_cleanup();
void finish_indexer(char *index_id);
void finish_indexer(int index_id);
void elastic_init(int force_reset, const char* user_mappings, const char* user_settings);

View File

@@ -90,6 +90,7 @@ subreq_ctx_t *web_post_async(const char *url, char *data, int insecure) {
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0);
}
curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, req->curl_err_buffer);
@@ -123,6 +124,7 @@ response_t *web_get(const char *url, int timeout, int insecure) {
curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout);
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0);
}
struct curl_slist *headers = NULL;
@@ -162,6 +164,7 @@ response_t *web_post(const char *url, const char *data, int insecure) {
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0);
}
char err_buffer[CURL_ERROR_SIZE + 1] = {};
@@ -203,10 +206,11 @@ response_t *web_put(const char *url, const char *data, int insecure) {
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, write_cb);
curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "PUT");
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
curl_easy_setopt(curl, CURLOPT_DNS_USE_GLOBAL_CACHE, 0);
curl_easy_setopt(curl, CURLOPT_SHARE, 0);
curl_easy_setopt(curl, CURLOPT_IPRESOLVE, CURLOPT_DNS_LOCAL_IP4);
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0);
}
struct curl_slist *headers = NULL;
@@ -241,6 +245,7 @@ response_t *web_delete(const char *url, int insecure) {
curl_easy_setopt(curl, CURLOPT_USERAGENT, "sist2");
if (insecure) {
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 0);
}
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "");

View File

@@ -30,10 +30,10 @@ char *get_meta_key_text(enum metakey meta_key) {
return "genre";
case MetaTitle:
return "title";
case MetaMediaComment:
return "media_comment";
case MetaFontName:
return "font_name";
case MetaParent:
return "parent";
case MetaExifMake:
return "exif_make";
case MetaExifDescription:
@@ -58,8 +58,6 @@ char *get_meta_key_text(enum metakey meta_key) {
return "author";
case MetaModifiedBy:
return "modified_by";
case MetaThumbnail:
return "thumbnail";
case MetaPages:
return "pages";
case MetaExifGpsLongitudeRef:
@@ -81,21 +79,23 @@ char *get_meta_key_text(enum metakey meta_key) {
}
}
char *build_json_string(document_t *doc) {
typedef struct {
meta_line_t *meta_head;
meta_line_t *meta_tail;
} linked_list_t;
void write_document(document_t *doc) {
linked_list_t thumbnails_to_write = {.meta_head = NULL, .meta_tail = NULL};
cJSON *json = cJSON_CreateObject();
int buffer_size_guess = 8192;
const char *mime_text = mime_get_mime_text(doc->mime);
if (mime_text == NULL) {
cJSON_AddNullToObject(json, "mime");
} else {
cJSON_AddStringToObject(json, "mime", mime_text);
}
// Ignore root directory in the file path
doc->ext = (short) (doc->ext - ScanCtx.index.desc.root_len);
doc->base = (short) (doc->base - ScanCtx.index.desc.root_len);
char *filepath = doc->filepath + ScanCtx.index.desc.root_len;
char filepath[PATH_MAX * 3];
strcpy(filepath, doc->filepath + ScanCtx.index.desc.root_len);
cJSON_AddStringToObject(json, "extension", filepath + doc->ext);
@@ -125,7 +125,6 @@ char *build_json_string(document_t *doc) {
while (meta != NULL) {
switch (meta->key) {
case MetaThumbnail:
case MetaPages:
case MetaWidth:
case MetaHeight:
@@ -143,7 +142,6 @@ char *build_json_string(document_t *doc) {
case MetaAlbumArtist:
case MetaGenre:
case MetaFontName:
case MetaParent:
case MetaExifMake:
case MetaExifDescription:
case MetaExifSoftware:
@@ -163,11 +161,17 @@ char *build_json_string(document_t *doc) {
case MetaExifGpsLatitudeDec:
case MetaExifGpsLatitudeRef:
case MetaChecksum:
case MetaMediaComment:
case MetaTitle: {
cJSON_AddStringToObject(json, get_meta_key_text(meta->key), meta->str_val);
buffer_size_guess += (int) strlen(meta->str_val);
break;
}
case MetaThumbnail: {
// Keep a list of thumbnails to write after we know what the sid is
APPEND_THUMBNAIL(&thumbnails_to_write, meta->str_val, meta->size);
break;
}
default:
LOG_FATALF("serialize.c", "Invalid meta key: %x %s", meta->key, get_meta_key_text(meta->key));
}
@@ -180,13 +184,19 @@ char *build_json_string(document_t *doc) {
char *json_str = cJSON_PrintBuffered(json, buffer_size_guess, FALSE);
cJSON_Delete(json);
return json_str;
}
void write_document(document_t *doc) {
char *json_str = build_json_string(doc);
database_write_document(ProcData.index_db, doc, json_str);
int doc_id = database_write_document(ProcData.index_db, doc, json_str);
free(doc);
free(json_str);
// Write thumbnails
meta = thumbnails_to_write.meta_head;
int index_num = 0;
while (meta != NULL) {
database_write_thumbnail(ProcData.index_db, doc_id, index_num, meta->str_val, meta->size);
meta_line_t *tmp = meta;
meta = meta->next;
free(tmp);
index_num += 1;
}
}

View File

@@ -11,7 +11,6 @@
#include "web/serve.h"
#include "parsing/mime.h"
#include "parsing/parse.h"
#include "auth0/auth0_c_api.h"
#include <signal.h>
#include <pthread.h>
@@ -39,7 +38,7 @@ void database_scan_begin(scan_args_t *args) {
index_descriptor_t *original_desc = database_read_index_descriptor(db);
// copy original index id
strcpy(desc->id, original_desc->id);
desc->id = original_desc->id;
if (original_desc->version_major != VersionMajor) {
LOG_FATALF("main.c", "Version mismatch! Index is %s but executable is %s", original_desc->version, Version);
@@ -67,7 +66,7 @@ void database_scan_begin(scan_args_t *args) {
desc->version_patch = VersionPatch;
// generate new index id based on timestamp
md5_hexdigest(&ScanCtx.index.desc.timestamp, sizeof(ScanCtx.index.desc.timestamp), ScanCtx.index.desc.id);
desc->id = (int) ScanCtx.index.desc.timestamp;
database_initialize(db);
database_open(db);
@@ -75,14 +74,11 @@ void database_scan_begin(scan_args_t *args) {
}
database_increment_version(db);
database_sync_mime_table(db);
database_close(db, FALSE);
}
void write_thumbnail_callback(char *key, int num, void *buf, size_t buf_len) {
database_write_thumbnail(ProcData.index_db, key, num, buf, buf_len);
}
void log_callback(const char *filepath, int level, char *str) {
if (level == LEVEL_FATAL) {
sist_log(filepath, level, str);
@@ -140,7 +136,6 @@ void initialize_scan_context(scan_args_t *args) {
// Comic
ScanCtx.comic_ctx.log = log_callback;
ScanCtx.comic_ctx.logf = logf_callback;
ScanCtx.comic_ctx.store = write_thumbnail_callback;
ScanCtx.comic_ctx.enable_tn = args->tn_count > 0;
ScanCtx.comic_ctx.tn_size = args->tn_size;
ScanCtx.comic_ctx.tn_qscale = args->tn_quality;
@@ -157,7 +152,6 @@ void initialize_scan_context(scan_args_t *args) {
}
ScanCtx.ebook_ctx.log = log_callback;
ScanCtx.ebook_ctx.logf = logf_callback;
ScanCtx.ebook_ctx.store = write_thumbnail_callback;
ScanCtx.ebook_ctx.fast_epub_parse = args->fast_epub;
ScanCtx.ebook_ctx.tn_qscale = args->tn_quality;
@@ -165,7 +159,6 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.font_ctx.enable_tn = args->tn_count > 0;
ScanCtx.font_ctx.log = log_callback;
ScanCtx.font_ctx.logf = logf_callback;
ScanCtx.font_ctx.store = write_thumbnail_callback;
// Media
ScanCtx.media_ctx.tn_qscale = args->tn_quality;
@@ -173,7 +166,6 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.media_ctx.tn_count = args->tn_count;
ScanCtx.media_ctx.log = log_callback;
ScanCtx.media_ctx.logf = logf_callback;
ScanCtx.media_ctx.store = write_thumbnail_callback;
ScanCtx.media_ctx.max_media_buffer = (long) args->max_memory_buffer_mib * 1024 * 1024;
ScanCtx.media_ctx.read_subtitles = args->read_subtitles;
ScanCtx.media_ctx.read_subtitles = args->tn_count;
@@ -189,13 +181,11 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.ooxml_ctx.content_size = args->content_size;
ScanCtx.ooxml_ctx.log = log_callback;
ScanCtx.ooxml_ctx.logf = logf_callback;
ScanCtx.ooxml_ctx.store = write_thumbnail_callback;
// MOBI
ScanCtx.mobi_ctx.content_size = args->content_size;
ScanCtx.mobi_ctx.log = log_callback;
ScanCtx.mobi_ctx.logf = logf_callback;
ScanCtx.mobi_ctx.store = write_thumbnail_callback;
ScanCtx.mobi_ctx.enable_tn = args->tn_count > 0;
ScanCtx.mobi_ctx.tn_size = args->tn_size;
ScanCtx.mobi_ctx.tn_qscale = args->tn_quality;
@@ -209,7 +199,6 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.msdoc_ctx.content_size = args->content_size;
ScanCtx.msdoc_ctx.log = log_callback;
ScanCtx.msdoc_ctx.logf = logf_callback;
ScanCtx.msdoc_ctx.store = write_thumbnail_callback;
ScanCtx.msdoc_ctx.msdoc_mime = mime_get_mime_by_string("application/msword");
ScanCtx.threads = args->threads;
@@ -228,7 +217,6 @@ void initialize_scan_context(scan_args_t *args) {
ScanCtx.raw_ctx.tn_size = args->tn_size;
ScanCtx.raw_ctx.log = log_callback;
ScanCtx.raw_ctx.logf = logf_callback;
ScanCtx.raw_ctx.store = write_thumbnail_callback;
// Wpd
ScanCtx.wpd_ctx.content_size = args->content_size;
@@ -271,9 +259,6 @@ void sist2_scan(scan_args_t *args) {
tpool_wait(ScanCtx.pool);
tpool_destroy(ScanCtx.pool);
LOG_DEBUGF("main.c", "Thumbnail store size: %lu", ScanCtx.stat_tn_size);
LOG_DEBUGF("main.c", "Index size: %lu", ScanCtx.stat_index_size);
database_t *db = database_create(args->output, INDEX_DATABASE);
database_open(db);
@@ -316,16 +301,15 @@ void sist2_index(index_args_t *args) {
database_open(db);
database_iterator_t *iterator = database_create_document_iterator(db);
database_document_iter_foreach(json, iterator) {
char doc_id[SIST_DOC_ID_LEN];
strcpy(doc_id, cJSON_GetObjectItem(json, "_id")->valuestring);
char sid[SIST_SID_LEN];
int doc_id = cJSON_GetObjectItem(json, "_id")->valueint;
cJSON_DeleteItemFromObject(json, "_id");
// TODO: delete tag if empty
format_sid(sid, desc->id, doc_id);
if (args->print) {
print_json(json, doc_id);
print_json(json, sid);
} else {
index_json(json, doc_id);
index_json(json, sid);
cnt += 1;
}
cJSON_Delete(json);
@@ -334,10 +318,12 @@ void sist2_index(index_args_t *args) {
free(iterator);
if (!args->print) {
char sid[SIST_SID_LEN];
database_iterator_t *del_iter = database_create_delete_list_iterator(db);
database_delete_list_iter_foreach(id, del_iter) {
delete_document(id);
free(id);
database_delete_list_iter_foreach(doc_id, del_iter) {
format_sid(sid, desc->id, doc_id);
delete_document(sid);
}
free(del_iter);
}
@@ -366,7 +352,6 @@ void sist2_sqlite_index(sqlite_index_args_t *args) {
database_fts_optimize(db);
database_close(db, FALSE);
database_close(search_db, FALSE);
}
void sist2_web(web_args_t *args) {
@@ -439,6 +424,8 @@ int set_to_negative_if_value_is_zero(UNUSED(struct argparse *self), const struct
fprintf(stderr, "error: option `--%s` Value must be >= 0\n", option->long_name);
exit(1);
}
return 0;
}
int main(int argc, const char *argv[]) {
@@ -533,7 +520,8 @@ int main(int argc, const char *argv[]) {
OPT_BOOLEAN('f', "force-reset", &index_args->force_reset, "Reset Elasticsearch mappings and settings."),
OPT_GROUP("sqlite-index options"),
OPT_STRING(0, "search-index", &common_search_index, "Path to search index. Will be created if it does not exist yet."),
OPT_STRING(0, "search-index", &common_search_index,
"Path to search index. Will be created if it does not exist yet."),
OPT_GROUP("Web options"),
OPT_STRING(0, "es-url", &common_es_url, "Elasticsearch url. DEFAULT: http://localhost:9200"),
@@ -557,7 +545,7 @@ int main(int argc, const char *argv[]) {
OPT_END(),
};
struct argparse argparse;
struct argparse argparse = {};
argparse_init(&argparse, options, usage, 0);
argparse_describe(
&argparse,

View File

@@ -61,4 +61,6 @@ unsigned int mime_get_mime_by_ext(const char *ext);
unsigned int mime_get_mime_by_string(const char *str);
unsigned int* get_mime_ids();
#endif

View File

@@ -365,7 +365,6 @@ model_vnd_gdl=65893,
model_vnd_gs_gdl=65894,
model_vrml=65895,
model_x_pov=65896,
sist2_sidecar=2,
text_PGP=590185,
text_asp=590186,
text_css=590187,
@@ -909,7 +908,6 @@ case image_x_sony_arw: return "image/x-sony-arw";
case image_x_sony_sr2: return "image/x-sony-sr2";
case image_x_sony_srf: return "image/x-sony-srf";
case image_x_epson_erf: return "image/x-epson-erf";
case sist2_sidecar: return "sist2/sidecar";
default: return NULL;}}
unsigned int mime_extension_lookup(unsigned long extension_crc32) {switch (extension_crc32) {
case 2495639202:return application_x_matlab_data;
@@ -1293,7 +1291,6 @@ case 1698465774:return image_x_sony_arw;
case 2083014127:return image_x_sony_sr2;
case 271503362:return image_x_sony_srf;
case 142938048:return image_x_epson_erf;
case 287571459:return sist2_sidecar;
default: return 0;}}
unsigned int mime_name_lookup(unsigned long mime_crc32) {switch (mime_crc32) {
case 3272851765: return application_x_matlab_data;
@@ -1747,6 +1744,7 @@ case 3060720351: return image_x_sony_arw;
case 2944016606: return image_x_sony_sr2;
case 3279729971: return image_x_sony_srf;
case 1665206815: return image_x_epson_erf;
case 521139448: return sist2_sidecar;
default: return 0;}}
unsigned int mime_ids[] = {655530,655363,655364,655365,655366,655362,655361,655367,655368,655369,655370,655371,655372 | 0x40000000,655373,655374,655375,655376 | 0x08000000,655377,655378,655379,655380,655382,655381,655383,655384,655390,655385,655386,655387,655388,655389,655391,655392,655393,655394,655395 | 0x40000000,655396,655397,655398,655399,655400,655401,655402,655403,655404,655405,655406,655407,655408,655411,655412,655413,655414,655415,655416,655417,655418,655419 | 0x20000000,655421,655422,655423,655424,655425,655426,655427,655428,655429,655430,655431,655432 | 0x04000000,655433 | 0x04000000,655434 | 0x04000000,655435,655436,655437,655438,655439,655440,655441,655442,655443,655444,655445,655446 | 0x10000000,655447,655448,655449 | 0x10000000,655450,655451,655452,655453,655454,655455,655456,655457,655458,655459,655461 | 0x08000000,655460,655462,655463,655464,655465,655466,655467,655468,655469,655470,655471,655472,655473,655474,655475,655476,655477,655478,655479,655480,1,655481,655482,655483,655484,655485,655486,655487,655488,655489 | 0x20000000,655490,655491,655492,655493,655494,655495,655496,655497,655498,655499,655500,655501,655502,655503,655504,655505,655506,655507,655508,655509,655510,655511,655512,655513,655514,655515,655516,655517,655519,655518 | 0x08000000,655521,655520,655522 | 0x08000000,655523 | 0x08000000,655524 | 0x08000000,655525,655526,655527,655528,655529,655531,655532,655533,655534,655535,655599,655536 | 0x02000000,655409 | 0x02000000,655540,655537,655538,655539,655541,655542,655543,655544,655545,655546,655547,655548,655549,655550,655552,655551,655553,655554,655555,655556,655557,655558,655559,655560,655561,655562 | 0x10000000,655563,655564,655565,655566,655567,655569,655568,655570,655571,655572,655573,655574,655575,655576,655577,655578 | 0x10000000,655579,655580,655581,655583,655582,655584,655585,655586,655587,655588,655589,655590,655591,655592,655593,655594,655595 | 0x08000000,655596,655597 | 0x08000000,655600 | 0x10000000,655601,458994 | 0x80000000,458995,458996,458998,458997,458999,459000,459001,459002,459003,459004,459005,459006,459007,459008,459009,459010,459011,459012,459013,459014,459015,459016,459017,459018,459030,459019,459020,459021,459022,459023,459025,459024,459026,459027,459029 | 0x80000000,459028 | 0x80000000,327959 | 0x20000000,327960 | 0x20000000,327962 | 0x20000000,327961 | 0x20000000,524571,524572,524573,524574,524575,524576,524577,524578,524579,524580,524581,524582,524583,524584 | 0x80000000,524585 | 0x80000000,524586,524587 | 0x80000000,524588 | 0x80000000,524589,524590,524591,524592,524593,524594,524595,524596,524597,524599,524602,524603,524605,524606,524608,524610,524611,524612 | 0x80000000,524613,524614,524619,524620,524624,524626,524627,524628,524629,524630,524631,524636,524637,524638,524639 | 0x80000000,524640 | 0x80000000,524641,196962,196963,65892,65893,65894,65895,65896,590186,590187,590189 | 0x01000000,590190,590191,590192,590185,590193,590231,590188,655410,590194,590195,590196,590197,590198,590199,590200,590201,590203,590202,590204,590205,590206,590207,590208,590209,590210,590211,590212,590213,590214,590215,590216,590217,590219,590220,590244 | 0x01000000,590218,590222,590221,590223,590224,590225,590226,590227,590228,590229,590230,590232,590233,590234,590235 | 0x01000000,590236,590237,590238,590239,590240,590241,590242,590243,393638,393639,393640,393637,393641,393642,393643,393644,393645,393646,393647,393648,393649,393650,393651,393652,393653,393654,393655,393656,393657 | 0x80000000,393658,393659,393660,393661,393662,393663,393664,393665,721346,655598,655420,524622 | 0x00800000,524621 | 0x00800000,524609 | 0x00800000,524623 | 0x00800000,524598 | 0x00800000,524600 | 0x00800000,524601 | 0x00800000,524604 | 0x00800000,524615 | 0x00800000,524616 | 0x00800000,524617 | 0x00800000,524618 | 0x00800000,524625 | 0x00800000,524632 | 0x00800000,524633 | 0x00800000,524634 | 0x00800000,524635 | 0x00800000,524607 | 0x00800000,0};
unsigned int* get_mime_ids() { return mime_ids; }
#endif

View File

@@ -4,10 +4,8 @@
#include "src/ctx.h"
#include "mime.h"
#include "src/io/serialize.h"
#include "src/parsing/sidecar.h"
#include "src/parsing/fs_util.h"
#include "src/parsing/magic_util.h"
#include <pthread.h>
#define MIN_VIDEO_SIZE (1024 * 64)
@@ -27,7 +25,6 @@ typedef enum {
FILETYPE_OOXML,
FILETYPE_COMIC,
FILETYPE_MOBI,
FILETYPE_SIST2_SIDECAR,
FILETYPE_MSDOC,
FILETYPE_JSON,
FILETYPE_NDJSON,
@@ -63,8 +60,6 @@ file_type_t get_file_type(unsigned int mime, size_t size, const char *filepath)
return FILETYPE_COMIC;
} else if (IS_MOBI(mime)) {
return FILETYPE_MOBI;
} else if (mime == MIME_SIST2_SIDECAR) {
return FILETYPE_SIST2_SIDECAR;
} else if (is_msdoc(&ScanCtx.msdoc_ctx, mime)) {
return FILETYPE_MSDOC;
} else if (is_json(&ScanCtx.json_ctx, mime)) {
@@ -157,7 +152,8 @@ void parse(parse_job_t *job) {
doc->size = job->vfile.st_size;
doc->mtime = MAX(job->vfile.mtime, 0);
doc->mime = get_mime(job);
generate_doc_id(doc->filepath + ScanCtx.index.desc.root_len, doc->doc_id);
doc->thumbnail_count = 0;
strcpy(doc->parent, job->parent);
if (doc->mime == GET_MIME_ERROR_FATAL) {
CLOSE_FILE(job->vfile)
@@ -165,16 +161,12 @@ void parse(parse_job_t *job) {
return;
}
if (database_mark_document(ProcData.index_db, doc->doc_id, doc->mtime)) {
if (database_mark_document(ProcData.index_db, doc->filepath + ScanCtx.index.desc.root_len, doc->mtime)) {
CLOSE_FILE(job->vfile)
free(doc);
return;
}
if (LogCtx.very_verbose) {
LOG_DEBUGF(job->filepath, "Starting parse job {%s}", doc->doc_id);
}
switch (get_file_type(doc->mime, doc->size, doc->filepath)) {
case FILETYPE_RAW:
parse_raw(&ScanCtx.raw_ctx, &job->vfile, doc);
@@ -195,6 +187,10 @@ void parse(parse_job_t *job) {
parse_font(&ScanCtx.font_ctx, &job->vfile, doc);
break;
case FILETYPE_ARCHIVE:
// Insert the document now so that the children documents can link to an existing ID
database_write_document(ProcData.index_db, doc, NULL);
parse_archive(&ScanCtx.arc_ctx, &job->vfile, doc, ScanCtx.exclude, ScanCtx.exclude_extra);
break;
case FILETYPE_OOXML:
@@ -206,11 +202,6 @@ void parse(parse_job_t *job) {
case FILETYPE_MOBI:
parse_mobi(&ScanCtx.mobi_ctx, &job->vfile, doc);
break;
case FILETYPE_SIST2_SIDECAR:
parse_sidecar(&job->vfile, doc);
CLOSE_FILE(job->vfile)
free(doc);
return;
case FILETYPE_MSDOC:
parse_msdoc(&ScanCtx.msdoc_ctx, &job->vfile, doc);
break;
@@ -225,14 +216,6 @@ void parse(parse_job_t *job) {
break;
}
//Parent meta
if (job->parent[0] != '\0') {
meta_line_t *meta_parent = malloc(sizeof(meta_line_t) + SIST_INDEX_ID_LEN);
meta_parent->key = MetaParent;
strcpy(meta_parent->str_val, job->parent);
APPEND_META((doc), meta_parent);
}
CLOSE_FILE(job->vfile)
if (job->vfile.has_checksum) {

View File

@@ -1,40 +0,0 @@
#include "sidecar.h"
#include "src/ctx.h"
void parse_sidecar(vfile_t *vfile, document_t *doc) {
LOG_DEBUGF("sidecar.c", "Parsing sidecar file %s", vfile->filepath);
size_t size;
char *buf = read_all(vfile, &size);
if (buf == NULL) {
LOG_ERRORF("sidecar.c", "Read error for %s", vfile->filepath);
return;
}
buf = realloc(buf, size + 1);
*(buf + size) = '\0';
cJSON *json = cJSON_Parse(buf);
if (json == NULL) {
LOG_ERRORF("sidecar.c", "Could not parse JSON sidecar %s", vfile->filepath);
return;
}
char *json_str = cJSON_PrintUnformatted(json);
char assoc_doc_id[SIST_DOC_ID_LEN];
char rel_path[PATH_MAX];
size_t rel_path_len = doc->ext - 1 - ScanCtx.index.desc.root_len;
memcpy(rel_path, vfile->filepath + ScanCtx.index.desc.root_len, rel_path_len);
*(rel_path + rel_path_len) = '\0';
generate_doc_id(rel_path, assoc_doc_id);
database_write_document_sidecar(ProcData.index_db, assoc_doc_id, json_str);
cJSON_Delete(json);
free(json_str);
free(buf);
}

View File

@@ -1,8 +0,0 @@
#ifndef SIST2_SIDECAR_H
#define SIST2_SIDECAR_H
#include "src/sist.h"
void parse_sidecar(vfile_t *vfile, document_t *doc);
#endif

View File

@@ -3,19 +3,19 @@
#define _GNU_SOURCE
#ifndef FALSE
#define FALSE (0)
#ifndef FALSE
#define FALSE (0)
#define BOOL int
#endif
#ifndef TRUE
#define TRUE (!FALSE)
#ifndef TRUE
#define TRUE (!FALSE)
#endif
#undef MAX
#undef MAX
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#undef MIN
#undef MIN
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#ifndef PATH_MAX
@@ -23,7 +23,7 @@
#endif
#undef ABS
#define ABS(a) (((a) < 0) ? -(a) : (a))
#define ABS(a) (((a) < 0) ? -(a) : (a))
#define UNUSED(x) __attribute__((__unused__)) x
@@ -51,17 +51,17 @@
#include <ctype.h>
#include "git_hash.h"
#define VERSION "3.2.0"
#define VERSION "3.4.2"
static const char *const Version = VERSION;
static const int VersionMajor = 3;
static const int VersionMinor = 2;
static const int VersionPatch = 0;
static const int VersionMinor = 4;
static const int VersionPatch = 2;
#ifndef SIST_PLATFORM
#define SIST_PLATFORM unknown
#endif
#define EXPECTED_MONGOOSE_VERSION "7.7"
#define EXPECTED_MONGOOSE_VERSION "7.13"
#define Q(x) #x
#define QUOTE(x) Q(x)

View File

@@ -77,14 +77,14 @@ static void worker_thread_loop(tpool_t *pool) {
job_t *job = database_get_work(ProcData.ipc_db, pool->shm->job_type);
if (job != NULL) {
pthread_mutex_lock(&(pool->shm->data_mutex));
pool->shm->busy_count += 1;
pthread_mutex_unlock(&(pool->shm->data_mutex));
if (pool->shm->stop) {
break;
}
pthread_mutex_lock(&(pool->shm->data_mutex));
pool->shm->busy_count += 1;
pthread_mutex_unlock(&(pool->shm->data_mutex));
if (job->type == JOB_PARSE_JOB) {
parse(job->parse_job);
} else if (job->type == JOB_BULK_LINE) {
@@ -110,11 +110,11 @@ static void worker_thread_loop(tpool_t *pool) {
if (LogCtx.json_logs) {
progress_bar_print_json(done,
count,
ScanCtx.stat_tn_size,
ScanCtx.stat_index_size, pool->shm->waiting);
0,
0, pool->shm->waiting);
} else {
progress_bar_print((double) done / count,
ScanCtx.stat_tn_size, ScanCtx.stat_index_size);
0, 0);
}
}
@@ -200,11 +200,11 @@ static void *tpool_worker(void *arg) {
pool->shm->ipc_ctx.completed_job_count += 1;
pthread_mutex_unlock(&(pool->shm->ipc_ctx.mutex));
pthread_mutex_lock(&(pool->shm->data_mutex));
pool->shm->busy_count -= 1;
pthread_mutex_unlock(&(pool->shm->data_mutex));
if (WIFSIGNALED(status)) {
pthread_mutex_lock(&(pool->shm->data_mutex));
pool->shm->busy_count -= 1;
pthread_mutex_unlock(&(pool->shm->data_mutex));
int crashed_thread_id = -1;
for (int i = 0; i < MAX_THREADS; i++) {
if (pool->shm->thread_id_to_pid_mapping[i] == pid) {
@@ -265,14 +265,14 @@ void tpool_wait(tpool_t *pool) {
if (pool->shm->ipc_ctx.job_count > 0) {
pthread_cond_wait(&(pool->shm->done_working_cond), &pool->shm->mutex);
} else {
if (pool->shm->ipc_ctx.job_count == 0 && pool->shm->busy_count == 0) {
if (pool->shm->ipc_ctx.job_count == 0 && pool->shm->busy_count <= 0) {
pool->shm->stop = TRUE;
break;
}
}
}
if (pool->print_progress && !LogCtx.json_logs) {
progress_bar_print(1.0, ScanCtx.stat_tn_size, ScanCtx.stat_index_size);
progress_bar_print(1.0, 0, 0);
}
pthread_mutex_unlock(&pool->shm->mutex);

View File

@@ -4,7 +4,7 @@
typedef struct database database_t;
typedef struct index_descriptor {
char id[SIST_INDEX_ID_LEN];
int id;
char version[64];
int version_major;
int version_minor;
@@ -24,4 +24,11 @@ typedef struct index_t {
char path[PATH_MAX];
} index_t;
typedef struct {
int doc_id;
int index_id;
long sid_int64;
char sid_str[SIST_SID_LEN];
} sist_id_t;
#endif

View File

@@ -7,6 +7,7 @@
#include "third-party/utf8.h/utf8.h"
#include "libscan/scan.h"
#include "types.h"
#include <openssl/evp.h>
@@ -18,7 +19,8 @@ dyn_buffer_t url_escape(char *str);
extern int PrintingProgressBar;
void progress_bar_print_json(size_t done, size_t count, size_t tn_size, size_t index_size, int waiting);
void progress_bar_print_json(size_t done, size_t count, size_t tn_size, size_t index_size, int waiting);
void progress_bar_print(double percentage, size_t tn_size, size_t index_size);
const char *find_file_in_paths(const char **paths, const char *filename);
@@ -87,24 +89,6 @@ static void buf2hex(const unsigned char *buf, size_t buflen, char *hex_string) {
*s = '\0';
}
static void md5_hexdigest(const void *data, size_t size, char *output) {
EVP_MD_CTX *md_ctx = EVP_MD_CTX_new();
EVP_DigestInit_ex(md_ctx, EVP_md5(), NULL);
EVP_DigestUpdate(md_ctx, data, size);
unsigned char digest[MD5_DIGEST_LENGTH];
EVP_DigestFinal_ex(md_ctx, digest, NULL);
EVP_MD_CTX_free(md_ctx);
buf2hex(digest, MD5_DIGEST_LENGTH, output);
}
__always_inline
static void generate_doc_id(const char *rel_path, char *doc_id) {
md5_hexdigest(rel_path, strlen(rel_path), doc_id);
}
#define MILLISECOND 1000
struct timespec timespec_add(struct timespec ts1, long usec);
@@ -125,6 +109,29 @@ struct timespec timespec_add(struct timespec ts1, long usec);
} while (0)
#define array_foreach(arr) \
for (int i = 0; (arr)[i] != NULL; i++)
for (int i = 0; (arr)[i] != 0; i++)
#define format_sid(out, index_id, doc_id) \
sprintf((out), "%08x.%08x", (index_id), (doc_id))
static int parse_sid(sist_id_t *sid, const char doc_sid_str[SIST_SID_LEN]) {
if (doc_sid_str[8] != '.') {
return FALSE;
}
char tmp[9];
memcpy(tmp, doc_sid_str, 8);
sid->index_id = (int) strtol(tmp, NULL, 16);
memcpy(tmp, doc_sid_str + 9, 8);
sid->doc_id = (int) strtol(tmp, NULL, 16);
memcpy(sid->sid_str, doc_sid_str, SIST_SID_LEN - 1);
*(sid->sid_str + SIST_SID_LEN - 1) = '\0';
sid->sid_int64 = ((long) sid->index_id << 32) | sid->doc_id;
return TRUE;
}
#endif

View File

@@ -48,30 +48,24 @@ void get_embedding(struct mg_connection *nc, struct mg_http_message *hm) {
WebCtx.es_version->major, WebCtx.es_version->minor, WebCtx.es_version->patch);
}
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2 + 4) {
LOG_DEBUGF("serve.c", "Invalid thumbnail path: %.*s", (int) hm->uri.len, hm->uri.ptr);
sist_id_t sid;
if (hm->uri.len != SIST_SID_LEN + 2 + 4 || !parse_sid(&sid, hm->uri.ptr + 3)) {
LOG_DEBUGF("serve.c", "Invalid embedding path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
char doc_id[SIST_DOC_ID_LEN];
char index_id[SIST_INDEX_ID_LEN];
int model_id = (int) strtol(hm->uri.ptr + SIST_SID_LEN + 3, NULL, 10);
memcpy(index_id, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(index_id + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(doc_id, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, SIST_DOC_ID_LEN);
*(doc_id + SIST_DOC_ID_LEN - 1) = '\0';
int model_id = (int) strtol(hm->uri.ptr + SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 3, NULL, 10);
database_t *db = web_get_database(index_id);
database_t *db = web_get_database(sid.index_id);
if (db == NULL) {
LOG_DEBUGF("serve.c", "Could not get database for index: %s", index_id);
LOG_DEBUGF("serve.c", "Could not get database for index: %s", sid.index_id);
HTTP_REPLY_NOT_FOUND
return;
}
cJSON *json = database_get_embedding(db, doc_id, model_id);
cJSON *json = database_get_embedding(db, sid.doc_id, model_id);
if (json == NULL) {
HTTP_REPLY_NOT_FOUND
@@ -84,17 +78,19 @@ void get_embedding(struct mg_connection *nc, struct mg_http_message *hm) {
void stats_files(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != SIST_INDEX_ID_LEN + 7) {
if (hm->uri.len != 17) {
HTTP_REPLY_NOT_FOUND
return;
}
char arg_index_id[SIST_INDEX_ID_LEN];
char index_id_str[9];
char arg_stat_type[5];
memcpy(arg_index_id, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index_id + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(arg_stat_type, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, 4);
memcpy(index_id_str, hm->uri.ptr + 3, 8);
*(index_id_str + 8) = '\0';
int index_id = (int) strtol(index_id_str, NULL, 16);
memcpy(arg_stat_type, hm->uri.ptr + 3 + 9, 4);
*(arg_stat_type + sizeof(arg_stat_type) - 1) = '\0';
database_stat_type_d stat_type = database_get_stat_type_by_mnemonic(arg_stat_type);
@@ -103,16 +99,15 @@ void stats_files(struct mg_connection *nc, struct mg_http_message *hm) {
return;
}
database_t *db = web_get_database(arg_index_id);
database_t *db = web_get_database(index_id);
if (db == NULL) {
LOG_DEBUGF("serve.c", "Could not get database for index: %s", arg_index_id);
LOG_DEBUGF("serve.c", "Could not get database for index: %d", index_id);
HTTP_REPLY_NOT_FOUND
return;
}
cJSON *json = database_get_stats(db, stat_type);
mg_send_json(nc, json);
cJSON_Delete(json);
}
@@ -152,19 +147,19 @@ void serve_chunk_vendors_css(struct mg_connection *nc, struct mg_http_message *h
web_serve_asset_chunk_vendors_css(nc);
}
void serve_thumbnail(struct mg_connection *nc, struct mg_http_message *hm, const char *arg_index,
const char *arg_doc_id, int arg_num) {
void serve_thumbnail(struct mg_connection *nc, struct mg_http_message *hm, int index_id,
int doc_id, int arg_num) {
database_t *db = web_get_database(arg_index);
database_t *db = web_get_database(index_id);
if (db == NULL) {
LOG_DEBUGF("serve.c", "Could not get database for index: %s", arg_index);
LOG_DEBUGF("serve.c", "Could not get database for index: %d", index_id);
HTTP_REPLY_NOT_FOUND
return;
}
size_t data_len = 0;
void *data = database_read_thumbnail(db, arg_doc_id, arg_num, &data_len);
void *data = database_read_thumbnail(db, doc_id, arg_num, &data_len);
if (data_len != 0) {
web_send_headers(
@@ -173,6 +168,7 @@ void serve_thumbnail(struct mg_connection *nc, struct mg_http_message *hm, const
"Cache-Control: max-age=31536000"
);
mg_send(nc, data, data_len);
nc->is_resp = 0;
free(data);
} else {
HTTP_REPLY_NOT_FOUND
@@ -181,44 +177,29 @@ void serve_thumbnail(struct mg_connection *nc, struct mg_http_message *hm, const
}
void thumbnail_with_num(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2 + 5) {
sist_id_t sid;
if (hm->uri.len != SIST_SID_LEN + 2 + 4 || !parse_sid(&sid, hm->uri.ptr + 3)) {
LOG_DEBUGF("serve.c", "Invalid thumbnail path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
char arg_doc_id[SIST_DOC_ID_LEN];
char arg_index[SIST_INDEX_ID_LEN];
char arg_num[5] = {0};
int num = (int) strtol(hm->uri.ptr + SIST_SID_LEN + 3, NULL, 10);
memcpy(arg_index, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(arg_doc_id, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, SIST_DOC_ID_LEN);
*(arg_doc_id + SIST_DOC_ID_LEN - 1) = '\0';
memcpy(arg_num, hm->uri.ptr + SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 3, 4);
int num = (int) strtol(arg_num, NULL, 10);
serve_thumbnail(nc, hm, arg_index, arg_doc_id, num);
serve_thumbnail(nc, hm, sid.index_id, sid.doc_id, num);
}
void thumbnail(struct mg_connection *nc, struct mg_http_message *hm) {
sist_id_t sid;
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2) {
if (hm->uri.len != 20 || !parse_sid(&sid, hm->uri.ptr + 3)) {
LOG_DEBUGF("serve.c", "Invalid thumbnail path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
char arg_doc_id[SIST_DOC_ID_LEN];
char arg_index[SIST_INDEX_ID_LEN];
memcpy(arg_index, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(arg_doc_id, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, SIST_DOC_ID_LEN);
*(arg_doc_id + SIST_DOC_ID_LEN - 1) = '\0';
serve_thumbnail(nc, hm, arg_index, arg_doc_id, 0);
serve_thumbnail(nc, hm, sid.index_id, sid.doc_id, 0);
}
void search(struct mg_connection *nc, struct mg_http_message *hm) {
@@ -236,6 +217,7 @@ void search(struct mg_connection *nc, struct mg_http_message *hm) {
snprintf(url, 4096, "%s/%s/_search", WebCtx.es_url, WebCtx.es_index);
nc->fn_data = web_post_async(url, body, WebCtx.es_insecure_ssl);
nc->is_resp = 1;
}
void serve_file_from_url(cJSON *json, index_t *idx, struct mg_connection *nc) {
@@ -382,11 +364,15 @@ void index_info(struct mg_connection *nc) {
cJSON *idx_json = cJSON_CreateObject();
cJSON_AddStringToObject(idx_json, "name", idx->desc.name);
cJSON_AddStringToObject(idx_json, "version", idx->desc.version);
cJSON_AddStringToObject(idx_json, "id", idx->desc.id);
cJSON_AddNumberToObject(idx_json, "id", idx->desc.id);
cJSON_AddStringToObject(idx_json, "rewriteUrl", idx->desc.rewrite_url);
cJSON_AddNumberToObject(idx_json, "timestamp", (double) idx->desc.timestamp);
cJSON_AddItemToArray(arr, idx_json);
#ifdef SIST_DEBUG_INFO
cJSON_AddStringToObject(idx_json, "root", idx->desc.root);
#endif
cJSON *models = database_get_models(idx->db);
cJSON_AddItemToObject(idx_json, "models", models);
}
@@ -397,23 +383,18 @@ void index_info(struct mg_connection *nc) {
cJSON_AddStringToObject(json, "searchBackend", "elasticsearch");
}
char *json_str = cJSON_PrintUnformatted(json);
web_send_headers(nc, 200, strlen(json_str), "Content-Type: application/json");
mg_send(nc, json_str, strlen(json_str));
free(json_str);
mg_send_json(nc, json);
cJSON_Delete(json);
}
cJSON *get_root_document_by_id(const char *index_id, const char *doc_id) {
cJSON *get_root_document_by_id(int index_id, int doc_id) {
database_t *db = web_get_database(index_id);
if (!db) {
return NULL;
}
char next_id[SIST_DOC_ID_LEN];
strcpy(next_id, doc_id);
int next_id = doc_id;
while (TRUE) {
cJSON *doc = database_get_document(db, next_id);
@@ -423,38 +404,36 @@ cJSON *get_root_document_by_id(const char *index_id, const char *doc_id) {
}
cJSON *parent = cJSON_GetObjectItem(doc, "parent");
if (parent == NULL || cJSON_IsNull(parent)) {
if (parent == NULL || !cJSON_IsNumber(parent)) {
return doc;
}
strcpy(next_id, parent->valuestring);
cJSON_Delete(parent);
next_id = parent->valueint;
cJSON_Delete(doc);
}
}
void file(struct mg_connection *nc, struct mg_http_message *hm) {
sist_id_t sid;
if (hm->uri.len != SIST_INDEX_ID_LEN + SIST_DOC_ID_LEN + 2) {
if (hm->uri.len != 20 || !parse_sid(&sid, hm->uri.ptr + 3)) {
LOG_DEBUGF("serve.c", "Invalid file path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
char arg_doc_id[SIST_DOC_ID_LEN];
char arg_index[SIST_INDEX_ID_LEN];
memcpy(arg_index, hm->uri.ptr + 3, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
memcpy(arg_doc_id, hm->uri.ptr + 3 + SIST_INDEX_ID_LEN, SIST_DOC_ID_LEN);
*(arg_doc_id + SIST_DOC_ID_LEN - 1) = '\0';
index_t *idx = web_get_index_by_id(arg_index);
index_t *idx = web_get_index_by_id(sid.index_id);
if (idx == NULL) {
HTTP_REPLY_NOT_FOUND
return;
}
cJSON *source = get_root_document_by_id(arg_index, arg_doc_id);
cJSON *source = get_root_document_by_id(sid.index_id, sid.doc_id);
if (source == NULL) {
HTTP_REPLY_NOT_FOUND
return;
}
if (strlen(idx->desc.rewrite_url) == 0) {
serve_file_from_disk(source, idx, nc, hm);
@@ -473,12 +452,12 @@ void status(struct mg_connection *nc) {
}
free(status);
nc->is_resp = 0;
}
typedef struct {
char *name;
int delete;
char *doc_id;
} tag_req_t;
tag_req_t *parse_tag_request(cJSON *json) {
@@ -501,20 +480,14 @@ tag_req_t *parse_tag_request(cJSON *json) {
return NULL;
}
cJSON *arg_doc_id = cJSON_GetObjectItem(json, "doc_id");
if (arg_doc_id == NULL || !cJSON_IsString(arg_doc_id)) {
return NULL;
}
tag_req_t *req = malloc(sizeof(tag_req_t));
req->delete = arg_delete->valueint;
req->name = arg_name->valuestring;
req->doc_id = arg_doc_id->valuestring;
return req;
}
subreq_ctx_t *elastic_delete_tag(const tag_req_t *req) {
subreq_ctx_t *elastic_delete_tag(const char *sid, const tag_req_t *req) {
char *buf = malloc(sizeof(char) * 8192);
snprintf(buf, 8192,
"{"
@@ -529,12 +502,12 @@ subreq_ctx_t *elastic_delete_tag(const tag_req_t *req) {
);
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, req->doc_id);
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, sid);
return web_post_async(url, buf, WebCtx.es_insecure_ssl);
}
subreq_ctx_t *elastic_write_tag(const tag_req_t *req) {
subreq_ctx_t *elastic_write_tag(const char *sid, const tag_req_t *req) {
char *buf = malloc(sizeof(char) * 8192);
snprintf(buf, 8192,
"{"
@@ -549,21 +522,18 @@ subreq_ctx_t *elastic_write_tag(const tag_req_t *req) {
);
char url[4096];
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, req->doc_id);
snprintf(url, sizeof(url), "%s/%s/_update/%s", WebCtx.es_url, WebCtx.es_index, sid);
return web_post_async(url, buf, WebCtx.es_insecure_ssl);
}
void tag(struct mg_connection *nc, struct mg_http_message *hm) {
if (hm->uri.len != SIST_INDEX_ID_LEN + 4) {
sist_id_t sid;
if (hm->uri.len != 22 || !parse_sid(&sid, hm->uri.ptr + 5)) {
LOG_DEBUGF("serve.c", "Invalid tag path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
char arg_index[SIST_INDEX_ID_LEN];
memcpy(arg_index, hm->uri.ptr + 5, SIST_INDEX_ID_LEN);
*(arg_index + SIST_INDEX_ID_LEN - 1) = '\0';
char *body = malloc(hm->body.len + 1);
memcpy(body, hm->body.ptr, hm->body.len);
*(body + hm->body.len) = '\0';
@@ -575,36 +545,36 @@ void tag(struct mg_connection *nc, struct mg_http_message *hm) {
return;
}
database_t *db = web_get_database(arg_index);
database_t *db = web_get_database(sid.index_id);
if (db == NULL) {
LOG_DEBUGF("serve.c", "Could not get database for index: %s", arg_index);
LOG_DEBUGF("serve.c", "Could not get database for index: %d", sid.index_id);
HTTP_REPLY_NOT_FOUND
return;
}
tag_req_t *req = parse_tag_request(json);
if (req == NULL) {
LOG_DEBUGF("serve.c", "Could not parse tag request", arg_index);
LOG_DEBUG("serve.c", "Could not parse tag request");
cJSON_Delete(json);
HTTP_REPLY_BAD_REQUEST
return;
}
if (req->delete) {
database_delete_tag(db, req->doc_id, req->name);
database_delete_tag(db, sid.doc_id, req->name);
if (WebCtx.search_backend == SQLITE_SEARCH_BACKEND) {
database_delete_tag(WebCtx.search_db, req->doc_id, req->name);
database_delete_tag(WebCtx.search_db, sid.sid_int64, req->name);
HTTP_REPLY_OK
} else {
nc->fn_data = elastic_delete_tag(req);
nc->fn_data = elastic_delete_tag(sid.sid_str, req);
}
} else {
database_write_tag(db, req->doc_id, req->name);
database_write_tag(db, sid.doc_id, req->name);
if (WebCtx.search_backend == SQLITE_SEARCH_BACKEND) {
database_write_tag(WebCtx.search_db, req->doc_id, req->name);
database_fts_write_tag(WebCtx.search_db, sid.sid_int64, req->name);
HTTP_REPLY_OK
} else {
nc->fn_data = elastic_write_tag(req);
nc->fn_data = elastic_write_tag(sid.sid_str, req);
}
}
@@ -660,7 +630,7 @@ int check_auth0(struct mg_http_message *hm) {
return TRUE;
}
static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(void *fn_data)) {
static void ev_router(struct mg_connection *nc, int ev, void *ev_data) {
if (ev == MG_EV_HTTP_MSG) {
struct mg_http_message *hm = (struct mg_http_message *) ev_data;
@@ -739,11 +709,11 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
if (mg_http_match_uri(hm, "/status")) {
status(nc);
} else if (mg_http_match_uri(hm, "/f/*/*")) {
} else if (mg_http_match_uri(hm, "/f/*")) {
file(nc, hm);
} else if (mg_http_match_uri(hm, "/t/*/*/*")) {
thumbnail_with_num(nc, hm);
} else if (mg_http_match_uri(hm, "/t/*/*")) {
thumbnail_with_num(nc, hm);
} else if (mg_http_match_uri(hm, "/t/*")) {
thumbnail(nc, hm);
} else if (mg_http_match_uri(hm, "/s/*/*")) {
stats_files(nc, hm);
@@ -752,7 +722,7 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
return;
}
tag(nc, hm);
} else if (mg_http_match_uri(hm, "/e/*/*/*")) {
} else if (mg_http_match_uri(hm, "/e/*/*")) {
get_embedding(nc, hm);
return;
} else {
@@ -771,6 +741,7 @@ static void ev_router(struct mg_connection *nc, int ev, void *ev_data, UNUSED(vo
if (r->status_code == 200) {
web_send_headers(nc, 200, r->size, "Content-Type: application/json");
mg_send(nc, r->body, r->size);
nc->is_resp = 0;
} else if (r->status_code == 0) {
sist_log("serve.c", LOG_SIST_ERROR, "Could not connect to elasticsearch!");

View File

@@ -3,7 +3,7 @@
#include "src/web/web_util.h"
typedef struct {
char *index_id;
int index_id;
char *prefix;
int min_depth;
int max_depth;
@@ -23,7 +23,7 @@ typedef struct {
double date_min;
double date_max;
int page_size;
char **index_ids;
int *index_ids;
char **mime_types;
char **tags;
int sort_asc;
@@ -108,7 +108,7 @@ static json_value get_json_bool(cJSON *object, const char *name) {
return (json_value) {item, FALSE};
}
static json_value get_json_float_array(cJSON *object, const char *name) {
static json_value get_json_number_array(cJSON *object, const char *name) {
cJSON *item = cJSON_GetObjectItem(object, name);
if (item == NULL || cJSON_IsNull(item)) {
return (json_value) {NULL, FALSE};
@@ -147,7 +147,6 @@ static json_value get_json_array(cJSON *object, const char *name) {
}
char **json_array_to_c_array(cJSON *json) {
cJSON *element;
char **arr = calloc(cJSON_GetArraySize(json) + 1, sizeof(char *));
int i = 0;
@@ -158,6 +157,17 @@ char **json_array_to_c_array(cJSON *json) {
return arr;
}
int *json_number_array_to_c_array(cJSON *json) {
cJSON *element;
int *arr = calloc(cJSON_GetArraySize(json) + 1, sizeof(int));
int i = 0;
cJSON_ArrayForEach(element, json) {
arr[i++] = (int) element->valuedouble;
}
return arr;
}
#define DEFAULT_HIGHLIGHT_CONTEXT_SIZE 20
fts_search_req_t *get_search_req(struct mg_http_message *hm) {
@@ -169,7 +179,8 @@ fts_search_req_t *get_search_req(struct mg_http_message *hm) {
json_value req_query, req_path, req_size_min, req_size_max, req_date_min, req_date_max, req_page_size,
req_index_ids, req_mime_types, req_tags, req_sort_asc, req_sort, req_seed, req_after,
req_fetch_aggregations, req_highlight, req_highlight_context_size, req_embedding, req_model;
req_fetch_aggregations, req_highlight, req_highlight_context_size, req_embedding, req_model,
req_search_in_path;
if (!cJSON_IsObject(json) ||
(req_query = get_json_string(json, "query")).invalid ||
@@ -184,11 +195,12 @@ fts_search_req_t *get_search_req(struct mg_http_message *hm) {
(req_seed = get_json_number(json, "seed")).invalid ||
(req_fetch_aggregations = get_json_bool(json, "fetchAggregations")).invalid ||
(req_sort_asc = get_json_bool(json, "sortAsc")).invalid ||
(req_index_ids = get_json_array(json, "indexIds")).invalid ||
(req_index_ids = get_json_number_array(json, "indexIds")).invalid ||
(req_mime_types = get_json_array(json, "mimeTypes")).invalid ||
(req_highlight = get_json_bool(json, "highlight")).invalid ||
(req_search_in_path = get_json_bool(json, "searchInPath")).invalid ||
(req_highlight_context_size = get_json_number(json, "highlightContextSize")).invalid ||
(req_embedding = get_json_float_array(json, "embedding")).invalid ||
(req_embedding = get_json_number_array(json, "embedding")).invalid ||
(req_model = get_json_number(json, "model")).invalid ||
(req_tags = get_json_array(json, "tags")).invalid) {
cJSON_Delete(json);
@@ -242,7 +254,6 @@ fts_search_req_t *get_search_req(struct mg_http_message *hm) {
fts_search_req_t *req = malloc(sizeof(fts_search_req_t));
req->sort = sort;
req->query = req_query.val ? strdup(req_query.val->valuestring) : NULL;
req->path = req_path.val ? strdup(req_path.val->valuestring) : NULL;
req->size_min = req_size_min.val ? req_size_min.val->valuedouble : 0;
req->size_max = req_size_max.val ? req_size_max.val->valuedouble : 0;
@@ -251,7 +262,7 @@ fts_search_req_t *get_search_req(struct mg_http_message *hm) {
req->date_max = req_date_max.val ? req_date_max.val->valuedouble : 0;
req->page_size = (int) req_page_size.val->valuedouble;
req->sort_asc = req_sort_asc.val ? req_sort_asc.val->valueint : TRUE;
req->index_ids = req_index_ids.val ? json_array_to_c_array(req_index_ids.val) : NULL;
req->index_ids = req_index_ids.val ? json_number_array_to_c_array(req_index_ids.val) : NULL;
req->after = req_after.val ? json_array_to_c_array(req_after.val) : NULL;
req->mime_types = req_mime_types.val ? json_array_to_c_array(req_mime_types.val) : NULL;
req->tags = req_tags.val ? json_array_to_c_array(req_tags.val) : NULL;
@@ -261,6 +272,16 @@ fts_search_req_t *get_search_req(struct mg_http_message *hm) {
? req_highlight_context_size.val->valueint
: DEFAULT_HIGHLIGHT_CONTEXT_SIZE;
req->model = req_model.val ? req_model.val->valueint : 0;
if (req_search_in_path.val->valueint == FALSE && req_query.val) {
if (asprintf(&req->query, "- path : %s", req_query.val->valuestring) == -1) {
cJSON_Delete(json);
return NULL;
}
} else {
req->query = req_query.val ? strdup(req_query.val->valuestring) : NULL;
}
req->embedding = req_model.val
? get_float_buffer(req_embedding.val, &req->embedding_size)
: NULL;
@@ -282,7 +303,9 @@ void destroy_search_req(fts_search_req_t *req) {
free(req->query);
free(req->path);
destroy_array(req->index_ids);
if (req->index_ids) {
free(req->index_ids);
}
destroy_array(req->mime_types);
destroy_array(req->tags);
@@ -303,7 +326,7 @@ fts_search_paths_req_t *get_search_paths_req(struct mg_http_message *hm) {
json_value req_index_id, req_min_depth, req_max_depth, req_prefix;
if (!cJSON_IsObject(json) ||
(req_index_id = get_json_string(json, "indexId")).invalid ||
(req_index_id = get_json_number(json, "indexId")).invalid ||
(req_prefix = get_json_string(json, "prefix")).invalid ||
(req_min_depth = get_json_number(json, "minDepth")).val == NULL ||
(req_max_depth = get_json_number(json, "maxDepth")).val == NULL) {
@@ -313,19 +336,16 @@ fts_search_paths_req_t *get_search_paths_req(struct mg_http_message *hm) {
fts_search_paths_req_t *req = malloc(sizeof(fts_search_paths_req_t));
req->index_id = req_index_id.val ? strdup(req_index_id.val->valuestring) : NULL;
req->index_id = req_index_id.val ? req_index_id.val->valueint : 0;
req->prefix = req_prefix.val ? strdup(req_prefix.val->valuestring) : NULL;
req->min_depth = req_min_depth.val->valueint;
req->max_depth = req_max_depth.val->valueint;
req->prefix = req_prefix.val ? strdup(req_prefix.val->valuestring) : NULL;
cJSON_Delete(json);
return req;
}
void destroy_search_paths_req(fts_search_paths_req_t *req) {
if (req->index_id) {
free(req->index_id);
}
if (req->prefix) {
free(req->prefix);
}
@@ -398,11 +418,15 @@ void fts_search(struct mg_connection *nc, struct mg_http_message *hm) {
void fts_get_document(struct mg_connection *nc, struct mg_http_message *hm) {
char doc_id[SIST_DOC_ID_LEN];
memcpy(doc_id, hm->uri.ptr + 7, SIST_INDEX_ID_LEN);
*(doc_id + SIST_INDEX_ID_LEN - 1) = '\0';
sist_id_t sid;
cJSON *json = database_fts_get_document(WebCtx.search_db, doc_id);
if (hm->uri.len != 24 || !parse_sid(&sid, hm->uri.ptr + 7)) {
LOG_DEBUGF("serve.c", "Invalid /fts/d/ path: %.*s", (int) hm->uri.len, hm->uri.ptr);
HTTP_REPLY_NOT_FOUND
return;
}
cJSON *json = database_fts_get_document(WebCtx.search_db, sid.sid_int64);
if (!json) {
HTTP_REPLY_NOT_FOUND

View File

@@ -5,43 +5,49 @@
void web_serve_asset_index_html(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(index_html), HTTP_CROSS_ORIGIN_HEADERS "Content-Type: text/html");
mg_send(nc, index_html, sizeof(index_html));
nc->is_resp = 0;
}
void web_serve_asset_index_js(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(index_js), "Content-Type: application/javascript");
mg_send(nc, index_js, sizeof(index_js));
nc->is_resp = 0;
}
void web_serve_asset_chunk_vendors_js(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(chunk_vendors_js), "Content-Type: application/javascript");
mg_send(nc, chunk_vendors_js, sizeof(chunk_vendors_js));
nc->is_resp = 0;
}
void web_serve_asset_favicon_ico(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(favicon_ico), "Content-Type: image/x-icon");
mg_send(nc, favicon_ico, sizeof(favicon_ico));
nc->is_resp = 0;
}
void web_serve_asset_style_css(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(index_css), "Content-Type: text/css");
mg_send(nc, index_css, sizeof(index_css));
nc->is_resp = 0;
}
void web_serve_asset_chunk_vendors_css(struct mg_connection *nc) {
web_send_headers(nc, 200, sizeof(chunk_vendors_css), "Content-Type: text/css");
mg_send(nc, chunk_vendors_css, sizeof(chunk_vendors_css));
nc->is_resp = 0;
}
index_t *web_get_index_by_id(const char *index_id) {
index_t *web_get_index_by_id(int index_id) {
for (int i = WebCtx.index_count; i >= 0; i--) {
if (strncmp(index_id, WebCtx.indices[i].desc.id, SIST_INDEX_ID_LEN) == 0) {
if (index_id == WebCtx.indices[i].desc.id) {
return &WebCtx.indices[i];
}
}
return NULL;
}
database_t *web_get_database(const char *index_id) {
database_t *web_get_database(int index_id) {
index_t *idx = web_get_index_by_id(index_id);
if (idx != NULL) {
return idx->db;
@@ -92,6 +98,7 @@ void mg_send_json(struct mg_connection *nc, const cJSON *json) {
web_send_headers(nc, 200, strlen(json_str), "Content-Type: application/json");
mg_send(nc, json_str, strlen(json_str));
nc->is_resp = 0;
free(json_str);
}

View File

@@ -10,15 +10,32 @@
// See https://web.dev/coop-coep/
#define HTTP_CROSS_ORIGIN_HEADERS "Cross-Origin-Embedder-Policy: require-corp\r\nCross-Origin-Opener-Policy: same-origin\r\n"
index_t *web_get_index_by_id(const char *index_id);
index_t *web_get_index_by_id(int index_id);
database_t *web_get_database(const char *index_id);
database_t *web_get_database(int index_id);
__always_inline
static char *web_address_to_string(struct mg_addr *addr) {
static char address_to_string_buf[INET6_ADDRSTRLEN];
static char address_to_string_buf[64];
return mg_ntoa(addr, address_to_string_buf, sizeof(address_to_string_buf));
if (addr->is_ip6) {
snprintf(address_to_string_buf, sizeof(address_to_string_buf),
"%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x",
addr->ip[0], addr->ip[1],
addr->ip[2], addr->ip[3],
addr->ip[4], addr->ip[5],
addr->ip[6], addr->ip[7],
addr->ip[8], addr->ip[9],
addr->ip[10], addr->ip[11],
addr->ip[12], addr->ip[13],
addr->ip[14], addr->ip[15]);
} else {
snprintf(address_to_string_buf, sizeof(address_to_string_buf),
"%d.%d.%d.%d",
addr->ip[0], addr->ip[1], addr->ip[2], addr->ip[3]);
}
return address_to_string_buf;
}
void web_send_headers(struct mg_connection *nc, int status_code, size_t length, char *extra_headers);

View File

@@ -106,12 +106,33 @@ find_library(MUPDF_LIB NAMES liblibmupdf.a)
find_library(CMS_LIB NAMES lcms2)
find_library(JAS_LIB NAMES jasper)
find_library(GUMBO_LIB NAMES gumbo)
find_library(GOMP_LIB NAMES libgomp.a gomp PATHS /usr/lib/gcc/x86_64-linux-gnu/11/ /usr/lib/gcc/x86_64-linux-gnu/5/ /usr/lib/gcc/x86_64-linux-gnu/9/ /usr/lib/gcc/x86_64-linux-gnu/10/ /usr/lib/gcc/aarch64-linux-gnu/7/ /usr/lib/gcc/aarch64-linux-gnu/9/ /usr/lib/gcc/x86_64-linux-gnu/7/ /usr/lib/gcc/aarch64-linux-gnu/11/ /usr/lib/gcc/x86_64-linux-gnu/8/ /usr/lib/gcc/aarch64-linux-gnu/8/)
find_library(GOMP_LIB NAMES libgomp.a gomp
PATHS
/usr/lib/gcc/x86_64-linux-gnu/5/
/usr/lib/gcc/x86_64-linux-gnu/6/
/usr/lib/gcc/x86_64-linux-gnu/7/
/usr/lib/gcc/x86_64-linux-gnu/8/
/usr/lib/gcc/x86_64-linux-gnu/9/
/usr/lib/gcc/x86_64-linux-gnu/10/
/usr/lib/gcc/x86_64-linux-gnu/11/
/usr/lib/gcc/x86_64-linux-gnu/12/
/usr/lib/gcc/aarch64-linux-gnu/5/
/usr/lib/gcc/aarch64-linux-gnu/6/
/usr/lib/gcc/aarch64-linux-gnu/7/
/usr/lib/gcc/aarch64-linux-gnu/8/
/usr/lib/gcc/aarch64-linux-gnu/9/
/usr/lib/gcc/aarch64-linux-gnu/10/
/usr/lib/gcc/aarch64-linux-gnu/11/
/usr/lib/gcc/aarch64-linux-gnu/12/
)
find_package(Leptonica CONFIG REQUIRED)
find_package(FFMPEG REQUIRED)
find_package(libraw CONFIG REQUIRED)
find_package(Freetype REQUIRED)
find_package(FFMPEG REQUIRED)
list(REMOVE_ITEM FFMPEG_LIBRARIES /usr/lib/x86_64-linux-gnu/libm.a)
list(REMOVE_ITEM FFMPEG_LIBRARIES /usr/lib/aarch64-linux-gnu/libm.a)
target_compile_options(
scan
@@ -166,7 +187,6 @@ target_link_libraries(
${WPD_LIB_DIR}/libwpd-0.9.a
${WPD_LIB_DIR}/libwpd-stream-0.9.a
${FREETYPE_LIB}
${HARFBUZZ_LIB}
${JBIG2DEC_LIB}

View File

@@ -191,7 +191,7 @@ scan_code_t parse_archive(scan_arc_ctx_t *ctx, vfile_t *f, document_t *doc, pcre
sub_job->vfile.logf = ctx->logf;
sub_job->vfile.has_checksum = FALSE;
sub_job->vfile.calculate_checksum = f->calculate_checksum;
strcpy(sub_job->parent, doc->doc_id);
strcpy(sub_job->parent, doc->filepath);
while (archive_read_next_header(a, &entry) == ARCHIVE_OK) {
struct stat entry_stat = *archive_entry_stat(entry);

View File

@@ -20,7 +20,6 @@ typedef struct {
parse_callback_t parse;
log_callback_t log;
logf_callback_t logf;
store_callback_t store;
char passphrase[4096];
} scan_arc_ctx_t;

View File

@@ -54,7 +54,6 @@ void parse_comic(scan_comic_ctx_t *ctx, vfile_t *f, document_t *doc) {
.max_media_buffer = 0,
.log = ctx->log,
.logf = ctx->logf,
.store = ctx->store,
};
ret = store_image_thumbnail(&media_ctx, buf, entry_size, doc, file_path);

View File

@@ -7,7 +7,6 @@
typedef struct {
log_callback_t log;
logf_callback_t logf;
store_callback_t store;
int enable_tn;
int tn_size;

Some files were not shown because too many files have changed in this diff Show More