Simon
|
2f1b0c96f1
|
retry if deleting docs fail
|
2018-07-14 21:34:25 -04:00 |
|
Simon
|
f452d0f8b2
|
file lists now deleted after indexing
|
2018-07-14 20:41:20 -04:00 |
|
Simon
|
51a47b3628
|
Removed debug line
|
2018-07-14 17:36:16 -04:00 |
|
Simon
|
fe1d29aaea
|
Crawl tasks are now fetched by the crawlers instead of pushed by the server
|
2018-07-14 17:31:18 -04:00 |
|
Simon
|
d9e9f53f92
|
Website should stay online even if elasticsearch is down / timing out
|
2018-07-12 12:06:45 -04:00 |
|
Simon
|
f202caece8
|
Increased cooldown time for indexing
|
2018-07-12 11:45:50 -04:00 |
|
Simon
|
c0327fecda
|
Increased delete_docs timeout value
|
2018-07-12 11:30:16 -04:00 |
|
Simon
|
290322dfa7
|
Indexing is a bit gentler on server resources and some pages have been memory cached
|
2018-07-12 11:26:20 -04:00 |
|
Simon
|
1b743e7aba
|
Updated README and catpchas are now toggled from the config
|
2018-07-10 22:57:52 -04:00 |
|
Simon
|
123f38e65d
|
Updated README with web server information
|
2018-07-10 22:35:54 -04:00 |
|
Simon
|
2b83698292
|
Changed 'no results' text in search page
|
2018-07-10 21:09:21 -04:00 |
|
Simon
|
1ee1c3c35d
|
safer document deletion
|
2018-07-10 19:26:06 -04:00 |
|
Simon
|
35b40f002d
|
Modified timeout values
|
2018-07-10 13:07:54 -04:00 |
|
Simon
|
d138db8f06
|
Added filter to check if a website can be scanned from its parent directory
|
2018-07-10 10:14:23 -04:00 |
|
Simon
|
f226b82f5a
|
Increased website count per page in /website/
|
2018-07-08 10:44:25 -04:00 |
|
Simon
|
711e8282ef
|
'Go to random website' button, and navigation in the website list
|
2018-07-08 10:42:14 -04:00 |
|
Simon
|
9ff21e7943
|
Fixed indentation error
|
2018-06-28 22:47:14 -04:00 |
|
Simon
|
4c9d79fdbf
|
Added filter for large files in stats
|
2018-06-28 10:40:54 -04:00 |
|
Simon
|
2638e47360
|
Only log searches in es
|
2018-06-27 15:39:48 -04:00 |
|
Simon
|
5383ad6aea
|
Searches are not saved to database
|
2018-06-27 15:29:50 -04:00 |
|
Simon
|
14037c5f21
|
Added more extension types and adjusted global stats histograms
|
2018-06-27 11:53:32 -04:00 |
|
Simon
|
10e1afb2e4
|
Small fix to allow uppercase in extension names
|
2018-06-27 10:12:59 -04:00 |
|
Simon
|
6a3d540de2
|
Added date filter in search options and github banner on homepage
|
2018-06-27 10:05:33 -04:00 |
|
Simon
|
b570e81bec
|
More search options
|
2018-06-26 21:38:26 -04:00 |
|
Simon
|
b1ad39c204
|
bugfix when crawl server is timing out
|
2018-06-26 20:25:28 -04:00 |
|
Simon
|
4abd8d12e2
|
Added size filter
|
2018-06-26 20:21:24 -04:00 |
|
Simon
|
f859bd3f8d
|
Removed unmaintained tests
|
2018-06-26 19:07:14 -04:00 |
|
Simon
|
8ea57967e6
|
Filter by extension type
|
2018-06-26 19:02:46 -04:00 |
|
Simon
|
a0bd45c829
|
Increased ES timeouts
|
2018-06-26 17:01:17 -04:00 |
|
Simon
|
e384efd403
|
Bugfix for http crawler
|
2018-06-25 20:36:31 -04:00 |
|
Simon
|
d7ce1670a8
|
Logging and bugfix for http crawler
|
2018-06-25 14:36:16 -04:00 |
|
Simon
|
5fd00f22af
|
Task logs now stored on main server
|
2018-06-24 20:32:02 -04:00 |
|
Simon
|
059d9fd366
|
Added button to queue empty websites
|
2018-06-24 19:33:15 -04:00 |
|
Simon
|
f6ee338c0f
|
Removed unused statement
|
2018-06-24 18:24:41 -04:00 |
|
Simon
|
e11343de23
|
More FTP crawler bug fixes
|
2018-06-24 18:05:30 -04:00 |
|
Simon
|
ab35ce96cc
|
FTP crawler bug fixes
|
2018-06-24 16:44:21 -04:00 |
|
Simon
|
f603f41754
|
Updated readme
|
2018-06-24 14:27:44 -04:00 |
|
Simon
|
8e937e69c0
|
Should fix some FTP errors
|
2018-06-24 13:50:55 -04:00 |
|
Simon
|
a6d753c6ee
|
Added redispatch button and fixed typo in load balancing code
|
2018-06-24 10:07:46 -04:00 |
|
Simon
|
1ac510ff53
|
Slots can be updated without removing & adding
|
2018-06-24 09:39:44 -04:00 |
|
Simon
|
348914aba9
|
Removing unused module
|
2018-06-22 17:34:10 -04:00 |
|
Simon
|
e824b2bf3c
|
Updated readme and UI fixes
|
2018-06-22 13:22:58 -04:00 |
|
Simon
|
9d3fc2d71b
|
typo (again)
|
2018-06-21 21:26:44 -04:00 |
|
Simon
|
efd1981e6f
|
typo
|
2018-06-21 21:00:50 -04:00 |
|
Simon
|
7a4432e4d0
|
More bugfixes for looping directories, some work on task dispatching
|
2018-06-21 20:50:26 -04:00 |
|
Simon
|
14d384e366
|
Decentralised crawling should work in theory + temporary fix for going further than the maximum 10k results elasticsearch allows by default
|
2018-06-21 19:44:27 -04:00 |
|
Simon
|
098ad2be72
|
Should fix unknown encoding errors + removed https warnings
|
2018-06-21 19:23:01 -04:00 |
|
Simon
|
80aa8933e6
|
Added rescan button
|
2018-06-21 13:02:16 -04:00 |
|
Simon
|
073551df3c
|
Attempt to handle looping directories
|
2018-06-21 11:54:40 -04:00 |
|
Simon
|
dd93d40a55
|
Small bugfix for ftp crawler
|
2018-06-20 21:56:38 -04:00 |
|