Simon
|
55a0fde19d
|
Skip 'Parent directory' links more efficiently
|
2018-07-17 11:20:58 -04:00 |
|
Simon
|
756e331c83
|
Fixed bug in crawler when file count in a directory is greater than 150
|
2018-07-17 11:03:10 -04:00 |
|
Simon
|
cf96d1697d
|
Fixed bug when submitting
|
2018-07-16 20:34:42 -04:00 |
|
Simon
|
a8a658f55b
|
Crawl server names that are numeric now show up in stats page
|
2018-07-15 21:33:37 -04:00 |
|
Simon
|
3b3661cae2
|
Typo
|
2018-07-15 21:23:49 -04:00 |
|
Simon
|
0227684a53
|
Added API commands
|
2018-07-15 21:21:57 -04:00 |
|
Simon
|
8a19fa0ce7
|
Decreased bulk enqueue limit
|
2018-07-15 15:31:21 -04:00 |
|
Simon
|
e4cb91376f
|
Fixed typo in readme
|
2018-07-15 12:54:28 -04:00 |
|
Simon
|
c35491cb15
|
Multi threading for bulk enqueue
|
2018-07-15 12:50:26 -04:00 |
|
Simon
|
08c3e119f0
|
Typo
|
2018-07-15 10:52:04 -04:00 |
|
Simon
|
112400886e
|
Crawler no longer crashes when website has no files
|
2018-07-15 10:46:48 -04:00 |
|
Simon
|
e18ded7ac1
|
Temporarily removed logger in async methods (https://stackoverflow.com/questions/37907350)
|
2018-07-15 10:35:13 -04:00 |
|
Simon
|
152a6f20fb
|
Re-enabled multi threaded file requests for large directories
|
2018-07-15 08:54:36 -04:00 |
|
Simon
|
2f1b0c96f1
|
retry if deleting docs fail
|
2018-07-14 21:34:25 -04:00 |
|
Simon
|
f452d0f8b2
|
file lists now deleted after indexing
|
2018-07-14 20:41:20 -04:00 |
|
Simon
|
51a47b3628
|
Removed debug line
|
2018-07-14 17:36:16 -04:00 |
|
Simon
|
fe1d29aaea
|
Crawl tasks are now fetched by the crawlers instead of pushed by the server
|
2018-07-14 17:31:18 -04:00 |
|
Simon
|
d9e9f53f92
|
Website should stay online even if elasticsearch is down / timing out
|
2018-07-12 12:06:45 -04:00 |
|
Simon
|
f202caece8
|
Increased cooldown time for indexing
|
2018-07-12 11:45:50 -04:00 |
|
Simon
|
c0327fecda
|
Increased delete_docs timeout value
|
2018-07-12 11:30:16 -04:00 |
|
Simon
|
290322dfa7
|
Indexing is a bit gentler on server resources and some pages have been memory cached
|
2018-07-12 11:26:20 -04:00 |
|
Simon
|
1b743e7aba
|
Updated README and catpchas are now toggled from the config
|
2018-07-10 22:57:52 -04:00 |
|
Simon
|
123f38e65d
|
Updated README with web server information
|
2018-07-10 22:35:54 -04:00 |
|
Simon
|
2b83698292
|
Changed 'no results' text in search page
|
2018-07-10 21:09:21 -04:00 |
|
Simon
|
1ee1c3c35d
|
safer document deletion
|
2018-07-10 19:26:06 -04:00 |
|
Simon
|
35b40f002d
|
Modified timeout values
|
2018-07-10 13:07:54 -04:00 |
|
Simon
|
d138db8f06
|
Added filter to check if a website can be scanned from its parent directory
|
2018-07-10 10:14:23 -04:00 |
|
Simon
|
f226b82f5a
|
Increased website count per page in /website/
|
2018-07-08 10:44:25 -04:00 |
|
Simon
|
711e8282ef
|
'Go to random website' button, and navigation in the website list
|
2018-07-08 10:42:14 -04:00 |
|
Simon
|
9ff21e7943
|
Fixed indentation error
|
2018-06-28 22:47:14 -04:00 |
|
Simon
|
4c9d79fdbf
|
Added filter for large files in stats
|
2018-06-28 10:40:54 -04:00 |
|
Simon
|
2638e47360
|
Only log searches in es
|
2018-06-27 15:39:48 -04:00 |
|
Simon
|
5383ad6aea
|
Searches are not saved to database
|
2018-06-27 15:29:50 -04:00 |
|
Simon
|
14037c5f21
|
Added more extension types and adjusted global stats histograms
|
2018-06-27 11:53:32 -04:00 |
|
Simon
|
10e1afb2e4
|
Small fix to allow uppercase in extension names
|
2018-06-27 10:12:59 -04:00 |
|
Simon
|
6a3d540de2
|
Added date filter in search options and github banner on homepage
|
2018-06-27 10:05:33 -04:00 |
|
Simon
|
b570e81bec
|
More search options
|
2018-06-26 21:38:26 -04:00 |
|
Simon
|
b1ad39c204
|
bugfix when crawl server is timing out
|
2018-06-26 20:25:28 -04:00 |
|
Simon
|
4abd8d12e2
|
Added size filter
|
2018-06-26 20:21:24 -04:00 |
|
Simon
|
f859bd3f8d
|
Removed unmaintained tests
|
2018-06-26 19:07:14 -04:00 |
|
Simon
|
8ea57967e6
|
Filter by extension type
|
2018-06-26 19:02:46 -04:00 |
|
Simon
|
a0bd45c829
|
Increased ES timeouts
|
2018-06-26 17:01:17 -04:00 |
|
Simon
|
e384efd403
|
Bugfix for http crawler
|
2018-06-25 20:36:31 -04:00 |
|
Simon
|
d7ce1670a8
|
Logging and bugfix for http crawler
|
2018-06-25 14:36:16 -04:00 |
|
Simon
|
5fd00f22af
|
Task logs now stored on main server
|
2018-06-24 20:32:02 -04:00 |
|
Simon
|
059d9fd366
|
Added button to queue empty websites
|
2018-06-24 19:33:15 -04:00 |
|
Simon
|
f6ee338c0f
|
Removed unused statement
|
2018-06-24 18:24:41 -04:00 |
|
Simon
|
e11343de23
|
More FTP crawler bug fixes
|
2018-06-24 18:05:30 -04:00 |
|
Simon
|
ab35ce96cc
|
FTP crawler bug fixes
|
2018-06-24 16:44:21 -04:00 |
|
Simon
|
f603f41754
|
Updated readme
|
2018-06-24 14:27:44 -04:00 |
|