305 Commits

Author SHA1 Message Date
Simon
bbe8ed07a8 Reset page number on search 2018-08-14 16:20:00 -04:00
Simon
c92f2f4937 Should fix export problem 2018-08-14 12:21:34 -04:00
Simon
5c386707ed Should fix import error 2018-08-13 14:03:22 -04:00
Simon
edede200f4 Decresed number of indexed documents per second 2018-08-12 14:58:27 -04:00
Simon
cc4c70f400 Request content is read all at once 2018-08-11 13:05:24 -04:00
Simon
78d1b7a5bd Next and previous buttons now works with captcha 2018-08-10 16:30:40 -04:00
Simon
bab68819df Increased stats generation interval 2018-08-10 15:27:37 -04:00
Simon
aab1abba54 Fixed websites link 2018-08-10 15:24:43 -04:00
Simon
c29af180c5 Captcha for searches 2018-08-10 12:46:40 -04:00
Simon
c94cf5b313 Adjusted timeout values (again) 2018-08-10 11:46:16 -04:00
Simon
a6b1d9cba3 More help when no search results 2018-08-09 21:43:07 -04:00
Simon
faeff701de Increased search timeout value 2018-08-09 18:33:35 -04:00
Simon
42d858b62a Queue can be emptied more easily pt.2 2018-08-09 17:14:17 -04:00
Simon
5a084cb857 Queue can be emptied more easily 2018-08-09 17:12:43 -04:00
Simon
ffeed4192e Refresh index before reddit comment callback 2018-08-09 16:19:21 -04:00
Simon
8ffd9179d2 Increased stats timeout value 2018-08-09 14:26:22 -04:00
Simon
f729b462f0 od_util can be used when od-database is a submodule part 2 2018-08-08 23:31:50 -04:00
Simon
88166054ad od_util can be used when od-database is a submodule 2018-08-08 23:07:09 -04:00
Simon
89e378ffd9 Reddit comment callback is not an edit instead of a new comment 2018-08-08 22:41:25 -04:00
Simon
458641654c Minimal configuration for reddit comment callback 2018-08-08 21:24:55 -04:00
Simon
1ff1e039f5 Increased stats generation interval 2018-07-25 12:29:14 -04:00
Simon
65738b0e70 Small fix 2018-07-25 12:05:07 -04:00
Simon
5ff198b88a Fix for negative sizes 2018-07-25 11:37:12 -04:00
Simon
49206af566 Updated requirements 2018-07-25 11:35:41 -04:00
Simon
3d63184287 Increased website scatter size 2018-07-25 11:33:21 -04:00
Simon
34d1f375a8 Crawler performance improvements 2018-07-25 11:27:50 -04:00
Simon
fbbe952e4d Stats are generated in background and stored to file instead of on-demand 2018-07-24 20:29:25 -04:00
Simon Fortier
bf82478fee
Delete blacklist.txt 2018-07-21 11:23:59 -04:00
Simon
f12d5d524a exceptions during push_result are logged instead of raised 2018-07-21 10:45:17 -04:00
Simon
d43cf3b0ce Empty queue timeout increased to avoid that all workers die before the website is dropped 2018-07-20 14:11:17 -04:00
Simon
d3801adf74 Typo 2018-07-20 13:39:23 -04:00
Simon
1df5d194d2 Very slow websites are skipped. Should fix infinite waiting bug 2018-07-20 13:34:40 -04:00
Simon
004ade8935 Misc bug fixes 2018-07-20 10:35:17 -04:00
Simon
df5b01dc83 Bug when directory is empty with new file upload (server side) 2018-07-17 18:28:50 -04:00
Simon
8ef1d36c9d Bug when directory is empty with new file upload 2018-07-17 18:24:05 -04:00
Simon
898ffcf410 File upload is made in small chunks 2018-07-17 17:52:17 -04:00
Simon
73afebec28 Added API commands 2018-07-17 13:12:20 -04:00
Simon
55a0fde19d Skip 'Parent directory' links more efficiently 2018-07-17 11:20:58 -04:00
Simon
756e331c83 Fixed bug in crawler when file count in a directory is greater than 150 2018-07-17 11:03:10 -04:00
Simon
cf96d1697d Fixed bug when submitting 2018-07-16 20:34:42 -04:00
Simon
a8a658f55b Crawl server names that are numeric now show up in stats page 2018-07-15 21:33:37 -04:00
Simon
3b3661cae2 Typo 2018-07-15 21:23:49 -04:00
Simon
0227684a53 Added API commands 2018-07-15 21:21:57 -04:00
Simon
8a19fa0ce7 Decreased bulk enqueue limit 2018-07-15 15:31:21 -04:00
Simon
e4cb91376f Fixed typo in readme 2018-07-15 12:54:28 -04:00
Simon
c35491cb15 Multi threading for bulk enqueue 2018-07-15 12:50:26 -04:00
Simon
08c3e119f0 Typo 2018-07-15 10:52:04 -04:00
Simon
112400886e Crawler no longer crashes when website has no files 2018-07-15 10:46:48 -04:00
Simon
e18ded7ac1 Temporarily removed logger in async methods (https://stackoverflow.com/questions/37907350) 2018-07-15 10:35:13 -04:00
Simon
152a6f20fb Re-enabled multi threaded file requests for large directories 2018-07-15 08:54:36 -04:00