223 Commits

Author SHA1 Message Date
Simon Fortier
fff013f253
Update README.md 2018-09-20 19:31:26 -04:00
Simon
bbd5c7694c Fixed typo in title 2018-09-13 17:17:55 -04:00
Simon
85437b1ef9 Merge remote-tracking branch 'origin/master' 2018-09-06 19:46:56 -04:00
Simon
53db765856 Static export file now managed by nginx 2018-09-06 19:38:07 -04:00
Simon
8b13de4a6b Re-init curl handle on error 2018-08-25 16:46:53 -04:00
Simon
faa9ac3ccb Closing curl handle manually just to make sure 2018-08-23 12:48:15 -04:00
Simon
dff4125c9f Bugfix post-pycurl update pt. 3 (Sorry!) 2018-08-23 12:47:17 -04:00
Simon
25e1e58828 Bugfix post-pycurl update pt. 3 (Sorry!) 2018-08-23 12:46:00 -04:00
Simon
6ffc43601b Bugfix post-pycurl update pt. 2 2018-08-23 12:40:13 -04:00
Simon
484a0baf9d Bugfix post-pycurl update 2018-08-23 12:37:27 -04:00
Simon
cadaf14c1b Small bugfix 2018-08-23 12:12:23 -04:00
Simon
54b4d2d5b4 removed debug lines 2018-08-23 12:02:07 -04:00
Simon
d42be56dee More debug info 2018-08-23 11:59:23 -04:00
Simon Fortier
8dc8627f78
Update README.md 2018-08-23 11:51:48 -04:00
Simon
85c3aa918d replaced requests by pycurl 2018-08-23 11:47:09 -04:00
Simon
8f218f3c9d Bug fix for pages buttons pt.2 2018-08-16 13:24:00 -04:00
Simon
a2327bac7c Bug fix for pages buttons 2018-08-16 13:13:34 -04:00
Simon
6d27cbca02 xz -> lzma for export 2018-08-15 11:32:36 -04:00
Simon
bbe8ed07a8 Reset page number on search 2018-08-14 16:20:00 -04:00
Simon
c92f2f4937 Should fix export problem 2018-08-14 12:21:34 -04:00
Simon
5c386707ed Should fix import error 2018-08-13 14:03:22 -04:00
Simon
edede200f4 Decresed number of indexed documents per second 2018-08-12 14:58:27 -04:00
Simon
cc4c70f400 Request content is read all at once 2018-08-11 13:05:24 -04:00
Simon
78d1b7a5bd Next and previous buttons now works with captcha 2018-08-10 16:30:40 -04:00
Simon
bab68819df Increased stats generation interval 2018-08-10 15:27:37 -04:00
Simon
aab1abba54 Fixed websites link 2018-08-10 15:24:43 -04:00
Simon
c29af180c5 Captcha for searches 2018-08-10 12:46:40 -04:00
Simon
c94cf5b313 Adjusted timeout values (again) 2018-08-10 11:46:16 -04:00
Simon
a6b1d9cba3 More help when no search results 2018-08-09 21:43:07 -04:00
Simon
faeff701de Increased search timeout value 2018-08-09 18:33:35 -04:00
Simon
42d858b62a Queue can be emptied more easily pt.2 2018-08-09 17:14:17 -04:00
Simon
5a084cb857 Queue can be emptied more easily 2018-08-09 17:12:43 -04:00
Simon
ffeed4192e Refresh index before reddit comment callback 2018-08-09 16:19:21 -04:00
Simon
8ffd9179d2 Increased stats timeout value 2018-08-09 14:26:22 -04:00
Simon
f729b462f0 od_util can be used when od-database is a submodule part 2 2018-08-08 23:31:50 -04:00
Simon
88166054ad od_util can be used when od-database is a submodule 2018-08-08 23:07:09 -04:00
Simon
89e378ffd9 Reddit comment callback is not an edit instead of a new comment 2018-08-08 22:41:25 -04:00
Simon
458641654c Minimal configuration for reddit comment callback 2018-08-08 21:24:55 -04:00
Simon
1ff1e039f5 Increased stats generation interval 2018-07-25 12:29:14 -04:00
Simon
65738b0e70 Small fix 2018-07-25 12:05:07 -04:00
Simon
5ff198b88a Fix for negative sizes 2018-07-25 11:37:12 -04:00
Simon
49206af566 Updated requirements 2018-07-25 11:35:41 -04:00
Simon
3d63184287 Increased website scatter size 2018-07-25 11:33:21 -04:00
Simon
34d1f375a8 Crawler performance improvements 2018-07-25 11:27:50 -04:00
Simon
fbbe952e4d Stats are generated in background and stored to file instead of on-demand 2018-07-24 20:29:25 -04:00
Simon Fortier
bf82478fee
Delete blacklist.txt 2018-07-21 11:23:59 -04:00
Simon
f12d5d524a exceptions during push_result are logged instead of raised 2018-07-21 10:45:17 -04:00
Simon
d43cf3b0ce Empty queue timeout increased to avoid that all workers die before the website is dropped 2018-07-20 14:11:17 -04:00
Simon
d3801adf74 Typo 2018-07-20 13:39:23 -04:00
Simon
1df5d194d2 Very slow websites are skipped. Should fix infinite waiting bug 2018-07-20 13:34:40 -04:00