26 Commits

Author SHA1 Message Date
Richard Patel
326e29e5e4
Reset to stable branch 2019-02-22 05:37:45 +01:00
Richard Patel
771d49f2dd
Fix WaitGroup deadlock 2019-02-03 17:14:20 +01:00
Richard Patel
dbd787aa81
Fix WaitGroup crash 2019-02-03 17:09:43 +01:00
Richard Patel
cea6c1658b
Bugfix: Don't schedule new tasks during shutdown 2019-02-03 17:02:44 +01:00
terorie
885af5bb3b
Beta task resuming 2019-02-03 16:50:08 +01:00
Richard Patel
b846498030
Delete URL queues after crawling 2018-11-20 03:05:43 +01:00
Richard Patel
4f3140a39f
Fix queue_count in log 2018-11-20 02:49:03 +01:00
Richard Patel
85d2aac9d4
Performance patch 2018-11-20 02:33:50 +01:00
Richard Patel
b6c0a45900
Job queue disk offloading 2018-11-20 02:03:10 +01:00
Richard Patel
339175220d
Refactor uploading & chunk size parameter 2018-11-18 00:19:43 +01:00
Richard Patel
8060556089
Fix: make crawled dir 2018-11-17 13:36:35 +01:00
Richard Patel
7b29da9340
Fix file uploads 2018-11-17 12:47:16 +01:00
Richard Patel
d596882b40
Fix ton of bugs 2018-11-17 04:18:22 +01:00
Richard Patel
718f9d7fbc
Rename project 2018-11-17 01:33:15 +01:00
Richard Patel
f1687679ab
Unescape results & don't recrawl 404 2018-11-17 01:21:20 +01:00
Richard Patel
145d37f84a
Fix wait, add back crawl command 2018-11-17 00:49:09 +01:00
Richard Patel
3f85cf679b
Getting tasks 2018-11-16 04:47:08 +01:00
Richard Patel
ffde1a9e5d
Timeout and results saving 2018-11-15 20:14:31 +01:00
Richard Patel
a268c6dbcf
Reduce WaitQueue usage 2018-11-12 00:38:22 +01:00
Richard Patel
add6581804
Add resource stats logging 2018-11-05 22:41:17 +01:00
Richard Patel
fa37d45378
Remove too many crawler block
More logging
2018-10-28 18:17:04 +01:00
Richard Patel
b1c40767e0
Remember scanned URLs 2018-10-28 17:07:30 +01:00
Richard Patel
ddfdce9d0f
Refactor a bit 2018-10-28 13:43:45 +01:00
Richard Patel
faad19f121
more stuff 2018-10-28 03:41:16 +01:00
Richard Patel
a507110787
Add stats interval parameter 2018-10-28 02:47:20 +02:00
Richard Patel
79f540bf29
Scheduler 2018-10-28 02:40:12 +02:00