Commit Graph

14 Commits

Author SHA1 Message Date
Richard Patel
8f6f8fd17f fasthttp uri 2018-11-16 04:10:45 +01:00
Richard Patel
ffde1a9e5d Timeout and results saving 2018-11-15 20:14:31 +01:00
Richard Patel
a268c6dbcf Reduce WaitQueue usage 2018-11-12 00:38:22 +01:00
Richard Patel
4c071171eb Exclude dups in dir instead of keeping hashes of links 2018-11-11 23:11:30 +01:00
Richard Patel
a8c27b2d21 Hash links 2018-11-06 02:01:53 +01:00
Richard Patel
ed5e35f005 Performance improvements 2018-11-06 00:34:22 +01:00
Richard Patel
77cb45dbec Detect directory symlinks 2018-10-28 18:37:18 +01:00
Richard Patel
b1c40767e0 Remember scanned URLs 2018-10-28 17:07:30 +01:00
Richard Patel
ab5874129f Don't retry on 401/403 2018-10-28 03:47:29 +01:00
Richard Patel
faad19f121 more stuff 2018-10-28 03:41:16 +01:00
Richard Patel
4ea5f8a410 Handle HTTP statuses 2018-10-28 03:22:25 +01:00
Richard Patel
1c33346f45 Fix crawl descent 2018-10-28 03:06:18 +01:00
Richard Patel
a507110787 Add stats interval parameter 2018-10-28 02:47:20 +02:00
Richard Patel
79f540bf29 Scheduler 2018-10-28 02:40:12 +02:00