Richard Patel
|
24ee6fcba2
|
Quickfix: Revert FTP give back
|
2018-11-17 12:43:30 +01:00 |
|
Richard Patel
|
bfb18d62b2
|
mini fix
|
2018-11-17 05:27:09 +01:00 |
|
Richard Patel
|
f4054441ab
|
Return FTP tasks
|
2018-11-17 05:07:52 +01:00 |
|
Richard Patel
|
f8d2bf386d
|
Fix FTP error ignore
|
2018-11-17 04:57:19 +01:00 |
|
Richard Patel
|
f41198b00c
|
Ignore FTP URLs
|
2018-11-17 04:50:59 +01:00 |
|
Richard Patel
|
7fdffff58f
|
Update config.yml
|
2018-11-17 04:19:04 +01:00 |
|
Richard Patel
|
d596882b40
|
Fix ton of bugs
|
2018-11-17 04:18:22 +01:00 |
|
Richard Patel
|
0fe97a8058
|
Update README.md
|
2018-11-17 01:36:07 +01:00 |
|
Richard Patel
|
718f9d7fbc
|
Rename project
|
2018-11-17 01:33:15 +01:00 |
|
Richard Patel
|
f1687679ab
|
Unescape results & don't recrawl 404
|
2018-11-17 01:21:20 +01:00 |
|
Richard Patel
|
145d37f84a
|
Fix wait, add back crawl command
|
2018-11-17 00:49:09 +01:00 |
|
Richard Patel
|
cc777bcaeb
|
redblackhash: Use bytes.Compare
|
2018-11-16 21:17:39 +01:00 |
|
Simon
|
1e78cea7e7
|
Saved path should not contain file name
|
2018-11-16 13:58:12 -05:00 |
|
Richard Patel
|
3f85cf679b
|
Getting tasks
|
2018-11-16 04:47:08 +01:00 |
|
Richard Patel
|
3c39f0d621
|
Random hacks
|
2018-11-16 03:22:51 +01:00 |
|
Richard Patel
|
50952791c5
|
Almost done
|
2018-11-16 03:12:26 +01:00 |
|
Richard Patel
|
30bf98ad34
|
Fix tests
|
2018-11-16 03:02:10 +01:00 |
|
Richard Patel
|
ccaf758e90
|
Remove URL.Opaque
|
2018-11-16 01:53:16 +01:00 |
|
Richard Patel
|
f668365edb
|
Add tests
|
2018-11-16 01:51:34 +01:00 |
|
Richard Patel
|
1db8ff43bb
|
Bump version
|
2018-11-16 00:25:11 +01:00 |
|
Richard Patel
|
82234f949e
|
Less tokenizer allocations
|
2018-11-16 00:22:40 +01:00 |
|
Richard Patel
|
084b3a5903
|
Optimizing with hexa :P
|
2018-11-15 23:51:31 +01:00 |
|
Richard Patel
|
ac0b8d2d0b
|
Blacklist all paths with a query parameter
|
2018-11-15 23:36:41 +01:00 |
|
Richard Patel
|
ffde1a9e5d
|
Timeout and results saving
|
2018-11-15 20:14:31 +01:00 |
|
Richard Patel
|
a268c6dbcf
|
Reduce WaitQueue usage
|
2018-11-12 00:38:22 +01:00 |
|
Richard Patel
|
4c071171eb
|
Exclude dups in dir instead of keeping hashes of links
|
2018-11-11 23:11:30 +01:00 |
|
Richard Patel
|
9c8174dd8d
|
Fix header parsing
|
2018-11-11 18:53:17 +01:00 |
|
Richard Patel
|
93272e1da1
|
Update README.md
|
2018-11-06 02:41:20 +01:00 |
|
Richard Patel
|
0344a120ff
|
fasturl: Remove path escape
|
2018-11-06 02:15:09 +01:00 |
|
Richard Patel
|
6e6afd771e
|
fasturl: Remove query
|
2018-11-06 02:11:22 +01:00 |
|
Richard Patel
|
a8c27b2d21
|
Hash links
|
2018-11-06 02:01:53 +01:00 |
|
Richard Patel
|
ed5e35f005
|
Performance improvements
|
2018-11-06 00:34:22 +01:00 |
|
Richard Patel
|
a12bca01c8
|
fasturl: Discard UserInfo
|
2018-11-06 00:33:57 +01:00 |
|
Richard Patel
|
ba9c818461
|
fasturl: Don't parse username and password
|
2018-11-06 00:28:42 +01:00 |
|
Richard Patel
|
9cf31b1d81
|
fasturl: Remove fragment
|
2018-11-06 00:17:10 +01:00 |
|
Richard Patel
|
ed0d9c681f
|
fasturl: Replace scheme with enum
|
2018-11-06 00:15:12 +01:00 |
|
Richard Patel
|
b88d45fc21
|
fasturl: Remove allocs from Parse
|
2018-11-05 23:05:21 +01:00 |
|
Richard Patel
|
4989adff9f
|
Add net/url package
|
2018-11-05 22:57:57 +01:00 |
|
Richard Patel
|
add6581804
|
Add resource stats logging
|
2018-11-05 22:41:17 +01:00 |
|
Richard Patel
|
395a6f30b2
|
Fix pprof
|
2018-11-05 21:55:07 +01:00 |
|
Richard Patel
|
a4e53053b9
|
Add LICENSE
oi m8 got a loicense for that
|
2018-11-05 21:42:59 +01:00 |
|
Richard Patel
|
e39565377e
|
Add pprof debug server
|
2018-11-05 21:39:15 +01:00 |
|
Richard Patel
|
77cb45dbec
|
Detect directory symlinks
|
2018-10-28 18:37:18 +01:00 |
|
Richard Patel
|
fa37d45378
|
Remove too many crawler block
More logging
|
2018-10-28 18:17:04 +01:00 |
|
Richard Patel
|
bfd7302be8
|
Add urfave/cli app
|
2018-10-28 17:59:46 +01:00 |
|
Richard Patel
|
b1c40767e0
|
Remember scanned URLs
|
2018-10-28 17:07:30 +01:00 |
|
Richard Patel
|
c196b6f20d
|
Better config
|
2018-10-28 14:19:09 +01:00 |
|
Richard Patel
|
ddfdce9d0f
|
Refactor a bit
|
2018-10-28 13:43:45 +01:00 |
|
Richard Patel
|
7c4ed9d41e
|
Remove WIP disclaimer
|
2018-10-28 03:48:33 +01:00 |
|
Richard Patel
|
ab5874129f
|
Don't retry on 401/403
|
2018-10-28 03:47:29 +01:00 |
|