Simon
|
0227684a53
|
Added API commands
|
2018-07-15 21:21:57 -04:00 |
|
Simon
|
5afdfb2b3c
|
fixed navbar icon for mobile
|
2018-06-19 21:13:36 -04:00 |
|
Simon
|
e54609972c
|
Overwrite document on re-index, update website last_modified on task complete, delete website files on index complete
|
2018-06-19 11:24:28 -04:00 |
|
Simon
|
8768e39f08
|
Added stats page
|
2018-06-18 19:56:25 -04:00 |
|
Simon
|
8a73142ff8
|
Support for more than just utf-8 and removed some debug info
|
2018-06-18 13:44:19 -04:00 |
|
Simon
|
b63c7190c3
|
Improved external link detection
|
2018-06-18 12:14:05 -04:00 |
|
Simon
|
344e7274d7
|
Simplified url joining and splitting, switched from lxml to html.parser, various memory usage optimizations
|
2018-06-17 22:10:46 -04:00 |
|
Simon
|
1283cc9599
|
Should fix memory usage problem when crawling (part three)
|
2018-06-16 20:32:50 -04:00 |
|
Simon
|
1bd58468eb
|
Bug fixes for FTP crawler
|
2018-06-13 15:54:45 -04:00 |
|
Simon
|
af2601ee70
|
Fixed file duplication problem
|
2018-06-12 15:55:52 -04:00 |
|
Simon
|
d61fd75890
|
Tasks can now be queued from the web interface. Tasks are dispatched to the crawl server(s)
|
2018-06-12 13:44:03 -04:00 |
|
Simon
|
6d48f1f780
|
Task crawl result now logged in a database
|
2018-06-12 11:03:45 -04:00 |
|
Simon
|
72495275b0
|
Elasticsearch search engine (import from json)
|
2018-06-11 22:35:49 -04:00 |
|
Simon
|
d849227798
|
barebones crawl_server microservice
|
2018-06-11 19:00:43 -04:00 |
|