| 
							
							
								 Simon | 458641654c | Minimal configuration for reddit comment callback | 2018-08-08 21:24:55 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 1ff1e039f5 | Increased stats generation interval | 2018-07-25 12:29:14 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 65738b0e70 | Small fix | 2018-07-25 12:05:07 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 5ff198b88a | Fix for negative sizes | 2018-07-25 11:37:12 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 49206af566 | Updated requirements | 2018-07-25 11:35:41 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 3d63184287 | Increased website scatter size | 2018-07-25 11:33:21 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 34d1f375a8 | Crawler performance improvements | 2018-07-25 11:27:50 -04:00 |  | 
			
				
					| 
							
							
								 Simon | fbbe952e4d | Stats are generated in background and stored to file instead of on-demand | 2018-07-24 20:29:25 -04:00 |  | 
			
				
					| 
							
							
								 Simon Fortier | bf82478fee | Delete blacklist.txt | 2018-07-21 11:23:59 -04:00 |  | 
			
				
					| 
							
							
								 Simon | f12d5d524a | exceptions during push_result are logged instead of raised | 2018-07-21 10:45:17 -04:00 |  | 
			
				
					| 
							
							
								 Simon | d43cf3b0ce | Empty queue timeout increased to avoid that all workers die before the website is dropped | 2018-07-20 14:11:17 -04:00 |  | 
			
				
					| 
							
							
								 Simon | d3801adf74 | Typo | 2018-07-20 13:39:23 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 1df5d194d2 | Very slow websites are skipped. Should fix infinite waiting bug | 2018-07-20 13:34:40 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 004ade8935 | Misc bug fixes | 2018-07-20 10:35:17 -04:00 |  | 
			
				
					| 
							
							
								 Simon | df5b01dc83 | Bug when directory is empty with new file upload (server side) | 2018-07-17 18:28:50 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 8ef1d36c9d | Bug when directory is empty with new file upload | 2018-07-17 18:24:05 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 898ffcf410 | File upload is made in small chunks | 2018-07-17 17:52:17 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 73afebec28 | Added API commands | 2018-07-17 13:12:20 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 55a0fde19d | Skip 'Parent directory' links more efficiently | 2018-07-17 11:20:58 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 756e331c83 | Fixed bug in crawler when file count in a directory is greater than 150 | 2018-07-17 11:03:10 -04:00 |  | 
			
				
					| 
							
							
								 Simon | cf96d1697d | Fixed bug when submitting | 2018-07-16 20:34:42 -04:00 |  | 
			
				
					| 
							
							
								 Simon | a8a658f55b | Crawl server names that are numeric now show up in stats page | 2018-07-15 21:33:37 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 3b3661cae2 | Typo | 2018-07-15 21:23:49 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 0227684a53 | Added API commands | 2018-07-15 21:21:57 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 8a19fa0ce7 | Decreased bulk enqueue limit | 2018-07-15 15:31:21 -04:00 |  | 
			
				
					| 
							
							
								 Simon | e4cb91376f | Fixed typo in readme | 2018-07-15 12:54:28 -04:00 |  | 
			
				
					| 
							
							
								 Simon | c35491cb15 | Multi threading for bulk enqueue | 2018-07-15 12:50:26 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 08c3e119f0 | Typo | 2018-07-15 10:52:04 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 112400886e | Crawler no longer crashes when website has no files | 2018-07-15 10:46:48 -04:00 |  | 
			
				
					| 
							
							
								 Simon | e18ded7ac1 | Temporarily removed logger in async methods (https://stackoverflow.com/questions/37907350) | 2018-07-15 10:35:13 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 152a6f20fb | Re-enabled multi threaded file requests for large directories | 2018-07-15 08:54:36 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 2f1b0c96f1 | retry if deleting docs fail | 2018-07-14 21:34:25 -04:00 |  | 
			
				
					| 
							
							
								 Simon | f452d0f8b2 | file lists now deleted after indexing | 2018-07-14 20:41:20 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 51a47b3628 | Removed debug line | 2018-07-14 17:36:16 -04:00 |  | 
			
				
					| 
							
							
								 Simon | fe1d29aaea | Crawl tasks are now fetched by the crawlers instead of pushed by the server | 2018-07-14 17:31:18 -04:00 |  | 
			
				
					| 
							
							
								 Simon | d9e9f53f92 | Website should stay online even if elasticsearch is down / timing out | 2018-07-12 12:06:45 -04:00 |  | 
			
				
					| 
							
							
								 Simon | f202caece8 | Increased cooldown time for indexing | 2018-07-12 11:45:50 -04:00 |  | 
			
				
					| 
							
							
								 Simon | c0327fecda | Increased delete_docs timeout value | 2018-07-12 11:30:16 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 290322dfa7 | Indexing is a bit gentler on server resources and some pages have been memory cached | 2018-07-12 11:26:20 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 1b743e7aba | Updated README and catpchas are now toggled from the config | 2018-07-10 22:57:52 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 123f38e65d | Updated README with web server information | 2018-07-10 22:35:54 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 2b83698292 | Changed 'no results' text in search page | 2018-07-10 21:09:21 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 1ee1c3c35d | safer document deletion | 2018-07-10 19:26:06 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 35b40f002d | Modified timeout values | 2018-07-10 13:07:54 -04:00 |  | 
			
				
					| 
							
							
								 Simon | d138db8f06 | Added filter to check if a website can be scanned from its parent directory | 2018-07-10 10:14:23 -04:00 |  | 
			
				
					| 
							
							
								 Simon | f226b82f5a | Increased website count per page in /website/ | 2018-07-08 10:44:25 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 711e8282ef | 'Go to random website' button, and navigation in the website list | 2018-07-08 10:42:14 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 9ff21e7943 | Fixed indentation error | 2018-06-28 22:47:14 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 4c9d79fdbf | Added filter for large files in stats | 2018-06-28 10:40:54 -04:00 |  | 
			
				
					| 
							
							
								 Simon | 2638e47360 | Only log searches in es | 2018-06-27 15:39:48 -04:00 |  |