From 8b9d8bfd177699962760616538cbc2a9b81c5c25 Mon Sep 17 00:00:00 2001 From: Richard Patel Date: Fri, 22 Feb 2019 06:04:10 +0100 Subject: [PATCH] Fix README.md format --- README.md | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 96dd2d8..9cb37f3 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ https://od-db.the-eye.eu/ - Start with `./od-database-crawler server ` 3. With Docker - ```dockerfile + ```bash docker run \ -e OD_SERVER_URL=xxx \ -e OD_SERVER_TOKEN=xxx \ @@ -37,20 +37,18 @@ https://od-db.the-eye.eu/ Here are the most important config flags. For more fine control, take a look at `/config.yml`. -| Flag/Config | Environment/Docker | Description | Example | -| ----------------------- | -------------------------- | ------------------------------------------------------------ | ----------------------------------- | -| `server.url` | `OD_SERVER_URL` | OD-DB Server URL | `https://od-db.mine.the-eye.eu/api` | -| `server.token` | `OD_SERVER_TOKEN` | OD-DB Server Access Token | _Ask Hexa **TM**_ | -| `server.recheck` | `OD_SERVER_RECHECK` | Job Fetching Interval | `3s` | -| `output.crawl_stats` | `OD_OUTPUT_CRAWL_STATS` | Crawl Stats Logging Interval (0 = disabled) | `500ms` | -| `output.resource_stats` | `OD_OUTPUT_RESORUCE_STATS` | Resource Stats Logging Interval (0 = disabled) | `8s` | -| `output.log` | `OD_OUTPUT_LOG` | Log File (none = disabled) | `crawler.log` | -| `crawl.tasks` | `OD_CRAWL_TASKS` | Max number of sites to crawl concurrently | `500` | -| `crawl.connections` | `OD_CRAWL_CONNECTIONS` | HTTP connections per site | `1` | -| `crawl.retries` | `OD_CRAWL_RETRIES` | How often to retry after a temporary failure (e.g. `HTTP 429` or timeouts) | `5` | -| `crawl.dial_timeout` | `OD_CRAWL_DIAL_TIMEOUT` | TCP Connect timeout | `5s` | -| `crawl.timeout` | `OD_CRAWL_TIMEOUT` | HTTP request timeout | `20s` | -| `crawl.user-agent` | `OD_CRAWL_USER_AGENT` | HTTP Crawler User-Agent | `googlebot/1.2.3` | -| `crawl.job_buffer` | `OD_CRAWL_JOB_BUFFER` | Number of URLs to keep in memory/cache, per job. The rest is offloaded to disk. Decrease this value if the crawler uses too much RAM. (0 = Disable Cache, -1 = Only use Cache) | `5000` | - - +| Flag/Environment | Description | Example | +| ------------------------------------------------------- | ------------------------------------------------------------ | ----------------------------------- | +| `server.url`
`OD_SERVER_URL` | OD-DB Server URL | `https://od-db.mine.the-eye.eu/api` | +| `server.token`
`OD_SERVER_TOKEN` | OD-DB Server Access Token | _Ask Hexa **TM**_ | +| `server.recheck`
`OD_SERVER_RECHECK` | Job Fetching Interval | `3s` | +| `output.crawl_stats`
`OD_OUTPUT_CRAWL_STATS` | Crawl Stats Logging Interval (0 = disabled) | `500ms` | +| `output.resource_stats`
`OD_OUTPUT_RESORUCE_STATS` | Resource Stats Logging Interval (0 = disabled) | `8s` | +| `output.log`
`OD_OUTPUT_LOG` | Log File (none = disabled) | `crawler.log` | +| `crawl.tasks`
`OD_CRAWL_TASKS` | Max number of sites to crawl concurrently | `500` | +| `crawl.connections`
`OD_CRAWL_CONNECTIONS` | HTTP connections per site | `1` | +| `crawl.retries`
`OD_CRAWL_RETRIES` | How often to retry after a temporary failure (e.g. `HTTP 429` or timeouts) | `5` | +| `crawl.dial_timeout`
`OD_CRAWL_DIAL_TIMEOUT` | TCP Connect timeout | `5s` | +| `crawl.timeout`
`OD_CRAWL_TIMEOUT` | HTTP request timeout | `20s` | +| `crawl.user-agent`
`OD_CRAWL_USER_AGENT` | HTTP Crawler User-Agent | `googlebot/1.2.3` | +| `crawl.job_buffer`
`OD_CRAWL_JOB_BUFFER` | Number of URLs to keep in memory/cache, per job. The rest is offloaded to disk. Decrease this value if the crawler uses too much RAM. (0 = Disable Cache, -1 = Only use Cache) | `5000` |