From 88856c1c190dc51184907848955416cf2f260578 Mon Sep 17 00:00:00 2001
From: Richard Patel <terorie@alphakevin.club>
Date: Fri, 22 Feb 2019 05:59:59 +0100
Subject: [PATCH] Flag explanation in README.md

---
 README.md | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 1b90ae2..96dd2d8 100644
--- a/README.md
+++ b/README.md
@@ -9,16 +9,48 @@
 
 https://od-db.the-eye.eu/
 
-#### Usage
+## Usage
+
+### Deploys
 
  1. With Config File (if `config.yml` found in working dir)
     - Download [default config](https://github.com/terorie/od-database-crawler/blob/master/config.yml)
     - Set `server.url` and `server.token`
     - Start with `./od-database-crawler server --config <file>`
- 
+
  2. With Flags or env
     - Override config file if it exists
     - `--help` for list of flags
     - Every flag is available as an environment variable:
       `--server.crawl_stats` ➡️ `OD_SERVER_CRAWL_STATS`
     - Start with `./od-database-crawler server <flags>`
+
+ 3. With Docker
+    ```dockerfile
+    docker run \
+        -e OD_SERVER_URL=xxx \
+        -e OD_SERVER_TOKEN=xxx \
+        terorie/od-database-crawler
+    ```
+
+### Flag reference
+
+Here are the most important config flags. For more fine control, take a look at `/config.yml`.
+
+| Flag/Config             | Environment/Docker         | Description                                                  | Example                             |
+| ----------------------- | -------------------------- | ------------------------------------------------------------ | ----------------------------------- |
+| `server.url`            | `OD_SERVER_URL`            | OD-DB Server URL                                             | `https://od-db.mine.the-eye.eu/api` |
+| `server.token`          | `OD_SERVER_TOKEN`          | OD-DB Server Access Token                                    | _Ask Hexa **TM**_                   |
+| `server.recheck`        | `OD_SERVER_RECHECK`        | Job Fetching Interval                                        | `3s`                                |
+| `output.crawl_stats`    | `OD_OUTPUT_CRAWL_STATS`    | Crawl Stats Logging Interval (0 = disabled)                  | `500ms`                             |
+| `output.resource_stats` | `OD_OUTPUT_RESORUCE_STATS` | Resource Stats Logging Interval (0 = disabled)               | `8s`                                |
+| `output.log`            | `OD_OUTPUT_LOG`            | Log File (none = disabled)                                   | `crawler.log`                       |
+| `crawl.tasks`           | `OD_CRAWL_TASKS`           | Max number of sites to crawl concurrently                    | `500`                               |
+| `crawl.connections`     | `OD_CRAWL_CONNECTIONS`     | HTTP connections per site                                    | `1`                                 |
+| `crawl.retries`         | `OD_CRAWL_RETRIES`         | How often to retry after a temporary failure (e.g. `HTTP 429` or timeouts) | `5`                                 |
+| `crawl.dial_timeout`    | `OD_CRAWL_DIAL_TIMEOUT`    | TCP Connect timeout                                          | `5s`                                |
+| `crawl.timeout`         | `OD_CRAWL_TIMEOUT`         | HTTP request timeout                                         | `20s`                               |
+| `crawl.user-agent`      | `OD_CRAWL_USER_AGENT`      | HTTP Crawler User-Agent                                      | `googlebot/1.2.3`                   |
+| `crawl.job_buffer`      | `OD_CRAWL_JOB_BUFFER`      | Number of URLs to keep in memory/cache, per job. The rest is offloaded to disk. Decrease this value if the crawler uses too much RAM. (0 = Disable Cache, -1 = Only use Cache) | `5000`                              |
+
+