diff --git a/README.md b/README.md index 4272992..7a194e3 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,29 @@ # reddit_feed -Fetches comments & submissions from reddit and publishes - serialised JSON to RabbitMQ for real-time ingest. +Fault-tolerant daemon that fetches comments & +submissions from reddit and publishes serialised + JSON to RabbitMQ for real-time ingest. -Can read up to 100 items per second, as per the API limits -(60 requests per minute, 100 items per request). +Can read up to 100 items per second (~6k/min), as per the API limits for +a single client. + +Can optionally push monitoring data to InfluxDB. Below is an +example of Grafana being used to display it. + +The daemon will attempt to stay exactly `REALTIME_DELAY` (60 by default) +seconds 'behind' realtime items. Of course, it can lag several minutes +behind when the number of events exceeds 100/s. In this case, +it will eventually catch up to the target time during lower traffic +hours. + +![monitoring](monitoring.png) + +Tested on GNU/Linux amd64, it uses ~60Mb of memory. + +### Usage +``` +python run.py +``` + +(It is recommended to run with `supervisord`, +see [supervisord_reddit_feed.ini](supervisord_reddit_feed.ini)) \ No newline at end of file diff --git a/monitoring.png b/monitoring.png new file mode 100644 index 0000000..d1ae398 Binary files /dev/null and b/monitoring.png differ diff --git a/run.py b/run.py old mode 100644 new mode 100755 index 62358d7..838975c --- a/run.py +++ b/run.py @@ -1,3 +1,5 @@ +#!/bin/env python + import datetime import json import logging diff --git a/supervisord_reddit_feed.ini b/supervisord_reddit_feed.ini new file mode 100644 index 0000000..467b77b --- /dev/null +++ b/supervisord_reddit_feed.ini @@ -0,0 +1,13 @@ +; Move this file to /etc/supervisor.d/ to enable it +; Make sure to change the RabbitMQ host, the path and the user!. +; Logs will be saved to directory/reddit_feed.log. + +; To enable it: +; sudo supervisorctl +; >update +; >start reddit_feed + +[program:reddit_feed] +command=/path/to/reddit_feed/run.py 172.17.0.2 +directory=/path/to/reddit_feed +user=some_user \ No newline at end of file