diff --git a/README.md b/README.md new file mode 100644 index 0000000..e70f1be --- /dev/null +++ b/README.md @@ -0,0 +1,20 @@ +# yt-metadata +Script to import [youtube-dl](https://github.com/rg3/youtube-dl) metadata to PostgreSQL. +The actual `.jpg` files for the thumbnails are saved into the database as byte arrays (Only the **default** +thumbnail saved by **youtube-dl**) + +### Scraping metadata using youtube-dl +This tool expects the files to be in the format that this bash script will output: +```bash +id="$1" +mkdir "$id"; cd "$id" +youtube-dl -v --print-traffic --restrict-filename --write-description --write-info-json --write-annotations --write-thumbnail --all-subs --write-sub --skip-download --ignore-config --ignore-errors --geo-bypass --youtube-skip-dash-manifest https://www.youtube.com/watch?v=$id +``` + +### Setup instructions: +* Create the database and schema with the tool of your choice using `schema.sql` +* Change the directory in `import.py` so it points to the location of your youtube metadata +* Run `import.py` + +### Schema: +![schema](https://user-images.githubusercontent.com/7120851/42966031-cb83f216-8b69-11e8-9c9e-a8bcefdc7456.png)