Re-wrote much of the script and updated README.md

This commit is contained in:
dolphinhats 2018-09-16 11:39:31 -05:00
parent 6b1e42e2ec
commit 2de3b78c4b

View File

@ -1,29 +1,60 @@
Meant for use with the discord chat program available here: [Discord](https://discordapp.com/)
# Discord Channel Scraper
**Requires**
* [Rapptz's discord python API](https://github.com/Rapptz/discord.py)
* Note: can be installed by running the following command as root:
```python3 -m pip install -U discord.py```
**See official documentation for further customization.*
This simple script logs in the user and scrapes up to `--limit` messages from a specified channel. There are a variety of features included to allow for different usecases.
**Usage**
## Requires
- [Rapptz's discord python API](https://github.com/Rapptz/discord.py)
- Available through pip:
```bash
pip3 install -U discord.py`
```
**See official documentation for further installation instructions**
1. Modify the final line in the script to include both your discord email and password
2. Run the script
* Note: a rather cryptic error can occur if you have 2 factor authentication enabled, [bug report here](https://github.com/Rapptz/discord.py/issues/235)
3. As the same user which was used to log into the script, go to the desired channel you wish to save and type ```!yank``` which will start the scraping process
4. If successful, the file should be saved in the directory you ran the script in as <channel name>.txt
* Note: This works with both server channels and private channels, however private conversions will save under the filename 'None.txt', while everything else will save under the name of the channel.
## Usage
**Current Issues**
```
$ python3.5 scrape-logs.py --help
```
- [ ]: On_ready fails for some unknown reason (probably asyncio related).
Project is on hold in the mean time.
**Future plans**
usage: scrape-logs.py [-h] [--username USERNAME] [--flag FLAG] [--quiet]
[--server SERVER] [--channel CHANNEL] [--limit LIMIT]
[--output OUTPUT] [--logging {10,20,30,40,50}]
Scrapes the logs from a Discord channel.
optional arguments:
-h, --help show this help message and exit
--username USERNAME, -u USERNAME
Username to login under. If not specified, username
will be prompted for.
--flag FLAG, -f FLAG An alternative to specifing the server and channel,
specify a piece of regex which when matched against a
message sent by the target user, will trigger scraping
of the channel the message was posted in. Useful for
private messages and private chats. Default value is
"!yank", activates by default if no server is
specified.
--quiet, -q Supress messages in Discord
--server SERVER, --guild SERVER, -s SERVER
Discord server name to scrape from (user must be a
member of the server and have history privileges).
This field is case sensitive. If channel is not
specified the entire server will be scraped.
--channel CHANNEL, -c CHANNEL
Discord channel name to scrape from (user must have
history privileges for the particular channel). This
field is case sensitive.
--limit LIMIT, -l LIMIT
Number of messages to save. Default is 1000000
--output OUTPUT, -o OUTPUT
Outputs all logs into a single file. If not specified,
logs are saved under the format: <server
name>-<channel name>.txt.
--logging {10,20,30,40,50}
Change the logging level. Defaults to 20, info.
```
## Future plans
- [ ]: Add 2FA support
- [ ]: Solve the 2FAuth issue.
- [ ]: More robust than the hacked-together mess it currently is
- [x]: *Quiet mode* - doesn't require messages
- [x]: Command line arguments
- [ ]: Remake discord API in C so it works more than half the time.