No description
| data | ||
| my_model | ||
| one_step | ||
| wikidata | ||
| .gitignore | ||
| __init__.py | ||
| main.py | ||
| one_step_store.py | ||
| parameter.py | ||
| play_maker.py | ||
| predict.py | ||
| README.txt | ||
| requirements.txt | ||
| train_more.py | ||
I forgot to consider that I was turning this code over when I wrote it, so to help make sense of the bloated mess here are some notes:
If you want to run it:
1. Download a wikipedia XML, I recommend: http://mattmahoney.net/dc/enwik9.zip due to the size, saved as wikidata/enwik9
2. Run wikidata/process_data.py, it should fill the wikidata folder with 120 data_{n}.txt files
3. Run main.py
4. Optional: Run train_more.py
5. Adjust and run predict.py at you leisure
Some notes about most files in case they cause headscratching:
main.py
Run once to generate a model, do some initial training and output
train_more.py
Run repeatedly after main.py to do more intense training
predict.py
Uses the saved one_step_model (from main or train_more) to generate text.
play_maker.py
If the code has been adjusted to use the shakespeare data, this makes plays, otherwise mostly nonsense.
parameter.py
An attempt to isolate the important parameters for some exploration.
data/data.py
A databundle as I wanted to simplify passing data things around. (needs to be rethought)
my_model/construct.py
Picks a model and constructs it, this is to have one piece of code in charge of what model everything uses.
my_model/model_two_lstm.py
The model used for most testing. The others can be ignored.
one_step/one_step.py
Model for compounding predictions.