An implemention of offline RL on recommender system
@author: misajie @update: 20220123
File organization:
- RecEnv
- ClassicalRL
- OfflineRL
In progress:
- Classical off-policy models construction and application on existing environments (Recsim, Virtual Taobao)
- Reconstruct simulator-free model, eg. feedrec
- Modify Recsim to fit Wechat short video dataset and run off-policy models and evaluate the result
- Generate reply samples from short video recommendation environment
- Build classical offline models
- Build original offline model
- Evaluate new model
- add autoML