An open-source framework for training and inference with hybrid MLM-Diffusion language models.
Fast parallel generation with masked language models, refined with diffusion.
The current small model has:
- 12 layers
- 1024 hidden
- 16 heads
- 4352 FFN
- ~225m parameters
- Proper documentation and README
- Better support for putting in your own data and hyperparameters
- Optimization
- Training more models (1b, 2b, 7b, etc.)