[2401.08967] ReFT: Reasoning with Reinforced Fine-Tuning
04-14-2025
Link:
https://arxiv.org/abs/2401.08967
Note:
字节出品的论文,做强化微调
← Back to articles