Valentin Radovich

Tag: reinforcement-learning

All the articles with the tag "reinforcement-learning".

How a raw genius trained himself to be a reasoning LLM model (almost)
DeepSeek RL algorithm to train a reasoning LLM model.