newsence
來源篩選

@MiniMax_AI: A key leap in M2.5 comes from large-scale RL training across hundreds of thousands of complex enviro...

Twitter

A key leap in M2.5 comes from large-scale RL training across hundreds of thousands of complex environments. This significantly improves performance in environment adaptation, long-horizon tasks, agent alignment, and inference efficiency. Plenty of ups, downs, and surprises along the way 🫨 Definitely worth the watch! @olive_jy_song

newsence

Loading

Fetching article data