Sep 17, 2024
Agreed, I beleive they used that during the training of the model. During inference, it is just using the trajectories learnt by RL at train time.
Agreed, I beleive they used that during the training of the model. During inference, it is just using the trajectories learnt by RL at train time.
3x๐Top writer in AI | AI Book ๐: https://rb.gy/xc8m46 | LinkedIn +: https://www.linkedin.com/in/vishal-rajput-999164122/ | ๐: https://x.com/RealAIGuys