2 Comments
User's avatar
Daniel Popescu / ⧉ Pluralisk's avatar

This article comes at the perfect time. The issues you highlight with pre-trained LLMs and objective misalignment are crucial. Thanks for this excellent, detailed guie on RLHF; it's trully important work.

Expand full comment
Dr. Ashish Bamania's avatar

Thank you! I’m glad that it helped

Expand full comment