You must log in or register to comment.
GPT4 with reflexion prompting gets 90% correct (for HumanEval coding benchmark). The paper this is based on is misleading at best.
GPT4 with reflexion prompting gets 90% correct (for HumanEval coding benchmark). The paper this is based on is misleading at best.