if they can do this for math, why can't they do it for general reasoning?
https://youtu.be/Bhoy_arJvaE?si=OLomRfCVUguhx3rx