Browse history From year (and earlier): From month (and earlier): all January February March April May June July August September October November December Deleted only For any version listed below, click on its date to view it. For more help, see Help:Page history. (cur) = difference from current version, (prev) = difference from preceding version, m = minor edit, → = section edit, ← = automatic edit summary (cur | prev) 16:31, March 24, 2016 Ngocminh.oss (wall | contribs) . . (3,279 bytes) (+835) . . (Parameter space exploring) (undo) (cur | prev) 12:34, February 20, 2016 Ngocminh.oss (wall | contribs) . . (2,444 bytes) (+1,064) . . (Direct gradient-based methods) (undo) (cur | prev) 12:02, February 19, 2016 Ngocminh.oss (wall | contribs) . . (1,380 bytes) (+157) . . (add David Silver's slide) (undo) (cur | prev) 10:35, February 3, 2016 Ngocminh.oss (wall | contribs) . . (1,223 bytes) (+694) . . (add explanations) (undo) (VisualEditor) (cur | prev) 20:01, May 17, 2015 Minhlab (wall | contribs) . . (529 bytes) (+529) . . (Created page with ""REINFORCE learns much more slowly than RL methods using value functions and has received relatively little attention. Learning a value function and using it to reduce the var...") Retrieved from "https://natural-language-understanding.fandom.com/wiki/Policy_gradient" Community content is available under CC-BY-SA unless otherwise noted.