Lurkin whitehats, can we please get access to State Dept. Decimal File, 316-19-1120 and report 861.00/5339. Pleases and thankyous.
Yeah ive dug through all the state dep archives, cant find shit. Theyve axed it.
This is the closest ive got to finding out about the second one.
https://wikileaks.org/gifiles/docs/18/1891211_-analytical-and-intelligence-comments-wars-.html
Q-learning is a reinforcement learning technique used in machine learning. The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances. It does not require a model of the environment and can handle problems with stochastic transitions and rewards, without requiring adaptations.
For any finite Markov decision process (FMDP), Q-learning finds a policy that is optimal in the sense that it maximizes the expected value of the total reward over all successive steps, starting from the current state.[1] Q-learning can identify an optimal action-selection policy for any given FMDP, given infinite exploration time and a partly-random policy. "Q" names the function that returns the reward used to provide the reinforcement and can be said to stand for the "quality" of an action taken in a given state.
https://en.wikipedia.org/wiki/Q-learning