Thi is the set of tools and knowledge, based in artificial intelligence (AI) that Giants in the digital world, supported by web facilities, are using and developing in applications that produce big enterprises, money or power, examples: Google, Facebook, Microsoft, Governments, Social hackers, etc.
“A mathematical agent is a metaphorical idealized actor, that is, an idealized actor in the source domain of a metaphor characterizing some aspect of mathematics”
The good part is that these model agents derives into computer programs that learn to execute by themselves highly complex task in the real world, such as driving a car or playing a brilliant GO game.
A computer program that inhabits some dynamic environment, senses and acts autonomously in this environment, and by using its own learning and memory resources, learn to realize a set of goals or tasks for which it was designed.
T3_player is a set of computer programs modules that finally assemble into a tic tac toe virtual environment, where a clearly defined software agent, constructed with artificial neurons, learn by its own effort to play brilliantly, with human-level ingenuity and creativity. During its learning phase, the agent observes the mathematical principles endorsed by the Bellman Equation. During its operating phase, the agent behaves as an intelligent Markov process.
MIND T3-Player is a computer program capable of exploring and learn brilliant solutions by itself in a tic tac toe environment. This agent oriented software solution easily reaches expert human playing capacity through a combination of:
- Artificial Neural Nets.
- Gradient Descent.
- Reinforcement Learning.
An applied mathematical concept that guarantees maximal obtained value in the control of a sequence of events that occur in a complex environment, with an underlying logic, with rewards scattered in the space-time. In this sense, a Bellman agent must always look into the future during its learning journey. This universal principle is currently applied to relevant applications like a self-driving car, robotics, business management, education, computer system, engineering, animation etc.
They are Complex networks that can learn intricate behaviors through examples or by themselves.
Learning is guaranteed by proved learning algorithms that in turn produce efficient computer models.
If we know the derivative of a transfer function of an artificial neural network then Gradient Descend makes possible to massively and orderly change connection weights and reduce the error (loss) of the underlying massive global network, providing in its way a most valuable processing ability: inference capacity or the capacity to give a good answer (outputs) to questions (inputs) never seen during the learning period.
“You make an inference when you use clues from the story to figure out something that the author doesn’t directly tell you”
Reinforcement learning (RL) is essentially the computer numerical solution of the Bellman equation which deals with the optimal control of differential-difference (time-lag) processes. If we write our computer program following its principles, we obtain a solid approximation of the optimal control of any given process.
Instruction to easily install Borland C++ are available at http://www2.hawaii.edu/~sdunan/study_guides/bcc.html
You can download the installer here
Below is the main code of Module 1. The external libraries and all the code can be downloaded here:
After compiling «T3_Player_module_1.cpp» using Borland C ++, an .exe file is generated.
The program already loads the training weights.
The «c» key is to load the already trained weights.
The «s» key is to save the weights of the new training.