This is a very basic AI-based tic-tac-toe game.Good accuracy is achieved by allowing the agents to play against each other for about 10,000 times.The agent learns to interact with the environment by remaining in a state and reward is the incentive which drives the agent towards its goal of winning.The state in which the agent wins is assigned reward 1 and rewards of all other states leading to final state is calculated iteratively.
shub124/Reinforcement_learning
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|