Adaptive AI, an experiment

AI discussion, ideas, and SDK help.
Post Reply
Kampar
Lux Newbie
Posts: 1
Joined: Tue Nov 04, 2008 10:49 am
Location: Australia

Adaptive AI, an experiment

Post by Kampar » Tue Nov 04, 2008 11:20 am

Hey guys, im a complete noobie here within the lux community but have a little bit of Java experience and alot of interest in AI in general.

My idea goes like this: I would like to create an AI plugin which I didnt actually have to code with any specific strategy myself, but rather have it develop behaviour based on experience.

There are a few ways to do this, but I think it would be possible to implement a neural network to handle the bot actions/reations, with game data as inputs, and the neural networks topology and weights determined through a genetic algorithm. Each time the bot ran, its fitness would be judged by its result in a game, and its genotype sent into a collective, separate population genepool. It would be easy to evolve initial behaviour by running 1000's of games against the inbuilt ai's.

It's a bit ambitious, but I already have some solid NeuralNet and GA classes written in java which I could implement, my questions arise from my limited experience with Java plugins and the Lux AI plugin system in general.

The collective genepool would have to be stored outside of any individual instance of the bot, and the bot would have to grab its geneset from that genepool at the start of every game, to determine the structure of its neural net.

Ive never done networking code in java before but I assume it would be similar to C (ie, sockets), is it possible within the Lux ai plugin framework for the bot to 'dial out' at the start of a game and grab data from somewhere (open a socket connection etc). Then dial back in at the end of the game to report its results? Or even if it could like, store files on lux server's or something, any way to keep a persistant genepool outside of individual instances of the bot would work.

I think this kindve bot could be a really interesting experiement. Once it had been trained on the initial AI, it could be released to the public and it would adapt to their play, perhaps eventually settle into perfect strategy? Better than any human could write in??

Ahh, getting ahead of myself :P.

Sorry, this is quite an epic first post, if you made it this far, thankyou for reading, and I really appreciate any guidance you can give me!

PS: If youve never heard of artificial neural networks or genetic algorithms before, I probably sound like a complete moron blabbing on in jibberish!

Wiki:

User avatar
The Wontrob
Ninja Doughboy
Posts: 2792
Joined: Wed Oct 03, 2007 9:56 pm
Location: The Pan-Holy Church, frollicking
Contact:

Re: Adaptive AI, an experiment

Post by The Wontrob » Tue Nov 04, 2008 12:10 pm

Enter dustin. Sorry for responding without proper information.

Oh, and welcome to lux :)
Last edited by The Wontrob on Tue Nov 04, 2008 5:52 pm, edited 1 time in total.

User avatar
dustin
Lux Creator
Lux Creator
Posts: 10916
Joined: Thu May 15, 2003 2:01 am
Location: Cascadia
Contact:

Post by dustin » Tue Nov 04, 2008 3:02 pm

You can do anything that Java allows: socket connections or local storage.

Easiest way is probably to start with loading/storing data in a file on the filesystem.

If you look in the Lux SDK it also has some simple methods for storing variables between launches.

User avatar
guest
Luxer
Posts: 189
Joined: Fri Dec 17, 2004 9:17 pm
Location: Southern NH
Contact:

Post by guest » Tue Nov 04, 2008 7:24 pm

I was looking a building a bot using a self training NN and letting it loose on the world but I want more than the player position at the end of the game to train it with.

Andrew97
Lux Newbie
Posts: 1
Joined: Thu Nov 06, 2008 1:56 am
Location: Australia

Post by Andrew97 » Thu Nov 06, 2008 2:35 am

Hi, this is Kampar. I have lost access to the account I initially signed up and posted with above when I made a typo trying to switch the email address to my gmail account. Typo in the email means I never get an email to reactivate it and im locked out. Oh well..

This account corresponds to my Lux name aswell, so mabey this is better anyway.

Thanks for the replies guys. I think the best plan to store the genepool/brain would be on an external, centralised server. A php/mysql database could store the bot brain genepool and it would be a matter of the Bot sending a simple GET request to the server and work with the reply. The reply would be some compressed/coded version of the bot's neural network as a string, and an ID number. When received, the bot builds the network, uses the board state as input, plays out the game, then sends back its result to the server with a POST or another GET url. The server could handle all the natural selection/mutation of its database (genepool) and would also provide a handy web based way to manage/control/review the bot and its performance.

There are some hurdles I can see at this point however: (hardest to easiest, probably)

1. Modeling the game state in a way that the neural net can take as input's. Ie, what does the bot need to know to play its turn? How can you adapt that information to a fixed number of inputs to the net? (the countries array?)

2. Modeling the NN's outputs to real actions/sequences of actions in the game. (perhaps, have it return a 'desirabliltiy' between -1 and 1 for all terriroties on the map, then have some simple finite state logic to carry out its requests. This means the NN deals with high level strategy, while some other bot code could do the dirty work or actually fulfilling its motives. Or mabey have a separate net for both!?)

3. A way to determine a fitness score for each game the bot plays. Like, as way to quantify and wrap up how many rounds it survived, its place, the skill/rank of its opponents etc into a single fitness score.

4. What to do if the bot cannot download it's brain before the game begins.

I think the final point is easily managable, it could revert to either the last brain it downloaded (saved to local storage as suggested) or fall back to basic strategy of a prewritten AI, like KillBot. To start with, it would make sense to have only one part of the bot's behaviour controlled by a NN, and build upon that in a modular fashion. Starting with KillBot's place/fortify/card code, and using a NN to control the attack phase, for example. This would make it easier for the bot to evolve some initial sensible behaviour.

Btw, I think I shall call him Darwin :D. Let me know if you have thoughts on the problems I mentioned above. I cant wait to get a working version out then run him against Reaper 50000 times :P.

User avatar
Bertrand
Reaper Creator
Posts: 568
Joined: Mon Nov 28, 2005 4:35 pm
Location: Montreal

Post by Bertrand » Thu Nov 06, 2008 12:53 pm

Hi Kampar. Good luck in your attempt to write a learning bot. Here are a few pointers that might help you.

In my attempt to "retrofit" learning code in Reaper, I originally wanted a unique, global database, but I soon changed my mind.

The problem is "bot firsting": in most games, the humans will conspire to kill the bots first, either explicitely or implicitely. I do not see how a bot can learn from this: a perfectly good strategy cannot overcome the teaming humans. So by updating the global database with the supposedly failed strategy, you will end up punishing the bot for nothing, effectively corrupting the database.

In order for the bot to learn, his opponents need to be consistent, reliable and non-teaming. Translation: other bots! I let Reaper play tons of games with the other bots, and slowly evolve a winning strategy. I then "hard coded" the resulting database in Reaper's code.

The learning in Reaper is in the high level logic only. Examples are: killing a player, conquering a continent, popping a continent, placing to kill, placing to pop, placing to conquer, etc.

Lastly, winning strategy in Lux is very dependent on the various settings and maps: card progression, continent bonus progression, map size, number of continents. A high card game in a classic sized map requires a very different strategy than a game on a large map. For that reason, the learning database has to be separately tuned for each type of game.

EDIT: Just reread my post, and noticed that I forgot to address one of your questions.

You asked for the best way to evaluate your bot's performance in a game? It's very simple: win = good, lose = bad. The ending position when losing means nothing.

If you reward the bot for his final ranking, then the bot will evolve strategies to do just that. Examples are turtling and lack of aggression. Those strategies make you finish higher, but do not make you win.

Post Reply