> When people play Avalon, the games are usually rich in “cheap talk,” such as defending oneself, accusing others, or debunking others’ claims [10]. In this work, we do not consider the strategic implications of natural language communication...
In the 2189 mixed human/agent games we collected, all humans knew which players were human and which were DeepRole. There were no restrictions on chat usage for the human players, but DeepRole did not say anything and did not process sent messages
I think the fun part about Avalon and other hidden role games really comes from the "cheap talk", where people try to convince each other that their picks / approvals make sense as a member of the good team, as opposed to making the decisions from picks and approvals alone. Though from the results it seems that those concrete actions are already enough to outperform the humans.
There's also the consideration that because the humans know the identity of DeepRole as a bot, they play differently: "That's what I'm gonna pick, because that's what bots do" [1]. I wonder if a combined DeepRole + human-for-chatting-only team would outperform either alone.
> ...The Resistance: Avalon, the most popular hidden role game.
I question this, because it seems more likely to me that the grade-school classics (Mafia/Werewolf) probably have The Resistance or any expansion/variant thereof beat for popularity there.
The text of the paper is a little more precise than the abstract, they picked R:A because it's "the most highly rated hidden role game on boardgamegeek.com."
Just as a player of these games, I'm curious how much weight DeepRole gives to the results of the team Proposal Votes. I find they hold a lot of useful, but hard to parse information which depends heavily on the skill/metagame of your opponents.
This seems like it would suffer from a lot of the sorts of AI problems in games, where computers are better at memory and ingesting information than people generally are.
Games like Avalon or Secret Hitler tend to be much less fun if people are playing with notes. A lot of it involves seeing what sort of lies/adversarial behavior you can get away with because people aren't paying attention to every single thing in the flood of information.
A tight example of this is if you watch TotalBiscuit's old series of games of Secret Hitler with friends on YouTube, they essentially ended up quitting partially because one of the guys started taking notes on exactly who had voted what way on everything, and counting cards. It turned out this both made him really good at the game at first (patterns in who votes with whom is valuable) and made the game less fun for everyone else, since they now had to play more "perfectly" and more directly do things to tamper with the dataset, instead of relying on no one remembering the exact votes from 15 mins earlier.
In the end, I somewhat question if the computer is actually better at any of the parts of the game that are fun/interesting, as much as just better at pattern recognition. For instance, in a game like Avalon, you'll often not be able to piece together who some of the less-important roles are, mostly because it's not worth the time, but a computer is likely able to overtly or passively track things like that because it has no reason not to.
Paper author here: one cool thing about our technique is that our bot doesn't keep "notes" :)
The only thing it maintains for the entire game is this length-60 belief vector - a summary of who it thinks is evil and good. How people act influences this belief vector, but it can't look back at the game history. This leads to awkward play sometimes - it will propose missions that have failed in the past, etc. I think it's cool that we (humans) can summarize the state of the game with such little information, and that the bot does something similar :)
Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector? Outside of directly being on a mission, the only major "good" or "evil" actions are voting for/against missions, once you know if the mission succeeded or failed. Do you just not take that into account?
It's possible this is in the paper - a lot of the more math/modeling parts went a bit over my head, so feel free to point me to a section to specifically read if I missed out.
If it's literally just a representation of the outcomes of the missions and who went on them, then isn't the Belief Vector just the venn diagram of how every mission went with some iterative statistics laid over it? I would have assume any regular/competitive players would be fairly good at keeping that mental model themselves, which makes it seem confusing to me that the Agent would be better than that, unless it's essentially just saying that the game is better if you play purely logically and ignore all context, which defeats the fun of playing it?
> Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector?
The belief vector is updated on the fly. When players take moves, we use the belief vector and our CFR-generated move probabilities to perform Bayes' rule. Once the belief vector is updated, we throw out all information related to the specific move they took.
> Outside of directly being on a mission, the only major "good" or "evil" actions are voting for/against missions, once you know if the mission succeeded or failed. Do you just not take that into account?
DeepRole takes all player actions into account - the key to good performance in Avalon is knowing how to interpret the voting/proposal actions of all the players. We explored this in our paper: LogicBot only uses the mission fail results to deduce who is good, and has a lower win rate than DeepRole in all situations.
> If it's literally just a representation of the outcomes of the missions and who went on them, then isn't the Belief Vector just the venn diagram of how every mission went with some iterative statistics laid over it?
While you can tease out the "venn diagram" aspect out of the belief vector (it will assign 0 probability to impossible assignments), it's far richer than that - it weights the possible assignments based on all of the moves it has observed.
In some sense, DeepRole is playing with one of its hands tied behind its back. All it knows about the state of the game is this belief vector, the number of succeeds, the number of fails, and the proposal count. It doesn't know the specific moves that led that point in the game. The fact we can summarize everyone's previous moves into this belief vector is somewhat surprising, considering human players can look back at the game history and re-synthesize for new insights.
I don't think this is true. On ProAvalon, human players can see the full history of the game at all times [1], and use it to make decisions. DeepRole, on the other hand, can only use its internal belief state. This belief state is only a summary of what has happened in the game - DeepRole has no way of knowing who went on previous missions, or how people voted, or who proposed what. See above for more detail.
I would like to find the right game to do some research with CFR as i think cooperation in incomplete information games is one of the most interesting field in AI.
I would like a game :
- Turnbased
- Incomplete information (fog of war)
- Team based
- Strong cooperation and coordinations between players is requiered for win.
I started to create a 2D CSGO, but i would like to know if there are any similar game already existing.
Instead of a total collective reward being the goal, you’d have team-based scores. You’d need cooperation within the team, and aggressive action against the enemy team.
In the 2189 mixed human/agent games we collected, all humans knew which players were human and which were DeepRole. There were no restrictions on chat usage for the human players, but DeepRole did not say anything and did not process sent messages
I think the fun part about Avalon and other hidden role games really comes from the "cheap talk", where people try to convince each other that their picks / approvals make sense as a member of the good team, as opposed to making the decisions from picks and approvals alone. Though from the results it seems that those concrete actions are already enough to outperform the humans.
There's also the consideration that because the humans know the identity of DeepRole as a bot, they play differently: "That's what I'm gonna pick, because that's what bots do" [1]. I wonder if a combined DeepRole + human-for-chatting-only team would outperform either alone.
[1] From Appendix F of the paper: https://www.youtube.com/watch?v=9RkUFHYTo_s