47

My Texas Hold'em ML algorithm keeps deciding the best strategy to make the most money, is to lose the least. Which is why it constantly folds after a certain point... *sigh*

Kids, don't gamble. Math has spoken, you'd be wise to listen.

Comments
  • 3
    Sage advice I guess...

    How do you define lose the least amount of money? Is that per game ? Per hand? Per year? Do you take into account how many chips it's opponent(s) have ?
  • 4
    @loopback It does take into account the chips the other players have, as well as a number of other things. It is measured per x rounds played, where x is less than or equal to the number of players at that table.

    Once x rounds have been played, the evolutionary algo kicks in and breeds the next Gen of players based on global standings (there are thousands of agents playing at hundreds of tables).
  • 5
    Maaan I would love to take a look at that project. Any chance you would have that on github?
  • 2
    📌
  • 1
    @AleCx04 Yep, check my profile for a link to my github, you'll see it under my pinned repos.
  • 1
    @rsync LOL no, that's reserved for fourth year students, and idk how deep into ML they go. Have to update my profile...

    I learnt that after watching some YouTube vids by 3Brown1Blue, he's got a 4 episode series on feed forward networks and its so well explained and represented (the math portion of the networks is explained in the fourth episode, requires some linear algebra and multivar Calc, but it's fairly easy to get a grasp of if you have differential calculus down). If you're interested in starting with ML I recommend his series, it's amazingly well visualized and explained.

    That being said, you may struggle with it for a while, it took me four days to get it to click, and I was sinking tons of hours a day into it.
  • 2
    I havent dealt with ml and stuff. But shouldnt it be possible to penalize folding x times in a row/incentivizing taking all the chips?
  • 2
    @Kubernatural Yeah, it's what I have to work on. It's tricky though, because when it gets good (if ever), it will be playing bad hands just because I've hard-coded that it must. So maybe after x generations I can lift those restrictions, still working on that.

    Currently more concerned with optimizing the program since the code is a bit shitty and I don't want to be waiting months to train an agent...
Add Comment