Comments

  • What the models were trained on?
    That is correct. What I mentioned only applies to the AI predictions and would not lend itself to a context where the handicapper applies varying angles/methods on-the-fly. The latter can only be tested manually over time. I think we are on the same page.
  • What the models were trained on?
    Thanks, Dave for the detailed explanation. I naturally have an inquisitive mind so trying to get a better read of the system at hand so I don't make wrong assumptions. Of course, it is for you to judge what parts you consider trade secret...no issues there.

    Also, I am not in any way trying to deny the early betting results you shared with us last week. It does support the evidence of valuable signals in picking winners and identifying profitable bets.

    About out-of-sample testing, usually when folks build machine learning models they also set some subset of the data aside, e.g., randomly sampled 10%. Then, the model is trained using any appropriate algorithm for the task at hand. After training, the out-of-sample test data is fed to the model to judge how it performs against it before it is used in real life. The whole idea is to avoid overfitting as most ML models are prone to it, some more than others.

    Another way to do it is setting aside a certain time slice instead of a randomly selected % of the observations/rows. For instance, I may be working with historical data from 2017 to 2023 for a given problem and I use the data from 2017 to 2022 to train my model and then validate it against the 2023 data. Because it has not been exposed to the 2023 data during training, it gives me a sense of the real-life performance of the model. Some algos are very good at memorizing training datasets, which can result in superb in-sample accuracy against training datasets that cannot be replicated against new data. Hope this explains...
  • What's the best way to use Grades?
    Thanks, Tom.
    "I am not sure that is correct. My understanding is the AI Line is how the public (mostly whales) should bet, whereas the AI BH Pct is how the AI predicts the win probability for each horse. $Net/Grades are a function of both of these."

    Ah, therein lies my confusion...I took some notes while watching the first set of videos some time back and I must have misinterpreted things.

    It would be interesting to get a better feel of how often AI Line is on the mark (within a reasonable deviation) and how often AI BH is. I suppose The Neal has been devised to, in a way, unplug the potential AI Line uncertainty from the equation focusing solely on AI BH + user's own analysis. If that is consistently achieved/achievable, I see no reason not to employ The Neal at the expense of AI Line by default...especially in predicted Chaos races. If my chain of thought is correct, we're back to square one: Grades are not that meaningful after all :)
  • Working Groups to learn deTerminator
    I strive to be data driven vs. static rules. What data has shown me is that pace handicapping is for real and all other factors are dubious at best. Never was able to make much sense of Trainer stats. Is a 22% trainer (read 78% loser) that much better than a 15%er? I suspect yes but negligible. And if you get into the weeds then you end up with too few observations to rely on. I like the BRIS PPs format in general and have come to believe Thorograph figures and form factors are also useful but no panacea on their own. If they show Horse A is really a slow animal that has to have a lifetime best, it's usually so in reality. Rich Strike was in the ball park that day btw.
  • Working Groups to learn deTerminator
    Wednesday evening is a good option for me. If not, Monday or Thursday. I don't have a preference on what the number limit should be for beta release...deferring to Dave on that.
  • Working Groups to learn deTerminator
    Hi, fellow deTERMINATORs. My name is Atakan (Kahn is the nick) and I am based in Las Vegas. I grew up in Turkey, where I first went to horse races with my dad as a kid. I've been in the U.S. since mid-90s...first to study than with work. I have been in the software industry in the capacities of Product Manager/Marketing Manager including stints with Big Tech as well as startups. Currently, I'm with a Machine Learning software startup helping enterprises make sense out of their data, e.g., fraud detection.
    As a handicapper, I've been mostly betting on big days due to time constraints and had a mixed bag of results but not profitable in aggregate. During the pandemic, I developed my own statistical models on top of Brisnet data and had better success...work in progress. I'm here to improve and better understand the fascinating problem of handicapping, which strikes me as majority science and minority art/luck. If things turn profitable too, I won't be too upset about it! :)
  • Ques004: AI BEST HORSE - ODDS SENSITIVE (LONGSHOTS)
    Dave is right about Deep Learning/Neural Nets being prone to overfitting to the data at hand. In my experience, when we are dealing with tabular data (as opposed to hard to feature engineer data types like images or video input) even more primitive algorithms like Logistic Regressions can outperform overly complicated and brittle pure Deep Learning/Neural Net approaches.
  • 2022 Belmont Stakes
    Nest had her troubles but what a filly...great effort!
  • 2022 Belmont Stakes
    A lot to digest here. I have been a Thorograph sheets user for a number of years and it seems they really like Nest in this race drawing a line on her Oaks performance and emphasizing she'll have some weight advantage against the boys. Plus the Curlin factor = bred for the distance. Not to be overlooked...probably the odds will have shortened when the race is off.
  • Chaos
    Thanks, RanchWest. The Derby sure felt like a chaotic race to my bankroll! :))
  • Chaos
    Does this approach work on races with many starters like the Kentucky Derby? For instance, was the Mine That Bird derby a chaos race?