12
erebus
6y

I have got 0.99 accuracy and 0.98 f1 score on some text classification task only to realize that I've created TF-IDF vectorizers using the entire corpus (train+test)

Now my professor is furious -_-

Comments
  • 2
    Lol 😂 😂 just use the test set for training and overfit to achieve 100% accuracy
  • 0
    @phreakyphoenix Tbh I dont see the difference from what I already did!
  • 1
    @erebus Currently you may be overfitting, but we want to go that extra mile you know 😉, just to be sure. And training on the entire dataset would just confuse your network a teeny tiny bit if your network is small compared to using just test set and a ginormous network.

    See for this project, we gotta follow worst practices lol.
  • 3
    Years ago, when I was a programming kid, I wrote a very complex piece of code. I tried it for the first time and... it worked!

    Something was surely wrong: my code _never_ worked at the first try. Turned out that I was implementing something that was being taken care by my shell, and my code was never executed.

    TL;DR: when things seem to work so well, something must be tremendously wrong 😛
  • 0
    @stacked I'm gonna frame this TL;DR and hang it on my work desk.
Add Comment