We don't talk enough about type 2 error! So many papers everywhere are just pure trash because they don't account for it, and people are so fucking oblivious about it, they don't even catch the obvious ones. Even researchers and publications which are supposed to properly review their articles simply fail to ask the obvious "Did you measure the segment which doesn't fit either of your variables?"

  • 1
    Umm, what?

    You want people to report their false positives AND false negatives, instead of just giving you a total accuracy?

    I mean, depending on application it may become necessary, l get that. But it's not always the important part.
  • 0
    @NoMad It's mathematically impossible to prove A <=> B without measuring all intersections between A, B, Aᶜ, and Bᶜ (some of these measurements might be infinite, but only if their complement is measurably finite).

    Even if you have high total accuracy, you can always have a significantly unspecific, unsensitive, or imprecise results.
  • 0
    @hitko 🤔 I think you need to read your last sentence again, and look up how "accuracy" is actually calculated. you can't have "significance" without having some measure of distance for this "significance". Ans what you're saying is not relevant to error type 2. Type 2 does not discuss significance at all as far as I know.
  • 0
    To not mistake that significance with the beta significance that we calculate on total errors aka on "number of cases that are type 2".
  • 0
    Actually, I just got super confused with the whole thing. Let's scrap this and start over. You're saying you want the type 2 error reported, which is false negatives. While accuracy is (tp+tn)/total. Which by default has the fn in total. But you don't want that. You want the some "distance" of these samples to be reported.

    Am I getting that right so far?
  • 0
    @NoMad (tp+tn)/total doesn't give the whole picture. Consider the following:

    (5+100)/110 = 0.954, fp+fn = 5

    Now, you might have fn = 5, fp = 0 in which case you only identified 50% positive cases, or you might have fn = 0, fp = 5 in which case there's 50% chance someone you identified as positive is actually negative, which can be equally worthless. And yet, in both cases you have 95.4% accuracy.
  • 1
    Accuracy is a problem if your labels are unbalanced...which they generally are. That’s why it’s better to report recall or precision. F1 score can be used, but it’s a bit of a blunt tool. Better to be guided by your problem to recall or precision.
  • 1

    Type2 errors are very important, almost central in some problems but don’t matter so much in others.

    In detection problems where we attempt to detect something rare for example, Type2 is central. (Like in (5+100)/110 above but if we cared about (5) that class for which the probability of an observation existing is lower.).

    In detection problems where we want to detect something that is common, not so much. This is because in our observations, the majority will be the ones which come from the common class so precision is good enough. (Unbalanced data here again.)

    For classification type problems, both must be reported. Especially if the data is unbalanced.

    (I hope this makes sense. It’s 3AM and I’m unable to fall asleep.)
Add Comment