11
NoMad
22h

LOVE how these companies are keep publishing their "AI models" without verifying or explaining their architecture, their data collection process, and any validation scores.

Like, I am to just take your word for it working "better" because of a study done by the school of PSYCHED(🤪)?

Comments
  • 6
    Taking someone's word for it? That's how you know your computer is secure. Or your router. Or your car.

    And the funny thing, as long as everyone believes them, they are safe.
  • 2
    They invented their own open source flavors. These days if others can posses your software it's considered open already. Lunduke has beautiful episode about it. Maybe Llm's makes people buy more books instead of less. It introduces you to a lot of subjects to get interesting in. So maybe it has a advertising effect. Who knows. But look how companies don't give a fuck about rights. They screwed over nearly any big author in the world. If they screw them, they'll screw you garanteed. They commit crime in daylight, no fucks given. If I ask a very detailed question, a phrase from a book, it freaking know it. A random unimportant event. Then you ask where he has that from. Mentions exactly that book. You ask if it was part of training data - error 404/500 basically. Maybe models with certain size should get audited or smth.
  • 0
    they do explain the architecture, and the test scores are externally verifiable. They don't disclose their data collection practices because they have none. They argue it's fair use the same way transformative works and parody are, because even though that's utter bullshit, it's the only legal categorization that would, if it made any sense, permit the kind of treatment they put the entire internet through.
  • 0
    you know it's not fair use BTW because the individual pieces of copyrighted input data aren't given any focus. All categories declared as fair use are supposed to focus on the original, because the reason fair use as a legal concept exists is solely to prevent authors from wielding copyright law to censor public discourse sparked by their work.
  • 0
    I'm actually open to declaring statistical analysis as fair use, because important truths can be revealed by statistics, but only if the process and its results are public domain and only the presentation and further processing of these results may be licensed, to ensure that the statistical process can be scrutinized by members of the public.
  • 0
    Publishing the statistical process would also ground all debates about whether the input data has been sufficiently transformed as to be unrecoverable in basic truth, these discussions are currently happening in complete darkness and even if lawsuits come to a conclusion the specifics of the argument will be hidden from the public.
  • 0
    (pretending for now that it's possible to understand the reversibility of an adversarially designed statistical process because surely hiding a cipher in plain sight is pretty hard)
Add Comment