3
sleek
325d

chat gpt is too politically correct, and i hate that im paying for an API that refuses certain prompts because they were considered inappropriate or because it thinks that it should not be giving me its analysis on a certain subject.

has anyone dabbled with using an open source LLM and made their own lite version of ChatGPT minus all the restrictions ?

i know its not gonna be as good, but at the very least free from the constraints

Comments
  • 3
    so what kinda stuff are you asking it?
  • 4
    Honestly, you can run Nous-Hermes-2-SOLAR-10.7B or even better Mixtral-8x7B on a local machine at this point using CPU. You will need a lot of RAM. Not sure how much SOLAR takes up, but Mixtral takes about 24GB or so (depending on the quant type) so your RAM will need to cover it.

    Or if you try running it through llama.cpp you can offload part of it to your GPU so it has to fit RAM+GPU memory then

    Eitherway, once you get it running, you can get a fully uncensored, fully offline LLM with really good output (both of those rival GPT3.5, with Mixtral getting close to GPT 4, except it's not multi-modal yet afaik) and even on CPU they run at several tokens a second which is perfectly usable (You can get SOLAR running almost instantly on a Mac M2 device)

    I'd recommend getting an LLM running locally and see what you can get away with, its fun, open and free
  • 1
    @Hazarth links please! Relevant to interest

    Yes I want to be spoonfed. Ty
  • 3
    @NeatNerdPrime If you're on Linux Simplest and fastest way to get running is by using Ollama:

    https://ollama.ai/

    You can then get either solar:

    https://ollama.ai/library/solar

    or Mixtral 8x7b:

    https://ollama.ai/library/mixtral

    You might figure out how to get it running on WSL2 on Windows yourself. Or you can try getting GPT4All, it doesn't have all the up-to-date models built-in, but you can always get Solar or Mixtral from HuggingFace (needs account)

    Though best performance from my experience comes from using latest llama.cpp:

    https://github.com/ggerganov/...

    but you will have to compile that for yourself and it's just a CLI interface

    Honestly best way to get LLMs running right now is Linux, so if you're on Windows you gonna have to do some extra work to get it running (except GPT4All, that just works, except for getting Solar or Mixtral, those will work, but you have to fetch them from HF first, in gguf format)
  • 1
    I've been messing around with GPT4ALL, which nicely packages a UI with automatically downloading models. You can also get additional models from huggingface dot io.
  • 0
    @Hazarth many thanks, the information is yummy!
  • 0
    @Hazarth

    Hello bro, i want to ask. Is the model only use resources when we use it?
  • 1
    @novatomic depends in the runner. I think both ollama and gpt4all only use resources while running and they then free most of them. But if you're running using llama.cpp that keeps loaded in memory until you stop it.
  • 1
    Yea no matter what I do, I can't get it to admit that the reason I'm single is because the Rothschilds paid every woman ever to ignore. Nor will it admit France was a mistake
  • 0
    If i would use a LLM is to strip it down to use way less resources and to understand its inner workings!
  • 0
    @Hazarth ran llama2, it REFUSED to comply with my prompts :(

    Its not about the prompts themselves i just want a completely open LLM with no restrictions, no firewall, no woke-filter

    will try mixtral to see if it complies
  • 0
    Most LLMs are just shit tons of matrix multiplication after all.

    Neurons links can be represented by therir weight(propebility) as a giant table(matrix).
Add Comment