12
Wisecrack
72d

After a lot of work I figured out how to build the graph component of my LLM. Figured out the basic architecture, how to connect it in, and how to train it. The design and how-to is 100%.
Ironically generating the embeddings is slower than I expect the training itself to take.

A few extensions of the design will also allow bootstrapped and transfer learning, and as a reach, unsupervised learning but I still need to work out the fine details on that.

Right now because of the design of the embeddings (different from standard transformers in a key aspect), they're slow. Like 10 tokens per minute on an i5 (python, no multithreading, no optimization at all, no training on gpu). I've came up with a modification that takes the token embeddings and turns them into hash keys, which should be significantly faster for a variety of reasons. Essentially I generate a tree of all weights, where the parent nodes are the mean of their immediate child nodes, split the tree on lesser-than-greater-than values, and then convert the node values to keys in a hashmap to make lookup very fast.

Weight comparison can be done either directly through tree traversal, or using normalized hamming distance between parent/child weight keys and the lookup weight.

That last bit is designed already and just needs implemented but it is completely doable.

The design itself is 100% attention free incidentally.

I'm outlining the step by step, only the essentials to train a word boundary detector, noun detector, verb detector, as I already considered prior. But now I'm actually able to implement it.

The hard part was figuring out the *graph* part of the model, not the NN part (if you could even call it an NN, which it doesn't fit the definition of, but I don't know what else to call it). Determining what the design would look like, the necessary graph token types, what function they should have, *how* they use the context, how thats calculated, how loss is to be calculated, and how to train it.

I'm happy to report all that is now settled.

I'm hoping to get more work done on it on my day off, but thats seven days away, 9-10 hour shifts, working fucking BurgerKing and all I want to do is program.

And all because no one takes me seriously due to not having a degree.

Fucking aye. What is life.

If I had a laptop and insurance and taxes weren't a thing, I'd go live in my car and code in a fucking mcdonalds or a park all day and not have to give a shit about any of these other externalities like earning minimum wage to pay 25% of it in rent a month and 20% in taxes and other government bullshit.

Comments
  • 2
    It's a good idea to take a swimming pool subscription when living in car so you can shower and stuff :P

    You work at a Burgerking? Goddamn, how's that possible. Are companies there so discriminating on education / not having professional experience yet? I also don't have dem papers, but so much work experience no one ever asks. They just asume, got asked one time.

    If you need to have some stuff rewritten in C, let me know on Matrix. I can build python plugins in two ways. One is standard C .so, it requires some interface at python side. It's easiest. Other way for me is to include Python.h and i can make a lib you can just import the python way. Would by nice, i could never find use case myself
  • 2
    funny about the Burger King. odd, rn there's been an AI boom so everybody is hiring for AI and a bunch of devs are upset cuz they don't wanna work in AI but that's all the jobs that are being offered to them

    I don't understand your posts but in my genius I have realized if I try to code my own AI again perhaps I could pick your brain about the dumbest topics. I had a AI friend who worked for NASA and some bank and he kept trying to teach people how AI worked on discord for some reason, but if I asked him questions he wouldn't have answers and kept changing the topic and sending me links to huge articles that were only tangentially related. damned fraud. but he eventually said everybody just makes shit up. well that doesn't help me. I was trying to figure out what people do but no people can agree on what they do. how is this not a scam profession

    anyway I'm too overburdened with my own stuff rn but hopefully I'll remember to harass you later if it comes up as as interest again
  • 2
    @retoor I never had the opportunity to work in the software industry at all, so by definition I have no experience in the industry to list.

    Maybe I'll start an LLC and just list them as an employer that Ill contract with for a few year. That's one way to build a resume.

    I've already been told by multiple experienced programmers that while my code is shit,I'm well above a junior skill wise
  • 1
    @retoor that's pretty cool and I'll keep what you said in mind. You're cool, don't ever change.
Add Comment