Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "transformers"
-
Getting really sick of brand naming conventions these days, Youtube just suggested me a video called "Samsung Galaxy S20 Ultra vs iPhone 11 Pro Max!"
Like what are those, Transformer names? 🙄12 -
Today on forgotten games – Ballance.
The game is absolutely outstanding. Graphics is absolutely amazing even though the game was developed in 2004. The sound effects are perfect, I can literally feel the wooden ball rolling on steel rails. The background music is also amazing, we're talking Alexander Brandon level here.
The game is about rolling the ball through the levels trying not to fall off. There are three balls: the stone one, the wooden one and the paper one, different in weight, velocity and momentum.
I admire the clever level design. It uses in-game map features in multi-purpose way, for example some levels use ball transformers (the things that transform the ball from one kind to another) as a trap for your ball to lose momentum. It even seems like that levels were designed by some crazy modders for advanced players, but they weren't, and traveling through them feels like you're a pro gamer playing custom levels.
Even though levels seem simple at first glance, they allow non-linear gameplay and different gaming styles.
The gameplay itself is pure meditation. But even though the concept seem straightforward – just follow the level and don't fall – it's not. You have to use all three ball types: there are air vents to fly above upon, which only paper ball can do, there are obstacles to push, which only stone ball can do, and so on.
For additional sonic satisfaction the levels even feature some metal domes that serve no purpose but to be bumped into just for making amazing gong sound.
I like it that when you get cocky and think like that's easy, I got this, the game quickly puts you into place. It basically says nigga you ain't shit, you got nothing on me.
Overall it's basically a mesmerizing travel through cleverly designed levels surrounded by relaxing music and outstanding graphics.
Definitely a must-have for mechanical keyboard gamers, it's a pure satisfaction playing this game with a great level of precision and control mechanical keyboard allows.
Search for "ballance widescreen fix" for modern displays support.10 -
Just go to GPT and get me the code today!!!
Like wtf why does every non-tech guy thinks GPT is the solution for each problem.
So, a workmate wanted a fullstack system ready in 2 days and argues that its too easy with GPT. I mean yeah GPT helps, sometimes, on frequent rare occasions. But, that can't be the solution to everything, like ask it to setup the whole backend frontend integration, will it? ask it to manage assets will it? ask it to write complex code, it fails, miserably, okay like me but that's the point. Just because random transformers pop up each week containing some random shitty weights (not always agreed) that can't replace a full stack developer. These 'attention' seeking models are as good as another tool, a bit efficient, but at the end of the day its just a tool, all it does is lessen the typing time but gives more errors too. The average time remains the same. And my non-tech manager can't understand he can't do it himself earlier using GPT. Succkkkkkkssss!5 -
After a lot of work I figured out how to build the graph component of my LLM. Figured out the basic architecture, how to connect it in, and how to train it. The design and how-to is 100%.
Ironically generating the embeddings is slower than I expect the training itself to take.
A few extensions of the design will also allow bootstrapped and transfer learning, and as a reach, unsupervised learning but I still need to work out the fine details on that.
Right now because of the design of the embeddings (different from standard transformers in a key aspect), they're slow. Like 10 tokens per minute on an i5 (python, no multithreading, no optimization at all, no training on gpu). I've came up with a modification that takes the token embeddings and turns them into hash keys, which should be significantly faster for a variety of reasons. Essentially I generate a tree of all weights, where the parent nodes are the mean of their immediate child nodes, split the tree on lesser-than-greater-than values, and then convert the node values to keys in a hashmap to make lookup very fast.
Weight comparison can be done either directly through tree traversal, or using normalized hamming distance between parent/child weight keys and the lookup weight.
That last bit is designed already and just needs implemented but it is completely doable.
The design itself is 100% attention free incidentally.
I'm outlining the step by step, only the essentials to train a word boundary detector, noun detector, verb detector, as I already considered prior. But now I'm actually able to implement it.
The hard part was figuring out the *graph* part of the model, not the NN part (if you could even call it an NN, which it doesn't fit the definition of, but I don't know what else to call it). Determining what the design would look like, the necessary graph token types, what function they should have, *how* they use the context, how thats calculated, how loss is to be calculated, and how to train it.
I'm happy to report all that is now settled.
I'm hoping to get more work done on it on my day off, but thats seven days away, 9-10 hour shifts, working fucking BurgerKing and all I want to do is program.
And all because no one takes me seriously due to not having a degree.
Fucking aye. What is life.
If I had a laptop and insurance and taxes weren't a thing, I'd go live in my car and code in a fucking mcdonalds or a park all day and not have to give a shit about any of these other externalities like earning minimum wage to pay 25% of it in rent a month and 20% in taxes and other government bullshit.4 -
In 2015 I sent an email to Google labs describing how pareidolia could be implemented algorithmically.
The basis is that a noise function put through a discriminator, could be used to train a generative function.
And now we have transformers.
I also told them if they looked back at the research they would very likely discover that dendrites were analog hubs, not just individual switches. Thats turned out to be true to.
I wrote to them in an email as far back as 2009 that attention was an under-researched topic. In 2017 someone finally got around to writing "attention is all you need."
I wrote that there were very likely basic correlates in the human brain for things like numbers, and simple concepts like color, shape, and basic relationships, that the brain used to bootstrap learning. We found out years later based on research, that this is the case.
I wrote almost a decade ago that personality systems were a means that genes could use to value-seek for efficient behaviors in unknowable environments, a form of adaption. We later found out that is probably true as well.
I came up with the "winning lottery ticket" hypothesis back in 2011, for why certain subgraphs of networks seemed to naturally learn faster than others. I didn't call it that though, it was just a question that arose because of all the "architecture thrashing" I saw in the research, why there were apparent large or marginal gains in slightly different architectures, when we had an explosion of different approaches. It seemed to me the most important difference between countless architectures, was initialization.
This thinking flowed naturally from some ideas about network sparsity (namely that it made no sense that networks should be fully connected, and we could probably train networks by intentionally dropping connections).
All the way back in 2007 I thought this was comparable to masking inputs in training, or a bottleneck architecture, though I didn't think to put an encoder and decoder back to back.
Nevertheless it goes to show, if you follow research real closely, how much low hanging fruit is actually out there to be discovered and worked on.
And to this day, google never fucking once got back to me.
I wonder if anyone ever actually read those emails...
Wait till they figure out "attention is all you need" isn't actually all you need.
p.s. something I read recently got me thinking. Decoders can also be viewed as resolving a manifold closer to an ideal form for some joint distribution. Think of it like your data as points on a balloon (the output of the bottleneck), and decoding as the process of expanding the balloon. In absolute terms, as the balloon expands, your points grow apart, but as long as the datapoints are not uniformly distributed, then *some* points will grow closer together *relatively* even as the surface expands and pushes points apart in the absolute.
In other words, for some symmetry, the encoder and bottleneck introduces an isotropy, and this step also happens to tease out anisotropy, information that was missed or produced by the encoder, which is distortions introduced by the architecture/approach, features of the data that got passed on through the bottleneck, or essentially hidden features.4 -
wow, I have to say, tail recursion, slices, iterator transformers and shadowing are great building blocks for parsers. I'm actually way faster at defining and modifying procedural parsers with Rust's tools than I am with Chumsky. I actually understand which of my checks are greedy and lazy now!
-
Apparently under the correct architecture, hopfield networks reduce down to the attention mechanism in transformers.
Very damn cool discovery. Surprised that I'm just reading about it.
Image is a snapshot from the article.
Whole article here:
https://nature.com/articles/...9 -
nested ternary operators
like/dislike?
I used to hate them cuz I would have to break them apart just to understand them, but now I use ternary operators so much, nesting at least one level is ok for me.
but i'm the only person that reads my code, what's the concensus.. nesting one level bad?
I wouldn't want someone reviewing my code if they couldn't wrap their head around a simple ternary, so if only myself or people more experienced than myself will ever read them, then fuck it, i'm using them10 -
There is nothing in this world that can run at 100% efficiency except transformers at power plants.5
-
@Wisecrack
Dude, it seems someone has actually done 1bit Quant for a transformer model:
https://arxiv.org/pdf/...2 -
Hello folks, i wanted to know.
Which series do u like better, dragonball, or transformers, and why.16 -
I starting developing my skills to a pro level from 1 year and half from now. My skillset is focused on Backend Development + Data Science(Specially Deep Learning), some sort of Machine Learning Engineer. I fill my github with personal projects the last 5 months, and im currently working on a very exciting project that involves all of my skills, its about Developing and deploy a Deep Learning Model for Image Deblurring.
I started to look for work two months to now. I applied to dozens of jobs at startups, no response. I changed my strategy a bit, focusing on early stage startups that dont have infinite money for pay all that senior devs, nothing, not even that startups wish to have me in their teams. I even applied to 2 or 3 and claim to do the job for little payment, arguing im not going for money but experience, nothing. I never got a reply back, not an interview, the few that reach back(like 3, from 3 or 4 dozen of startups), was just for say their are not interested on me.
This is frustrating, what i do on my days is just push forward my personal projects without rest. I will be broke in a few months from now if i dont get a job, im still young, i have 21 years, but i dont have economic support from parents anymore(they are already broke). Truly dont know what to do. Currently my brother is helping me with the money, but he will broke in few months as i say.
The worst of all this case is that i feel capable of get things done, i have skills and i trust in myself. This is not about me having doubts about my skills, but about startups that dont care, they are not interested in me, and the other worst thing is that my profile is in high demand, at least on startups, they always seek for backend devs with Machine Learning knowledge. Im nothing for them, i only want to land that first job, but seems to be impossible.
For add to this situation, im from south america, Venezuela, and im only able to get a remote job, because in my country basically has no Tech Industry, just Agencies everywhere underpaying devs, that as extent, dont care about my profile too!!! this is ridiculous, not even that almost dead Agencies that contract devs for very little payment in my country are interested in me! As extra, my economic situation dont allows me to reallocate, i simple cant afford that. planning to do it, but after land some job for a few months. Anyways coronavirus seems to finally set remote work as the default, maybe this is not a huge factor right now.
I try to find job as freelancer, i check the freelancer sites(Freelancer, Guru and so on) every week more or less, but at least from what i see, there is no Backend-Only gigs for Python Devs, They always ask for Fullstack developers, and Machine Learning gigs i dont even mention them.
Maybe im missing something obvious, but feel incredible that someone that has skills is not capable of land even a freelancer job. Maybe im blind, or maybe im asking too much(I feel the latter is not the case). Or maybe im overestimating my self? i think around that time to time, but is not possible, i have knowledge of Rest/GraphQL APIs Development using frameworks like Flask or DJango(But i like Flask more than DJango, i feel awesome with its microframework approach). Familiarized with containerization and Docker. I can mention knowledge about SQL and DBs(PostgreSQL), ORMs(SQLAlchemy), Open Auth, CI/CD, Unit Testing, Git, Soft DevOps Skills, Design Patterns like MVC or MTV, Serverless Environments, Deep Learning Solutions, end to end: Data Gathering, Preprocessing, Data Analysis, Model Architecture Design, Training and Finetunning. Im familiarized with SotA techniques widely used now days, GANs, Transformers, Residual Networks, U-Nets, Sequence Data, Image Data or high Dimensional Data, Data Augmentation, Regularization, Dropout, All kind of loss functions and Non Linear functions. My toolset is based around Python, with Tensorflow as the main framework, supported by other libraries like pandas, numpy and other Data Science oriented utils.
I know lot of stuff, is not that enough for get a Junior Level underpaid job? truly dont get it, what is required for get a job? not even enough for get an interview?
I have some dev friends and everyone seems to be able to land jobs, why im not landing even an interview?
I will keep pushing my Dev career, is that or starve to death. But i will love to read your suggestions! how i can approach this?
i will leave here my relevant social presence:
https://linkedin.com/in/...
https://github.com/ElPapi42
Thanks in advance!9 -
I want to build a code transpiler/ transformer as a side project I looked around but I can't find where to start.
Does anyone have any useful info on this ?1