Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "problem in context switching"
-
PM : "I will not tolerate this."
Me : "I don't like it either inside callbacks."
Fellow dev sitting next to me : *facepalm*6 -
My best skill is problem is:
*** problem solving ***
Really, at least in all the teams I've been working until now, I'm always surprised by myself. How fast I am in spotting the problem root and find or suggest a solution. Even on things I have almost no knowledge.
My worst skill is:
*** problem solving ***
Being so effective make me everybody's slave.
Everybody always rely on me for any kind of weird shit. If I try to "outsource" the problem, after one day it will bounce back on me and I solve it in no time.
So I've no time for anything else that solving other people's problems.
Constant interruptions and context switching.
And worst, my bosses don't understand why I don't finish my tasks. And I cannot blame my team.8 -
Heres some research into a new LLM architecture I recently built and have had actual success with.
The idea is simple, you do the standard thing of generating random vectors for your dictionary of tokens, we'll call these numbers your 'weights'. Then, for whatever sentence you want to use as input, you generate a context embedding by looking up those tokens, and putting them into a list.
Next, you do the same for the output you want to map to, lets call it the decoder embedding.
You then loop, and generate a 'noise embedding', for each vector or individual token in the context embedding, you then subtract that token's noise value from that token's embedding value or specific weight.
You find the weight index in the weight dictionary (one entry per word or token in your token dictionary) thats closest to this embedding. You use a version of cuckoo hashing where similar values are stored near each other, and the canonical weight values are actually the key of each key:value pair in your token dictionary. When doing this you align all random numbered keys in the dictionary (a uniform sample from 0 to 1), and look at hamming distance between the context embedding+noise embedding (called the encoder embedding) versus the canonical keys, with each digit from left to right being penalized by some factor f (because numbers further left are larger magnitudes), and then penalize or reward based on the numeric closeness of any given individual digit of the encoder embedding at the same index of any given weight i.
You then substitute the canonical weight in place of this encoder embedding, look up that weights index in my earliest version, and then use that index to lookup the word|token in the token dictionary and compare it to the word at the current index of the training output to match against.
Of course by switching to the hash version the lookup is significantly faster, but I digress.
That introduces a problem.
If each input token matches one output token how do we get variable length outputs, how do we do n-to-m mappings of input and output?
One of the things I explored was using pseudo-markovian processes, where theres one node, A, with two links to itself, B, and C.
B is a transition matrix, and A holds its own state. At any given timestep, A may use either the default transition matrix (training data encoder embeddings) with B, or it may generate new ones, using C and a context window of A's prior states.
C can be used to modify A, or it can be used to as a noise embedding to modify B.
A can take on the state of both A and C or A and B. In fact we do both, and measure which is closest to the correct output during training.
What this *doesn't* do is give us variable length encodings or decodings.
So I thought a while and said, if we're using noise embeddings, why can't we use multiple?
And if we're doing multiple, what if we used a middle layer, lets call it the 'key', and took its mean
over *many* training examples, and used it to map from the variance of an input (query) to the variance and mean of
a training or inference output (value).
But how does that tell us when to stop or continue generating tokens for the output?
Posted on pastebin if you want to read the whole thing (DR wouldn't post for some reason).
In any case I wasn't sure if I was dreaming or if I was off in left field, so I went and built the damn thing, the autoencoder part, wasn't even sure I could, but I did, and it just works. I'm still scratching my head.
https://pastebin.com/xAHRhmfH33 -
Interviewing with three companies. First one extended an offer. I'm expecting an offer from at least one, possibly both, of the others (On-site with Second was yesterday and expecting an offer tomorrow or Mon, phone tech interview (they also had a tech screen) with Three was today and I /rocked/ it, expecting an onsite invite for next week).
The problem with being a badass is that the choice paralysis is SO OVERWHELMING. All three have features that I like and how do I choose.
I think I'm being overly influenced by the weekly massage, onsite barista, free nice breakfast/lunch, and ideal location of Second (the domain is finance, they have $$$). Oh and fucking 25 vacation days and amazing 401k matching. I mean how would I say no to an offer? But what if the work is actually beyond me? But they have seriously cranked their benefits package up to 11.
First is an in house product with external clients. The domain I don't find super interesting, but it has amazing Glassdoor reviews, seems like a decent environment, and really seems like a place to progress and grow as a professional. It is also the lowest salary of the three (both others are through Hired, so I know what they are offering).
Third is a consultancy where I'd really get to keep my skills relevant. Seems mad fast paced, which is a bit intimidating, and I don't know how well I'd handle the context switching of being on multiple projects at a time.
I mean, all of this is counting my chickens before they hatch. But I have a really good feeling about my chsnces with Second, though I suppose I still have a chance to botch my onsite with Third.
Ahhhh. Dev Rant, how did you go about choosing between offers that can't be evaluated on a single axis?1 -
To add a bit more context to my last rant.
The following situation happened today and similar situations are at the moment common as fuck.
Situation started roughly 1 1/2 months ago as a deployment failed.
Seemed to be a DNS problem for the devs, so my basic assumption was that they checked their shit.
As I was and I am currently more than swamped, told them it had to wait if it is an DNS issue...
Well.
Backstabbing product manager complained to upper management as it took so long.
Backstabbing manager even went so far to propose alternative solutions - think of switching product to work around issue and throwing away a year of development of a 5 man team...
So additional to my work I had to deescalate and prevent complete nonsense.
Today I finally found time for the problem.
After 2-3 hours of turning every stone inside the DNS setup, cloudflare, loadbalancers, etc...
Well. Devs. Don't trust them.
Turned out the devs misconfigured the environment entirely.
Its not so obvious in this product as it is rather complicated, though the devs documentation explicitly mentioned that if one overrides the configuration for e.g. several languages, one has to make sure to set two env variables for TLS mode...
There was only one set.
:(
8 fucking weeks of backstabbing and blaming others while they could have just read their own fucking documentation and fixed that shit in 5 minutes.2 -
Hi fellow ranters, I humbly request your opinion on a matter.
I am a CS student in his last year of college, and currently developing a Node.js app as his final year project with a partner. The project has potential, and we've been at it for about three weeks, but the problem is that the more I code, the less I see myself doing Node in the future.
I was a total noob in CSS before starting the project, and I have learnt a ton in just 3 short weeks, but that has taken a toll on me, because I fell pretty far behind our schedule. However, for as much time and effort ad I have put in, my partner has put in a lot more (and he knows way more than me), thus increasing the gap.
My partner and I have (for the moment) different views on the amount of effort that we want to put in the project, since I see it as "slightly more than just another subject" (9-hr a week), and he sees it as a real passion project (endless hours). This could be due to the burnout of the first weeks, but I'm really not that excited about the project anymore, and I find myself thinking that I am wasting both of our time (I don't want to be dead weight), and that if I worked on a project that really made me passionate, such a compiler or a runtime environment, or a new programming language, I wouldn't mind putting in the hours that he does. Just to give more context, this whole project was his idea, and although I find it a great idea, and I know he is capable of building an amazing product, I am not sure whether I would be useful, or even if I want to be useful. Again, this could all be because of burnout.
Anyone has had such an experience?
TL;DR: I am working on a final project with a partner (it was his idea, and I found it interesting), but I think I would be happier switching to a project of my own.7