devRant - A fun community for developers to connect over code, tech & life as a programmer

Search - "one-based index"

46

Hel8y

335

9y

A few weeks ago a client called me. His application contains a lot of data, including email addresses (local part and domain stored separately in SQL database). The application can filter data based on the domain part of the addresses. He ask me why sub.example.com is not included when he asked the application for example.com. I said: No problem, I can add this feature to the application, but the process will take a longer.

Client: No problem, please add this ASAP.

So, the next day I changed some of the SQL queries to lookup using the LIKE operator.

After a week the client called again: The process is really slow, how can this be?

Me: Well, you asked me to filter the subdomains as well. Before, the application could easily find all the domains (SQL index), but now it has to compare all the domains to check if it ends with the domain you are looking for.

Client: Okay, but why is it a lot slower than before?

Me: Do you have a dictionary in your office?

<Client search for a dictionary, came back with one>

Me: give me the definition of the word "time"

<Client gives definition of time>

Me: Give me the definition of all words ending with "time"

Client: But, ...

Never heard from him again on this issues :-P

undefined

5
45

JBSnorro

1215

9y

I wanted to print the second and third page of some document, so in the relevant field of the printer dialog I enter "1, 2" and I walk off to the printer.

My first thought when I saw the printer had printed the wrong pages was
"F*ing buggy software"
Second thought:
"Oh... right"
Third thought:
"Right, in the real world, one-based indices are the rule rather than the exception. "
Fourth thought:
"Dumb real world"

undefined one-based index

3
8

gymmerDeveloper

2306

7y

So this web company i joined had a page load time in minutes. The free text search (inverted index search, based on elasticsearch) queries would return results in 10-45 seconds (should be milliseconds always). The indexes had no schema. And they would crawl data and feed into mssql db, which had a 2 gb/db limit on the free version. So everytime the db hit the limit, a new db was created and the name was incremented by one.

Had a very tough time cleaning up that mess. Plus the architect who had made this architecture was on his way out and unhelpful to the core.

What was worse was that most of the changes i did were very simple changes that should have been done long back. Basic sanity changes.

rant bad architecture wk133 bad design

4
6

Wisecrack

9354

1y

Heres some research into a new LLM architecture I recently built and have had actual success with.

The idea is simple, you do the standard thing of generating random vectors for your dictionary of tokens, we'll call these numbers your 'weights'. Then, for whatever sentence you want to use as input, you generate a context embedding by looking up those tokens, and putting them into a list.

Next, you do the same for the output you want to map to, lets call it the decoder embedding.

You then loop, and generate a 'noise embedding', for each vector or individual token in the context embedding, you then subtract that token's noise value from that token's embedding value or specific weight.

You find the weight index in the weight dictionary (one entry per word or token in your token dictionary) thats closest to this embedding. You use a version of cuckoo hashing where similar values are stored near each other, and the canonical weight values are actually the key of each key:value pair in your token dictionary. When doing this you align all random numbered keys in the dictionary (a uniform sample from 0 to 1), and look at hamming distance between the context embedding+noise embedding (called the encoder embedding) versus the canonical keys, with each digit from left to right being penalized by some factor f (because numbers further left are larger magnitudes), and then penalize or reward based on the numeric closeness of any given individual digit of the encoder embedding at the same index of any given weight i.

You then substitute the canonical weight in place of this encoder embedding, look up that weights index in my earliest version, and then use that index to lookup the word|token in the token dictionary and compare it to the word at the current index of the training output to match against.

Of course by switching to the hash version the lookup is significantly faster, but I digress.

That introduces a problem.
If each input token matches one output token how do we get variable length outputs, how do we do n-to-m mappings of input and output?

One of the things I explored was using pseudo-markovian processes, where theres one node, A, with two links to itself, B, and C.
B is a transition matrix, and A holds its own state. At any given timestep, A may use either the default transition matrix (training data encoder embeddings) with B, or it may generate new ones, using C and a context window of A's prior states.

C can be used to modify A, or it can be used to as a noise embedding to modify B.

A can take on the state of both A and C or A and B. In fact we do both, and measure which is closest to the correct output during training.

What this *doesn't* do is give us variable length encodings or decodings.

So I thought a while and said, if we're using noise embeddings, why can't we use multiple?

And if we're doing multiple, what if we used a middle layer, lets call it the 'key', and took its mean
over *many* training examples, and used it to map from the variance of an input (query) to the variance and mean of
a training or inference output (value).

But how does that tell us when to stop or continue generating tokens for the output?

Posted on pastebin if you want to read the whole thing (DR wouldn't post for some reason).

In any case I wasn't sure if I was dreaming or if I was off in left field, so I went and built the damn thing, the autoencoder part, wasn't even sure I could, but I did, and it just works. I'm still scratching my head.

https://pastebin.com/xAHRhmfH

random llm machine learning

33
4

galena

7445

3y

There is a element in mappView that lets you display a specific image from a list based on a index. It has the parameters selectedImageIndex and selectedIndex. One is used to set the other to read. Its not like that you need RW separation. Its done with variable binding. So why the FUCK is it there????

rant

2

Top Tags

rant linux code windows fuck i java c programming android dev the is javascript js a life joke python

Weekly Rant

Most unrealistic deadline you've had?

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service