Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "masking"
-
LONG RANT AHEAD!
In my workplace (dev company) I am the only dev using Linux on my workstation. I joined project XX, a senior dev onboarded me. Downloaded the code, built the source, launched the app,.. BAM - an exception in catalina.out. ORM framework failed to map something.
mvn clean && mvn install
same thing happens again. I address this incident to sr dev and response is "well.... it works on my machine and has worked for all other devs. It must be your environment issue. Prolly linux is to blame?" So I spend another hour trying to dig up the bug. Narrowed it down to a single datamodel with ORM mapping annotation looking somewhat off. Fixed it.
mvn clean && mvn install
the app now works perfectly. Apparently this bug has been in the codebase for years and Windows used to mask it somehow w/o throwing an exception. God knows what undefined behaviour was happening in the background...
Months fly by and I'm invited to join another project. Sounds really cool! I get accesses, checkout the code, build it (after crossing the hell of VPNs on Linux). Run component 1/4 -- all goocy. run component 2,3/4 -- looks perfect. Run component 4/4 -- BAM: LinkageError. Turns out there is something wrong with OSGi dependencies as ClassLoader attempts to load the same class twice, from 2 different sources. Coworkers with Windows and MACs have never seen this kind of exception and lead dev replies with "I think you should use a normal environment for work rather than playing with your Linux". Wtf... It's java. Every env is "normal env" for JVM! I do some digging. One day passes by.. second one.. third.. the weekend.. The next Friday comes and I still haven't succeeded to launch component #4. Eventually I give up (since I cannot charge a client for a week I spent trying to set up my env) and walk away from that project. Ever since this LinkageError was always in my mind, for some reason I could not let it go. It was driving me CRAZY! So half a year passes by and one of the project devs gets a new MB pro. 2 days later I get a PM: "umm.. were you the one who used to get LinkageError while starting component #4 up?". You guys have NO IDEA how happy his message made me. I mean... I was frickin HIGH: all smiling, singing, even dancing behind my desk!! Apparently the guy had the same problem I did. Except he was familiar with the project quite well. It took 3 more days for him to figure out what was wrong and fix it. And it indeed was an error in the project -- not my "abnormal Linux env"! And again for some hell knows what reason Windows was masking a mistake in the codebase and not popping an error where it must have popped. Linux on the other hand found the error and crashed the app immediatelly so the product would not be shipped with God knows what bugs...
I do not mean to bring up a flame war or smth, but It's obvious I've kind of saved 2 projects from "undefined magical behaviour" by just using Linux. I guess what I really wanted to say is that no matter how good dev you are, whether you are a sr, lead or chief dev, if your coworker (let it be another sr or a jr dev) says he gets an error and YOU cannot figure out what the heck is wrong, you should not blame the dev or an environment w/o knowing it for a fact. If something is not working - figure out the WHATs and WHYs first. Analyze, compare data to other envs,... Not only you will help a new guy to join your team but also you'll learn something new. And in some cases something crucial, e.g. a serious messup in the codebase.11 -
Boy do I hate office politics...
A client asked our company to fix perf issues on their product. Our coleagues had been picked for the job [being led by another 3rd-party, as per client's request]. Aaand they dropped the ball. The deadline is in 2 weeks, nothing is working.
Mgmt engaged us to put out the fire, but strictly at the scope the other guys were working in.
On the first day of testing we've revealed an elephant-sized perf issue that's as easy to fix as brainlessly changing 4 values in config. And that elephant is masking all the other perf issues.
We got a firm NO for config changes as that is out of the defined scope. And we're asked to continue testing.
I mean, the elephant is THAT huge that any further testing is moot - all other bottlenecks are hidden behind it. And just changing those 4 values would reduce the resources required by a magnitude of ~10.
But that's out of scope...
Client is desperate, lost and honestly asking us, pros in the field, for help.. We know how to help.. It takes 10 seconds to apply the fix..
But our mgmt forbids us to step out of the scope :/
as a result we have to pretend to be dummies hardly knowing what to do and hide the truth from the customer they so desperately want.
This is frustrating. And wrong. And imo unprofessional10 -
The solution for this one isn't nearly as amusing as the journey.
I was working for one of the largest retailers in NA as an architect. Said retailer had over a thousand big box stores, IT maintenance budget of $200M/year. The kind of place that just reeks of waste and mismanagement at every level.
They had installed a system to distribute training and instructional videos to every store, as well as recorded daily broadcasts to all store employees as a way of reducing management time spend with employees in the morning. This system had cost a cool 400M USD, not including labor and upgrades for round 1. Round 2 was another 100M to add a storage buffer to each store because they'd failed to account for the fact that their internet connections at the store and the outbound pipe from the DC wasn't capable of running the public facing e-commerce and streaming all the video data to every store in realtime. Typical massive enterprise clusterfuck.
Then security gets involved. Each device at stores had a different address on a private megawan. The stores didn't generally phone home, home phoned them as an access control measure; stores calling the DC was verboten. This presented an obvious problem for the video system because it needed to pull updates.
The brilliant Infosys resources had a bright idea to solve this problem:
- Treat each device IP as an access key for that device (avg 15 per store per store).
- Verify the request ip, then issue a redirect with ANOTHER ip unique to that device that the firewall would ingress only to the video subnet
- Do it all with the F5
A few months later, the networking team comes back and announces that after months of work and 10s of people years they can't implement the solution because iRules have a size limit and they would need more than 60,000 lines or 15,000 rules to implement it. Sad trombones all around.
Then, a wild DBA appears, steps up to the plate and says he can solve the problem with the power of ORACLE! Few months later he comes back with some absolutely batshit solution that stored the individual octets of an IPV4, multiple nested queries to the same table to emulate subnet masking through some temp table spanning voodoo. Time to complete: 2-4 minutes per request. He too eventually gives up the fight, sort of, in that backhanded way DBAs tend to do everything. I wish I would have paid more attention to that abortion because the rationale and its mechanics were just staggeringly rube goldberg and should have been documented for posterity.
So I catch wind of this sitting in a CAB meeting. I hear them talking about how there's "no way to solve this problem, it's too complex, we're going to need a lot more databases to handle this." I tune in and gather all it really needs to do, since the ingress firewall is handling the origin IP checks, is convert the request IP to video ingress IP, 302 and call it a day.
While they're all grandstanding and pontificating, I fire up visual studio and:
- write a method that encodes the incoming request IP into a single uint32
- write an http module that keeps an in-memory dictionary of uint32,string for the request, response, converts the request ip and 302s the call with blackhole support
- convert all the mappings in the spreadsheet attached to the meetings into a csv, dump to disk
- write a wpf application to allow for easily managing the IP database in the short term
- deploy the solution one of our stage boxes
- add a TODO to eventually move this to a database
All this took about 5 minutes. I interrupt their conversation to ask them to retarget their test to the port I exposed on the stage box. Then watch them stare in stunned silence as the crow grows cold.
According to a friend who still works there, that code is still running in production on a single node to this day. And still running on the same static file database.
#TheValueOfEngineers2 -
Summary: Burnout, and everything's broken.
I don't feel like doing a damn thing today. I look at the code and cringe. I look at Slack and think "ugh. i can't." Mental capitals are even too much work.
(I've started reading "Zen and the Art of Motorcycle Maintenance" to try and combat burnout. I'll write a rant/story about it here if I find it helpful. but all I want to do today is drink tea and read.)
But onto the story:
Heroku is deprecating support for and will automatically upgrade any old verisons of Postgres running on its platform after August something (like five days from now).
I performed the upgrade to PG10 on Sunday (and late into the night), provisioning a new follower, blah blah blah.
However, the version of Rails we're using (4.2.x) doesn't support PG10 sequences, so I manually added in support via a monkeypatch. I did this on our QA servers first, obviously, and everything worked as expected. After half a day of no issues, I did the same on production, and again: everything worked as expected.
But today? I keep hearing about new things that are broken. One specific type of alert doesn't work for one specific person (wat). Can't send [redacted] at all. Can't update merchants! Yet there are magically no errors logged.
That last one (well, two) are just great; let me explain: when there's an error concerning merchants, the error gets caught, isn't logged or recorded anywhere so it just disappears, and the rescue block triggers a json response instead and happily exits. This is for an internal admin tool, so returning a user-friendly error is kinda stupid anyway, but masking what actually happened? fuck that dev with an obelisk made from spikes and solidified pain. That json response is also lovely: it's a 200 OK returning {status: 1, data: "[generic message containing incorrect IT jargon]"}. Doesn't even say "error" anywhere. Bloody everything about this pattern is absolutely wrong. Even the friggin' text.
Fucking hell. I want to pipe the entire codebase into shred and walk out the door.
But I digress. So many things are broken, my motivation is wanning to a sliver, and I have a conference call today where I'll undoubtedly be asked why everything is on smoking and/or on fire, and my huge and overly productive week last week will ofc mean nothing by contrast.
Ugh.
`shred ~/dev/work -zfu -n 32 &; ./brew tea --hot && wine ~/takeabreak.exe`rant zen and the art of motorcycle maintenance postgres heroku ship's sinking and the fixer's all fixed out burnout21 -
Dynamically typed languages are barbaric to me.
It's pretty much universally understood that programmers program with types in mind (if you have a method that takes a name, it's a string. You don't want a name that's an integer).
Even it you don't like the verbosity of type annotations, that's fine. It adds maybe seconds of time to type, which is neglible in my opinion, but it's a discussion to be had.
If that's the case, use Crystal. It's statically typed, and no type annotations are required (it looks nearly identical to Ruby).
So many errors are fixed by static typing and compilers. I know a person who migrated most of the Python std library to Haskell and found typing errors in it. *In their standard library*. If the developers of Python can't be trusted to avoid simple typing errors with all their unit tests, how can anyone?
Plus, even if unit testing universally guarded against typing errors, why would you prefer that? It takes far less time to add a type annotation (and even less time to write nothing in Crystal), and you get the benefit of knowing types at compile time.
I've had some super weird type experiences in Ruby. You can mock out the return of the type check to be what you want. I've been unit testing in Ruby before, tried mocking a method on a type, didn't work as I expected. Checked the type, it lines up.
Turns out, nested away in some obscure place was a factory that was generating types and masking them as different types because we figured "since it responds to all the same methods, it's practically the same type right?", but not in the unit test. Took 45 minutes on my time when it could've taken ~0 seconds in a statically typed language.11 -
Wordpress is abolsute garbage trash. The devs who made the core appear to be drunk 24/7 when they wrote it and dont get me started with these fucking shit plugins asking you to GO PRO, GET THE PREMO VERSION, MOAR FEATURES!!!! Fuck this bullshit wordpress, masking itself as a "one size fits all" "Just add a plugin BrO" peices of shit, i hope this cancer stops, plugin devs think this is some place for their own personal billboard to advertise you dumb fucking products. Take a look at any plugin and look at the "Pro features" makes me want to die, peices of trash, fuck all of you5
-
Inappropriate experience at work: One of our project managers got arrested one day for fraud. Apparently an employee had been in the middle of an online purchase and walked away from their desk. He happened to see the unmasked entry of the CC info (this was before websites cared about masking sensitive form inputs). I guess the temptation was too great…and he was too stupid to realize he’d get caught…and he jotted it all down. He made thousands of dollars in purchases which, naturally, eventually led back to him.
The same guy, before he got arrested, had made a joke when someone in an office team email said “Feel free to have some cake in the break room.” He replied “No need to do anything to me for the cake.” His first name was “Free”.5 -
I can't stop myself from thinking like a computer when I'm sick.
The OS that runs my body is kinda fucked up right now. It was very vulnerable and now it got infected by viral executables sent out by an agent which happens to be on same work network that I'm connected to. Well, it executed and populated feelings of infatuation and crush in my heart drive. ( pun intended )
As a precaution, I patched the vulnerabilities by masking response of my Emotions API.
To further secure my system, I'll be executing memory intensive tasks that will also put my hardware to it's limits. According to my estimates, this will stall further execution of this infection and eventually kill them while rewarding me with upgraded hardware.4 -
Decided to install new CentOS to prepare for Red Hat exams.
a) had to disable VirtualBox Audio and USB otherwise it got stuck during boot (lvm2 masking did not help)
b) First command - "dnf update". Crashed in middle of the process and completely screwed dnf/yum (TWICE!). Went through just fine when executed from runlevel 3.
So far it held up to the name Enterprise Linux because this is the exact out of box clusterfuck I would expect from a corporate.2 -
In 2015 I sent an email to Google labs describing how pareidolia could be implemented algorithmically.
The basis is that a noise function put through a discriminator, could be used to train a generative function.
And now we have transformers.
I also told them if they looked back at the research they would very likely discover that dendrites were analog hubs, not just individual switches. Thats turned out to be true to.
I wrote to them in an email as far back as 2009 that attention was an under-researched topic. In 2017 someone finally got around to writing "attention is all you need."
I wrote that there were very likely basic correlates in the human brain for things like numbers, and simple concepts like color, shape, and basic relationships, that the brain used to bootstrap learning. We found out years later based on research, that this is the case.
I wrote almost a decade ago that personality systems were a means that genes could use to value-seek for efficient behaviors in unknowable environments, a form of adaption. We later found out that is probably true as well.
I came up with the "winning lottery ticket" hypothesis back in 2011, for why certain subgraphs of networks seemed to naturally learn faster than others. I didn't call it that though, it was just a question that arose because of all the "architecture thrashing" I saw in the research, why there were apparent large or marginal gains in slightly different architectures, when we had an explosion of different approaches. It seemed to me the most important difference between countless architectures, was initialization.
This thinking flowed naturally from some ideas about network sparsity (namely that it made no sense that networks should be fully connected, and we could probably train networks by intentionally dropping connections).
All the way back in 2007 I thought this was comparable to masking inputs in training, or a bottleneck architecture, though I didn't think to put an encoder and decoder back to back.
Nevertheless it goes to show, if you follow research real closely, how much low hanging fruit is actually out there to be discovered and worked on.
And to this day, google never fucking once got back to me.
I wonder if anyone ever actually read those emails...
Wait till they figure out "attention is all you need" isn't actually all you need.
p.s. something I read recently got me thinking. Decoders can also be viewed as resolving a manifold closer to an ideal form for some joint distribution. Think of it like your data as points on a balloon (the output of the bottleneck), and decoding as the process of expanding the balloon. In absolute terms, as the balloon expands, your points grow apart, but as long as the datapoints are not uniformly distributed, then *some* points will grow closer together *relatively* even as the surface expands and pushes points apart in the absolute.
In other words, for some symmetry, the encoder and bottleneck introduces an isotropy, and this step also happens to tease out anisotropy, information that was missed or produced by the encoder, which is distortions introduced by the architecture/approach, features of the data that got passed on through the bottleneck, or essentially hidden features.4 -
Trying to implement a dynamic data masking solution for our databases, to filter out sensitive data.
This seems like a problem which should've been solved decades ago. But it isn't. All DDMs, proxies, seeders, maskers... they all suck balls.
Which makes me wonder, how many devs walk around with MacBooks with half a million credit card numbers on them... -
I had the idea that part of the problem of NN and ML research is we all use the same standard loss and nonlinear functions. In theory most NN architectures are universal aproximators. But theres a big gap between symbolic and numeric computation.
But some of our bigger leaps in improvement weren't just from new architectures, but entire new approaches to how data is transformed, and how we calculate loss, for example KL divergence.
And it occured to me all we really need is training/test/validation data and with the right approach we can let the system discover the architecture (been done before), but also the nonlinear and loss functions itself, and see what pops out the other side as a result.
If a network can instrument its own code as it were, maybe it'd find new and useful nonlinear functions and losses. Networks wouldn't just specificy a conv layer here, or a maxpool there, but derive implementations of these all on their own.
More importantly with a little pruning, we could even use successful examples for bootstrapping smaller more efficient algorithms, all within the graph itself, and use genetic algorithms to mix and match nodes at training time to discover what works or doesn't, or do training, testing, and validation in batches, to anneal a network in the correct direction.
By generating variations of successful nodes and graphs, and using substitution, we can use comparison to minimize error (for some measure of error over accuracy and precision), and select the best graph variations, without strictly having to do much point mutation within any given node, minimizing deleterious effects, sort of like how gene expression leads to unexpected but fitness-improving results for an entire organism, while point-mutations typically cause disease.
It might seem like this wouldn't work out the gate, just on the basis of intuition, but I think the benefit of working through node substitutions or entire subgraph substitution, is that we can check test/validation loss before training is even complete.
If we train a network to specify a known loss, we can even have that evaluate the networks themselves, and run variations on our network loss node to find better losses during training time, and at some point let nodes refer to these same loss calculation graphs, within themselves, switching between them dynamically..via variation and substitution.
I could even invision probabilistic lists of jump addresses, or mappings of value ranges to jump addresses, or having await() style opcodes on some nodes that upon being encountered, queue-up ticks from upstream nodes whose calculations the await()ed node relies on, to do things like emergent convolution.
I've written all the classes and started on the interpreter itself, just a few things that need fleshed out now.
Heres my shitty little partial sketch of the opcodes and ideas.
https://pastebin.com/5yDTaApS
I think I'll teach it to do convolution, color recognition, maybe try mnist, or teach it step by step how to do sequence masking and prediction, dunno yet.6 -
My friend and me sit next to each other in the class.
One day, he tell me about his family. they have code equivalent of most common English words.
when the COVID-19 pandemic occur in our country, his father warn everyone in the house by saying
stay.at("127.0.0.1")
wear("255.255.255.0")
everyone start to
search(mask)
return tohome;
========================================
127.0.0.1 is a loopback address. aka localhost
IP masking is a way to hide your real IP.
255.255.255.0 is an example of subnet mask for IP
we used Ruby in this story.19 -
I still have no idea how bit shifting and masking work. I don't have to use it in my day-to-day anymore but I briefly worked as a game developer and still occasionally do side gigs and personal game projects. When I was working on games as my day job I had to do a fair amount of masking for a bunch of different reasons. But I've never gotten the hang of it. Everytime I have to create a mask I have to Google it and then I'm like "oh yeah of course that's simple enough". But inevitably the next time I have to do it I end up back at square one.4
-
It’s majority rules as far as the nature of people that sticks in someone’s mind, sure guy. But was going to say what appears nice all over is usually masking the face of a soul sculpted lovingly from dogshit.1
-
I am trying to extract data from the PubSub subscription and finally, once the data is extracted I want to do some transformation. Currently, it's in bytes format. I have tried multiple ways to extract the data in JSON format using custom schema it fails with an error
TypeError: __main__.MySchema() argument after ** must be a mapping, not str [while running 'Map to MySchema']
**readPubSub.py**
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
import json
import typing
class MySchema(typing.NamedTuple):
user_id:str
event_ts:str
create_ts:str
event_id:str
ifa:str
ifv:str
country:str
chip_balance:str
game:str
user_group:str
user_condition:str
device_type:str
device_model:str
user_name:str
fb_connect:bool
is_active_event:bool
event_payload:str
TOPIC_PATH = "projects/nectar-259905/topics/events"
def run(pubsub_topic):
options = PipelineOptions(
streaming=True
)
runner = 'DirectRunner'
print("I reached before pipeline")
with beam.Pipeline(runner, options=options) as pipeline:
message=(
pipeline
| "Read from Pub/Sub topic" >> beam.io.ReadFromPubSub(subscription='projects/triple-nectar-259905/subscriptions/bq_subscribe')#.with_output_types(bytes)
| 'UTF-8 bytes to string' >> beam.Map(lambda msg: msg.decode('utf-8'))
| 'Map to MySchema' >> beam.Map(lambda msg: MySchema(**msg)).with_output_types(MySchema)
| "Writing to console" >> beam.Map(print))
print("I reached after pipeline")
result = message.run()
result.wait_until_finish()
run(TOPIC_PATH)
If I use it directly below
message=(
pipeline
| "Read from Pub/Sub topic" >> beam.io.ReadFromPubSub(subscription='projects/triple-nectar-259905/subscriptions/bq_subscribe')#.with_output_types(bytes)
| 'UTF-8 bytes to string' >> beam.Map(lambda msg: msg.decode('utf-8'))
| "Writing to console" >> beam.Map(print))
I get output as
{
'user_id': '102105290400258488',
'event_ts': '2021-05-29 20:42:52.283 UTC',
'event_id': 'Game_Request_Declined',
'ifa': '6090a6c7-4422-49b5-8757-ccfdbad',
'ifv': '3fc6eb8b4d0cf096c47e2252f41',
'country': 'US',
'chip_balance': '9140',
'game': 'gru',
'user_group': '[1, 36, 529702]',
'user_condition': '[1, 36]',
'device_type': 'phone',
'device_model': 'TCL 5007Z',
'user_name': 'Minnie',
'fb_connect': True,
'event_payload': '{"competition_type":"normal","game_started_from":"result_flow_rematch","variant":"target"}',
'is_active_event': True
}
{
'user_id': '102105290400258488',
'event_ts': '2021-05-29 20:54:38.297 UTC',
'event_id': 'Decline_Game_Request',
'ifa': '6090a6c7-4422-49b5-8757-ccfdbad',
'ifv': '3fc6eb8b4d0cf096c47e2252f41',
'country': 'US',
'chip_balance': '9905',
'game': 'gru',
'user_group': '[1, 36, 529702]',
'user_condition': '[1, 36]',
'device_type': 'phone',
'device_model': 'TCL 5007Z',
'user_name': 'Minnie',
'fb_connect': True,
'event_payload': '{"competition_type":"normal","game_started_from":"result_flow_rematch","variant":"target"}',
'is_active_event': True
}
Please let me know if I m doing something wrong while parsing the data to JSON. Also, I am looking for examples to do data masking and run some SQL within Apache Beam4