Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "mnist"
-
I had the idea that part of the problem of NN and ML research is we all use the same standard loss and nonlinear functions. In theory most NN architectures are universal aproximators. But theres a big gap between symbolic and numeric computation.
But some of our bigger leaps in improvement weren't just from new architectures, but entire new approaches to how data is transformed, and how we calculate loss, for example KL divergence.
And it occured to me all we really need is training/test/validation data and with the right approach we can let the system discover the architecture (been done before), but also the nonlinear and loss functions itself, and see what pops out the other side as a result.
If a network can instrument its own code as it were, maybe it'd find new and useful nonlinear functions and losses. Networks wouldn't just specificy a conv layer here, or a maxpool there, but derive implementations of these all on their own.
More importantly with a little pruning, we could even use successful examples for bootstrapping smaller more efficient algorithms, all within the graph itself, and use genetic algorithms to mix and match nodes at training time to discover what works or doesn't, or do training, testing, and validation in batches, to anneal a network in the correct direction.
By generating variations of successful nodes and graphs, and using substitution, we can use comparison to minimize error (for some measure of error over accuracy and precision), and select the best graph variations, without strictly having to do much point mutation within any given node, minimizing deleterious effects, sort of like how gene expression leads to unexpected but fitness-improving results for an entire organism, while point-mutations typically cause disease.
It might seem like this wouldn't work out the gate, just on the basis of intuition, but I think the benefit of working through node substitutions or entire subgraph substitution, is that we can check test/validation loss before training is even complete.
If we train a network to specify a known loss, we can even have that evaluate the networks themselves, and run variations on our network loss node to find better losses during training time, and at some point let nodes refer to these same loss calculation graphs, within themselves, switching between them dynamically..via variation and substitution.
I could even invision probabilistic lists of jump addresses, or mappings of value ranges to jump addresses, or having await() style opcodes on some nodes that upon being encountered, queue-up ticks from upstream nodes whose calculations the await()ed node relies on, to do things like emergent convolution.
I've written all the classes and started on the interpreter itself, just a few things that need fleshed out now.
Heres my shitty little partial sketch of the opcodes and ideas.
https://pastebin.com/5yDTaApS
I think I'll teach it to do convolution, color recognition, maybe try mnist, or teach it step by step how to do sequence masking and prediction, dunno yet.6 -
Trained a hand digit recognition model on MNIST dataset.Got ~97℅ accuracy (wohoo!).Tried predicting my digits,its fucked up! Every ML model's story (?)3
-
Anyone ever tried using user-to-user collaborative filtering to classify the mnist digits dataset?
This is about as far as I got:
https://hastebin.com/obinoyutuw.py
It's literal copy-and-paste frankencode because this is only the second time I've ever done something like this, so pardon the hatchjob.1 -
I feel really lost in neural network theory.
the mnist sample made sense, but now I'm looking at Gans and CNN's.. and now all of a sudden I'm lost.
True also are the examples I'm finding of something I know I was able to get to work when more at peace once upon a time called wavenet for text to speech.
I used the Onyx model however which was very easy to implement, but I quickly get lost looking at the tensorflow and pytorch code, even though it is very short I feel intimidated.
The ssd mobilenet documentation also is pretty straightforward, but when I look for wavenet information about providing input in what format and interpreting output I'm having some trouble.
Its frustrating.
I'm tense, I'm poorly rested, I'm sick of having to redo crap and I'm surrounded by people who make me hypervigilant, skin crawly and tense.
How to overcome these things when I'm not at peace at all ?
I don't know. Pushing through it isn't compatable with the mindset I've been forced into.5