Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "fuzzy searching"
-
Person A: "Add a search box, one where you can type anything in. You know, a hairy search or something"
Person B: "You mean a fuzzy search..." -
Two things before this all:
- I fucking love gitlab so far
- I miss the fuzzy searching from sublime text, as vsCode still can't do it properly..
I was fed up with all the shitty overbloated git deployment scripts, sync scripts, automatic backup solutions and hosted git servers out there, so now my own solution is:
- remote git cloned local files
- local files are synced via dropbox, to easily edit them on any device
- all changes and deleted files are saved up to 1 year on dropbox
- remote has gitlab running and webhooks setup
- the webhooks point to my node scripts, which then rebase the code to its dedicated dev server
- daily server backup with 7 days roll
- cold storage backup each 30 days
Sounds like overkill, but from my experience, you really can't have enough places that have a backup, especially coldstorage backups.
My goal in general though is to have everything on my computer backupped and ready to go asap, if something happens.
I wanted to just use a virtual machine for development stuff, but that wouldnt be able to run on my laptop, so I need a more general solution, where I sync all configs and all projects across. (and have some sort of basic list of tools needed, so I dont need to remember them)
Found for example something for vscode to sync its settings and plugins via any sort of git, will give it a try in near future too.7 -
When I wrote my first algorithm that learns...
So in order to on board our customers onto our software we have to link the product on their data base to the products on ours. This seems easy enough but when you actually start looking at their data you find it's a fuck up of duplication's, bad naming conventions and only 10% or so have distinct identifiers like a suppler code,model no or barcode. After a week or 2 they find they can't do it and ask for our help and we take over. On average it took 2 of our staff 1-2 weeks to complete the task manually searching one record of theirs against our db at a time. This was a big problem since we only had enough resources to on board 2-4 customers a month meaning slow growth.
I realized when looking at different customers databases that although the data was badly captured - it was consistently badly captured similar to how crap file names will usually contain the letters 'asd' because its typed with the left hand.
I then wrote an algorithm that fuzzy matched against our data and the past matches of other customers data creating a ranking algorithm similar to google page search. After auto matching the majority of results the top 10 ranked search results for each product on their db is shown to a human 1 at a time and they either click the the correct result or select "no match" and repeat until it is done at which point the algo will include the captured data in ranking future results.
It now takes a single staff member 1-2 hours to fully on board a customer with 10-15k products and will continue to get faster and adapt to changes in language and naming conventions. Making it learn wasn't really my intention at the time and more a side effect of what I was trying to achieve. Completely blew my mind.