14
kurtr
6y

When I wrote my first algorithm that learns...

So in order to on board our customers onto our software we have to link the product on their data base to the products on ours. This seems easy enough but when you actually start looking at their data you find it's a fuck up of duplication's, bad naming conventions and only 10% or so have distinct identifiers like a suppler code,model no or barcode. After a week or 2 they find they can't do it and ask for our help and we take over. On average it took 2 of our staff 1-2 weeks to complete the task manually searching one record of theirs against our db at a time. This was a big problem since we only had enough resources to on board 2-4 customers a month meaning slow growth.

I realized when looking at different customers databases that although the data was badly captured - it was consistently badly captured similar to how crap file names will usually contain the letters 'asd' because its typed with the left hand.

I then wrote an algorithm that fuzzy matched against our data and the past matches of other customers data creating a ranking algorithm similar to google page search. After auto matching the majority of results the top 10 ranked search results for each product on their db is shown to a human 1 at a time and they either click the the correct result or select "no match" and repeat until it is done at which point the algo will include the captured data in ranking future results.

It now takes a single staff member 1-2 hours to fully on board a customer with 10-15k products and will continue to get faster and adapt to changes in language and naming conventions. Making it learn wasn't really my intention at the time and more a side effect of what I was trying to achieve. Completely blew my mind.

Comments
Add Comment