C0D46236652dLoad up the csv in a temp / staging table present this to the user and make them edit or confirm before pushing it through and processing the data.
spongessuck494952dYes I've done something very similar. I'd only do it if it's easy and you're sure they're actually gonna check it or else you're probably wasting your time.
UnicornPoo32652dIt’s a good idea but it really depends on your user and the amount of data they are updating.
My experience is that most users don’t really get why their data needs to be accurate and will fully expect you to clear it up if they take a shit in your db.
However at least if you give them the option of validation then less of the responsibility is on you if/when they send in half arsed, inaccurate data.
Or I might just be a bit old and cynical!
hjk101292452dIf the dataset is really huge an excerpt of the records will suffice. Staging table is highly recommended even for processing trimming/validation on your end. You can display erroneous records without having to fish them out of your life db. Also remember that you need to do a lot of sanitisation for security reasons.
netikras2659252dLast I had to do this:
1. Validate the data once loaded; fail-fast asap to save resources:
2. Push the csv to the queue for async processing;
3. Return a 200 to the user;
4. Process the entries from the queue asynchronously, not overloading the system;
5. As soon as the last entry is processed - update the respective csv-uploads-history entry with a success status.
We only had a handful of servers and we had to deal with huuuuge loads. That did the trick.
Thanks everyone for your input and examples. All good stuff to think about.
Right now I think I'm going to go with:
- Customer uploads spreadsheet.
- I'm going to give them a sample (or all) of the entries to preview how we processed them (they can spot check if they like).
- They hit "yeah that looks good" and then I'll send them to the sever and I'll handle that all async and such.
On the preview page I sort of toyed with and built a quick option for them to change data if it was wrong BUT I thought about it and that just smelled like a bad idea. I really am against that now.
I didn't want the customers CSV, the possibly edited preview, and then the DB all... all to be the same thing but also ... possibly different.
I want the onus to be on the customer's CSV to be right, and if it's wrong, they fix it and then just run it again.
hjk101292450d@N00bPancakes inline editing only makes sense on small data sets like a Christmas greetings service where customers copy paste a list of addresses.
If users complian about it you can always add it later. Easily uploading it again is a must though, a lot of times it will be a structural problem like wrong column name or encoding issue.