Random opinion question: I'm working on a thing where the user provides a big CSV and we process it and put it in the d

Ranter

N00bPancakes

7419

Comments

9

C0D4

64767

5y

Load up the csv in a temp / staging table present this to the user and make them edit or confirm before pushing it through and processing the data.
1

spongessuck

6151

5y

Yes I've done something very similar. I'd only do it if it's easy and you're sure they're actually gonna check it or else you're probably wasting your time.
1

hjk101

5550

5y

If the dataset is really huge an excerpt of the records will suffice. Staging table is highly recommended even for processing trimming/validation on your end. You can display erroneous records without having to fish them out of your life db. Also remember that you need to do a lot of sanitisation for security reasons.
3

netikras

34589

5y

Last I had to do this:
1. Validate the data once loaded; fail-fast asap to save resources:
2. Push the csv to the queue for async processing;
3. Return a 200 to the user;
4. Process the entries from the queue asynchronously, not overloading the system;
5. As soon as the last entry is processed - update the respective csv-uploads-history entry with a success status.

We only had a handful of servers and we had to deal with huuuuge loads. That did the trick.
1

donozone

37

5y

@netikras This worked for us too. And add an email or system notification for errors or successful confirmation.
1

N00bPancakes

7419

5y

@C0D4

@spongessuck

@UnicornPoo

@hjk101

@netikras

@donozone

Thanks everyone for your input and examples. All good stuff to think about.

Right now I think I'm going to go with:

- Customer uploads spreadsheet.

- I'm going to give them a sample (or all) of the entries to preview how we processed them (they can spot check if they like).

- They hit "yeah that looks good" and then I'll send them to the sever and I'll handle that all async and such.

On the preview page I sort of toyed with and built a quick option for them to change data if it was wrong BUT I thought about it and that just smelled like a bad idea. I really am against that now.

I didn't want the customers CSV, the possibly edited preview, and then the DB all... all to be the same thing but also ... possibly different.

I want the onus to be on the customer's CSV to be right, and if it's wrong, they fix it and then just run it again.
1

hjk101

5550

5y

@N00bPancakes inline editing only makes sense on small data sets like a Christmas greetings service where customers copy paste a list of addresses.
If users complian about it you can always add it later. Easily uploading it again is a must though, a lot of times it will be a structural problem like wrong column name or encoding issue.

Add Comment