Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
@iAmNaN SQL Server 2016. Getting the data from there is easy. It's doing the compares on the data from the API that is killing it.
-
iAmNaN71317y@bkwilliams can you pull the data from the API into SQL Server, into a temp table, and let SQL Server do the work? I'm assuming that SQL is your DW.
-
@iAmNaN that was going to be my next test. I don't know if I can bulk load into a temp table, plus there would have to be X of them, one for each thread.
-
@iAmNaN no it's based on the number of user's logins to the API that I have to pull. A control table sets the active users and I automate pulling the data for them. (API doesn't keep history, my DW does).
-
iAmNaN71317y@bkwilliams Can you write the API data to flat files, and use an ETL process to load to the DB? I've seen that as one option. We extract data from a Mainframe database (ADABAS) to flat files, and then use ETL to clean and load the data to Oracle.
Related Rants
So I did this https://devrant.io/rants/797965/... which works fine until medium sized data.
However for large data the ETL pegs a 6 core Xeon (2.2GHz) with 50GB of ram. Because of it ends up doing six threaded compares, so 12 different data sets. Other than "pull less data", any tips?
Code (C#) is basically a Linq multi column join between two DataTables and when the compared columns don't match it returns as a var which is turned into a third DataTable to be SqlBulk loaded into the DB.
Table1 is external API return data (no windowing) and Table2 is from our DW.
undefined
ask a dev
peg a server
scalability