I got to create the data platform with Azure Data Factory. I am new to data platform.

Any advise on what to look out for?

Could you guy please tell me if you know any good use case I can look at or any obvious pitfall which drain all the credit and so on?

I just a vague idea of what Azure data factory can do.

  • 3
    Only advice I can give is don't lock in to that ecosystem. You'll be married to what is effect a proprietary wrapper around Apache YARN, beam and Spark. You'll also pay a premium.
  • 0

    Doesn't creating a data platform is
    like being married to it? I cannot think of any other way.
  • 0
    If you build it directly against the Apache products listed, you can move it anywhere; other clouds, on prem, etc. Spend some time and host it yourself, save money, get more control.

    If you build it on azure data factory, moving involves a ground up rebuild because all your work will be done through MS abstractions. Very little of it will be reusable.
  • 0

    That was I am afraid of. Azure Data Factory is too good at the locking in with its gui.

    I was thinking of moving all the data on-site to cloud storage like blob storage first. Then I will do transformation with data factory operation and finally transfer it to Azure Sql Database.

    What do you think of my approach?

    I am tasked with creating a single database (data warehouse) for analysis purpose based on multiple database created by different teams.

    I cannot think of any rollback model other than scraping the whole data warehouse and starting over again if I make mistake in transformation.
Add Comment