We are building this big-data engine for a client's product for which we were using a cluster on GCP and they were billed ~1100$ for the last month's usage.
The CTO - the CHIEFFUCKING TECHNOLOGY OFFICER told us to hook up 5-6 laptops in our server room and create our own cluster because they cannot afford so much bill.

  • 12
    I'd go full beast mode and clustered 5-6 more toasters to that just incase 😀
  • 4
    The Cloud is just Someone Else's Computer(tm). Most people don't realize how expensive AWS/Azure actually are. A lot of the bigger players move back to their own data centers in order to save on costs.
  • 2
    @djsumdog Dude, you are absolutely right, but we are using dataproc, which is PAAS running spark.

    It's not very easy to manage a cluster of computers for distributed computing!
  • 3
    @djsumdog yes for the same performance its often more expensive but when you start to require some of the more advance services like redundancy, large variations in load, geographic load balancing the up front cost for your own equipment and personnel starts to rise exponentially until you reach a critical mass.

    So it all depends on if and why you need the cloud.

    But if you still install and run complete servers yes the cloud is often more expensive.
  • 2
    we are building our data pipeline and machine learning platform on GCP.
    One of the reasons is that manually configuring those services like Spark, HDFS, Lucene, Tensorflow , Kubernetes is really exhausting and you can easily mess and break things
    So,I'd recommend the cloud with this advice: Don't spin up expensive xl instances like Cloud TPUs, Spanner or Nvidia GPUs. Instead i prefer small GPUs and immediately deleting them after training
  • 2
    Anyone for another raspberry pi cluster?
Add Comment