5
RaduAmh
7d

Why all the resources on deep learning and machine learning are based on python? I know for instance that C# has ML.NET but you don't find books on that. Anyone knows good resources that are more language agnostic?

Comments
  • 0
    ML.net is very new. Only like 3 months old. The current version is 0.4 I believe, so a lot can still change...

    Does it actually use the GPU to process everything?
  • 3
    Python is great for writing very high level code and orchestrating the execution of low level libraries.
    And it does all this out of the box in a nice light environment. Also, it's one of the easiest languages to pick up and very popular in the scientific community.

    Also, Python's ML ecosystem is great, there are tons of very high quality packages to choose from. Other languages that have booming ecosystems are R and JVM (sort of). Languages like C# and C++ aren't really widely used for this, and I mention JVM only because of Spark's ML capabilities and deeplearning4j.

    Andrew Ng's Stanford course is great, the lecture notes are online and so is a set of video recordings of the class (YouTube). He uses Octave but you can implement it in any language because he starts from scratch.
  • 1
    @RememberMe I can see your point but aren't you going to lose speed if you use those out of the box tools like image or data preprocessing from the keras library? I mean suppose you can write the algorithm in a memory leak free c++ code, isn't going to be faster?
  • 3
    @RaduAmh the algorithms are written in C/C++, it's just orchestrated by Python. Nobody does heavy numerical work in Python, that's just pointless. Though there are methods like Numba or Cython to make pure Python faster.

    For example, the tensorflow Python stuff is just a wrapper over the C++ tensorflow library, numpy is a wrapper over some C/Fortran implementation of BLAS, Keras is a very high level library that works over tensorflow etc. This is a common trend with Python. You have LLVM wrappers, rendering engine wrappers (Ogre), Qt application framework wrappers (PyQt), CUDA wrappers (Numba, PyCUDA), etc.

    Look at Blender, the core of the software (rendering engine, core operations) are written in C/C++, but a ton of other stuff is done using the Python API, that makes Blender very flexible and open to plugins while retaining good performance.
  • 2
    @RaduAmh also, raw speed matters a lot lesser than you think. You also need to know how to distribute the performance gains.

    Python's development cycle is really, really fast. No compiling, it just runs out of the box. It's high level so you can express your ideas very quickly with reasonable performance. In many cases that's more valuable than squeezing performance out of your code (eg. isn't it a bad thing to waste time writing stuff in a low level language if you end up having to throw it and use something else?)

    Also, the biggest performance gains come from optimizing "hot" code - the very small part of your program that is responsible for most of the processing. Imagine that tight inner loop in a data processing engine that adds matrices together, something like that. In Python's model that hot code would be written in C/C++, while Python does the initialization, display, reading/writing data, top level function calls etc, all the boilerplate stuff.
  • 2
    @RememberMe Thanks for taking the time to answer, now the situation is a bit more clear
  • 1
    google for the FANN library, it's in C and has many language bindings.
Your Job Suck?
Get a Better Job
Add Comment