2

I want to make a neural network (with tensorflow) that takes an audio file, and classifies all words with a begin and end timestamp.

Any suggestions on how I should approach this? What kind of network would be good for this? Do i need one to separate words first? Done the mnist tutorial but not much more.

Comments
  • 0
    what's your motivation?
    do you need just the audio files transcribed or do you want to learn how to do that?
  • 2
    @heyheni I want to learn it. I specifically need the timestamps for each word.

    I want to create a program were you type a sentence, and it creates a movie with each word from random speeches.
Add Comment