0

ML and NLP people I need some help bc I’m stupid

I want to build a model that is trained on a set of images and captions for those specific image.

The images are of a single person in different positions and different environments.

How do I train something like this? Do I use an object recognition model to understand what’s in the image as it’s features?

Comments
  • 0
    Well, I'd say the most fundamental problem is you haven't said what you want to accomplish...

    As in, what algo you use, what features you use, and what labels you use are all defined by that.
Add Comment