Using camera and image recognition to give translations of signs?

Can you explain a specific example of how this might work from a museum guest’s perspective? I’m not understanding how the image matching is supposed to function.

You can do this with Thunkable but it’s going to involve a lot of API work for image recognition and translation. ChatGPT might be able to do all of that for you but I don’t know how useful the image recognition data is going to be, especially if you’re trying to match that to an existing image database. I guess it depends on how similar/different each image is and what tags you provide in the database that can be matched by whatever ChatGPT comes up with. You might need to provide those same tags to ChatGPT in the form of custom instructions so that it can choose the most likely one instead of coming up with its own.

This is going to take A LOT of time to get right. I’ve done some of this work before, for example with this project. I’d expect to plan for at least 40 hours to code, test, and fully build such an app.

1 Like