Trying to build a simple app that takes photos and generates keywords from the photo

I’m trying to build a super simple app that takes a photo and sends it to Google’s Cloud Vision API and uses the labeling feature to spit out what it sees in the picture. I’m an absolute beginer to development and all i could do so far was create an app that takes a photo and stores it in cloudinary generate a URL. I’m completely lost how to send this url to google cloud vision api and get a response. Can anyone please help out with setting up the blocks for this usecase. Thanks in advance!

Each API is different enough that it’s hard to suggest an approach other than (1) read the API documentation, especially anything that mentions “endpoints” or REST and (2) watch my tutorial video and (3) post a link to the API documentation and a screenshot of any blocks you’ve tried after that.