Image Recognition & Classification using Artificial Intelligence

I have been building an app that can classify ant species from a photo using Ximilar for AI image classification. (Thanks to @muneer, @manyone and @Jared_Gibb for pushing me to learn to use this tool in Thunkable and for their assistance!) My daughter knows a ton about ants and has started her own colony so she had this idea to make an app like LeafSnap.

Thunkable provides access to powerful APIs but each API is unique in the way it’s implemented so having a tutorial like this can help for anyone wanting to learn to use Ximilar but the steps here may not apply to other APIs. I did investigate other cloud-based AI image recognition tools but ultimately this one seemed the easiest and most straightforward to use. It’s also free to start with although the free credits do run out pretty quickly. Credits are used for API calls but also for each image you upload to train the AI engine.

Here’s the process I used to get started with Ximilar, set up categories for classification, train the AI engine, communicate with the API, and display the results in Thunkable. Note that I am only covering Custom Image Recognition, Categorization & Tagging. If you need help with other aspects of Ximilar, I am not promising to know anything about them or to help you with them!

Part 1 - Training the Ximilar AI Engine

1. Sign up for Ximilar

Go to https://app.ximilar.com and click Sign Up.

I use that URL whenever accessing Ximilar. For some reason, I find it hard to navigate from https://ximilar.com to the AI engine Dashboard.

I used my Google account to sign up.

2. Add a Custom Image Recognition

Click on Custom Image Recognition

image

Click on Categorization & Tagging

image

3. Add a Task

Click on Tasks in the sidebar

image

Click Create New Task:

image

Give your task a name. I recommend labeling it with something like “Step 1” which will be handy when we get to Flows later on:

image

Click on Categorization:

(You might be excited about Tagging which is also an option on this screen but as I mentioned above, I am only focusing on one aspect of Ximilar. Still, learning all of this may help you figure out other features of Ximilar.)

image

4. Add a Category

Okay, we have to pause for a moment so I can explain how I set up categories for classifying ant species. I’ll save you some time so you don’t have to figure this out on your own. I knew very little about AI going into this and I’m still learning a lot.

Initially, I set up three categories, one for each species of ant that I wanted to classify: Odorous House Ant, Lasius Niger, and Argentine Ant. There are obviously many more species of ants but I used those as testing categories. As I began to train and test the AI engine (more on that below!), I found something interesting was happening. Ximilar was doing a pretty good job of classifying ant photos into one of the three categories. Ximilar provides probability percentages so for any given ant photo, it might tell you that the photo is most likely an Argentine Ant with a probability of 72% but that it could also be an Odorous House Ant (21%) or a Lasius Niger ant (7%). Very cool! But um, what about something very clearly is not an ant, such as a photo of a frisbee or a dog? So I gave it a photo of a frisbee and it happily told me that it was most definitely a Lasius Niger ant (67%) but very possibly could be an Argentine Ant (17%) or an Odorous House Ant (16%).

Why am I telling you all of this? Because it was my first mistake and my first major hurdle. How can I correctly classify ants if my AI can’t even tell that a frisbee or a house or a cloud is not an ant? I ended up contacting the good folks at Ximilar and they were very helpful in explaining the solution to me: I need to have two tiers of categorization. First, the AI should decide if my photo is an ant or not. Then it should figure out which species of ant. Okay, that made a lot of sense to me. But how?

That’s where Flows come into play. You can tell Ximilar to run a category check and then based on the result, run the next category check. But we’re getting ahead of ourselves! You still with me? Good! Here we go…

Click on Create New:

image

Give your Category a name:

image

Don’t worry about the Description or Output name. I think the Output name could be pretty useful but I can’t get it to show up in the JSON responses so I’ve just ignored it and everything works fine without it.

You can either add photos now or just click the back arrow to go back. I’ll discuss adding photos later so it’s probably best to wait on that for now.

5. Add a second category

Create another new category and name it:

image

6. Add photos

You’ll need to add a variety of images that represent that category. So in this case, I would add a bunch of photos (Ximilar recommends at least 100 but suggests starting with at least 20) that are clearly not ants. What does a “not ant” look like? Well, I decided to add photos of frisbees and dogs, houses, clouds, dirt, people, cars, boats, trees, and whatever else I could think of. That worked pretty well. But after testing the AI engine extensively, I found that I needed more nuanced photos in that category. Because Ximilar could easily figure out that a hamburger wasn’t an ant but it struggled to know for sure that a grasshopper wasn’t an ant.

And that makes sense. If the AI engine is trying to classify what it means to be an ant, it might easily confuse another insect with legs. Or it might think that a black spot of paint on a sidewalk is an ant (that happened before I gave it photos of sidewalks, dirt, roads, etc.). I know I’m talking a lot about ants right now and you may not care one bit about ants (or you may care somewhat but probably not as much as my daughter; I’ve learned so much from her!), but I think it helps to follow this example through from beginning to end since (a) I already have it working somewhat well and (b) it allows me to provide real-world examples of what is and is not working along the way.

Either click to add photos or drag them onto the bar that shows up:

image

After the images have completed uploading, you can see the updated photo count at the bottom-right of the list of images:

image

Go back and add photos to the other category you created. In this case, that’s the “Ant” category so I’ll want to upload photos of ants. It doesn’t much matter what types of ants I’m uploading at this point. I just want to make sure they are ants. If I mistakenly add a termite or a fly or a spider, that is going to end up confusing the AI engine. And I’m not entirely sure what constitutes a useful image. For example, should I include photos of a bunch of ants or only close-up individual ants? Should I include a photo of an ant in someone’s hand or will that make the AI engine think that people’s hands are also ants? I really don’t know! Like I said, I have a lot to learn about artificial intelligence and image recognition.

7. Train the AI engine

Just adding photos doesn’t really do anything. You have to tell Ximilar to train the AI engine. I keep using the term “AI engine.” Perhaps there is a better descriptor for this but that makes sense to me. Each time you train the AI, it evaluates the photos in the categories within that task. If you add more photos to a category, you should re-train the AI for that task.

Click on Tasks in the sidebar, click on the task (Step 1: Ant or Not Ant) and then click TRAIN:

image

Training the AI can take a long time. As in 20 minutes or more. And that’s just for 50-100 photos per category. I have no idea how long it takes if you give the task 1,000 photos or 10,000 photos.

The TRAIN button will change to a loading animation but I like to check the Models section below that:

When the training has completed, the Status will show “Trained.” This tells you that the task is ready to be used. The Accuracy column will show a percentage for how accurate the task should be in determining if a photo is an ant or not. Click on ACTIVATE and then click YES:

image

8. Testing the task

We’ve arrived at the exciting moment when we get to give the AI a photo and see if it can tell if it is an ant or not. This is what my daughter calls “the computer acting like a 2 year-old.” The fact that the AI initially had trouble determining that a frisbee wasn’t an ant and more surprisingly that it was certain that it was a Lasius Niger species of ant is not lost on my kid’s intellect and humor.

Are you wondering what all of this has to do with Thunkable? Worry not! We will get there. There’s a lot of setup that needs to happen in Ximilar but honestly, once it’s set up, it’s really easy to improve the AI capabilities by just dragging and dropping a whole bunch of photos into each category and re-training the task. The app I built can now take a photo using the phone’s camera and tell me if it’s an ant or not and fairly reliably what species of ant it is.

As an aside, here’s a funny story which you’re welcome to skip if you’re getting tired of all my banter… I was feeling better about Ximilar being able to correctly classify ants after I set up a Flow and provided the task with enough photos but I still wasn’t convinced it was working well enough. So I took a photo of a patch of dirt outside of my office. I ran it through the classification in my app and it told me – surprisingly – that the patch of dirt was an ant and it even told me which species (I forget now which one it was). Well, I was quite disappointed. I got home from work and told my daughter that the app was working pretty well but it needed improvement. She said to me “Can I see the photo?” I asked her if she meant the photo of dirt. She said yes and I passed my phone to her. After about 5 seconds, she said “There’s an ant.” “What?!” I exclaimed. “Where?!” She zoomed in and pointed to a spot in the dirt where, sure enough, there was an ant. So I guess the AI was working. I also know my daughter is pretty good at spotting ants and… not ants.

Okay, aside over.

Click on Classify in the sidebar. Then drag an image onto the bar or paste in an image URL and click Process:

Scroll down to see the results:

Uh… Ximilar thinks this cute dog is an ant. And it tells me it has a 61% probability of being right. So clearly, I need to give it more photos of “ants” and “not ants”. For this test, I used about 50 of each. For my current app, I am using 100-300 images for each category and it’s working much better.

There’s also a JSON response. This is unbelievably helpful if you are familiar with JSON and especially if you have used APIs with Thunkable before. In fact, I just lost many of you on this long-winded tutorial because you (I see you sneaking away…!) are suddenly setting up API blocks to parse out the JSON properties in Thunkable. That’s what I’d be doing right about now. But we’ll get to that in a moment.

But what good is a JSON response without knowing the correct way to call the API? Well, the Classify feature also provides a curl command:

This can be used to test the API response in Terminal on a Mac, in Postman, or better yet to access the API from within Thunkable.

Here’s how I’ve set up the blocks in Thunkable:

Now here’s where I get a little lazy. After carefully walking you through every step of the process, I’ve skipped way ahead. The Web API’s URL includes the word “flows” which means that it’s using a flow I set up. But I haven’t shown you that process yet. So these blocks will not work for you. You’re just gonna have to trust me and set up the rest of this and eventually, they will work.

8b. Tell me about flows!!!

Whoa… we’re getting there. Remember how I talked about a two-tiered categorization process? First, we have to see if the photo is an ant or not and then we have to figure out what species it is. For now, we’re going to work on that second task and eventually we’ll combine them into a single flow that we can access with an API call.

9. Add another task

You’re going to need to create another Task. Call this one “Step 2: Ant Species”:

image

Click to add a new category within that task. Give it a name:

image

Add a bunch of photos that are representative of that category. You’ll want to be a little careful here. If you just drag in a bunch of photos of ants that you think are Odorous House Ants (yes, they are really called that and for a specific reason; Google it), you may end up confusing the AI engine. So… be careful about assigning photos that are accurate representations of that category. This was less important with that first task because that photo of a motorcycle doesn’t really matter if it’s a motorbike or a bicycle or an electric scooter… the AI should still be able to tell that it’s not an ant. But now that we’re at the level of species classification (or maybe you’re trying to have it recognize Hondas vs. Teslas or shoes with laces vs. shoes without laces), it’s important that you’re sure that the photos you provide for this category are actually supposed to go in this category.

10. Add more categories

Now repeat this process to create at least one other category in the second task. Using my example, that would be “Lasius Niger” and “Argentine Ant”.

Add photos to those categories as well.

11. Train the second task

Click on the Tasks sidebar, click on “Step 2: Ant Species” and then click TRAIN. Go have some coffee or go for a walk. When you come back, scroll down to check if the model has completed and if so, click ACTIVATE.

12. You’re ready for flows. Really?! Yes, really.

Flows are so powerful. I love flows. Click on Flows in the sidebar and then click on Flow Definitions:

image

Click Create New Flow:

image

Give your flow a name. I like to call mine something that shows the steps for classifying an image:

image

Click “Add the First Step”:

image

Click “Branch Selector”:

image

I found that by trial and error. It was the only once that made sense and allowed me to set up the two-tiered approach. I have no idea what all of the options there do. If you find a good use for them, let me know. :wink:

Click Choose Recognition Task:

image

Click “Step 1: Ant or Not Ant” (the name of your first task):

image

It’s going to complain about needing an output field name so type something to the right of the task name (I used “Ant or Not”):

image

Click on the task name and then click on “Ant” (or the name of whichever category is the key one for your classification; if your flow was “Car or Not Car → Car Brand Name” then you would click “Car” here):

image

Click on “Branch Selector”:

image

And then click on “Choose Recognition Task”:

image

Click on “Step 2: Ant Species” (or whatever your second task is called):

image

Be sure to give your output field a name:

image

Your flow is ready to be tested!

13. Test your flow

In the sidebar, click on Flows and then on Test Flows.

Note: Your first and second tasks must have completed training and be activated for you to be able to test your flow.

Drag an image onto the bar or paste in an image url and then click “Test Flow”:

While you can and should test “not ant” images, it’s going to be much more interesting to give the AI engine an “ant” image and see if it can correctly identify it as an ant and then also correctly identify its species.

Whereas when you test a task using the Classify feature, it shows you percentages for each category, the Test Flow command does not do that. Instead, you have to look at the JSON Response:

{
  "records": [
    {
      "_url": "https://upload.wikimedia.org/wikipedia/commons/3/3f/Linepithema_Argentine_ant.jpg",
      "_status": {
        "code": 200,
        "text": "OK",
        "request_id": "f4d0b0fe-1ad9-4380-9c6d-81ab233cbed5"
      },
      "_id": "a90578f0-2e2e-492e-a54a-e9a6a116f7fd",
      "_width": 541,
      "_height": 434,
      "Ant or Not": [
        {
          "prob": 0.95225,
          "name": "MAIN: Ants",
          "id": "a6fb69da-6bfb-4d7c-9d60-ece512c5830e"
        },
        {
          "prob": 0.04775,
          "name": "MAIN: Not Ants",
          "id": "7cbaee93-798a-4ae5-bd0a-97614205dc39"
        }
      ],
      "Ant Species": [
        {
          "prob": 0.6526,
          "name": "SPECIES: Lasius Niger",
          "id": "9c54fb93-fccb-4bfc-b688-d94607cb4cdc"
        },
        {
          "prob": 0.17538,
          "name": "SPECIES: Argentine Ant",
          "id": "7716d8ea-0723-474f-9faf-7cceafc9ef1e"
        },
        {
          "prob": 0.17202,
          "name": "SPECIES: Odorous House Ant",
          "id": "b503dfb2-d63d-4508-80e4-acccd0f89634"
        }
      ]
    }
  ],
  "flow": "c145d0fb-39b2-401a-ca80-d97631c027e5",
  "version": "2022-04-26 04:04:08.521753+00:00",
  "model_format": "",
  "status": {
    "code": 200,
    "text": "OK",
    "request_id": "f4d0b0fe-1ad9-4380-9c6d-81ab233cbed5",
    "proc_id": "d91eec50-6af6-46c2-a946-40d9c4c436b6"
  },
  "statistics": {
    "processing time": 0.7643306255340576
  }
}

I pasted the JSON as text above instead of a screenshot for two reasons – (1) It simply doesn’t fit in a single screenshot and (2) any time you are working with JSON and especially if you need help on the forums, it’s best to post the response as text and to include three back ticks ``` before and after the text.

Full disclosure: when doing this with my test set of images (about 30 per category) that I created for this tutorial, the flow failed. It incorrectly classified an Argentine ant as “Not Ant.” Oops! If that happens to you and you have your flow set up the way I do, you won’t get the second part (task two) of the JSON response.

So I switched to my production version of my flow that has 100-300 images per category and it did much, much better. As you can see in the JSON response above.

14. Parsing the JSON (manually)

If you’ve gotten this far but haven’t dealt with JSON responses before, I highly recommend taking the time to read/watch through my API tutorial and then coming back to this tutorial.

Alright, assuming you understand how to read through JSON, the next step is to copy the JSON response from Ximilar and paste it into a Best JSON Viewer and JSON Beautifier Online.

Then we can see some very interesting things on the right side of that website:

The “prob” property is a value from 0 to 1 representing the probability that the AI engine accurately classified an image. If you multiply that by 100, you get a percentage from 0% to 100%. The highlighted property “Ant or Not” in the screenshot is the name of my first task (sorry, I know I’ve been calling it “Step 1: Ant or Not” – it’s the same thing). Below that is the name of my second task “Ant Species” (aka “Step 2: Ant Species”).

So what did the AI engine figure out? The probability this image is an ant is 95% (0.95225). That’s really high! The probability it’s a Lasius Niger ant is 65% (0.6526), the probability it’s an Argentine ant is 18% (0.17538) and the probability that it’s an Odorous House Ant is 17% (0.17202). These will add up to 100% or 1 if you haven’t converted to percents (0.6526 + 0.17538 + 0.17202 = 1).

Part 2 - Handling the API response in Thunkable

15. Setting up blocks to call the Ximilar API

I’ll be using the newer Drag & Drop interface for this tutorial. If you haven’t worked with APIs yet, I’m going to again recommend my API tutorial. I’m not going to explain all about APIs here so having that background will be useful.

I posted a screenshot of the Web API POST call blocks above but here it is again:

The second part of that Web_API1's Post block is the “then do” section that runs when the API returns a response:

You really have many options in this section. But to start with, you should aim to get a value from a property of the JSON response and display it in a label. If you’ve seen some of my answers on the forums, you’ll notice that I often tell people “keep it simple.” Before you try to make a Data Viewer List with the JSON data or assign it to Firebase users or anything really complicated, try to just get it to work. You can always get fancy later! But if you throw in too many blocks, it gets really hard to figure out what is causing the problem when things don’t work. Okay, enough soap box!

This is the key to parsing the JSON data, at least to begin with:

image

That set of blocks will have the value of the name of the first task category with the highest percentage. Huh? Sorry if that was hard to follow. Remember that the JSON response when formatted looks something like this:

Ximilar’s API sorts the items in the first task (“Ant or Not”). So the first list item will always have the higher probability (“prob” property) and the “name” property will tell us which category has the higher probability.

Okay, so what are we supposed to do with those blocks then? Start by attaching them to a label block:

If that works and your label shows the value from the JSON response as expected, you can start to use that value in other ways such as using an If/else block to check if the image we submitted is an Ant or Not Ant:

Note: Your property values may differ from mine! I apologize for switching it up… my earlier screenshots were from my sample project I worked with for this tutorial. My later screenshots are from my working app which is a bit more sophisticated now. So be sure to double-check your formatted JSON response to see the exact spelling of your property values.

Okay, I’m going to take a deep breath here because this has gotten very, very long and I think I’m almost done. But also because a common question on the forums is “how do I get more than one value at a time from a list (aka array) in the JSON response?” In terms of Ximilar, those values are name and probability (“prob”) properties within the second task:

I’m going to post a screenshot of the blocks I’m using but this is a warning that they are complicated and I’m not going to explain them all in this tutorial, at least not right now. I’m out of energy! :rofl:

That’s going to add a list item for each entry in the second task and that list item will contain the name and probability of each category in the second task. Later, I’ll use the make text from list block in the List drawer to convert the list to a text string and… you guessed it… display it in a label.

Well, that’s the end of the tutorial. I hope you enjoyed it and I can’t wait to see what you make using Ximilar and Thunkable!

Further Considerations

Please don’t ask me to share a project link for this tutorial. Yes, I have a project with all of this inside. But I put in dozens of hours getting all of this to work and the project contains my private tokens, and creative approaches to handling data and displaying it. I value that time and I’m not willing to give it away for free. You’re just going to have to put in the time to create this yourself. Heck, this tutorial alone took me several hours to make. I like to give back to this community because others have helped me along the way and I hope that if you find this useful, you’ll also find a way to give back to someone else here.

And honestly, you’ll learn so much more and be better at it by creating this from scratch – with the help above – than by simply using what I’ve made. And if that sounds like a teacher’s voice well, I’m a teacher so there you go! :wink:

Some thoughts beyond the tutorial…

Well, what if you want to classify ant species but you also want to be able to tell if something is another type of insect? For example, say I take a photo of a butterfly. I might want the app to be able to tell me it’s a butterfly but then also tell me what species of ant another photo is. How would I do that? I would need a three-tier/task approach. So the first task in the flow would determine “insect or not”, the second task would branch to (assuming it’s an insect) “ant or not” and the third task (assuming it’s an ant) would branch to “Lasius Niger or Odorous House Ant or Argentine Ant”. I’m pretty sure that’s possible and I’m also pretty sure there are other ways to organize that logic. But it would be possible to make a pretty sophisticated image classification app given enough time and enough image input (photos).

5 Likes

:open_mouth:WHOA! this is an awesome tutorial! (and it’s just part 1!). i’m getting ant-sy already, i’m itching for part 2 (excuse the puns)! thanks in advance for doing this! you are truly a teacher. you write good!

2 Likes

Awesome work and excellent tutorial.

1 Like