Retrieve text from json object

Hi everyone,

I received “something has gone wrong. check your blocks and reset this page” error by running the following block. The response is from azure OCR, I’m trying to retrieve all the ‘text’ - which is the equivalent of this python code.

analysis = response_final.json()
time.sleep(1) # Wait for a second to avoid overloading the server.

Extract the recognized text from the response.

lines = [line[“text”] for line in analysis[“analyzeResult”][“readResults”][0][“lines”]]
for line in lines:
print(line)

I’m quite new to thunkable. **I believe the for each item k in list is giving the error ** but I’m not sure how to fix it. hope someone could shed some lights here.

THanks!!

this is the json response:

{‘status’: ‘succeeded’, ‘createdDateTime’: ‘2023-07-07T07:39:34Z’, ‘lastUpdatedDateTime’: ‘2023-07-07T07:39:35Z’, ‘analyzeResult’: {‘version’: ‘3.0.0’, ‘readResults’: [{‘page’: 1, ‘angle’: 1.3974, ‘width’: 3024, ‘height’: 4032, ‘unit’: ‘pixel’, ‘lines’: [{‘boundingBox’: [2482, 63, 2863, 75, 2860, 164, 2480, 150], ‘text’: ‘message’, ‘words’: [{‘boundingBox’: [2482, 64, 2860, 75, 2855, 165, 2480, 150], ‘text’: ‘message’, ‘confidence’: 0.984}]}, {‘boundingBox’: [1006, 563, 1406, 580, 1403, 667, 1003, 647], ‘text’: ‘set app’, ‘words’: [{‘boundingBox’: [1006, 564, 1170, 571, 1168, 656, 1004, 647], ‘text’: ‘set’, ‘confidence’: 0.987}, {‘boundingBox’: [1198, 572, 1384, 579, 1382, 667, 1195, 657], ‘text’: ‘app’, ‘confidence’: 0.986}]}, {‘boundingBox’: [1516, 576, 1874, 578, 1872, 673, 1514, 668], ‘text’: ‘variable’, ‘words’: [{‘boundingBox’: [1517, 577, 1871, 579, 1867, 674, 1515, 666], ‘text’: ‘variable’, ‘confidence’: 0.983}]}, {‘boundingBox’: [2029, 585, 2164, 593, 2159, 681, 2026, 675], ‘text’: ‘64’, ‘words’: [{‘boundingBox’: [2041, 586, 2161, 592, 2156, 682, 2036, 675], ‘text’: ‘64’, ‘confidence’: 0.559}]}, {‘boundingBox’: [2173, 611, 2950, 617, 2949, 719, 2173, 710], ‘text’: ‘post_message "’, ‘words’: [{‘boundingBox’: [2180, 611, 2828, 619, 2832, 719, 2182, 704], ‘text’: ‘post_message’, ‘confidence’: 0.939}, {‘boundingBox’: [2852, 619, 2943, 618, 2947, 718, 2856, 719], ‘text’: ‘"’, ‘confidence’: 0.851}]}, {‘boundingBox’: [976, 1130, 2412, 1158, 2410, 1252, 975, 1218], ‘text’: “set post api text 's Text to”, ‘words’: [{‘boundingBox’: [976, 1131, 1157, 1131, 1156, 1224, 976, 1220], ‘text’: ‘set’, ‘confidence’: 0.983}, {‘boundingBox’: [1174, 1131, 1360, 1133, 1359, 1229, 1173, 1224], ‘text’: ‘post’, ‘confidence’: 0.985}, {‘boundingBox’: [1378, 1133, 1512, 1136, 1510, 1232, 1377, 1229], ‘text’: ‘api’, ‘confidence’: 0.979}, {‘boundingBox’: [1529, 1136, 1744, 1142, 1743, 1237, 1528, 1232], ‘text’: ‘text’, ‘confidence’: 0.955}, {‘boundingBox’: [1837, 1145, 1954, 1149, 1952, 1242, 1836, 1240], ‘text’: “'s”, ‘confidence’: 0.983}, {‘boundingBox’: [1989, 1150, 2204, 1160, 2201, 1248, 1987, 1243], ‘text’: ‘Text’, ‘confidence’: 0.965}, {‘boundingBox’: [2302, 1165, 2413, 1171, 2410, 1252, 2300, 1250], ‘text’: ‘to’, ‘confidence’: 0.94}]}, {‘boundingBox’: [2592, 1189, 2976, 1190, 2976, 1277, 2592, 1276], ‘text’: ‘generate’, ‘words’: [{‘boundingBox’: [2593, 1191, 2973, 1192, 2971, 1278, 2594, 1276], ‘text’: ‘generate’, ‘confidence’: 0.981}]}, {‘boundingBox’: [359, 1881, 1262, 1900, 1259, 1996, 358, 1971], ‘text’: ‘when Web_Viewer1’, ‘words’: [{‘boundingBox’: [359, 1881, 619, 1884, 618, 1978, 359, 1973], ‘text’: ‘when’, ‘confidence’: 0.986}, {‘boundingBox’: [668, 1885, 1261, 1906, 1256, 1997, 666, 1979], ‘text’: ‘Web_Viewer1’, ‘confidence’: 0.715}]}, {‘boundingBox’: [1526, 1920, 2278, 1948, 2275, 2043, 1523, 2010], ‘text’: ‘Receives Message’, ‘words’: [{‘boundingBox’: [1526, 1921, 1885, 1932, 1883, 2022, 1523, 2009], ‘text’: ‘Receives’, ‘confidence’: 0.982}, {‘boundingBox’: [1902, 1932, 2278, 1950, 2276, 2044, 1900, 2023], ‘text’: ‘Message’, ‘confidence’: 0.985}]}, {‘boundingBox’: [2656, 2136, 3021, 2151, 3019, 2228, 2654, 2217], ‘text’: ‘message’, ‘words’: [{‘boundingBox’: [2657, 2136, 3016, 2152, 3016, 2228, 2654, 2216], ‘text’: ‘message’, ‘confidence’: 0.984}]}…}]}]}]}}

It’s a little hard for me to tell but it seems like the property “lineText” might be a sub-property of the property “readResults”. So you’d need to set up the blocks differently.

I can’t tell for sure because the JSON you posted is invalid. Can you paste the full JSON response into https://codebeautify.org/jsonviewer and make sure it’s valid? And then post the updated JSON here?

Thanks, @tatiang
heres the sample JSON response and I would like to retrieve all the “text”
I have another related question - I set an app variable with the full JSON response and set the text as the app variable. I noticed the displayed text got cut off and won’t contain the full JSON response. Is there a limitation for the text length in an app variable or displayed text?
Thanks!

{
“status”: “succeeded”,
“createdDateTime”: “2021-02-04T06:32:08.2752706+00:00”,
“lastUpdatedDateTime”: “2021-02-04T06:32:08.7706172+00:00”,
“analyzeResult”: {
“version”: “3.2”,
“readResults”: [
{
“page”: 1,
“angle”: 2.1243,
“width”: 502,
“height”: 252,
“unit”: “pixel”,
“lines”: [
{
“boundingBox”: [
58,
42,
314,
59,
311,
123,
56,
121
],
“text”: “Tabs vs”,
“appearance”: {
“style”: {
“name”: “handwriting”,
“confidence”: 0.96
}
},
“words”: [
{
“boundingBox”: [
68,
44,
225,
59,
224,
122,
66,
123
],
“text”: “Tabs”,
“confidence”: 0.933
},
{
“boundingBox”: [
241,
61,
314,
72,
314,
123,
239,
122
],
“text”: “vs”,
“confidence”: 0.977
}
]
}
]
}
]
}
}

Are you displaying the JSON response in a Text Input component? I haven’t had a problem with that getting cut off. If it does, you might need to use a browser or a third-party tool like Postman to get the full response.

i have a feeling that you copied the json string from a formatted page.
i loaded your string in notepad and changed all left_double_quote and right_double_quote (which in printed documents are different!) to “regular”/“normal” quote. then formatted it as a pre-formatted text and i got this

{
"status": "succeeded",
"createdDateTime": "2021-02-04T06:32:08.2752706+00:00",
"lastUpdatedDateTime": "2021-02-04T06:32:08.7706172+00:00",
"analyzeResult": {
"version": "3.2",
"readResults": [
{
"page": 1,
"angle": 2.1243,
"width": 502,
"height": 252,
"unit": "pixel",
"lines": [
{
"boundingBox": [
58,
42,
314,
59,
311,
123,
56,
121
],
"text": "Tabs vs",
"appearance": {
"style": {
"name": "handwriting",
"confidence": 0.96
}
},
"words": [
{
"boundingBox": [
68,
44,
225,
59,
224,
122,
66,
123
],
"text": "Tabs",
"confidence": 0.933
},
{
"boundingBox": [
241,
61,
314,
72,
314,
123,
239,
122
],
"text": "vs",
"confidence": 0.977
}
]
}
]
}
]
}
}

you may have a stray illegal quote mark somewhere in your block? or maybe your returned json string has an illegal double quote character somewhere?

as @tatiang suggested, show as the block that performs the call then show us the full response as pre-formatted text.

Thanks, @manyone, I didn’t mention it but I did format the text that was posted above using the </> button in the forums toolbar (to remove smart quotes) but it still wasn’t valid JSON.

Using the re-formatted JSON that @manyone posted, I was able to view the JSON tree as follows:

When clicking the “text” property in the “words” property list, the path to that property is shown at the top. To translate that into Thunkable blocks is a pain but possible. I prefer to use a property string. So the property you’d be getting from the JSON object is “analyzeResult.readResults[1].lines[1].words[1].text” because JSON lists are numbered starting at 0 but Thunkable lists are numbered starting at 1. To iterate through the “words” list, you would use a variable in place of the last 1 which I bolded above. And you would need to set its starting value to 1 and then increment it by 1 each time through the loop. Use a Join text block to connect the first part of the property string + the variable + the last part.

1 Like

(pardon me for cluttering up the thread with junk! where did that come from!?).

I saw that and wasn’t sure what it was. :yum:

No worries… happy Saturday!