1. ## A Pragmatic Approach to Reading Text with Tesseract

A Pragmatic Approach to Reading Text with

Foreward | Introduction

Despite this post, I know next to nothing in regards to how this works its magic under the hood.
I still have yet to learn a lot about Tesseract and OCR in general. However, with that said, one
doesn't need to know a lot on how it works to use it. I remember feeling like Tesseract was this
sort of super complex thing that I wouldn't have time for. This tutorial serves only as a way to
provide enough 'activation energy' to get started with reading text with Tesseract (as it did with me) -

- and enough is very little - and that's what I'm going to try to show today.

Enjoy!

Basic Idea

We'll be using a single function call, tesseractGetText(), that takes in a TBox (within which
will be searched for text), as well as a filter constant that which will be used as 'criteria'
for accurately detecting our text. It returns a string.

In this tutorial, I'm going to show you an alternative method to get prayer info for your
character (convenient for tutorialing I suppose), and ways to massage (if applicable) the data
more into something you can use.

Tools Used

We'll be using the TesseractTool by @Olly to help us test and create our filters.

Screenshot software; I personally like ShareX, but for the purposes of this tutorial, I'll be
using the Windows' Snipping Tool. This will be used in tandem with the TesseractTool.

Step 1: Screencap

So, we know that we are going to be trying to read in our prayer info. It's got fairly contrasting
colors between the background and font, seems like a very suitable victim of Tesseract because of this.

Target acquired.

I'm going to be trying to read in both sides of the '/' there, getting both current and maximum
prayer points. To get this process underway, we're going to be screencapping just about the
tightest cropping of the text you are going to want to read. In this case, I'm going to go ahead
and use Snipping Tool to get a capture of it.

I took a screencap of myself screencapping.

After getting a good clean capture, go ahead and Right Click -> Copy from Snipping Tool. Don't use save
with the snipping tool, the image becomes too compressed and it loses a bit of it's quality.

Man, I really oughtta train prayer someday.

From there, you're going to want to simply tab back to your instance of TesseractTool (now referred
to as 'TT' from now on), and CTRL+V-slap that copied image right in there to begin carving out your filter.

Blown away.

Step 2: Make Filter

Anyways, here comes the sorta trial-and-error part of this process. From here, you need to essentially
play around with these settings until you get something that accurately resembles what your're
looking for. You're going to stretch, squeeze, and carve your way to victory, and I believe in you.

The 'Resize' area of the TT deals with scaling either the width or height up from 1:1 ratio you start at.
This is useful for creating more pixels for the tool to use - remember this will just end up to what
amounts to helping a math function via the filter we're going to create from this. You set which scale
parameters you want to try out, then hit the resize button. NOTE: Bigger isn't always better.

Next up is the 'Threshold' area of the TT, which deals in accurately masking out the pixels it thinks we
want it to mask out (anything that isn't a character, pretty much). You'll see an invert toggle, which
helps very situationally (in my experience, it's worth a try in lower contrast challenges). You can keep the
drop down to default 'TM_Mean' (I don't know a lot about this one tbh). Then comes the amount, which is
the arguably only setting you need to touch in this area - essentially more threshold amount, the more it
'carves' away at the letters, which might be good for getting rid of straggling colors or pulling together
a lopsided '1' or something. A typical good range I start at is 20-40. Hit the Apply Threshold to see
what's up and to be able to next hit that big ol Tesseract! button. This will print out what it saw with

Rinse + wash + repeat until desired results!

1. At this point, since this is such a physically small bit of text, not a lot of pixels to work with, one can
readily expect the default filter to work. fig. 1

fig. 1

Default settings.

2. I suppose I'll try straight-up enlargment of our sample by setting both width and height scales to 5
retaining all proportion scale as well). As you can see, this was maybe too much as our '/' now reads as a capital i. fig. 2

fig. 2

Enhance.

3. From experience, I know that, typically, one can get more accurate '/' readings by having more width, as it
squashes it down to be more horizontal, so I'm going to try that next. I'll also kick the threshold up a bit
just for fun. fig. 3

fig. 3

Squash.

4. Maybe I was a little too heavy handed with that adjustment, but at this point, I'm just going to try going
to 3W 3H and see what happens, as well as bring the threshold amount up to 30 (I probably seem like a crazy
person, but I'm kinda winging it too). fig. 4

fig. 4

Dehance.

5. Wow! It's completely accurate! However, don't fully celebrate yet! The next step is change what the text says,
re-screencap, re-copy+paste into the TT and make sure the same settings work for different text! fig. 5

fig. 5

Double check.

Wooo! In an ideal world, you want to be certain this works with 3 or 4+ different caps of text, and that if you had
to adjust your filter for the NEWEST cap, that you'd go back and cross check it with your older text samples to make
sure your adjusted filter works for every text sample still. This makes your filter quite robust and less likely to
return data you don't want to be finding. With that said, I'm pretty confident that for this situation, this is going
to be pretty good for two iterations - so I'll move on to last part of this step...

Generating that there fancy-shmancy TTesseractFilter!!! Wooooooooooooooo -

go to 'Tesseract' -> 'Write Filter'.

In the console are of TT you should see something like const myFilter: TTesseractFilter = [ s t u f f ].

Hey, guess what? We now get to see a little bit of code (well actually, you could've set up your simba project before
you took screencaps and fiddled your way around the TT, but you probably didn't, did you. if you did - damn ur good.

Anyways, let's go ahead a mockup a quick function that returns a string and call it something like 'readPrayer'
that which also has a TTesseractFilter const matching TT's output.

Simba Code:
function readPrayer: string;const  filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];var  searchArea: TBox;begin  result := tesseractGetText(searchArea, filter);end;

Oh wait, oh no! We're not ready! We don't yet have any values for our TBox - tragedy! What will we do!?
Oh wait, don't worry we can do that now ez pz.

Step 3: Define Search Area

Finding where the points that make up our box are is likely no new news to you, but I'ma walk through it anyways.

I'm simply going to make sure Simba is paired with our SMART client, then hover my mouse in the two points which
make up the box we want to be trying to read our text from within, reading the x and y coordinate from your Simba window.
You generally want this to be as tight to the data as you can get, but I usually like to leave a 1 or 2 pixel buffer
in between the bounds of the text, if you end up cutting into your text, you won't be getting accurate results!
In this case, I'll hover over the top left and bottom right corners of my prayer stats there on the action bar.

Top left corner:

<- Hover, Coords ->

Bottom right corner:

<- Hover, Coords ->

(Optional: Load a full viewport screencap from your SMART client into something like photoshop or gimp where you
can zoom in and see a coordinate system for super accurate and super tight search boxes for added consistency)

Now with these two points, (318, 311) and (364, 323), I'll use those with a simple intToBox(x1, y1, x2, y2) to
assign our searchBox. In the var section of our function, we now have this.

Simba Code:
var  searchArea: TBox := intToBox(318, 311, 364, 323);

Now we're on to finally testing this in game to see if everything has seemed to come out as planned!

Step 4: Test

For the purposes of testing, I've put our function into a repeat..until false; with a wait to keep printing
out results for us.

Simba Code:
program ReadSomeDangPrayer;{$DEFINE SMART}{$I SRL-6/srl.simba} function readPrayer: string;const  filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];var  searchArea: TBox := intToBox(318, 311, 364, 323);begin  result := tesseractGetText(searchArea, filter);end;begin  setupSRL();  repeat    writeLn(readPrayer);    wait(1000);  until false;end.

Now, if all goes well, you should be getting something like this output in your Simba console:

Which would be great soooo we can move on to the next step, an oh-so-sensual step ...

Step 5: Massage

Don't let the name or others fool you. This step is strictly for business only.

Now that we have that we have that straightened out, we can move on to turning this string into smaller, more digestible pieces.
First off, we're going to want to expand our var section a bit: let's go ahead and add a TStringArray named arr. You know. Like a pirate.
We're going to be storing the stringed numbers that represent our Current and Maximum prayer points, respectively, in this array with a
handy-dandy function called explodeWrap(). We'll also create a local string variable for us to mangle and contort in any way
we please before handing it over as a result. Here's what our function looks like now ...

Simba Code:
function readPrayer: string;const  filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];var  searchArea: TBox := intToBox(318, 311, 364, 323);  ourString: string;  arr: TStringArray;begin  ourString := tesseractGetText(searchArea, filter);  // split our string at the slash into our array in order  explodeWrap('/', ourString, arr);  // left side of our slash is current prayer  // therefore arr[0] is our current prayer  writeLn('Current Prayer Points: ' + arr[0]);  // right side of our slash is maximum prayer  // therefore arr[1] is our maximum prayer  writeLn('Maximum Prayer Points: ' + arr[1]);  result := ourString;end;

... and with that, when we run the script ...

#dataisbeautiful

Very nice! However, this doesn't seem very useful... really at all, yet, so let's see if we can do something about that. Let's add
a boolean parameter to our function asking us if we'd like to return what adds up to either the current, or the maximum point value.
I'll have it default to true for me, but you can do whatever you wish. It's your script. You should know this.

BONUS TIP: Sometimes you may end up with characters you don't want, like spaces, commas, or other stray marks that shouldn't be where they are,
aside from further honing your filter, you can also use something like
replace(); to get rid of those insubordinates.
With that said, you should make sure your filter is as good as you can get it first before this. An example to remove spaces would be:
replace(ourString, ' ', '', [rfReplaceAll]); There's also an rfIgnoreCase flag you could add to that array for other uses!

Here's what my final program looks like:

Simba Code:
program ReadSomeDangPrayer;{$DEFINE SMART}{$I SRL-6/srl.simba}// our text finding functionfunction readPrayer(current: boolean = true): string;const  filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];var  searchArea: TBox := intToBox(318, 311, 364, 323);  ourString: string;  arr: TStringArray;begin  ourString := tesseractGetText(searchArea, filter);  // split our string at the slash into our array in order  explodeWrap('/', ourString, arr);  if (current) then    begin      writeLn('Current Prayer Points: ' + arr[0]);      result := arr[0];    end  else    begin      writeLn('Maximum Prayer Points: ' + arr[1]);      result := arr[1];    end;end;begin  setupSRL();  repeat    writeLn(readPrayer);    wait(1000);  until false;end.

And here's what it might return:

Current

Maximum

Now THAT'S some usable data right there, I tell ye!

Conclusion

Now, this tutorial was fairly purposefully pretty simple, but the main takeaway should be essentially that if you
can predict where you might see text on screen, and you wish to read it, Tesseract is a very viable solution with
not really a whole lot effort. This post might be several real feet long in length, but it probably translates into
about a 5 minute process for simple applications like this. The real time-and-complexity-suck comes from your last
step, massaging whatever data you get into the data you want.

Good luck, and thanks for reading!

- Lama
Last edited by Lama; 07-20-2018 at 02:40 AM.

2. Very nice tutorial, Tesseract has been somewhat of an enigma to me and outside of SRL-6 I have little understanding of how to implement it. Thanks for this.

It would be nice to see the Tesseract plugin used in other includes and scripts, this tutorial makes accomplishing something of that nature much easier, if anyone is so inclined.

3. Registered User
Join Date
Jul 2015
Location
San Diego
Posts
14
Mentioned
0 Post(s)
Quoted
7 Post(s)
Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs

4. Originally Posted by argothes
Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs
Tesseract (https://github.com/tesseract-ocr/) was originally created as an AIO text reader, so it's not limited to Simba/Runescape at all! As for dynamic backgrounds, if the color of your text is static, then you can replace the pixel value of every non-text color to black and have Tesseract search that. If that's not the case, then you can calculate the horizontal/vertical gradient (https://en.wikipedia.org/wiki/Image_gradient) of the pixel change and have Tesseract run that. Also, you can use Simba with Python, check out PyMML on Wizzup's github. It has all the color finding techniques that are regularly used in Simba.

5. Originally Posted by argothes
Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs
Here is a plugin that will let you deal with shifted pixels: https://villavu.com/forum/showthread...42#post1358842
If you could post some images of the the problematic text, others could chip in and do some tests of their own.