Results 1 to 5 of 5

Thread: A Pragmatic Approach to Reading Text with Tesseract

  1. #1
    Join Date
    Feb 2013
    Location
    The Boonies
    Posts
    203
    Mentioned
    9 Post(s)
    Quoted
    70 Post(s)

    Default A Pragmatic Approach to Reading Text with Tesseract

    A Pragmatic Approach to Reading Text with


    Foreward | Introduction



    Despite this post, I know next to nothing in regards to how this works its magic under the hood.
    I still have yet to learn a lot about Tesseract and OCR in general. However, with that said, one
    doesn't need to know a lot on how it works to use it. I remember feeling like Tesseract was this
    sort of super complex thing that I wouldn't have time for. This tutorial serves only as a way to
    provide enough 'activation energy' to get started with reading text with Tesseract (as it did with me) -

    - and enough is very little - and that's what I'm going to try to show today.

    Enjoy!

    Basic Idea



    We'll be using a single function call, tesseractGetText(), that takes in a TBox (within which
    will be searched for text), as well as a filter constant that which will be used as 'criteria'
    for accurately detecting our text. It returns a string.

    In this tutorial, I'm going to show you an alternative method to get prayer info for your
    character (convenient for tutorialing I suppose), and ways to massage (if applicable) the data
    more into something you can use.

    Tools Used



    We'll be using the TesseractTool by @Olly to help us test and create our filters.

    Screenshot software; I personally like ShareX, but for the purposes of this tutorial, I'll be
    using the Windows' Snipping Tool. This will be used in tandem with the TesseractTool.

    Step 1: Screencap



    So, we know that we are going to be trying to read in our prayer info. It's got fairly contrasting
    colors between the background and font, seems like a very suitable victim of Tesseract because of this.


    Target acquired.

    Go ahead and run TesseractTool.simba

    I'm going to be trying to read in both sides of the '/' there, getting both current and maximum
    prayer points. To get this process underway, we're going to be screencapping just about the
    tightest cropping of the text you are going to want to read. In this case, I'm going to go ahead
    and use Snipping Tool to get a capture of it.


    I took a screencap of myself screencapping.

    After getting a good clean capture, go ahead and Right Click -> Copy from Snipping Tool. Don't use save
    with the snipping tool, the image becomes too compressed and it loses a bit of it's quality.


    Man, I really oughtta train prayer someday.

    From there, you're going to want to simply tab back to your instance of TesseractTool (now referred
    to as 'TT' from now on), and CTRL+V-slap that copied image right in there to begin carving out your filter.


    Blown away.

    Step 2: Make Filter



    Anyways, here comes the sorta trial-and-error part of this process. From here, you need to essentially
    play around with these settings until you get something that accurately resembles what your're
    looking for. You're going to stretch, squeeze, and carve your way to victory, and I believe in you.

    The 'Resize' area of the TT deals with scaling either the width or height up from 1:1 ratio you start at.
    This is useful for creating more pixels for the tool to use - remember this will just end up to what
    amounts to helping a math function via the filter we're going to create from this. You set which scale
    parameters you want to try out, then hit the resize button. NOTE: Bigger isn't always better.

    Next up is the 'Threshold' area of the TT, which deals in accurately masking out the pixels it thinks we
    want it to mask out (anything that isn't a character, pretty much). You'll see an invert toggle, which
    helps very situationally (in my experience, it's worth a try in lower contrast challenges). You can keep the
    drop down to default 'TM_Mean' (I don't know a lot about this one tbh). Then comes the amount, which is
    the arguably only setting you need to touch in this area - essentially more threshold amount, the more it
    'carves' away at the letters, which might be good for getting rid of straggling colors or pulling together
    a lopsided '1' or something. A typical good range I start at is 20-40. Hit the Apply Threshold to see
    what's up and to be able to next hit that big ol Tesseract! button. This will print out what it saw with
    your settings.

    Rinse + wash + repeat until desired results!

    1. At this point, since this is such a physically small bit of text, not a lot of pixels to work with, one can
      readily expect the default filter to work. fig. 1

      fig. 1


      Default settings.

    2. I suppose I'll try straight-up enlargment of our sample by setting both width and height scales to 5
      retaining all proportion scale as well). As you can see, this was maybe too much as our '/' now reads as a capital i. fig. 2

      fig. 2


      Enhance.

    3. From experience, I know that, typically, one can get more accurate '/' readings by having more width, as it
      squashes it down to be more horizontal, so I'm going to try that next. I'll also kick the threshold up a bit
      just for fun. fig. 3

      fig. 3


      Squash.

    4. Maybe I was a little too heavy handed with that adjustment, but at this point, I'm just going to try going
      to 3W 3H and see what happens, as well as bring the threshold amount up to 30 (I probably seem like a crazy
      person, but I'm kinda winging it too). fig. 4

      fig. 4


      Dehance.

    5. Wow! It's completely accurate! However, don't fully celebrate yet! The next step is change what the text says,
      re-screencap, re-copy+paste into the TT and make sure the same settings work for different text! fig. 5

      fig. 5


      Double check.


    Wooo! In an ideal world, you want to be certain this works with 3 or 4+ different caps of text, and that if you had
    to adjust your filter for the NEWEST cap, that you'd go back and cross check it with your older text samples to make
    sure your adjusted filter works for every text sample still. This makes your filter quite robust and less likely to
    return data you don't want to be finding. With that said, I'm pretty confident that for this situation, this is going
    to be pretty good for two iterations - so I'll move on to last part of this step...

    Generating that there fancy-shmancy TTesseractFilter!!! Wooooooooooooooo -

    Simple enough - once you've found your filter settings that you feel satisfied with, head on up to your menu and
    go to 'Tesseract' -> 'Write Filter'.



    In the console are of TT you should see something like const myFilter: TTesseractFilter = [ s t u f f ].



    Hey, guess what? We now get to see a little bit of code (well actually, you could've set up your simba project before
    you took screencaps and fiddled your way around the TT, but you probably didn't, did you. if you did - damn ur good.

    Anyways, let's go ahead a mockup a quick function that returns a string and call it something like 'readPrayer'
    that which also has a TTesseractFilter const matching TT's output.

    Simba Code:
    function readPrayer: string;
    const
      filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];
    var
      searchArea: TBox;
    begin
      result := tesseractGetText(searchArea, filter);
    end;

    Oh wait, oh no! We're not ready! We don't yet have any values for our TBox - tragedy! What will we do!?
    Oh wait, don't worry we can do that now ez pz.

    Step 3: Define Search Area



    Finding where the points that make up our box are is likely no new news to you, but I'ma walk through it anyways.

    I'm simply going to make sure Simba is paired with our SMART client, then hover my mouse in the two points which
    make up the box we want to be trying to read our text from within, reading the x and y coordinate from your Simba window.
    You generally want this to be as tight to the data as you can get, but I usually like to leave a 1 or 2 pixel buffer
    in between the bounds of the text, if you end up cutting into your text, you won't be getting accurate results!
    In this case, I'll hover over the top left and bottom right corners of my prayer stats there on the action bar.

    Top left corner:

    <- Hover, Coords ->

    Bottom right corner:

    <- Hover, Coords ->

    (Optional: Load a full viewport screencap from your SMART client into something like photoshop or gimp where you
    can zoom in and see a coordinate system for super accurate and super tight search boxes for added consistency)

    Now with these two points, (318, 311) and (364, 323), I'll use those with a simple intToBox(x1, y1, x2, y2) to
    assign our searchBox. In the var section of our function, we now have this.

    Simba Code:
    var
      searchArea: TBox := intToBox(318, 311, 364, 323);

    Now we're on to finally testing this in game to see if everything has seemed to come out as planned!

    Step 4: Test



    For the purposes of testing, I've put our function into a repeat..until false; with a wait to keep printing
    out results for us.

    Simba Code:
    program ReadSomeDangPrayer;
    {$DEFINE SMART}
    {$I SRL-6/srl.simba}

    function readPrayer: string;
    const
      filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];
    var
      searchArea: TBox := intToBox(318, 311, 364, 323);
    begin
      result := tesseractGetText(searchArea, filter);
    end;

    begin
      setupSRL();

      repeat
        writeLn(readPrayer);
        wait(1000);
      until false;
    end.

    Now, if all goes well, you should be getting something like this output in your Simba console:



    Which would be great soooo we can move on to the next step, an oh-so-sensual step ...

    Step 5: Massage



    Don't let the name or others fool you. This step is strictly for business only.

    Now that we have that we have that straightened out, we can move on to turning this string into smaller, more digestible pieces.
    First off, we're going to want to expand our var section a bit: let's go ahead and add a TStringArray named arr. You know. Like a pirate.
    We're going to be storing the stringed numbers that represent our Current and Maximum prayer points, respectively, in this array with a
    handy-dandy function called explodeWrap(). We'll also create a local string variable for us to mangle and contort in any way
    we please before handing it over as a result. Here's what our function looks like now ...

    Simba Code:
    function readPrayer: string;
    const
      filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];
    var
      searchArea: TBox := intToBox(318, 311, 364, 323);
      ourString: string;
      arr: TStringArray;
    begin
      ourString := tesseractGetText(searchArea, filter);

      // split our string at the slash into our array in order
      explodeWrap('/', ourString, arr);

      // left side of our slash is current prayer
      // therefore arr[0] is our current prayer
      writeLn('Current Prayer Points: ' + arr[0]);

      // right side of our slash is maximum prayer
      // therefore arr[1] is our maximum prayer
      writeLn('Maximum Prayer Points: ' + arr[1]);

      result := ourString;
    end;

    ... and with that, when we run the script ...


    #dataisbeautiful

    Very nice! However, this doesn't seem very useful... really at all, yet, so let's see if we can do something about that. Let's add
    a boolean parameter to our function asking us if we'd like to return what adds up to either the current, or the maximum point value.
    I'll have it default to true for me, but you can do whatever you wish. It's your script. You should know this.

    BONUS TIP: Sometimes you may end up with characters you don't want, like spaces, commas, or other stray marks that shouldn't be where they are,
    aside from further honing your filter, you can also use something like
    replace(); to get rid of those insubordinates.
    With that said, you should make sure your filter is as good as you can get it first before this. An example to remove spaces would be:
    replace(ourString, ' ', '', [rfReplaceAll]); There's also an rfIgnoreCase flag you could add to that array for other uses!


    Here's what my final program looks like:

    Simba Code:
    program ReadSomeDangPrayer;
    {$DEFINE SMART}
    {$I SRL-6/srl.simba}

    // our text finding function
    function readPrayer(current: boolean = true): string;
    const
      filter: TTesseractFilter = [3, 3, [False, 30, TM_Mean]];
    var
      searchArea: TBox := intToBox(318, 311, 364, 323);
      ourString: string;
      arr: TStringArray;
    begin
      ourString := tesseractGetText(searchArea, filter);

      // split our string at the slash into our array in order
      explodeWrap('/', ourString, arr);

      if (current) then
        begin
          writeLn('Current Prayer Points: ' + arr[0]);
          result := arr[0];
        end
      else
        begin
          writeLn('Maximum Prayer Points: ' + arr[1]);
          result := arr[1];
        end;
    end;

    begin
      setupSRL();

      repeat
        writeLn(readPrayer);
        wait(1000);
      until false;
    end.

    And here's what it might return:


    Current


    Maximum

    Now THAT'S some usable data right there, I tell ye!

    Conclusion



    Now, this tutorial was fairly purposefully pretty simple, but the main takeaway should be essentially that if you
    can predict where you might see text on screen, and you wish to read it, Tesseract is a very viable solution with
    not really a whole lot effort. This post might be several real feet long in length, but it probably translates into
    about a 5 minute process for simple applications like this. The real time-and-complexity-suck comes from your last
    step, massaging whatever data you get into the data you want.

    Good luck, and thanks for reading!

    - Lama
    Last edited by Lama; 07-20-2018 at 02:40 AM.

  2. #2
    Join Date
    Dec 2011
    Location
    East Coast, USA
    Posts
    4,231
    Mentioned
    112 Post(s)
    Quoted
    1869 Post(s)

    Default

    Very nice tutorial, Tesseract has been somewhat of an enigma to me and outside of SRL-6 I have little understanding of how to implement it. Thanks for this.

    It would be nice to see the Tesseract plugin used in other includes and scripts, this tutorial makes accomplishing something of that nature much easier, if anyone is so inclined.
    GitLab projects | Simba 1.4 | Find me on IRC or Discord | ScapeRune scripts | Come play bot ScapeRune!

    <BenLand100> we're just in the transitional phase where society reclassifies guns as Badâ„¢ before everyone gets laser pistols

  3. #3
    Join Date
    Jul 2015
    Location
    San Diego
    Posts
    14
    Mentioned
    0 Post(s)
    Quoted
    7 Post(s)

    Default

    Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs
    Simba Code:
    Wait(1);

  4. #4
    Join Date
    Dec 2011
    Location
    Toronto, Ontario
    Posts
    6,424
    Mentioned
    84 Post(s)
    Quoted
    863 Post(s)

    Default

    Quote Originally Posted by argothes View Post
    Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs
    Tesseract (https://github.com/tesseract-ocr/) was originally created as an AIO text reader, so it's not limited to Simba/Runescape at all! As for dynamic backgrounds, if the color of your text is static, then you can replace the pixel value of every non-text color to black and have Tesseract search that. If that's not the case, then you can calculate the horizontal/vertical gradient (https://en.wikipedia.org/wiki/Image_gradient) of the pixel change and have Tesseract run that. Also, you can use Simba with Python, check out PyMML on Wizzup's github. It has all the color finding techniques that are regularly used in Simba.

  5. #5
    Join Date
    May 2012
    Location
    Glorious Nippon
    Posts
    1,011
    Mentioned
    50 Post(s)
    Quoted
    505 Post(s)

    Default

    Quote Originally Posted by argothes View Post
    Great tutorial, I didn't even know the TesseractTool existed. Have you had any luck using this approach on non-black and dynamic backgrounds? I'm writing a bot for a non-RS game that relies heavily on OCR, and I've been having really poor performance due to the dynamic background the text is on. I can simply subtract all pixels which have shifted to eliminate this problem, but I can't seem to find a built-in way to do that with Simba, even with SRL-6. I'm probably going to switch to a Python implementation soon, which is too bad because I love using Simba for all of my botting needs
    Here is a plugin that will let you deal with shifted pixels: https://villavu.com/forum/showthread...42#post1358842
    If you could post some images of the the problematic text, others could chip in and do some tests of their own.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •