Results 1 to 17 of 17

Thread: multiple 'between'

  1. #1
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default multiple 'between'

    I need to extract to TstringArray multiple things between tags ,but function between returns only first one. I was thinking about making buffer moving through whole string ,but it will be too slow ,as string is very long. Is any other way to do this?
    Simba Code:
    program new;
    var
    a,b :string;
    c :Tstringarray;
    begin
     a := '<tag>test1</tag>.....<tag>test2</tag>.....<tag>test3</tag>.....<tag>test4</tag>.....';
     b := between('<tag>','</tag>',a);
     writeln(b);
    end.

    output should be:
    c = ['test1','test2','test3','test4']

  2. #2
    Join Date
    Oct 2006
    Location
    Netherlands
    Posts
    3,285
    Mentioned
    105 Post(s)
    Quoted
    494 Post(s)

    Default

    Regular expressions. That will be the best way.
    Working on: Tithe Farmer

  3. #3
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default

    Hmm I have idea to use ReplaceRegExpr and replace everything which is not (<tag>[anything]</tag> ) with one symbol. Then use Explode function to get TStringArray ,and finally Between for every element in array. Is it the best way?

  4. #4
    Join Date
    Feb 2011
    Location
    The Future.
    Posts
    5,600
    Mentioned
    396 Post(s)
    Quoted
    1598 Post(s)

    Default

    RegExpr for sure..

    I say.. use RegExpr + pos to iterator through the string and match.. sorta like what I did for the StringToTPA function.. Or just look up PregMatchAll and write one for pascal..

    I had one in the CurlLib library and in pascal but I reformatted to windows 8 :c
    I am Ggzz..
    Hackintosher

  5. #5
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default

    Problem with iteration is that string has length 5000+ ,I think it can be too slow.

    How to make negation of this expression?
    Simba Code:
    b := ReplaceRegExpr('<tag>.{1,8}</tag>',a,'~',TRUE);
    I mean to replace with ~ everything which is not(<tag>.{1,8}</tag>). I read tut ,and i know how to make negation of one sign ,but not a chain.

  6. #6
    Join Date
    Oct 2006
    Location
    Netherlands
    Posts
    3,285
    Mentioned
    105 Post(s)
    Quoted
    494 Post(s)

    Default

    Why is this needed? If it is for storing vars in a file I suggest using a .ini .
    Working on: Tithe Farmer

  7. #7
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default

    Quote Originally Posted by masterBB View Post
    Why is this needed? If it is for storing vars in a file I suggest using a .ini .
    Grabbing data from website.

  8. #8
    Join Date
    Oct 2009
    Location
    Stockton, CA
    Posts
    2,040
    Mentioned
    0 Post(s)
    Quoted
    1 Post(s)

    Default

    Use between, replace, repeat if you don't want to use regexs.
    Join the IRC! irc.rizon.net:6667/srl | SQLite (0.99rc3+) | SRL Doc | Simba Doc | Extra Simba Libraries (openSSL & sqlite3)
    Quote Originally Posted by #srl
    10:45 < Toter> daphil when can get sex anyday I want
    10:45 < Toter> he is always on #SRL
    "A programmer is just a tool which converts caffeine into code"

  9. #9
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default

    Quote Originally Posted by Sex View Post
    Use between, replace, repeat if you don't want to use regexs.
    I want to use regexs ,but I stuck on syntax.

  10. #10
    Join Date
    Feb 2006
    Location
    Helsinki, Finland
    Posts
    1,395
    Mentioned
    30 Post(s)
    Quoted
    107 Post(s)

    Default

    Hello beginner5,

    I wrote quickly a function called as "AllBetween"..

    Code:
    function AllBetween(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) or (s1 = s2) then
        Exit;
      repeat
        start := PosEx(s1, s, start + 1);
        if start > 0 then
        begin
          finish := PosEx(s2, s, start);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + 1);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start) + 1));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;
    
    var
      TSA: TStringArray;
      h, i: Integer;
      s: string;
    
    begin
      ClearDebug;
      s := Between('<td class="alt1"><span class="smallfont"> <a href="http://villavu.com/forum/member.php', '	</tr>', GetPage('http://villavu.com/'));
      TSA := AllBetween('">', '</span></a>', s);
      h := High(TSA);
      WriteLn('List of Active Members @SRL-Forums [' + IntToStr(h + 1) + ']:');
      for i := 0 to h do
        WriteLn(TSA[i]);
      SetLength(TSA, 0);
    end.
    The test script will get the Active Members of SRL-Forums.. :P Just to show it does work..

    Still needs some improving, of course (would be better if it would be based on Regex, right now it works with PosEx).
    &NOTE: It requires that s1 <> s2. They cannot be the same, else it wont execute the function.

    -Jani

    Here is a test script using your <tag>, </tag> example:

    Code:
    function AllBetween(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) or (s1 = s2) then
        Exit;
      repeat
        start := PosEx(s1, s, start + 1);
        if start > 0 then
        begin
          finish := PosEx(s2, s, start);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + 1);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start) + 1));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;
    
    var
      TSA: TStringArray;
      h, i: Integer;
      s: string;
    
    begin
      ClearDebug;
      s := '<tag>test1</tag>.....<tag>test2</tag>.....<tag>test3</tag>.....<tag>test4</tag>.....';
      TSA := AllBetween('<tag>', '</tag>', s);
      h := High(TSA);
      for i := 0 to h do
        WriteLn(TSA[i]);
      SetLength(TSA, 0);
    end.
    Results to:
    test1
    test2
    test3
    test4
    Successfully executed (13.7896 ms)

    So TSA would be: ['test1', 'test2', 'test3', 'test4'].
    Last edited by Janilabo; 03-11-2012 at 07:09 PM.

  11. #11
    Join Date
    Jan 2007
    Posts
    8,876
    Mentioned
    123 Post(s)
    Quoted
    327 Post(s)

    Default

    Another way would be to use Explode

  12. #12
    Join Date
    Feb 2006
    Location
    Helsinki, Finland
    Posts
    1,395
    Mentioned
    30 Post(s)
    Quoted
    107 Post(s)

    Default

    Zyt3x, I think you just had a brain fart here.. Meaning that, Explode wont be a solution for what he is requesting here. ..but, of course you can prove me wrong mate...

  13. #13
    Join Date
    Jan 2007
    Posts
    8,876
    Mentioned
    123 Post(s)
    Quoted
    327 Post(s)

    Default

    Quote Originally Posted by Janilabo View Post
    Zyt3x, I think you just had a brain fart here.. Meaning that, Explode wont be a solution for what he is requesting here. ..but, of course you can prove me wrong mate...
    Simba Code:
    const
      S = '<tag>test1</tag>.....<tag>test2</tag>.....<tag>test3</tag>.....<tag>test4</tag>';

    var
      sArr : TStringArray;
      H, I : Integer;

    begin
      sArr := Explode('<', S);
      H := High(sArr);
      for I := 0 to H do
        sArr[I] := Between('>', '#END#', sArr[I] + '#END#');

      WriteLn(sArr);
      WriteLn(Implode('', sArr));
    end.

    Code:
    Compiled successfully in 31 ms.
    ['', 'test1', '.....', 'test2', '.....', 'test3', '.....', 'test4', '']
    test1.....test2.....test3.....test4
    Successfully executed.
    E: Ah, I see now where I was wrong. I though he just wanted to remove the <tag> and </tag>'s

  14. #14
    Join Date
    Feb 2006
    Location
    Helsinki, Finland
    Posts
    1,395
    Mentioned
    30 Post(s)
    Quoted
    107 Post(s)

    Default

    Nice code Zyt3x..

    That is not exactly what he requested though (it doesn't work the way he wanted)..

    He doesn't want the code to pick '.....' or anything else that are outside hes tags (<tag>, </tag>). He wants the strings from inside the tags.. So the results would be only the 'test1', 'test2', 'test3', 'test4'.

    But your code does look good.. Clean and short!

  15. #15
    Join Date
    Feb 2011
    Location
    The Future.
    Posts
    5,600
    Mentioned
    396 Post(s)
    Quoted
    1598 Post(s)

    Default

    This can be done with both RegEx's and Non-RegEx's.. Your going to need a hell of a good understanding of RegEx's if I were to post a RegEx Solution as it will use NON-Greedy RegExpressions.. Which means it will not backtrack in the simplest terms..

    Solution:
    Simba Code:
    program new;
    {$I SRL/SRL.Simba}

    //<tag>.*?</tag>    A NON-Greedy RegExpr..

    Function GrabText(HayStack: String; Between1, Between2: String): TStringArray;
    var
      Needle: String;
      Start, Ending, Iterator, Tracker: Integer;
    begin
      while (Iterator < Length(HayStack) + Ending) do
      begin
        Start:= PosEx(Between1, HayStack, Start);
        Ending:= PosEx(Between2, HayStack, Ending);
        Iterator := Iterator  + Ending;
        Needle:= Copy(HayStack, Start + Length(Between1), Ending - Start - Length(Between1));
        Start:= Ending + 1; Ending:= Ending + 1;

        SetLength(Result, Tracker + 1);
        Result[Tracker]:= Needle;
        Inc(Tracker);
      end;

      For Start:= 0 To High(Result) do
        If (Result[Start] = '') then
          DeleteValueInStrArray(Result, Start);
    end;

    begin
      SetupSRL;
      ClearDebug;
      writeln(GrabText('<tag>test1</tag>....<tag>test2dgtsgtsgdsgs</tag>....<tag>test3</tag>.....<tag>test4</tag>.....', '<tag>', '</tag>'));
    end.
    This would be an example of a NON-Greedy RegExpr for the curious guys: http://villavu.com/forum/showpost.ph...93&postcount=5

  16. #16
    Join Date
    Oct 2011
    Posts
    805
    Mentioned
    21 Post(s)
    Quoted
    152 Post(s)

    Default

    hey Janilabo,

    I really appreciate what you have done here! It works perfect and it's fast.
    It requires that s1 <> s2. They cannot be the same, else it wont execute the function.
    It's not big problem ,in most cases can be easy bypassed.

    I made something with it:

    Simba Code:
    program new;

    function AllBetween(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) or (s1 = s2) then
        Exit;
      repeat
        start := PosEx(s1, s, start + 1);
        if start > 0 then
        begin
          finish := PosEx(s2, s, start);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + 1);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start) + 1));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;

    Procedure PrintPlayerStats(PlayerName : string);
    var
    client,a :integer;
    s :string;
    c ,cc: TStringArray;
    begin
      client:=InitializeHTTPClientWrap(TRUE);
      AddPostVariable(Client,'user1',PlayerName);
      s := PostHTTPPageEx(Client,'http://services.runescape.com/m=hiscore/g=runescape/compare.ws');
      c := Allbetween('<span class="columnLevel">','</span>',s);
      cc := Allbetween('<span><span><span>','</span></span></span>',s);
      for a := 1 to 25 do
        writeln(cc[a]+' : '+c[a+1]);
      FreeHTTPClient(client);
    end;


    begin
      PrintPlayerStats('dragon');
    end.

    @E : Thx ggzz too. I'm trying to make it by regexpr now..this below works ,but can be easy crashed...

    Simba Code:
    program new;
    function AllBetween(s1 ,s2 ,s :string) :TstringArray;
    var
    a:string;
    p : tstringlist;
    i,c :integer;
    begin
     p := TStringlist.create;
     a := s1+'|'+s2;
     SplitRegExpr(a,s,p);
     c := p.count;
     SetLength(Result,c/2);
     for i:=1 to c-1 do
     begin
        result[(i-1)/2] := p.Strings[i];
        inc(i);
     end;
      p.Free;
    end;

    var
    a : string;
    begin
     a := ',.,.,<tag>test1</tag>......<tag>test2</tag>.....<tag>test3</tag>.....<tag>test4</tag>.....';
     writeln( AllBetween('<tag>','</tag>',a) );
    end.
    Last edited by bg5; 03-11-2012 at 09:34 PM.

  17. #17
    Join Date
    Feb 2006
    Location
    Helsinki, Finland
    Posts
    1,395
    Mentioned
    30 Post(s)
    Quoted
    107 Post(s)

    Default

    Hey beginner5,

    Glad it helped you out!
    But yeah, it of course could & should be improved (regex, s1=s2..).

    Btw, naaaise player stats grabber!

    Edit:

    Modified the AllBetween function a bit.. Added support for s1 = s2, but... There needs to be exact amount of start and end 'brackets', in order to get results correctly (with s1=s2).. :\ This is something where regexes will do a lot better job, for sure..

    Here is the improved AllBetween:

    Code:
    function AllBetween2(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) then
        Exit;
      start := PosEx(s1, s, 1)
      repeat
        if finish > 0 then
          start := PosEx(s1, s, (finish + s2L));
        if start > 0 then
        begin
          finish := PosEx(s2, s, start + 1);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + s1L);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start)));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;
    And a small example/test script:

    Code:
    function AllBetween2(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) then
        Exit;
      start := PosEx(s1, s, 1)
      repeat
        if finish > 0 then
          start := PosEx(s1, s, (finish + s2L));
        if start > 0 then
        begin
          finish := PosEx(s2, s, start + 1);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + s1L);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start)));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;
    
    var
      s: string;
      TSA: TStringArray;
      h, i: Integer;
    
    begin
      s := '"user1", "user2", "user3", "user4"';
      TSA := AllBetween2('"', '"', s);
      h := High(TSA);
      for i := 0 to h do
        WriteLn(TSA[i]);
      SetLength(TSA, 0);
    end.
    Small forum widget, that I made for fun (active users browsing SRL-forums...), using AllBetween[2] and TSAToParts functions:

    Code:
    function TSAToParts(TSA: TStringArray; partSize: Integer): array of TStringArray;
    var
      i, i2, r, h, d: Integer;
    begin
      h := High(TSA);
      if (h >= 0) and (partSize > 0) then
        if partSize <= h then
        begin
          Inc(h);
          r := (h div partSize);
          if (r * partSize) < h then
            Inc(r);
          SetLength(Result, r);
          for i := 0 to (r - 1) do
            for i2 := 0 to (partSize - 1) do
            begin
              SetLength(Result[i], partSize);
              if d < h then
              begin
                Result[i][i2] := TSA[d];
                Inc(d);
              end else
              begin
                SetLength(Result[i], i2);
                Exit;
              end;
            end;
        end else
          Result := [TSA];
    end;
    
    function AllBetween(s1, s2, s: string): TStringArray;
    var
      sL, s1L, s2L, old_start, start, finish: Integer;
      str: string;
    begin
      s1L := Length(s1);
      s2L := Length(s2);
      sL := Length(s);
      if (s1 = '') or (s2 = '') or (s = '') or (s1L >= sL) or (s2L >= sL) or ((s1L + s2L) > sL) then
        Exit;
      start := PosEx(s1, s, 1)
      repeat
        if finish > 0 then
          start := PosEx(s1, s, (finish + s2L));
        if start > 0 then
        begin
          finish := PosEx(s2, s, start + 1);
          if finish <= 0 then
            Break;
          repeat
            old_start := start;
            start := PosEx(s1, s, old_start + s1L);
          until (start >= finish) or (start <= 0);
          start := old_start;
          if str <> '' then
            str := str + '{ab_NS}';
          str := str + Between(s1, s2, Copy(s, start, s2L + (finish - start)));
        end else
          Break;
      until False;
      if str <> '' then
        Result := Explode('{ab_NS}', str);
      str := '';
    end;
    
    var
      TSA: TStringArray;
      h, h2, i, i2: Integer;
      s: string;
      T2DSA: array of TStringArray;
    
    begin
      ClearDebug;
      s := Between('<td class="alt1"><span class="smallfont"> <a href="http://villavu.com/forum/member.php', '	</tr>', GetPage('http://villavu.com/'));
      TSA := AllBetween('">', '</span></a>', s);
      s := '';
      T2DSA := TSAToParts(TSA, 9);
      WriteLn('List of Active Members @SRL-Forums [' + IntToStr(High(TSA) + 1) + ']:');
      SetLength(TSA, 0);
      h := High(T2DSA);
      for i := 0 to h do
      begin
        h2 := High(T2DSA[i]);
        for i2 := 0 to h2 do
          s := s + T2DSA[i][i2] + ', ';
        SetLength(T2DSA[i], 0);
        if (i = h) then
          WriteLn(Copy(s, 0, Length(s) - 2) + '.')
        else
          WriteLn(s);
        s := '';
      end;
      SetLength(T2DSA, 0);
    end.
    Feel free to use!
    Last edited by Janilabo; 03-12-2012 at 01:32 AM. Reason: There was a bug in AllBetween2 function - FIXED now.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •