parse from year: 2019

Wednesday 9th January, 2019

GiuseppeChillemi

19:31I wish to parse strings like:

the-path: "#theserver:theDB/thetable/thefield"
the-path: "theDB/thetable/thefield"

caratteri-path: charset [#"A" - #"Z" #"a" - #"z" #"0" - #"9"]
	parse the-path [
		copy first-char
		0 1
		[ "!" | "#" ]
		copy first-word some caratteri-path (print ["First Word" first-word])
		any [1 [ "/" | ":" ]]
		copy bword some [some caratteri-path  (print ["bword" bword]) | 1 "/"]
	]

I don't understand why the last copy is not working

First Word theserver
*** Script Error: bword has no value
*** Where: print
*** Stack: decode-path-parse

Just note, the parse is not complete and I am experimenting and struck at this point.

19:35note: ":" is possible only after the first word , then only "/" are allowed

19:37note: first char could be #, ! but also a char, still not understood how to obtain this

9214

20:09

text
>> split "#theserver:theDB/thetable/thefield" charset "#:/"
== ["" "theserver" "theDB" "thetable" "thefield"]
>> split "theDB/thetable/thefield" charset "#:/"
== ["theDB" "thetable" "thefield"]

GiuseppeChillemi

20:29Need to know the character before the Word

9214

20:31...

>> first "#a:b/c"
== #"#"
>> first "!a:b/c"
== #"!"
>> first "b/c"
== #"b"

GiuseppeChillemi

20:37I mean: those characters are meaningful , I Need to know what they preceed

20:42I know how to extract the First char but I want to act on the following word in a different way if preceded from # or ! , And only one ":" could be in the Path

9214

20:42Then combine two snippets above together - check the first character, split the string, then figure out what character preceeded by observing the first non-empty string.

GiuseppeChillemi

20:42Pl

20:42Qqp

20:42QZ

9214

20:43Quit this spamming please.

20:45If you still insist on using Parse, here's a less convoluted version:

chars:     charset [#"A" - #"Z" #"a" - #"z" #"0" - #"9"]
element:   [keep some chars]
prefix:    [keep ["#" | "!"] element ":"]
delimiter: ["/" | end]
path:      [some [element delimiter]]

foreach test [
    "#this:that/end"
    "!this:that/end"
    "this/that/end"
][
    probe new-line/all parse test [collect [opt prefix path]] off
]

GiuseppeChillemi

20:46> Quit this spamming please.

Just standing up from bed with the phone open on gitter in my hand... hit the screen without noticing it.

20:47I'll check it after dinner, thanks.

20:48Hoever: why this "copy" didn't work in my code ?

copy bword some [some caratteri-path  (print ["bword" bword]) | 1 "/"]

I need to learn...

9214

20:51

text
some [copy bword some caratteri-path  (print ["bword" bword]) | 1 "/"]

20:55You're trying to copy the rest of the path instead of copying each element separately, which never happens because your rule inevitably fails - there's no matching "/" at the very end. My rule fails for the same reason, so path is never actually parsed. But each element of path is matched successfuly so copy and print work as expected for the rest of the input.

GiuseppeChillemi

21:30So, copy happen only of the rule is successfull in every occurrence ? As it would fail only in the last part of the path when END is encountered

21:32No, it seems I am wrong

21:47@9214

I have added " end, still with no success

some [some caratteri-path  (print ["bword" bword]) | 1 "/" | end]

I supposed:

some caratteri-path   (print ["bword" bword])

Would match at least one character in the bitset or a complete word until something different is found. If at least one character is found then "copy bword" is triggered and print executed.

If no bitset character is found then

1 "/"

checks for "/"

If it is not found

"end"

is checked and if we are at the end the rule succeds

The starting

some [c

Instruct parse the whole rule should happen at least one time and this happens because "end" is always encontered

21:49Found the problem as you pointed

21:49

some [ copy bword some caratteri-path  (print ["bword" bword]) | 1 "/" | end]

21:50Copy now copies when a word is encountered

21:58@9214

This is a working version, need to refine it:

aword: copy []
	parse the-path [
		copy first-char
		0 1
		[ "!" | "#" ] (append aword first-char)
		copy first-word some caratteri-path (print ["First Word" first-word] (append aword first-word))
		any [1 [ "/" | ":" ]]
		some [ copy bword some caratteri-path  (print ["bword" bword] append aword bword) | 1 "/" (append aword "/")| end]
		(probe aword)
	]

First Word theserver
bword theDB
bword thetable
bword thefield
["#" "theserver" "theDB" "/" "thetable" "/" "thefield"]
>

The last modification needed is acceppting a staring path with a character from the caratteri-path bitset and I am done.

21:59Oooops.... missing ":"

22:09

parse the-path [
		0 1
		[copy first-char "!" (append aword first-char) | copy first-char "#" (append aword first-char)] 
		copy first-word some caratteri-path (print ["First Word" first-word] (append aword first-word))
		any  [copy bword [1 [ "/" | ":" ]] (append aword bword)]
		some [ copy bword some caratteri-path  (print ["bword" bword] append aword bword) | 1 "/" (append aword "/")| end]
		(probe aword)
	]

This works !

22:10Need to be cleaner but concept understood !

Thursday 10th January, 2019

GiuseppeChillemi: 03:55@9214 a last note , I see parse collect Is different from the standard one. I am learning how It works to understand your example.
gltewalt: 06:01This might be of some help. Under the Extraction section.
https://www.red-lang.org/2013/11/041-introducing-parse.html
GiuseppeChillemi: 08:16@gltewalt Just reading
08:17However, I am feeling more confortable with RED parse. I do not know why...

Sunday 24th February, 2019

vazub

17:05Hi guys! A layman's question here. Does parse support binary! input to iterate through it byte by byte? I am currently struggling to grok it by means of parsing a file, read as binary, and it doesn't seem to work as I would expect it to. From the looks of it, you can match the input only as a whole, not iterate through it. Or am I mistaken here?

toomasv

17:09@vazub I never used it but seems to work:

>> parse #{00100101} [some [s: (probe s/1) skip]]
0
16
1
1
== true

>> rejoin parse #{497420776F726B7321} [collect some [s: keep (to-char s/1) skip]]
== "It works!"

9214

17:35@vazub Parse suports binary! input since a while, so, perhaps you're doing something wrong. What binary format are you trying to parse? You don't need Parse to iterate thru a series though.

>> get also 'match parse #{AA AA DEADBEEF FF FF} [2 #{AA} thru #{DEAD} copy match 2 skip to end]
== #{BEEF}

vazub

18:06Thanks @toomasv and @9214 . I initially failed to acknowledge, that parse returns false, even if the match is found before the end of the input, due to rule failure afterwards. I was expecting true on match and therefore assumed something was not working as expected. Using parse-trace helped me to get better understanding of how parse works.
@9214 I am trying to read a file's "magic number" and compare it to a known signature. Your example does seem to cater to similar case, thanks!

Monday 25th February, 2019

vazub

19:09Continuing my previous question: is there a way to set the number for rule iteration dynamically from the word set by one of the previous parse rules? I have a code like this:

print parse some-bin-file [
    collect set x [n: keep (to integer! to binary! reverse reduce [n/1 n/2]) 2 skip]
    x collect [o: keep (to integer! to binary! reverse reduce [o/1 o/2 o/3 o/4]) 4 skip]
]

What I essentially am looking for is to make sure the second rule is applied x number of times. However, this code throws an error:

*** Script Error: PARSE - invalid rule or usage of rule:

hiiamboris

19:14

>> parse to-binary "aaaaa" [(x: 3) collect [x keep skip]]
== [97 97 97]

9214

19:39

text
>> parse mold quote ((( o ))) [copy match some "(" (many: length? match) skip many ")"]
== true
>> parse [3 a b c][set count integer! count word!]
== true
>> parse [3 word! a b c 2 refinement! /d /e][some [set count integer! set type! word! (type!: get type!) count type!]]
== true

19:48@vazub bear in mind that set takes only the first value from the matched portion of the input, while copy takes the whole portion from start to end. So, if your length is specified by 2 bytes, you need to use copy, because set takes only the first byte (as integer! value).

19:51

text
>> binary: append/dup #{ABCD} 0FFh 0ABCDh ; 43981 number of FFh
== #{
ABCDFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFF...
>> parse binary [copy count 2 skip (count: to integer! count) count #{FF}]
== true

Tuesday 26th February, 2019

vazub: 06:29Thanks for the info! Much appreciated!

Saturday 9th March, 2019

toomasv

09:16Why doesn't Red parse read input through the end?
R2 and R3:

>> parse "abc" [some [end (probe "end") break | s: (probe s) skip]]
"abc"
"bc"
"c"
"end"
== true

Red (latest and earlier ones also):

>> parse "abc" [some [end (probe "end") break | s: (probe s) skip]]
"abc"
"bc"
"c"
== true

There is a workaround:

>> parse "abc" [some [s: (probe s) skip [end (probe "end") | none]]]
"abc"
"bc"
"c"
"end"
== true

.. but it is cumbersome to use if end-check is needed, e.g. for inserting something in the end, but there many paths to reach end depending on input.

hiiamboris

10:03is it really a problem? why can't you unite the *many paths* into one rule and place an end check after?

toomasv

10:06I can. Never mind. Just additional quirk to remember.

hiiamboris

10:07Could be an optimization. Esp. makes sense for tiny blocks which are many..

toomasv

10:11Probably needs to be added to "Differences...". Also a bit ugly, because [end ... | none] or [end ...|] is needed, otherwise will fail.

hiiamboris

10:22why not parse "abc" [some [s: (probe s) skip] end (probe "end")]

10:23and [rule |] is opt rule

10:27you're writing a rule that should work in both Red and Rebol?

toomasv

10:34Ah, good. Somehow missed that :blush:

Monday 11st March, 2019

greggirwin: 03:52The main thing is to know whether it's by design, or unintentional.

Tuesday 2nd April, 2019

giesse

02:21anybody has any example usage for the then keyword? R3 examples are fine, just really hard to search on Google.

ne1uno

03:01https://github.com/revault/rebol-wiki/wiki/Parse-Project

giesse

06:40hmm, I see

06:40thanks.

rebolek

06:43The example seems strange:

any [[Parse-Project#"a" | #"b"] then [end skip] | skip]

9214

09:33@giesse then in Red is broken IIRC, since no one could really understand its semantics.

pekr

09:49Not sure the old R3 parse project / proposal helps, but: https://github.com/gchiu/rebol.net/blob/master/wikipedia/Parse%20Project.wiki#THEN

09:51It would be probably really better being called either, but most probably I am missing something ....

rebolek

10:37I don't see how then is different from normal parsing:

parse "ab" [#"a" then #"b" | #"c"]

works same as

parse "ab" [#"a" #"b" | #"c"]

giesse

18:25@rebolek I think you need three alternatives at least for then to work.
@9214 I vote that it should be removed rather than fixed.

9214

18:27@giesse I might be wrong, but then, as it is described in Rebol docs, might be a by-product of recursive-descent Parse implementation that R3 uses.

18:28https://gitter.im/red/help?at=59d205da7b7d98d30d606143

giesse

18:31I doubt that it has anything to do with the implementation details. Wouldn't you say that topaz-parse is recursive-descent as well? (Not that I see how implementing it as a state machine makes any difference - it's just a matter of whether you use the stack implicitly or explicitly.)

9214

18:36I'm in favor of then removal, but jumping over alternate rule that it makes (or supposed to make) is a nice feature.

giesse

18:41I'm confident that any real-world examples can be reworked into something that is more readable and understandable than then. :P

rebolek

18:43@giesse if you read the docs, it says *if rule1 then rule2 or rule3* which to me looks like rule1 then rule2 | rule3, but I may be of course wrong, the docs are not very clear.

18:45The dynamic rule described there is something I was using very often in Lest, but I'm not sure how to translate it to then. I have to look if I ever used it.

18:47Nope, Lest doesn't use it.

9214

18:49

text
[rule1 (cont: rule2) | (cont: rule3)] cont

[rule1 rule2 | rule3]

These two are equivalent, no? If rule1 matched, match rule2. If rule1 failed, backtrack and match rule3.

rebolek

18:50Right, I think it's bad example.

9214

18:52Rebol wiki refers to GTDPL with the following remark:
> In GTDPL, in place of TDPL's recursive rule form A ← BC/D, we instead use the alternate rule form A ← B[C,D], which is interpreted as follows. When nonterminal A is invoked on some input string, it first recursively invokes B. If B succeeds, then A subsequently invokes C on the remainder of the input left unconsumed by B, and returns the result of C to the original caller. If B fails, on the other hand, then A invokes D on the original input string, and passes the result back to the caller.

> The important difference between this rule form and the A ← BC/D rule form used in TDPL is that C and D are never both invoked in the same call to A: that is, the GTDPL rule acts more like a "pure" if/then/else construct using B as the condition.

18:53That's why I've mentioned recursive-descent R3 implementation. then tries to mimick A ← B[C,D] rule, which is different from A ← BC/D only in some obscure detail related to recursive call.

giesse

19:07@rebolek I believe the idea is: [rule1 then rule2 | rule3 | rule4]
* if rule1 fails, backtracks to rule3
* if rule1 succeeds, but rule2 fails, backtracks to rule4 instead of rule3
* if both rule1 and rule2 succeed, then of course it does not backtrack

19:08I doubt I will even remember this one year from now... so I don't think it's a good name at the very least.

19:09on a separate topic, another bug:

>> parse ['|] ['|]
== true

9214

19:14@giesse https://github.com/red/red/issues/3554 ?

giesse

19:14indeed.

19:14good to see it's already there :)

19:15luckily I can work around it with ahead

rebolek

19:18@giesse Ah! It's an interesting construct.

19:19With very detailed help, someone may actually find it useful.

giesse

19:35I would need to see a real use case. And I bet $1 I could come up with a better solution for it. (One that doesn't have us parse old-timers stumble to figure it out.)

rebolek

19:40Maybe it can be useful in some rules generator? But I'm not going to bet on it, there are better things I can spend $1 on :smile:

giesse

19:46I'm not saying that it *can't* be useful. I'm saying that there's gotta be a better way to solve the problem.

20:44Updated topaz-parse in preparation for much bigger changes to come: https://github.com/giesse/red-topaz-parse

Wednesday 3th April, 2019

toomasv

07:16Here's how then actually works on W10:

parse [x y d 1][['x 'y then word! "comment here" | (probe "2") 'k 'l 'm | (probe "3") 'o 'p] then integer! "and final remarks"]
;== true

parse [k l m 1][['x 'y then word! 'comment 'here | (probe "2") 'k 'l 'm | (probe "3") 'o 'p] then integer! "and final remarks"]
;"2"
;== true

parse [o p 1][['x 'y then word! comment here | (probe "2") 'k 'l 'm | (probe "3") 'o 'p] then integer! "and final remarks"]
;"2"
;"3"
;== true

parse [x y ta-daa! 42][['x 'y then 'x (do something) | (probe "2") 'k 'l 'm | (probe "3") 'o 'p] then integer! "and final remarks"]
;"2"
;"3"
;== false

parse [x y ta-daa! 42][['x 'y then word! and whatever else | (probe "2") 'k 'l 'm | (probe "3") 'o 'p] then integer! "and final remarks"]
;== true

That is in case of rule1 then rule2 rule3 | rule4 if either rule1 or rule2 fails, rule4 is tried; rule3 is always ignored.
This effectively means that after then rule2 you may write anything that is loadable. Seems usable for in-rule comments only?

rebolek

07:50@toomasv that doesn't make any sense.

toomasv

07:52I don't claim it makes sense. I report how it seems to work currently on W10.

rebolek

07:52@toomasv yes, that's not pointed at you, but at then

07:53then : regardless of failure or success of what follows, skip the next alternate rule.

07:54what follows word! in your examples is not alternate rule.

toomasv

07:55Yes, its working doesn't match description.

rebolek

07:56and what does it mean anyway? If I want to ignore next alternate rule, regardless of failure or success, I simply wouldn't write such rule.

toomasv

08:00I understand it means what @giesse described, i.e. in case of rule1 then rule2 | rule3 | rule4, in case rule1 fails rule3 is tried, in case rule1 succeeds but rule2 fails, rule rule3 is skipped and rule4 is tried.

But also, shouldn't it mean that in case of r1 then r2 r3 | r4 | r5, if both r1 and r2 succeed but r3 fails, r4 is skipped?

rebolek

08:05From your examples it seems that r3 is never checked.

toomasv

08:06Currently yes, and next alternate rule is never skipped.

hiiamboris

12:17then in R3:

>> parse "ac" ["a" "q" | "ac"]
== true
>> parse "ac" ["a" then "q" | "ac"]
== false

(in Red both are true)

12:21And please count my vote for removal of this abomination. The rule it's supposed to replace ([rule1 (cont: rule2) | (cont: rule3)] cont) is way cleaner and easier to follow than then.

rebolek

12:24dynamic rules ftw!

9214

12:28> removal of this abomination

[Kill meh... Kill meh!](https://www.youtube.com/watch?v=SBzDU9dfpGQ)

hiiamboris

12:38:D

giesse

17:55@hiiamboris or at least, it should be written as [either rule1 [rule2] [rule3]], or perhaps either rule1 [rule2 | rule3], though I find that just needlessly confusing.

17:56But in my experience, dynamic rules like that are the result of some other missing feature in parse, like ahead or not which didn't exist in R2.

hiiamboris

18:21> But in my experience, dynamic rules like that are the result of some other missing feature in parse, like ahead or not which didn't exist in R2.

I believe so as well. Although ahead has a bit of a [performance] problem: double matching.

Thursday 4th April, 2019

greggirwin

19:33After reading this chat, I vote for removal of then as well.

19:34Would someone summarize and point to this chat and create a ticket for that request? Thanks.

9214

19:36:hammer:

19:51@giesse while we are at it: how function's body can be used in Parse rules if there's a need for parameter substitution?
e.g. in foo: func [x] [thru x keep to x x], x either needs to be bound to a context (either object! or function's call frame which you need to explicitly keep intact) or substituted for a literal argument. Or you need to mimic call stack and function call by yourself, doing all associated bookkeeping.

19:52I also think that it would be nice to make internal Parse engine state directly accessible from rules. Right now this is possible only with parse/trace and some black magick trickery.

giesse

19:57Correct, in Topaz all the function call machinery was already available to me to tap into. With topaz-parse I'd have to do it myself.

9214

19:58That's cool! Direct stack access in Red is a wet dream of mine since a while.

giesse

19:59But also, since I am compiling the parse dialect anyway, the compiler could directly substitute x for its value. So long as it's not a paren or something like that.

19:59ie. [foo 'c] -> [keep to 'c 'c]

9214

19:59Macros, sort of?

rebolek

20:01@greggirwin I'm not against then, I just want it would make sense.

greggirwin

20:06Given the examples and chat, it seems more confusing (hence risky) than useful. It doesn't give us any new capabilities, just a different way, and not a better one it seems.

rebolek

20:08Current functionality doesn't make any sense, I believe all can agree on that. However then as a keyword can be useful. I'm not sure for what right now, but it certainly can.

moliad

20:08I'm surprised it's still in the official distro, Nenad and I had convened that thenis a redundant operator in PARSE since its the default operator.

20:08it only makes sense in the context of a PEG spec parser with its set of operators.

20:10IIRC it should have been removed a long time ago (like 2 years or so)

rebolek

20:21@moliad is there an issue on Github tracking it?

giesse

20:29@9214 right, though, it can be smart enough not to evaluate the argument multiple times when it matters (I need to think about a real example though)

9214

20:38I don't think that creating dedicated datatype (which almost 100% mimicks function!) for Parse rules is really worth it.

20:38Same argument can be made for regular Parse rules, they are *just* blocks with some values in them.

20:40Anyone who gets confused by foo: func [x] [thru x keep to x x] would be equally confused by view [...] or draw [...] IMO, or any other dialect.

nedzadarek

20:47> Anyone who gets confused about foo: func [x] [thru x keep to x x] would be equally confused about view or draw IMO.

With view or draw you're using commands inside the dialect. If foo: func [x] [thru x keep to x x] can be typed outside parse then it can be confusing.

rebolek

20:48@nedzadarek if you can write for example parse/with blabla bleble [foo], then it's not so confusing.

9214

20:49Another point to consider are possible side-effects that called function can produce. I don't think it's really critical though, assuming that caller knows what he's doing.

rebolek

20:49As it Red ever cared about side effects.

9214

20:49Right, right.

nedzadarek

20:52@rebolek yes, something like parse/with [source] [rules] [function1: func [...] [...] function2: func [...][...]] wouldn't be a problem.

20:53> As it Red ever cared about side effects.

That's why I have function constructor that makes probably almost all things local (even x in set x). I have built it especially for parse.

9214

20:55I wonder if there's any gain in keeping such rules as run-time constructs (i.e. function!), because they can be just macros. @giesse does Topaz has anything else interesting in this regard?

giesse

20:55Ok, since there's seems to be enough interest in this topic...

I've spent like a decade thinking about it (on and off, obviously, but well). There are many ways to do it and in the end it's about which compromise we like the most.

Most obvious thing is to just reuse the function! type just like we are just using block! for rules, as @rebolek says. There is no technical problem with it, it's just a matter of whether you care about possible confusion. Admittedly, most of what we do is confusing to newcomers anyway.

The benefit of a separate type is just that it would not let you call it outside of parse, as well as not letting you accidentally use a regular function inside parse (both would lead to error messages that may not be very descriptive of what the real problem is, not to mention that both may just accidentally not error out but work in some way unexpected to the programmer which then has to figure out where the bug is). This can also be solved with a flag in the function spec, and as a way to do it right now, with a refinement like @nedzadarek suggests.

20:56@9214 in Topaz I was thinking about function! because parse is an interpreter like in Red. But, topaz-parse being a compiler to parse right now, they can be just "macros" indeed.

20:57Another way to solve this problem is to let people define them *within* the dialect itself. Similarly to how @rebolek suggests /with, we could also do:

parse [...] [
    foo: rule [x] [keep to x x]
    ...  foo 'c ...
]

9214

20:58> The benefit of a separate type is just that it would not let you call it outside of parse

Mhm, but regular function still can be called from paren! expression.

20:58And we can construct dynanic rules that way, with a bit of trickery.

20:59(like some then example above demonstrates)

giesse

20:59This way they are contained within parse and most of the confusion goes away. But, of course, if you have multiple rule blocks, where do those definitions appear? Are they global or local? Do we want /with as @rebolek suggests? etc.

20:59@9214 If called from a paren it would still be called by the do interpreter and thus cause an error.

9214

21:00@giesse I mean function!, not theoretical parse-function! or whatnot.

21:01parse-function! can be called from rule but not from paren! expression, and the reverse holds for function! (any-function! in fact).

giesse

21:01As I said, this is all a matter of different compromises. Since we don't have the benefit of a few years of using this in practice, we can't say which compromise could end up being bad in practice.

21:01@9214 for function! right now I would probably use a refinement like @nedzadarek suggested, so that it's unlikely that it would be called accidentally from the wrong "context".

21:02Anyway, I'm thinking that I'll up the priority of this in topaz-parse quite a lot, so perhaps we can play with the idea and see if it works in practice or not?

9214

21:04@giesse what would be interesting is to be able to check somehow if Parse's call frame is on a stack during such function! calls. If it isn't there then function! simply bails out. This call frame provides enough "contextual" info, and there's no need for crude hacks.

21:05Although such introspection can be viewed as a crude hack itself.

21:05But, like you said, it's all about trade-offs.

giesse

21:08Some level of access to the internals is always cool... but the other side of the coin is that people will end up doing all sort of crazy stuff with it. :)

Alternatively making it easier to change the interpreter itself is a solution. That's why it was so important to me that Topaz be written in Topaz and not JS or even a lower level dialect (which was my initial plan, similarly to Red/System...)

9214

21:09> people will end up doing all sort of crazy stuff with it. :)

But that's the whole point! :neckbeard:

21:10> Alternatively making it easier to change the interpreter itself is a solution

Yes, tossing arround different evaluation strategies (laziness and Scheme amb come to mind) or switching interpreter alltogether.

nedzadarek

21:12@giesse
> flag in the function spec

It reminds me about [Rebol's attributes](http://rebol.com/docs/core23/rebolcore-9.html#section-7). I haven't used them but I think it would require some changes in the Red itself.

giesse

21:13@nedzadarek correct.

rebolek

21:49I must leave discussion for few minutes and how many interesting ideas! I believe (and can prove that with code), that basic rule with args can be implemented without going to R/S. Would it be cooler and faster? Certainly yes!

greggirwin

23:09Access to internals is a double-edged sword. It's better if you don't *have to* handle the edges (as R2 forced with View, before accessors were added, and even after that). "The goal of a func-rule! type is ..."
Let's start with that.

Friday 5th April, 2019

greggirwin: 00:24Finally caught up in red/help and see that rule args are, as expected, the driving force.
giesse: 05:22@greggirwin I'll add some level of support for this in topaz-parse so we can have a concrete basis for discussion.
moliad: 13:23@rebolek I have no idea. it was something we deduced when we where looking at the code of parse face to face. I guess it's something that Nenad might have forgotten to track, it was just a simple observation within a day-long meeting.
13:23(above is in reference to removing then)

Saturday 6th April, 2019

giesse

20:21There's a pattern I have to deal with all the time in various forms, an example being:

some-word: [...]
parse [some-word] [block! | set x word! if (block? get/any x)]

In other words, I'm expecting a block! value, but a word referring to a block! value should be accepted too.
This happens all the time when parsing parse rules, especially because block! above, for eg., it's not a datatype! but a word referring to a datatype!. So if I write ['into opt datatype! block!] it doesn't work in practice, it needs to be ['into opt [datatype! | set x word! if (datatype? get x)] block!], and so on. In fact, almost everywhere I'm matching a value directly that's not a keyword, I want to do word lookup.

How should this pattern be expressed in parse?

20:24I'm thinking of something like [get block!] which will match a word! value so long as it refers to a block! value. In general, [get rule] would match a word! as long as the value it refers to matches rule; but the generalization gets tricky because rule could match more than one value, and so on. So perhaps it should only allow datatypes and typesets.

20:25Another possibility is something like [into word! rule], but I don't know if that is confusing.

9214

20:26@giesse it sounds like a "semantic" rule to me - you don't want to match by the looks of the value but by its properties.

20:27e.g. integer! > 5 or something like that would match all integers greater that 5.

giesse

20:27Another possibility is to use a path like word!/block! to indicate a word that refers to a block. This would allow something like [into word!/block! rule] to mean [into word! [into block! rule]], but that sounds like I'm going too far. :)

20:28@9214 I think that [set x integer! if (x > 5)] is a good enough solution for that.

9214

20:28@giesse but it's essentially the same as your case, no?

giesse

20:28but with words it is so pervasive that I feel like having built in support would help a lot.

20:30I haven't really had to use if (x > 5) ever, but I have to lookup words basically every time I encounter one when compiling a topaz-parse rule.

20:30the dialects I use in %ast-tools.red also have to do the same thing

20:30and, one of my earliest proposals for parse in R2 was the do keyword that solved a generalization of this problem.

20:31See http://www.colellachiara.com/soft/PDFM2/compile-rules.r for eg.

9214

20:31I believe it's pervasive only when you work with Parse rules themselves. And it can be supported as a part of some larger feature-set, like these "semantic rules" or your parse-function! idea.

giesse

20:31I don't recall writing a dialect where I did *not* want this feature. :P

9214

20:31> generalization of this problem

Yes, exactly. We need a general mechanism, not case-by-case little keywords.

giesse

20:32do does full evaluation, and it cannot back-track. I love it because you don't need compose on dialects anymore, but it seems that most people don't feel that way.

20:33looking up words is pervasive, checking arbitrary expressions is rare enough that if is a good general mechanism for it.

20:33but, I guess it's only pervasive for me - good to know :)

9214

20:34Dunno, I might be wrong of course. Let's see what others have to say.

giesse

21:10Is this supposed to be an infinite loop?

parse ['collect | 'copy | 'object] [any [not '| skip | (x: none)]]

21:11Note that this is not:

>> parse [| 'copy | 'object] [any [not '| skip | (x: none)]]
== false

9214

21:22@giesse ?

>> parse ['collect | 'copy | 'object] [any [not '| skip | (x: none) skip]]
== true

21:22The latter might be infinite loop check kicking-in or any subtlety.

giesse

21:32my point is that it should stop when input doesn't advance, otherwise there's no point in having while. :)

21:32I can work around it with | (x: none) fail

21:34But perhaps it's not a bug and rules should be written to not always match.

Sunday 7th April, 2019

toomasv: 05:51@giesse How about block? word!, more generally [ | ], being rule that looks at next value and if it is word queries its reference value type (datatype! or typeset!)?
greggirwin: 20:06I see @giesse and @9214's targets as quite different, though related. Syntactic matching and evaluation/constraints are very different. Matching a word that refers to another type is a step of indirection, and nothing more. I've hit this same need myself. You can avoid it with basic dialects, but as soon as we want more expressive flexibility, it would be very nice to have.

We don't have a get-datatype! syntax (:block!), but if we agree on a ref! type (@ref syntax), that could be pressed into service in parse when it's added.
9214: 20:23@greggirwin where should ticket about then go, in REP or in main repo?
greggirwin: 20:27Main repo I think.

Monday 8th April, 2019

giesse: 00:32@greggirwin actually... we do have :block!, because in practice you'll never find an actual datatype in a rule, but a word like block! instead; so using a get-word to indicate a level of indirection might work.
00:34the only question is whether it is readable enough. get block! vs :block! vs not sure what else (sorry @toomasv I don't think I like block? word!, especially because it would force us to keyword-match all the possibilities like block? and string? and so on and so forth.)
greggirwin: 04:35One concern over the get-word! syntax is that normal get-words already have a purpose, and a very different one. Might still be OK though.
giesse: 06:32Not in topaz-parse though, at least not yet :) But, again, I don't know what this feature should look like. I'm using get right now, just hoping to find something better.
9214: 08:07https://github.com/red/red/issues/3843

Tuesday 9th April, 2019

giesse

22:30I have an implementation of rule functions (ie. rules with arguments and locals): https://github.com/giesse/red-topaz-parse

There are a number of limitations (mostly due to the compilation process), but it's a start. If you guys like it and have real use cases, we can work on fixing any issues.

22:34

--== Red 0.6.4 ==-- 
Type HELP for starting information. 

>> do %topaz-parse.red

>> foo: rule [a b] [a ("A matched") | b ("B matched") | ("Neither matched")]
>> topaz-parse [1 2 3] [foo word! integer!]
== "B matched"
>> topaz-parse [a b c] [foo word! integer!]
== "A matched"
>> topaz-parse [a b c] [foo block! paren!]
== "Neither matched"

>> foo: rule [n [integer!]] [n integer!]
>> topaz-parse [1 2 3] [foo "String"]
*** Script Error: foo does not allow string! for its n argument
*** Where: do
*** Stack: topaz-parse cause-error  
>> topaz-parse [1 2 3] [foo 3]
== 3

>> foo: rule [] [n: integer! (n * 3)]
>> topaz-parse [4] [foo]
== 12
>> n
*** Script Error: n has no value
*** Where: catch
*** Stack:

Wednesday 10th April, 2019

nedzadarek

12:13@giesse my only problem is with the syntax (readability). With the current form there is no way to know without reading source(s) if you mean:
- foo (normal rule/block or 0-argument rule), a word! and an integer!
- foo rule that takes 1 argument (a word!) and an integer! or
- foo rule that takes 2 arguments - a word! and an integer!

PS. about topaz-parse itself:
end should match whole thing as per readme:
> The parsing succeeds if the rules match, even if they don't consume all the input. To achieve the same behavior as parse, just add end to your rules appropriately.

It works with rules not in a block:

topaz-parse [a b c] ['a word! end]
; none

However it does not work when I put things into a block:

foo: ['a] topaz-parse [a b c] [foo word!  end]
; a

giesse

18:45@nedzadarek the same argument is valid for Red functions.

18:46Thanks for the bug report, I'll fix it.

nedzadarek

20:51@giesse You are right.
With the current implementation: is this possible to make a rule that takes a block!/paren! and returns e.g a first and a second element (something like r: rule [bl] [bl/1 bl/2] - topaz-parse [a a] [r [word! word!]] => equivalent of 2-argument rule: r2: rule [a b] [a b] - topaz-parse [a a] [r2 word! word!])

I thought it was my error. Well, thank you.

giesse

22:03You can see it's clearly a bug from this:

>> topaz-parse [a b] [word! word!]
== b
>> foo: [word!] topaz-parse [a b] [foo word!]
== a

22:06Using paths is possible, but there are a couple issues:
* Red parse does not handle paths, so the compiler has to work around it
* The compiler won't know at compile time what bl/1 and bl/2 are, so it can only guess. This will make it fail if you try to use this for anything more complicated than just your example above. It may be possible to get most cases to work with enough static analysis, but that's a lot of work; maybe one day. :)

nedzadarek

22:07@giesse I see, thank you.

giesse

22:08The whole point of this experiment so far is to figure out if this can be indeed useful enough to be worth extra effort. So if you can come up with real use cases and how it helps or fails, that can inform the design.

22:09I should also mention, another option for things like your bl/1 example is to have the compiler generate code for all possible cases and then select at runtime, or, just spit out the code for an interpreter, so when something can't be determined at compile time it just gets interpreted at runtime.

22:11I do want to do both things eventually - analyze as deeply as possible to figure out optimizations, and fall back to an interpreter if everything else fails; but it's going to take a while to get there I think.

Thursday 11st April, 2019

dander

06:48I was working on some parsing code which I wonder if it might be a helpful use case. The purpose was to strip out some extraneous sections of xml to make it smaller. In this example, the key is that there are some different elements that have to have an attribute disabled="false". Here are some relevant parts:

close-element: function [name][
    rejoin ["</" name ">"]
]

disabled-element: function [name][
    compose/deep [
        "<" (:name) to [{enabled=} | ">" | "/>" ]
        {enabled="false"} to [">" | "/>"]
        ["/>" | ">" thru (close-element :name)]
    ]
]

I generated some rules from these

disabled-filter: disabled-element "Filter"
disabled-stream: disabled-element "Stream"
disabled-component: disabled-element "Component"

then remove them while walking the document

parse action [some [
    remove [
        manual
        | disabled-filter
        | disabled-stream
        | disabled-component
        | param-option
    ]
    | skip]
]

I haven't tried porting it to the topaz method yet, but it looks like it would clean it up a bit, but not change things too significantly. Still, any improvement in clarity is nice to have. I'm fairly new to parse, and getting the to/thru behavior right were the really tricky parts for me.

Incidentally, I'm also interested in if there is some better way of approaching this.

giesse

18:30I haven't worked on string parsing yet, and I'm not a fan of changing the input in place, but yes, this could be a real use case for rule arguments.

I think I would address this particular problem by loading the whole xml into some internal representation, then manipulating that, then outputting back xml; it's overkill for your simple case of wanting to remove a few elements, but most of the time it can be really useful.

dander

20:14Yeah, I figure eventually there will be a built-in xml processor at some point, and I would probably go that route if it were available. It would certainly be less error prone than my approach

nedzadarek

20:55@giesse
> So if you can come up with real use cases and how it helps or fails, that can inform the design.

I have found 2 small cases:
1) [ahead block! into return-parse-rule] => so I would like to type this: into-block: rule [...] [...] [into-block return-parse-rule]

r: rule [some-rule] [ahead block! into some-rule]
rule1: ['b (print "found b")] 
topaz-parse [a [b] ] ['a r rule1]
; found b
; == b

2) set refinement refinement! (append refinements to-word refinement) => name refinement is not really important. What is important is that I append matched type to some collection. I would like to do something like this:

create-name: does [to-word append "temp" random 9999]
r: rule [type collection ] [set (c: create-name) type (append collection c)]
arr: copy [] 
topaz-parse [42] [r integer! arr]

after I run this code arr should have 42 in it.
PS. is this doable with the current rule?

9214

21:09@giesse I still don't like your idea with function's body being a Parse rule. It's too limited and requires a custom rule function. And, no, I don't buy an argument about it being confusing - either way it's just a function that takes some arguments and returns a block that looks like a Parse rule, what's confusing in that?

BTW, do you support refinements?

21:15I also think it might be useful to pass to such functions some extra info implicitly (unless there's a direct access to it), e.g. for error reporting (series position, previous rule and it's status, etc). Another question if they can backtrack to alternates outside of themselves.

21:28I mean, when you compile foo: rule [][... |] down to Parse, e.g.:

[a b foo c]

Does it act like

[a b [... |] c]

[a b ... | c]

Friday 12nd April, 2019

giesse

07:01former, [... |]. It's exactly like a normal rule, except you have arguments. Eg.:

foo: [... |]
[a b foo c]
; vs.
foo: rule [blah] [... |]
[a b foo c]

07:01You want it to be special, I want it to be *exactly* the same as current blocks, except with arguments and locals.

07:01I don't currently support refinements.

07:02@nedzadarek topaz-parse does not have ahead or set. You can use into block! [...] to specify what type into should accept.

07:03I'll look more closely at your examples tomorrow and explain more.

18:14@nedzadarek
1) are you asking for the general case of allowing passing sub-rule blocks to rule functions? If that's the case, I think I just need to add some support for it. If instead you are asking specifically about into, as I briefly mentioned yesterday you can use into block! some-rule.

2) why not just

>> topaz-parse [42] [collect [keep integer!]]
== [42]

18:15eg.

>> input: [word /refinement some-other-word /some-other-refinement "string"]
== [word /refinement some-other-word /some-other-refinement "string"]
>> topaz-parse input [collect any [keep refinement! | skip]]
== [/refinement /some-other-refinement]

nedzadarek

20:28@giesse 1)
I guess I wasn't clear. I mean ahead & into, in the Red, are used together so why not create a rule to make it shorter (as you have already done with into block! some-rule in the topaz; I would do it even shorter... well, just a few characters). There might be another keywords used together so why not join them into one rule.

2) I have forgotten another thing (it's in the first line but I missed it in the later code). I have to change matched thing in my code to other type (in my case to-word).
Let's say I need to convert to floats:

create-name: does [to-word append "temp" random 9999]
r: rule [type collection ] [set (c: create-name) type (append collection to-float c)]
arr: copy [] 
topaz-parse [42] [r integer! arr]

Result:

[42.0]

giesse

20:592)

>> input: [word /refinement some-other-word /some-other-refinement "string"]
== [word /refinement some-other-word /some-other-refinement "string"]
>> topaz-parse input [collect any [ref: refinement! keep (to word! ref) | skip]]
== [refinement some-other-refinement]

21:021) Ok, so I think it's just a matter of the compiler not being smart enough right now. I think I can fix this at least to the level that you need, but it will need more work to handle all possible cases. Eventually I'll get there. :) In fact, I think I have an idea on how to let you easily switch between the compiler and an interpreter, without adding too much code. Hopefully soon. :)

nedzadarek

22:48@giesse 2) collect + keep (fun X) is a nice combination. It's smaller but I still need to make a name for the word. I want to avoid creating unnecessary words yet keeping it "readable enough". Parse's functions might deal with naming such words leaving "core" to the user. It's something like [tacit programming](https://en.wikipedia.org/wiki/Tacit_programming).
ps. It's not my use-case but what if an user want to collect things into different types, for example a binary!? In general (in harder cases), would you (or anyone) prefer to use parse's (future) functions or some external utility, For example:

do https://raw.githubusercontent.com/nedzadarek/cold.red/master/main.red
foo: func [bl init] [
    cold/fun/into bl func [key value] [
      init/(key): init/(key) + value
 ] init
]
foo [
    parse [1] [set i integer! (keep 1 i)]
] #{00}

Result:

== #{01}

Saturday 13th April, 2019

giesse

06:43Perhaps collect binary! [...]. Haven't thought much about this yet.

06:45Re: having to use a word for case 2, maybe this is how function! values should be used in topaz-parse:

topaz-parse input [collect any [keep to-word refinement!]]

06:45Provided the function only takes one argument, it gets the result of the following rule, and results in its return value.

06:48It would probably be very useful with dialect interpreters, for eg. [on-word word! ...] where on-word is your interpreter's function for dealing with words. Combined with collect and object it would make the parse dialect almost functional.

06:48so instead of [x: some-rule (do-something-with x)] you just write [do-something-with some-rule]. It could be hard to read though.

Monday 15th April, 2019

nedzadarek

11:11@giesse
[collect any [keep to-word refinement!]] and version with your topaz's object might be good addition for core parse.

> It could be hard to read though.

I would name my function-rules with some distinctive name (e.g. ending with *) but it's just a naming conventions.

giesse

18:26It may be worth playing with. It should be super easy for me to add it.

Rebol2Red

18:55

data: {

    <s>
		<d>15</d>
		<o>
			<f1>
        </o>
	</s>
	
    <s>
		<d>25</d>
		<o>
			<f2>
        </o>
		<o>
			<f3>
        </o>
   </s>

    <s>
		<d>37</d>
		<o>
			<f4>    
        </o>
   </s>
	
}

parse data [
	collect set res [
		some [
			thru {<s>} 
				thru {<d>}
				keep to {</d>}
				some [
					thru {<o>}
						thru {<f}
						keep to {>}
					to {</o>}
				]
			to {</s>}
		]
	]
]
probe res

;== ["15" #"1" #"2" #"3" #"4"]

please, ** explain why ** this does not work as expected

i need to get
[[15 1][25 2 3][37 4]]

dander

20:09@Rebol2Red once you've reached your first , the parser is still in the some block, so the next thing it looks for is thru {}, not to , and that will jump you further down in the document than you want to go. to / thru are pretty tricky to get right. Ideally, you should try to build up rules that define the whole document, but I don't know how much more complex it is than your example

endo64

22:09@Rebol2Red A little confusing using that double collect but here is what you ask:

ws: charset "^/^- "
aws: [any ws]
digit: charset "0987654321"
probe parse data [
	collect [some [aws "<s>" aws "<d>" collect [keep any digit "</d>" aws some ["<o>" aws "<f" keep some digit ">" aws "</o>" aws] "</s>"]]]
]

Output:

[[
    "15" #"1"
] [
    "25" #"2" #"3"
] [
    "37" #"4"
]]

22:10You see the reason for double collect if you remove the first one.

Tuesday 16th April, 2019

nedzadarek: 08:13@giesse :+1:

Friday 19th April, 2019

Rebol2Red

10:06I still do not get it
I can parse this with '2 passes' but not with one pass

data: {
	<ul class="genres">
		<li title="Family">Family</li>
		<li title="Drama">Drama</li>
		<li title="Adventure">Adventure</li>
	</ul>				   
	<ul class="countries">
		<li title="The Netherlands">The Netherlands</li>
	</ul>                            
}
                                       
parse data [
	collect set temp [thru {<ul class="genres">} keep to {</ul>}]	
]
parse to-string temp [
	collect set genres [some [thru {<li title="} thru {">} keep to {</li>}]]
]
parse data [
	collect set temp [thru {<ul class="countries">}	keep to {</ul>}]	
]
parse to-string temp [
	collect set countries [some [thru {<li title="} thru {">} keep to {</li>}]
	]
]
probe genres
;==["Family" "Drama" "Adventure"]
probe countries
;==["The Netherlands"]

I want to get [["Family" "Drama" "Adventure"]["The Netherlands"]]
Countries and genres can be one or more.

9214

10:36@Rebol2Red

data: {
    <ul class="genres">
        <li title="Family">Family</li>
        <li title="Drama">Drama</li>
        <li title="Adventure">Adventure</li>
    </ul>                   
    <ul class="countries">
        <li title="The Netherlands">The Netherlands</li>
    </ul>                            
}

bl: charset reduce [space tab cr lf]
ul: [thru "<ul" thru ">" some [some bl </ul> break | li] | thru </ul>]
li: [thru "<li" thru ">" keep to </li> </li>]

probe load mold/flat parse data [collect some [to "<ul" collect ul]]

Rebol2Red

10:43Thank you, i will look into this.

toomasv

15:46@Rebol2Red Alternative experiment, albeit not pure parse:

block: parse data rule: [
	collect [some [
	  [{"genres"} | {"countries"}] rule
	| #"^"" keep to #"^"" skip 
	| skip
	]]
]
move back tail block/1 tail block
load mold/flat block
;== [["Family" "Drama" "Adventure"] ["The Netherlands"]]

Wednesday 24th April, 2019

giesse

20:28@nedzadarek updated topaz-parse:

>> foo: [word!] topaz-parse [a b] [foo word!]
== b
>> foo: ['a] topaz-parse [a b c] [foo word!  end]
== none

>> input: [word /refinement some-other-word /some-other-refinement "string"]
== [word /refinement some-other-word /some-other-refinement "string"]
>> topaz-parse input [collect any [keep to-word refinement! | skip]]
== [refinement some-other-refinement]

20:30On passing a block to a rule function and then using it as a sub-rule - I think I'll deal with this later on, when I can deal with all the edge cases.

21:48I also cleaned up the code a bit, it should be easier to look at.

nedzadarek

22:39@giesse :clap:

Thursday 25th April, 2019

greggirwin

17:58@giesse the more functional approach does seem like it could be harder to read. We know from history that the pattern-action dispatch model is accessible to a wide audience; the functional approach less so.

hiiamboris

19:25Why is it so that parse cannot switch to a different input series with :word syntax? Are there any known hacks to do that?

greggirwin

19:32I'm going mad just thinking about it @hiiamboris. =8^\

hiiamboris

19:33:D

19:33And I'm going mad while inventing a hack! :D

greggirwin

19:34That way lies madness to be sure. What are you trying to do?

19:36Internally, from a clean slate, you could have multiple FSMs, and each one would have state for a single input series, then the processor would "context switch" between them, like co-routines, if the input series is switched.

hiiamboris

19:38There's a dialect I'm experimenting with. I have 2 alternative implementation ideas:
- compile it to bare Red - very clean but I'm reinventing some parse wheels
- compile it to parse - thus leveraging some of it's stuff but will certainly involve some dirty hacks I won't be proud of

So I asked just for the sake of consideration of the 2nd option.

greggirwin

19:39Go with option 1. :^)

19:40I remember having the same dilemma many years ago, when I worked up a CLI dialect.

19:48When you asked, backtracking first came to mind as an issue, but think about using set-words to mark locations, in combination with get-words to position the input. People who like goto and longjmp would probably like it. ;^)

hiiamboris

19:49haha, yeah

19:57So, the simplest hack looks like this now:

>> parse r: [[(change/only r [4 5 6]) (r: next r) into r end] [copy o collect some keep integer!]] r  ? o
O is a block! value.  length: 3  [4 5 6]

It modifies part of the rule, inserting a new input [4 5 6] into it, then using into on that input matching it against the rest of the rule.
But no, I don't like to continue this way ;)
Could've been better if into allowed to specify a custom series to go into.

greggirwin

20:30That's an interesting idea. Though I can see where it would make things harder to reason about. What I've done in the past is break things up, so if there are sub-dialects (e.g. sub commands in a CLI app) they are all independent. It means some redundancy at times, but makes maintenance a lot easier and safer.

giesse

20:46@hiiamboris any more details? Specifically, what does your dialect do, what does it look like. There might be a completely different way to look at it.

rebolek

21:05@hiiamboris Switching to different is series is not bad idea, but I'm fine with modifying. It's easy to use and really powerful.

21:05

>> series1: [a b c]
== [a b c]
>> series2: [1 2 3]
== [1 2 3]
>> series: series1
== [a b c]
>> parse series [some word! p: (insert p series2) some integer!]
== true

21:09This does not refer to different series, but it's close :smile:

>> parse series [some word! (insert clear series series2) :series some integer!]
== true

hiiamboris

21:28@rebolek :D

@giesse Details.. well ☺ It's focused at general purpose series transformations. In the future I'll be able to define a relation between a pair of series and expect that any modification in either of these will be automatically reflected upon the other. I'll give you some examples once I settle on the design and test it.

The rationale is that I *hate* to deal with indices. It's a monkeys work that a human should never concern himself with. It's also a grave for declarativity. Unfortunately, using Red I often find myself deeply in index arithmetics, so that's what I wanna change.

dander

22:07@hiiamboris using reactivity?

hiiamboris

22:30Yeah, when the morphism is declared as persistent. Otherwise, a single operation.

Thursday 16th May, 2019

hiiamboris

21:14Do you also think that opt rule is acting weird here?

>> parse "1234" [a: skip opt to "." a:] ? a
A is a string! value: ""

Whereas the following works:

>> parse "1234" [a: skip opt [to "."] a:] ? a
A is a string! value: "234"

>> parse "1234" [a: skip [to "." |] a:] ? a
A is a string! value: "234"

>> parse "1234.5" [a: skip opt to "." a:] ? a
A is a string! value: ".5"

greggirwin

21:36I'll let @dockimbel speak to it being by design, but the first case does look like a bug to me. R2 and R3 return different results for a ("1234" and "234" respectively).

9214

21:39@hiiamboris https://github.com/red/red/issues/3679 ?

greggirwin

21:46Thanks for finding that @9214.

hiiamboris

21:49I'll add it there then

Wednesday 19th June, 2019

rebolek

12:54Hm, this result is not something I expected:

>> parse x: "-" [change not space #"."]
== false
>> x
== ".-"

toomasv

13:30@rebolek

>> parse x: "-" [not space change skip dot]
== true
>> x
== "."

rebolek

13:32@toomasv Thanks, but I'm not interest in workaround, I wonder why it's doing what it's doing.

toomasv

13:32Because not does not advance, IMO. It's look-ahead rule. (Hence, corrected my code: ahead not needed)

13:48It's clear from this:

>> parse x: "-" [not space s: (probe s)]
"-"
== false

14:34Thinking of it -- where should it advance when it does not match? Let's say you have

parse x: "-" [not [2 space] rule]

where should rule continue from if not from same place as before trying subrule?

14:38Above could have been written also as

parse x: "-" [change [not space skip] dot]

giesse

19:28@toomasv but, change should change not insert, no?

19:28though I guess if it's mapping to change/part it makes some sense.

19:29ie. @rebolek think of it this way:

>> x: "-"
== "-"
>> change/part x "." x
== "-"
>> x
== ".-"

toomasv

20:24Nice example!

Monday 24th June, 2019

hiiamboris

21:28On the old topic of preferring while over any - wanna mention another real world use case: I'm parsing a dialect and preprocessing it in place using logic like

parse input [while [input: 
    ...
    | (found a macro) ... (change/part input new-code 1) :input
    | ...
]]

any just stops since input did not advance after the substitution (by design). while gives me more control here.

Tuesday 25th June, 2019

greggirwin: 16:53Feel free to make notes in https://github.com/red/red/wiki/%5BDOC%5D-Parse as thoughts come up.

Wednesday 26th June, 2019

rebolek

08:23@giesse @toomasv thanks for insights.

08:25Now another problem. I want to check with ahead that there's colon on line. Something like:

ahead [copy value to newline if (find value colon)]

but without fallback to Red. Any ideas?

9214

08:55ahead [to #":" to newline]?

rebolek

08:57@9214 Thanks, that was easy. I guess my aversion to to makes some thing unnecessarily hard :)

hiiamboris

09:46to #":" part may skip a few newlines on it's way☺

rebolek

09:46@hiiamboris true! My aversion to to is right again! :smiley:

hiiamboris

09:46:D

9214

10:09Well, it checks if there's colon on line... but maybe it's not the line you want :smirk:

rebolek

10:10:)

9214

10:17This is a common pain-point with to and thru - it's hard to delineate up to which rightmost boundary search should be made.

rebolek

10:21Hm, I think I've got it:

not-colon: charset [not ":^/"]
rule: [ahead [some not-colon colon]]

9214

10:21Ah, so you needed to check for colon at the end of the line?

rebolek

10:22No, I need to check if there is one on the line

10:22like in key: value

9214

10:23Right, I see.

dander

16:45I think this is the way to do it with to:

line-contains-colon: [to [":" | newline | end] ":"]
rule: [ahead line-contains-colon]

toomasv

17:14@dander :+1: , but considering @rebolek's aversion towards to, it is no-no. :smile:

dander

17:28Maybe it's a case of knowing what the safe patterns are. I came by this by repeatedly shooting myself in the foot though... it looks understandable to me now, but I recognize the risks. I guess the biggest thing with to/thru is that they are deceptively simple looking

toomasv

17:30It is very close to @rebolek's solution actually:

[ahead [to [":" | newline] ":"]] ~ [ahead [some not-colon colon]]   ; not-colon: charset [not ":^/"]

to [":" | newline]     ~     some not-colon

dander

17:35yeah, that's an interesting point. I guess to rule is basically equivalent to any not rule ahead rule

17:37hmm, I think [some not-colon colon] should maybe be [any not-colon colon], but it probably doesn't matter for his example case

toomasv

17:38@dander Ah, you are right! any would be better.

17:50@dander But your suggestion to rule ~ any not rule ahead rule is not correct. Consider this:

rule: ["a"]
parse "bcda" [to rule rule]
;== true
parse "bcda" [any not rule ahead rule rule]
;== false

rebolek

17:50@dander nice, thanks!

17:51some not-colon colon makes more sense, because the key before colon should have some value, but you're right.

toomasv

17:53But

parse "bcda" [any [not rule skip] ahead rule rule]
;== true

rebolek

17:53@toomasv I did some tests some time ago and IIRC, explicit rules are actually even faster than using to - another reason not to use it :)

toomasv

17:56@rebolek I agree, but using to shouldn't be ruled out, IMO. In cases where performance does not matter, it can make rules simpler, although it must be used with much care.

rebolek

17:58Don't get me wrong, to is great for simple stuff, if you're looking for one value in HTML page, it's the best solution. But converting some xMB document in random format to Red cannot be done with to properly.

toomasv

17:59@rebolek jawohl

rebolek

18:00:)

dander

18:05@toomasv ooh, not rule doesn't advance? So is this a case where parse detects an infinite loop and fails out?

18:06oh, never mind. I get it. It matches "bcd", but doesn't move forward, so ahead is still looking at the start

toomasv

18:08@dander Don't know about failing out but see also this earlier discussion :point_up: [June 19, 2019 3:54 PM](https://gitter.im/red/parse?at=5d0a3094faf70031f94e67f3)

giesse

18:50@rebolek to and thru are useful *very* rarely, but sometimes they are (aside from quick parsing I mean) - ie. to #">" is more readable than any non-greater-than (which also requires defining a charset) or any not #">" etc. It is also easier to optimize. That being said, I don't know how to stop people from using them too much.

@dander not cannot advance as it succeeds when the rule does *not* match. :)

rebolek

18:50@giesse I know how to stop them, but it involves violence.

giesse

18:52:D

pekr

19:02@rebolek I dont believe to/thru naysayers, sorry 😀

dander

19:10@giesse it makes sense when you say it that way 😄. So what happens when you do any not rule? It seems like that would match and not advance forever, but is the loop detected and interrupted?

hiiamboris

19:21any/some stop when they don't advance

Thursday 27th June, 2019

giesse

06:23@pekr if you are good enough to avoid the kind of mistakes with to and thru, you probably have little need for them.

endo64

09:00I usually end up with infinite loop when using some or any

parse "bx" [some [not "b" | skip]]

rebolek

09:03@endo64 what should this rule do other than infinite loop? ;)

hiiamboris

09:11@endo64 looks like a bug. I think you should report it.

rebolek

09:12@hiiamboris how is it bug?

hiiamboris

09:13How it is not? [not "b" | skip] is always failing after the 1st iteration, yet some continues to match it.

09:13Look at parse-trace "bx" [4 [not "b" | skip]] output (I replaced some with a number of tries)

09:15Ah, I see. It matches "b" | skip instead. Still, some is supposed to stop, no?

09:17

>> parse "xx" [any [not "b" | skip]]
== false
>> parse "bb" [any [not "b" | skip]]
== true
>> parse "xx" [some [not "b" | skip]]
== false
>> parse "bb" [some [not "b" | skip]]
== true
>> parse "bx" [some [not "b" | skip]]
(hangs)

Is there a logic to this? :)

rebolek

09:19not "b" on bx -> false, move to skip
skip on bx -> true, run rule again
not "b" on x -> true, run rule again
and now it "hangs"

09:19remember that not does not advance

endo64

09:22So we can't say any/some stop when they don't advance`, no?

rebolek

09:23We can say it, that's why there's fail and break.

09:24And reject of course.

endo64

09:25But why this doesn't hang: parse "x" [some [not "b"]]

hiiamboris

09:25@endo64 exactly. And this is a mystery:

>> parse-trace "xx" [3 [not "b" | skip]]
 -->
   match: [3 [not "b" | skip]] 
   input: "xx"   
   -->
     -->
       match: [not "b" | skip] 
       input: "xx"       
       -->
         ==> not matched
       <--
       match: ["b" | skip] 
       input: "xx"       
     <--
   <--
 <--
return: false
== false
>> parse-trace "bx" [3 [not "b" | skip]]
 -->
   match: [3 [not "b" | skip]] 
   input: "bx"   
   -->
     -->
       match: [not "b" | skip] 
       input: "bx"       
       -->
         ==> matched
       <--
       match: [| skip] 
       input: "bx"       
       ==> matched
     <--
     match: [[not "b" | skip]] 
     input: "x"     
     -->
       match: [not "b" | skip] 
       input: "x"       
       -->
         ==> not matched
       <--
       match: ["b" | skip] 
       input: "x"       
     <--
     match: [[not "b" | skip]] 
     input: "x"     
     -->
       match: [not "b" | skip] 
       input: "x"       
       -->
         ==> not matched
       <--
       match: ["b" | skip] 
       input: "x"       
     <--
   <--
 <--
return: false
== false

rebolek

09:26@endo64 hm, good question. I would expect it to hang.

endo64

09:29not "b" doesn't advance, neither some, so how does it stop?

>> parse "x" [some [not "b" p1:] p2:]
== false
>> p1
== "x"
>> p2
== "x"

rebolek

09:32IIRC there may be some checks to prevent some kind of infinite loops, this may be one of them (but I may be wrong).

endo64

09:35I asked that feature before, Nenad said that it could be added, but I didn't see a commit or message then.

hiiamboris

09:35> Iteration
any rule : repeat rule zero or more times until failure or **if input does not advance.**
some rule : repeat rule one or more times until failure or **if input does not advance.**
while rule : repeat rule zero or more times until failure regardless of input advancing.

09:35https://www.red-lang.org/2013/11/041-introducing-parse.html

rebolek

09:37@hiiamboris Aha!

endo64

09:44Thanks @hiiamboris , so parse "bx" [some [not "b" | skip]] should not hang, parse cannot detect input does not advance in this case. Worth a ticket?

rebolek

10:12Probably yes.

Saturday 13th July, 2019

dockimbel: 13:19Issue fixed.
endo64: 15:33I confirm, parse "bx" [some [not "b" | skip]] returns false and doesn't hang anymore. Thanks!

Tuesday 16th July, 2019

toomasv

05:45Could action of keep pick be extended to expressions? Currently it works on matches only:

>> parse [a b c][collect some [keep ['b 'c] | skip]]
== [[b c]]
>> parse [a b c][collect some [keep pick ['b 'c] | skip]]
== [b c]
>> parse [a b c][collect some ['b keep ([add some stuff]) | skip]]
== [[add some stuff]]
>> parse [a b c][collect some ['b keep pick ([add some stuff]) | skip]]
== []

Same for keep copy .

Friday 19th July, 2019

abdllhygt

12:28@toomasv
the lexer works;

>> u ["mamazo" "mamaxo" "hama"]
== "ayah, ibu dan makanan"
>> uno ["mamazo" "mamaxo" "hama"]
== "ayah, ibu atau makanan"

12:28but i didn't make the parser

12:29it translates to indonesian language from my conlang

toomasv

12:32:+1:

Monday 22nd July, 2019

abdllhygt

12:49new sample

>> u [(ia[word: "mama" determiner: "ni" adjective: "hehaha"])(ia[word: "qaja"])]
== "orang tua lucu ini dan terjamah"

13:24

Red
>> ieoa[[text: (oa[word: "o"])][text: (ia[word: "qima" determiner: "na"])]]["o" "i"]
== "iya, jeruk lemun itu"

(yes, that lemon)

greggirwin

17:58I'll say that's cool, even though the base lang and your conlang are foreign to me. :^)

abdllhygt

20:52thank you

20:53but it's not my conlang, it translate to indonesian language haha

20:53i want to make a parser for my conlang when i finished the indonesian lexer

Tuesday 23th July, 2019

abdllhygt

14:42New Sample:

; %qslsamples/sample.red
ieoa[
  [
    text: (
      ia[
        word: "mamazo"
        determiner: "ni"
      ]
    )
  ]
  [
    text: (
      aa[
        word: "hama"
        tense: "za"
        negative: true
      ]
    )
  ]
]["i" "a"]

14:43

>> do %qaja/qslsamples/sample.red
== "ayah ini tidak sedang makan"

14:44english: this father isn't eating
conlang: i mamazo ni za no hama

15:24[sample2.red](https://github.com/abdllhygt/qaja/blob/master/qslsamples/sample2.red)

>> do %qaja/qslsamples/sample2.red
== "ayah saya belum terjamah bahasa ini"

english: my father didn't translate this language
conlang: i mamazom pa no qaja e qisa ni

15:44updated:

>> do %qaja/qslsamples/sample2.red
== "ayah saya belum terjamah bahasa ini dirumah"

english: my father didn't translate this language at home
conlang: i mamazom pa no qaja e qisa ni ze me

Wednesday 24th July, 2019

abdllhygt

08:17[new sample](https://github.com/abdllhygt/qaja/blob/master/qslsamples/su-zu.red)

>> print do %qaja/qslsamples/su-zu.red
ayah saya belum terjamah bahasa ini ketika kita jalan jalan dijalan

english: my father didn't translate this language when we walk on the road
conlang: i mamazom pa no qaja e qisa ni zu i kisa a jaja ze jaja

Sunday 28th July, 2019

abdllhygt

13:44hi again, i started to the parser

13:45

qq>> ni nama hehaha
id== nama lucu ini

it means this funny name

13:48qq (translation language)
id (indonesian language)

Saturday 31st August, 2019

Rebol2Red

08:58

text: copy {<p></p> <p>6</p>}
res: copy []
parse text [
	some [
		thru "<p>" 	copy between to "</p>" (append res between)
	]
]
probe res/1
;==""
probe res/2
;=="6"

text: copy {<p></p> <p>6</p>}
parse text [
	collect set res [
		some [
			thru "<p>"	keep to "</p>"
		]
	]
]
probe res/1
;==#"6"
probe res/2
;==none

Why does the first code gives the expected result and the second code not?
I want the same result using collect in one way or the other.

toomasv

09:11@Rebol2Red Try res: parse text [collect some [...]]

Rebol2Red

09:27@toomasv Thanks, but unfortunately it gives the same wrong result. Is this an issue? Does this works on your computer? If so which version of Red are you using?

9214

10:52@Rebol2Red this is not an issue, but the difference between copy and collect with keep. The former copies matched input in a straightforward way, the latter is more smarter and will ignore empty matches and coerce one-character strings to char! values (because, really, what you're parsing is a series of characters).

>> parse "abc" [collect some [keep skip]]
== [#"a" #"b" #"c"]
>> collect [parse "abc" [some [copy match skip (keep match)]]]
== ["a" "b" "c"]
>> collect [parse "abc" [some [set match skip (keep match)]]]
== [#"a" #"b" #"c"]

>> parse "xxx" [collect some [thru #"x" keep to #"x"]]
== []
>> collect [parse "xxx" [some [thru #"x" copy match to #"x" (keep match)]]]
== ["" ""]

Rebol2Red

11:49@9214 I see, thanks, It works, but i keep wondering if parse shouldn't be more straighforward. I mean more the way like people think it would work.
So i am asking if there is a way to make your code compatible with my second code fragment using function/redefining/macro or whatever?

9214

12:10> more the way like people think it would work

I personally expect copy to keep the "always return a series" invariant, which it does with empty string case. Ditto for keep - I don't expect it to collect empty garbage, and don't think that this is useful at all in use-cases where collect typically applies.

People think all kinds of things, but what matters are strong, concrete examples and thorough understanding of language design / semantics. Feel free to [propose your changes](https://github.com/red/red/wiki/%5BDOC%5D-Red-Should...-%28Feature-Wars%29) if you feel like it, but the onus is on you to convince us, not for us to argue with you why things are the way they are. OTOH, I got bitten by keep returning char! instead of string! myself, and agree that it can be more consistent and return string! in either case. Also see https://github.com/red/REP/issues/8

> So i am asking if there is a way to make your code compatible...

The closest I can get is:

>> parse "<a> <>" [collect some [thru #"<" keep to #">" | keep ("")]]
== [#"a" ""]

Single-char is still an edge-case that requires workaround.

Rebol2Red

16:20@9214 Ok, I agree that things are what they are. Thank you for taken the time to explain why and for the code.

toomasv

16:23@Rebol2Red Sorry, was on phone earlier and couldn't double-check. This works:

res: parse "<p></p><p>6</p>" [
    collect some [
        thru "<p>" keep copy x to "</p>" thru "</p>"
]]

9214

16:25@toomasv I'm not sure if this is a bug or edge-case feature :confused:

toomasv

16:26Which one would you suggest?

Rebol2Red

16:27@toomasv Yes, that works
@9214 I just executed your code and it is in some loop or so.

9214

16:27@Rebol2Red

>> parse "<a> <>" [collect some [thru #"<" keep to #">" | keep ("")]]
== [#"a" ""]
>> about
Red 0.6.4 for Windows built 31-Aug-2019/17:47:43+05:00 commit #b28d8f5

16:38And while at it:

#do [match: none]
#macro grab: func [rule][
    compose/only [keep copy match (rule)]
]

probe parse "<p></p><p>6</p>" [
    collect some [thru <p> grab to </p>]
]

Rebol2Red

16:49@9214 Nice! My Red was not the latest. The reason is that some programs of mine from the past only works with older versions. Have to find a way to handle this.

9214

16:51> Have to find a way to handle this.

The best way to handle this is to provide a minimal reproducible example of your bug / crash, so that we can fix it :wink:

greggirwin

17:28:point_up: [August 31, 2019 6:10 AM](https://gitter.im/red/parse?at=5d6a63bb5b9a174fdf0fbe8d) :+1: @9214.

Would someone please recap the above chat and examples and add a section to https://github.com/red/red/wiki/%5BDOC%5D-Parse?

giesse

19:21@Rebol2Red @9214 compare with:

>> do %topaz-parse.red
== func [
    {Parse BLOCK according to RULES; return last result from RULES if it matches, NONE otherwise} 
    block [binary! any-block! an...
>> topaz-parse "<p></p> <p>text</p>" [collect some [thru <p> keep to </p> </p>]]
== [#"<" #"<"]
>> topaz-parse "<p></p> <p>text</p>" [collect some [thru <p> keep copy to </p> </p>]]
== ["" "text"]

toomasv

19:45@9214 :point_up: [August 31, 2019 7:25 PM](https://gitter.im/red/parse?at=5d6a9f84375cc34fdeefaeef) I am still wondering why do you think it is bug or edge-case?

9214

19:52@toomasv because of an inconsistency in how keep treats empty match - without copy it fails, with copy it keeps an empty string. One can think that in both cases what was matched is an empty string (between

and

toomasv

19:59@9214 I don't think it is inconsistent. It keeps according to datatype - with copy it is forced to treat match as string, and it matches empty string, but with simple keep or keep pick it doesn't match anything there.

9214

20:14Uhm, in both cases input datatype is the same, so why should copy affect anything? The only side-effect of matched copy IMO should be extraction of matched input into a word. "Anything there" in both cases is an empty string, it's actually equivalent to keep none.

toomasv

20:28Surely it does. Compare these:

>> parse "<p></p><p>x</p><p>abc</p>" [collect some [thru "<p>" keep to "</p>" "</p>"]]
== [#"x" "abc"]
>> parse "<p></p><p>x</p><p>abc</p>" [collect some [thru "<p>" keep pick to "</p>" "</p>"]]
== [#"x" #"a" #"b" #"c"]
>> parse "<p></p><p>x</p><p>abc</p>" [collect some [thru "<p>" keep copy _ to "</p>" "</p>"]]
== ["" "x" "abc"]

Exactly according to [explanation](https://www.red-lang.org/2017/03/062-libred-and-macros.html):

> added pick option to keep, so user can control how keep captures the matched input:

> keep collects matched values as a series if many, or as a value if only one.
> keep pick collects all the matched values separately in a block.
> keep copy <word> collects all the matched values as a single series (of same type as input).

9214

20:35Unless I'm completely blind, all keep options "collect *matched value(s)*". How come first two examples haven't matched anything but the third one did, even though there's no *value* between < and > characters?

Previously I was confused that keep and copy are separate keywords, but now it turns out that keep copy is a dedicated keep option.

20:38By definiton, string! is a series of char! *values*, and there's no such thing as #"". So, why keep copy "collects matched values as a single series" when no *values* actually being matched?

toomasv

20:41It matches "" (of same type as input).

20:47All keep variants match values of course, but of different types, i.e. strings or chars *in this case*. keep matches char! if one or string! if several. keep pick matches always chars, and keep copy matches always strings.

9214

21:09keep doesn't match, it collects values from already matched input, but can do it in a variety of ways: either one-by-one in a block or together in a series of respective datatype. Anyway, since #"" doesn't exists, it might make sense for keep and keep pick to ignore first match but keep copy to yield an empty string. And keep copy indeed preserves "copy always return a series" invariant (which it isn't, by the way).

>> parse [][copy match skip] :match
>>

Don't see any point in continuing this debate. My main concern was about collect / keep idiosyncrasies and a need for their further improvement (as indicated by my comment in https://github.com/red/REP/issues/8 and elsewhere). Setting a dummy word to matched input just to keep empty string is a hacky solution.

greggirwin

21:14A key point, to me, is that if we have this kind of debate, the design may be confusing.

9214

21:23Ah, scratch that, my example above is incorrect, because skip fails.

>> parse [][copy match none] :match
== []

So yeah, keep copy follows copy semantics faithfully, but keep keeping empty strings is IMO questionable. keep only or something like that would be OK.

GiuseppeChillemi

21:26@9214
>> OTOH, I got bitten by keep returning char! instead of string! myself, and agree that it can be more consistent and return string! in either case. Also see red/REP#8

Also I don't like type coercing in this situation:

>> a: "hello"
== "hello"
>> probe first a
#"h"
== #"h"
>> probe type? first a
char!
== char!
>>

greggirwin

21:27What coercing are you talking about @GiuseppeChillemi ?

9214

21:27@greggirwin collect is already confusing enough as it is now, and there are a couple of REP tickets pending in this regard. I see no harm in discussing this and coming to at least some form of consensus.

greggirwin

21:27@9214 agreed. Hashing things out here is good.

9214

21:28As long as they are wiki-ed.

GiuseppeChillemi

21:28@greggirwin first "hello" shouldn't be a string ?

9214

21:28Why should it be a string and what it has to do with Parse?

greggirwin

21:28No, it should not @GiuseppeChillemi. first returns the first value in a series. A string is a series of char! values.

21:29Just a bit of group overlap @9214 .

GiuseppeChillemi

21:32@greggirwin The first time I used this code I expected to have a string. I has been a natural expectation. No discussion about the inner reasons of this. I just report what I expected and I suppose it is the same reason why Vladimir expected it.

21:35Just let's wait and see if other people will come on gitter or other discussion places reporting on this topic. If it will happend we have a common behaviour and the cause should be investigated or the documentation should clearly warn the user about this.

greggirwin

21:36Completely different scenarios though.

When you do something for the first time, and get an unexpected result, the thing to do is understand why you got the result you did. Red is doing exactly what it says, and which has worked that way since the dawn of Rebol. It's not going to change.

21:37Not only would it break a million things, but the design is correct as it is.

GiuseppeChillemi

21:41@greggirwin Greg, design should not changed but *"the documentation should clearly warn the user about this"* as I suppose it is an hole where most people will fall.

21:57And yes, people should understand before asking for a deep change. Building the architecture of a language is not easy.
I have told you before that those "holes" trigger me the same *"the thing to do is understand why you got the result you did"* approach about why all newcomers fall in similar holes. I have some answers and I am building the "big picture" but its not easy as visual decoding of symbols and languages from our neurons is still new science. Also I don't think that this branch of science has been included in all university courses of computer sciences bit it will be more relevant as AI and machine/brain interfaces will evolve.

9214

21:58> I just report what I expected and I suppose it is the same reason why Vladimir expected it.

That's a false implication. My original message was related to keep behaving differently depending on number of matched elements.

>> parse "ab" [collect keep 1 skip]
== [#"a"]
>> parse "ab" [collect keep 2 skip]
== ["ab"]

> the documentation should clearly warn the user about this

Documentation shouldn't try to counter godzillion of false assumptions and expectations that people may have. The fact that string! is a series of char! values is explicitly mentioned both in [Red](https://doc.red-lang.org/en/datatypes/string.html) and [Rebol](http://www.rebol.com/docs/core23/rebolcore-6.html) documentation. Mechanics of series itself is consistent and uniform across the whole language: if you ask for the first element of a series, you get the first element (or none, if there's none).

> it is an hole where most people will fall

Please don't speak for others when all you have is your own experience. So far, the only person who keeps claiming that such "holes" (which are, in fact, an applied common sense stemming from 20+ years of Rebol design) are unexpected and should be documented is you.

21:59@GiuseppeChillemi at this point you're derailing the room's topic. Case closed to me.

GiuseppeChillemi

22:22@9214 I suppose you have no knowledge on the topics I have introduced. So while I respect your deep knowledge on RED an computer science, please do the same on my knowledge on psychology, perception and knowledge building.
REBOL design has been been influenced from neuroscience studies made from CARL. While the whole "vision" of Carl is still advanced in our times some aspects are limited from 1990 knowledge. Now we are in 2020 and some things should be refined under the light of advancements this area.
The fact you can't see what I see doesn't implies that what I see does not exists. It could simply mean you have not sufficient knowledge about this topic. If you wish to continue this exchange and learn something in a productive feel free to choose the proper room, otherwhise we can stop here.

9214

22:28Ad hominem and straw man. Nice try.

Sunday 1st September, 2019

toomasv

06:21Stumbled on a little funny Flying Dutchman:

>> res: parse @-+-a+-bc+ [collect some [thru "-" keep copy _ to "+" "+"]]
== [ a bc]
>> length? res
== 3
>> _?: first res
== 
>> type? _?
== email!
>> length? _?
== 0

This is of course same as in:

_?: next @

greggirwin

18:29In R2 you could create hidden data in issues.

>> i: to issue! "a b c"
== #a
>> mold i
== "#a"
>> length? i
== 5
>> last i
== #"c"

18:29With your discovery, it's now clear we can build a better BrainF*** lang dialect. :^)

Sunday 8th September, 2019

ericguedespinto

01:36Hello everybody!
I am trying to convert some EBNF rules to parse rules but I stumbled into a problem with escaped unicode identifier and pattern syntax (ID_START and ID_CONTINUE).
Is there any way to put this into a rule other then passing a few thousand codepoints ? like set notation.

greggirwin

02:55@ericguedespinto , charset, not, and union are your friends here.

>> dig=: charset [#"0" - #"9"]
== make bitset! #{000000000000FFC0}
>> alnum=: make bitset! [#"0" - #"9" #"a" - #"z"]
== make bitset! #{000000000000FFC0000000007FFFFFE0}
>> non-dig=: charset [not #"0" - #"9"]
== make bitset! [not #{000000000000FFC0}]
>> hex=: union dig= charset "ABCDEFabcdef"
== make bitset! #{000000000000FFC07E0000007E}

See also:
- https://www.red-lang.org/2013/11/041-introducing-parse.html
- https://doc.red-lang.org/en/datatypes/char.html
- https://doc.red-lang.org/en/datatypes/bitset.html

02:57And we've started a [parse cookbook](https://github.com/red/red/wiki/Parse-Cookbook) page where you can add notes that may help others.

toomasv

06:00@ericguedespinto I also started with [EBNF parsing rules](https://github.com/toomasv/grammars/blob/master/ebnf.rule). So far it's strightforward translation of [iso-14977](https://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf) but I have not tried it out much. Just parsed EBNF grammar itself. E.g.:

do %ebnf.rule
ebnf: read %ebnf.ebnf
parse ebnf remove-gaps
parse ebnf remove-comments
parse ebnf syntax
;== true

ericguedespinto

09:52Thank you all

Tuesday 8th October, 2019

rebolek

13:28I would expect b instead of a:

>> parse [a a: b][quote a: set value skip to end]
== true
>> value
== a:

9214

13:29@rebolek quote a: matches a, set value skip matched a:, to end matches b.

rebolek

13:30@9214 the parse rule is bit wrong, but anyway, I used quote a: to match a:, not a.

pekr

13:31Never used quote myself, but why it matches aif a colon is used? (quote a:)

rebolek

13:32@pekr exactly

pekr

13:32With just a brief sight, it should fail on the first rule

13:32there should be skipimo

rebolek

13:33Right, I've simplified actual rule and haven't tested it before posting, but matching a instead of a: is the point.

pekr

13:33Eh, I don't understand the rest, so most probalby I should stay silent :-)

rebolek

13:34Ok, fixed the rule. I expect this to show b:

>> value: none parse [c a a: b][some [quote a: set value skip | skip]] value
== a:

But instead I have to do this:

>> value: none parse [c a a: b][some [ahead set-word! quote a: set value skip | skip]] value
== b

Why?

pekr

13:36I don't know, how setworks. Never used a block parsing. Does it set the value at certain (matched) position? Then it should imo return b, as you suggest

rebolek

13:37The thing is that quote a: matches a. set is basically like copy for one value.

9214

13:39@rebolek https://github.com/red/red/issues/3554 ?

13:40@pekr because that's what quote does - matches value that follows it literally. Except that in this case its a lax any-word! match.

rebolek

13:40@9214 thanks, I guess it's related.

9214

13:43

text
>> parse [a:][quote a]
== true
>> parse/case [a:][quote a]
== false
>> parse/case [a:][quote a:]
== true

13:43So it's actually https://github.com/red/red/issues/3029, mentioned in the ticket above. They all look related to one single regression.

giesse

18:45perhaps related to this as well? https://github.com/giesse/red-topaz-parse/blob/master/targets/parse.red#L72

Wednesday 9th October, 2019

rebolek: 06:39Certainly. There's probably some implicit conversion of any-word! values to word! that messes this up.

Monday 21st October, 2019

rebolek

13:24word! parsing:

>> parse [a][a]
*** Script Error: PARSE - invalid rule or usage of rule: a
*** Where: parse
*** Stack:  

>> parse [a]['a]
== true

path! parsing:

>> parse [a/b][a/b]
== true
>> parse [a/b]['a/b]
== false

I'm not sure about the first example in path! parsing, but shouldn't the second return true!?

endo64

13:32

>> b: ['a a a/b 'a/b]
== ['a a a/b 'a/b]

>> b/1 = b/2
== true
>> b/3 = b/4
== false

rebolek

13:32ok, but that's not parse

endo64

13:39In parse words (word! and lit-word!) are substituted by their values, but not paths. Otherwise a: 1 parse [a/b][a/b] would return an error and we would need to write parse [a][quote a] every time, so I think it is a reasonable design choice.

giesse

19:10@rebolek that's exactly what I posted above :P

19:11@endo64 IMHO it's a bug. [a] will match with the value of a (usually a sub-rule, but could be anything), while ['a] will match the word a (equivalent to [quote a]); in the same way, [a/b] should match with the value of a/b, while ['a/b] should match with the path a/b.

19:13Note that the current behavior is incompatible with R2:

REBOL/Core 2.7.6.4.2 (14-Mar-2008)
...
>> parse [a/b] [a/b] 
** Script Error: a has no value
** Near: parse [a/b] [a/b]
>> parse [a/b] ['a/b]
== true

and, very possibly incompatible with future versions of Red (ie. the code we are writing today, such as the one I posted a while ago above, will break at some point).

rebolek

19:33@giesse as always, you're a few steps ahead :smile: And I agree with you wrote, I also expect more R2-like behavior (lit-path!rule for matching path!).

endo64

20:05@giesse @rebolek If parse [a/b] [a/b]fails (and resolves the value of a/b) then I agree that parse [a/b] ['a/b] should behave like parse [a] ['a] .
Do we have a issue on GitHub for this? Or should be discuss it on /bugs room?

greggirwin

23:21It sounds like we all agree on this, so unless somebody else jumps in with a different view, let's open a ticket for it.

Tuesday 22nd October, 2019

giesse: 19:01https://github.com/red/red/issues/4101
greggirwin: 19:37Thanks @giesse.

Sunday 3th November, 2019

cloutiy: 14:17Not sure if this room is the right place for this question, but is there such a thing as a C->Red or C->Red/System translator? Something that parses C and generates equivalent Red or Red/S code?
9214: 15:17> C->Red

No, because this doesn't make any earthly sense.

> C->Red/System

Only unfinished experiments ([1](https://github.com/rebolek/parse-c-header), [2](https://github.com/iceflow19/c2reds), [3](https://github.com/Pebaz/RGB)).
cloutiy: 17:12@9214 ok thanks for your response

Monday 4th November, 2019

rebolek: 09:07@9214 Thanks for mentioning parse-c-header. It's unfinished because C headers are just C in disguise. For Rebol , I did Reb-C, translator for subset of Rebol to C.
So the point is - Rebol/System to C is not that hard. It jut takes one dedicated person and some time. But it's doable.
pekr: 11:06I don't know, what is Red/Pro goimg to be about, but as it is supposed to bring us some 64 bit functionality, I would expect either direct target or C emitters, so maybe itis worth a wait here?
11:07Well, not that any effort would be bad of course ....
rebolek: 12:08Red/Pro is certainly worth it, but I don't want to discourage anyone from creating alternatives. Generic C emitter would be useful (with minimal changes) for creating e.g. JS code. OTOH it seems easy but the devil is always in the details.
pekr: 12:35What I remember from the former Rebol past, was that someone tried to create a wrapper to C headers automatically. Not sure it was FFI or what it was. Something like that might be handy for R/S wrappers. But we are in a wrong group here anyway ...
rebolek: 12:39Parsing C headers requires full C parser. Anyway, we're still talking parsers, so I believe we're in right group ;)
greggirwin: 19:46I did a library dialect for R2, to making wrapping APIs easier. Also some code generators to map C %.h stuff. Many things, like constants, are simply grunt work and easy to port this way.

Red/Pro will not emit C. LLVM is the likely target.

Thursday 7th November, 2019

GiuseppeChillemi: 12:19@greggirwin While searching for REBOL documentation I have found the page of an old REBOL developer which is specialized in LLVM. Just FYI: http://www.complang.tuwien.ac.at/vpavlu/
greggirwin: 19:44I only met Viktor once, in 2005. Very nice guy. No idea where he is these days.
pekr: 20:02Those Austrian guys got somehow lost :-) Ditto for very talented Andreas Bolka (aka Earl) - very talented reboller ....
GiuseppeChillemi: 20:59If someone could make a list of all the people which have been in REBOL galaxy it would be good for getting back in contact with them. I remember Paolo Romano Tenca from Italy.

Monday 11st November, 2019

Oldes

12:00Unfortunately I think that Andreas must be dead as there is no activity on his Github since year 2016

pekr

12:08Or he just disappeared silently in the same way, BrianH did.

Oldes

12:32But BrianH is still somehow active https://github.com/thoughtbot/appraisal/pull/160 in Ruby community. Also on the mini Redcon in Brno years ago Earl was showing us his unreleased Rebol cross compiled to run in browser. He would not just disappear.

12:40I wonder if Brian was paid by Carl, because he was very active in R3 development and disappeared from Rebol together with Carl.

pekr

13:16The prominent member of Rebol community left for Ruby, how sad. In the past, he told me, that he was once asked by Ruby to create something like View for that.

greggirwin

17:40Moved community topic to :point_left: [red/chit-chat](https://gitter.im/red/chit-chat?at=5dc99cd04adf071a840f652b)

Respectech

18:08Here's something I'm missing in 'parse: Event loops. When I write a dialect, a lot of times I want something to happen over time or on a schedule when a certain condition is met in the dialect, or based on some outside input. VID would allow me to handle these conditions within the dialect, but I can't figure out how to easily do the same thing in 'parse rules and that limits my usage of 'parse.

greggirwin

18:25Can you give an example @Respectech? VID itself doesn't give you this feature, just a way to define timers. You can do the same with parse. In fact, @dockimbel wrote a scheduling dialect just for that, which I hacked on quite a bit for production use.

18:25The original is [here](https://www.softinnov.org/rebol/scheduler.shtml).

Respectech

18:52Something like this:

gpio [
		hardware odroid-c2 ;defines ids and features of pins for target hw
		
		legs: [5 4 7 6]
		
		servo-vals: read legs
		pin legs pwm rate 60 on-timer [
			write legs (gyro-correct/x)
		]
	]
	do-parse-events

greggirwin

18:57I don't see any problem there. We just don't have a non-View event loop yet, so you'd have to loop and check against timer intervals.

Respectech

18:58Yeah, it's the event loop that I'm missing.

giesse

19:07@Respectech that's what ports are for :)

Wednesday 13th November, 2019

bubnenkoff

12:26How to translate this from Rebol to Red?

probe parse "The trip will take 21 days" none

hiiamboris

12:40split "The trip will take 21 days" sp

bubnenkoff

13:17@hiiamboris is it possible to do with parse?

13:17I am learning red by rebol docs

hiiamboris

13:33Sure, if you like parse ☻:
parse "The trip will take 21 days" [collect any [sp | keep to [sp | end]]]

rebolek

13:48:)

bubnenkoff

13:48Thanks! Am I right understand that we iteration by words that have spaces? for example how to extract img name from here?

a: "<img src='test.png'>"
parse a [to "="]

13:57

>> parse "He is a good man" [collect any [sp | keep to sp | end] ]
== ["He" "is" #"a" "good"]

Why I am getting # in output?

rebolek

13:59It collects one letter as char!, not as string!. Also, it misses last word:

>> parse "He is a good man" [collect any [sp | keep to [sp | end]]]
== ["He" "is" #"a" "good" "man"]

14:00And to convert char! to string!:

>> parse "He is a good man" [collect any [sp | copy value to [sp | end] keep (form value)]]
== ["He" "is" "a" "good" "man"]

bubnenkoff

14:09Why

parse "aaa bbb" [collect any sp]
== []

rebolek

14:09@bubnenkoff collect starts collection, but needs keep to do actual collection of elements.

14:10any sp will match any space. You have more than just spaces in your input.

14:11

>> parse "aaa bbb" [collect [keep to sp skip keep to end]]
== ["aaa" "bbb"]

14:13To better illustrate, what any space matches:

>> parse "" [any space]
== true
>> parse "               " [any space]
== true

bubnenkoff

14:14

>> parse "" [any space]
== true

Why true?

rebolek

14:15Because any matches zero or more occurrences. If you want to match at least one space, you need to use some.

>> parse "" [some space]
== false
>> parse " " [some space]
== true

14:16Also, you can limit by count:

>> parse " " [1 3 space]
== true
>> parse "    " [1 3 space]
== false

14:16Here it matches one to three spaces. Not less, not more.

hiiamboris

14:17:point_up: [November 13, 2019 4:57 PM](https://gitter.im/red/parse?at=5dcc0bca52b73c7cb20e7491)

Okay, hotfix: parse "He is a good man" [collect any [sp | keep copy _ to [sp | end]]] ☻

bubnenkoff

14:19sp is alias to space?

dockimbel

14:19@bubnenkoff

>> parse "He is a good man" [collect any [sp | keep copy tmp to [sp | end]] ]
== ["He" "is" "a" "good" "man"]

Using a copy command you can force a series! as collected value.

>> series!
== make typeset! [block! paren! string! file! url! path! lit-path! set-path! get-path! vector! ha...

hiiamboris

14:21@bubnenkoff Do you know you can use ? in console?

>> ? sp
SP is a char! value: #" "

And there's a summary of Red parse here: https://www.red-lang.org/2013/11/041-introducing-parse.html

bubnenkoff

14:21thanks for tip!

14:22But I do not see that there some mention in ? that sp is same that space

hiiamboris

14:25

>> ? space
SPACE is a char! value: #" "

bubnenkoff

14:32So Red have different words with similar mining?

14:32

>> parse "aaa bbb" [collect [keep any sp]]
== []

I expect to get at last somthing in result

rebolek

14:33You can't have anything, because your string starts with a, not with space.

14:35So your space rule does not match it and parse ends

14:35To continue with parsing, every rule must pass.

Thursday 14th November, 2019

bubnenkoff

06:48

>> parse "aaa bbb ccc ddd" [1 skip]
== false

Why false?

06:50

>> parse "aaa bbb ccc ddd" [1 skip]
== false
>> parse ["aaa bbb ccc ddd"] [1 skip]
== true

What [] mean in read?

toomasv

06:57@bubnenkoff In string parsing, elements are chars. You skip just one, then stop (1 not needed). In block parsing you skip the string and are in the end of input.

bubnenkoff

07:02And what if I for example want to drop first two letters from string:
parse "aaa bbb ccc ddd" [collect [2 skip]] is wrong
a bbb ccc ddd is what I expected

toomasv

07:41collect is meaningful together with keep only:

>> first parse "aaa bbb ccc ddd" [collect [2 skip keep to end]]
== "a bbb ccc ddd"

But if that's all you want, it's easier to use just skip:

>> skip "aaa bbb ccc ddd" 2
== "a bbb ccc ddd"

hiiamboris

11:15@bubnenkoff you should really at least skim through http://www.rebol.com/docs/core23/rebolcore.html to understand the very basics of the language, instead of just doing random stuff with it. Will save you a lot of time in the end

bubnenkoff

11:39Red is hanging on next code: parse "He is a good man" [collect any [ keep to sp ] ]

toomasv

14:04@bubnenkoff Now think why it might do that:
1) It keeps string to first space
2) Now it is positioned before the first space
3) It searches for space
4) It finds the space just before its nose
5) It is at state (2) Gosh, I was here already! ...
:point_up: [November 13, 2019 4:19 PM](https://gitter.im/red/parse?at=5dcc110aea7d147cb370bbee)

PS. On Windows it doesn't hang but stops.

14:14Try thru instead of to. What happens? Why?

hiiamboris

14:38It's probably the stable 0.6.4 build that hangs ;)

greggirwin

17:23@bubnenkoff as @hiiamboris suggests, take some time to read and use some basic tools for learning. For example, in parse you can use set-word! values to mark a position in the input, and you can use paren! values as actions. With these two simple tools you can set markers and print them out in the console, to see where parsing stopped, how rules like some or any move through an input, and more.

Friday 15th November, 2019

bubnenkoff

06:58@greggirwin thanks! But could you give link with set-word! examples? I am about "set-word! values to mark a position in the input, and you can use paren! values as actions. With these two simple tools you can set markers and print them out in the console"

toomasv

07:18Here is a simple one:

>> parse load "first second third" [some [s: skip (print s/1)]]
first
second
third
== true

bubnenkoff

08:24Why s/1 and not simply s?

toomasv

08:37Because s refers to the whole input series at certain index, s/1 picks first element at this index.

08:40You can of course (print form s) or (probe s) too. Try. See?

pekr

08:43I am just on y cell phone, but I use marking to copy some portion of the input, mark start:, then end: and in parens (copy/part result start end)

08:45We shoul also mention :get-word to set new position

toomasv

08:48Yes, like marking (simplistic) start and end of parens here:

parse "first (second third) fourth" [
   some ["(" s: | ")" e: (print copy/part s back e) | skip]
]
;second third

09:39And, while playing with it, here is example of using get-word! to move back in series:

n: 0 
parse str: copy matrjonushka: "[beware (matrjonushka) here]" [
   some [
     "(" s: 
   | ")" e: if (5 > n: n + 1)(
         change/part back s matrjonushka e
      ) :s 
   | skip]
] str
== {[beware [beware [beware [beware [beware (matrjonushka) here] here] here] here] here]}

greggirwin

09:39@pekr I left out get-word! because it's more advanced, and @bubnenkoff is just starting out.

09:39Now there's an example though! :^)

09:40@bubnenkoff you also want to watch out for rules that can lead to infinite loops, because some don't advance the input. Start small and you'll get there.

bubnenkoff

13:41Thanks! Continue learning. New question, why collecting only second h1 tag?

page: {
<html> 
    <title> My Great Page</title>
    <h1>Big Heading A</h1>
        <p>Stuff in A</p>
    <h1>Big Heading B</h1>
        <p>Stuff in B</p>
</html>
    }

parse page [ collect any [thru <h1> | keep to "<" ]]
== ["Big Heading B"]

hiiamboris

13:49extra |

toomasv

14:02Adding to @hiiamboris' minimalistic explanation:
1) thru

succeeds and brings you just behind

2) as last round succeeded `any` starts next round from new position
3) first subrule is `thru`
, let's try this
4) goto (1); you are just behind second

5) do (2-3)
6) as there are no more
s, first subrule fails
7) try next subrule, which is `keep to "<"`, success
8) new round, failure on first subrule, let's try next
9) infinite loop starts on second subrule as you are not advancing anymore
10) stop and return collected values

greggirwin

17:29@toomasv these kinds of explanations are great. When we work on a parse IDE we can show matches and backtracking (even now, thanks to parse/trace). Figuring out how to visualize things such that they are as clear as this kind of explanation will be a fun challenge.

toomasv

17:55Yup, I thought a lot how to visualise it. Still not buried the thought.

18:14Actually, I'm not sure it is correct to say in (9) that "infinite loop starts". Rather, it stops because input does not advance.

GiuseppeChillemi

22:26@toomasv I have a couple of ideas too. I hope they will be complete in the future

22:49Is this correct ?

>> parse [a 1 b 2 c 3 d 4 e 5] [collect [keep any [keep word! | number! ]]]
== [a b c d e [a 1 b 2 c 3 d 4 e 5]]

I didn't expect another block inside the output block but duplicated word matches.

23:04Also, I don't understand the difference:

>> parse [a b c d e f] [to 'b collect [keep any word!]]
== [[b c d e f]]
>> parse [a b c d e f] [to 'b collect [ any keep word!]]
== [b c d e f]
>>

... and not expecting any keep word! working

Saturday 16th November, 2019

toomasv

05:51@GiuseppeChillemi First case:
1) collect initialises new collection-block
2) enter the block following collect
3) meet keep; now what shall we keep?
4) meet any; enter the block
5) any times you meet word!, keep it; skip number! (but both are successfully matched)
6) now first keep recieves and keeps matched input from any, i.e. all of it
7) done, return collection
To get rid of block you can use pick after first keep.

GiuseppeChillemi

05:57@toomasv Why an inner block appears?

toomasv

05:58Because you keep the block matched by any. pick would pick elements of the block, but no block.

06:05To realise your wish in first case ("duplicated word matches"):

>> rule: [any [keep word! | number!]] parse [a 1 b 2 c 3 d 4 e 5] [collect [s: rule :s rule]]
== [a b c d e a b c d e]

06:11Second problem.

In first case you keep the block matched by any and you keep it as block.
In second case you keep words separately inside the any loop. (As keep word! is a rule, and any operates on the following rule, there is no reason why it should not be working.)

You can keep words without the block in first case if you use pick with keep:

>> parse [a b c d e f] [to 'b collect [keep pick any word!]]
== [b c d e f]

GiuseppeChillemi

09:12@toomasv @9214 pick is a new addition to the. It is not present either in blog article introducing parse, or in any other page. Only red-by-example cites it without examples or full description.

09:14(I have not the knowledge needed to write about it, so one of you pro should do the job)

toomasv

09:26@GiuseppeChillemi It is introduced in this [blog article](https://www.red-lang.org/2017/03/062-libred-and-macros.html).

GiuseppeChillemi

09:27@toomasv
Here is the explanation for the first case:

> 2) enter the block following collect
> 3) meet keep; now what shall we keep?
> 4) meet any; enter the block

The expression "enter the block" could be bound to multiple subjects in my mind. What you mean for "enter the block" ? -Who is "entering?" (adding?) and where - or -Who is "entering?" (going inside the parsed block?) -or- "Going inside the block" , (block created by collect) -or- "Going inside the block" , (rules block).

toomasv

09:29Pick whatever makes sense for you.

GiuseppeChillemi

09:37@toomasv none of the above, it's me following the rule step by step.

09:41Also you solved my doubts with the [blog article](https://www.red-lang.org/2017/03/062-libred-and-macros.html) as other than not knowing about pick I didn't know it returned values as a block inside a block if multiple have been returned from the rule

09:45So 2 blocks are created from this code: the one by collect, the other one by KEEP which receives multiple values (or it is any management mechanism creating the block and passing it to KEEP)

toomasv

10:20keep is a keyword which makes things happen in Red's belly, so that whatever is matched next will be appended to the collection, either as series (if series is matched or copy keyword (together with a word) is used after keep) or as single value (if single non-series value is matched w/o copy) or multiple single values (if series is matched but pick is used after keep). To see how it *exactly* happens please look at [parse.reds](https://github.com/red/red/blob/master/runtime/parse.reds).

GiuseppeChillemi

11:59@toomasv Thaks, I'll KEEP [parse.red](https://github.com/red/red/blob/master/runtime/parse.reds) for future reading as the topic is too advanced and also I do not learnt R/S.

Monday 18th November, 2019

bubnenkoff

11:39I am trying to write small app that parse xml and display result in VID.

view [ button "open" [ if file: request-file [ do [ x: parse file/text [ collect any [thru "<purchaseObject>" keep to "</purchaseObject" ]] ] ] ] area 900x700 x ]

The problem that I can't understand how to display result. It run but do not print nothing. I think problem with last x that have parse result data

xqlab

12:21Just an example

view [ 
	button "open" [
		if file: request-file [ 
			a/text:   read file 
		] 
	] a: area 900x700 
]

You have to read the file before parsing the text. Give the area a name so you can access it's fields

toomasv

12:35Little addition:

a/text: mold parse read file [collect...]

bubnenkoff

12:37I want to display parsing result, not file content

12:37But let me few minutes to try, I understand idea

12:40Is it's possible to do like:

view [ button "open" [ if file: request-file [  x: parse read file/text ....

toomasv

12:42After file: request-file, file is of type file!, i.e. it is filename. You have to read or load it.

bubnenkoff

12:51I did some improves but still can't understand how to pass result of parsing to area:

view [ button "open" [ if file: read request-file [ x: parse file [ collect any [thru "<purchaseObject>" keep to "</purchaseObject" ]]  ] ] area 900x700 x ]

toomasv

12:53Combine what @xqlab and I proposed above...

bubnenkoff

12:54Why do I need mold?

toomasv

12:55Because collect from parse returns a block, mold will turn it into string. You can try form too, if it suits you.

bubnenkoff

12:56Wow! That's works!

view [ button "open" [ if file: request-file [ a/text: mold parse read file [ collect any [thru "<purchaseObject>" keep to "</purchaseObject" ]]  ] ] a: area 900x700  ]

toomasv

12:56:+1:

12:58I don't see your result, but I guess you would want to use a: area 900x700 wrap if it is long enough.

bubnenkoff

13:21Yeah! Thanks!

13:22is it single approach to access to parsed result in area? Or there is another ways?

toomasv

13:27You mean to set areas' text?

bubnenkoff

13:29yes

toomasv

13:31From within a view itself I don't see another way as simple as that. But there surely are more ways to set areas' text.

bubnenkoff

13:34Ok, please explain me what mean a/text: ? And why than we writing a: area to display text. a: is creating word. Why we need create word and set area to it?

toomasv

13:41When interpreting the VID block with a: area, an area face is created with its facets (see [doc](https://doc.red-lang.org/en/view.html#_area)), and a is refering to this face. By using set-path a/text: you can set text facet of this area and with path a/textyou are accessing that facet. If you want just to display the text and don't want to edit it, you can use text instead of area. E.g. t: text 900x700 wrap. Still, to set text you use t/text:... and to access it t/text. You need to set a word to the area (or any face/style) if you want to directly access it later. But there are ways to access it indirectly too.

Tuesday 19th November, 2019

greggirwin: 01:36@bubnenkoff you can think of Red's path notation (a/text) like dot notation (a.text) in other langs.
GiuseppeChillemi: 21:02@bubnenkoff Which languages do you know?

Wednesday 20th November, 2019

bubnenkoff: 14:26@GiuseppeChillemi I am from D/Python. I am learning read in my free time.

Friday 22nd November, 2019

bubnenkoff: 11:29What difference between:
action: ['buy | 'sell] and action: ["buy" | "sell"]
rebolek: 11:30@bubnenkoff first rule is for block parsing, second for string parsing.
11:30There is difference between [buy 1 kg of apples] and "buy 1 kg of apples".
xqlab: 11:41the first block contains **lit-word!**s, the second **string!**s as alternative

Monday 25th November, 2019

cloutiy

00:43Could someone explain a bit more on how to chop part of a parse subject. For example I have a start marker and stop marker and want to remove everything in between:

parse file [
    any [
        to separator start:
        to heading stop:
        ( change/part start stop "" )
        :stop
    ]
]

I think the second arg for change/part needs to be a length, but not sure how to provide it. Trying stop - start says that - doesn accept string!

01:51Ok, seems I got it to work....remove is what I wanted, not change. This works.

parse file [
    any [
        to separator start: (n1: index? start)
        to heading stop: (n2: index? stop)
        (n: n2 - n1)
        (remove/part start n)
        :stop
    ]
]

toomasv

04:30@cloutiy It is simpler:

parse file [any [to separator remove to heading]]

04:43Also, in your last example stop: records position when hitting heading, and when you continue from :stop, then parse continues from the index? of recorded stop: which generally is not where you want it to continue. You can check it if you insert (probe stop) just after :stop.

greggirwin

04:46Modifying the input you're parsing can be tricky, so consider doing the reverse. Collect everything you want to keep, simply skipping over the parts you don't want.

toomasv

10:21@cloutiy This would have worked too:

parse file [any [
   to separator start: 
   to heading stop: 
   (remove/part start stop) 
   :start
]]

bubnenkoff

14:29how to print alphabet? I am genarate it like: z: charset [#"a" - #"z"]

14:30

>> foreach d z [print d]
*** Script Error: foreach does not allow bitset! for its series argument

dockimbel

14:43@bubnenkoff

>> z: charset [#"a" - #"z"]
== make bitset! #{0000000000000000000000007FFFFFE0}

A bitset is just an _array of bits_ used to represent here characters presence/absence _by convention_. Bitsets don't contain any other information than those bits, so they can be used to represent something else if needed. So in order to print characters corresponding to those bits, you have to generate the characters first, then check which ones are part of that bitset, like this for example:

>> repeat c 256 [if find z c: c - 1 [prin to-char c]] prin lf
abcdefghijklmnopqrstuvwxyz

bubnenkoff

15:03so bits: make bitset! [#"a" - #"b"] is 8bit for a plus 8 bit for b?

15:13and how to get size of bitset? length? show me strange digit

rebolek

15:15no, it's one bit for **a**, one bit for **b**, etc

bubnenkoff

15:16one bit?!

15:17but it's need 8 bits for representation single letter

rebolek

15:19no, in bitset each bit represents one letter

bubnenkoff

15:21bit? Bit it's 0 and 1. How they can represent letter?

rebolek

15:22let's have a simple 10bit bitset: 0100000100. Here I have bits 2 and 8 set, so it can match for example numbers 2 and 8, or letters **b** and **h**.

cloutiy

17:13@toomasv regarding :stop, point well taken, i agree. In my file i actually put a comment to myself to see whether that was required or not. Regarding your simpler approach...briliant! I like it. Thanks for pointing that out!

toomasv

17:14@cloutiy You are welcome!

GiuseppeChillemi

19:46Pardon, does it mean that bit n.1 correspond to letter in position 1 in bit set definition, and this sequence is stored in an ordered list, and so on?

22:32@rebolek I don't understand if you match the bits comparing to their hex value of the character or something else.

rebolek

22:33

>> enbase/base to binary! charset #"1" 2
== {00000000000000000000000000000000000000000000000001000000}
>> enbase/base to binary! charset #"2" 2
== {00000000000000000000000000000000000000000000000000100000}
>> enbase/base to binary! charset #"3" 2
== {00000000000000000000000000000000000000000000000000010000}
>> enbase/base to binary! charset #"4" 2
== {00000000000000000000000000000000000000000000000000001000}
>> enbase/base to binary! charset #"5" 2
== {00000000000000000000000000000000000000000000000000000100}
(etc)

22:34here you can see how the character is determined by the position

22:36if you wonder about the zeros in the beginning, it simply starts with ASCII charater 0:

>> enbase/base to binary! charset to char! 0 2
== "10000000"
>> enbase/base to binary! charset to char! 1 2
== "01000000"
>> enbase/base to binary! charset to char! 2 2
== "00100000"
>> enbase/base to binary! charset to char! 3 2
== "00010000"
(etc)

Bitset is optimized to take shortest possible length.

greggirwin

23:17Bitsets are a generic "bitmap" of single bit values. In the context of parse or charsets in general, those bits are flags to indicate whether a particular character is part of the bitset. But bitsets are a powerful and space efficient structure that can be used for other purposes.

Tuesday 26th November, 2019

bubnenkoff

07:21But how to code: "AB" letters? And How "BA"?
If for example А 000001 B 000010

07:57When I should use parentheses and when not?

tag: ["<" alpahabet ">" skip thru "</" alpahabet ">"]

Why I can't simply write:

tag: "<" alpahabet ">" skip thru "</" alpahabet ">"

rebolek

08:03@bubnenkoff
> But how to code: "AB" letters? And How "BA"?

You don't- Bitset matches both A and B, not depending on order.

08:07

>> halo: charset "halo"
== make bitset! #{0000000000000000000000004089}
>> parse "halo" [4 halo]
== true
>> parse "ahol" [4 halo]
== true
>> parse "aaaa" [4 halo]
== true

bubnenkoff

08:15ok! Understood! But what size of bitset? How can it grow?

08:16What kind of information are usually coding by bitset?

08:33And how to debug bitset? I need to know what range it include:
alphabet: union charset [#"a" - #"z"] charset [#"A" - #"Z"]
to be sure that all correct

Oldes

08:59@bubnenkoff

>> alphabet: charset [#"a" - #"z" #"A" - #"Z"]
== make bitset! #{00000000000000007FFFFFE07FFFFFE0}
>> length? alphabet
== 128
>> append alphabet #"ř"
== make bitset! #{00000000000000007FFFFFE07FFFFFE000000000000000000000000000000000000000000000000000000040}
>> length? alphabet
== 352
>> repeat i length? alphabet [ if find alphabet i [print [i to-char i]] ]
65 A
66 B
...
345 ř

bubnenkoff

09:46Thanks! Is there any difference between to char! and to-char? Or it's shurtcut?

10:16I can't understand how to work break. "break out of a matching loop, returning success."

a: "<app><div>Hello</div></app>"
>> parse a ["<" thru ">" break]
== false

Why? I thought it will evalauate first as true and then break will stop evaluation and will return true

Oldes

10:21to-char is just a shortcut... don't use it in tight loops ;-)

>> ?? to-char
to-char: func ["Convert to char! value" value][to char! :value]

10:24Just use to end instead of break if you want to end and return true:

>> parse a ["<" thru ">" to end]
== true

10:27Btw.. in above bitset example, I could use: alphabet/:i instead of find alphabet i

bubnenkoff

10:29but what wrong with break?

Oldes

10:33Maybe someone else will answer... I'm not a break user. But it makes sense, that as there is a way how to end with true, there should be a way how to end with false too.

bubnenkoff

12:17There is example of usage break, but I do not understand what it do https://www.red-lang.org/2013/11/041-introducing-parse.html

Oldes

12:25It escapes from the recursive rule in the blog's example.

12:35@bubnenkoff maybe this will help you to understand:

>> parse "aabb" [some [#"a" break] copy rest to end] rest
== "abb"
>> parse "aabb" [some [#"a" ] copy rest to end] rest
== "bb"

12:36The break above escapes from the some rule, but still keeping successful result.

GiuseppeChillemi

13:18I have tried some simple instruction to understand how bitset works:

With the following, it seems you convert to a hexadecimal or octal value.

>> charset ["a"]
== make bitset! #{00000000000000000000000040}

And also:

>> enbase/base to binary! charset ["a"] 2
== {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000}

This is the equivalent of decimal 64, so the above one is an hexadecimal rappresentation.

Well, first question: in which character set is a mapped to Hx0040 ?

toomasv

13:23@GiuseppeChillemi Wrong question :)

>> enbase/base to binary! charset ["a"] 16
== "00000000000000000000000040"

>> enbase/base to binary! charset ["a"] 64
== "AAAAAAAAAAAAAAAAQA=="

13:30

>> cs: charset "a"
== make bitset! #{00000000000000000000000040}
>> repeat i length? cs [if cs/:i [print i]]
97

Oldes

13:31@GiuseppeChillemi does this help to understand?

>> to-integer #"a"
== 97

>> b: enbase/base to binary! charset ["a"] 2
== {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000}

>> index? find b "1"
== 98 ;<--- 1-based indexing when using string series!

GiuseppeChillemi

17:32Well, 98 is the number of zeroes before number 1

17:32[![image.png](https://files.gitter.im/red/parse/KKxC/thumb/image.png)](https://files.gitter.im/red/parse/KKxC/image.png)

17:32I have manually counted them !

17:34But I don't understand the source of number 97, why a converts to 97 ?

17:37Found: U+0061 a 61 LATIN SMALL LETTER A [Source](https://www.utf8-chartable.de/)

rebolek

17:45@GiuseppeChillemi it's easy to find:

>> to char! 97
== #"a"

17:45or

>> to integer! #"a"
== 97

GiuseppeChillemi

17:46@rebolek So, position is not number 97, I see UTF starts from zero, so a is the 98th character

rebolek

17:47right

GiuseppeChillemi

17:57For to integer! #"a" the number a is decimal 97, while in charaset the bit is ON in position 98 because UTF8 starts from 00 ....

18:00I am not able to understand how everything correlates, but If I consider only the position of a in the UTF8 table, I could imagine Red converts the letter to its absolute position and then check that flag.

18:01So it does for any letter mentioned in the charset range

18:02We will have as many 1 for each corresponding absolute position of the letter mentioned in the range.

greggirwin

18:21Imagine bitsets as a bunch of toggle switches, on or off. #"a" is switch 98. This, as you note, is because characters use their UCP (Unicode Code Point) to determine which switch applies to them. If you look at a charset, pretend going down the list of "letters" in it and flipping each corresponding switch up. Then you can look at your switches and see, at a glance, which are on or off.

As to #"a" specifically, it's because unicode used the old ASCII codes for compatibility.

Oldes

18:28Forget 98, #"a" is 97!

>> alphabet/97
== true
>> to-char 0
== #"^@"
>> alphabet/0: true
== true
>> find alphabet #"^@"
== true

18:30Bitset is using zero based indexing... the 98 was there, because classic series use 1-based indexing!

18:31@GiuseppeChillemi I don't understand why you don't understand what bitset is. It is so easy.

18:33If you forget charset and look at it just as a collection of bits:

>> b: make bitset! []
== make bitset! #{00}
>> length? b
== 8
>> b/0: true
== true
>> enbase/base to-binary b 2
== "10000000"

18:34The only thing that may be confusing is the zero based indexing in bitsets... but you are not expected to convert bits to string and count 1 characters.

greggirwin

18:44Thanks for clarifying @Oldes.

GiuseppeChillemi

19:21@Oldes > I don't understand why you don't understand what bitset is. It is so easy.

19:22I will replay to you later, I have taken notes about the thought process.

21:53@Oldes

The first question I asked to myself has been: how a character is converted to bitset ? Me and @bubnenkoff were convinced that a character is converted to bit rapresentation of its Hex number in ASCII or UTF8 map. We started searching for confirmation but first examples did not help because it is very difficult to make a correlation from a full charset to 000…7FFFFFE0

>> z: charset [#"a" - #"z"] == make bitset! #{0000000000000000000000007FFFFFE0}

I have started thinking about ASCII to Hex conversion, as:

>> charset ["a"] == make bitset! #{00000000000000000000000040}

Hx0040 is 64 decimal and UTF8 tables reported:

21:53[![image.png](https://files.gitter.im/red/parse/s0Iu/thumb/image.png)](https://files.gitter.im/red/parse/s0Iu/image.png)

21:53

Having 1 integer distance could mean a different index start. (Only Later I realized than you wrote a and not A )

Another problem trying to find the was rapresented from the most and least significant bits:

**How do you read this binary number ?**

>> enbase/base to binary! charset to char! 3 2 == "00010000"

Is the least significant number on the left or on the right ? I supposed it was on the right as:

>> enbase/base to binary! charset ["a"] 2 == {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000}

Is 64, as Hex0040! So you would never think the **relevant information** is all the zeroes on the left,** you think that it is on the right side.**

Your last example put me o the right track:

>> to-integer #"a" == 97

>> index? find b "1" == 98 ;<--- 1-based indexing when using string series!

Red does not convert the ASCII position to a binary number but to a bit string with N zeroes before the 1. The 1 is put on the **nTH** position from the left, based on UTF8 position of the character in UTF8 definition, whose table is considered an array starting from 00 as position 1, while in Red a series usually start at 01.
>

22:05Correction
* *Another problem was trying to find where you have rapresented the most and least significant bits:*

hiiamboris

22:07Such an adventure ☻

rebolek

22:08You making it sound really complicated. Bitset is simple series of bits. Each character is represented by one bit. They are ordered from left to right, according to char's position in ASCII/Unicode table. That's all.

Oldes

22:26Btw... Red/Rebol 's bitset implementation is trivial... Petr Kobalicek is taking more care about bitsets in his Blend2D -> https://gitter.im/blend2d/blend2d/archives/2019/08/28

GiuseppeChillemi

22:30Your explanation needs an example to be added to the end:

Let's build a simple charset:

charset [#"a" - #"b"]

take the position number of the character a in the UDF8 table, it is 97 but consider its position is 98 as the UTF8 table starts from 0. Then add 97 zeroes and one 1 in position 98.

This is the resulting bitset

== {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000}

Character b is position 98 in UTF8 table, so it is 99 in our bitset, add 98 zeroes and set the 99th it to 1.

== {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000}

Merge the 2 bitsets and result is:

== {00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001100000}

When red encounters the letter a will look at 98th bit and if set to 1, as it is, will return as a matching character in parse.

22:45Hovever, thank you to everyone who helped. Personally, bitsets were obscure to me and now they are very clear. Another step forward into knowing Red.

22:48@hiiamboris Such an adventure ☻

22:48Learning Red is a great adventure!

23:21@rebolek your..

23:21> Bitset is simple series of bits

23:22Let me think that "bitseries" better describes the real mechanism.

Wednesday 27th November, 2019

toomasv

04:45@GiuseppeChillemi This may explain your initial confusion:

>> to-binary "a"
== #{61}
>> to-binary to-bitset "a"
== #{00000000000000000000000040}

greggirwin

05:41:point_up: [November 26, 2019 11:21 AM](https://gitter.im/red/parse?at=5ddd6d1c32df1245cb96acf5) @GiuseppeChillemi (or anyone else), did my explanation make any sense, or how can it be improved?

@GiuseppeChillemi "bitseries" does not describe the mechanism. Bitsets are not series. They are *exactly* what the name says: sets of bits.

GiuseppeChillemi

06:23@greggirwin my difficulty has been on the correlation mechanism between the source and the bitset.

I was imaging converting the single character to its bit representation. #"a" which is decimal w97 would be converted to "01100001".

> Imagine bitsets as a bunch of toggle switches, on or off. #"a" is switch 98

I think adding a little phrase to the end would help:

*Imagine bitsets as a bunch of toggle switches, on or off. #"a" is switch 98 because on UCP8 #"a" is located on position 98 of the table.*

Then a note:

*"Decimal value is different than its position in the UCP table as it starts from 0, so #"a" is located in position 98 while its decimal value in the table is 97*

06:24As you can see, this conversion had multiple subtle difficulties finding the correct mechanism.

06:29@toomasv

> @GiuseppeChillemi This may explain your initial confusion:
>

> >> to-binary "a"
> == #{61}
> >> to-binary to-bitset "a"
> == #{00000000000000000000000040}
>

Yes, it is confusing right now:

#{61} and #{00000000000000000000000040}

For me, these numbers are not a binary representation. I expect 00101010100 style but maybe they are an I am ignoring something.

toomasv

07:01to-binary results in binary series represented in base 16. To convert it to base 2 use enbase/base 2.

07:03(But this is string, so not actually a conversion)

07:31This would be the base 2 binary representation:

>> print head insert back tail "2#{}" enbase/base to-binary "a" 2
2#{01100001}

>> print head insert back tail "2#{}" enbase/base to-binary charset "a" 2
2#{00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000}

GiuseppeChillemi

07:37@toomasv You have touched a point I have intentionally skipped deferring it to a further investigation: the lack of visual information of the numeric base used. I did not know if it was me lacking of this knowledge or there really where no such info.

toomasv

07:40See binary! in [doc](https://doc.red-lang.org/en/datatypes/binary.html) or in [spec](https://github.com/meijeru/red.specs-public/blob/master/specs.adoc#binary).

bubnenkoff

09:29> Thanks! Continue learning. New question, why collecting only second h1 tag?
>

> page: {
> <html> 
>     <title> My Great Page</title>
>     <h1>Big Heading A</h1>
>         <p>Stuff in A</p>
>     <h1>Big Heading B</h1>
>         <p>Stuff in B</p>
> </html>
>     }
> 
> parse page [ collect any [thru <h1> | keep to "<" ]]
> == ["Big Heading B"]
>

I solve it with break:

>> parse page [ collect any [thru <h1> keep to "<" break]]
== ["Big Heading A"]

toomasv

09:42@GiuseppeChillemi
Crash course in binary and charset (Sorry @rebolek!)

;Let's convert a string to binary
bn: to-binary "Březovský"
;== #{42C599657A6F76736BC3BD}

;Now try to convert each byte back to characters
foreach b bn [prin to-char b]
;BÅezovskÃ½

;But if we convert it to string
to-string bn
;== "Březovský"

;Let's see the chars
to-binary "B"
;== #{42}
to-binary #"ř"
;== #{C599}

;Ha!
forall bn [
    either bn/1 < 128 [
        prin to-char bn/1
    ][  prin to-char copy/part bn 2 bn: next bn]
]()
;Březovský


;Now charset
cs: charset "Březovský"
;== make bitset! #{0000000000000000200000000411122000000000000000000000000000000004000000000000000000000040}

;Here they are in "alphabetic" (or rather "utfic") order
repeat i length? cs [if cs/:i [prin to-char i]]()
;Bekosvzýř

;Nice thing about charsets (well, bitsets actually) is that we can do set operations with these
cs2: charset "Boleslav"
;== make bitset! #{000000000000000020000000440912}

cs-union: union cs cs2
;== make bitset! #{0000000000000000200000004419122000000000000000000000000000000004000000000000000000000040}
repeat i length? cs-union [if cs-union/:i [prin to-char i]]()
;Baeklosvzýř

cs-excl: exclude cs cs2
;== make bitset! #{0000000000000000000000000010002000000000000000000000000000000004000000000000000000000040}
repeat i length? cs-excl [if cs-excl/:i [prin to-char i]]()
;kzýř

cs-diff: difference cs cs2
;== make bitset! #{0000000000000000000000004018002000000000000000000000000000000004000000000000000000000040}
repeat i length? cs-diff [if cs-diff/:i [prin to-char i]]()
;aklzýř

cs-compl: complement cs2
;== make bitset! [not #{000000000000000020000000440912}]
repeat i length? cs-compl [if cs-compl/:i [prin to-char i]]()
	

 !"#$%&'()*+,-./0123456789:;<=>?@ACDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`bcdfghijkmnpqrtuwx

Quick [intro](https://medium.com/@apiltamang/unicode-utf-8-and-ascii-encodings-made-easy-5bfbe3a1c45a) to Unicode, UTF and ASCII.
Have a look at [utf/unicode chartable](https://www.utf8-chartable.de/unicode-utf8-table.pl)

to-string #{e28880 78 e28883 79 20 28 42 78 20 e288a7 20 47 79 29 20 e28a83 20 4C 78 79}
;== "∀x∃y (Bx ∧ Gy) ⊃ Lxy"

;You can use unicode code points to get characters if you convert these to integers first:
to-char to-integer #{01F609}
;== #"😉"
to-char to-integer #{01F475}
;== #"👵"

;But to get multibyte characters (also chars with decimal value > 127) in string you should use utf encoding
print to-string #{f09f91b5 20 f09f9889}
👵 😉

10:17@bubnenkoff If you want first heading only then you don't need any loop:

>> parse page [ collect [thru <h1> keep to "<"]]
== ["Big Heading A"]

rebolek

11:25@toomasv :smile:

xqlab

12:47@bubnenkoff you get only the second block because of your | with the meaning **or**. As the first condition is fulfilled the second with keepwill not be processed as long as the first condition is sufficient.

>> parse page [ collect any [thru <h1>  keep to "<" ]]
== ["Big Heading A" "Big Heading B"]

endo64

13:21Great explanations as always @toomasv , just a small addition, to see what is going on (I've shorten the output to not flood your screens):

repeat i length? cs [either cs/:i [prin to-char i] [prin "."]]
...................B..................................e.....k...o...s..v...z.........................................................ý...................................................ř.......

rebolek

13:22@endo64 great visualisation!

endo64

13:28Should we create a wiki page for bitset!s and add those conversation notes + Toomas' explanations there?

rebolek

13:31It makes sense, so next time we can just point people to that wiki.

toomasv

14:45@endo64 :+1: Nice!

GiuseppeChillemi

17:16I am not in front of my pc but I see I have new Red lessons ready for this evening.

17:16Thanks @toomasv!

toomasv

17:32:smile:

GiuseppeChillemi

21:17@toomasv Lesson read. Some things to try but everything is taking a shape.
@endo64 Nice visualization.

greggirwin

21:39Wiki page FTW! :^)

Thursday 28th November, 2019

bubnenkoff

08:03ws: charset reduce [space tab cr lf]
Why reduce is here? Why not simple: ws: [space tab cr lf]?

xqlab

09:19because words in a block are shielded against evaluation / automatic reducing. Just see the block in the command line with and without reducing.

bubnenkoff

10:09"find simply checks if the field exists, returning true or none"
:

>> b
== ["name" "price" "age"]
>> find b ["age"]
== ["age"]

Why I am not getting true\false?

10:26There is bug in latest compiler. I am getting exception on next code:

obj: object [a: 44]
print find obj 'a

10:30> "find simply checks if the field exists, returning true or none"
> :
>

> >> b
> == ["name" "price" "age"]
> >> find b ["age"]
> == ["age"]
>

>
> Why I am not getting true\false?
>

The same behavior on latest and stable Red

endo64

10:33> find simply checks if the field exists, returning true or none

@bubnenkoff Where did you read this? find was working on objects before and was return true/false, but not anymore. that feature removed on recent versions.
Now you should get *** Script Error: find does not allow object! for its series argument error.

10:35For the blocks find returns the given block in the position where it found the searched value (or none if it can't find). So find b ["age"] ; == ["age"] is correct.

bubnenkoff

11:23I read it in Helpin' Red.pdf sorry, I messed mention about object at top

11:24@endo64 how to evaluate find as true\false for the block checking?

endo64

11:26

b: ["name" "price" "age"]
not none? find b "age" ; returns true/false
; or
if find b "age" [print "found!"]

xqlab

12:09or short to-logic

12:10

>> to-logic find ["name" "price" "age"]  'a
== false
>> to-logic find ["name" "price" "age"]  "name"
== true

bubnenkoff

13:16thanks!

toomasv

14:02:point_up: [November 27, 2019 11:39 PM](https://gitter.im/red/parse?at=5ddeecf426eeb8518f2a4832)

> One should use caution when using the term to older people who are unaware of the new connotation.

[Source](https://turbofuture.com/internet/ftw-meaning*)

endo64

14:59@toomasv I thought the same thing :)

greggirwin

18:22Wow! My apologies. I had never seen the other meaning. The one time I try to keep up with the times...

18:25There's a comedian named (An)Drew Tarvin who does a bit on this.

> I got back from Switzerland and texted my Grandma. She wrote back "Switzerland? WTF!". I had to call her and ask what she thought WTF meant. "Oh, a lady in my bridge club said it means 'Wow, that's fun!'", and I told my Grandma...that's *exactly* what it means.

18:28@bubnenkoff on true/false testing, in Red only 2 values are considered "false": the logic! value false (use help logic! to see all aliases), and none. Everything else, including unset! is considered true. So you don't have to cast to logic! values if all you care about is truthiness.

Friday 29th November, 2019

bubnenkoff

10:11I described simple open\closing tag construction:
tg: [any ["<" thru ">" opt lf "<" opt "/" thru ">"]]
How I can iterate by simple tags:

I tried to do something like:

>> foreach tg "<div><title></title></div>" [print "hello"]
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello
hello

But can't understand why so many hello printed.

10:24It's seems that it's doing print after every check of rule. But I need to do print only 4 times on: "div" "title" "title" and "div"

Oldes

10:39@bubnenkoff try this:

foreach tg "
" [print tg]

and you will understand.

10:40You are expecting tg to contain tags, but actually you are getting each char from the given string series.

pekr

10:41The code does not make much sense, most probably you wanted to use your tg rule as a rule in parser?

Oldes

10:48@bubnenkoff you probably want something like this:

>> tg: [#"<" opt #"/" thru #">"]
== [#"<" opt #"/" thru #">"]
>> parse "<div><title></title></div>" [any [s: tg e: (probe copy/part s e)]]
"<div>"
"<title>"
"</title>"
"</div>"

10:50Or:

parse "<div><title>blabla</title></div>" [any [s: tg e: (probe copy/part s e) | 1 skip]]

if you want do deal with other content, not just tags.

10:53Or:

>> parse "<div><title>blabla</title></div>" [collect any [keep tg | 1 skip]]
== ["<div>" "<title>" "</title>" "</div>"]

bubnenkoff

12:10@Oldes please explain me this part:

s: tg e: (probe copy/part s e)

Oldes

12:13s and e hold series at positions (start and end) of the input. Just add probe s probe e and see.

12:15copy/part s e creates a new substring from the input.

12:19@bubnenkoff and when you are in the learning phase, it is good to know, that you can change the input using get-word!.. for example:

>> parse str: "aabbaa" [any [some #"a" | s: some #"b" e: (e: change/part s #"c" e) :e]]
== true
>> str
== "aacaa"

bubnenkoff

12:35@Oldes You wrote:

>> tg: [#"<" opt #"/" thru #">"]

My code is:

tg: [any ["<" thru ">" opt lf "<" opt "/" thru ">"]]

Is they are same? Should I use # for single chars as in your example?

13:28> s and e hold series at positions (start and end) of the input. Just add probe s probe e and see.

Not fully understand. It's standard any behavior? Could you provide another example of Handling start and end position with any

13:35I am trying to figure out how to work any with start and end. I tried to do small example but failed with it logic and result:

>> parse "aabbcc" [any [s: "aa" (print s) "bb" e: "cc" (print e) ] ]
aabbcc
cc

I expect to get printed s and e values

pekr

13:35I think that it would be good to read about Rebol's parse here: http://www.rebol.com/docs/core23/rebolcore-15.html

13:44set-words marks the position during the parsing. If you mark two positions, it means 2 index positions, so copy/part start end will copy the string between the start and end marks:

>> string: "This is some text."
== "This is some text."
>> start: skip string 4
== " is some text."
>> end: skip string 8
== "some text."
>> copy/part start end
== " is "

13:48In your above example, you got printed sand evalues. Just remember - those are just kind of pointers/markers at certain string position.

Oldes

14:15I think it is good to read [all chapters](http://www.rebol.com/docs/core23/rebolcore.html), not just about parsing;-)

14:34@bubnenkoff the set-words in the parse rules are not related to any. Try this:

>> str: "abc" (parse str [1 skip bc:] bc) = skip str 1
== true

Tuesday 3th December, 2019

bubnenkoff

11:57> s and e hold series at positions (start and end) of the input. Just add probe s probe e and see.

Still can't understand why you are saying that it's hold position, but not data

12:10> set-words marks the position during the parsing. If you mark two positions, it means 2 index positions, so copy/part start end will copy the string between the start and end marks:
>
>

> >> string: "This is some text."
> == "This is some text."
> >> start: skip string 4
> == " is some text."
> >> end: skip string 8
> == "some text."
> >> copy/part start end
> == " is "
>

Am I right understand that here start and end store not simple value, but value and pointer to data?

toomasv

12:29They all point to the same string, their own data is index in that string.

bubnenkoff

12:33I am not understanding:

>> x
== "aaabbbcccddd"
>> parse x [any [s: "aaa" (print s) e: "ccc" (print e) ] ]
aaabbbcccddd

I expected:

aaabbbccc

toomasv

12:56s: will point to x at index 1. When rule reaches (print s) it prints s, which is basically same as x. As it doesn't match anything further, it fails.

>> s = x
== true

13:07To get what you expected:

>> parse x [s: "aaa" thru "ccc" e: (print copy/part s e)]
aaabbbccc
== false
>> parse x ["aaa" thru "ccc" e: (print copy/part x e)]
aaabbbccc
== false
>> parse x ["aaa" thru "ccc" e:] print copy/part x e
aaabbbccc

greggirwin

15:57@bubnenkoff leave parse for the moment and play with series values directly. Use navigation funcs like [head tail next back skip at index? pick poke] and things like copy/part. Get a feel for those, then come back to parse, and I think it will make more sense.

toomasv

16:25@bubnenkoff One more funny thing to ponder:

>> parse x ["aaa" thru "ccc" e:] clear e print x
aaabbbccc

Wednesday 4th December, 2019

bubnenkoff

13:19@Oldes I am not understanding why you wrote:

tg: [#"<" opt #"/" thru #">"]

opt is skip one element. So I should write something another like: opt any alpahabet. Why your code is working?

Oldes

13:20to accept and also ... yes.. there is the thru, so I could avoid it, but there could be also something more sophisticated instead of the thru.

rebolek

13:20opt is not skip one element. opt is for optionally matching it.

13:21"this element may be here or may not, both is fine"

bubnenkoff

13:25> to accept and also ... yes.. there is the thru, so I could avoid it, but there could be also something more sophisticated instead of the thru.

Oh! I understood!

rebolek

13:31to and thru are great way to shoot yourself in the foot

bubnenkoff

14:19@Oldes I am still trying to understand. You Wrote:
"

parse "
" [any [s: tg e: (probe copy/part s e)]]

"
> s and e hold series at positions (start and end) of the input. Just add probe s probe e and see.

I have trouble with understanding start and end.
So if I am parsing string:

parse "aabbcc"

the end of it is cc. So parsing rule will be:
parse "aabbcc" [s: "aa" e: thru "cc"]
s - will hold start, e - end

right?

15:26Or here in any [...] can be more that two positions:

parse "<div><title></title></div>" [any [s: tg e: (probe copy/part s e)]]

Oldes

16:29

>> parse "aabbcc" [s: "aa" e: thru "cc"]
== true
>> s
== "aabbcc"
>> index? s
== 1
>> e
== "bbcc"
>> index? e
== 3
>> copy/part s e
== "aa"
>> copy/part "aabbcc" (3 - 1)
== "aa"

16:32when you use set-word! as a parse rule, it stores current input's position.

Thursday 5th December, 2019

bubnenkoff

12:07@Oldes big thanks, sorry for many simple questions about same. Yes now I am understanding more. But still have problem with:

e: (probe copy/part s e)

we are creating word e and what happens than? What value have last e (before closing paren)

12:13How it will evaluate? the last e will automatically point to end of match? and than we will do copy/part from start to end?
I can't understand how end index will be placed there.

12:18Wow! I did not sot that it's possible!

parse "aaabbbccc" [s: "aaa" thru "ccc" e: ]

e is getting ccc index!

13:00

>> print ("aa" "bb")
bb

How to print both?

13:03I am playing with index and want to print them:

parse a [any [s: tg e: (probe (index? s index? e) ) ]]

13:14What difference between join and rejoin? Is it good to write like:

>> parse a [any [s: tg e: (probe rejoin [index? s " - " index? e] ) ]]
"1 - 6"
"6 - 11"
"11 - 18"
"18 - 26"
"26 - 32"
"32 - 38"

GiuseppeChillemi

13:58@bubnenkoff Red has no join

bubnenkoff

14:09@GiuseppeChillemi join is rudimental?

>> ? join
     rejoin          function!     Reduces and joins a block of values.

GiuseppeChillemi

14:22[![image.png](https://files.gitter.im/red/parse/j62N/thumb/image.png)](https://files.gitter.im/red/parse/j62N/image.png)

bubnenkoff

14:22oh! thanks!

GiuseppeChillemi

14:22There is no join, only REJOIN

14:22Ok

14:24you are welcome

xqlab

14:45@bubnenkoff

>> print ["aa" "bb"]
aa bb

(""aa" "bb") is the same as do ["aa" "bb"]which gives the last value

bubnenkoff

14:47Is it's possible to do?

d: charset [#"0" - #"9"]

I am asking because I have empty result:

>> repeat c length? d [if find to-string d c [ print to-char c]]
== none

Oldes

14:58@bubnenkoff just: repeat c length? d [if find d c [ print to-char c]]

GiuseppeChillemi

18:42 A definition appeared in my mind today: "fuzzy parse".

18:43Let's search if it exists.

greggirwin

21:14:point_up: [December 5, 2019 5:07 AM](https://gitter.im/red/parse?at=5de8f3141659720ca8021036) @bubnenkoff there is a *very important* thing to note here. Set-word!s work differently in the parse dialect than in normal Red. In Red's standard evaluator e: (probe copy/part s e) would set e to refer to the result of the paren evaluation, but in parse e: marks a location in the input, which you can refer to (e.g. in the paren), but the evaluation of the paren is ignored. Everything *inside* the paren is evaluated normally, not as parse dialect.

Friday 6th December, 2019

meijeru

10:21@bubnenkoff Allow me to quote from the [spec document](https://github.com/meijeru/red.specs-public/blob/master/specs.adoc), section 2.5
Red makes available a large number of different value types. The evaluation rules describe the interpretation of these values when they occur in a Red _program_ which is nothing more or less than a sequence of values. The users may furthermore use and interpret Red values, when considered as _data_, in ways of their own, and thus create dialects or Domain Specific Languages (DSLs). Implementing the interpretation of dialects in Red is greatly facilitated by the parse facility.
In fact, Red itself contains a number of dialects where blocks of data are interpreted in a specific way; this includes the preprocessor dialect, the parse dialect, the visual interface dialect (VID), which also uses the draw dialect and its shape sub-dialect, the various spec dialects involved in defining vectors, images, bitsets, objects, errors, ports, maps, functions and routines, furthermore the compose and construct dialects and the system dialect (Red/System).

bubnenkoff

13:26

a: "<app><div><title>Hello</title></div></app>"
alphabet: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9"]
>> parse a [ any [s: tg e: (print copy/part e s) | collect keep alphabet ]  ] 
<app>
<div>
<title>
</title>
</div>
</app>
== [#"H" [#"e"] [#"l"] [#"l"] [#"o"]]

How to extract Hello not as separate charsets?

13:33I tried collect into, but got very strange result:

>> parse a [ any [s: tg e: (print copy/part e s) | collect into q keep alphabet ]  ] 
<app>
<div>
<title>
</title>
</div>
</app>
== true
>> q
== "olleHaaab"

GiuseppeChillemi

14:16@meijeru are you the author of the spec document?

meijeru

14:28I have written 99% of it. Tell me what you think of it.

cloutiy

14:32@bubnenkoff maybe try keep some alphabet?

GiuseppeChillemi

16:49@meijeru Its really complete and I am learning lot of concepts and terms. I am having only difficulties on few specific topics.

greggirwin

18:49@bubnenkoff I think there's an example that does exactly what you want (or very close) on https://www.red-lang.org/2013/11/041-introducing-parse.html

toomasv

18:58@bubnenkoff Three examples:

>> first parse a [collect any ["><" | #">" keep to #"<" | skip]]
== "Hello"
>> first parse a [collect any [#">" not #"<" keep to #"<" | skip]]
== "Hello"
>> form first parse load a [collect any [keep word! | skip]]
== "Hello"

dockimbel

20:18@bubnenkoff

parse a [ collect any [s: tg e: (print copy/part e s) | keep copy tmp [any alphabet]]]

parse a [ collect any [s: tg e: (print copy/part e s) | keep thru [any alphabet]]]

Saturday 7th December, 2019

meijeru: 13:52@GiuseppeChillemi Feel free to ask! Perhaps better in red/help.
GiuseppeChillemi: 14:19@meijeru I am reading your document again and again and if I won't find what I was searching for I will ask there.

Monday 9th December, 2019

ericguedespinto

13:11Hello all
I need help!
I am trying to parse a long list of references (loaded from a txt file as a string) in order to reformat them into something more useful and I am getting this strange error where the parsing stops in the middle of the file. When I do a parse-trace, I get this error

```, although with parse-trace it actually happens before the point it would happen with parse. 
I thought that this could be memory related, but then I tried to cut the file and go at it in chunks but it always breaks at the some points.
Does anyone have any thoughts?

results: read %pubmed_result.txt
output: ""
search_tag: [ "search tags(\[[a-zA-Z-&]*\])^/^/" ]
digit: charset "0123456789"
index: [ some digit #":" ]
letters: charset [#"a" - #"z" #"A" - #"Z"]
linebreak: charset [ #"^/" ]
chars: union union letters charset[ #"," #" " #":" "-" #"[" #"]" #"^"" #"ş" #"ı" #"Ç" #";" #"'" #"í" #"ó" #"(" #")" #"ö" ] digit
authors: [ index some [ chars ] #"." any [ #" " ] ]
title: [ collect [ some [ keep some [ chars ] opt linebreak ] keep [ #"." | #"?" ] ] ]
ref: [ thru "^/^/^/" ]
article: [ authors opt linebreak title ref ]
file: [ search_tag some article end ]
print parse-trace results file

The actual file is about 60000 lines long, but it doesn't get past line 279... and it breaks with this line

'Sonographic diagnosis of thyroid cancer with support of AI. '`

hiiamboris

13:13What Red build are you using?

ericguedespinto

13:15I am on win10, just downloaded the new build today
%gui-console-2019-12-3-34266

hiiamboris

13:16Can you upload this file with 280 lines so we can reproduce the glitch?

13:17That error message rings some bell slightly..

ericguedespinto

13:23[pubmed_result.txt](https://files.gitter.im/red/parse/8eEG/pubmed_result.txt)

hiiamboris

13:25Just tried. No Script Error with this file.

ericguedespinto

13:27does it process the whole file? I can't get past entry 65 on line 279

hiiamboris

13:29Wait. I reproduced it. Without --cli I see *** Script Error: reset-buffer does not allow vector! for its argument

13:30With --cli I get parse to finish with false and last words it prints are " Sonographic diagnosis of Sonographic diagnosis of"

13:31Definitely a memory related error. I have lots of these when I use parse and View at the same time.

13:34By the way, that was line 89 for me:

match: [some [chars] #"." any [#" "]] 
                                       input: { Mahmoudzadeh AP, Malkov S, Fan B, Greenwood H, J                                       
                                       -->
*** Script Error: reset-buffer does not allow vector! for its <anon> argument
*** Where: reset-buffer
*** Stack: parse-trace

ericguedespinto

13:34that is where it stops with parse instead of parse-trace, but I can't understand why it would stop there...
Curiously enough, if I delete the first part of the file up to that line, it still stops at the same point with "Sonographic diagnosis of", which makes me wonder if it realy is memory related

13:35Yes, with parse-trace it stops in line 89 (entry 25) for me as well, but with parse, it prints half of line 279

hiiamboris

13:35I see. Interesting.

ericguedespinto

13:40and if I just delete that entry 65, it goes further than before

ne1uno

13:40bit of unicode there

hiiamboris

13:40Why I believe it's memory related is because with --cli the behavior is different.

ne1uno

13:4300000000 73 75 70 70 6F 72 74 C2 A0 6F 66 20 41 49 support of AI

ericguedespinto

13:46Yes, unicode would be a good explanation.
I just rewrote by hand that line and now it parses easily.
So a bit of hidden information is that txt file? o maybe something to do with encoding of the string?

hiiamboris

13:50There's a lot of unicode there, so what? Looks like valid UTF8. If it wasn't, read would have failed.

ericguedespinto

13:58fair point and the error does refer to reset-buffer.
If I delete the entry 65, it goes on so it's not running out-of-memory, but it could be a bug with memory-management (reset-buffer)

hiiamboris

13:58Indeed. I didn't mean out-of-memory, but rather memory corruption.

ericguedespinto

13:59Thanks:

hiiamboris

13:59Also, older builds' (June 1st, 2019 and before) parse-trace is a bit different. Fails on the 1st line, returning false. parse stops with vector thing again on 289 line.

ericguedespinto

14:00I'll go and check on the issues if this is a possible bug worth reporting.

14:00Thanks a lot!

hiiamboris

14:00Yes, please raise an issue about it :)

14:01And refer to this chat there.

14:02Thanks for finding a reproducible case :)
As a temporary solution, I suggest you split the input on double line breaks, and parse each entry separately.

Tuesday 10th December, 2019

greggirwin: 01:57Thanks for the analysis all. Good find.

Wednesday 11st December, 2019

bubnenkoff

08:49Could you help me? Why is false?

a: "<app><div><title>Hello</title></div></app>"
alphabet: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9"]
tg: [any [#"<" opt "/" thru #">"]]
>> parse a [any tg | skip any alphabet]
== false

08:51And what difference between:

parse a [any tg | skip [any alphabet] ]

and

parse a [any tg | skip any alphabet]

hiiamboris

09:20[any tg | skip any alphabet] <- how do you understand this rule? what should it do, in your opinion?

endo64

09:26You can also add a: just before the skip and b: after the alphabet, then see what a & b shows you to understand where parsing stuck.

bubnenkoff

09:37> [any tg | skip any alphabet] <- how do you understand this rule? what should it do, in your opinion?

Accept any tag or skip any alphabet

endo64

09:41skip doesn't work that way. Hint 5 skip skips 5 characters.

bubnenkoff

09:48@endo64

>> parse a [any tg | a: skip [any alphabet] b: ]
== false
>> a
== "<app><div><title>Hello</title></div></app>"
>> b
*** Script Error: b has no value
*** Where: catch
*** Stack:

rebolek

10:09@bubnenkoff
[any tg | whatever] will accept any number of tags. Any is zero or more, so basically anything is fine for this rule. That means that the alternate rule is never checked.

bubnenkoff

14:42is there any pro/const of debugging with print or parse-trace?

14:52I still can't understand how to debug code:

parse-trace a [any tg skip [any alphabet] ]

Could you show me an example?

15:01> skip doesn't work that way. Hint 5 skip skips 5 characters.

Am I right understand that my rule do not work because any alphabet return true on first symbol and do not moving forward because "Repeat rule zero or more times until failure or if input does not advance"?

15:16

>> parse a [[any tg q: skip thru alphabet] ]
== false
>> q
== "Hello</title></div></app>"

>> parse a [[any tg q: skip thru any alphabet] ]
== false
>> q
== "Hello</title></div></app>"

Why it's not skipping thru alphabet?

hiiamboris

15:23@ericguedespinto I made the ticket https://github.com/red/red/issues/4184

15:27@bubnenkoff perhaps programming is just not your thing? or at least not yet ☺

dander

19:05@bubnenkoff a few comments on tg: [any [#"<" opt "/" thru #">"]]
The way any works is if the input matches the rule, it will advance the input until it stops matching, then it will move to the next rule. If you have any rule1 rule2, rule1 will always match, then parse will advance to rule2. If you have any rule1 | rule2, rule2 will never be hit because any rule1 never fails

19:05skip is sort of like . from regex

GiuseppeChillemi

22:29@bubnenkoff
>> parse a [[any tg q: skip thru alphabet] ]

Why are you using double square brackets ?

22:37@bubnenkoff

parse-trace a [any tg skip [any alphabet] ]

Your parse istructions:

any tg  ;<---  right associative. It's argument is "tg" 
skip ;<--- no associative or  left associative: skip = skip one position; 5 skip = skip 5 positions.

skip [any alphabet] ;skip open position [any alphabet] is not a SKIP argument

22:42@dander
>> The way any works is if the input matches the rule, it will advance the input until it stops matching, then it will move to the next rule.

So, if any will match until the END position, the next rules will receive END as start position ?

hiiamboris

22:49Yes.

GiuseppeChillemi

23:02*skip open position = skips one position

Thursday 12nd December, 2019

bubnenkoff

08:18

>> parse a [any tg to tg e: ]
== false
>> e
== "Hello</title></div></app>"

Why e not:

" ?

GiuseppeChillemi

08:29The first ANY has eaten all your tags. Then, TO TG has noting to process.

bubnenkoff

08:30Why it's not stopped on Hello?

GiuseppeChillemi

08:31Pardon, I have not read properly your post

08:32First any fails

08:32The you ask TO TG to move just before the first tag

08:33Then you set the input there

08:36ANY checks the first character of the string. It won't advance as it fails immediately. Remember, it advances if it finds something matching its argument and until it fails.

bubnenkoff

08:37> First any fails

Why?!

GiuseppeChillemi

08:39Pardon, I have not used the proper words.

08:42First ANY does not find a TG, so it stops searching immediately. It won't fail, it returns true.

bubnenkoff

08:45%) "First ANY does not find a TG"

>> probe a
"<app><div><title>Hello</title></div></app>"

So it should find TG:

>> tg
== [any [#"<" opt "/" thru #">"]]

GiuseppeChillemi

09:07Ok, found the my mistake. I have not seen the original A

09:11*my

09:16Try to remove ANY from TG definition and the outher square brackets.

09:20I suppose the inner ANY fails so TO fails immediately at the starting of HELLO. The series position remains just before H, so E: is set to it.

09:22(please pardon any mistake, I am on mobile)

bubnenkoff

09:22No problem))

GiuseppeChillemi

09:30Has it worked?

bubnenkoff

09:32No, but I am looking deep in tg, and trying understand what can be wrong with it

greggirwin

09:37@bubnenkoff it can help others if you include all the code with a question. Otherwise we have to sift messages and don't know if earlier messages contain your exact code in new versions.

bubnenkoff

09:37@greggirwin ok!

11:42How to get work skip not with chars but with words? I want to skip first tag.

>> a: "<app><div><title></title></div></app>"
== "<app><div><title></title></div></app>"
>> probe tg
[any [#"<" opt "/" thru #">"]]
== [any [#"<" opt "/" thru #">"]]

How to write it? parse a [ skip tg ] is skipping just first latter

GiuseppeChillemi

12:00@bubnenkoff I am still newbie at parse. Other will help but I will try to resolve your questions too.

12:01Note, you can use parse-trace instead of parse to debug your rules. It is very helpful.

bubnenkoff

12:10The docs says that skip have args:

ARGUMENTS:
     series       [series! port!] 
     offset       [integer! pair!]

But I can't understand how to pass them. Or it's impossible?

meijeru

12:15There is a built-in function skip which you can use in "normal" Red programming, and there is a skip word in the _parse dialect_ . These are not to be confused. Concerning the use of Red values (including words) in dialects, may I refer to the spec document, [section 2.5](https://github.com/meijeru/red.specs-public/blob/master/specs.adoc#25-dialects).

bubnenkoff

12:28I should know the dictionary of every dialect to prevent confusing?

meijeru

12:40Only the ones you use... But yes, Red is rich and there is much to learn if you want to use it to advantage.

bubnenkoff

12:54So if skip (correct me if I am wrong) in parse dialect works not as I expected I should create new own word with proper behavior?

meijeru

13:03You should first learn how skip works in parse dialect. And what is, according to you, proper behaviour?

bubnenkoff

13:09By the docs it can skip only series. Do I understand docs correctly?

GiuseppeChillemi

13:47String series and block series. I don't know if you can parse other series too, just try.

meijeru

14:57"skipping series" applies to "normal" Red programming. skip in parse applies to elements of the string being parsed. Plese stop confusing the two.

bubnenkoff

15:10@meijeru where can I read it? Or you use your experience when answering?

meijeru

15:22The best description of Parse is [here](http://www.red-lang.org/2013/11/041-introducing-parse.html)

GiuseppeChillemi

17:32@bubnenkoff Dmitry consider PARSE having its own language. It has similarities with Red but words work differently. Skip in parse is SKIP where is an integer and optional.

dander

17:58@bubnenkoff

>> a: "<app><div><title></title></div></app>"
== "<app><div><title></title></div></app>"
>> tg: [#"<" thru #">"]
== [#"<" thru #">"]
>> parse a [tg mark:]
== false
>> mark
== "<div><title></title></div></app>"

GiuseppeChillemi

18:22I have read that IF *evaluate the Red expression*, could the success of a rule be evaluated with or without moving the input ?

18:24I mean something like IF [RULE] then [RULE] or THEN (code).

dander

18:25@GiuseppeChillemi if you want to check for a rule without advancing the input, you can use ahead rule

GiuseppeChillemi

18:38@dander could you please show me an example of alternate rule evaluation path taken after and ahead check and setting a word to true after a successful ahead match ?

dander

18:46@GiuseppeChillemi, Advent of code has had some good parse challenges. [This one](https://github.com/dander/advent-of-code/blob/master/2016/day-7.red) had some ahead usage, though I'm not sure if it's the sort of example you are looking for

hiiamboris

19:28> I mean something like IF [RULE] then [RULE] or THEN (code).

opt [rule (code)] or opt [ahead rule (code)] depending on your needs

GiuseppeChillemi

19:36@hiiamboris I am experimenting with this.

19:37A simple question:

is there any difference between:

parse "a b c" [e: any ["a b c"]] and parse "a b c" [any e: ["a b c"]] ?

19:39I understand that the index is set before any in the first example but if you put the set word between ANY and the rule, is e set multiple times ? Is the set just one ?

hiiamboris

19:48> parse "a b c" [any e: ["a b c"]]

Same as parse "a b c" [[any e:] ["a b c"]]

GiuseppeChillemi

20:33In my example the first E is set before...

Monday 16th December, 2019

bubnenkoff

09:26Could anybody show an example of: ":word : resume input at the position referenced by the word"

toomasv

10:21@bubnenkoff Silly example:

>> parse str: "This is get-word example" [some [" " s: | "get-word" :s change "g" "s" | skip]]
== true
>> str
== "This is set-word example"

bubnenkoff

13:57I have got function:

>> f: t: func [w][print ["--> " w " <--"]]

I want to call it when I will be on "bb" in next code:

>> parse "aabbcc" ["aa" s: "bb" (f s) "cc"]

But I am getting:
--> bbcc <--
But I need
--> bb <--

pekr

14:04Because your s: is just a marker of the position in the while string. Try to use something like parse "aabbcc" ["aa" s: "bb" e: (f copy/part s e) "cc"]

14:06Simply using two markers, at the start position, and the end of the position and copying what's between them ....

bubnenkoff

14:06Thanks!

pekr

14:06Or you can modify your func to accept two params and do a copy/part in terms of your function, to keep a parse call a bit cleaner.

bubnenkoff

14:07f: func [a b][print ["--> " a b " <--"]] ?

14:10Am I right understand that parse *always* to process parsing by single symbol?

pekr

14:10copy/part a b, or it will print two string "bbcc" and "cc" imo

14:12Yes, you are mostly correct. You can use skip for that as an alternate rule. But parse also can move "faster", using to or thrukeywords. In R3 and Red (in contrast to R2), something like to [a | b | c] should work too. But beware, what you get. Do some printing like you already do, that keeps you learning ....

bubnenkoff

14:15@pekr Do you mean by a and b start and end (like s, e)?
I can't understand the last idea about "you can modify your func to accept two params and do a copy/part in terms of your function, to keep a parse call a bit cleaner."

14:19

>> parse "aabbcc" ["aa" s: "bb" e: (f copy/part s e) "cc"]
*** Script Error: f is missing its b argument

Oldes

14:34

f: func [a b][print ["--> " copy/part a b " <--"]]
parse "aabbcc" ["aa" s: "bb" e: (f s e) "cc"]

pekr

14:34exactly like that ....

Oldes

14:52@bubnenkoff but in real life, you would probably used this instead:

>> parse "aabbcc" ["aa" copy b "bb" (probe b) "cc"]
"bb"
== true
>> b
== "bb"

15:13@bubnenkoff for example this raw-example of data mining:

page: read https://medium.com/topic/visual-design
parse page [
	some [
		thru {<h3 class="ap} thru {>} [
			{<a href="} copy url to {"}
			thru {>} copy text to {</a}
			( print [text lf url lf] )
			| 1 skip
		]
	]
]

bubnenkoff

15:13Why instead of quotes used {}?

ne1uno

15:19so you won't have to escape the double quote

bubnenkoff

15:19When we are writing:

parse "aabbcc" ["aa" s: "bb" e: (print copy/part e s)]

How Red understand that e is data and s is digit?
here all explicit:

>> copy/part "aabbcc" 4 ; first element is data, second is index
== "aabb"

Oldes

15:20It is not a digit... it is series at given index.

15:21But I thought this was explained here many times already.

15:21

>> s: "abc"
== "abc"
>> index? s
== 1
>> e: next s
== "bc"
>> index? e
== 2
>> copy/part s e
== "a"

bubnenkoff

15:23So under the hood:

copy/part "aabbcc" 4

is doing same as you shown in your example? It's move index of series?

Oldes

15:24

>> copy/part "aabbcc" 4
== "aabb"
>> copy/part s: "aabbcc" skip s 4
== "aabb"

bubnenkoff

15:27So what I wrote above it's correct? That it's just shortcut?

Oldes

15:28Yes

bubnenkoff

15:28Thanks!

pekr

15:31I suggest to find a Rebol/Core 2.3 manul, chapter 6 - Series - http://www.rebol.com/docs/core23/rebolcore.html

15:32One really needs to understand Redbol series principles first ....

greggirwin

20:53@bubnenkoff, as @pekr says, and I suggested some time ago, it will probably help for you to learn Red fundamentals outside the scope of parse. You're doing well, but learning how series work is key, because parse operates on them (though in a special way).

Tuesday 17th December, 2019

bubnenkoff

06:32@greggirwin thanks! I understand, but some syntax futures confuse me. For example I have not seen any mention in docs about shortcut cut above words. I will reread docs, I hope that it will help. But it's pretty hard to thinks different and to found goog and simple examples

07:13Why head is print all block content, but tail is not?

>> d: [aa bb cc]
== [aa bb cc]
>> tail d
== []
>> head d
== [aa bb cc]

But the doc head is pointer to first element, so I expect that it will print not all block, but only first element

greggirwin

07:17Don't think of the two copy/part syntaxes as shortcuts, either way. It's two *ways* to do it, either of which may be more convenient in a given circumstance. That is, sometime you know how *many* items you want to copy, and sometimes you know a position in a series you want to copy *to*.

07:18Have you used ? head and ? tail?

07:19To continue general series questions, we can move to the red/help room.

bubnenkoff

07:34ok, one moment I need to play with examples

greggirwin

07:35I posted in red/help

bubnenkoff

14:59

s1: "aabbccddee"
s2: next next s1

>> copy/part s2 s1
== "aa"
>> copy/part s1 s2
== "aa"

Why the same result?

pekr

15:14Checked the R2, the results are the same. Not sure, how the function does it internally ....

ne1uno

15:19use latest? try s1: "abcdef"

pekr

15:19We would have to study series.reds file, copy function source. Can't read the code properly to find out, why it is so like that ...

bubnenkoff

15:20> use latest? try s1: "abcdef"

please explain what I should to try? Just single string?

ne1uno

15:21you're not using a good test string to see what is going on

greggirwin

21:56@bubnenkoff I responded in red/help

Wednesday 18th December, 2019

toomasv

04:17Fantasy wish: list keyword that enable shortcut for writing e.g. element any [separator element] as list element separator.

giesse

05:58

>> do %topaz-parse.red
>> list: rule [element separator] [element any [separator element]]
>> topaz-parse "a,a,a,a" [list #"a" #","] 
== #"a"
>> topaz-parse "a,a,a,a" [copy list #"a" #","] 
== "a,a,a,a"

toomasv

06:44:+1: No fantasy any more :smile:

06:46Xmas wish granted! Thanks, Santa!

bubnenkoff

12:15Why next code always stops on second value:

alphabet: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9"]
tg: [{<}  opt {/} thru {>} opt lf] 
parse-trace a [any tg some alphabet any tg]

data for parsing:

a: {<root>
<lots>
<lot>
<name>Foo1</name>
<price>100</price>
</lot>
<lot>
<name>Bar1</name>
<price>202</price>
</lot>
</lots>
</root>}

it will always stops on 100. If remove it, it will stop on Bar1

pekr

12:45I don't use opt parse directive myself and can't follow your code properly (have just few minutes to react), but if you use any/some, you should enclose the rest in the block, to make it iterative - something like :

parse a [any [s: tg e: (print ["tag:" copy/part s e]) | s: some alphabet e: (print ["txt: " copy/part s e])| skip]]

12:45not sure why it prints empty line, it comes from the tag rule ....

12:47

>> tg: [{<}  opt {/} thru {>}] 
== ["<" opt "/" thru ">"]
>> parse a [any [s: tg e: (print ["tag:" copy/part s e]) | s: some alphabet e: (print ["txt: " copy/part s e])| skip]]
tag: <root>
tag: <lots>
tag: <lot>
tag: <name>
txt:  Foo1
tag: </name>
tag: <price>
txt:  100
tag: </price>
tag: </lot>
tag: <lot>
tag: <name>
txt:  Bar1
tag: </name>
tag: <price>
txt:  202
tag: </price>
tag: </lot>
tag: </lots>
tag: </root>
== true

12:49Removing the "debugging" print info: parse a [any [tg | some alphabet| skip]]

bubnenkoff

12:56"but if you use any/some, you should enclose the rest in the block"
@pekr so it's rule do not work?

parse a [any tg some alphabet any tg ]

pekr

13:05Your rule does work in a specific way. It is basically 3 rules, followed one after each other. So first it tries to identify any tag. It works so far, as there is any tag. What if your string would start with text? It would fail right there and not print anything ....

13:06any [a | b | c]simply means - repeat zero or many times, aor bor c

bubnenkoff

13:09@pekr big thanks!

pekr

13:36Just keep your work, parse is great. Once you grasp it, it is going to be a very useful tool for you!

greggirwin

21:09Thanks for assisting @pekr !

pekr

22:08It was just a luck, I got it kind of working at first try :-) But also a refreshing reminder how good tool parse can be ....

Thursday 19th December, 2019

bubnenkoff

10:35My tg is described as:

["<" opt "/" thru ">" opt lf]

I can't understand why next rule do not work with data:

a: {<root>
<lots> 
</lots>
</root>}

I am still debugging with parse-trace because I do not see how print can help me here.

parse-trace a [any [tg | some alphabet | tg ] ]

If I add skip at end of rule all works. But I do not understand whats it's does

10:39As I understand it should moving line by line because every line is tg. And this part of rule should not works: some alphabet | tg

hiiamboris

10:42Look closer here ☻:

>> a: {<root>
{    <lots> 
{    </lots>
{    </root>}
== {<root>^/<lots> ^/</lots>^/</root>}

bubnenkoff

10:43The space?!

hiiamboris

10:43Yep :)

bubnenkoff

10:43omg! Thanks!!

10:45@hiiamboris is it good to write tg as:

tg: [{<}  opt {/} thru {>} opt lf opt space]

hiiamboris

10:58you might have a space before newline, or a few spaces ;)

10:59or a number of newlines ;)

pekr

11:24Why are you adding lf there at all? I have showed you a working example, removing exactly that element. If you want to skip anything else, just use - skip :-)

11:25parse a [any [tg | some alphabet| skip]]

bubnenkoff

11:36Yes, now I understanding why you have add skip, but till I am learning I want to declare all explicitly

pekr

11:39well, if you want to cover all possible cases, you should add more than one possibiliy of space or a newline

11:40something like any [lf | space], or creating a bitset callled spacers, containing various options

GiuseppeChillemi

11:48@bubnenkoff If I remember correctly when working in block mode, space is not considered. So when you will use parse in that way you should remember this.

pekr

11:55He uses a string mode though ....

GiuseppeChillemi

14:19@pekr I am giving him the information that when he will try block mode there will be this difference. I have learned that if you are not informed of rules changes you will try to use the old one in similar contexts.

bubnenkoff

14:45@GiuseppeChillemi block mode is wrapping brackets like: a: { ... }?

pekr

14:51No, what you mention is a string mode. "This is a string" is equivalent to {This is a string}, those curly braces allow you to use multiline strings, eg. in console ....

14:51[This is a block mode]

14:52Parsing blocks and dialects: http://www.rebol.com/docs/core23/rebolcore-15.html#section-9

greggirwin

17:48@bubnenkoff

>> a: {<root>^/<lots> ^/</lots>^/</root>}
== {<root>^/<lots> ^/</lots>^/</root>}
>> b: load a
== [<root> 
    <lots> 
    </lots> 
    </root>
]
>> parse b [some tag!]
== true

17:48The catch is that your data needs to be valid Red. Don't expect to load any random input and have it work.

dander

17:59@bubnenkoff I would suggest removing the whitespace from your tg definition, unless it really is part of the tag. Then in the rule which contains tgs, you can define where your whitespace can appear. For example, instead of using | skip (which is like "something else can appear here, but I don't know what"), you could have | some whitespace

18:02whitespace: [crlf | lf | cr (print "how old is this text file?!") | tab | space]

18:04It could have more or less than that... you would know best what the constraints on your input would be

pekr

18:53It could be also done as bitset, instead of rule.

dander

18:55@pekr true, unless you specifically need to match crlf instead of either cr or lf

pekr

18:56Yes, you are of course right :-)

cloutiy

21:42Hi all, I've been trying to do simple which in principle is simple enough for the last few hours, but for some reason is not giving the expected results. I would love some input on what is wrong with my rule.

The objective: a) insert

before a #"^/" followed by a string!; b) insert a

after a string! followed by a #"^/".

For some reason only b gets done.

doctree: [
    [<h3> ["Preparation for Code Generation"] </h3>]
    #"^/"
    #"^/"
    {Before code can be generated, it is generally necessary tomanipulate and change the internal fprogram in some way. Runtimestorage must be allocated to variables. In FORTRAN, COMMON andEQUIVALENCE statements must be prccessed. One important pointincluded here is the optimization of the program in order toreduce the execution time of the object program.}
    #"^/"
    #"^/"
    [<h3> ["Code Generation"] </h3>]
    #"^/"
    #"^/"
    {This is the actual translation of the internal source progranminto assembly language or machine language. This is perhaps themessiest and most detailed part, but the easiest to understand.Assuming we have an internal form of quadruples as outlinedabove, we generate code for each quadruple in order. For thethree quadruples listed above we could generate, on the IBM 360,the assembly language}
    #"^/"
    #"^/"
]

parse doctree [ 
   any [ 
         to #"^/" string! insert <p>
        | thru string! #"^/" insert </p> insert <p>
        | skip ] 
  ]

Oldes

21:47I'm just on mobile.. but when you have to .. than it does not skip thru it.. so the first line always fails. @cloutiy

hiiamboris

21:51Also unify your alternate to/thru rules, otherwise you skip them with to/thru

cloutiy

21:58@hiiamboris @Oldes I just tried something after posting and got it to work. I needed to do to [#"^/" string!].

22:00This is my updated rules that gets me to where I wanted to be:

parse doctree [ any [
         to [#"^/" string!] insert <p> to [string! #"^/"] thru [string! #"^/"] insert </p>
        | skip ]  ]

remove-each item doctree [ item = #"^/" ]

hiiamboris

22:02:+1:

cloutiy

22:06Not gonna lie. Feels kind of great to figure things out ;)

greggirwin

22:08I know that feeling well...some days. ;^)

Friday 20th December, 2019

toomasv: 04:18@cloutiy You can leave out to [string! #"^/"].

Monday 23th December, 2019

bubnenkoff

11:41Why next code add line break?

parse a [any [s: tg e: (checktag copy/part e s ) | some alphabet | tg ] ]

checktag: func [tag] [
    if find tag "<lots>" [print [tag " - " length? tag]]
]

result:

>> parse a [any [s: tg e: (checktag copy/part e s ) | some alphabet | tg ] ]
<lots>
 -  7
== true

I expect: - 7 on one line

11:54it's seems that there is line break. I tried to add trim, but did not get result:
parse a [any [s: tg e: (checktag trim copy/part e s ) | some alphabet | tg ] ]

hiiamboris

11:56Try ? trim ;)

bubnenkoff

12:02thanks! is there any way to print on console nonprintable symbols like lffor simplify debuging?

pekr

12:03print mold

bubnenkoff

12:04Yeah!

toomasv

12:24Or just probe

bubnenkoff

13:00How I can continue parsing inside function?

parse a [thru "<lots>" s: to  "</lots>" e: (checktag copy/part s e)]

checktag: func [tag] [
    tag: trim/lines tag
    any [tg (print "hello")]
]

I just want to move some parsing to function. But current code do not print nothing. It's not possible or I have some error?
data:

a: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lots>
<lot>
<name>Foo1</name>
<price>101</price>
</lot>
<lot>
<name>Bar2</name>
<price>201</price>
</lot>    
<lot>
<name>Baz3</name>
<price>302</price>
</lot>
</lots>
</root>}

13:44Am I right understand that I should not/can not delegate parse rules to separate functions?

toomasv

14:06@bubnenkoff In checktag function there is regular Red code, not parse dialect. checktag returns the result of any (native!) function, not parse dialect`s keyword.

pekr

14:06Parse rules work only inside of the parse.

toomasv

14:08But you can split your parse rule into several subrules and refer to these subrules in your rule.

14:39Something like this:

checktag: [some [not </lots> [remove [#" " | #"^/"] | skip]]] 
parse a [thru <lots> checktag </lots>] 
print a
<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lots><lot><name>Foo1</name><price>101</price></lot><lot><name>Bar2</name><price>201</price></lot><lot><name>Baz3</name><price>302</price></lot></lots>
</root>

Tuesday 24th December, 2019

bubnenkoff

09:42Big thanks! Is next rule are good written? Or it can be simplified?

>> parse "<bb><aa><bb><aa><aa>" [ any [to "<aa>" s: thru "<aa>" e: (print copy/part e s) ] ]
<aa>
<aa>
<aa>

I am trying to print all tags.

Btw @toomasv am I right understand that I can specify some letters in rules without quote mark? Like you wrote but not ""

toomasv

09:44@bubnenkoff

> am I right understand that I can specify some letters in rules without quote mark? Like you wrote </lots> but not "</lots>"

This concerns complete tags only.

>> print parse "<bb><aa><bb><aa><aa>" [collect any [keep <aa> | skip] ]
<aa> <aa> <aa>
>> parse "<bb><aa><bb><aa><aa>" [any [copy _ <aa> (print _) | skip] ]
<aa>
<aa>
<aa>
== true

bubnenkoff

10:45Thanks! What is next code doing? (from Rebol doc)

NAMES: copy []

Why name is initialize with copy?

toomasv

10:47To be used later without being affected by other [] probably. Have you [read](https://github.com/red/red/wiki/%5BDOC%5D-Why-you-have-to-copy-series-values) about importance of copy?

11:20Consider this:

list: ["Abel" "Cain" "Seth"] 
add-names: func [/local names][names: [] append names list]
add-names
;== ["Abel" "Cain" "Seth"]
add-names
;== ["Abel" "Cain" "Seth" "Abel" "Cain" "Seth"]
add-names
;== ["Abel" "Cain" "Seth" "Abel" "Cain" "Seth" "Abel" "Cain" "Seth"]
;-----------
add-names: func [/local names][names: copy [] append names list]
add-names
;== ["Abel" "Cain" "Seth"]
add-names
;== ["Abel" "Cain" "Seth"]
add-names
;== ["Abel" "Cain" "Seth"]

bubnenkoff

12:39I want to generate json.

parse "<bb><aa>123</aa><bb><aa>642</aa>" [ any [ thru "<aa>" copy x to "</aa>" (append vals x)  ] ]

The next code is doing similar thing:

>> to-json vals
== {["123","642"]}

But how I can create key-value pairs? For example I want to get:

{
	"vals": [123, 642]
}

pekr

12:45You could as well use Rebolek's json codec ....

12:47

>> data: [name [123 642]]
== [name [123 642]]
>> data/name
== [123 642]
>> data/name/1
== 123
>> select data 'name
== [123 642]
>> pick select data 'name 1
== 123

bubnenkoff

12:51Do I need to use external codec? It's impossime to do with default?

toomasv

12:57You can do it the hard way :)

print json: rejoin collect [
    vals: [] 
    parse "<bb><aa>123</aa><bb><aa>642</aa>" [ 
        collect into vals any [ thru "<aa>" keep to "</aa>" ]
    ] 
    keep {^{^/    "vals": [} 
    forall vals [
        if not head? vals [
            insert vals comma 
            vals: next vals
        ]
    ] 
    keep rejoin vals 
    keep "]^/}"
]
{
    "vals": [123,642]
}

bubnenkoff

12:59Yes, it's a little bit hard for me now. Could you link codec?

toomasv

13:03? to-json

bubnenkoff

13:04I used help, but did not understand how to do what I want

toomasv

13:07https://github.com/red/red/blob/master/environment/codecs/json.red

bubnenkoff

13:10"Rebolek's json codec"
it's name of default codec?

toomasv

13:12

>> print to-json object [vals: [123 345]]
{"vals":[123,345]}

>> print to-json/pretty object [vals: [123 345]] "    "
{
    "vals": [
        123,
        345
    ]
}

13:14May be he meant this: https://github.com/rebolek/red-tools/blob/master/codecs/json.red

13:17With the last (rebolek's):

>> print json/encode object [vals: [123 456]]
{"vals":[123,456]}

13:25So you could do something like this with default codec:

xml: "<bb><aa>123</aa><bb><aa>642</aa>" 
print to-json object [vals: parse xml [collect [any [<aa> keep to </aa> | skip]]]]
{"vals":["123","642"]}

13:38Sorry, copy _ to keep (load _)

bubnenkoff

14:34@toomasv what load is doing here?

toomasv

14:34loading value from string

14:35With load it results as {"vals":[123,642]} instead of {"vals":["123","642"]}

14:36to-integer might be used too of course

bubnenkoff

15:07Why on the next code work only one rule, but not both?

prices: []
names: []
parse a [thru "<lots>" 
    collect [
         any [
           ws |
           some [ 
               collect set prices any [ thru "<price>" keep to "</price>" | skip ] 
               collect set names any [ thru "<name>" keep to "</name>" | skip ]
                ] 
        ]
    ]
 "</lots>" ]

result:

>> to-json object [_prices: :prices]
== {{"_prices":["101","201","302"]}}
>> to-json object [_names: :names]
== {{"_names":[]}}

data:

a: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lots>
<lot>
<name>Foo1</name>
<price>101</price>
</lot>
<lot>
<name>Bar2</name>
<price>201</price>
</lot>    
<lot>
<name>Baz3</name>
<price>302</price>
</lot>
</lots>
</root>}

some and nested rules should process both of sub-rule

toomasv

15:19Think hard, what does the rule any [ thru "" keep to "" | skip ] do?
...
...
Correct, it finishes through input before next one even starts.
Here is how they work:

parse a [
	thru <lots> 
	some [ 
		</lots> to end
	|	collect into prices [ "<price>" keep to "</price>" ] 
	|	collect into names [ "<name>" keep to "</name>" ]
	|	skip
	] 
]

to-json object [_prices: :prices]
;== {{"_prices":["302","201","101"]}}
to-json object [_names: :names]
;== {{"_names":["Baz3","Bar2","Foo1"]}}

As you were warned already, thru-to pair is tricky and treacherous, especially combined with any or some quantifier.

Nomarian

23:12Is there a way to inherit an object?

23:12I have things like digit: charset ...

23:12constantly remaking such things is rather silly

greggirwin

23:38

>> o: object [digit: charset [#"0" - #"9"]]
== make object! [
    digit: make bitset! #{000000000000FFC0}
]
>> oo: make o [alpha: charset "abc..."]
== make object! [
    digit: make bitset! #{000000000000FFC0}
    alpha: make bitset! #{00000000000200000000000070}
]

Wednesday 25th December, 2019

Nomarian

00:15thanks gregg!

bubnenkoff

09:52@toomasv thanks! Am I right understand that I will change collect intoto collect set set will create new variable that will be cleaned on every iteration (because it inside some)?

09:55I can't understand why with set prices is empty

10:40I did simple example for question above:

>> parse "aabbccddaa" [some [collect into b keep "aa" | skip ] ]
== true
>> b
== ["aa" "aa"]
>> 
>> parse "aabbccddaa" [some [collect set b keep "aa" | skip ] ]
== true
>> b
== ["aa"]

pekr

11:14You are right - each time you run the someloop, your bis being re-set to the actual value. I am not used to use set, I prefer a more freedom in terms of the paren (code).

bubnenkoff

11:48Why is adding thru is breaking collecting both price and names (without thru both of them are collectiing)

id: []
prices: []
names: []
parse a [
    ; collect into id [thru "<id>"  keep to  "</id>"]
    thru "<lots>"
    any [
        collect into prices [thru "<price>"  keep to  "</price>"] |
        collect into names [ thru "<name>" keep to "</name>" ] | 
        skip
    ]
 "</lots>" ]

to-json object [_prices: :prices]
to-json object [_names: :names]

>> to-json object [_prices: :prices]
== {{"_prices":["302","201","101"]}}
>> to-json object [_names: :names]
== {{"_names":[]}}

I do not see any reason why it work different

12:49Now I faced with new problem. it's seems that collect do not working woth objects:

data: object [
    id: [] 
    prices: [] 
    names: []
    ]
parse a [
    collect into id [thru "<id>"  keep to  "</id>"]
    thru "<lots>"
    any [
        collect into data/prices [thru "<price>"  keep to  "</price>"] |
        collect into data/names [ thru "<name>" keep to "</name>" ] | 
        skip
    ]
 "</lots>" ]

to-json object [data]

I am getting error:
Script Error: PARSE - unexpected end of rule after: collect

13:07Can't edit post: thru above should be removed, because they are part of different question

toomasv

13:08> Why is adding thru is breaking collecting

Because with thru you are jumping to next on each iteration and second rule gets its chance only when there are no more tags (alas, no more tags either), but without thru rules are advancing orderly by little steps.

bubnenkoff

13:09so with thru keep to do not working? Mmm.... yes I see it's logic

pekr

13:28You imo need general thru rule with options. But then I am not sure you can collect to two different blocks?

bubnenkoff

13:39I did not understand your idea about general rule for thru... If I understand Toomas correctly keep to working, but thru is force to jumping to next price block

13:46> Now I faced with new problem. it's seems that collect do not working woth objects:
>

> data: object [
>     id: [] 
>     prices: [] 
>     names: []
>     ]
> parse a [
>     collect into id [thru "<id>"  keep to  "</id>"]
>     thru "<lots>"
>     any [
>         collect into data/prices [thru "<price>"  keep to  "</price>"] |
>         collect into data/names [ thru "<name>" keep to "</name>" ] | 
>         skip
>     ]
>  "</lots>" ]
> 
> to-json object [data]
>

>
> I am getting error:
> Script Error: PARSE - unexpected end of rule after: collect

I found one way to do what I need. But if anybody will show how to work with collect into object I will be thankful.
working code:

[ "<price>"  copy price to  "</price>"] (append data/prices price ) |

pekr

13:51I have never used collectmyself, am mentally stuck with the R2 parse and doing stuff in parens, like you just did :-), but here's the code, which seems to work:

id: [] 
prices: [] 
names: []

parse a [
    collect into id [thru "<id>"  keep to  "</id>"]
    thru "<lots>"
    any [
       thru [
           "<price>" collect into prices keep to "</price>"
         | "<name>"  collect into names  keep to "</name>"
       ] | skip
    ]
 to end
]
== true
>> id
== ["19160099"]
>> prices
== ["302" "201" "101"]
>> names
== ["Baz3" "Bar2" "Foo1"]

13:531) I made a thruan alternating rule. Not two of them, just one with options. 2) the problem also seems to be your objects. It seems parse can't use something like collect into data/prices, most probably it considers it being a path. When I moved those subobjects from the dataobject, it seems to work ....

toomasv

15:12As far as I know, paths are not supported by parse yet.

pekr

15:25Maybe you could use some bind trickery to get those blocks being part of the object :-)

bubnenkoff

15:43Heh, I got more complex problem. Now I am trying to find solution. I need to get next data structure:

{{"id":["19160099"],"lots":[ {name: "", price: "" } ] }}

Without creation nested object I would not able to access to elements by name. So I should write something like:

data: object [
    id: [] 
	lots: object [
        prices: [] 
        names: []		
	]
   ]
 ```
but `to-json` generate `lots` not as array, but as dict:

{{"id":[],"lots":{"prices":[] ,"names":[] }}}
`

I will continue in next day, but could you give any advice what I should to try? Maybe some methods for object?

toomasv

16:08@bubnenkoff How about this:

id: [] coll: [] 
parse a [
    collect into id [
        thru <id> keep to </id>
    ] 
    thru <lots> 
    collect into coll any [
          </lots> to end 
    |     <price> copy p to </price> thru <name> copy n to </name> 
          keep (object compose [price: (p) name: (n)]) 
    |     skip
    ]
]
to-json object compose/only [id: (id) lots: (coll)]
;== {{"id":["19160099"],"lots":[{"price":"101","name":"Bar2"},{"price":"201","name":"Baz3"}]}}

cloutiy

16:12Hello reducers, here with another question. But first, I just want to say how amazing the world becomes once you start to understand Red/Rebol and parse a bit more!

Now the question. I ran parse on a text file using collect to give me a block!. The output is as follows.:

[
 chapter "Title" 
p ["Some text with @i[inline @b[formatting]]"] 
p ["Another paragraph"] 
h2 "A Level 2 Heading"
]

What I'd like to do now is run this block through a second parse, using different rules. Specifically the block! following 'p is just a string!. What I want to do is find every sequence of 'p block! and parse the string! that is in the block, then replace that block with the result of parsing its contents (collected using collect.)

Is this where one would use into? Something along the lines of:

parse doctree rules: [
   'p mark: into block! (
         poke mark parse mark collect [ 
              the.rules.to.parse.the.paragraph.string ] )
]

Overlooking the details of my crude example, is the the general idea?

hiiamboris

16:15ahead block! change into [set s string!] (do something with s, return new result)

16:17or ahead block! into [set s string! (modify s buffer in place)] if you still need the block

toomasv

16:29@bubnenkoff How to get all three pieces correct? Homework for you.

cloutiy

18:29@hiiamboris thanks for that. that gives me something to experiment with. I do want to keep the block, but I was going to use collect to collect the result of the parse, and then replace the original block altogether. Well at least that's how I imagined doing it with my current level of knowledge/comfort. I'm sure I'll make a few mistakes along the way. But that's te best way to learn!

hiiamboris

18:36Indeed ;)

Thursday 26th December, 2019

bubnenkoff

07:08@toomasv Red is awasome! Every day I am starting with "Wow!". Thanks I will try your and maybe some other approach!

11:59@toomasv after some attempts I did:

data: object [
    id: [] 
	lots: [ 
	]
   ]
   
parse a [
       thru "<lots>"
       collect any [
           "<price>" copy p to "</price>" thru "<name>" copy n to "</name>" ( append data/lots object compose [ price: (p) name: (n) ] ) | skip
       ]
        "</lots>" 
    ]

to-json data

Not sure if it's better that yours, but at last I did it myself)

toomasv

12:02:+1: BUT, the result is still {{"id":[],"lots":[{"price":"101","name":"Bar2"},{"price":"201","name":"Baz3"}]}}, whitch is not correct result. Homework exercise still not done :) You should get

[{"price":"Foo1","name":"101"},{"price":"Bar2","name":"201"},{"price":"Baz3","name":"302"}]

And you don't need collect if you are not keeping anything.

12:13Also, try to run your parse several times. See what happens to data.

bubnenkoff

13:09Yes, I have seen it :( I read about copying, but still not fully understand. Need more time for my mind. Because I do not simply copy-past without understanding.
Look. I have next data structure. And I need to fill it.

data: object [
    id: [] 
    lots: [
    maxPrice: []    
    purchaseObjects: [ 
        ]
    ]
   ]

append data/lots/purchaseObjects object compose [ price: (2 + 2) name: ("Mike") ]

But I am getting:

make object! [
    id: []
    lots: [
        maxPrice: [] 
        purchaseObjects: [make object! [
            price: 4
            name: "Mike"
        ]]
    ]
]

I do not need make object! here...

toomasv

13:37@bubnenkoff

> I do not need make object! here...

Hmm.. What can I say... don't make it then.

bubnenkoff

13:38Is to-json to-block data good idea?

toomasv

13:40You mean, to get something like this?:

{["id:",[],"lots:",["maxPrice:",[],"purchaseObjects:",[{"price":4,"name":"Mike"}]]]}

bubnenkoff

13:40Yes

toomasv

13:41Don't make your data object. Everything else is same.

bubnenkoff

13:47it's seems that I should use to-block in my code, because in my variant of code I use object compose for creation {"price":4,"name":"Mike"}. But now it's not problem for me. This function is dowing what I want

toomasv

13:49I meant you can use dataas block if you don't want it to be object:

data: [
    id: [] 
    lots: [
    maxPrice: []    
    purchaseObjects: [ 
        ]
    ]
   ]

13:52Why bother making it object if you'll convert it into block at once?

13:54> Yes, I have seen it :( I read about copying, but still not fully understand. Need more time for my mind. Because I do not simply copy-past without understanding.

:+1: Good attitude! And it's definitely about copying. (And ordering your parse rules.)

bubnenkoff

13:59@toomasv object is needed to get correct json. Look:

a: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
    <purchaseObject>
    <name>Foo1</name>
    <price>111</price>
    </purchaseObject>
    <purchaseObject>
    <name>Bar2</name>
    <price>222</price>
    </purchaseObject>
    <purchaseObject>
    <name>Baz3</name>
    <price>333</price>
    </purchaseObject>        
</purchaseObjects>
</lot>
</root>}

data: [
    id: [] 
    lots: [
    maxPrice: []    
    purchaseObjects: [ 
        ]
    ]
   ]

parse a [
       thru "<id>" copy id to "</id>" (append data/id id )
       thru "<purchaseObjects>"
       collect any [
           "<price>" copy p to "</price>" thru "<name>" copy n to "</name>" ( append data/lots/purchaseObjects object compose [ price: (p) name: (n) ] ) | skip
       ]
        "</purchaseObjects>" 
    ]

write %file.txt to-json data

I am getting wrong result:

["id:",["19160099"],"lots:",["maxPrice:",[],"purchaseObjects:",[{"price":"111","name":"Bar2"},{"price":"222","name":"Baz3"}]]]

"id:",["19160099"], instead of "id:": ["19160099"],etc

toomasv

14:01Then why did [you answer "Yes"](https://gitter.im/red/parse?at=5e04b859260751301c126228) to my question? ;)

bubnenkoff

14:04Oh sorry, I did not noticed the difference there

toomasv

14:05With your to-block the result would be exactly if as you made data into a block in the first place.

bubnenkoff

14:05thanks!

toomasv

14:06You are welcome!

14:14@bubnenkoff Try this:

data: object [
    id: [] 
    lots: [
        maxPrice: []    
        purchaseObjects: []
    ]
]
parse a [
    (clear data/id clear data/lots/purchaseObjects)
    thru "<id>" copy id to "</id>" (append data/id id )
    thru "<purchaseObjects>"
    collect any [
        "<name>" copy p to "</name>" thru "<price>" copy n to "</price>" ( 
            append data/lots/purchaseObjects object compose [ price: (p) name: (n) ] 
        ) 
    | skip
    ]
     "</purchaseObjects>" 
]
probe to-json data

Friday 27th December, 2019

bubnenkoff

08:02@toomasv thanks!

13:33Is there any principle difference between:

parse "<bb><aa><bb><aa><aa>" [any [copy _ <aa> (print _) | skip] ]

and

parse "<bb><aa><bb><aa><aa>" [any [to _ <aa> (print _) ] ]

or absolutely same? The output is same. I checked

hiiamboris

13:48Try 2nd rule in a freshly started console

endo64

15:06@bubnenkoff Don't confuse, _ is nothing special but just a ordinary word, so to _ is like to my-word-without-a-value, hence it will fail.

Monday 30th December, 2019

bubnenkoff

09:49@hiiamboris why it's do not work in fresh console?

hiiamboris

11:06What does it say?

bubnenkoff

11:33

>> parse "<bb><aa><bb><aa><aa>" [any [to _ <aa> (print _) ] ]
*** Script Error: PARSE - invalid rule or usage of rule: _
*** Where: parse
*** Stack:

hiiamboris

11:34I see. Yes, quite cryptic message ;)

11:34What it tries to tell you is that word _ is unset (and it expects this word to hold a rule).

bubnenkoff

11:35why it's work in not empty console?

toomasv

13:15Because if [you have copied it before](https://gitter.im/red/parse?at=5e060818b4ed68096ee9024d), then it is not unset! but "", but if you haven't (in fresh console), then it has no value.

Archived messages from: gitter.im/red/parse from year: 2019

Wednesday 9th January, 2019

Thursday 10th January, 2019

Sunday 24th February, 2019

Monday 25th February, 2019

Tuesday 26th February, 2019

Saturday 9th March, 2019

Monday 11st March, 2019

Tuesday 2nd April, 2019

Wednesday 3th April, 2019

Thursday 4th April, 2019

Friday 5th April, 2019

Saturday 6th April, 2019

Sunday 7th April, 2019

Monday 8th April, 2019

Tuesday 9th April, 2019

Wednesday 10th April, 2019

Thursday 11st April, 2019

Friday 12nd April, 2019

Saturday 13th April, 2019

Monday 15th April, 2019

Tuesday 16th April, 2019

Friday 19th April, 2019

Wednesday 24th April, 2019

Thursday 25th April, 2019

Thursday 16th May, 2019

Wednesday 19th June, 2019

Monday 24th June, 2019

Tuesday 25th June, 2019

Wednesday 26th June, 2019

Thursday 27th June, 2019

Saturday 13th July, 2019

Tuesday 16th July, 2019

Friday 19th July, 2019

Monday 22nd July, 2019

Tuesday 23th July, 2019

Wednesday 24th July, 2019

Sunday 28th July, 2019

Saturday 31st August, 2019

Sunday 1st September, 2019

Sunday 8th September, 2019

Tuesday 8th October, 2019

Wednesday 9th October, 2019

Monday 21st October, 2019

Tuesday 22nd October, 2019

Sunday 3th November, 2019

Monday 4th November, 2019

Thursday 7th November, 2019

Monday 11st November, 2019

Wednesday 13th November, 2019

Thursday 14th November, 2019

Friday 15th November, 2019

5) do (2-3)6) as there are no more s, first subrule fails7) try next subrule, which is keep to "<", success8) new round, failure on first subrule, let's try next9) infinite loop starts on second subrule as you are not advancing anymore10) stop and return collected values

s, first subrule fails7) try next subrule, which is keep to "<", success8) new round, failure on first subrule, let's try next9) infinite loop starts on second subrule as you are not advancing anymore10) stop and return collected values

Saturday 16th November, 2019

Monday 18th November, 2019

Tuesday 19th November, 2019

Wednesday 20th November, 2019

Friday 22nd November, 2019

Monday 25th November, 2019

Tuesday 26th November, 2019

Wednesday 27th November, 2019

Thursday 28th November, 2019

Friday 29th November, 2019

Tuesday 3th December, 2019

Wednesday 4th December, 2019

Thursday 5th December, 2019

Friday 6th December, 2019

Saturday 7th December, 2019

Monday 9th December, 2019

Tuesday 10th December, 2019

Wednesday 11st December, 2019

Thursday 12nd December, 2019

Monday 16th December, 2019

Tuesday 17th December, 2019

Wednesday 18th December, 2019

Thursday 19th December, 2019

Friday 20th December, 2019

Monday 23th December, 2019

Tuesday 24th December, 2019

5) do (2-3)
6) as there are no more
s, first subrule fails
7) try next subrule, which is `keep to "<"`, success
8) new round, failure on first subrule, let's try next
9) infinite loop starts on second subrule as you are not advancing anymore
10) stop and return collected values

s, first subrule fails
7) try next subrule, which is `keep to "<"`, success
8) new round, failure on first subrule, let's try next
9) infinite loop starts on second subrule as you are not advancing anymore
10) stop and return collected values