insert. My goal: I have a string, and I have a list of targets. What I'd like to do is before the matched target, I want to insert something. Additionally, after the matched target, and I want to insert something. I've tried a few different ways, one f which was using a set-word before and after the match, then using (some code) attempted to do an insert. The second approach I tried is as follows:keywords: [ "chapter" | "caption" | "figure" | "image" ]
parse content [ any [
insert "<b>"
keyword
insert "</b>" ] | skip ]keyword is unset likelyinsert (as all modifying rules) is not cancelled on backtrackingahead keyword before the insert, since to/thru can introduce their own challengesparse content [
to keyword
any [ ahead keyword insert "<b>" thru keyword insert "</b>" to keyword ]][change copy _ keyword (rejoin [“<b>” _ “</b>”] | skip]
parse "<bb><aa><bb><aa><aa>" [any [to "<aa>" | skip ] ]
to. But I can't understand what it's breakparse "<bb><aa><bb><aa><aa>" [any [copy _ <aa> (print _) | skip] ]
copy _ (print _) copy is create temp word named _ and copy to to it or what?to moves to just that tag, and stays there. With next iteration, if finds the same position. So fater matching the tag, you have to somehow skip it?>> parse "<bb><aa><bb><aa><aa>" [any [to "<aa>" s: (print s) | skip ] ] <aa><bb><aa><aa> <aa><bb><aa><aa> == false
>> parse "<bb><aa><bb><aa><aa>" [any [to "<aa>" s: "<aa>" (print s) | skip ] ] <aa><bb><aa><aa> <aa><aa> <aa> == true
parse:? show help for in context of Red. How to get command description in context of Red Parse?parse doesn't have commands, but keywords which are interpreted as described in different [articles](https://github.com/red/red/wiki/%5BDOC%5D-Parse). There is also a new draft doc (https://github.com/red/docs/pull/204).> parse content [ > to keyword > any [ ahead keyword insert "<b>" thru keyword insert "</b>" to keyword ]] >
to keyword ? Move to next?to or thruChanges the first elements of a series and returns the series after the change.>> change "aabbcc" "g" == "abbcc"
>> head change "aabbcc" "g" == "gabbcc" >> >> s: change "aabbcc" "g" == "abbcc" >> s == "abbcc"
s is not getting value?s: head changes: head change "aabbcc" "g"works. But I do not understand why head is needed>> head s: change "aabbcc" "g" == "gabbcc" >> s == "abbcc" >> s: head change "aabbcc" "g" == "gabbcc" >> s == "gabbcc"
head to get series from head.change (or any other transform or move) does not return a new copy of series (unless you use copy of course), but a new position in series (index) and pointer to the same series.s1 and s2 point to same series but at different indexes:>> s1: "aabbcc" == "aabbcc" >> s2: change s1 "g" == "abbcc" >> s1 == "gabbcc" >> s2 == "abbcc" >> head s2 == "gabbcc"
>> same? s1 s2 == false >> same? s1 head s2 == true >> same? next s1 s2 == true
change is change index position, right? And it place new value on start position. right?change changes series at current position and returns same series at position after changed element.head? is moved from start?>> s: change "aabbcc" "g" == "abbcc" >> head? s == false >> >> head? back s == true >> back s == "gabbcc" >> index? s == 2
change do not change index position to start? What is the logic?>> a: "aabbcc" == "aabbcc" >> change a "z" == "abbcc" >> a == "zabbcc" >> change change a "1" "2" == "bbcc" >> a == "12bbcc"
change (change a "1") "2"colors: [blue red green gold] change next colors 'yellow >> colors == [blue yellow green gold]
index position, and I was very confused next colors moves index to second position (red in this case) and changes it to yellow.>> colors: [blue red green gold] == [blue red green gold] >> >> change colors 'foo == [red green gold] >> >> colors == [foo red green gold]
>> colors == [red green gold]
change returns series after change, but does not modify index, you would need to assign the result back to colors"abcd" has same length as [red green blue yellow]> list: ["Abel" "Cain" "Seth"] > add-names: func [/local names][names: [] append names list] > add-names > ;== ["Abel" "Cain" "Seth"] > add-names > ;== ["Abel" "Cain" "Seth" "Abel" "Cain" "Seth"] > add-names > ;== ["Abel" "Cain" "Seth" "Abel" "Cain" "Seth" "Abel" "Cain" "Seth"] > ;----------- > add-names: func [/local names][names: copy [] append names list] > add-names > ;== ["Abel" "Cain" "Seth"] > add-names > ;== ["Abel" "Cain" "Seth"] > add-names > ;== ["Abel" "Cain" "Seth"] >
>> t: func [/local n] [n: [] append n "aa"] == func [/local n][n: [] append n "aa"] >> >> t == ["aa"] >> t == ["aa" "aa"]
n is created inside function (not in global dict). Where it storing n and why it's appending more and more. t is creating new dict inside itself that continue to live after each call? t to see it's inner structure? >> f: func [/local b][b: [] append b "aa"] == func [/local b][b: [] append b "aa"] >> f == ["aa"] >> f == ["aa" "aa"]
>> body: body-of :f == [b: ["aa" "aa"] append b "aa"]
>> body/5: "bb" == "bb" >> f == ["aa" "aa" "bb"]
body-of? :t returns whole function, body-of returns just function's body, just like spec-of returns function's specs.> >> f: func [/local b][b: [] append b "aa"] > == func [/local b][b: [] append b "aa"] > >> f > == ["aa"] > >> f > == ["aa" "aa"] >
> >> body: body-of :f > == [b: ["aa" "aa"] append b "aa"] >
> >> body/5: "bb" > == "bb" > >> f > == ["aa" "aa" "bb"] >
b: [] is creation block. But why b: copy [] prevent duplication?b and thanks to copy, it's initialized each time to empty block.b: [], b is a reference to an empty block. So if you modify b, you are modifying original block.b: copy [], b is a **copy** of an empty block and by adding to b, you're not modifiyng original block.b: [], b is a reference to an empty block. So if you modify b, you are modifying original block.b: copy [], b is a **copy** of an empty block and by adding to b, you're not modifiyng original block.b: [] is same that when we are writing b: a ? With only difference that here is not word, but block that work as word? But when we are writing [] from new line it's just block declaration. [] begin work as accumulator of appending values, like other words do>> z: [] == [] >> >> f: func [/local b] [append copy z "bb"] == func [/local b][append copy z "bb"] ; z is local copy of global `z`, right? if yes what is the name of word where will be placed `bb`. It will be in `[]` block without name?
z in your func is global; copy z is also global but anonymous and lost. You can catch it with e.g. b: f.input-stdin is an internal function and may not be available later, but currently below code works on Win10:Red [] ;Current dir is /bin #include %../../environment/console/CLI/input.red probe input-stdin
C:\...\red\build\bin>echo test | pipe.exe "test ^M"
input-stdin and operate against that.data: object [
id: ""
lots: [
maxPrice: ""
purchaseObjects: [
]
]
]
parse a [
(clear data/id clear data/lots/purchaseObjects clear data/lots/maxPrice) ; <--- this line
thru "<id>" copy _ to "</id>" (append data/id _ )
thru "<maxPrice>" copy _ to "</maxPrice>" (append data/lots/maxPrice _ )
thru "<purchaseObjects>"
collect any [
"<OKPD2>" [
thru "<code>" copy p to "</code>" thru "<name>" copy n to "</name>" ( append data/lots/purchaseObjects object compose [ code: (p) name: (n) ] )
] | skip
]
"</purchaseObjects>"
]
write %file.txt to-json data
to-json to-block data (clear data/id clear data/lots/purchaseObjects clear data/lots/maxPrice) is hack?thru "<id>" copy _ to "</id>" (copy [] append data/id _ )
(clear data/id clear data/lots/purchaseObjects clear data/lots/maxPrice) outside parser?copy [] in this code:thru "<id>" copy _ to "</id>" (copy [] append data/id _ )
copy _, because it is already copied by your parse rule.clear data/lots/maxPrice as it is a block where you append multiple values.purchaseObjects. Instead of:append data/lots/purchaseObjects object compose [ code: (p) name: (n) ]
repend data/lots/purchaseObjects [ p n ]
foreach [code name] data/lots/purchaseObjects [ print ["Code:" code "name:" name] ]
data object, consider not using id: "", because it will be replaced anyway. Using id: none is better as you don't create the empty unused string.compose in your code... object constructor already _reduces_ the values. Check this:>> p: "foo" object [code: p]
== make object! [
code: "foo"
]parse. The first videos are up. More to come. https://youtu.be/k7VYAFPDnXc and https://youtu.be/1riJ1PYYOfQ purchaseObjects. Instead of:> append data/lots/purchaseObjects object compose [ code: (p) name: (n) ] >
> repend data/lots/purchaseObjects [ p n ] >
[{code: 123, name: "sdf"}{code: 321, name: "zxc"}]a: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lots>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</lots>
</root>}Lots always should be a list. Every purchaseObjects in it is also list.{
"lots": [{
"maxPrice": 8186313.66,
"purchaseObjects": [{
"code": "11.131.11",
"name": "Foo111"
}, {
"code": "22.12.55",
"name": "Bar222"
}, {
"code": "33.322.41",
"name": "Baz333"
}]
}]
}Red. Because lots content is multiple. I am sure that next code is wrong:data: object [
id: ""
lots: [
maxPrice: ""
purchaseObjects: [
]
]
]parse a [
; some clearing need here
thru "<maxPrice>" copy _ to "</maxPrice>" ( append data/lots/maxPrice _ )
thru "<lots>"
collect any [
to "<OKPD2>" [
thru "<code>" copy p to "</code>" thru "<name>" copy n to "</name>" ( append data/lots/purchaseObjects object [ code: (p) name: (n) ] )
] | skip
]
"</lots>"
]
write %file.txt to-json datalot and collect all objects from it, than move to another lotparse a [
thru "<maxPrice>" copy _ to "</" ( data/lots/maxPrice: _ )
any [
thru "<code>" copy p to "</"
thru "<name>" copy n to "</" (
append data/lots/purchaseObjects object [ code: p name: n ]
)
]
]... you would need to take more care to handle multiple lots.{
"id": "",
"lots": ["maxPrice:", "8186313.668186313.66", "purchaseObjects:", [{
"code": "11.131.11",
"name": "Foo111"
}, {
"code": "666",
"name": "Bar222"
}, {
"code": "33.322.41",
"name": "Baz333"
}
]]
}{
"id": "",
"lots": [{
"maxPrice:", "8186313.668186313.66", "purchaseObjects:", [{
"code": "11.131.11",
"name": "Foo111"
}, {
"code": "666",
"name": "Bar222"
}, {
"code": "33.322.41",
"name": "Baz333"
}
] }
]
}po-name: [<name> keep to </name> </name>]
po-code: [<code> keep to </code> </code>]
po: [<purchaseObject> collect [
any [
not ahead </purchaseObject>
[po-name | po-code | skip]
] </purchaseObject>]
]
>> parse a [collect any [po | skip]]
== [["11.131.11" "Foo111" "666"] ["22.12.55" "Bar222"] ["33.322.41" "Baz3...data: object [
id: none
lots: []
]
parse-lot: func[code /local price purchases p n][
purchases: copy []
parse code [
thru <maxPrice> copy price to </maxPrice>
any [
thru <code> copy p to </code>
thru <name> copy n to </name> (
append purchases object [ code: p name: n ]
)
]
]
object [maxPrice: price purchaseObjects: purchases]
]
parse-lots: func[code /local tmp][
parse code [
any [
thru <lot> copy tmp to </lot> (append data/lots parse-lot tmp)
]
]
]
parse a [
thru <id> copy id to </id>
thru <lots> copy tmp to </lots> (
data/id: id
clear data/lots
parse-lots tmp
)
]
datamake object! [
id: "19160099"
lots: [make object! [
maxPrice: "8186313.66"
purchaseObjects: [make object! [
code: "11.131.11"
name: "Foo111"
] make object! [
code: "666"
name: "Bar222"
] make object! [
code: "33.322.41"
name: "Baz333"
]]
]]
]parse user_.copy is working. Now I can use it. But it's not clear for me what happens when I am writing:f: func[w] [d: copy [] append d w]
d - is global word here, yes?copy do? In next code every call d should get empty block []. But it's do not! It's not overwriting global word d? Or what?f: func[w] [d: [] append d w]
>> f: func[w] [d: copy [] append d w] == func [w][d: copy [] append d w] >> f 1 == [1] >> f 2 == [2] >> f 3 == [3]
>> f1: func[w] [d: copy [] append d w d] == func [w][d: copy [] append d w d] >> f2: func[w] [d: [] append d w d] == func [w][d: [] append d w d] >> f1 "a" == ["a"] >> f1 "a" == ["a"] >> f2 "a" == ["a"] >> f2 "a" == ["a" "a"]
copy is clear global value?d>> f: func[w] [d: [] append d w] == func [w][d: [] append d w] >> f 1 == [1] >> second body-of :f == [1]
clear). But this is out of topic in the /parse room. Just remember that when you copy, you are safe.d: make block! 8 which will also produce always new block with preallocated given size.clear here: https://gist.github.com/greggirwin/b08ffb5c9fa54a9b9387248387baf46d#file-url-parser-red-L157>> f3: func[w] [d: clear [] append d w d] == func [w][d: clear [] append d w d] >> f3 "a" == ["a"] >> f3 "b" == ["b"]
copy https://gist.github.com/greggirwin/b08ffb5c9fa54a9b9387248387baf46d#file-url-parser-red-L195result: clear url-buffer:// does two other things, which are more important: 1) it sets the datatype of the result. 2) it documents the purpose, as you would with a comment, but without needing a separate comment.copy the result at the end.data: object [
id: none
lots: []
]
parse-lot: func[lot /local purchaseObjects] [
single_lot: object [
maxPrice: ""
purchaseObjects: []
]
parse lot [
thru "<maxPrice>" copy _ to "</maxPrice>" (single_lot/maxPrice: _ )
any [
thru "<purchaseObjects>" copy obj to "</purchaseObjects>" (append single_lot/purchaseObjects parse-obj obj )
]
]
single_lot
]
parse-obj: func[obj ] [
purchaseObjects: copy []
parse obj
[
any [
thru "<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
to "</OKPD2>"
]
]
purchaseObjects
]
parse a [
(clear data/lots)
thru "<id>" copy id to "</id>" (data/id: id)
any [
thru "<lot>" copy lot to "</lot>" (append data/lots parse-lot lot )
]
]
write %file.txt to-json data{
"id": "19160099",
"lots": [{
"maxPrice": "8186313.66",
"purchaseObjects": [{
"code": "11.131.11",
"name": "Foo111"
}, {
"code": "22.12.55",
"name": "Bar222"
}, {
"code": "33.322.41",
"name": "Baz333"
}
]
}
]
}/local purchaseObjects in code? Or it can be dropped? /local too.set or copy data to these, as you can by accident overwrite important parts elsewhere.. like in this example:>> foo: "hello" == "hello" >> f: func[][ parse "foo" [copy foo to end] ] f == true >> foo == "foo" ;<-------- not a "hello" anymore!
set and copy in parse rules are not recognized by function construct, which normally collects set-words and use these as a local for you.>> a: b: 0 f: function[][ a: 1 parse "foo" [copy b to end] ] f == true >> a == 0 ;<--- was collected as a local >> b == "foo" ;<--- was not collected as a local
/local! So it's good practice to declare all words inside function as local?true for every document that grammar it can parse (now it parse but at result return false)?false? Now I have few functions and I need to learn best debugging practice. true before start work on next?parse-trace is also useful sometimes but its more difficult to understand and follow its output.>> parse [a b c 1 d] [some [word! (prin ".")]] ...== false ; it's the forth item >> parse [a b c 1 d] [some [p: word! (probe first p)]] a b c ; fails after c
8 replies, but I have trouble accessing them.Red []
;file: read %/C/current_month/notice/notification_Moskva_2019020200_2019020300_001.xml/fcsNotificationEA44_0373200101018000262_19160099.xml
data: object [
id: none
lots: []
]
parse-lot: func[lot /local single_lot price obj] [
single_lot: object [
maxPrice: ""
purchaseObjects: []
]
parse lot [
thru "<maxPrice>" copy price to "</maxPrice>" (single_lot/maxPrice: price )
any [
thru "<purchaseObjects>" copy obj to "</purchaseObjects>" (append single_lot/purchaseObjects parse-obj obj ) | skip
]
]
single_lot
]
parse-obj: func[obj /local c n] [
purchaseObjects: []
parse obj
[
any [
thru "<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
to "</OKPD2>"
]
]
purchaseObjects
]
parse file [
(clear data/lots)
thru "<id>" copy id to "</id>" (data/id: id)
any [
thru "<lot>" copy lot to "</lot>" (append data/lots parse-lot lot ) | skip
]
]file: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}false, no hangup here.parse recently.any [thru "" copy lot to " " (append data/lots parse-lot lot ) | skip] and if it's not found, it skips to next character., then move to next character and scan again. Then move to next character and scan again. Etc, until it finally hits end.thru and see the speedup.any [
"<lot>" copy lot to "</lot>" (append data/lots parse-lot lot ) | skip ; thru removed
]>> do %test.red == true
thru in another functions?thru :) Your input file has ~700kB, so it check for almost 2450000000000 instead of 700000. In your other functions, you are checking much smaller string, so it runs faster. Another optimization would be not use three function, but one with nested parse rules.any [thru something | skip] with any [something | skip], as the second variant is much faster and the code is also smaller.any works without thru. Or it do move? >> parse "aabbccdd" [any "bb" s:] == false >> index? s == 1
any [thru something | skip] is a nonsense... because if thru samething is false, the skip rule never helps with finding something after skipping input by one char.any in itself doesn't do any moves. If you read the [documentation](https://www.red-lang.org/2013/11/041-introducing-parse.html), you'll find that any *repeats rule zero or more times until failure or if input does not advance*bbthru checks whole file if there is . If there isn't, skip moves one char forward and then thru **checks the whole file again**. This is repeated until end is finally hit. So inefficient that it looks like Red hangs.thru?some [thru something] or some [something | skip], but never ever some [thru something | skip]thruthere, because you expect that. Of course if that is the case ...thru, how would you write it? Of course as some [something | skip]. So some [thru something | skip] is some [some [something | skip] | skip]. As you can see, the complexity of this rule is not linear but exponential.>> parse a [any [<name> copy n to #"<" thru #">" (probe n) | skip]] "Foo111" "Bar222" "Baz333" == true
parse a [any [<name> copy name to #"<" thru #">" (probe name) | to #"<"] ]
>> dt [parse append/dup "" #"0" 10000 [some [#"1" | skip]]] == 0:00:00.002302 >> dt [parse append/dup "" #"0" 10000 [some [thru #"1" | skip]]] == 0:00:03.20431
skip itself is also not what you want sometimes... compare number of skips evaluated here:>> s: 0 parse a [any [<name> copy name to #"<" thru #">" (probe name) | skip (s: s + 1)] ] s "Foo111" "Bar222" "Baz333" == 811
>> s: 0 parse a [any [to #"<" [<name> copy n to "<" thru ">" (probe n) | skip (s: s + 1)]]] s "Foo111" "Bar222" "Baz333" == 42
to has to do anyway.to #"<" tells that you want to get start of a tag, and ignores other charasters, which would be tested with the first version.< doesn't match in the first case.relatively soon and then tons of uninteresting garbage, which is basically the same as my example. It's the garbage at the end what was causing supposed hangup.dt is simple oneliner I wrote just for this test. I would post Red's number with your examples, but I don't have a in my Red :(a>> dt [loop 10000 [ parse a [any [thru <name> copy name to #"<" thru #">" | skip]] ]] == 0:00:01.007076 >> dt [loop 10000 [ parse a [any [<name> copy name to #"<" thru #">" | skip]] ]] == 0:00:02.348254 >> dt [loop 10000 [ parse a [any [to #"<" [<name> copy n to "<" thru ">" | skip]]] ]] == 0:00:00.261976
parse a [
any [ ;- find any tags...
to #"<" [ ;- seek the char used to start a tag
[
<name> copy n to "<" (print ["n:" n]) ;- process <name> tag...
| ;- .. or ..
<code> copy c to "<" (print ["c:" c]) ;- process <code> tag...
]
| skip ;- skips the "<" char so we can find another tag
]
]
to end ;- just to let parse return TRUE when there is some left content
]thru (and also to) rule, you should be aware that you may skip content which you may don't want to skip.. so it is always good to be careful, in complex rules... but of course it is useful in cases where you for example just want title of a web page:>> parse read https://www.red-lang.org [thru <title> copy t to "<"] t == "Red Programming Language"
thru checks whole file if there is . If there isn't, skip moves one char forward and then thru **checks the whole file again**. This is repeated until end is finally hit. So inefficient that it looks like Red hangs.thru do not calling?>> parse "aabbccdd" [any [ thru "cc" print ("cc") | skip (print "skiped") ] ]
skiped
skiped
skiped
skiped
skiped
skiped
skiped
skiped
== truethru **checks the whole file again**">> parse "aabbccdd" [any [ thru "cc" (print "cc") | skip (print "skiped") ] ] cc skiped skiped == true
thru is running only one timedt is simple oneliner I wrote just for this test. I would post Red's number with your examples, but I don't have a in my Red :(dt?dt: func [block /local t][t: now/time/precise do block now/time/precise - t]print *before* thru. You are checking if the rule succeeded, not if it was called.a. But what about much more simple case? parse "aabbccdd" [any [ (print "cc") thru "cc" | skip (print "skiped") ] ]
skip thru should re-scan string with index+1 (aka next)thru [(print "cc") "cc"]skiped is for each char after last cc found:>> s: 0 parse "aabbccdd" [any [ thru "cc" | skip (s: s + 1) ] ] s == 2 >> s: 0 parse "aabbccdddd" [any [ thru "cc" | skip (s: s + 1) ] ] s == 4
it checks the whole file again.. it checks from the current position to tail.thru :) Your input file has ~700kB, so it check for almost 2450000000000 instead of 700000. In your other functions, you are checking much smaller string, so it runs faster. Another optimization would be not use three function, but one with nested parse rules.any [thru something | skip] with any [something | skip], as the second variant is much faster and the code is also smaller.thru "lot" move index to next position after first lot it found. Than we will run function that collect body if lot, if current position is not lot me just skip symbol. What I am missing? thru but how it will change logic? something and thru something. The first check only if something is at current position, while the second traverse to end. any [something | skip] as: _check if something is at current position, if not, skip the position to next char_ thru is just setting cursor to position after match is foundskip change position to next char, but you are again seeking from position to the end. >> parse "aabbccdd" [ (print "cc") thru "cc" ] cc
>> parse "aabbccdd" [ thru [(print "cc") "cc" ]] cc cc cc cc cc
>> parse "aabbccdd" [ thru [p: (print ["cc" mold p]) "cc" ]] cc "aabbccdd" cc "abbccdd" cc "bbccdd" cc "bccdd" cc "ccdd"
thru on block always skip. I'm not sure if it is by design.>> parse "aabbccdd" [some [p: (probe p) "cc" break | skip ]] "aabbccdd" "abbccdd" "bbccdd" "bccdd" "ccdd"
print before the content check, you see the output even when the match is not found! a: "aabbccdd" parse a [ thru "cc" (print index? a) ] 1
parse.ports? parse doesn't have all functionality of Red's one), but it's more or less same in this case.true/false and second is parsed data. How I can do it in right way?do-parsing: function[file] [
; ..... some code skipped
parse file [ ; result of parsing
(clear data/lots)
thru "<id>" copy id to "</id>" (data/id: id)
any [
"<lot>" copy lot to "</lot>" (append data/lots parse-lot lot ) | skip
]
]
data-for-return: to-json data
]parse file to false if any of top parse function was evaluated as fail?parse-result: parse file [ ... ] ... data-for-return: reduce [parse-result to-json data]
f: function[v] [ parse v ["somewrongrule"] ] parse "aabbccdd" [ thru "aa" copy v to "dd" 2 skip (f v)]
if key word:>> parse "abc" ["a" "bc"] == true >> parse "abc" ["a" if (true) "bc"] == true >> parse "abc" ["a" if (false) "bc"] == false
"<purchaseObjects>" copy obj to "</purchaseObjects>" (append single_lot/purchaseObjects if(parse-obj obj) ) | skip *** Script Error: if is missing its then-blk argument *** Where: if *** Stack: do-parsing
if as parse rule *inside* code block. Move it out.>> bb-rule: ["bb"] == ["bb"] >> parse-bb: func [value][parse value bb-rule] == func [value][parse value bb-rule] >> parse "aabbcc" ["aa" copy value to "cc" (parse-bb value) to end] == true
>> parse "aabbcc" ["aa" bb-rule "cc" to end] == true
(return f v) inside your rule.if as parse rule *inside* code block. Move it out."<purchaseObjects>" copy obj to "</purchaseObjects>" if (append single_lot/purchaseObjects parse-obj obj ) | skip
true:file: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}
data: object [
id: none
lots: []
]
parse-lot: func[lot /local single_lot price obj] [
single_lot: object [
maxPrice: ""
purchaseObjects: []
]
parse lot [
thru "<maxPrice>" copy price to "</maxPrice>" (single_lot/maxPrice: price )
any [
"<purchaseObjects>" copy obj to "</purchaseObjects>" if (append single_lot/purchaseObjects parse-obj obj ) | skip
]
]
single_lot
]
parse-obj: func[obj /local c n] [
purchaseObjects: []
parse obj
[
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
to "</OKPD2>"
]
]
purchaseObjects
]
parse file [
(clear data/lots)
thru "<id>" copy id to "</id>" (data/id: id)
any [
"<lot>" copy lot to "</lot>" if (append data/lots parse-lot lot ) | skip
]
]if as a key word in the parse dialect and if as a key word for the main evaluation.thru and to, so it will escape.if (append data/lots parse-lot lot ) is a nonsense.truedo https://raw.githubusercontent.com/rebolek/red-tools/master/codecs/xml.red data: xml/decode file
parse, than it is really good to start with the mentioned simple HTML parser from the article.data? But how to make it's look structure that I need?parseparse, than it is really good to start with the mentioned simple HTML parser from the article.>> data: xml/decode data *** Syntax Error: invalid character in: "ns2:export"
ws: charset reduce [space tab cr lf]
rule-lot: [
any ws <maxPrice> copy mp to "</" </maxPrice> (probe mp)
any ws <purchaseObjects> thru </purchaseObjects>
any ws
]
parse file [
any ws
<root> [
any ws <id> copy id to "<" (probe id) </id>
any ws <purchaseNumber> copy pn to "<" (probe pn) </purchaseNumber>
any ws <lot> rule-lot </lot>
]
any ws
</root>
]purchaseObjects as it is now just skipped.id, purchaseNumber and lot in exact order!any ws parts as it was dealing with white spaces automatically and one had to use parse/all to handle the spaces manually. @dockimbel I wonder if this should not be available in Red too!parse/all is the default mode. Whitespace definition is so subjective that I used parse/all all the time.parse/all is more common.parse/all -> Red's parse. Wasn't it the same in R3? I may be wrong though./all only for parse splitting:>> parse "a | b" "|" == ["a" "b"] >> parse/all "a | b" "|" == ["a " " b"]
parse is working like Red's one?lot with purchaseObjects:parse file [
thru "<id>" copy id to "</id>" (data/id: id)
any [
"<lot>"
any [
"<purchaseObjects>"
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( purchaseObjects: copy [] append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
"</purchaseObjects>" | skip
]
"</lot>" | skip
]
]*id?>> id: 1 object [id: id]
*** Script Error: id has no value
*** Where: id
*** Stack: object
>> *id: 1 object [id: *id]
== make object! [
id: 1
]any ws ;-)false when the input is malformed. Not like the initial version."<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" (...)
"</OKPD2>" | skipto "" and than expects "", which is not there yet!Parse is an advanced feature, and I think it will benefit you greatly to get very comfortable with the basic functions, how series work, what helps you debug code, etc. parse can be a *lot* of work. We plan to build tools to help with this, but right now, it can mean a lot of probe calls, temp markers, or using parse/trace. Making rules robust in the face of real world data is not easy. parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
]
parse-lot: [
(purchaseObjects: copy [])
any [
"<purchaseObjects>" parse-object to "</purchaseObjects>" | skip
]
]
parse file [
(lots: copy [])
any [
"<lot>" parse-lot (print "hello") "</lot>" | skip
]
] work only if it placed before "</lot>". And I can't understand why it's not working if write it as:
`to "" in your pare-lot rule, so it ends before . Change it to either to or thru parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
]
parse-lot: [
(purchaseObjects: copy [])
any [
"<purchaseObjects>" parse-object to "</purchaseObjects>" "</purchaseObjects>" | skip
]
]
parse file [
(lots: copy [])
any [
"<lot>" parse-lot "</lot>" (print "hello") | skip
]
] isn't right after so chage it to to , it should help.</purchaseObjects> </lot>
fcsNotificationEA44_0373200101018000262_19160099.xml file and after there are . Even if it's</purchaseObjects> </lot>
| skip in my last version?to it works. I tried to replace to with ws and it's stop works. Why?-------------
file: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}
ws: charset reduce [space tab cr lf]
parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
]
parse-lot: [
(purchaseObjects: copy [])
any [
"<purchaseObjects>" parse-object to "</purchaseObjects>" "</purchaseObjects>"
]
]
parse file [
(lots: copy [])
any [
"<lot>" parse-lot ws "</lot>" (print "hello") | skip
]
]| skip in my last version?upd: Yes I just mentioned itws. But I am not sure that ws is good. Because real data could be very variate. For example there can be different sections like etc in different places. So maybe skip will be more reliable ws is just for white spaces, so it will not handle unknown tags.ws should work, because I do not have any other tags between:</purchaseObjects> </lot>
"<lot>" parse-lot ws "</lot>" (print "hello") | skip
</purchaseObjects> </lot>
to work:"<lot>" parse-lot to "</lot>" (print "hello") | skip
parse-object as I mentioned earlier.whitespace: charset reduce [space tab cr lf] ws: [any whitespace]
( ) ?[ purchaseNumber "0373200101018000262" id "19160099" lot [ purchaseObjects [...] maxPrice "4343" garbage "..." ] foo "...." ]
"hello" because your parse-lot fails. Definitely in the parse-object part."hello" because your parse-lot fails. Definitely in the parse-object part.parse/trace[sub1 opt sub2 sub3]. To know how far I got, I've inserted few set-words there: [p1: sub1 p2: opt sub2 p3: sub3] Now when the rule end, I can check p1-3 values to see where exactly the parse was.parse-object rule, you are having this:data: "<a><b>a</b></a>" parse data [ <a> thru <b> copy b to </b> </a> ]
to does not skips the tag for you.whitespace: charset reduce [space tab cr lf] data: "<a> <b>a</b> </a>" parse data [ <a> thru <b> copy b to "<" </b> any whitespace </a> ]
thru may skip out of your tag content!parse data [ <a> any whitespace <b> copy b to "<" </b> any whitespace </a> ]
contents (using skip):data: "<a> <b>x</b> </a> <c/> <a> <b>y</b> </a>" parse data [ any [ <a> any whitespace <b> copy b to "<" </b> any whitespace </a> ( probe b ) | skip ] ]
data: "<a> <b>b1</b><c>c1</c> </a> <c/> <a> <c>c2</c><b>b2</b> </a>" parse data [ any [ <a> any [ any whitespace <b> copy b to "<" </b> | any whitespace <c> copy c to "<" </c> ] any whitespace </a> ( print [b c] ) | skip ] ]
data: "<a> <b>b1</b><c>c1</c> </a> <d/> <a> <c>c2</c><b/> </a>" parse data [ any [ <a> any [ any whitespace <b> copy b to "<" </b> | any whitespace <c> copy c to "<" </c> | any whitespace <b/> (b: none) | any whitespace <c/> (c: none) ] any whitespace </a> ( print [b c] ) | skip ] ]
, so you will ask us, and we will show you this:data: "<a> <b>b1</b><c>c1</c> fooo </a> <d/> <a> <c>c2</c><b/> </a>" parse data [ any [ <a> any [ any whitespace <b> copy b to "<" </b> | any whitespace <c> copy c to "<" </c> | any whitespace <b/> (b: none) | any whitespace <c/> (c: none) | any whitespace </a> break | skip ] ( print [b c] ) | skip ] ]
... so you will learn new level:data: "<a> <b>b1</b><c>c1</c> fooo </a> <d/> <a> <c>c2</c><b/> <a><b>b3</b></a>" parse data [ any [ <a> (b: c: none) any [ any whitespace <b> copy b to "<" </b> | any whitespace <c> copy c to "<" </c> | any whitespace <b/> (b: none) | any whitespace <c/> (c: none) | any whitespace ahead </a> break ;<--- check if there will be closing tag | skip ;<--- so this skip will not go until end ] ( print [b c] ) | skip ] ]
false. I am not sure that I would get success because I have problem even with single file, but at last I will try)))> "<OKPD2>" > thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" (...) > "</OKPD2>" | skip >
to "" and than expects "", which is not there yet!whitespace: charset reduce [space tab cr lf] ws: [any whitespace]
parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" "</name>" whitespace ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
]whitespace- are you absolutly there is some whitespace involved? Maybe it would be more flexible to have it as any whitespace?whitespaceprobe/prints to be sure that they are completed?parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" whitespace ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
]parse-lot: [
(purchaseObjects: copy [])
any [
"<purchaseObjects>" parse-object to "</purchaseObjects>" "</purchaseObjects>" ; maybe here?
]
]to "" "" and use to "<" instead, because you are adding additional computation if you use to "" instead of just to "<" and even more optimal to #"<" which I'm not using with you, because you are confused enough even without knowing difference between string and char.thru and to in your code, and use something from the explained example above... here is what you want for :OKPD2-rule: [
<OKPD2> (c: n: none)
any [
any whitespace <code> copy c to "<" </code>
| any whitespace <name> copy n to "<" </name>
| any whitespace ahead </OKPD2> break
| skip
]
</OKPD2> (
print ["code:" c "name:" n]
)
]
file: {
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
<name>Foo222</name>
</OKPD2>
<OKPD2>
<name>Bar222</name>
<code>22.131.22</code>
</OKPD2>
<OKPD2><code>Bla</code><name/><foo>moo</foo></OKPD2>
}
parse file [ any [OKPD2-rule | skip] ]to & thru, but you must trust the input data you parse... else you will fail, if you for example have swapped and , how I have in above test.>> parse "<a><b/></a>" [thru <a> copy x to </a> position:] == false
false, because the position of the input is not at tail!>> position == "</a>"
>> parse "<a><b/></a>" [thru <a> copy x to </a> position: </a> position2:] == true >> position2 == ""
to endin your code? parse runs same speed when compiled.position if my parse have separate rules like:parse-object: [
any [
"<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" whitespace ( append purchaseObjects object [ code: c name: n ] )
"</OKPD2>" | skip
]
pos:
]Script Error: pos has no valuepos has no value, because your rule always fails and never reaches pos: part. I'm probably bad teacher... better to do something what I know.formula: "2d6" print parse formula [integer! "d" integer!]
>> print parse formula [integer! "d" integer!] ** Script error: PARSE - invalid rule or usage of rule: integer!
fast-lexer branch, but only on binary! series and only strict literal values are supported:>> print parse to-binary "2 d 6" [integer! " d " integer!] true
some [ "<lot>" to "</lot>" | skip ]
skip. But without skip I would not able to go nextsome
[
thru "<lot>" thru "</lot>" to end
]some\any with thrufalse. While it should be true. I think something wrong with to end, but can't understand what exactly:parse-lot: [
(purchaseObjects: copy [])
some [
thru "<purchaseObjects>" to "</purchaseObjects>" "</purchaseObjects>" to end ; I think problem is here
]
]
parse file [
(lots: copy [])
some
[
thru "<lot>" parse-lot "</lot>" to end
]
]some, which is 1 or more .... try to use any, which is 0 or more. What if the first thrufails in there? Then the whole rule fails .... pos:thru "<purchaseObjects>" to "</purchaseObjects>" "</purchaseObjects>" to end pos:
>> pos == ""
some, which is 1 or more .... try to use any, which is 0 or more. What if the first thrufails in there? Then the whole rule fails .... any? There should be at last 1 sections with such namefile: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}fail if some sections do not exists"" to endto continue with ... and it fails, as already being at the end ...to end move it to end of file? Not to end of section?""? Do not jumping to end of file? For example I can have something like:</purchaseObjects> </someanothertag> ; <-- </lot> </root>}
skip but it will skip section if it is not found. But I should be 100% sure that it exists or return failfalse. My xml are very complex and it may have any sections on any place. For example adding one section will break Oldes example:file: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<foo> ; <------------------------------------ only one foo make it return false
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}
whitespace: charset reduce [space tab cr lf]
ws: [any whitespace]
rule-object: [
ws <purchaseObject> [
ws <OKPD2> [
ws <code> copy *cd to "<" </code>
ws <name> copy *nm to "<" </name>
]
ws </OKPD2>
ws opt [<currency> thru </currency>] ;- currency is not used
ws <price> thru </price> ;- price is not used
( append purchases object [code: *cd name: *nm] )
]
ws </purchaseObject> ws
]
rule-lot: [
(purchases: copy [])
ws <maxPrice> copy *mp to "<" </maxPrice>
ws <purchaseObjects> any rule-object </purchaseObjects>
ws
]
parse file [
(data: none)
ws
<root> [
ws <id> copy *id to "<" </id>
ws <purchaseNumber> copy *pn to "<" </purchaseNumber>
ws <lot> rule-lot </lot>
(
data: object [
id: *id
lots: object [
maxPrice: *mp
purchaseObjects: purchases
]
]
)
]
ws
</root>
] thru "<purchaseObjects>" to "</purchaseObjects>" "</purchaseObjects>" to end pos:
tag: [any ["<" thru ">"] | ws] ; skip any tag or space\linebreaknot_tag: [not any [ "" ]] ; tags that I **should not** skiptag: [any ["<" thru ">"] | ws]not sure it is correct. Just stop fearing the general skip! What if there is no white-space contained in the charset, but just regular char?:stack: clear [] parse file [some [ "</" copy _ to #">" [if (_ = take stack) | reject] | #"<" copy _ to #">" (insert stack _) | skip ]] == true
added:... == false >> stack == ["/lot" "foo" "root"]
["root"] ["id" "root"] ["root"] ["purchaseNumber" "root"] ["root"] ["lot" "root"] ["maxPrice" "lot" "root"] ["lot" "root"] ["purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["code" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["name" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["currency" "purchaseObject" "purchaseObjects" "lot" "root"] ["code" "currency" "purchaseObject" "purchaseObjects" "lot" "root"] ["currency" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["price" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["code" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["name" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["price" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["code" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["name" "OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["OKPD2" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["price" "purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObject" "purchaseObjects" "lot" "root"] ["purchaseObjects" "lot" "root"] ["lot" "root"] ["root"] [] == true
stack: clear [] parse file [some [ "</" copy _ to #">" [if (_ = take stack) (probe stack) | reject] | not "</" #"<" copy _ to #">" (insert stack _) (probe stack) | skip ]]
included):... ["purchaseObjects" "lot" "foo" "root"] ["lot" "foo" "root"] ["foo" "root"] == false
is last in file. Then the parse result is true but stack is not empty:... == true >> stack == ["foo"]
reject does not return immediately, IIUC. To remedy this:stack: clear []
parse file [
some [
"</" copy _ to #">" [if (_ = first stack) (remove stack) | thru end]
| not "</" #"<" copy _ to #">" (insert stack _)
| skip
]
if (empty? stack)
]
== false
>> stack
== ["foo"]to and thru seem simpler/faster to everyone while they are first starting with it (it certainly was for me) and everyone just needs to spend some target practice with their foot to eventually realize that they make things more brittle and harder to troubleshoot. Especially with something as complex as xml.parse/trace in here, but no parse-trace? It provides its own callback, which is a bit verbose, but *extremely* useful for understanding what parse is doingTo/Thru can be very helpful for certain formats, and chunking, but as soon as the markers can appear in content elsewhere, they lose their value. They can also be handy when doing exploratory parsing on large inputs. I've also used find and non-parse approaches there to good effect.not does not advance.>> parse "aa" [not "bb" (print "ok")] ok == false >> parse "bb" [not "bb" (print "ok")] == false
false.>> rejoin parse "aabca" [collect some [not "b" keep skip | skip]] == "aaca" >> rejoin parse "aabca" [collect some ["b" | keep skip]] == "aaca"
not is negative lookahead, i.e. like excluding condition: not a b means "b except a". :)not invert rule. But how to make it to do advice? not dos not advance and correct rules do, right.not know how far to advance? Let's say you have parse "abcdefgh" [not #"z"]. Where do you expect the not rule to stop?> do https://raw.githubusercontent.com/rebolek/red-tools/master/codecs/xml.red > data: xml/decode file >
file: {<root>
<id>19160099</id>
<purchaseNumber>0373200101018000262</purchaseNumber>
<lot>
<maxPrice>8186313.66</maxPrice>
<purchaseObjects>
<purchaseObject>
<OKPD2>
<code>11.131.11</code>
<name>Foo111</name>
</OKPD2>
<currency>
<code>666</code>
</currency>
<price>111</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>22.12.55</code>
<name>Bar222</name>
</OKPD2>
<price>222</price>
</purchaseObject>
<purchaseObject>
<OKPD2>
<code>33.322.41</code>
<name>Baz333</name>
</OKPD2>
<price>333</price>
</purchaseObject>
</purchaseObjects>
</lot>
</root>}do https://raw.githubusercontent.com/rebolek/red-tools/master/html-tools.red, you'll get foreach-node function and then you can: >> foreach-node xml/decode {<parent><child id="1">Bob</child><child id="2">Alice</child></parent>} [print [tag mold content attributes]]
parent [child [none "Bob" #()] #(
"id" "1"
) child [none "Alice" #()] #(
"id" "2"
)]
child [none "Bob" #()] "id" "1"
none "Bob"
child [none "Alice" #()] "id" "2"
none "Alice>> foreach-node xml/decode {<parent><child id="1">Bob</child><child id="2">Alice</child></parent>} [if tag = 'child [print [second content select attributes "id"]]]
Bob 1
Alice 2tn: function[tag] [ parse tag [thru "<" copy tag_name to ">" ] tag_name ]
extract_data: function[tag] [ parse tag [ mytag_name: tn tag (print tag_name) ] ]
mytag_name here will work as pointer, and will not receive value from tnparse dialect with Red code.parse dialect?parse "abcd" [copy x to #"c"] print xparse dialect with Red code. And to be honest, I cannot read your extract_data function. I don't understand what you want to achieve.parse dialect with Red code. And to be honest, I cannot read your extract_data function. I don't understand what you want to achieve.parse function, than use it's result in another parse expression parse with own rules\words?skip implemented. Where it's placed on source tree? https://github.com/red/red(mycode) ?parse [mydata] [ set argument any [word! | integer! | paren! (x: do argument) insert x] ]
>> parse b: [start (probe 3 * 5) end] [any [word! | integer! | p: paren! (x: do first p) insert x skip]] b 15 == [start (probe 3 * 5) 15 end]
>> parse b: [start (3 * 5) end] [any [word! | set arg paren! insert (do arg)]] == true >> b == [start (3 * 5) 15 end] >
change:>> parse b: [start (3 * 5) end] [any [word! | change set arg paren! (do arg)]] b == [start 15 end]
>> parse b: [start (3 * 5) end] [any [word! | change s: paren! (do first s)]] b == [start 15 end]
insert, it is better to do it so:>> parse b: [start (3 * 5) end] [any [word! | p: paren! insert (do first p)]] b == [start (3 * 5) 15 end]
x in parens:parse b: [start (3 * 5) end] [any [word! | p: paren! (x: do first p) insert (x)]]
*** Script Error: PARSE - invalid rule or usage of rule: 15
>> parse b: [start (2 * 3) a a a a a a end] [any [word! | p: paren! (x: do first p) insert x 'a]] == true >> b == [start (2 * 3) 6 a a a a a a end]
15 skip but as there are not so many elements, it fails. If you leave out skip you'll get "Invalid rule" error, because integer rule needs something to loop over.>> parse b: [start (3 * 5) end] [any [word! | p: paren! q: (x: do first p) (r: insert q x) :r]] b == [start (3 * 5) 15 end] >> parse b: [start (3 * 5) end] [any [word! | p: paren! (change p do first p)]] b == [start 15 end]
true if every section is exists and parseable plus data from them, and false if any of them do not founded.whitespace: charset reduce [space tab cr lf]
ws: [any whitespace]
parse file [
(
lots: copy []
id: none
purchaseNumber: none
responsibleOrg_regNum: none
responsibleOrg_inn: none
responsibleOrg_kpp: none
responsibleOrg_fullName: none
responsibleOrg_postAddress: none
responsibleRole: none
)
thru "<id>" copy _id to "</id>" (id: _id)
thru "<purchaseNumber>" copy _purchaseNumber to "</purchaseNumber>" (purchaseNumber: _purchaseNumber)
thru "<responsibleOrg>"
thru "<regNum>" copy _responsibleOrg_regNum to "</regNum>" (responsibleOrg_regNum: _responsibleOrg_regNum)
thru "<inn>" copy _responsibleOrg_inn to "</inn>" (responsibleOrg_inn: _responsibleOrg_inn)
thru "</responsibleOrg>"
some [
(
lot: copy []
maxPrice: none
)
thru "<lot>"
thru "<maxPrice>" copy _price to "</maxPrice>" (maxPrice: _price )
some [
thru "<purchaseObjects>"
(purchaseObjects: copy [])
some [
thru "<OKPD2>"
thru "<code>" copy c to "</code>" thru "<name>" copy n to "</name>" "</name>" ( append purchaseObjects object [ code: c name: n ] ) | ws
"</OKPD2>"
]
thru "</purchaseObjects>" (print "Hello")
]
( append lot object [ price: maxPrice Objects: purchaseObjects])
thru "</lot>"
( append lots object [
id: _id
purchaseNumber: _purchaseNumber
responsibleOrg_regNum: _responsibleOrg_regNum
responsibleOrg_inn: _responsibleOrg_inn
lots: lot
]
)
] to end
]
write %file.txt to-json lotsthru "<purchaseNumber>" copy _purchaseNumber to "</purchaseNumber>" (purchaseNumber: _purchaseNumber)
thru "<purchaseNumber>" copy purchaseNumber to "</purchaseNumber>"
skip to prevent the error :)opening-header_random-data-and-lenght ---- header1 [a-random-name] ---- random-stuff-here ending-formula ---- header2 [a-random-name] ---- random-stuff-here ending-formula ---- header3 [a-random-name] ---- random-stuff-here ending-formula ---- header2 [a-random-name] ---- random-stuff-here ending-formula ---- header2 [a-random-name] ---- random-stuff-here [a-random-name] ending-formula ---- header1 [a-random-name] ---- random-stuff-here ending-formula ---- header1 [a-random-name] ---- random-stuff-here ending-formula ---- header2 [a-random-name] ---- random-stuff-here ending-formula ---- header3 [a-random-name] ---- random-stuff-here ending-formula ---- header4 [a-random-name] ---- random-stuff-here ending-formula ---- header4 [a-random-name] ---- random-stuff-here ending-formula ---- header4 [a-random-name] ---- random-stuff-here ending-formula ---- header5 [a-random-name] ---- random-stuff-here ending-formula ---- header6 [a-random-name] ---- random-stuff-here ending-formula ---- header7 [a-random-name] ---- random-stuff-here ending-formula ---- header6 [a-random-name] ---- random-stuff-here ending-formula ---- header6 [a-random-name] ---- random-stuff-here ending-formula ---- header7 [a-random-name] ---- random-stuff-here ending-formula ---- header6 [a-random-name] ---- random-stuff-here ending-formula ---- header5 [a-random-name] ---- random-stuff-here ending-formula closing formula
parse/all file-to-analyze [ some [ header-marker1 to ending-marker | header-marker2 to ending-marker | header-marker3 to ending-marker | header-marker4 to ending-marker | header-marker5 to ending-marker | header-marker6 to ending-marker | header-marker7 to ending-marker | skip ] copy script (code to be added here) ]
[a-random-name] mean: two parens with one or more words inside of it.[a-rand-name_content header-type content-of-blocks_including-headers-and-footer] (do not consider the previous one)[
dotes check-constraint {
-- =======================================================
-- Check Constraint Nochecks's for Table: [dbo].[DOTes]
-- =======================================================
Print 'Check Constraint Nochecks''s for Table: [dbo].[DOTes]'
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_AccontoPerc]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_Cambio]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_Cd_DoSottoCommessa]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_Cd_LS_2]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DoTes_Cd_LS_C]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_DataDocRif]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DoTes_DataPag]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_IvaSplit_IvaSospesa]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_NumeroDocRif]
ALTER TABLE [dbo].[DOTes] NOCHECK CONSTRAINT [CK_DOTes_ScontoCassa]
GO
-- TRANSACTION HANDLING
IF @@ERROR<>0 OR @@TRANCOUNT=0 BEGIN IF @@TRANCOUNT>0 ROLLBACK SET NOEXEC ON END
GO
}
]-- ======================================================= -- Trigger Disable's for Table: [dbo].[DOTes] -- ======================================================= Print 'Trigger Disable''s for Table: [dbo].[DOTes]' ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_U1] ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_D1] ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [DOTes_atrg_brd] GO -- TRANSACTION HANDLING IF @@ERROR<>0 OR @@TRANCOUNT=0 BEGIN IF @@TRANCOUNT>0 ROLLBACK SET NOEXEC ON END GO -- ======================================================= -- Trigger Disable's for Table: [dbo].[DORig] -- ======================================================= Print 'Trigger Disable''s for Table: [dbo].[DORig]' ALTER TABLE [dbo].[DORig] DISABLE TRIGGER [xDoRig_BOW_D1] ALTER TABLE [dbo].[DORig] DISABLE TRIGGER [DORig_atrg_brd] GO -- TRANSACTION HANDLING IF @@ERROR<>0 OR @@TRANCOUNT=0 BEGIN IF @@TRANCOUNT>0 ROLLBACK SET NOEXEC ON END GO
Trigger Disable's = header-type (there are 6 other types like Check Constraints, etc)a-rand-name_Content = DOTes , taken from: [dbo].[DOTes]content-of-blocks_including-headers-and-footer =-- =======================================================
-- Trigger Disable's for Table: [dbo].[DOTes]
-- =======================================================
Print 'Trigger Disable''s for Table: [dbo].[DOTes]'
ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_U1]
ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_D1]
ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [DOTes_atrg_brd]
GO
-- TRANSACTION HANDLING
IF @@ERROR<>0 OR @@TRANCOUNT=0 BEGIN IF @@TRANCOUNT>0 ROLLBACK SET NOEXEC ON END
GO[ a-rand-name_Content header-type content-of-blocks_including-headers-and-footer a-rand-name_Content header-type content-of-blocks_including-headers-and-footer a-rand-name_Content header-type content-of-blocks_including-headers-and-footer ... ]
out: parse read %input.txt [
collect some [
thru [s: "^--- =" thru "-- "]
copy header-type to " for Table:"
thru "].["
keep to #"]"
keep (header-type)
2 thru ["^-GO"] e:
keep (copy/part s e)
]]
foreach [name type content] out [
print to-word lowercase replace/all name " " "-"
print to-word lowercase replace/all type " " "-"
print content print ""
]dotes trigger-disable's -- ======================================================= -- Trigger Disable's for Table: [dbo].[DOTes] -- ======================================================= Print 'Trigger Disable''s for Table: [dbo].[DOTes]' ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_U1] ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [xDoTes_BOW_D1] ALTER TABLE [dbo].[DOTes] DISABLE TRIGGER [DOTes_atrg_brd] GO -- TRANSACTION HANDLING IF @@ERROR<>0 OR @@TRANCOUNT=0 BEGIN IF @@TRANCOUNT>0 ROLLBACK SET NOEXEC ON END GO ...
out: parse read %input.txt [ collect some [ thru [s: "^--- =" thru "-- "] copy type to " for Table:" thru "].[" copy name to #"]" keep (to-word lowercase replace/all name " " "-") keep (to-word lowercase replace/all type " " "-") 2 thru ["^-GO"] e: keep (copy/part s e) ]] foreach [name type content] out [print [name type] print content print ""]
new-line/all/skip out true 3
>> probe out
[
dotes trigger-disable's {^--- =======================================================^/^--- Trigger Disable's for Table: ...
dotes foreign-key-constraint-nochecks's {^--- =======================================================^/^--- Foreign ...
dotes check-constraint-nochecks's {^--- =======================================================^/^--- Check Constraint...
...
]Newline is new isntruction to methru [s: here you mark the input to the starting point...thru "-- "] copy type to " for Table:" thru [out: clear []
parse read %input.txt [
some [
to "^--- =" s: thru "^/^--- "
copy type to " for Table:"
thru "].["
copy name to #"]"
(append out to-word lowercase replace/all name " " "-")
(append out to-word lowercase replace/all type " " "-")
2 thru "^-GO"
e: (append out copy/part s e)
| skip
]]| skip in the end isn't actually needed.x: "<id>123</id>" tag_name: parse x ["<" collect keep to ">" to end] parse_rule: rejoin ["thru <" tag_name "> to </" tag_name ">" ]
>> parse "<id>123</id>" [#"<" copy tag-name to #">" thru "</" tag-name #">"] == true
collect keep with copy, you're getting one value. And making parse rule with rejoin is wrong. You're constructing code, not string.>> x: "<id>123</id>" == "<id>123</id>" >> tag_name: parse x ["<" collect keep to ">" to end] == ["id"] >> parse_result: rejoin ["thru <" tag_name "> copy " tag_name " to </" tag_name ">" " </" tag_name ">" ] == "thru <id> copy id to </id> </id>"
parse x [#"<" copy tag-name to #">"] directlyload that string to get actual parse rule. Why not generating the parse rule directly? compose or reduce (or any other method).> >> parse "<id>123</id>" [#"<" copy tag-name to #">" thru "</" tag-name #">"] > == true >
some thru and than it simply jump to end?>> parse "aabbccaagg" [some thru "aa"] == true >> parse "aabbccaagg" [some [thru "aa"]] == false
>> parse "aabbccaagg" [some thru [x: (print ["x:" x]) "aa" y: (print ["y:" y])] z: (print ["z:" z])] x: aabbccaagg y: bbccaagg x: bbccaagg x: bccaagg x: ccaagg x: caagg x: aagg y: gg x: gg x: g x: z: == true >> parse "aabbccaagg" [some [thru [x: (print ["x:" x]) "aa" y: (print ["y:" y])]] z: (print ["z:" z])] x: aabbccaagg y: bbccaagg x: bbccaagg x: bccaagg x: ccaagg x: caagg x: aagg y: gg x: gg x: g x: z: gg == false
>> parse "aabbccaagg" [some thru ["aa" y:] z: if (z = y)] == false >> parse "aabbccaagg" [some thru ["gg" y:] z: if (z = y)] == true
if inside parse. At last how to use it for branching if:>> parse "afirstbsecondcthird" [
some [s:
if (find "ac" s/1) 6 skip
| if (s/1 = #"b") 7 skip
]]
== true>> parse "afirstbsecondcthird" [
some [
["a" | "c"] 5 skip
| "b" 6 skip
]]
== truecollect keep with copy, you're getting one value. And making parse rule with rejoin is wrong. You're constructing code, not string.thru will skip only first latter in first case?>> parse "<id>123</id>" [thru "<" "id" ">" copy tag_name to "</" "id" ">" thru "</" "id" ">" ] == false >> >> >> parse "<id>123</id>" [thru "<id>" copy tag_name to "</id>" thru "</id>" ] == true
copy tag_name to """" <> "id"to "" will set it before "</". So why you saying that I need skip "</"? I think I need to "" "" (but it do not work)) to "" "" should work. skip takes integer argument **before** itself, so something like 2 skip.> parse "<id>123</id>" [thru "<" "id" ">" copy tag_name to "</" "</" "id" ">" thru "</" "id" ">" ] == false
to "" "" should work. skip takes integer argument **before** itself, so something like 2 skip.>> parse "<id>123</id>" [thru "<" "id" ">" copy tag-name to "</" "</" "id" ">"] == true>
copy or creating words *always* make word global?parse "123" [ thru ">" copy x to "<" to end] ; x is not globalparse operation, then yes.>> f: func[/local x][parse "123" [copy x to end] probe x]
== func [/local x][parse "123" [copy x to end] probe x]
>> f
"123"
== "123"
>> x
*** Script Error: x has no value
*** Where: catch
*** Stack:
>> c: context [x: none parse "123" [copy x to end] probe x]
"123"
== make object! [
x: "123"
]
>> x
*** Script Error: x has no value
*** Where: catch
*** Stack:function construct, which normally collects set-words and makes them _local_. >> f: function[][y: 1 parse "123" [copy x to end] probe x] f "123" == "123" >> y *** Script Error: y has no value *** Where: catch *** Stack: >> x == "123"
context [a: b: none parse [... copy a ... set b ...] ] is useful in that case.context [a: b: none rules: [... copy a ... set b ... ] parser: func [data] [parse data rules]] is preffered to parser: func [data /local a b rules][rules: [... copy a ... set b ... ] parse data rules]closure should simplify this.>> about Red 0.6.4 for Windows built 7-Mar-2020/6:07:29+02:00 commit #87d8f52 >> non: none parse [a]['a non] == false >> non: [none] parse [a]['a non] == true
>> about REBOL/Core 2.7.8.3.1 1-Jan-2011 >> non: none parse [a]['a non] == true >> non: [none] parse [a]['a non] == true
none example in the docs, it's all there.R2: >> x: 'skip == skip >> parse [a b] [x x] == true Red: >> x: 'skip == skip >> parse [a b] [x x] *** Script Error: PARSE - invalid rule or usage of rule: skip *** Where: parse *** Stack:
text >> n: none == none >> parse [a #[none]]['a n] == true >> parse [a #[none]]['a #[none]] == true >> parse [a #[none]]['a none #[none]] == true >> parse [a #[none]]['a [none] #[none]] == true
skip above and signals incorrect error, but some rules can be used that way.>> foo: quote (print "bar") == (print "bar") >> parse [a b c][3 word! foo] bar == true
non: none be treated as a none value or none keyword?: is covered by "Values on the following datatypes" [here](https://doc.red-lang.org/en/parse.html#_parse_rules). Which means that you can stuff like ↑ paren! above and more:>> mark: quote mark: == mark: >> parse [a b c][2 word! mark to end] == true >> mark == [c]
>> x: 2 y: 'x z: 'char! t: char! == char! >> parse [x x] [x y] *** Script Error: PARSE - invalid rule or usage of rule: x *** Where: parse *** Stack: >> parse [x x] [x z] == false >> parse [x x] [x t] == false >> parse [x x] [2 t] == false >> parse [x x] [2 z] == false
char!?word! though ;)>> x: 2 y: 'x z: 'word! t: word! >> parse [x x] [2 t] == true >> parse [x x] [x t] == true
[x y] example shows. And maybe something else as well>> x: 2 y: 'x z: 'word! t: word! >> parse [x x] [x y] *** Script Error: PARSE - invalid rule or usage of rule: x *** Where: parse *** Stack: >> parse [x x] [x z] == false
z and y are evaluating to wordsparse vs xpath? I understand that parse is more powerful, but how to summarize it's advantages? Like 1... 2... 3... ?text >> xml: [a [b node]] == [a [b node]] >> xml/a/b == node >> first parse xml [collect ['a into ['b keep skip]]] == node
parse the The Crown Jewel of Rebol, but I don't think you can understand its true power without block parsing and all the datatypes.collect, and also make block!. An approach I'm experimenting with is (while scanning a string, and extracting things into a collect) is to put a [ when I want to start a new block, and ] when I want to end the block. But so far none of the ways I've tried have been successful. I tried to keep (#"["), keep ('[). So then am I right to conclude the only way to create a block is with collect or make block!? I guess another alternative I could do is to insert [ and ] directly into the string I'm parsing then load the entire thing after it's done parsing.load form [ my-word #"[" 1 2 3 #"]" ]. This is how I could do it if I keep (#"[") and keep (#"]") while collecting. Think I answered my own question.collect in parse:>> parse [1 2 3][collect [keep integer! collect [some keep integer!]]] == [1 [2 3]]
some [
(lot: copy [])
thru "<lot>"
thru "<lotNumber>" copy _lotNumber to "</lotNumber>" (lotNumber: _lotNumber )
thru "<maxPrice>" copy _price to "</maxPrice>" (price: _price )
(print _lotNumber)
thru "<OKPD2>"
thru "<code>" copy _code to "</code>" (code: _code)
thru "<name>" copy _name to "</name>" (name: _name)
thru "</OKPD2>"
thru "</lot>"
(append lot object [lot_number: lotNumber maxPrice: price _code: code _name: name ])
(append _lots lot)
]some become true. Is there any way to make some to true if every condition inside block is true?some [ ... ] will ONLY execute if everything matches....| rules here, so if I understand the question, that should be the case, but remember that thru is greedy, so your last thru "" might skip a whole bunch of data. Also if you get at least one match for all those rules, that indicates success for the opening some rule. thru is success (every thru tag is exists: "" "" "" etc ).or expresstion(|) to get it look like:thru "<lot>" |
thru "<lotNumber>" copy _lotNumber to "</lotNumber>" (lotNumber: _lotNumber ) |
thru "<maxPrice>" copy _price to "</maxPrice>" (price: _price ) ||"ahead to check if the text ahead has required structure. Like this:ws: charset " ^/^-" ws*: [any ws] guard: [ ws* ahead [ "<lot>" ws* "<lotNumber>" any [not "</lotNumber>" skip] "</lotNumber>" ws* "<maxPrice>" any [not "</maxPrice>" skip] "</maxPrice>" ws* "<OKPD2>" ws* "<code>" any [not "</code>" skip] "</code>" ws* "<name>" any [not "</name>" skip] "</name>" ws* "</OKPD2>" ws* "</lot>" ] ]
lots: clear [] lot: clear []
_lotNumber: _price: _code: _name: none
rule: [
some [
(clear lot)
thru "<lot>"
thru "<lotNumber>" copy _lotNumber to "</lotNumber>"
thru "<maxPrice>" copy _price to "</maxPrice>"
thru "<OKPD2>"
thru "<code>" copy _code to "</code>"
thru "<name>" copy _name to "</name>"
thru "</OKPD2>"
thru "</lot>"
(append lot object [lot_number: _lotNumber maxPrice: _price code: _code name: _name ])
(append lots lot)
]
]
text: {
<lot>
<lotNumber>1</lotNumber>
<maxPrice>10</maxPrice>
<OKPD2>
<code>c</code>
<name>ABC</name>
</OKPD2>
</lot>
<lot>
<lotNumber>2</lotNumber>
<maxPrice>20</maxPrice>
<OKPD2>
<code>d</code>
<name>XYZ</name>
</OKPD2>
</lot>
}parse text [(clear lots) some [guard rule]]
probe lots
[make object! [
lot_number: "1"
maxPrice: "10"
code: "c"
name: "ABC"
] make object! [
lot_number: "2"
maxPrice: "20"
code: "d"
name: "XYZ"
]]rule: [ (clear lot) thru "<lot>" thru "<lotNumber>" copy _lotNumber to "</lotNumber>" thru "<maxPrice>" copy _price to "</maxPrice>" thru "<OKPD2>" thru "<code>" copy _code to "</code>" thru "<name>" copy _name to "</name>" thru "</OKPD2>" thru "</lot>" (append lots object [lot_number: _lotNumber maxPrice: _price code: _code name: _name ]) ] parse text [(clear lots) some [guard rule | skip]]
s: to guard:guard: [
ws* ahead [
s: "<lot>" ws*
s: "<lotNumber>" any [not "</lotNumber>" skip] "</lotNumber>" ws*
s: "<maxPrice>" any [not "</maxPrice>" skip] "</maxPrice>" ws*
s: "<OKPD2>" ws*
s: "<code>" any [not "</code>" skip] "</code>" ws*
s: "<name>" any [not "</name>" skip] "</name>" ws*
s: "</OKPD2>" ws*
s: "</lot>"
]
]skipping: no error: [opt [if (not any [skipping empty? s]) (print ["Problem at: " probe copy/part s 30 "..."] skipping: yes)]] parse text [(clear lots) some [guard rule (skipping: no) | error skip]]
thru with a block of options. That way it might return the closest match, but not sure right now, would have to try ...thru although we tried to explain you, that it is bad way to go! and in but in future I may find case when I also will need extract "<currency>". But for any other part of doc will have another structure. For example "lotNumber" may be named lotPosition or it may be absent. Same with any other field. It's better to say that I am trying to write parser for "float doc structure".
So I need to prevent regressions for every case. I can add some code to rule only if I totally sure that this case is possible and I can test that all working.
So I found only one way - to write rules for every founded case and if two ore more rule are valid to take result that have more collected data that another %)to and thru. And if your data are so variadic, I would probably used some general solution.>> text: "Here is some text to parse." == "Here is some text to parse." >> parse text [any [thru ["parse" m: (print [3 m]) | "is" m: (print [1 m]) | "text" m: (print [2 m])]]] 1 some text to parse. 2 to parse. 3 .
lot [
lotNumber [1]
maxPrice [10]
OKPD2 [
code [c]
name [ABC]
]
]
lot [
lotNumber [2]
maxPrice [20]
OKPD2 [
code [d]
name [XYZ]
]
]parse-lots: function [ text ] [
letter: charset [ #"a" - #"z" #"A" - #"Z" ]
letters: [ some letter ]
number: charset [ #"0" - #"9" ]
numbers: [ some number ]
end-tag: [ "</" some [letters | numbers] ">" ]
start-tag: [ "<" copy tag-name [some [letters | numbers]] ">" ]
parse text [
any [
change end-tag "]"
| change start-tag (rejoin [tag-name " ["])
| skip
]
]
print text
comment {
1. Substitute print for load if you want to return a block of data.
2. If you return a block of data, you could make functions for each tag.
3. Each function doing something with the data passed to it.
ex: lot: function [ data ] [ some code that will create an object from the data arg]
lotNumber: function [ data ] [ ]
OKPD2: function [ data ] [ reduce data ]
code: function [ data ] [ data ]
name: function [ data ] [ data ]
}
]change) will not result to the fastest solution if the source data are huge. But yes... I was already recommending that he should first build intermediate structure.do would create said objects:parse-lots: function [ text ] [
letter: charset [ #"a" - #"z" #"A" - #"Z" ]
letters: [ some letter ]
number: charset [ #"0" - #"9" ]
numbers: [ some number ]
end-tag: [ "</" some [letters | numbers] ">" ]
start-tag: [ "<" copy tag-name [some [letters | numbers]] ">" ]
parse text [
any [
remove {<OKPD2>}
| remove {</OKPD2>}
| change "<lot>" "make object! ["
| change "</lot>" "]"
| change end-tag "}"
| change start-tag (rejoin [tag-name ": {"])
| skip
]
]
load text
]make object! [
lotNumber: {1}
maxPrice: {10}
code: {c}
name: {ABC}
]
make object! [
lotNumber: {2}
maxPrice: {20}
code: {d}
name: {XYZ}
]>> parse-lots "<lot><name>a</name></lot>"
== [make object! [name: "a"]]
>> parse-lots "<lot><name>{</name></lot>"
*** Syntax Error: (line 1) invalid string at ]
*** Where: transcode
*** Stack: parse-lots loadparse/all. Is there some equivalent in red parse? Otherwise I have to stick a lot of extra any space, opt space etc... everywhere, turning elegant and clean rules into something with a lot of visual noise._: [any space]if (1 = 2)>> parse "aabbcc" [copy x to "bb" if (1 = 1) "bbcc" ] == true >> parse "aabbcc" [copy x to "bb" if (1 = 2) "bbcc" ] == false
thru "<value>" copy _value to "</value>"
if (_value = "false") [reject] ; stop parsing return false>> parse "<value>false</value>" [thru "<value>" copy _ to "</value>" if (_ <> "false") to end] == false >> parse "<value>test</value>" [thru "<value>" copy _ to "</value>" if (_ <> "false") to end] == true
to end of course. I just wanted to show that it stops there as you asked.reject is irrelevant with if:>> parse [] [] == true >> parse [] [reject] == false
>> parse [There should be no integer here] [any [integer! reject | any-type!]] == true >> parse [There should be 0 integer here] [any [integer! reject | any-type!]] == false
reject can be tricky (#3478). Playing with it I noticed that a construct I expected to behave identically, does not:>> status: false parse [There should be 0 integer here] [any [set s skip opt [if (integer? s) reject]]] == true >> status: false parse [There should be 0 integer here] [any [set s skip [if (integer? s) reject |]]] == false
status: false superfluous. Remained from eariler experiments.[any [integer! reject | any-type!]] => [any [not integer! skip]] is enough>> parse [There should be integer here 0] [any [integer! reject | any-type!] p:] == true >> ? p P is a block! value. length: 0 index: 7 []
printstatement causing a stack overflow there?>> rule: [quote 1] == [quote 1] >> parse [1] [some rule] == true >> rule: [quote 1 (print rule)] == [quote 1 (print rule)] >> parse [1] [some rule] *** Internal Error: stack overflow *** Where: print *** Stack:
>> rule: [print rule] == [print rule] >> do rule *** Internal Error: stack overflow *** Where: print *** Stack:
whitespace: charset reduce [space tab cr lf]
ws: [any whitespace]
d: {
<apps>
<app>
<id>1</id>
<good>true</good>
</app>
<app>
<id>2</id>
<good>true</good>
</app>
<app>
<id>3</id>
<good>true</good>
</app>
<app>
<id>4</id>
<foo>foo can be optional</foo>
<rejected>true</rejected>
</app>
</apps>
}[
{id: 1, good: true},
{id: 2, good: true},
{id: 3, good: true},
{id: 4, good: false}
]appsome [
[
thru "<app>"
thru "<id>" copy _id to "</id>"
thru "<good>" copy _good to "</good>"
]
|
[
thru "<app>"
thru "<id>" copy _id to "</id>"
thru "<rejected>" copy _rejected "</rejected>"
]
thru "</app>"
]thru if you don't have different data. Something like this should work:parse b [collect [ <apps> some [ <app> some [ <id> keep integer! </id> | <good> keep word! </good> | <rejected> keep word! </rejected> | skip ] </app> ] </apps>] ] ; == [1 true 2 true 3 true 4 true]
[app id], those can be a single, shared rule.parse? I tried a couple different approaches but not elegant and not specific enough. Here is one example:program: does [
CST: parse source-code [
collect [
any [
A
| B
| C
| D
| failure: (print [ "Parser error. Some problem here >>" failure ] quit)
]
]
]expect "if" or expect "end" where unless it meets the expectation, an error is produced to point that out.parse?[(error: "your input must look like xxx" error?: true) the-actual-rule (error?: false)]
failure rule can look like:[failure: (if error? [print ["Error:" error "at" failure]])]
error: [p: (do make error! rejoin ["Parser error. Some problem here >> " mold/part p 100])]parse ... [... A [B | error] opt [C [D | error]]...]parse ... [... A [B | (expected "B")] opt [C [D | (expected "D")]]...] where expected is your own "complaining" function(foo (bar (baz))) is valid whereas (bar() is invalid?>> parse "(bar())" rule: ["(" while [end (error) | ")" break | rule | skip]]
== true
>> parse "(bar()" rule
*** Script Error: error has no value
*** Where: parse
*** Stack:>> parse "(bar()))" rule: ["(" while [end (error) | ")" break | rule | skip]]
== falsenested-curly-braces: [
(cnt: 1)
any [
c-comment
| #"{" (cnt: cnt + 1)
| __end: #"}" if (zero? cnt: cnt - 1) break
| not end skip
]
]
rule-action: [
#"{" __start: nested-curly-braces (
if not zero? cnt [
missing: form pick "}{" positive? cnt
cause-error 'syntax 'missing [missing __start]
]
)
]false though.paren: [#"(" paren-items #")"]
paren-items: [any [ws | some word-chars | paren]]
ws: [space | tab | new-line | crlf]
word-chars: charset [#"a" - #"z"]parse "(bar())" parens: [#"(" any [parens | not #")" skip] #")"]parens: [#"(" any [parens | [escape | not #")"] skip] #")"] ()
escape: #"\" parse "(bar(\)))" parens
;== true
parse "(bar(\))))" parens
;== false
escape: #"^^" parse %{(bar(^)))}% parens
;== true
parse %{(bar(^))}% parens
;== false
parse %{(bar(^))))}% parens
;== falseerr: func [msg][cause-error 'user 'message rejoin msg]
rule: [any [
s: [escape | not [#"(" | #")"]] skip
| ahead #")" (err ["Unmatched closing paren at " mold s "!"])
| parens
]]
parens: [
#"(" any [parens | [escape | not #")"] skip]
[#")" | end (err ["Unclosed paren at " mold s "!"])]
]>> parse %{foo (bar()) baz}% rule
== true
>> parse %{foo (bar()) (baz}% rule
*** User Error: Unclosed paren at "(baz"!
*** Where: do
*** Stack: err cause-error
>> parse %{foo (bar())) baz}% rule
*** User Error: Unmatched closing paren at ") baz"!
*** Where: do
*** Stack: err cause-error
>> parse %{foo (bar())^) baz}% rule
== truetranscode to check your strings for balanced parens:>> transcode/trace %{a(b(c)d)e}% func [e i t l o][[open close] print [e mold i]]
open "(b(c)d)e"
open "(c)d)e"
close ")d)e"
close ")e"
== [a (b (c) d) e]
>> transcode/trace %{a(b((c)d)e}% func [e i t l o][[open close] print [e mold i]]
open "(b((c)d)e"
open "((c)d)e"
open "(c)d)e"
close ")d)e"
close ")e"
*** Syntax Error: (line 1) missing ) at (b((c)d)e
*** Where: transcode
*** Stack:
>> transcode/trace %{a(b(c)d))e}% func [e i t l o][[open close] print [e mold i]]
open "(b(c)d))e"
open "(c)d))e"
close ")d))e"
close "))e"
close ")e"
*** Syntax Error: (line 1) missing ( at )e
*** Where: transcode
*** Stack:[event input type line token].{11 22 33 }{11 33 22 }tag: [open ahead [to close] data]
open: ["<" copy opening to ">" if (find tags opening) ">"]
close: ["</" copy closing to ">" if (opening = closing) ">" mark:]
data: [keep copy match to close :mark]
tags: split "aa bb cc" space
examples: [
{<aa>11</aa><bb>22</bb><cc>33</cc>}
{<aa>11</aa><cc>33</cc><bb>22</bb>}
{<aa>11</bb>}
{<xx>22</xx>}
]
foreach example examples [
probe parse example [collect some tag]
]:mark is doing?mark:. See the [docs](https://github.com/red/docs/blob/master/en/parse.adoc#restoring) for details.[to close] is placed in brackets? ahead in this case? From what I understand, ahead checks to see if the rule that follows is true, and if so then goes on do do the next rule. If the rule after ahead is false, then it won't? If so, isn't this how rules rules move forward anyhow (only move forward if the rule matches) This is an area of parse where I would love more examples. Examples of when/how to use if, ahead, while, reject, then, fail. open ahead to close data check that opening tag is followed by a matching closing tag; if that's true then data between these tags can be safely collected.open data close, it can turn out that there is no closing tag, or that it doesn't match the opening one (see the 3rd example), but at that point the data was already collected and you'll get the wrong result.if for "semantic matching", i.e. when I matched some value and want to check if it satisfies certain properties. [Here](https://github.com/9214/7guis-red/blob/master/tasks/cells.red#L15) I match a word and [check](https://github.com/9214/7guis-red/blob/master/tasks/cells.red#L10) if it is set to an object which was derived from a specific [prototype](https://github.com/9214/7guis-red/blob/master/tasks/cells.red#L37).ahead is essentially "match but don't advance the input if it matches", literally "ahead of the current position". fail is bugged out currently, but [fail] is an always failing rule.math dialect:math: function [ "Evaluates expression using math precedence rules" datum [block! paren!] "Expression to evaluate" /local operator match ][ order: ['** [* / % //] [+ -]] redex: [skip infix [enter | skip]] infix: [set operator word! (do gauge) if (infix?)] gauge: [right?: attempt [lit-word? first do check]] check: [infix?: find to block! group operator] emend: [parse datum tally] tally: [any [enter [fail] | recur [fail] | count [fail] | skip]] enter: [ahead paren! into tally] recur: [ahead redex if (right?) 2 skip tally] count: [while ahead change only copy match redex (do match)] do also datum: copy/deep datum foreach group order emend ]
if (infix?) I match a word and check if it's one of the arithmetic operators, in if (right?) I check that matched infix operator is right-associative.ahead paren! preemptively checks that there's a paren! right under the current "cursor", and only then recurs into it with into. Same with ahead redex: it's a preemptive check that gives me an opportunity to analyze the input before I decide on how to process it, in particular, it sets infix? and right? flags on which I can dispatch inside if.[any [enter [fail] | recur [fail] | count [fail] | skip]] is the heart of the interpreter. Conceptually, it means "match the rule, and, _regardless of the result_, backtrack and match the next rule". [fail] is an always failing rule which forces Parse to retract its input "cursor" a few steps back (to the point where "failed" portion of the input started) and to go look for |, which delineates the next alternate rule. then was removed https://github.com/red/red/issues/3843 quite a while ago because it didn't make any earthly sense, but here I noticed that it can be re-introduced to cover this idiom:[any [enter then recur then count then skip]]
while ahead change only copy match redex (do match) is a mind-bender. What it does can be described as: match the infix expression and substitute it for the result of its evaluation, but _do so without advancing the input_. If you ever saw car's wheel skidding in the mud, that's the Parse equivalent of it: the wheel rotates, but the car stands still, stuck in the mud. while is used here to indicate that the rule will continue to match regardless of the input advancing, but in fact some and any can be used as well, because the input "advances" when change modifies it.data: {<lots>
<lot>
<lotnum>1</lotnum>
<objs>
<obj>
<name>Foo</name>
</obj>
<obj>
<name>Bar</name>
</obj>
<obj>
<name>Baz</name>
</obj>
</objs>
</lot>
<lot>
<lotnum>2</lotnum>
<objs>
<obj>
<name>Red</name>
</obj>
<obj>
<name>Green</name>
</obj>
<obj>
<name>Blue</name>
</obj>
</objs>
</lot>
</lots>}
parse data [
( _lots: copy [] )
some [ thru "<lots>"
(_lot: copy [])
some [ thru "<lot>"
thru "<lotNum>" copy _lotNumber to "</lotNum>"
(_objs: copy [])
some [thru "<objs>"
(_obj: copy [])
some [ thru "<obj>"
thru "<name>" copy _name to "</name>"
thru "</obj>"
]
thru "</objs>"
]
(append _obj object [lot_number: copy _lotNumber name: _name ])
]
(append _objs copy _obj)
(append _lot object [lot_number: to-integer copy _lotNumber objs: _objs ])
]
(append _lots _lot)
]
probe to-json _lots[{lot: 1, objs: [ {name: "Foo"}, {name: "Bar"}, {name: "Baz"}] }, { lot: 2, objs: [ {name: "Red"}, {name: "Green"}, {name: "Blue"}]} ]to and thru in such situations have been explained several times. Plus, err... , other things.lots: copy []
parse data [some [
<lotnum> copy _num to </lotnum> (_objs: copy [])
| <name> copy _name to </name> (append _objs object [name: _name])
| </objs> (append lots object [lot: _num objs: _objs])
| skip
]]
to-json lots
== {[{"lot":"1","objs":[{"name":"Foo"},{"name":"Bar"},{"name":"Baz"}]},{"lot":"2","objs":[{"name":"Red"},{"name":"Green"},{"name"...code tags but I need to collect only codes inside OKPD2data: read %fcsNotificationZA44_0173200001419002048_22649697.xml collect?: no parse data [collect some [<OKPD2> (collect?: yes) | </OKPD2> (collect?: no) | <code> if (collect?) keep to </code> | skip]] ;== ["0101010.90.33.120" "11.111.1111111111111110" "27.90.33.120" "27.90.33.120" "27.90.33.120" "27.90.33.120" "27.90.33.120" "27.90.33.120" "27.90....
parse! :+1:ahead and friends. My understanding based on the docs is that 1) ahead does not advance. It's more like a lookahead. 2) If it is a match, perform the next rule. Is this correct? I'm playing with a simple example to help me understand it:letters: charset {abcdefghijklmnopqrstuvwxyz}
parse "{ a b { d d d } c }" list: [
ahead not "{" [(print "Expecting {." quit)]
"{"
collect [
any [ space | keep letters | list ]
]
]
ahead not "}" [(print "Expecting }." quit)]
"}"
]ahead, things work as expected. However with ahead it is not. Although now as I was writing this, I'm thinking this might be because it doesn't go any further than the first rule, since the first char *is* {, and so ahead not "{" is false, and so doesn't keep going.>> parse "a" [ahead not "a" | (probe 'failed!)] failed! == false >> parse "a" [not "a" | (probe 'failed!)] failed! == false
Ahead only knows the single rule it's given, and whether it succeeds or fails.ahead as a yo-yo. You throw it _forward_ while _standing_ and when pull _back_. Sometimes you fail to pull back (the rules given to ahead fails and then ahead itself fails) and switch to alternate mode (jump over |) where you e.g. wind the thread by hand.into it:>> rule: [some [ahead block! into rule | keep string!]] == [some [ahead block! into rule | keep string!]] >> parse ["we" ["need" ["to" ["go" ["deeper"]]]]] [collect rule] == ["we" "need" "to" "go" "deeper"]
ahead block! you'd go both into string! and block!.>> rule: [some [into rule | keep string!]] == [some [into rule | keep string!]] >> parse ["we" ["need" ["to" ["go" ["deeper"]]]]] [collect rule] *** Script Error: PARSE - input must be of any-block! type: we *** Where: parse *** Stack:
>> rule: [mark: skip (probe mark)] == [mark: skip (probe mark)] >> parse [a b c d][ahead [rule ahead [rule ahead [rule ahead rule rule] rule] rule] rule 'b 'c 'd] [a b c d] [b c d] [c d] [d] [d] [c d] [b c d] [a b c d] == true
yes; otherwise say no. "case?if-statement: [ "if" ... ]
case-statement: [ "case" ... ]
let-statement: [ "let" ... ]
statement-list: [
any [
space
| newline
| ahead "if" if-statement
| ahead "case" case-statement
| ahead "let" let-statement
| (print "Expecting if, case or let expression." quit)
]
]ahead?ahead with keep (and if). if you don't use ahead you would already kept the value even if it is not the value you want. You already kept while you are checking it.ahead usage. I was using these as a way of experimenting with different ways of doing things, so some of the code is a bit more obtuse than it needs to be, but I think the parse rules were pretty reasonable.not ahead, which should just be notahead, or is it more of a conceptual block for you?parse, whether it would make things easier/cleaner (or not).ahead usage is to parse stuff like Markdown for example, where you need to decide if e.g. an asterisk marks emphasis or is simply an asterisk. You can look ahead for matching asterisk to determine that it's emphasis:parse markdown-text [
; match the start of emphasis
#"*"
; look for emphasis end
ahead [to #"*"]
; if the end was found, the rule will continue with the actual content
emphasis-content
; otherwise, parse all stuff as normal content
| normal-content
]>> parse [1 2] [any [fail | (print "A") skip]] == false >> >> parse [1 2] [any [[fail] | (print "A") skip]] A A == true
failthat way? "force current rule to fail, backtrack " (taken from the R3 docs), so for me, it is someting like break? not ouside?a: {
<apps>
<app>1</app>
<app>2</app>
<app>3</app>
</apps>
<app>4</app>
}
parse a [
any [
thru "<app>" copy _ to "</" (print _)
]
]4 .a: {
<apps>
<app>1</app>
<app>2</app>
<app>3</app>
</apps>
<foo>
<app>4</app>
</foo>
}a: {
<apps>
<some>
<app>1</app>
<app>2</app>
</some>
<app>3</app>
</apps>
<foo>
<app>4</app>
</foo>
}whitespace: charset reduce [space tab cr lf]
ws: [any whitespace]
a: {
<apps>
<app>1</app>
<app>2</app>
<app>3</app>
</apps>
<app>4</app>
}
parse a [
any [
thru "<app>" copy _ to "</" (print _) not [ahead ws "</apps>"]
]
]3 because ahead is 4 you meant, right?not ahead, not is fine by itself (version with whitespaces):>> a
== {<apps><app>1</app><app>2</app><app>3</app></apps><app>4</app>}
>> parse a [<apps> any [<app> copy _ to "</" thru #">" (print _) not </apps>]]
1
2
3
== falseapp except app that have ahead apps". I am playing in attempts to understand how to process different cases. I expected that it should not process 3 because ahead of 3 is "</apps>"3 is already printed when you are doing the check.fail fails [... | (print "a") skip] block and makes any to instantly stop, in the latter it fails enclosing [...] and backtracks to an alternate rule. It's all described in the documentation, FYI.apps: [<apps> some app </apps>]
app: [<app> keep copy match to "<" </app>]
parse trim/all {
<apps>
<app>1</app>
<app>2</app>
<app>3</app>
</apps>
<app>4</app>
}[
collect apps
]app at the top-level of apps". In that case:apps: [<apps> some [app | junk] </apps>]
app: [<app> keep copy match to "<" </app>]
junk: ["<" copy name to #">" thru ["</" name ">"]]
parse trim/all {
<apps>
<some>
<app>1</app>
<app>2</app>
</some>
<app>3</app>
</apps>
<foo>
<app>4</app>
</foo>
}[
collect apps
]ahead ok I think I get it now. As a general summary...; Lookahead for some-pattern. Does not advance the index. ahead some-pattern ; if true, continue with all the rules after 'ahead'. next-rule1 next-rule2 next-rule3 ; otherwise... | rule4 rule5 rule6
parse trim/all a [ any [ thru <app> copy _ to </app> </app> not </apps> (print _) ] ]
f: func[] ["aa"] parse "aa" [f]
parse ["aa"] compose [(f)]parse probe reduce [:f] [f] ; == truex:t: "bb" parse "<aa>111</aa><bb>222</bb>" compose [ thru "<" (t) ">" copy x to "</" (t) ">" ]
t: "aa"parse "111 222 " probe compose/deep [thru ["<" (t) ">"] copy x to ["" (t) ">"]] I think.compose is change collecting behavior? >> parse "<aa>11</aa><bb>22</bb><dd>55</dd><cc>33</cc>" [ [ some [ [ thru ">" [ copy x to "</" (print x) [ ] [ [ ] 11 <bb>22 <dd>55 <cc>33 == false >> >> >> parse "<aa>11</aa><bb>22</bb><dd>55</dd><cc>33</cc>" compose/deep [ [ some [ [ thru ">" [ copy x to "</" (print x) [ ] [ [ ] <cc>33 == false
print is not . (print x) is evaluated and prints x, which is set to the result from the last usage of Parse, i.e. to "33" .( )print is not . (print x) is evaluated and prints x, which is set to the result from the last usage of Parse, i.e. to "33" .result from the last usage of Parse? Why not for example first result?probe x between parse "..." [...] and parse "..." compose/deep [...] in the example above. Then try to understand from where the result comes from.probe :x before parse "..." [...]. Then keep narrowing down the gap between the two probes until it clicks :smiley_cat: if (find tags opening)o: object[ list: ["aa" "bb" "cc"] is-found: false to-next: func[] [ list: next list probe reduce ["now list: " list] ] is-found: func[tag value] [ print ["tag: " tag " value: " value] ] f: func[x] [ print ["==> " x] ] ] parse "<aa>11</aa><bb>22</bb><dd>55</dd><cc>33</cc>" [ (o/is-found: false) some [ thru ["<" (first o/list) ">"] (o/is-found: true) copy x to ["</" (first o/list) ">"] (print ["=>" x] o/to-next) ] ]
(first o/list). collected: false). But I need to check hundred files to their struct. So I need more elegant way.price but only if it's inside data) then you first need to convert linear XML to a tree. I already gave you a working prototype for that yesterday.probe x between parse "..." [...] and parse "..." compose/deep [...] in the example above. Then try to understand from where the result comes from.parse "<aa>11</aa><bb>22</bb><dd>55</dd><cc>33</cc>" [ some [ thru ">" copy x to "</" ] ] probe x parse "<aa>11</aa><bb>22</bb><dd>55</dd><cc>33</cc>" compose/deep [ some [ thru ">" copy x to "</" ] ]
[. How to escape this symbol?[ { to get: [{ ... }]:data: [[x: none] ] parse data [...]
into?>> data: [[x: none] ] parse data rule: [any [ahead block! into rule | change 'none 'something | skip]] data == [[x: 'something]]
ahead block! before into is important, otherwise into will go into strings (and other serie types) also and you'll wonder why your rule doesn't work:>> parse ["abc" [abc]][some [into [word!] | skip]] *** Script Error: PARSE - matching by datatype not supported for any-string! input *** Where: parse *** Stack: >> parse ["abc" [abc]][some [ahead block! into [word!] | skip]] == true
>> data: [[x: none] ] >> parse data rule: [any [ahead block! (inside?: yes) into rule (inside?: no) | if (inside?) s: copy _ to end :s change to end (mold/only _) | skip]] data == [["x: none"]]
>> data: mold [[x: none] ] parse data rule: [any [change #"[" "[{" | change #"]" "}]" | skip]] data
== "[{[{x: none}]}]">> data: mold [[x: none] ] parse data rule: [skip any [change #"[" "[{" | #"]" end | change #"]" "}]" | skip]] data
== "[[{x: none}]]"data: [ name: none lots: [ lotNumber: none price: none objects: [ code: none ] ] ]
data: [ name: none lots: [ lotNumber: none price: none objects: [ code: none code: none code: none ] ] ]
data: [ name: none lots: [ lotNumber: none price: none objects: [ code: none ] lotNumber: none price: none objects: [ code: none ] ] ]
data: [ name: none lots: [ lotNumber: none price: none objects: [ code: none ] lotNumber: none price: none objects: [ code: none code: none code: none ] ] ]
jsonname tag new name and lots will be addedlots will be added in bunches of lotNumber, price and objectsobjects will contain only code fieldsdata: [
name: none
lots: [
lotNumber: none
price: none
objects: []
]
]
walk-data: func [/local out found] [
out: make block! 30
foreach [tag value] list [
switch/default tag [
name [
head out: skip insert tail out copy/deep data -4
out/name: copy value
]
code [
append first find/last/tail out/lots 'objects reduce [quote code: copy/deep value]
]
][
either 'none = first found: find/last/tail out/lots tag [
found/1: value
][
append out/lots copy/deep data/lots
change find/last/tail out/lots tag value
]
]
]
head out
]
list: [
name "Apples" lotNumber 1 price $13 code [print "first"] code [do [something]] lotNumber 2 price $32
name "Bananas" lotNumber 3 price $7 code [print "Bananas are only $7"] price $99.99 code [print "Price goes up"]
]
probe walk-data[
name: "Apples"
lots: [
lotNumber: 1
price: $13.00
objects: [code: [print "first"] code: [do [something]]]
lotNumber: 2
price: $32.00
objects: []
]
name: "Bananas"
lots: [
lotNumber: 3
price: $7.00
objects: [code: [print "Bananas are only $7"]]
lotNumber: none
price: $99.99
objects: [code: [print "Price goes up"]]
]
]name tag new name and lots will be addedname have only one value, but lots content is multiple. [] can\should be multiplicate. Other than root (data) because data is container.lots. I have very complex data structure and can't hardcode anything... lotNumber copy all parent lots to another place and fill it. Or something like it. reduce word after lots data: [ id: 123 date: "2020" lots: [lot: [lotNumber: 1 price: ] ] ]
data: [ id: 123 date: "2020" lots: reduce [lot: [lotNumber: 1 price: ] ] ]
quote in next code do not work:parse data [thru quote lots insert quote reduce] probe data
reduce because later I will add object to make nested object evaluated (I am just playing here)insert is enough here:insert next find data quote lots: 'reduce
parse data [thru quote lots: insert ('reduce)]insert is weird. lot: to object (without colon):parse data [to quote lot: change 'lot: 'object]
'lit-set-word! type'lot: doesn't make sensecode. I want to change lot: to word object to make data look:data: [ id: 123 date: "2020" lots: [object [ lotNumber: 1 price: 777 ] ] ]
lots. I will integrate it laterto block! into?some it's change only first lot::data: [
id: 123
date: "2020" lots: [
lot: [ lotNumber: 1 price: 777 ]
lot: [ lotNumber: 2 price: 888 ]
]
]
code: [object]
parse data [ some [
to block! into [change quote lot: code]
]
]
probe datadata: [
id: 123
date: "2020" lots: [
lot: [ lotNumber: 1 price: 777 ]
lot: [ lotNumber: 2 price: 888 ]
]
]
parse mold data [result: some [
to "lot:" change "lot:" "object"
]
]
probe to-block result[[
id: 123
date: "2020" lots: [
object [lotNumber: 1 price: 777]
object [lotNumber: 2 price: 888]
]
]][ :( >> code: [object]
== [object]
>> parse data [to block! into [some [change quote lot: code | skip]]]
== true
>> data
== [
id: 123 date: "2020" lots: [object [lotNumber: 1 price: 777] object [lotNumber: 2 price: 888]]
]skip thanks!object: to object) and it do not work:data: [ id: 123 INN: "123213142" lots: [lot: [ lotNumber: 1 price: 777 objects: [object: [ name: "Apples" ] object: [ name: "Bananas" ] ] ] ] ] code: [object] parse data [ to block! into [ some [change quote object: code | skip] ] ] probe data
to block!) and then jumps into that block (into ...). It doesn't look for other nested block, but you can change the rule so it would be recursive.someparse data rule: [some [change quote object: code | ahead block! into rule | skip]]
some will just make the rule match one or more times.SET a matched element to a word in a context? SET does not seem to work on paths. I need to write to OBJ/WORD when parsing. Even a SET in OBJ WORD would be welcome.set in obj 'a "val" ?do [ obj/word: your-val ]text >> foo: object [bar: 'qux] () >> parse [baz][set match word! (set/any 'foo/bar :match)] == true >> foo/bar == baz
parse [x] [set value word! (dd/value: value)] but I was asking if we can do it directly in PARSE SET without using a global word.text >> foo: object [bar: 'qux] () >> parse [baz][set match word! (set/any 'foo/bar :match unset 'match)] == true >> foo/bar == baz >> value? 'match == false
>> parse [baz] bind [set bar word! (set/any 'foo/bar :bar)] foo: object [bar: 'qux] == true >> foo/bar == baz >> value? 'bar == false
'match content, shouldn't it?SET with custom containers in 15 minutes. Redbol make me say "that's the joy of coding"case, as a record extractor, as for loop, as special switch, as for-case loop, as foreach. It's incredible how flexible it is!parse so slow when I use it like this?data: read/binary http://avatars-04.gitter.im/gh/uv/4/oldes
dt [loop 10000 [find/match data #{89504E47}]]
;== 0:00:00.0039912
dt [loop 10000 [parse data [#{89504E47} to end]]]
;== 0:00:13.6435to.to end shouldn't be that slow... :)>> profile/show [[find/match data #{89504E47}] [parse data [#{89504E47} (found?: true)]] [parse data [#{89504E47} to end]]]
Time | Memory | Code
1.0x (331ns) | 512 | [find/match data #{89504E47}]
2.59x (858ns) | 168 | [parse data [#{89504E47} (found?: true)]]
5201.75x (2ms) | 168 | [parse data [#{89504E47} to end]]to endoperation looks suspicious :-)end presence.>> ws: " " cs: charset "abcde"
== make bitset! #{0000000000000000000000007C}
>> parse "ab c de" [collect any [keep some cs some ws]]
== ["ab" #"c" "de"]
>> parse "ab c de" [collect any [keep copy some cs some ws]]
== ["a"]
>> parse "ab c de" [collect any [keep [copy some cs] some ws]]
== [#"a"]["ab" "c" "de"] output.keep copy poop wart](https://github.com/red/docs/blob/master/en/parse.adoc#collect) as I love to call it.rule: [ (x: make object! [a: b: c: none]) set a word! set b word! set c! number! (append container x) ]
rule: [ (x: make object! [a: b: c: none] bind segment-rule: [set a word! set b word! set c! number!] 'x) segment-rule (append container x) ]
rule: [collect any [set _a word! set _b word! set _c number! keep (object [a: _a b: _b c: _c])]] container: parse [x y 1 z w 2] rule
== [make object! [
a: 'x
b: 'y
c: 1
] make object! [
a: 'z
b: 'w
c: 2
]]rule: [set /local a skip]
SET, and OBJECTS I use I use to store those words too with (myobject/data: WORD-SET-BY-PARSE) , are unique and isolated.parse examples. If nobody else knows where they are, I'll try to find them.SET, so if you execute a rule multiple times, when you come back to the previous level, after the continuation point, the subsequent code would use the word belonging to its level. Like scoping works in classic languages. = to the end of production rule names, and *prepended* = to the name for the word that referred to the data it collected. use, but I may try to rewrite it.use-local: func [locals blk] [ do reduce [ func compose [/local (locals)] blk] ]
math: func [a _ c d][a _ c = d] parse [1 + 2 3 2 * 4 8 3 - 1 2] [some [s: if (math s/1 get s/2 s/3 s/4) (s: skip s 4) :s]] == true
RULES1 if needed by further code.RULES1, if the grammar is matched by the input, then the structure elements are set.rules1: [set arg1 word! set arg2 word! set arg3 integer! (struct-to-set/a: arg1 struct-to-set/b: arg2 struct-to-set/c: arg3)] decode: func [phrase struct-to-set] [ parse phrase [rules1] ]
struct-to-set ARG was not bound in the RULES1 block, where it is just a global unset word.decodefunction context. Then I have realized that as PARSE reduces each word in a nested way, so BIND would just bind only the words before any reduction. As solution, I have then thought about binding each rule *on the fly when used.
After thinking this, a simpler solution came in mind: wrapping everything around a big context containing all the rules:
When the basic working have been established, I have moved to the next task: Keep a version of the `SET` variables `ARG1` and `ARG2` or the `STRUCT-TO-SET` for each round of the rule. So, what should I do? Because if I add `ANY` in `parse` like: ``` parse phrase [ANY rules1]`
RULE1 would overwrite the previous arguments and print [ARG1 ARG2 STRUCT-TO-SET]
rules1 end with a PULL one is needed after next level RULES1 return. ARGS and STRUCT-TO-SET in a new context each round, and adding a KEY, would open the opportunity create an indexed storage for random retrieving instead of PUSH/PULL operations.structs: clear [] rules1: [any [set arg1 word! set arg2 word! set arg3 integer! (append structs make struct-to-set [a: arg1 b: arg2 c: arg3])]] decode: func [phrase struct-to-set] [parse phrase bind rules1 :decode]
>> decode [a b 1 c d 2 e f 3] object [a: b: c: none]
== true
>> structs
== [make object! [
a: 'a
b: 'b
c: 1
] make object! [
a: 'c
b: 'd
c: 2
] make object!...struct-to-set:structs: clear [] rules1: [any [set arg1 word! set arg2 word! set arg3 integer! (append structs object [a: arg1 b: arg2 c: arg3])]] decode: func [phrase struct-to-set] [parse phrase rules1]
struct-to-set was there for a reason: in other scenarios, some grammars could return at some point and then be resumed later. Struct-to-set would carry some extra data like the dialect block where interrupted, and all the dialect parameters and values collected by the grammar processing: switches, operating modes, refinements; they are needed for interpreting the rest of the Phrase. rules: object [ top: [ (clear args) some [ current: gather [ if (args/3 >= stop) interrupt | (append result object [a: take args b: take args c: take args]) ] ] ] stop: 3 interrupt: [thru end] continue: [:current top] gather: [collect into args [word word number]] word: [keep word!] number: [keep number!] result: clear [] args: clear [] current: none ] data: [a b 1 c d 2 e f 3 g h 4 i j 5 k l 6]
parse data rules/top ;== true rules/result ;== [make object! [ ; a: 'a ; b: 'b ; c: 1 ;] make object! [ ; a: 'c ; b: 'd ; c: 2 ;]] ;...Do something else rules/stop: 5 parse data rules/continue ;== true ;... Interrupted again length? rules/result ;== 4 rules/stop: 7 parse data rules/continue ;== true last rules/result ;== make object! [ ; a: 'k ; b: 'l ; c: 6 ;] length? rules/result ;== 6
parse: ; String based approach, not tree based
insert-parens: function [src /right][
op: charset "+-*/"
blk: parse src [
collect [
some [
keep copy w to [op | end]
keep copy o skip
]
]
]
dest: copy ""
len: to integer! divide length? blk 2
either right [
append dest #"("
foreach [w o] blk [
repend dest either none? o [[w]][[w o #"("]]
]
append/dup dest #")" len
][
append/dup dest #"(" len - 1
repend dest [#"(" blk/1]
foreach [o w] next blk [repend dest [o w #")"]]
]
]
print insert-parens "a+b-c*d/e"
print insert-parens/right "a+b-c*d/e"[word op (insert paren incr count)] then insert count alt-paren at the end, but this helped me think about it, and was fun./right handling, where the last value is solely parenthesized. (e) edge case, but found another one. I won't spoil it for others yet. The trick is what you don't know, when taking a naive string mod approach. What seems simpler at a glance...isn't.enparen: function [str [string!] /right][
op: charset "+-*/"
el: [some [not op skip]]
n: 0
rule: either right [
[s: el opt [op e: :s insert #"(" :e skip rule insert #")"]]
][
[s: el any [op el insert #")" (n: n + 1)] :s n insert #"("]
;[s: el any [op el insert #")" e: :s insert #"(" :e skip]] ;alternatively
]
parse str: copy str rule
str
]
enparen "a+a+a+a+a+a+a"
;== "((((((a+a)+a)+a)+a)+a)+a)"
enparen/right "a+a+a+a+a+a+a"
;== "(a+(a+(a+(a+(a+(a+a))))))"[s: el any [insert #" " op insert #" " el insert #")" (n: n + 1)] :s n insert #"("] .... not sure if I should use some aheador so parse keyword, so that first space is inserted only after the opis matched with next rule ....== "((((((a + a) + a) + a) + a) + a) + a)"op: ... in my code should be replaced bysep: charset "+-*/" op: [change [any space set o sep any space] (rejoin [space o space])]
>> enparen "a+b-c*d/f" == "((((a + b) - c) * d) / f)" >> enparen/right "a+b-c*d/f" == "(a + (b - (c * (d / f))))"
enparen "a + a + a + a + a + a + a"prase- half imperative solution:enparen: function [
str [string!]
/right
][
op: charset "+-*/"
el: [some [not op skip]]
b: parse str [
collect [ keep any el some [ keep op keep any el ] ]
]
par: "()"
if right [reverse b reverse par]
par-str: copy par
insert next par-str take/part b 3
foreach [op el] b [
par-str: head insert next copy par reduce [par-str op el]
]
if right [reverse par-str]
par-str
]s: "a+b-c*d/f" print ['enparen mold s '-> enparen s] print ['enparen/right mold s '-> enparen/right s] enparen "a+b-c*d/f" -> ((((a+b)-c)*d)/f) enparen/right "a+b-c*d/f" -> (a+(b-(c*(d/f))))
any el in rule? some enparen: function [
str [string!]
/right
][
op: charset "+-*/"
el: [some [not op skip]]
b: parse str [
collect [ keep el some [ keep op keep el ] ]
]
par: "()"
if right [reverse b reverse par]
par-str: copy par
insert next par-str take/part b 3
foreach [op el] b [
par-str: head insert next copy par reduce [par-str op el]
]
if right [reverse par-str]
par-str
]+ and -, right-associate * and /, and multiplicative ops do have higher precedence, e.g.:a+b+c*d*e*f -> ((a+b)+(c*(d*(e*f))))a*b*c+d+e+f -> ((((a*(b*c))+d)+e)+f)a+b*c+d*e+f -> (((a+(b*c))+(d*e))+f)>> enparen/right "a+"
== ")a+("-x+3y2 -> (-1 * x) + (3 * (y ** 2)).enparen2: function [ str [string!]][
op: charset "+-*/"
el: [some [not op skip]]
arg: [ word! | number! | paren! ]
addt: [ quote '+ | quote '- ]
mult: [ quote '* | quote '/ ]
b: copy []
parse str [
any [ copy t [ el | op ] (append b load t) ]
]
parse reverse b [
some [
p: remove copy t [arg mult arg]
(insert/only p to-paren reverse t)
| skip
]
]
parse reverse b [
some [
p: remove copy t [arg addt arg]
(insert/only b to-paren t)
| skip
]
]
either paren? first b [mold first b] [form b]
]
probe enparen2 "a+b+c*d*e*f"
probe enparen2 "a*b*c+d+e+f"
probe enparen2 "a+b*c+d*e+f"
probe enparen2 "a+""((a + b) + (c * (d * (e * f))))" "((((a * (b * c)) + d) + e) + f)" "(((a + (b * c)) + (d * e)) + f)" "a +"
enparen: function [str [string!]][
left-op: charset "+-"
right-op: charset "*/"
op: union left-op right-op
el: [some [not op skip]]
term: [t: el opt [right-op [end | m: :t insert #"(" :m skip term insert #")"]]]
parse str: copy str [s: term any [left-op term insert #")" e: :s insert #"(" :e skip]]
str
]
enparen "a+b+c*d*e*f"
;== "((a+b)+(c*(d*(e*f))))"
enparen "a*b*c+d+e+f"
;== "((((a*(b*c))+d)+e)+f)"
enparen "a+b*c+d*e+f"
;== "(((a+(b*c))+(d*e))+f)"
enparen "a+12*c+d*e+3.14"
;== "(((a+(12*c))+(d*e))+3.14)"expr, to separate it from the actual parse call. I can see others wanting to play with this. ^ with yet higher precedence, e.g.:a*b^2+4*c/d-3 -> (a*(b^2))+((4*c)/d)-3math: function [ "Evaluates expression using math precedence rules" datum [block! paren!] "Expression to evaluate" /local operator match ][ order: ['** [* / % //] [+ -]] redex: [skip infix [enter | skip]] infix: [set operator word! (do gauge) if (infix?)] gauge: [right?: lit-word? attempt check] check: [first infix?: find to block! group operator] emend: [parse datum tally] tally: [any [enter [fail] | recur [fail] | count [fail] | skip]] enter: [ahead paren! into tally] recur: [ahead redex if (right?) 2 skip tally] count: [while ahead change only copy match redex (do match)] do also datum: copy/deep datum foreach group order emend ]
context [
op: union union
add: charset "+-"
mul: charset "*/"
pow: charset #"^^"
el: [some [not op skip]]
expr: [term any [add [term | (probe "Missing term!") fail]]]
term: [t1: factor any [mul [
factor insert #")" t2: :t1 insert #"(" :t2 skip
| (probe "Missing factor!") fail]]]
factor: [f1: el opt [pow [
ahead el f2: :f1 insert #"(" :f2 skip factor insert #")"
| (probe "Missing exponent!") fail]]]
set 'enparen function [str [string!]][
parse str: copy str expr
str
]
]>> enparen %{2*a^3^b*4+c/5*d-e}%
== "((2*(a^^(3^^b)))*4)+((c/5)*d)-e"
>> enparen %{2*a^3^b*4+c/5*d-}%
"Missing term!"
== "((2*(a^^(3^^b)))*4)+((c/5)*d)-"
>> enparen %{2*a^}%
"Missing exponent!"
== "(2*a)^^"
>> enparen %{2*^b}%
"Missing factor!"
== "2*^^b">> enparen %{What/s+that^?}%
== "(What/s)+(that^^?)"ab+3c2 -> (a * b) + (3 * (c ** 2))ab+3cd2^d/(2.5*-(e+f)) -> (a * b) + (3 * c * (d ** (2 ** d)) / ((2.5 * (-1 * (e + f)))))(3a2+1.5b)/c -> ((3 * (a ** 2)) + (1.5 * b)) / c expression "(3a2+1.5b)/-c"
;== [((3 * (a ** 2)) + (1.5 * b)) / (-1 * c)]
expression "ab+3c2"
;== [(a * b) + (3 * (c ** 2))]
expression %{ab+3cd2^d/(2.5*-(e+f))}%
;== [(a * b) + (3 * c * (d ** (2 ** d)) / ((2.5 * (-1 * (e + f)))))]
expression "3a2-2(b+c)3"
;== [(3 * (a ** 2)) - (2 * ((b + c) ** 3))]
expression/eval/with "(3a2+1.5b)/c" [a: 3 b: 2 c: 10]
;== 3.0
expression "(3a2+1.5b)/-c"
;== [((3 * (a ** 2)) + (1.5 * b)) / (-1 * c)]
expression %{(3a2+1.5b)^}%
;Expected exponent in the end!
expression "3a2+*1.5b"
;Expected term at *1.5b!save/as %test.json object [a: [1 2]] 'json like this you mean?>> save %test.json object [a: [1 2]]
>> read %test.json
== {{"a":[1,2]}}
>> load %test.json
== #(
a: [1 2]
)*.json extension)url: https://raw.githubusercontent.com/wkentaro/labelme/master/examples/bbox_detection/data_annotated/2011_000003.json obj: load/as read url 'json
>> data: load/as https://raw.githubusercontent.com/wkentaro/labelme/master/examples/semantic_segmentation/data_annotated/2011_000003.json 'json
== #(
version: "4.0.0"
flags: #()
shapes: [#(
label: "person"
points: [[250.8142292490119 106.96696468940974] [229.8142292490119 118.96696468940974] [22...
>> probe words-of data
[version flags shapes imagePath imageData imageHeight imageWidth]version is easy, for flags I need to find another example, shapes is a block of maps, which I can easily probe:>> data/shapes/1
== #(
label: "person"
points: [[250.8142292490119 106.96696468940974] [229.8142292490119 118.96696468940974] [221.8142292490119 134.96696468940974] [223.8142292490119 147.9...
>> probe words-of data/shapes/1
[label points group_id shape_type flags][(x * 3) + y] to [(x/result * 3) + y/result]>> resultify [(x * 3) + y / z][x y] == [(x/result * 3) + y/result / z] >> resultify [(x * 3) + y / z][y z] == [(x * 3) + y/result / z/result] >> resultify/by [(x * 3) + y / z][y] 'my-path-to-result == [(x * 3) + y/my-path-to-result / z] >> resultify/by [(x * 3) + y / z][y] 'my/path/to/result == [(x * 3) + y/my/path/to/result / z]
x: vector [100. 0.] ; vector makes a graph node, here x/result = [100. 0.] y: vector [0. -100.] point1: vector [[( x * 3 ) + (y * 3)] ; * and + are custom, and I'd like to evaluate the result here
get-parents: func [expr[block!]] [unique collect [
parse expr rule: [
any [
word: path! (if object! = type? get word/1/1 [keep word/1/1])
| into rule
| skip
]
]
]]copy word word! keep (if object! = type? get word/1 [to-path [word/1 'result] ]) will give me none on words that don't refer to objectsresultify: function [expr vars /by ref][
ref: any [ref 'result]
o: clear []
forall vars [
append o to-lit-word vars/1
if not last? vars [append o '|]
]
parse expr rule: [any [
ahead any-block! into rule
| ahead o change only set w word! (to-path append to-block w ref)
| skip
]]
expr
]SRC is a string! value: "button1: button {TWO^/LINES} loose"
>> to-block src
== [button1: button
"TWO^/LINES" loose
]
>>TWO^/LINES ?string! value, which may be molded in this or that form depending on it's length."" and {} are the same. It's only that long strings are molded with curly braces, and they allow you to write multiline strings without the literal chars for CRLF. So if you need to identify strings by curly braces, you'll have to do that without loading.>> parse {"" hi} [ some [ [ {""} | "" ] " " "hi" ]]
== true
>> parse {"" hi} [ some [ [ "" | {""} ] " " "hi" ]]
== false"" rule matches an empty input, so {""} that matches two quotes is never used. Then it looks for " ", but you're still at the first quote in your input, because matching an empty input doesn't advance.>> parse {""} [some "" {""}]
== true