Author: "Gabriele Santilli" in the header ;)to-json because of a number of issues, load-json is mostly the original code.change-dir to-red-file "C:\dev\giesse\red-json\tests\data" big-100k: [load %really-big-1.json] big-700k: [load %really-big-2.json] big-2.5m: [load %really-big-3.json] canada-5m: [load %canada.json] twitter-600k: [load %twitter.json] profile/show [canada-5m twitter-600k big-100k big-700k big-2.5m]
; Original codec Time | Time (Per) | Memory | Code 0:00:00.09 | 0:00:00.09 | 1486144 | big-100k 0:00:00.378 | 0:00:00.378 | 12368428 | twitter-600k 0:00:00.523 | 0:00:00.523 | 8389480 | big-700k 0:00:00.938 | 0:00:00.938 | 46031464 | canada-5m 0:00:01.994 | 0:00:01.994 | 32086040 | big-2.5m ; Updated codec Time | Time (Per) | Memory | Code 0:00:00.036 | 0:00:00.036 | 1486644 | big-100k 0:00:00.206 | 0:00:00.206 | 8389480 | big-700k 0:00:00.783 | 0:00:00.783 | 32086040 | big-2.5m 0:00:00.992 | 0:00:00.992 | 46031464 | canada-5m
{
"checksum": "1e54fbb25d92a354f7aeaf576726429e",
"roots": {
"bookmark_bar": {
"children": [ ],
"date_added": "13250024635730629",
"date_modified": "13250024643208696",
"guid": "00000000-0000-4000-a000-000000000002",
"id": "1",
"name": "Bookmarks bar",
"type": "folder"
},
"other": {
"children": [ ],
"date_added": "13250024635730662",
"date_modified": "0",
"guid": "00000000-0000-4000-a000-000000000003",
"id": "2",
"name": "Other bookmarks",
"type": "folder"
},
"synced": {
"children": [ ],
"date_added": "13250024635730666",
"date_modified": "0",
"guid": "00000000-0000-4000-a000-000000000004",
"id": "3",
"name": "Mobile bookmarks",
"type": "folder"
}
},
"version": 1
}s: {{
"checksum": "1e54fbb25d92a354f7aeaf576726429e"
}}
j: load/as s 'json{
"location": "関西 ↓詳しいプロ↓"
}
s: read-clip j: load/as s 'json\uXXXX format chars.\u escapes are just an alternative.to-json just failed all tests. Eventually @dockimbel fixed it so I went back to an easier to read recursive version.to-json. It is up to 2x slower on very small data (eg. to-json none), however, it is up to 2x faster on bigger data (like some of the files in the test suite)./pretty and /ascii refinements to to-json."; Original Time | Time (Per) | Memory | Code 0:00:00.001 | 0:00:00.001 | 12288 | [load/as bf-nano 'json] ; ~1K 0:00:00.714 | 0:00:00.714 | 24276068 | [load/as bf-sm 'json] ; ~1.5M 0:00:02.439 | 0:00:02.439 | 118880976 | [load/as bf-lg-no-meta 'json] ; ~5M 0:00:05.148 | 0:00:05.148 | 173835384 | [load/as bf-lg 'json] ; `10M ; New Count: 1 Time | Time (Per) | Memory | Code 0:00:00 | 0:00:00 | 12288 | [load/as bf-nano 'json] 0:00:00.157 | 0:00:00.157 | 24274472 | [load/as bf-sm 'json] 0:00:00.954 | 0:00:00.954 | 118866608 | [load/as bf-lg-no-meta 'json] 0:00:01.091 | 0:00:01.091 | 173821016 | [load/as bf-lg 'json]
profile>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/22:45:30+01:00 21-Jan-2021/22:45:46+01:00
Red 0.6.4 for Windows built 21-Jan-2021/13:10:58+01:00 commit #e0bb1d5>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/22:54:18+01:00 21-Jan-2021/22:54:40+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/22:54:50+01:00 21-Jan-2021/22:55:20+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/22:55:28+01:00 21-Jan-2021/22:55:59+01:00
dt as delta-time function:>> dt [wait 0:0:0.11] == 0:00:00.11713
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/23:02:43+01:00 21-Jan-2021/23:04:03+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/23:04:13+01:00 21-Jan-2021/23:05:38+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/23:05:43+01:00 21-Jan-2021/23:07:15+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/23:07:42+01:00 21-Jan-2021/23:09:17+01:00
>> probe now b: load/as %prj-bookmarks/bookmarks 'json probe now 21-Jan-2021/23:15:23+01:00 21-Jan-2021/23:16:58+01:00
delta-time)>> loop 5 [probe dt [b: load/as %prj-bookmarks/bookmarks 'json]] 0:00:17.1552 0:00:22.8579 0:00:32.3664 0:00:32.0836 0:00:31.8588
recycle.>> loop 5 [probe dt [b: load/as %prj-bookmarks/bookmarks 'json]] 0:00:18.7711 0:00:28.5016 0:00:41.1943 0:00:43.6485 0:00:42.7504 == 0:00:42.7504
>> recycle/off loop 5 [probe dt [b: load/as %prj-bookmarks/bookmarks 'json] recycle] 0:00:18.7421 0:00:18.4631 0:00:18.3451 0:00:18.306 0:00:18.3311
profile to see how much memory your file occupiesrecycle manually when your app can breathe a bit. !attributes and set it to a list of words that *are attributes*.map! for attributes, which can't have repeating elements or deeper structures. We can also name it of course, but if each element is a block, the only time you have a map! is if there are attributes for it, and it's always the first value in the block. Don't get clever beyond that. If we name it, e.g. as @hiiamboris suggests, then each element is also a well-formed key-value block.[name "value" name "value" unique-name "value"] so you can still use path access when addressing a unique element (without constant /1/ clutter), but with a dedicated mezz can collect every name's value.extract btw: extract by key.find or directly refer to something in a [name attr [..children..] uri] tree. By writing a tree traversal mezz.. and using callbacks?get-node: func[dom path /flat /local tags node-name cont? result ][ ;probe path tags: parse path "/" node-name: last tags remove back tail tags foreach tag tags [ cont?: false foreach node dom [ ;probe node if all [ block? node node/1 = tag ][ dom: node/3 cont?: true ;dom: third cont break ] ] ] result: copy [] foreach node dom [ if all [block? node node/1 = node-name][ either flat [ repend result node ][ append/only result node ] ] ] result ] get-node-content: func[dom path][ third get-node/flat dom path ] get-nodes: func[dom path /flat /local tags node-name cont? result ][ ;probe path tags: parse path "/" node-name: take/last tags foreach tag tags [ cont?: false foreach node dom [ ;probe node if all [ block? node node/1 = tag ][ dom: node/3 cont?: true ;dom: third cont break ] ] unless cont? [return none] ] result: copy [] foreach node dom [ if all [block? node node/1 = node-name][ either flat [ repend result node ][ append/only result node ] ] ] result ]
shpNode: first get-nodes symbol %DOMSymbolItem/timeline/DOMTimeline/layers/DOMLayer/frames/DOMFrame/elements/DOMShape
new-line markers when parsing, or normalize as we do JSON. parse implementation, which is instantly accessible as code. On the other hand, we have a dialect and advanced system that lets you think in terms of the content (tree, children, etc.).>> xml/decode {<a b="v"></a>}
== [a [] #(
"b" "v"
)]
>> xml/decode {<a><b>v</b></a>}
== [a [b [none "v" #()] #()] #()]none, it's a bit more work.>> parse-markup {<a b="v"></a>}
== [<a> #(
b: "v"
) </a> none]
>> parse-markup {<a><b>v</b></a>}
== [<a> #() <b> #() text "v" </b> none </a> none]
>> parse-markup read https://www.red-lang.org/
== [declaration <!DOCTYPE html> whitespace "^/" <html> #(
class: "v2"
dir: "ltr"
xmlns: "http://www.w3.org/1999/xhparse-markup: :parse-markup/parse-markup :^)Make just keeps asking me for the path to Topaz parse. Doesn't like anything I tell it apparently. But now I see that %compiled-rules.red was created.Make just keeps asking me for the path to Topaz parse. Doesn't like anything I tell it apparently. But now I see that %compiled-rules.red was created.Make.red doesn't show the error, just loops, but this is the problem.*** Script Error: PARSE - invalid rule or usage of rule: keep/only *** Where: do *** Stack: do-file
topaz-parse?)compiled-rules: #include %compiled-rules.red, rather than the prebuilt version, in the zip I DL'd. :^\load/as.. json one converts json object to Red objects, the other to maps. Is there any option to convert to injects using load\as? And why map has been chosen in place of objects? >> #(a: 1 A: 2)
== #(
a: 1
A: 2
)
>> object [a: 1 A: 2]
== make object! [
a: 2
]object! instead of MAP!>> make object! to block! #(map: 1)
== make object! [
map: 1
]>> make object! to block! #(a: 1 A: 2)
== make object! [
a: 2
]>> m: #(a: 1) m/b == none >> o: object [a: 1] o/b *** Script Error: cannot access b in path o/b
[arg1: value arg2: value]Object>> blk: [a: 1 b: 2] == [a: 1 b: 2] >> blk/b == 2
find/select>> blk: [a: 1 b: 2] == [a: 1 b: 2] >> find blk 'b == [b: 2] >> find blk quote b: == [b: 2]
find/select just words or set-words in Red?find/case.>> blk: [a: 22 C: b b: 3] == [a: 22 C: b b: 3] >> w: 'b == b >> blk/(w) == b: >>
arguments-block: [a: 50 b: 22] ;--- This is the checks block --- checks: make object! [ a: func [arg [number!]] [if arg > 10 [raise-error]] b: func [arg [number!]] [if arg < 2 [raise-error]] ] f: func [args /local checks] [ foreach wrd words-of checks [ checks/(wrd) args/(wrd) ] ] f arguments-block ;Alternate solution f: func [args /local ff checks] [ foreach wrd words-of checks [ checks/(wrd) find/tail/case args wrd ] ]
/skip 2 or /case. Tradeoffs./case can't be used and you have to drop words the use of words as set-word arguments !!!!!word/[options]/path/path I will experiment with it making a custom operator to evaluate the path.set-words and words in find and paths [here](https://github.com/red/red/wiki/%5BDOC%5D-Differences-between-Red-and-Rebol)maps while arrays are converted to blocks. Is it right?>> xx: ["a" %myfile.r 'a /hello true false none 1 3x2 make object! [c: 22 d: 33]]
== ["a" %myfile.r 'a /hello true false none 1 3x2 make object! [c: 22 d: 33]]
>> load/as save/as "" reduce xx 'json 'json
== ["a" "myfile.r" "a" "/hello" true false none 1 "3x2" #(
c: 22
d: 33
)]from-XML-date: func [val [string!]] [to-date replace val "T" "/"] to-XML-date: func [val [date!]] [to-ISO8601-date/T/no-zone val] MSProject-type-map: [ ; XML type REBOL type conversion funcs "integer" integer! [to-REBOL to-integer to-XML form] "string" string! [to-REBOL to-string to-XML form] "boolean" logic! [to-REBOL from-XML-boolean to-XML to-XML-boolean] "float" decimal! [to-REBOL to-decimal to-XML form] "decimal" decimal! [to-REBOL to-decimal to-XML form] "time" time! [to-REBOL to-time to-XML form] "duration" time! [to-REBOL from-XML-duration to-XML to-XML-duration] "dateTime" date! [to-REBOL from-XML-date to-XML to-XML-date] ; other name means it's an object type ]
"TimephasedDataType" [
"Type" "integer"
"UID" "integer"
"Start" "dateTime"
"Finish" "dateTime"
"Unit" "integer"
"Value" "string" ; may contain a duration value that requires parsing
]foreach to iterate MSProject-type-map and then a path on the third element to select the appropriate conversion function? load it.load-json. Looking at https://github.com/greggirwin/red-json/commit/d034df6268f81ad7b1dbc84e595ac51d055f141d#diff-7a2689d7c2ae95425eac90483c2810220659bf626bd012d97fe409a6d62dd2d7 I'm going to blame Gregg :Pxpath API. node!](https://github.com/red/REP/issues/57) thoughts), and integration with iterators for imperative use. It would also be nice if we can make things sensible WRT other features in Red. e.g. the new split may be able to partition by function(s), so it overlaps with HOFs and DOM filters/selectors in some ways.find, select, append won't change in this regard. At least not anytime soon. I started an HOF project, and @hiiamboris took some ideas to the next level, but this will be new mezzanines for the most part. dns:// built in as well. I actually used it quite a bit as a debugging tool, making sure names went where I thought they did. Handy for creating server dashboards as well.dns:// too, what I meant is that it should be implemented at R/S level using OS calls instead of writing a complete DNS resolversplit and format. Fortunately, there's a clear separation with codecs. The parser will be done with parse, and produce a data structure. The interface to that, and the implementation to support it is the second part. They meet at the data structure, but are otherwise no connected. The data structure is also important, because it doesn't have to come from a parsed markup document. input: {
<tag-1 attr-1="t-1">
<tag-2>
<a attr-a="aa-1">aaaa</a>
<b>bbbb</b>
cccccc
</tag-2>
</tag-1>
}
xml: [
tag-1 [
#attr-1 [! "t-1"]
tag-2 [
! "cccccc"
a [
! "aaaa"
#attr-a [! "aa-1"]
]
b [! "bbbb"]
]
]
]
xml/tag-1/#attr-1/! ;) "t-1"
xml/tag-1/tag-2/! ;) "cccccc"
xml/tag-1/tag-2/a/#attr-a/! ;) "aa-1/! implies we want content, not a branch, and short enough not to be annoying! "" for each value (for readability)xml: [ tag-1 [ #attr-1 "t-1" tag-2 [ ! "cccccc" a [ ! "aaaa" #attr-a "aa-1" ] b [! "bbbb"] ] ] ]
/! here is for branches only, attrs do not require itxml/tag-1/#attr-1 ;) "t-1" xml/tag-1/tag-2/! ;) "cccccc" xml/tag-1/tag-2/a/#attr-a ;) "aa-1
! is interesting. Do you think multiple text fields should be combined into one, or appear multiple times in the output? And if each one appears separately, would they use the same or a different marker?ref! also be feasible for that, or is issue! superior for this? I guess issue1 seems like it might be closer to the constraints of xml attributes...item [[first...] [second...]]! content between other child elements any number of times. ... ?>) and stuff like CDATA (). ?> are so called processing information, they are used for ... hm ... processing. are basically some metadata. ?> are so called processing information, they are used for ... hm ... processing. is basically DTD embedded is instructions for specific programs: appearing in element/attribute names./meta refinement that will include processing instructions and other metadata into the result. These metada have same structure as normal XML data (I use tag-name content attributes format, but that's easy to change if we decide to go with another one), there's one difference, meta names use issue! instead of word! as normal tags.xml.red right now (the older version) and it's completely impossible to work with (just saying)"^/^-^-^-^-" represented as *content* is very harmful too ;)xml.red was made to parse both xml and html. It wasn’t a good idea, so I made separate html-codec.red that is now in castr and started working on xml2.red that I’m improving currently.true :)true :)truesift](https://gitlab.com/hiiamboris/red-mezz-warehouse/-/blob/master/sift-locate.md) will be of help given a tree iterator? or if something we could add to it for these use cases?probe load-xml {<?xml version="1.0" encoding="UTF-8" ?>
<symbols numberSystem="arab">
<decimal>٫</decimal>
<group>٬</group>
<list>؛</list>
<percentSign>٪</percentSign>
<plusSign>+</plusSign>
<minusSign>-</minusSign>
<approximatelySign>~</approximatelySign>
<exponential>اس</exponential>
<superscriptingExponent>×</superscriptingExponent>
<perMille>؉</perMille>
<infinity>∞</infinity>
<nan>NaN</nan>
<timeSeparator>:</timeSeparator>
</symbols>
}
[symbols [decimal [] #() group [] #() list [] #() percentSign [] #() plusSign [] #() minusSign [] #() approximatelySign [] #() exponential [] #() superscriptingExponent [] #() perMille [] #()
infinity [] #() nan [] #() timeSeparator [] #()] #(
numberSystem: "arab"
)]<?xml version="1.0" encoding="UTF-8" ?> <cyclicNameSet type="dayParts"> <cyclicNameContext type="format"> <cyclicNameWidth type="abbreviated"> <cyclicName type="1">zi</cyclicName> <cyclicName type="2">chou</cyclicName> <cyclicName type="3">yin</cyclicName> <cyclicName type="4">mao</cyclicName> <cyclicName type="5">chen</cyclicName> <cyclicName type="6">si</cyclicName> <cyclicName type="7">wu</cyclicName> <cyclicName type="8">wei</cyclicName> <cyclicName type="9">shen</cyclicName> <cyclicName type="10">you</cyclicName> <cyclicName type="11">xu</cyclicName> <cyclicName type="12">hai</cyclicName> </cyclicNameWidth> <cyclicNameWidth type="narrow"> <alias source="locale" path="../cyclicNameWidth[@type='abbreviated']"/> </cyclicNameWidth> <cyclicNameWidth type="wide"> <alias source="locale" path="../cyclicNameWidth[@type='abbreviated']"/> </cyclicNameWidth> </cyclicNameContext> </cyclicNameSet>
EmptyElemTag: [ #"<" copy value Name any [S Attribute] S? "/>" (value?: cont-val: copy "") push-stack pop-stack ]
>unxml.exe 1.xml
ldml [
cyclicNameSet [
#type "dayParts"
cyclicNameContext [
#type "format"
cyclicNameWidth [
#type "abbreviated"
cyclicName [#type "1" ! "zi"]
cyclicName [#type "2" ! "chou"]
cyclicName [#type "3" ! "yin"]
cyclicName [#type "4" ! "mao"]
cyclicName [#type "5" ! "chen"]
cyclicName [#type "6" ! "si"]
cyclicName [#type "7" ! "wu"]
cyclicName [#type "8" ! "wei"]
cyclicName [#type "9" ! "shen"]
cyclicName [#type "10" ! "you"]
cyclicName [#type "11" ! "xu"]
cyclicName [#type "12" ! "hai"]
]
cyclicNameWidth [
#type "narrow"
alias [#source "locale" #path "../cyclicNameWidth[@type='abbreviated']"]
]
cyclicNameWidth [
#type "wide"
alias [#source "locale" #path "../cyclicNameWidth[@type='abbreviated']"]
]
]
]
]STag: [#"<" copy value Name any [S Attribute] S? #">" push-stack]
<attributeValues
elements='dateFormatLength timeFormatLength dateTimeFormatLength decimalFormatLength scientificFormatLength percentFormatLength currencyFormatLength'
attributes='type' order='given'>full long medium short</attributeValues><numberingSystem id="adlm" type="numeric" digits="𞥐𞥑𞥒𞥓𞥔𞥕𞥖𞥗𞥘𞥙"/>
<numberingSystem id="ahom" type="numeric" digits="𑜰𑜱𑜲𑜳𑜴𑜵𑜶𑜷𑜸𑜹"/>
<numberingSystem id="arab" type="numeric" digits="٠١٢٣٤٥٦٧٨٩"/>
<numberingSystem id="arabext" type="numeric" digits="۰۱۲۳۴۵۶۷۸۹"/>to integer! to issue! stuff.unxml, but maybe RexML (pronounced Wrecks-ML) would offer a small, subliminal commentary. ;^)xml2.red for a reason, until it will be on pair with xml.red.S? is opt and S* is any so they shouldn’t be aliases. But maybe using S* everywhere where the specs have S? would be a good idea.S? is opt and S* is any so they shouldn’t be aliases. But maybe using S* everywhere where the specs have S? would be a good idea.opt any rule? any is already optionalopt any rule? If you look at the current version, it says : S?: [opt S] S*: [any S].S* everywhere and forget about it.S* (or whatever name you choose for the whitespace rule) basically everywhere.opt any rule? If you look at the current version, it says : S?: [opt S] S*: [any S].opt S is deeply wrong because it makes spaces significant. One space - syntax works, 2 spaces - it's broken :)any SS* and we'll see.S and S? have been replaced with S*. @hiiamboris can you please test it with the problematic data?/pretty for more readable output.prettyis not used anywhere in Red yet, so let's give it a few thoughts, and as I know @greggirwin, he tends to be very carefull in introducing new naming schemas :-)path!. This has one downside, in some cases you won't be able to access values using path notation. However, the path notation isn't that useful anyway, let's for example look at this:<menu>
<item id="1">one</item>
<item id="2">two</item>
</menu>menu/item would return only the first item anyway, so some XPath-like dialect is certainly a much better solution to access values.map! where you obviously cannot use path! as a key.sift data/menu [/item] is all it takes to extract those itemssift seems to be used to filter a flat data structure, while xpath is sort of like filesystem globbing + predicate filtering, so it's tuned for a tree structure.sift is good at walking trees like that, or if it would make sense to have a sibling method that would handle like a path with patterns at different layers. I still want to spend some more time looking at them, but haven't been able to yetxmlns:xsl="http://www.w3.org/1999/XSL/Transform" this defines a 'prefix' xsl which corresponds to the http... namespacexsl:output it is saying that the output element is in the xsl transform namespace specified abovep1, p1, etcfoo:bar is a valid url! type, but maybe not the best thing to work with in paths>> test: [element1: [@ "ns-prefix" ! "text" element2: ["text2"]]] >> test/element1/@ == "ns-prefix"
@ and ! it starts to look like an alphabet soup, Perl or Regex. If we’re going to use this format I wonder if using more verbose words, like namespace and content. They would take the same amount of memory but would directly imply what they represent.! is. And with the introduction of another sigil, I’m getting lost completely.item in my menu example above), path access is useless./#xmlns accessor? I think valid xml elements can't start with xmlxmltree/ldml/numbers/currencyformats/currencyformatlength/currencyformat/pattern/count? It occupies the whole editor line, leaving no space for functions or anything.[#()]. I don't think we need to be afraid of the other datatypes, if it ends up looking clear enough/tree//count. We need a similar mechanism.count that are childer or (grand)*childern of tree. So yes, it needs to scan whole tree.! as a content accessor. It's not Reddish in that we *try* to use words when we can. ! also already has a strong meaning as a datatype suffix, so has to be learned as not type related in this context. If it does appear in the data structure as a key, that softens the blow. @hiiamboris mentioned that the lexer would need to be updated for one of his ideas. We should not do anything that requires such changes, unless they are lexer issues to begin with.path!, select/only can be used for easy accessblock! instead of map!, that way attribute names with namespace can be converted to path! alsoblock!. I believe it’s a small price to pay.[
"menu" [none "234567^/ " none
"item" "one" #(
"id" "1"
) none "123456^/ " none
"item" "two" #(
"id" "2"
)
] #()
][
menu [
! "234567^/"
item [#id "1" ! "one"]
! "123456^/"
item [#id "2" ! "two"]
! "012345^/"
]
]! sigil to get data from elements (we can of course use none like you're doing today, but I don't think none is any better in readability and usability)xml2.red all values were disappearing):<?xml version="1.0" encoding="UTF-8" ?>
<menu>
234567
<item id="1">one</item>
123456
<item id="2">two</item>
012345
</menu>! is really confusing.none isn't?content: instead of ! and namespace: so tag names won’t have to be path!s if you don’t like them?issue! for attribute names, they can’ have a namespace.namespace [ #x "1" #y "1" ]
#namespace/x "1" #namespace/y "1"
b/x/y more Reddish than b/('x/y):>> b: [x/y 1]
== [x/y 1]
>> b/('x/y)
== 1<ns1:a>1</ns1:a><ns2:a>2</ns2:a><ns1:a>3</ns1:a><ns2:a>4</ns2:a>
ns1 [ a 1 a 3 ] ns2 [ a 2 a 4 ]
! with something verbose and sane.@text, @ns? Could also use that for things like @cdata, @comment if those are wanted<xml xmlns:ns1="foo"><ns1:bar/></xml> <xml xmlns:ns2="foo"><ns2:bar/></xml>
xml/ns1/bar or xml/ns2/bar. It's possible that you would need to differentiate via namespace anyway, but in my experience, it's almost always fine to just ignore itref! is a string type. Since attributes are used throughout a document, the same names may appear a *lot*. In that case, word types should offer a significant benefit; as long as we accept word syntax limitations, which I think was an earlier conversation.any-word! type instead of any-string!. I should have a prototype of the alternative output today, so we can play with it then@text is a win over !text or just !. I'm just *okay* with all these variants.@ref yes! is used in XML for comments and CDATA, which may make it confusing if we repurpose it. There will be a lot of people coming from that side, rather than the Red side, and they may think it's used as XML would."html:a" would end up as a /html. I know it looks strange backwards, but it uses a special type for the namespace that is also word-based, so more efficient than a string. And it doesn’t need some special markup such as @ or namespace:.a in path and yes, it would ruin the ordering in case we want to have key name first.tag-name #namespace /attribute "att value" "content"
>> print read %data/namespace1.xml
<?xml version="1.1"?>
<html:html xmlns:html='http://www.w3.org/1999/xhtml'>
<html:head><html:title>Frobnostication</html:title></html:head>
<html:body><html:p>Moved to
<html:a href='http://frob.example.com'>here.</html:a>
</html:p>
<html:p>Done by
<html:a href='http://examle.com'>example.com</html:a>
</html:body>
</html:html>
>> print mold load-xml %data/namespace1.xml
[
head /html [
title /html "Frobnostication"
]
body /html [
p /html ["Moved to ^/ "
a /html #href "http://frob.example.com" "here."
]
p /html ["Done by ^/ "
a /html #href "http://examle.com" "example.com"
]
]
]/html head would work thoughbody/p/a won’t return a’s value. And of course it’s usable for first a only, in case there would be multiple tags.a is fine for the majority of use cases a [..] like I proposed originallytext:?a [..] like I proposed originallytree/ldml/dates/calendars/calendar/gregorian/months/monthcontext/stand-alone/monthwidth/abbreviated/monthparen! for attributes, that way I don’t need some markup to make it obvious it’s not content.issue! for namespaces and refinement! for attribute names as it looks more logically this way. It also puts attributes after the content so path access works (for first item only):[
#html head [
#html title "Frobnostication"
]
#html body [
#html p ["Moved to ^/ "
#html a "here." /href "http://frob.example.com"
]
#html p ["Done by ^/ "
#html a "example.com" /href "http://examle.com"
]
]
]body/p/(/href) to get the attribute contents.body has children, so the content is block!. a doesn’t have children, so the content is string! /tag-namespace tag-name "content" /att-namespace #att-name "att-value"
xml/output-type. Supported values are wbm (word, block, map - I’ll probably remove it as it’s too limited). wbb (word, block, block) and boris (I wonder why it’s called that).months [
monthContext [
format [
monthWidth [
abbreviated [
alias [source "locale" path "../monthWidth[@type='wide']"]
]
narrow [
alias [source "locale" path {../../monthContext[@type='stand-alone']/monthWidth[@type='narrow']}]
]
wide [
month [
1 "M01"
2 "M02"
3 "M03"
4 "M04"
5 "M05"
6 "M06"
7 "M07"
8 "M08"
9 "M09"
10 "M10"
11 "M11"
12 "M12"
]
]
]
]
stand-alone [
monthWidth [
abbreviated [
alias [source "locale" path {../../monthContext[@type='format']/monthWidth[@type='abbreviated']}]
]
narrow [
month [
1 "1"
2 "2"
3 "3"
4 "4"
5 "5"
6 "6"
7 "7"
8 "8"
9 "9"
10 "10"
11 "11"
12 "12"
]
]
wide [
alias [source "locale" path {../../monthContext[@type='format']/monthWidth[@type='wide']}]
]
]
]
]
]stand-alone/abbreviated links to format/abbreviated which links to /format/widealias in the specs so it must be their own invention which I believe they are very proud of.<?xml version="1.0" encoding="UTF-8" ?>
<identity>
<version number="$Revision$"/>
<language type="root"/>
</identity>[
identity [
version ""
#number "$Revision$"
"" language
"" #type
"root"
]
]none would be better in such case?<?xml version="1.0" encoding="UTF-8" ?>
<identity>
<version number="$Revision$">1.0</version>
<language type="root"/>
</identity>version tag.identity [
version [#number "$Revision$" ! "1.0"]
language [#type "root"]
]none instead of the empty string!.! is vastly preferable to numeric index or none.here -> a "here". It's not gonna work. list [a "here" a [#type "b" @bolek "there"]]list [a [@bolek "here"] a [#type "b" @bolek "there"]]! in it ;-)!for-each [('a) b [block!]] list [..] anymorelist [a ["here"] a ["there" #type "b"]]
! is it.wbb because its structure is word-block-block. The other emitter has more free form structure, so I can’t choose some similar simple name. As always, I’m open to suggestions.Boris emitter a few days before you coded it :Dtypes [
type [#key "calendar" #type "buddhist" ! "буддийский календарь"]
type [#key "calendar" #type "chinese" ! "китайский календарь"]
type [#key "calendar" #type "coptic" ! "Коптский календарь"]
type [#key "calendar" #type "dangi" ! "календарь данги"]
type [#key "calendar" #type "ethiopic" ! "эфиопский календарь"]
type [#key "calendar" #type "ethiopic-amete-alem" ! "Эфиопский календарь "Амете Алем""]
type [#key "calendar" #type "gregorian" ! "григорианский календарь"]
type [#key "calendar" #type "hebrew" ! "еврейский календарь"]
type [#key "calendar" #type "indian" ! "Национальный календарь Индии"]
type [#key "calendar" #type "islamic" ! "исламский календарь"]
type [#key "calendar" #type "islamic-civil" ! "Исламский гражданский календарь"]
type [#key "calendar" #type "islamic-rgsa" #draft "contributed" ! "исламский календарь (Саудовская Аравия)"]
type [#key "calendar" #type "islamic-tbla" #draft "contributed" ! {исламский календарь (табличный, астрономическая эпоха)}]
type [#key "calendar" #type "islamic-umalqura" #draft "contributed" ! "исламский календарь (Умм аль-Кура)"]
type [#key "calendar" #type "iso8601" ! "календарь ISO-8601"]
type [#key "calendar" #type "japanese" ! "японский календарь"]
type [#key "calendar" #type "persian" ! "персидский календарь"]
type [#key "calendar" #type "roc" ! "календарь Миньго"]
type [#key "cf" #type "account" ! "финансовый формат"]
type [#key "cf" #type "standard" ! "стандартный формат"]
type [#key "colAlternate" #type "non-ignorable" #draft "contributed" ! "Сортировка символов"]
type [#key "colAlternate" #type "shifted" #draft "contributed" ! "Сортировка без учета символов"]
(...)prettify experiment of mine. It was written for *code*. But now that I think of it, maybe it can work with data separately too.trace nowprettify/dataselect map 'column and it has not been used pick map 'column?PICK - Returns the series value at a given index.[
cyclicNameSet [#type "dayParts"
cyclicNameContext [#type "format"
cyclicNameWidth [#type "abbreviated"
cyclicName [#type "1" ! "zi"]
cyclicName [#type "2" ! "chou"]
cyclicName [#type "3" ! "yin"]
cyclicName [#type "4" ! "mao"]
cyclicName [#type "5" ! "chen"]
cyclicName [#type "6" ! "si"]
cyclicName [#type "7" ! "wu"]
cyclicName [#type "8" ! "wei"]
cyclicName [#type "9" ! "shen"]
cyclicName [#type "10" ! "you"]
cyclicName [#type "11" ! "xu"]
cyclicName [#type "12" ! "hai"]
]
cyclicNameWidth [#type "narrow"
alias [#path "../cyclicNameWidth[@type='abbreviated']" ! ""]
]
cyclicNameWidth [#type "wide"
alias [#path "../cyclicNameWidth[@type='abbreviated']" ! ""]
]
]
]
]! "" though I guess.index? find words-of table column/meta to get them anyway.@various-metadata so they can distinguished by type.!, what about text. It's always going to be text, because we're parsing it from XML text, and then if we have a semi-mapped loader there could be data like GUI elements have for the duality of text and data content. ! is just not worth it, long term.editor in R2. Maybe seems a bit heavy, but if it works across data structures it packs a lot of punch.a (viewable mapping) is also nice, but may also be best served by a diff-like viewer. Maybe the XML module includes all these things, because if you need one, you probably don't care about another 50K in your app. ;^)c and d, which is all about how we use and access the data structure. parse-based access may be the way to go.parse-doc which is a dialected func where you can define actions for elements by tag name, ..."texttextis that it can be used as a tag name and then you wouldn’t be able to distinguish if it’s a tag name or a keyword.<book author="J.R.R.Tolkien" name="Return of the King"/>
<book author="J.R.R.Tolkien"> <name>Return of the King</name> </book>
<book>
<author>J.R.R.Tolkien
<name>Return of the King</name>
</author>
</book><book> <author>J.R.R.Tolkien</author> <name>Return of the King</name> </book>
[
identity [
version [#number "$Revision$"] ! ""
language [#type "root"]
]
]<?xml version="1.0" encoding="UTF-8" ?> <identity> <version number="$Revision$"/> <language type="root"/> </identity>
select on maps returns just the value, is there a command to have the values together with the key?>> m: make map! [a: 22 b: 33 d: 44]
== #(
a: 22
b: 33
c: 44
)>> select m 'a == 22
>> select-col m 'a == #(a: 22)
>> select-col m [a b] == #(a: 22 b: 33)
select-col: func[m [map!] key][ make map! reduce [key select m key] ]select a, b from table where table has columns a, b, c, dred select-col: func[m [map!] key][ unless block? key [key: to block! key] collect [ forall key [ keep key/1 keep select m key/1 ] ] ]
#(a: 1)), internally is construction of maps quite complex and you should avoid using too many small maps. At least until maps are not threated as simple blocks internally for small number of keys (not building hash tables). [
calendar [#type "buddhist"
months [
alias [#path "../../calendar[@type='gregorian']/months"]
]
days [
alias [#path "../../calendar[@type='gregorian']/days"]
]
]
]<?xml version="1.0" encoding="UTF-8" ?>
<calendar type="buddhist">
<months>
<alias source="locale" path="../../calendar[@type='gregorian']/months"/>
</months>
<days>
<alias source="locale" path="../../calendar[@type='gregorian']/days"/>
</days>
</calendar>remove-words/keep-words. Along with spec block support funcs. Useful for sanitizing and minimizing objects, making from templates, etc.[tag attr value] or [tag value attr]. I lean toward the latter for path use. But if if 90% of access is via our API to it and HOFs, it really doesn't matter. ! though. I remember that much. :^)! is that it's short, and it's a word which can't be the name of an xml element or attribute. We could as well use a different data type, or other word which contains a character that xml won't allow. text! would be another option, or @text was also suggested (for example, if we wanted to have a common data type for various xml structural parts). I would be fine with any of these, really. The first time you deserialize an xml, it should be pretty clear what they are for. Being a little more verbose might help with other people reading the source if they aren't familiar with it, or non-reducers<?xml version="1.0" encoding="UTF-8" ?>
<currencyFormats numberSystem="latn">
<currencySpacing>
<beforeCurrency>
<currencyMatch>[[:^S:]&[:^Z:]]</currencyMatch>
<surroundingMatch>[:digit:]</surroundingMatch>
<insertBetween> </insertBetween>
</beforeCurrency>
<afterCurrency>
<currencyMatch>[[:^S:]&[:^Z:]]</currencyMatch>
<surroundingMatch>[:digit:]</surroundingMatch>
<insertBetween> </insertBetween>
</afterCurrency>
</currencySpacing>
<currencyFormatLength>
<currencyFormat type="standard">
<pattern>¤ #,##0.00</pattern>
</currencyFormat>
<currencyFormat type="accounting">
<alias source="locale" path="../currencyFormat[@type='standard']"/>
</currencyFormat>
</currencyFormatLength>
<currencyFormatLength type="short">
<currencyFormat type="standard">
<pattern type="1000" count="other">¤ 0K</pattern>
<pattern type="10000" count="other">¤ 00K</pattern>
<pattern type="100000" count="other">¤ 000K</pattern>
<pattern type="1000000" count="other">¤ 0M</pattern>
<pattern type="10000000" count="other">¤ 00M</pattern>
<pattern type="100000000" count="other">¤ 000M</pattern>
</currencyFormat>
</currencyFormatLength>
<unitPattern count="other">{0} {1}</unitPattern>
</currencyFormats>boris format:[
currencyFormats [#numberSystem "latn"
currencySpacing [
beforeCurrency [
currencyMatch [! "[[:^^S:]&[:^^Z:]]"]
surroundingMatch [! "[:digit:]"]
insertBetween [! " "]
]
afterCurrency [
currencyMatch [! "[[:^^S:]&[:^^Z:]]"]
surroundingMatch [! "[:digit:]"]
insertBetween [! " "]
]
]
currencyFormatLength [
currencyFormat [#type "standard"
pattern [! "¤ #,##0.00"]
]
currencyFormat [#type "accounting"
alias [#path "../currencyFormat[@type='standard']"]
]
]
currencyFormatLength [#type "short"
currencyFormat [#type "standard"
pattern [#type "1000" #count "other" ! "¤ 0K"]
pattern [#type "10000" #count "other" ! "¤ 00K"]
pattern [#type "100000" #count "other" ! "¤ 000K"]
pattern [#type "1000000" #count "other" ! "¤ 0M"]
pattern [#type "10000000" #count "other" ! "¤ 00M"]
pattern [#type "100000000" #count "other" ! "¤ 000M"]
]
]
unitPattern [#count "other" ! "{0} {1}"]
]
]word-block-block format:[
currencyFormats [
currencySpacing [
beforeCurrency [
currencyMatch "[[:^^S:]&[:^^Z:]]" []
surroundingMatch "[:digit:]" []
insertBetween " " []
] []
afterCurrency [
currencyMatch "[[:^^S:]&[:^^Z:]]" []
surroundingMatch "[:digit:]" []
insertBetween " " []
] []
] []
currencyFormatLength [
currencyFormat [
pattern "¤ #,##0.00" []
] [type "standard"]
currencyFormat [
alias "" [path "../currencyFormat[@type='standard']"]
] [type "accounting"]
] []
currencyFormatLength [
currencyFormat [
pattern "¤ 0K" [type "1000" count "other"]
pattern "¤ 00K" [type "10000" count "other"]
pattern "¤ 000K" [type "100000" count "other"]
pattern "¤ 0M" [type "1000000" count "other"]
pattern "¤ 00M" [type "10000000" count "other"]
pattern "¤ 000M" [type "100000000" count "other"]
] [type "standard"]
] [type "short"]
unitPattern "{0} {1}" [count "other"]
] [numberSystem "latn"]
]source locale missing is a known issue, correct?word block block between these two. If we do end up with a special key, I bang my design gavel in favor of a decorated word, like text! or .text rather than !. If we *don't* have a special key, we can still define a named value to use in paths, e.g. a/b/c/:XTEXT, but that's just another tradeoff.none rather than an empty string, but not sure it's worth it, or how common they are. My guess is not, so the efficiency win isn't there. Attrs can be none, rather than an empty block. It does mean users have to create the block if they want to add them, but that's OK.boris format doesn't work for HTML (due to [ordering](https://gitter.im/red/codecs?at=6197ff60197fa95a1c7e37e4)), do we want two different formats? I say No. Technically, XML is ordered as well, isn't it? Though many cases will work fine without preserving it. I haven't gotten back to the spec, sorry.boris format breaks it. Is there an example?parse against internally and provide a nice set of APIs, but (I hope) also imperatively or with HOFs when the data is simple and predictable.wbb one is a bit harder for me to follow.>> xml/output-type: 'wbb
>> xml/decode read %data/simple.html
== [
body [none "some " none
b "bold" [] none "and " none
i "italic" []
] []
]
>> xml/output-type: 'boris
>> xml/decode %data/simple.html
== [
body [! "some"
b [! "bold"] ! "and"
i [! "italic"]
]
][
currencyFormats [
; .text "" ;?? Omit empty .text values?
.attr [numberSystem "latn"] ;?? Can .attrs always safely group to the top?
currencySpacing [
beforeCurrency [
; Always include .attr ?
currencyMatch [.text "[[:^^S:]&[:^^Z:]]" .attr []]
surroundingMatch [.text "[:digit:]" .attr []]
insertBetween [.text " " .attr []]
]
afterCurrency [
; Omit empty .attrs ?
currencyMatch [.text "[[:^^S:]&[:^^Z:]]"]
surroundingMatch [.text "[:digit:]"]
insertBetween [.text " "]
]
]
currencyFormatLength [
currencyFormat [
pattern [.text "¤ #,##0.00" .attr [type "standard"]]
currencyFormat [
.text ""
.attr [type "accounting"]
alias [
.text none ;?? to distinguish auto-closed tags? Probably not.
.attr [path "../currencyFormat[@type='standard']"]
]
]
]
currencyFormatLength [
.attr [type "short"]
currencyFormat [
; Can .attrs always safely group to the top with .text?
; Should they always come first?
.attr [type "standard"]
.text ""
pattern [.text "¤ 0K" .attr [type "1000" count "other"]]
pattern [.text "¤ 00K" .attr [type "10000" count "other"]]
pattern [.text "¤ 000K" .attr [type "100000" count "other"]]
pattern [.text "¤ 0M" .attr [type "1000000" count "other"]]
pattern [.text "¤ 00M" .attr [type "10000000" count "other"]]
pattern [.text "¤ 000M" .attr [type "100000000" count "other"]]
]
]
unitPattern [.text "{0} {1}" .attr [count "other"]]
]
]boris, but attrs are in a keyed block. cdata (if differentiating from text), comment, and other weird things like processing instructionsnone for no value, valuefor just one and [value1 value2]for multiple ones. But I can understand you don't want to have code full of []or "" ..../.attr is preferable to /3 magic number, though doesn't leverage type richness and the biggest downside after extra verbosity is that .attr [] will always have to be present, you can't omit it, or your paths b/.attr/x will error out on missing .attr in the scenario like this:a [.attr [x "1"] .text "1"] b [.text "2"] c [.attr [x "3"] .text "3"]
;; select a subtree
srcdays: cull/fetch 'tree/ldml/dates/calendars/calendar/(#type = "gregorian")/days
;; extract the days data
for-each [('daycontext) b] srcdays [
type: select [format format stand-alone standalone] to word! b/#type
cal/:type/days: days: copy #()
for-each [('daywidth) b2 [block!]] b [
width: to word! b2/#type
key: select [narrow char abbreviated abbr wide full short short] width
days/:key: m: copy #()
for-each [('day) b3] b2 [put m to word! b3/#type b3/!]
]
]
;; some days may be inherited from the other type, so I reflect both sides to one another for completeness
foreach w [char abbr full short] [
unless format/days/:w [format/days/:w: standalone/days/:w]
unless standalone/days/:w [standalone/days/:w: format/days/:w]
]cull is a helper I made to filter out things from the tree, cull/fetch returns only the /days leaf rather than the whole tree. I don't like it and it has big design issues, and is only occasionally helpful, but it helps me avoid the fourth for-each loop here. Better ideas how to handle this are welcome.morph could have been just the tool for it, but it's too early a prototype to be convenienttag content attributes model you can traverse the tree easily as there are always three values. With your approach, you can’t rely upon that, as the element’s length is always different. #attr/x is currently a problem in the lexer that Gregg said isn't going to be addressed.attr is absent, it complicates traversal? Yes.[tag content attributes] structure you can traverse the tree with a simple recursive foreach loop..attr will always have to be present. For automated access and for less error handling. And readability will be sacrificed.attribute-block then?if a/attribute-block/x <> b/attribute-block/x [a/attribute-block/x: b/attribute-block/x]
#attr "value" syntax is because having /attr/ everywhere in the paths distracts the eye from attribute and element *names* to the completely unimportant detail of how we name the attribute group. And also the reason I chose ! over text!: it's a *syntactic noise*, and I prefer to keep it to a minimum. I exaggerated the example above only to show that strictly abiding by "never abbreviate" law, and having attribute, is a step towards maximizing syntactic noise. I would prefer common sense to prevail in all decisions we make, and laws being taken as only guidelines, and every decision should be weighed in it's context (and that is true not only for the world of code).flags. So what if it's not an official term?/strict refinement.#attr/x lexer issue, it isn't a problem with @attr/x (which also conveniently matches up more closely to xpath syntax). However, would someone give a bit more explanation about the tradeoffs of using a word-based type vs string-based? I don't think I understand Red internals well enough for it. Is it mainly to do with memory usage or other performance characteristics? Like are the words just an index into a context somewhere, vs a string would be duplicated each time it appears?>> type? <attr>/x == /x >> blk: [<attr> "test"] == [<attr> "test"] >> blk/<attr> == "test"
/x instead of a type?type? returns type, but it gets lost as you are not using it. Put probe before it to see.word! vs. string! is the memory usage. However, we can cheat a little bit here and reuse the same string, I guess it shouldn’t be a problem, as we aren’t going to modify the string during the lifetime of the data structure. But I need to test it to see if there aren’t some problems I’m not aware of.text and attr that can't conflict with what may appear in the data. That means they have to be illegal in XML. We can't safely use refinement! or issue! types, due to paths. Using lit/get/set words as a hack doesn't solve the problem. This leaves us with a sigilled word or a string type. As I said [before](https://gitter.im/red/codecs?at=61aa5d92197fa95a1ca4b8ae), *not* copying strings is a recipe for disaster. So we either accept the overhead of copied strings for something like @text/@attr (tag! is a bad choice, as we are coming *out of* markup and don't want confusion there) or use a sigil on a word. I vote for word, and *prefer* . as the sigil not because I *like* it but because the other options are worse. :^) Vote for your sigil..attr and .text being noise, I disagree. They aren't *implicit*, but I think that's a good thing in this case. This is *my* priority sense for the design right now. The reason being that this format, fingers crossed, will never change. If it does, it will be at the edges and not at the core for text and attrs. If it's a long-lived format, it should be as plain and obvious as possible. If there is a *more* plan and obvious format, please post it with the rationale for it.tag content attributes model you can traverse the tree easily as there are always three values. gregg emitter yet. What I posted is not 3-value based, but recursive some [key value] all the way down. `.text` is always a string. More than one may appear throughout a tag's (node's) block. `.attr` is always a block and may only appear once. Any other key is a tag name, and is a node.
.text/.attr, which means we can never take them away. If we spec that they may be omitted, people have to write code to handle that. If we get lots of complaints, or find that code is massively uglified by it, we can later force them to always be included without it being a breaking change. And, No, I don't care about the argument that someone will load a doc now that later fails because empty values are forcibly included. tag content attributes. It's a fixed structure that makes sense and does not need magic keywords.tag content attributes is more of a fixed structure than the alternatives. It is very consistent when text appears just in leaf nodes, but when that is not the case, you still need to repeat the structure multiple times, and it adds a lot of extraneous none and empty blocks in those cases. So how much of a difference is it really to loop through elements in groups of 3 vs 1 at a time? Are you primarily considering the data transformation scenario where it would be a nicer structure?split available). But there are some problems with it too:source: {<example>Some <bold>words</bold> in <italic>this</italic> string</example>}
wbb: load-xml source
== [
example [none "Some" none
bold "words" [] none "in" none
italic "this" [] ; <-- Hey, why is "string" missing?
] []
]
boris: load-xml source
== [
example [! "Some"
bold [! "words"] ! "in"
italic [! "this"]
]
]
gregg: load-xml source
== [
example [.text "Some"
bold .text "words" .attr [] .text "in"
italic .text "this" .attr []
] .attr []
]extract/index wbb/example 3 2 == ["Some" "words" "in" "this"] parse boris/example rule: [collect any [ahead block! into rule | '! keep skip | skip]] == ["Some" ["words"] "in" ["this"]] parse gregg/example [collect any ['.text keep skip | skip]] == ["Some" "words" "in" "this"]
pick split wbb/example 3 2
== [
bold "words" []
]
pick split boris/example 2 2
== [
bold [! "words"]
]
;gregg ??" in " and not just "in" for example!/trim to trim the spaces.[tag-name attr-block-or-none content-block]where attributes are just [key1 value1 ...]source: {<example>Some <bold:rt>words</bold>^/in <italic:rt>this</italic> string</example>}
wbb: load-xml source
== [
example [none "Some" none
bold/rt "words" [] none "in" none
italic/rt "this" []
] []
]
get to-path [wbb example bold/rt]
== "words"<example> _ [ "Some " [<bold> _ "words"] " in " [<italic> _ "this"] " string" ]
tag! it could be just:<example> [ "Some " [<bold> "words"] " in " [<italic> "this"] " string" ]
tag! is a tag, which is not just another string syntax, but can handle also attributes internally, if there would be any)boris -> compact, bolek -> triplet, gregg -> dotted, oldes -> tagged :)<example> [ "Some " <bold> "words" " in " <italic> "this" " string" ]
shape-to-symbol: func[
"Converts existing shape into new symbol"
shp [block!] "Shape's DOM"
id [any-string! integer!] "New symbol's name id"
/local name symbol shpNode file
][
name: join "__symbol_" id ;checksum mold shp/3 'crc32
symbol: load-xml [
{<DOMSymbolItem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://ns.adobe.com/xfl/2008/" name="} name {" itemID="} make-id {" sourceLibraryItemHRef="} name {" symbolType="graphic" lastModified="} to-timestamp now {">
<timeline>
<DOMTimeline name="} name {">
<layers>
<DOMLayer name="Layer 1" color="#4FFF4F" current="true" isSelected="true" >
<frames>
<DOMFrame index="0" keyMode="9728">
<elements>
<DOMShape/>
</elements>
</DOMFrame>
</frames>
</DOMLayer>
</layers>
</DOMTimeline>
</timeline>
</DOMSymbolItem>}]
shpNode: get-node symbol %DOMSymbolItem/timeline/DOMTimeline/layers/DOMLayer/frames/DOMFrame/elements/DOMShape
if shpNode [
shpNode/3: shp/3
]
;probe shpNode
repend/only dom-symbols compose/deep ["Include" ["href" ( join name %.xml ) "itemIcon" "1" "loadImmediate" "false"] none]
;ask ""
write file: xfl-target-dir/LIBRARY/(join encode-filename name %.xml) form-xfl symbol
;append files-to-parse file
name
]
make-symbol-dom: func[name][
first load-xml [
{<DOMSymbolInstance libraryItemName="} name {" name="" symbolType="graphic">
<matrix>
<Matrix/>
</matrix>
<transformationPoint>
<Point/>
</transformationPoint>
</DOMSymbolInstance>}
]
]tag! would solve *a lot* of problemsattributes-of etc. But that of course would be slow.parse? foreach loops? Or filtering with something like [sift / locate](https://gitlab.com/hiiamboris/red-mezz-warehouse/-/blob/master/sift-locate.md)?>tag!. <gavel>[tag content attr] order for triples. <gavel> (unless someone makes a *really* strong, concise, and concrete case for [tag attr content]).! in boris needs to be a sigiled word, not just !, if it's going to be included as a standard emitter. <gavel>[triples compact key-val]? Compact is a little tricky, because it may still be prettified, so it's not flat or minified in any way. I don't love dialect or rich-type but will let others weigh in.boris are problematic only in two use cases (correct me @hiiamboris), which is 1) setting them, because you can't use them literally in a set-path. They can't nest, so will always be the last thing in a path. 2) is accessing an index in them. e.g. .../#name/1. You can still use indirection for them. Is that acceptable, or do we need to use another type?output-type as the emitter control name? Since codec implies 2 directions, I'm unsure. Format is no better in that case._ as a none placeholder in this context. But we could also do NO_ATTRS or something along those lines. If a human is going to read it, our viewer could have an option for what to display for none values. In the same vein, replacement of unique keys is easy. e.g. change .text to ! File: %data/simple.html
Format: wbb
[
example [none "Some" none
bold/rt "words" [] none "^^/in" none
italic/rt "this" []
] []
]
File: %data/simple.html
Format: boris
[
example [! "Some"
/bold rt [! "words"] ! "^^/in"
/italic rt [! "this"]
]
]
File: %data/simple.html
Format: gregg
[
example [.text "Some"
bold/rt .text "words" .attr [] .text "^^/in"
italic/rt .text "this" .attr []
] .attr []
]/bold/rt, in boris @rebolek.gregg. bold/rt .text "words" .attr [] .text "^^/in"
bold/rt [.text "words" .attr [] .text "^^/in" ]
../#x/1 and ../#x: 2 are lexer bugs. Rather a grey design area where nobody thought about issues in paths. But since : and / are illegal for issues, I find it much more useful to support sub-paths and set-paths than to raise an error.rt being a common namespace for related elements, and bold, italic being the 'local element names'). I just made up a namespace definition for it. One note is that the xmlns prefix is one of the rare places where you will see a namespace prefix on an attribute.source: {<example xlmns:rt="Red/rich-text">Some <rt:bold>words</rt:bold>^/in <rt:italic>this</rt:italic> string</example>}<keyword>
<key extension="t" name="k0" description="Keyboard transform:
Used to indicate a keyboard transformation, such as one used by a client-side virtual keyboard.
The first subfield in a sequence would typically be a 'platform' designation,
representing the platform that the keyboard is intended for.
The keyboard might or might not correspond to a keyboard mapping shipped by the vendor for the platform.
One or more subsequent fields may occur, but are only added where needed to distinguish from others." since="21.0.2">
<type name="osx" description="Mac OSX keyboard." since="21.0.2" />
<type name="windows" description="Windows keyboard." since="21.0.2" /><attributeValues dtds='supplementalData' elements='subdivisionAlias' attributes='replacement' type='choice'>AS AW BL BQ CW GF GP GU HK MF MO
MP MQ NC PF PM PR RE SJ SX TF TW UM VI WF YT</attributeValues><firstDay day="sat" territories="AE AF BH DJ DZ EG IQ IR JO KW LY OM QA SD SY"/>
<firstDay day="sun" territories="
AG AS
BD BR BS BT BW BZ
CA CN CO
DM DO
ET
...
SA SG SV
TH TT TW
UM US
VE VI
WS
YE
ZA ZW"
/><variable id='$oldLanguages' type='choice'>
aa ace ada ady ain ale alt an anp arn arp ars av awa ay
ba ban bho bi bin bla bug byn
ch chk chm cho chy crs cv
dak dar dgr dv dzg
efi eka
fj fon
gaa gan gez gil gn gor gwi
hak hil hsn hup hz
...
udm umb
ve
wa wal war wuu
xal
ybb
zun zza
</variable>
<variable id='$scriptNonUnicode' type='choice'>Afak Aran Blis Cirt Cyrs Egyd Egyh Geok Inds Jurc Kitl Kpel Latf Latg Loma Maya Moon
Nkgb Phlv Roro Sara Syre Syrj Syrn Teng Visp Wole
</variable>[
example [none "Some" none
rt/bold "words" [] none "in" none
rt/italic "this" []
] [xlmns/rt "Red/rich-text"]
][
example [/xlmns #rt "Red/rich-text" ! "Some"
/rt bold [! "words"] ! "in"
/rt italic [! "this"]
]
]source into the console and load it, but not if it's in my test file run by my script.<example xlmns:rt="Red/rich-text">Some <rt:bold>words</rt:bold> in <rt:italic>this</rt:italic> string </example>
/preserve refinement or something, which is no easier for users than a trim-strings helper.! takes a few seconds - just enough to read the sentence explaining it's meaning and glancing at the output example. And then I immediately can reap the benefits of it not growing on my nerves. But some people will look at the output and think OMG what does this even mean?? Without a textual hint they will feel lost. So our preferences come down to our differences :)<type>42</type> <x>1</x> <x a="a">2</x> <x b="b">3</x>
"type": ["42"],
"x": [
{"text!": "1"},
{"text!": "2", "attr!": {"a": "a"}},
{"text!": "3", "attr!": {"b": "b"}}
]output-type was a temporary solution, I’ll add /format to both to-xml and load-xml with triples being the default format and for people who don’t like that as the default format and don’t want to use the refinement every time, there would be an option to override it directly with setting xml/format. I guess this is the most sensible solution, but I’m of course open to suggestions.>> load-json {{"a": 1}}
== #(
a: 1
)
>> load-json {{"+1": 1}}
*** Script Error: contains invalid characters[#"+" | #"-"] some digits endattempt [to word! str] will be much faster than parse there.