! but Gregg vetoes it :)! is still in code ;)text!, you can override it.!. Please change it. As @rebolek notes, people can override it if they want for their own uses. For standard Red work, it should not be overridden.path! and not in set-path!? What’s the rationale here?compact format changed to text! and keywords in key-val format are now issue!XML context. Other codecs are in anonymous contexts. This context has a name so it’s possible to change some settings. I can probably add a function, something like xml-settings that would take care of it and let the context be anonymous. What do you think?system/codecs/xml I thinksystem/codecs/json uses capitalized names in it/as catches the error earlier, as it's checked when set, but setting the format name produces the same *kind* of error, just when the data is loaded, correct? That is, you don't get a different error or crash.red-keys? sounds funny to me. If that's the default, and the norm, would it be better to call it out as str-keys?? Then unless red-keys? ... (the two places it's used) become if red-keys? ... which is also cleaner. Also make sure it is doc'd outside triples, since it applies to key-val as well.triples section, since it's used for all of them. Or included it in the other sections for direct comparison of input and output.text! instead of !, correct? The code does have that now. Also probably important to note why the sigil is used, because ! can't appear in XML content, so it avoids conflicts.text-sigil? Simply because "sigil" denotes a single symbol, rather than a name, key, or marker. This one does apply only to compact, so text-key is *slightly* confusing as it looks like it applies to key-val. include-meta? instead of just meta?. key-val is not quite right in the example.version #text none #attr [number "$Revision$"]
language #text "Czech" #attr [type "root"]
markup #text none #attr noneversion [#text none #attr [number "$Revision$"]]
language [#text "Czech" #attr [type "root"]]
markup [#text none #attr none]none values would exclude their associated key (and the none). e.g.version [#attr [number "$Revision$"]]
language [#text "Czech" #attr [type "root"]]
markup []x/#attr/z may fail if #attr isn't there, but it will also fail if it's none.load/to encode/decode consistent with the others. :+1: I was looking at the old code and going to comment, but then checked the PR code and it's good there.xml-settings, but we should be consistent. In the other codecs, you need to use load/to to spec alternative options. However we decide to do it, we should be consistent. Each codec can have an options area, but that means changing CSV and JSON now too. Doing it with refinements on load/to means a couple more for the XML loader. A third option is an /options refinement for load/to funcs, to subsume others. Not as nice for help that way.compact or key-val is a big win for Red internally, in how we want to standardize access, we can change that if we do it before too long.key-val format issue [here](https://gitter.im/red/codecs?at=61bcf8369a9ec834fbd51c45) before.#text/#attr (issues as keys) are used, rather than words for key-val? It's something we'll be asked, so need to doc it, and I don't remember. We need to note it for compact too, but I know why they're used there.red-keys? ...words?mark? I don’t know, if anyone has a beter idea, I would be glad./meta and for the refinement I prefer the shorter variant..text but I don’t find it very Reddish. I can change it back but I still prefer a different type for keywords.[
identity [
#attr none
#text "Version and language^/"
version [
#attr [number "$Revision$"]
#text none
]
language [
#attr [type "root"]
#text "Czech"
]
markup [
#attr none
#text none
]
#text "Ending^/"
]
]```
[
identity [
#attr none
#text "Version and language^/"
#children [
version [
#attr [number "$Revision$"]
#text none
]
language [
#attr [type "root"]
#text "Czech"
]
markup [
#attr none
#text none
]
]
#text "Ending^/"
]
]red-keys? or words?, what is the *downside* to str-keys??text-mark as well. Any thoughts @hiiamboris? compact-text-key says it exactly, but is a bit long. It's only used about 4 times though. Maybe text-mark is good enough./meta for the refinement name, but meta? alone reads like "Is there metadata in the content?". Again, only used a few places.text!/attr!? That matches compact, where issues are used only for attr names, though also isn't idiomatic because they look like datatype names. OTOH, they *are* keys for specific types of data in the content, so it's justifiable that way. @rebolek if you and @hiiamboris really want issues here, see if Nenad agrees that issues in paths are lexer-bugged. It still gnaws at me that we have to do this:>> blk/identity/version/(#attr)/number == "$Revision$" >> blk/identity/version/(#attr)/number: "$Rev$" == xyz
key-val, there is no technical problem caused by removing empty keywords, correct? If so, let's do that. #children? I get that it makes all keys for an element one of [text! attr! children!], and makes the child nodes explicit, but it's going to make path access quite a bit uglier I think. Imagine just one more level than in the examples. We go from blk/identity/version/detail to blk/identity/children!/version/children!/detail.text-marker toosymbol comes from association with [red-symbol](https://github.com/red/red/blob/43609421b8b3612d8d882719650d5150f657d17f/runtime/datatypes/structures.reds#L162)[key word field name etc.], let's go with what fits this context best.blk/identity/version/(#attr)/number syntax shouldn't be much of a concern. I think it might actually make the xml invalid if they do.#text/#attr are used (outside a tailing read), because of how issues are currently lexed. We'll get used to it, and we could argue that those values are special, and it makes them stand out in the path, so it's not the end of the world. Just leave a funny taste in my mouth at the moment.string-keys? would be find IMO.text-mark is ok with me, as well as text-symboltext-symbol is it.meta? ...ref! also. It’s buggy in paths too but in a different way than issue!:>> a: [@a [a]] >> a/@a == [a] >> a/@a/1 == none
string-keys?, but str is almost always an acceptable abbreviation IMO because it's so common. Same with obj or blk.! makes text!/attr! illegal in XML, per earlier chat, but good to confirm.str-keys? - why not, I don’t have any strong opinions here.make them.load/save don't work transparently on file/url values, but /as is already used there for the source format, not the resulting data structure. :^\options to make that a win over direct refinements. text!, unless this is a real datatype./options, including sample calls./options is that it's future-proof and extensible by anyone. OTOH if there are 2-3 similar options, adding a refinement shortcut for that option makes sense./options is that help won't know what to do with it. Hence, I suggest mocking things up./options refinement for load-xml, etc.make codec concept, or how that would work with load, while still providing option locality for safety.system/options ¿-)help smarter, so rich interfaces like objects and dialects become first class citizens like specs and refinements are today. That will be *very* important in distributed message based systems, APIs, and interrogative interfaces (related to, but much more than autocomplete.!is a regular word char. As far as type text! does not return datatype! (which it does not), I am OK with that.system/standard, which holds some templates. It was IIRC mainly used by networking - e-mail header, etc.datatype! use exclamation mark at the end is just a convention, datatype! has no literal form, it looks like a word! and you need to check for a type to distinguish it./options vs refinements, or are you waiting for me to do that? I will make time today if that's the case. /with options variant./meta /strict /str-keys /mark txt-mark ; this one is for COMPACT format only
/with refinement followed by block or map of changed options. Options are first initiated to default values and then overridden using refinements or user options block (map).data: [tag "content" none]
<tag>content</tag>
none is not none! but a word!.none! here or should we be relaxed and accept word! also? In this particular case it’s not a problem, there can’t be word! in place of attributes, but I can think of a case where it would be problematic, if you’re really contemptible person and would name your tag .to-xml so I could test it. :^)none! seems like a good idea here. It has more semantic meaning, and is safer. It's also what we get from the loader, so it's more consistent.put system/codecs is at the bottom for XML and JSON, but at the top for CSV. strict? be in options?/with consistently in codecs, that will chagne CSV, which uses it just for dlm today.default-opts uses lit-words and logic words in a map. I actuallyload-xml to work using /with for format no matter what#[true]/#[false].Strict option yet. Just a TBD note./meta is used with compact, meta issue! vals change toref! vals. With key-val, using %menu.xml test data, the first#xml changes to .xml but #PI stays #PI./options approach, how do we make it clear which go with to-xmlload-xml? Or that indentation is ignored ifpretty? is not yes? Similarly, text-mark only applies tocompact./meta, that namespace or DTD support might add?to-csv doesn't need /as because the datatype tells us the format.key-val, but that means changing that loaded format to map!, which we didn't want to do for the overhead./with to /delim or something, so /with is available as a future option. It's more specific in any case. CSV has a lot of options, and it's unlikely we'll add more. Similarly, XML having 3 emitter options and meta+strict covers a lot. I'm also OK with removing /mark for now. I think we need /str-keys, but @dander can weigh in, or @hiiamboris based on the ICU work, to say if how often it might be needed.print mold load-xml/as file 'triples print mold load-xml/with file #(format: triples) print mold load-xml/with file #( format: triples )
print mold load-xml/as/mark file 'triples 'text! print mold load-xml/with file #(format: triples text-mark: text!) print mold load-xml/with file #( format: triples text-mark: text! )
print mold load-xml/strict/meta/str-keys/as/mark file 'triples 'text! print mold load-xml/with file #( strict?: #[true] include-meta?: #[true] str-keys?: #[true] format: triples text-mark: text! )
/with ... format working, which I can't so far, per my message on Telegram./with options might be the way to go. But I think it makes the calls more verbose while at the same time being less clear and specific. The fact that it's bugged (or so I think) also tells me we have some more map/serialized/config thinking to do, and this turned out to be a timely, real-world, example.Bh is not a word. In ICU data I haven't encountered invalid words, but when I tried loading their .txt format (so, not XML-related, even though it's the same data) Bh was a problem for me./with options because I would hate to pollute general load with refinements from all codecs. Though /as should be there as both option (format) and refinement /as because it's just so often going to be used. OTOH it conflicts with load/as so maybe another name.load/as .. 'xml?load-xml itself works, either way is fine)load-xml.load to it's receiving ends.decode to something internal for that codec.load to it's receiving ends.load. Not in the func spec, but in being completely general, where do you look for options, etc.load already has 7 refinements.markup repo or in the Red PR? Because the Red PR is updated less often. I am using the markup repo as the working place and occasionally I push the changes to the PR as I don’t want to pollute that branch with too many commits. If you want to test the latest version, use the markup repo, please./strict certainly works, you are probably looking at an older version.issue! is reserved for attributes in compact format. So I used ref! for meta tags. Maybe we can change it in other formats to use ref! too so it’s consistent. What do you think?strict? option for /with./with ... options, and only @hiiamboris complained a little. text!/attr! entries if they are none. Any objections or technical problems with that?/Strict makes the loader require it, and /meta is nicely wrapped so only meta-action (and one other place at line 602: store-xml-decl) care about it. All /strict does is add opt to the grammar, making it strict vs loose. /strict it means you get an error if it's not there. Then we'd add /valid or a validate-xml feature if/when DTDs are supported. But the pertinent question is whether we'd rather use if error? load-xml/strict ... or remove /strict and let users check for prolog info if they care. Note that /strict *only* applies to the XMLDecl part of the prolog, as other prolog parts are still optional./strict?/mark. We can always add it later, but can't remove it once it's there. A simple replace/deep is all that's needed if someone wants a different marker, and should be fast if it's word-for-word.! is good with path access but text! is good for defining field mappings in a DSL. E.g. value: ! is not good there. So the use case dictates the sigil./mark, but make sure that Boris, er, "the user" can set text-mark manually in the XML context to avoid the hit. And we should also note that "feature", and the replace/deep option in the docs. The latter keeps global codec state out of the picture. <gavel>prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? to parse.prolog: [ XMLDecl opt [doctypedecl any Misc] any Misc ]
/strict).standalone in XMLDecl isn't captured.<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE data SYSTEM "./data/menu.dtd"> <?xml-stylesheet type="text/css" href="/style/design"?> <?xml-stylesheet type="text/xsl" href="menu.xsl"?> <!-- This is a comment --> <?welcome to the real world?> <?welcome x?> <?welcome ?> <customer_list> <customer> <name> Sanjay</name> <location> Mumbai</location> </customer> <customer> <name> Micheal</name> <location> Washington</location> </customer> </customer_list>
/withXMLDecl is optional in specs, let’s make it optional.ref! to be consistent across output formats.! even less, and /mark as a general feature is really only there from our own little legacy issues and one specific alternative marker (legacy happens fast sometimes :^). So I will still say to remove it, even with an anonymous context, and then @hiiamboris can tell me how much time he loses each day, working on L10N and ICU data, because he wants to use ! and has to post-process it.<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE data SYSTEM "./data/menu.dtd"> <?xml-stylesheet type="text/css" href="/style/design"?> <?xml-stylesheet type="text/xsl" href="menu.xsl"?> <!-- Prolog comment 1 --> <?welcome to the real world?> <?welcome x?> <?welcome ?> <!-- Prolog comment 2 --> <customer_list> <customer> <![CDATA[Customer 1 CDATA]]> <name> Sanjay</name> <!-- Customer 1 comment --> <location> Mumbai</location> <?cust-1-PI type="app/red" href="/root/path"?> </customer> <customer> <?cust-2-PI-1 type="app/red" href="/root/path/1"?> <?cust-2-PI-2 type="app/red" href="/root/path/2"?> <name> Micheal</name> <!-- Customer 2 comment 1 --> <![CDATA[customer 2 CDATA 1]]> <location> Washington</location> <![CDATA[customer 2 CDATA 2]]> <!-- Customer 2 comment 2 --> </customer> </customer_list>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE data SYSTEM "./data/menu.dtd"> <?xml-stylesheet type="text/css" href="/style/design"?> <?xml-stylesheet type="text/xsl" href="menu.xsl"?> <!-- Prolog comment 1 --> <?welcome to the real world?> <?welcome x?> <?welcome ?> <!-- Prolog comment 2 --> <customer_list> <customer name="myVar1"> <![CDATA[Customer 1 CDATA]]> <name type="first"> Sanjay</name> <!-- Customer 1 comment --> <location type="geo"> Mumbai</location> <?cust-1-PI type="app/red" href="/root/path"?> </customer> <customer name="myVar2"> <?cust-2-PI-1 type="app/red" href="/root/path/1"?> <?cust-2-PI-2 type="app/red" href="/root/path/2"?> <name type="first"> Micheal</name> <!-- Customer 2 comment 1 --> <![CDATA[customer 2 CDATA 1]]> <location type="geo"> Washington</location> <![CDATA[customer 2 CDATA 2]]> <!-- Customer 2 comment 2 --> </customer> </customer_list>
compact's choice to use issues leaks out if we have to change the other formats to avoid it.key-val, PIs need to be adjusted, as they are 3 values right now (key-val-val). #PI "cust-1-PI" {type="app/red" href="/root/path"}#PI ["cust-1-PI" {type="app/red" href="/root/path"}][cdata comment pi]. Not that anyone in the real world would ever do that, just to check for conflicts as we look at syntax options.#PI? ... ?> format to identify them.PI is not meaningful, but processing-instruction is long. Process, no. Proc-instr, still not great. Given that "PIs are not part of the document's character data, but MUST be passed through to the application." what about pass-thru? I'm open to thoughts, and if @dander says PI makes the most sense in context, so be it.proc-info maybetriples format, I’ll fix it.<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE data SYSTEM "./data/menu.dtd"> <?xml-stylesheet type="text/css" href="/style/design"?> <?xml-stylesheet type="text/xsl" href="menu.xsl"?> <!-- Prolog comment 1 --> <?welcome to the real world?> <?welcome x?> <?welcome ?> <!-- Prolog comment 2 --> <customer_list comment="comment-attr" cdata="cdata-attr" PI="PI-attr"> <customer name="myVar1"> <![CDATA[Customer 1 CDATA]]> <name type="first"> Sanjay</name> <!-- Customer 1 comment --> <location type="geo"> Mumbai</location> <?cust-1-PI type="app/red" href="/root/path"?> </customer> <customer name="myVar2"> <?cust-2-PI-1 type="app/red" href="/root/path/1"?> <?cust-2-PI-2 type="app/red" href="/root/path/2"?> <name type="first"> Micheal</name> <!-- Customer 2 comment 1 --> <![CDATA[customer 2 CDATA 1]]> <location type="geo"> Washington</location> <![CDATA[customer 2 CDATA 2]]> <!-- Customer 2 comment 2 --> </customer> </customer_list>
text! and attr!. I mocked it up in this file:! can't be in XML names. So ! acts like we use types today.compact). This mocked data is the worst things could ever look WRT the effect the syntax has. Massive data in attrs, or other issues will dwarf it in any raw readability comparison.! instead of # or @ too. The latter two variants seem to be quite busy, overloaded. Can't explain, it is just feeling after all ....ref! would be better because it’s a different beast than text! and attr!, but I can live with that./mark as discussed becuase there is a way for accessing the anonymous XML context:xml-ctx: context? first body-of :load-xml xml-ctx/text-mark: '!
CDATA as it is, don’t introduce new terminology when it’s not needed.CDATA it is, unless @dander screams "NOOoooo!". <gavel>[PI pass-thru proc-info]?doctype be doc-type?load has to support a generic mechanism for it (as I understand it). And that has all the issues we talked about for it here.text! and use ! there is a way even with the anonymous object as I shown, so IMO it’s not a problem currently.load-* functions.<?xml-stylesheet type="text/xsl" href="menu.xsl"?> <?welcome to the real world?>
PI! [#xml-stylesheet {type="text/xsl" href="menu.xsl"}]
PI! [#welcome "to the real world"]load-* functions.help works for functions today. Otherwise there is nothing more than docs for a user to learn from. Then there's help itself. If there are no extended funcs, and we *can* do that, you can't do help load- and get a list of them. You just get load's help, that it has /as and a list of those formats. If it gets to *tons* of them, you overload that doc string. But also, that doc string won't know about any non-standard codecs, so won't list them. If you're the only one working on your code, you know that you included the XAML codec, but on a team, or with shared code of any kind, you might not.help. But who's going to do it? How important is it relative to other things?>> ? system/codecs SYSTEM/CODECS is an object of value: text object! [name type title suffixes entry] markup object! [name type title suffixes entry] qoi object! [name type title suffixes entry size?] pkix object! [name type title suffixes decode identify] der object! [name type title suffixes decode identify DER-tags decode-OID verbose] mobileprovision object! [name type title suffixes decode] crt object! [name type title suffixes decode fingerprint verbose] ppk object! [name type title suffixes decode] ssh-key object! [name type title suffixes decode] utc-time object! [name type title decode] unixtime object! [name type title suffixes decode encode] ar object! [name type title suffixes decode identify] gzip object! [name type title suffixes decode encode identify verbose level] zip object! [name type title suffixes decode encode identify decompress-file de-extra-fields ... tar object! [name type title suffixes decode identify verbose] dng object! [name type title suffixes comment decode identify] jpegxr object! [name type title suffixes decode encode identify] dds object! [name type title suffixes decode encode identify size?] tiff object! [name type title suffixes decode encode identify] gif object! [name type title suffixes decode encode identify size?] bmp object! [name type title suffixes decode encode identify size?] jpeg object! [name type title suffixes decode encode identify size?] png object! [name type title suffixes decode encode identify size? chunks] wav object! [name type title suffixes decode encode identify verbose] ico object! [name type title suffixes decode encode identify] swf object! [name type title suffixes decode decode-tag identify verbose swf-tags] pdf object! [name type title suffixes decode encode identify] bbcode object! [name type title suffixes decode] html-entities object! [name type title decode] json object! [name type title suffixes encode decode] xml object! [name type title suffixes decode verbose options xml-parse-handler echo-handler b...
rebol
>> ? codecs
TIME CODECS:
UNIXTIME
Unix time stamp converter
Includes: decode encode
UTC-TIME
UTC time as used in ASN.1 data structures (BER/DER)
Includes: decode
TEXT CODECS:
BBCODE
Bulletin Board Code
Suffixes: .bbcode
Includes: decode
HTML-ENTITIES
Reserved characters in HTML
Includes: decode
JSON
JavaScript Object Notation
Suffixes: .json
Includes: encode decode
MARKUP
Internal codec for markup media type
Suffixes: .html .htm .xsl .wml .sgml .asp .php
TEXT
Internal codec for text media type
Suffixes: .txt .cgi
XML
Extensible Markup Language
Suffixes: .xml .pom
Includes: decode verbose options xml-parse-handler echo-handler block-handler ns-block-handler parser
CRYPTOGRAPHY CODECS:
CRT
Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile
Suffixes: .crt
Includes: decode fingerprint verbose
DER
Distinguished Encoding Rules
Suffixes: .p12 .pfx .cer .der .jks
Includes: decode identify DER-tags decode-OID verbose
MOBILEPROVISION
Apple's mobileprovision file
Suffixes: .mobileprovision
Includes: decode
PKIX
Public-Key Infrastructure (X.509)
Suffixes: .pem .ssh .certSigningRequest
Includes: decode identify
PPK
PuTTY Private Key
Suffixes: .ppk
Includes: decode
SSH-KEY
Secure Shell Key
Suffixes: .key
Includes: decode
COMPRESSION CODECS:
AR
Unix archive file
Suffixes: .ar .a .lib .deb
Includes: decode identify
GZIP
Lossless compressed data format compatible with GZIP utility.
Suffixes: .gz
Includes: decode encode identify verbose level
TAR
TAR File Format
Suffixes: .tar
Includes: decode identify verbose
ZIP
ZIP File Format
Suffixes: .zip .aar .jar .apk .zipx .appx .epub
Includes: decode encode identify decompress-file de-extra-fields validate-crc? verbose level
SOUND CODECS:
WAV
Waveform Audio File Format
Suffixes: .wav .wave
Includes: decode encode identify verbose
IMAGE CODECS:
BMP
Portable Bitmap
Suffixes: .bmp
Includes: decode encode identify size?
DDS
DirectDraw Surface
Suffixes: .dds
Includes: decode encode identify size?
DNG
Digital Negative
Suffixes: .dng
Includes: comment decode identify
GIF
Graphics Interchange Format
Suffixes: .gif
Includes: decode encode identify size?
ICO
Windows icon or cursor file
Suffixes: .ico .cur
Includes: decode encode identify
JPEG
Joint Photographic Experts Group
Suffixes: .jpg .jpeg
Includes: decode encode identify size?
JPEGXR
JPEG extended range
Suffixes: .jxr .hdp .wdp
Includes: decode encode identify
PNG
Portable Network Graphics
Suffixes: .png
Includes: decode encode identify size? chunks
TIFF
Tagged Image File Format
Suffixes: .tif .tiff
Includes: decode encode identify
OTHER CODECS:
PDF
Portable Document Format
Suffixes: .pdf
Includes: decode encode identify
QOI
Internal codec for qoi media type
Suffixes: .qoi
Includes: size?
SWF
ShockWave Flash
Suffixes: .swf
Includes: decode decode-tag identify verbose swf-tags
TIP: use for example help system/codecs/zip to see more info.image type)>> extend object [] [x: 2] *** Internal Error: reserved for future use (or not yet implemented)
load-* functions.round in Rebol, and am doing so now for split (where we are now at the stage of deciding on a dialect vs refinements) and format. It then comes to how things work in practice, and finding the best balance, because it is possible to overload a small set of interfaces too far. That's part of the magic of Rebol's design for me. It's not that it's perfect, but it found a nice balance.pass-thru or proc-info, as they are more word-like, and have nice meaning. But when dealing with XML, that domain uses PI. So I will defer to that, and nobody except me has complained about PI that I know of, so PI it remains. <gavel>init function, so the codec won’t eat memory when not used, this PI! [#xml-stylesheet {type="text/xsl" href="menu.xsl"}] (for the compact format) is not done yet. but that’s going to be very fast, and that’s all I believe. I’ll finish these two things and re-read the conversation to see if I haven’t forget about something.PI! [#xml-stylesheet {type="text/xsl" href="menu.xsl"}] is done already.init is also finished.set [prolog body] load-xml/meta ...cont-val had a large chunk of data in it. It only gets cleared in store-char-data. I wonder if, as a general rule, we want to clear things like that when the main func is done. On the one hand, they can be handy for debugging. OTOH, leaving data laying around for others to sniff isn't great.store-char-data but there shouldn’t be anything left after the function ends. If there is, I may add clear, that would be probably wise, I agree.find xml word! to figure it outdoctype OK. Easy. :^) <gavel>/meta and there is none. Metadata could also be none in that case. If it's not in a block, and you use /meta, how do you determine if any exists?prolog rule), even if there's not a syntactical delimiter marking the end of it. That is, it's part of the logical structure, not the physical structure.document ::= prolog element Misc*
but we have document: [ opt prolog element ]
misc elements, but would like @dander to weigh in on that. If we *really* want to follow the spec, we should have them. Now we have another question: the epilog. ;^)find xml word! to figure it outxml! doctype! PI! comment!) are also words, so you have to find the first word that's not one of those. I agree that most users will act on specific docs of known, or semi-known structure. It just means a bit more work for anyone writing more general purpose tools. Not a showstopper there though./meta, how do you determine if any exists?prolog! [...] inside?prolog! [...] inside?find xml word! to figure it outxml! doctype! PI! comment!) are also words, so you have to find the first word that's not one of those/meta and always include metadata/meta. Adding separate prolog wouldn’t serve any purpose IMO./meta. It does mean, however, that everybody then has to follow the instructions @rebolek writes for how to ignore it. ;^) The question of value is how often people use it versus how often they need to ignore it. If people need to ignore it explicitly a good portion of the time, we should keep the refinement to make the default case easier. Just the doc Ma'am./meta in place, and release v1 with the prolog as it stands (as long as the docs are written for how to deal with it).set [prolog content] load-xml/meta . What does easy look like the way it is now?to-xml/as 'key-val, if there are no further problems, then I need just to update the docs.#delayed.set [prolog content] load-xml/meta . What does easy look like the way it is now?triples format textual data have this format:none "Some text" none
none to text!:text! "Some text" none
profile against? In the blog announcement, I thought it might be nice to show a comparison between formats.not enough memorybytes formatter @hiiamboris. :^)Time | Time (Per) | Memory | Code 0:00:15.714 | 0:00:15.714 | 232'350'964 | triples 0:00:16.503 | 0:00:16.503 | 54'074'716 | compact 0:00:17.003 | 0:00:17.003 | 71'993'260 | key-val
Time | Time (Per) | Memory | Code 0:00:00.808 | 0:00:00.808 | 15'945'432 | triples 0:00:00.811 | 0:00:00.811 | 6'357'364 | compact 0:00:00.832 | 0:00:00.832 | 7'155'460 | key-val
Time | Time (Per) | Memory | Code 0:00:00.839 | 0:00:00.002 | 6'084'772 | triples 0:00:00.839 | 0:00:00.002 | 6'250'580 | compact 0:00:00.852 | 0:00:00.002 | 5'170'256 | key-val
! and 2 text!:get-meta function, to get the metadata. /meta, that would be how right now; don't use /meta. A few messages later I said this:set [prolog content] load-xml/meta get-meta and see if it's empty. But then you still can't tell if it's in the doc part, because get-meta doesn't remove it, so question 2 still stands.get-meta returns only the prolog metadata, not all metadata for the document. And if there is no metadata it returns an empty block, just like Bolek's first point says. If people use /meta, are we concerned about the overhead of that empty block, or that it's confusing? I'm not. 1) It's XML and not small or efficient, 2) we doc how it works, returning two blocks.prolog rule and moving everything that is in the document block so far into a new block, removing it from the document block. Can't say for sure, and @rebolek may see a better way, e.g. an in-prolog? flag that meta-action uses to decide where to put meta values.set [prolog content] load-xml/meta dance and their done.Get-meta/as lets users get the metadata in a different format than the doc. What is the use case for that @rebolek?Get-meta is basically doing the same work the meta-action related bits in the main XML parser do, but against the Red structure, not the raw XML. So we're doing the same work twice. Prologs will be small, and I'm not concerned about performance overhead. Just that less code = less bugs and doing the same thing in two different places can lead to different results.;1) Get %xml-tools.red from... #include %xml-tools.red doc: load-xml/meta <file> meta: get-meta doc ;A) Select known root word from doc ;- or - ;B) ??? How do you skip the metadata?
get-meta nor having a block seems very useful./meta includes it everywhere, and not using it leaves it out for the entire doc? parse data [to [set w word! if (#"!" <> last form w)] data:] work for all formats?/meta on files that had it for testing, and never *not* used it on those.find-root helper it's fine.locate data [w [word!] .. #"!" <> last form w].split into runtime! :)find-element instead of find-root though, since it's not just specific to the root of the doc.Set will still work to make it global, but then we have a naming issue. :^\ Nothing is easy. to-xml. That's why /as is needed. We probably can add autodetection but I'm not sure it's worth it./meta, maybe I should have included metadata always and nobody would probably ever think about separating them from the rest of the content./meta *may* be useful, but we can't say for sure. We did talk about always including it. And we can change our minds once we get some feedback (not too hopeful on that though ;^). While it's wasteful to post-process and remove it, that's an option.>> print mold get-meta load-xml/meta %data/prolog-pi-cdata-cmt.xml
[
xml! none [
version: "1.0"
encoding: "UTF-8"
standalone: true
]
doctype! {data SYSTEM "./data/menu.dtd"} none
PI! xml-stylesheet {type="text/css" href="/style/design"}
PI! xml-stylesheet {type="text/xsl" href="menu.xsl"}
comment! "Prolog comment 1" none
PI! welcome "to the real world"
PI! welcome "x"
PI! welcome ""
comment! "Prolog comment 2" none
cdata! "Customer 1 CDATA" none
comment! "Customer 1 comment" none
PI! cust-1-PI {type="app/red" href="/root/path"}
PI! cust-2-PI-1 {type="app/red" href="/root/path/1"}
PI! cust-2-PI-2 {type="app/red" href="/root/path/2"}
comment! "Customer 2 comment 1" none
cdata! "customer 2 CDATA 1" none
cdata! "customer 2 CDATA 2" none
comment! "Customer 2 comment 2" none
]triples, not for other formats. That's what I was seeing.>> print mold get-meta load-xml/meta/as %data/prolog+pi+cdata+cmt.xml 'key-val
[
xml! [
version: "1.0"
encoding: "UTF-8"
standalone: true
] doctype!
PI! ["xml-stylesheet" {type="text/xsl" href="menu.xsl"}] comment!
PI! ["welcome" "x"] PI!
]
>> print mold get-meta/as load-xml/meta/as %data/prolog+pi+cdata+cmt.xml 'key-val 'key-val
*** Script Error: PARSE - unexpected end of rule after: into
*** Where: parse
*** Near : [collect into output [any [keep pick [['xml! ] ] ]]]
*** Stack: get-meta key-val
>> print mold get-meta/as load-xml/meta/as %data/prolog+pi+cdata+cmt.xml 'compact 'compact
*** Script Error: PARSE - unexpected end of rule after: into
*** Where: parse
*** Near : [collect into output [any [keep pick ['xml! ] ]]]
*** Stack: get-meta compactprint mold get-meta load-xml/meta/as %data/prolog+pi+cdata+cmt.xml 'key-val
key-val data as triples.get-meta/as ... is the proper code but runs into a bug.forming a word, but that's just an implementation detail.find-element as @dander proposed, unless a better name comes up.find-root/find-elementfind-element for finding *any* element in the doc (actually all occurences of the element, IMO) and find-root as the specialized version that would skip prolog.foreach element find-element xml-data element-name [do code]?for-each. But it's still very limited, driven by one use case only.loop overhead question. If the only thing you're doing is loading XML we're really slow. As soon as you start doing real work with that data, what percentage of the time is that, compared to loading?greg-format should be gregg-format, etc. ;^)element? that would tell whether a particular word represents an element or something else.find-root or find-element are simple or can be considered temporary, it seems fine to add them. I still don't quite understand the distinction between them though, unless the idea is to make find-element much more powerful.find-root is a nice name, and more meaningful than something like find-element vs find-element/deep. It could be find-element/root to reverse that logic of course. The root is the starting point, and there can be only one. That makes it easy. If we do find-element it raises other questions (First? All? params, etc.), so it will be only an *example* of how to write something like that. A strawman if you will.find-element/deep - I don't think it makes sense to have flat search, XML structure is a tree, it should always search in nodes. So find-element/deep should be find-element.first find-element ...?find-element takes element's name as an argument and you don't know what is the root element going to be. That's why there's also find-root.find-root is a quick hack, yes. But it's OK for now. Alo, you can just don't add /meta and then the root element is the first element.[DOC] prefix?load-xml is not copied. Not sure how I missed this before. Is that fixed in the PR version? e.g.>> compact-data: load-xml/as src 'compact
== [
Migration [
Session [#Name "Main gather (Machine Independent)" #Type "Online"
Platform [#Typ...
>> kv-data: load-xml/as src 'key-val
== [
Migration [
Session [
attr! [Name "Main gather (Machine Independent)" Type "Online"]
...
>> compact-data
== [
Migration [
Session [
attr! [Name "Main gather (Machine Independent)" Type "Online"]
...codec in the wiki and they should show up.clear target is used. load-csv/as-records:>> load-csv/as-records {"A","B"^/"1","2"}
*** Script Error: append does not allow map! for its series argument
*** Where: append
*** Near : value char
*** Stack: load-csv>> load-csv/as-records/header {"A","B"^/"1","2"}
== [#(
"A" "1"
"B" "2"
)]
>> load-csv/as-records {"A","B"^/1,"2"}
== [#(
"A" "A"
"B" "B"
) #(
"A" "1"
"B" "2"
)]%xml.red and %html-tools.red and codepage.red with no successave/as %tmp layout [panel [field button]] 'redbin *** Access Error: cannot decode or encode (no codec): handle! *** Where: encode *** Near : codec/encode :value dst *** Stack: save
layout is connecting the root face generated to the screen face. It should not, only view or show should do that. layout/only to avoid the creation of a window root face attached to the screen face, the serialization still chokes, but on routines. The culprit is the /option block created from a [template](https://github.com/red/red/blob/master/modules/view/VID.red#L700) bound to system/view which contains routines. on-change* mostly, which contains references to system/view context or other contexts like [system/reactivity](https://github.com/red/red/blob/master/modules/view/view.red#L435).load call.parent: make object! [...] parts) and then reallocating resources and recording references again throughout the tree.