I love the idea of what you're proposing. I just hate the idea of what Microsoft, Google, and all of the other big vendors would turn it into - assuming you can get their attention at all.
For all its faults, XML has given us a fantastic structured data format with excellent tools (XPath, XQuery, XSLT) for data retrieval and transformation.
JSON is nice for when you need to pass simple data structures to a user interface, especially if that UI is written in Javascript.
YAML gives us nice whitespace issues and a very awkward place for storing commands.
SignalR and Proto have given us high-performance wire transfers for making RPC calls performant and scalable.
XML could certainly stand to be improved - I don't see why we couldn't have something like <item Toothbrush /> where the schema would define that there's a string defined inside item. Maybe a mashup of Relax NG is in order.
Another useful mashup might be introducing DolDoc, but I don't think the web kids are ready for that.
I change my previous opinion because I think I read your article too quickly !
I focused on the differences of pXML with JSON, and not on HTML, where indeed everything is in character string ...
The format that you offer does not guarantee the restitution of the original data
I'm sorry, but I don't understand what you mean. Could you give us an example please, and explain exactly what you mean by "does not guarantee the restitution of the original data".
DidierO wrote:
no guarantee on the type
All values in XML documents are just strings. That's how XML works, and therefore the same is true in pXML. There is no native way in standard XML to specify 'types'. You can add 'type information' with metadata, and you can define XML schemas to validate string values. But it's not like in JSON, where native values can be strings, integers, boolean, null. It seems that you are not aware of the fundamental basic differences between XML and other formats.
DidierO wrote:
loss of white characters at the ends of the values
That's simply not true, unless I totally misunderstand your point. If you write [name foo ] in pXML, then the trailing space after "foo" is part of the value of name. Please provide an example if this is not what you are talking about.
I honestly think that your vote is totally unjustified (because your arguments are wrong). You might consider reevaluating your arguments and vote.
Isn't that true for any format when you serialize for storage?
Yes, true for most formats.
My intention is to (later) add types as an optional extension to pXML. Besides predefined types like boolean, number variations, date, time, list, map, etc. it must be easy for a user to add customized types. I have a very concrete idea about how to do that (without changing pXML's syntax), and I might publish a "Suggestion for types in pXML" article in the future, and consider feedback from the community.
Most large documents are created by WYSIWYG editors. Style is preset by the developers of the editor and difficult to change. The current solution is Cascading Style Sheets (CSS). These can easily become a maintenance nightmare. What is needed is named blocks — sort of like subroutines in code. A syntax is needed to define a name and its pXML code block, both with and without the use of an external style file.
An important feature to simplify maintenance is to prevent redefinition of a block name using different code within a document. This a prevents block named StyleFoo from being redefined in a sub-sub-document and screwing up the formatting from that point on. This problem often arises when multiple documents become merged into a larger document, such as short stories in an anthology or as chapters into a user manual.
In my experience, the designers of XML documents design a style sheet which they know and understand and use very effectively. Years later, maintenance must modify the document, but the time to understand the style sheets is not available, so the maintainers use local formatting for the modifications. When the style sheets change, such as happens when two companies merge or the company's graphics change, the document becomes an instant mess. I have never seen management budget for the time required to fix these document issues.
Cascading Style Sheets is a good example. They were not mentioned in the description of the proposed syntax.
The problem with CSS is that styles can be redefined, causing the document to screw up after the redefinition. For a style that is used only occasionally, finding the redefinition can be time consuming and management never allocates sufficient (if any ) time document modification.
I suggest that if a named block definition is repeated identically, a warning should be displayed. If the definition differs, an error should be displayed and the original definition should be retained.
I have seen a hierarchy of CSS files redefine the style for the same element — usually <title> or <hn> and, of course, various table elements — multiple times. Of course, changing a CSS file to fix one document may well break another document that relies on the same file.
Disclaimer:
I am a software developer and maintainer who uses xml codes in documentation. I do not have the time to study the chain of CSS files used by existing documents that I have to modify. I am not, by any stretch of the imagination, an expert in xml document tags.
XML is definitely not terse. If just representing data is the goal there are, as you mention, many other syntaxes to use.
But the reason to use XML is because there can be a schema or (for the old school) a DTD. These definitions can describe in very great detail the structure of XML instance documents (the ones with tags and data). This allows the creator of an instance document to check that it contains valid content which covers not just structure but element values. The recipient of the document can also verify the document is valid.
Because the XML specification is as old as the hills, most languages include features to validate an XML instance document against a schema document.
the reason to use XML is because there can be a schema
Yes, that's one of the very useful additions to XML. As said in the article, an XML schema can also be applied to a document using the pXML syntax. Once a pXML document is parsed into an XML structure, all these great XML additions and tools can still be used (including XML schema). I plan to publish a follow-up article to show examples of how XML technology can be used with pXML as well.