⚓ T14974 The newline added to a template, magic word, variable, or parser function that returns line-start wikicode formatting (*#:; {|) causes unexpected parsing


Article Images

There's also the problem where MediaWiki still incorrectly surrounds a an <bdi></bdi> element (used as a *mixed-content* element whole content may be inline or block) by forcing its inclusion within a paragraph (within a dummy and undesired <p></p> HTML element, possibly also causing another block element containg it to be terminated too early). This causes problems for contents that should be purely inline and even totally invisible in the rendered page.

Some templates use <div style="display:none">...</div> with the assumption that it is always invisible, but then those templates cannot be used inline (e.g. they break paregraphs or list items and the list containg them, by inserting a separating paragraph, with also additional vertical margins). We should avoid "div" elements for that case, but "span" is not so versatile as its content-model is not mixed. Now if we want to use "bdi" (the only element allowed in Mediawiki to have mixed content, i.e. capable of containing inline or block elements), MediaWiki assumes that this is an inline element... and wants to make it part of a block, inserting it in a paragraph.

Examples of this are templates inserting "tags" for storing some data intended to be processed by machines for Wikidata. Shouldn't we have a Mediawiki-specific element that is to be completely invisible (it would generate a `<bdi style="display:none" data="..."></bdi> element) that the Mediawiki parser (or its last "TidyHTML" step) considers to be left as is (i.e. never embedded inside any other automatically-added element as it is warrantied to be always invisible?

Side note: a "bdi" element can occur anywhere even in the middle of a word, it does not have any effect on the directionality of surrounding text, or on line wraps. Even copy-pasting text containing a bdi element without any text content (only attributes) into a plain-text editor will not add anything, it is really invisible.

So that would be a use case for some <data someID="..."> element in MediaWiki, that would pass though the "HTML Tidy" step and converted into <bdi style="display:none" data-someID="..."></bdi> (the data would fit in the data attribute whose attribute name contains the given id, and where any backslash, newlines or double quotes in the data would be escaped in the attribute value; otherwise HTML-escaping using character entities are also possible for HTML transparency: such embedded data is supposed to be never read by humans, but only by machines, so a basic escaping in it will not be a problem for machines). That data element would accept no CSS style or class attributes, it is purely intended to be invisible, the CSS style="display:none" attribute is automatically added for the generated <bdi> element by the HTML generator in the last phase, but it could also generate some CSS class="mw-data" attribute instead).

Such thing would be useful for various purpose, including "micro-tagging" for semantics, or could be used as internal tracking metadata, or could facilititate the work of wiki editors. The "data" attributes can be used on any valid HTML elements, these attributes can have an extension chosen freely (as long as it is a valid HTML identifier) and appended after an hyphen, and they are also usable in CSS selectors if needed to perform efficent queries inside the DOM (e.g. for use in Javascript with jQuery). For embedding really-invisible data and generate valid HTML5, only the <bdi> element is valid and suitable for that purpose, and it is the only one accepted and supported (partially) by Mediawiki. But using a MediaWiki-specific <data> element would make things easier to handle in the MediaWiki parser and its HTML generator.

Such data element could also be used as a debugging tool for templates/modules, to contain some tracing info.

Note: this data element is also not equivalent to an invisible <input type="..." value="..."> element (which is visible to HTML input forms) or to a <meta> element (meant for web page headers and that are not freely insertable in the page content, but used by the HTML generator). It is a requirement that the chosen HTML element (that the data element will map to) has a "mixed" content-model, for full HTML5 conformance (and "bdi" is the only one well supported in browsers which has all the nice features for being fully invisible to human readers, especially when its inner content is empty (no child elements, only HTML attributes are permitted).

The MediaWiki <data> (or equivalently <#tag:data>) element should probably not accept any common HTML layout attributes like dir, style or class (they would have no effect at all anyway and the two last could conflict with the attributes generated by the parser when convertint the MediaWiki <data> element into an HTML5 <bdi> element). If such attributes are passed they would become data-dir, data-style, data-class. However I may see a use case for accepting the lang attribute specifically as a reserved identifier treated differently (and pass it "as is"), as a possible way to offer some functionality in selectors with jQuery that are not possible with 'data-someID' attributes.

It may eventually also accept an id attribute, only for such selectors, but it could cause conflicts with anchors used in page navigation and would make them partly "visible" to the human user, unless they are converted to another data-id attribute that is also usable in selectors for jQuery via a simple adjustment of the selector syntax.

Beside that, we could use multiple attributes in the same data element for different but simulteneous tagging purposes. E.g. <data lang="en" a="x" b="y"/> would become <bdi class="mw-data" lang="en" data-a="x" data-b="y"></bdi>.

If the MediaWiki data element has a content, that content would be remapped into the (unextended) data attribute of the generated HTML element (with proper escaping). E.g. <data lang="en" a="x" b="y">{ "text", 2 }</data> would become <bdi class="mw-data" lang="en" data-a="x" data-b="y" data="{ &quot;text&quot;, 2 }"></bdi>

So in summary, such "data" element will be a MediaWiki-safe replacement for an empty "bdi" element that will only hold invisible data. It will never be visible but will be processable by a machine (or by client-side Javascript tools that may transform hem to make them visible on demand: it will have enough attributes to perform all we want, including with selectors on all these attributes that can be used in jQuery). MediaWiki will recognize this data element easily, will neither consider them as inline or block elements, will not embed them into any undesired block elements, but will just HTML-ize them on the final conversion step into empty 'bdi" elements with converted data attributes.