shuppy (@delan) [archived]

belter creole, language codes “qbc” and “art-x-belter” (for now)

Belter Creole [lang belta], language codes “qbc” (iso 639-3) and “art-x-belter” (ietf)

belter creole has two language codes, “qbc” and “art-x-belter”. but how?

“qbc” is from the “reserved for local use” area (“qaa” through “qtz”) of the three-letter iso 639-3 language codes, like “eng” for english, “cmn” for mandarin chinese, or “ckb” for sorani. conlang code registry is a project to informally coordinate “assignments” in this area for conlangs, and that’s why belter creole is “qbc”. who knows, maybe someday it will transcend the expanse (2015) and upgrade to a standard code, like toki pona did from “qtk” to “tok”.

this reminds me of cambridge g-rin, a project to informally coordinate “assignments” in the rfc 1918 address spaces, that is, private ipv4 addresses like 10.0.0.1 and 192.168.0.1. in both cases, a public registry that allows anyone who knows about it to avoid taking someone else’s spot.

“art-x-belter” starts with “art”, the iso 639-3 code for “artificial languages”. the “-x-belter” part uses an ietf extension to those codes (bcp 47) that lets you add “private-use” extra details, and that’s why belter creole is “art-x-belter”. internet standards love “x-” for private use. see also http headers like “X-Real-IP” or “X-Forwarded-For”, or mime headers like “X-Spam-Score”.

by the way, another language code with private-use extra details is in the html spec, of all places. thanks hixie :)

<!DOCTYPE html>
<html class="split index" lang=en-US-x-hixie>
<script src=/link-fixup.js defer=""></script>
<meta charset=utf-8>
[...]
made with @nex3's syntax highlighter
#the expanse#the expanse (2015)#toki pona#conlang#The Cohost Global Feed
tef (@tef) [archived]

https://datatracker.ietf.org/doc/html/rfc6648

i am sorry to report but the internet no longer loves x-headers

shuppy (@delan) [archived]

probably for the best

The primary problem with the "X-" convention is that unstandardized parameters have a tendency to leak into the protected space of standardized parameters, thus introducing the need for migration from the "X-" name to a standardized name. Migration, in turn, introduces interoperability issues (and sometimes security issues) because older implementations will support only the "X-" name and newer implementations might support only the standardized name. To preserve interoperability, newer implementations simply support the "X-" name forever, which means that the unstandardized name has become a de facto standard (thus obviating the need for segregation of the name space into standardized and unstandardized areas in the first place).

a related example of these problems is on the web platform, with vendor prefixes in css and javascript. they sound like a good way to mark something as experimental, but in practice they just made a mess that forces browsers and webdevs to write everything five times:

-webkit-transition: all 4s ease;
-moz-transition: all 4s ease;
-ms-transition: all 4s ease;
-o-transition: all 4s ease;
transition: all 4s ease;
made with @nex3's syntax highlighter
window.requestAnimationFrame =
  window.requestAnimationFrame ||
  window.mozRequestAnimationFrame ||
  window.webkitRequestAnimationFrame ||
  window.oRequestAnimationFrame ||
  window.msRequestAnimationFrame;
made with @nex3's syntax highlighter

in some cases, we’ve even come full circle with the permanence of vendor prefixes: the standard (and in fact only) way to set the stroke or fill color of text in html is to use -webkit-text-stroke and -webkit-text-fill-color.

nowadays we prefer to put experimental features behind feature flags, requiring either the user to opt in, or a specific website to opt in for a limited time only. this allows webdevs to experiment with these new features as they evolve, without allowing them to rely on the feature in its current state in production. what are you gonna do, tell your customers that they need to change some weird browser setting to use your bank website?

the internet is the only thing in computing we’ve ever built at such a large scale, and this has had fascinating implications for language and protocol engineering.

postel’s law was once considered a fundamental principle of the internet, but we’ve since found that it can cause some gnarly problems with interoperability, and it can even interfere with our ability to make intentional changes and improvements down the line.

speaking of interfering with our ability to make intentional changes and improvements, even well-intentioned extension mechanisms can prove impossible to use in practice. this is known as protocol ossification.

dns is the protocol for looking up domain names like “cohost.org”. names consist of labels, like “cohost” and “org”, and there’s even a special type of label to help with compressing dns messages. but the other two label types, while reserved for future use, are actually impossible to ever use because we forgot to specify how many bits they take up. if no existing dns software knows how many bits to skip, that makes any message containing such a label unreadable from that point onwards.

cryptographic agility, by adam langley, is another great discussion of these problems. tls (née ssl) is an old and complex protocol, with too many ways to extend it and too much flexibility. you can send “extensions” in any order, except you can’t because too much software wrongly expects them in a certain order. you can negotiate from a set of dozens of crypographic algorithms, except that also means a third party can force a downgrade to broken or even more broken algorithms.

langley argues that tls should have had a single, simple extension mechanism, ideally one that controls everything else (which algorithms, what to send, in what order). tls has a version number, which could have been that! except we didn’t bump the version number for ten years, so now tls 1.3 has to pretend to be tls 1.2 because too much software wrongly expects the version to forever be 1.2.

internet engineering is hard.