Specilaized Syntax Systems

dm, other Style command line tools, and even some of our GUI tools make use of specialized syntax systems to help bridge the gap between classic 8-bit concepts and modern computing, including things like PETSCII and Commodore filenames. There are currently three such systems and they are described below.

pet{asc}

pet{asc} is a specification for serializing PETSCII characters into US-ASCII. It is a super-set of bastext, which is itself derived from TOK64, which was the first software & specification for representing PETSCII characters in an ASCII context. Like TOK64, pet{asc} uses paired "{" and "}" delimiters with the ASCII characters inside each pairing interpreted as a representation of one or more PETSCII characters. These are henceforth referred to as pet{asc} tokens. The following token forms are defined:

pet{asc} token	interpretation
`{000}`	a three digit zero-padded decimal value, 000-255, is directly translated to the same PETSCII code point. (as the original TOK64 syntax)
`{10}`	a two digit decimal value, 10-99, is directly translated to the same PETSCII code point.
`{$8f}`	a hexadecimal value, $0-$ff, is directly translated to the same PETSCII code point. Note that this is redundant with tagged serialized hexadecimal described below, but included here separately for parity with the other single byte to code point styles. Note also that it is not necessary to pad the value: `{$f}` and `{$0f}` are equivalent.
`{0x8f}`	an alternative form (C-style) for a hexadecimal value is directly translated to the same PETSCII code point.
`{%10001111}`	a binary value is directly translated to the same PETSCII code point.
`{$313233}`	a multi-byte tagged (preceded by '$') serialized hexadecimal string is translated into multiple PETSCII code points. See "Serialized Hexadecimal" below.
`{~LowerCase}`	a '~' as the first character indicates that each of the following sequence of characters should be translated to PETSCII as if input to a C64 operating in lower/upper case mode. Keep in mind that the default interpretation for alphabetic characters is upper/graphics mode, so for example, `{~text}` is equivalent to `TEXT` - both translate to the PETSCII bytes $54,$45,$58,$54 which if CHROUT'ed on a real c64 would display '`TEXT`' in upper/graphics mode and '`text`' in lower/upper mode. This token form is useful in a GEOS context since GEOS defaults to writing and displaying filenames in lower/upper.
`{1}`	a single US-ASCII character in the range $20 to $7e is translated according to the mapping below. (as the original TOK64 syntax)
`{cm 1}`	any multi-character token that doesn't match one of the above forms is assumed to be a string code that is mapped to a single PETSCII character. pet{asc} includes all tokens defined by TOK64 and bastext, all tokens produced as output by petcat, and a number of other additional tokens. Here, `{cm 1}` is a token that maps to PETSCII $81. (as the original TOK64 syntax)
`{CM 1}`	pet{asc} defaults to case-insensitive with regards to matching multi-character tokens; therefore the tokens `{cm 1}` and `{CM 1}` both map to PETSCII $81.
`{cm1}`	pet{asc} treats space characters in tokens as optional; therefore the tokens `{cm 1}` and `{cm1}` both map to PETSCII $81. This can make using pet{asc} tokens in the context of command line options or other contexts where whitespace is significant a little easier.
`{c:$1}`	a token of the form "c:" following by a decimal, hexadecimal, or binary digit in the range 0-15 is translated to the PETSCII color control code that corresponds with that color index (e.g. 0 = $90/black, 1 = $05/white, etc).
`{down*10}`	any token that is immediately followed by an asterisk and a decimal, hexadecimal, or binary digit is interpreted as a repeated token, and translated to that number of PETSCII code points that map from the token. (as the original TOK64 syntax)
`{$11*$10}`	the repetition syntax not only works with tokens but also any of the other above; for example this token translates to a sequence of 16 cursor down ($11) PETSCII control characters, the same as the preceding example.

Serialized Hexadecimal

A serialized hexadecimal string represents a sequence of bytes in hexadecimal form that follows these rules:

Serialized Hexadecimal strings

all non-hexadecimal characters are treated as white space
hexadecimal digits (0-9, a-f, A-F) are parsed from left to right
each pair of consecutive hexadecimal characters are interpreted as a byte
single hexadecimal characters that are delimited by white space (incl. as per 2. above) and/or cannot be paired with a successive hexadecimal character are interpreted as a byte with only the lower nybble value

Tagged Serialized Hexadecimal strings

All of the above, including:

The string must start with the '$' character.

So for example, the following are all equivalent serialized hexadecimal strings that resolve to the same four bytes:

"$0102030f", "$1 2 3 f", "$01, $02; 03, 0f", "$1,2,030f"

And these resolve to the same two bytes:

"$120f", "$12f", "12 f"

The main purpose of defining a specification for "Serialized Hexadecimal" is that software tools ought to be able to work with flexibly written multi-byte hexadecimal strings; this in turn would make for better interoperability with other tools like hex editors or hex dumpers that display sequences of hexadecimal data in various ways or copy data to the text clipboard in different forms. xxd's default display looks like this: 3c3f 7068 700a; copying from HxD looks like this: 3c 3f 70 68 70 0a; copying from the DirMaster track/sector view looks like this: $3c, $3f, $70, $68, $70, $0a. But all three strings resolve to the same data bytes using the Serialized Hexadecimal definition above.

CBM $electors

CBM $electors are a concise way to represent specific files or data (bytes) on/in Commodore disk/archive container formats, particularly aimed at command line interfaces. One or more selectors comprise a selection. Any program that allows a user to select specific files or data using text (like from a command line) could use CBM $elector syntax, which also incorporates pet{asc} as defined above. Depending on context, only certain selectors could be applicable. A program that allows you to copy files should terminate with feedback if you tried to select data from a disk block instead of a file. A program that only works on a single file should reject a file index range selection, and so on.

Basic selector syntax follows the pattern OSFILE:CBMFILE where OSFILE is the container file stored on the OS native file system and CBMFILE is the actual selector. The following syntax examples demonstrate the current range of supported selectors:

single file selection

selector	interpretation
`disk.d64:#1`	by index (0-based)
`disk.d64:FILE`	by name
`disk.d64:{~GeosFile}`	by name; using a pet{asc} lower/upper conversion token
`disk.d64:{$8f,01,02}`	by name; using a pet{asc} tagged serialized hexadecimal token
`disk.d64:{%11110000}`	by name; using a pet{asc} binary token
`disk.d64:{cm1}abc`	by name; using a pet{asc} string code token
`disk.d64:{c:0x0f}`	by name; using a pet{asc} indexed color token
`disk.d64:{cm1*2}`	by name; using a pet{asc} repetition token

multiple file selection

selector	interpretation
`disk.d64:#5-8`	by index range
`disk.d64:#1,5,8`	by a list of indices, comma separated
`disk.d64:FILE;{cm1};#1`	by a list of selectors, semi-colon separated; multiple selectors are grouped as a logical OR and a file in the container that matches any single selector will be selected
`disk.d64:{cm 1}*`	by wildcard (*)
`disk.d64:F?LE`	by wildcard (?)

exclude one or more files from selection

selector	interpretation
`disk.d64:^{cm1}*`	by any valid selector that follows the '^'; any file in the container that matches the negated selector will be excluded from the selection even if it matches a different selector

data selection

selector	interpretation
`disk.d64:$`	by following the chain starting at the first directory block; full 256b blocks
`disk.d64:$~`	by following the chain starting at the first directory block; 254b blocks
`disk.d64:%18.0`	by one full 256b block
`disk.d64:%18`	by track; full 256b blocks in order starting at sector 0
`disk.d64:%18.0,1,4,7`	by compact chain notation; full 256b blocks, in chain order
`disk.d64:%~18.0`	by one 254b block (omitting the first two link bytes)
`disk.d64:%~18`	by track; 254b blocks in order starting at sector 0
`disk.d64:%~18.0,1,4,7`	by compact chain notation; 254b blocks, in order
`disk.d64:%%18.0`	by following the link chain starting at tr.sr; full 256b blocks
`disk.d64:%%~18.0`	by following the link chain starting at tr.sr; 254b blocks

nested container selection

selector	interpretation
`disk.d64:FILE.LNX:#1`	by descending into subdirs, partitions, or archive containers. In this example, the selection is the file at index 1 that is inside the named lynx archive that is on the d64.
`d.d64:A.LNX:#1:#1::#2:#1`	a double colon ascends from a subdir, partititon, or archive container. In this example, the selection is the 1st file in the 1st and 2nd containers inside the lynx.

OS specific issues

Depending on the OS and/or shell used certain characters may need some care; for example on Windows CMD the '^' is an escape character so you'd either need to write the selector in double quotes or use "^^"; when using fish shell you'd need to prefix an '*' with a '\'; etc.

more examples

List all files on the disk that have the traditional Koala Painter file name scheme:

dm -Cl koala.d64:{cm1}*

List all files on the disk except those that have the traditional Koala Painter file name scheme:

dm -Cl koala.d64:"^{cm1}*"

List files that start with K or P but not if they end with a '.':

dm -Cl koala.d64:"K*;P*;^*."

Stream the 256 bytes at track 18, sector 0 and pipe them into petxd:

dm cat koala.d64:%18,0 | petxd