Dvorak encoding is a method of encoding plain text documents in a nonstandard way; it is similar in theory to rot13 except that instead of replacing every character with the character 13 places away from it in the alphabet, it replaces characters with their corresponding characters on the dvorak keyboard.
To illustrate this idea, let's take the top row of the QWERTY keyboard:
and the top row of the Dvorak keyboard:
Given this information, we replace all occurances of q with ', and so on. So "you" encoded in this fasion is "frg".
But it does not stop here; what if you typed "you" in the dvorak layout, on a qwerty keyboard? You'd get "tsf". This is the same theory, in a way reversed.
I have named the two aforementioned methods of encoding to make them easier to identify; "QWDv", which stands for "QWertyDvorak", is the method in which you translate the characters as if you typed qwerty on a Dvorak keyboard.
"DvQW", standing for "DvorakQWerty", is the opposite; where Dvorak is typed on a QWERTY keyboard.
You might not think of this at first, but encoding QWDv encoded data with DvQW effectively decodes it.
It doesn't stop there, though; There are three Dvorak layouts: Right handed, Standard, and Left handed. Left handed changes the most characters because it changes numbers as well as alphas. Not only this, but you can recode already-encoded data multiple times with the same or other class of encoding to further encrypt it. There are many ways this encoding can be applied; so many that decoding it is really not a simple task, if possible at all, without knowing what methods were used. If you were to apply QWDv encoding on a data three times, it would be termed QWDv3.
Sometimes (not always!) it is preferrable to be able to include in the encoded data exactly what method of encoding was used. I'm no cryptography expert but I believe a simple and not-so-secure method of doing this is by putting, at the beginning of a stream of encoded data, a tag of sorts containing the steps which the data has gone through to be in its current state. For example, say I encode a text file three times with QWDv, and then three more times with QWDvl to obscure it even more. I would include in the header of the data 'a3c3|'. The a3 means that the encoding with ID a was used three times (see table below for encoding IDs), and then c3 indicates encoding C was used 3 times, and the | seperates it from actual data. Obviously this data is not very useful if a wrongdoer knew this format because it would give him the exact method by which to decode it. This is just a concept and could be improved. I'm not a cryptographer so I'm still contemplating a better way to do this.
|a||QWERTY → Dvorak Standard||QWDv||DvQW|
|b||QWERTY → Dvorak Right||QWDvr||DvrQW|
|c||QWERTY → Dvorak Left||QWDvl||DvlQW|
|d||Dvorak → QWERTY||DvQW||QWDv|
|e||Dvorak Right → QWERTY||DvrQW||QWDvr|
|f||Dvorak Left → QWERTY||DvlQW||QWDvl|
I have made a Perl program that translates standard input between all of the listed encodings.
Usage is as so:
% echo "This is a testing string" | dvenc qwdv Ydco co a y.oycbi oypcbi % echo "Ydco co a y.oycbi oypcbi" | dvenc dvwq This is a testing string
The layered encodings can be done by passing it through the program multiple times.
Download: dvenc (updated 2006-05-06; some characters were not translated correctly, producing invalid results)
I'm also working on a PHP version, which should be available shortly.
Dvorak Encoding was named on November 13th, 2005, by myself, Andrew Keyser, in the IRC channel #mm2c on GameSurge. It was then posted to Wikipedia by another member of the channel going by the name of 'spoop,' with authorization from me. After some notes on that Wikipedia article were made, and it was pointed out that original works cannot be posted on Wikipedia (see this Wikipedia policy), I decided to make this page. This page is not final, and the encoding may be changed. Dvorak encoding may have been around much longer than currently known; but this is the first time to my knowledge that it has been named as such.