yEnc - the new encoding format

What is yEnc?
Where does yEnc come from?
Will It work?
Will yEnc become the preferred encoding format?
What do I do if the attachment is corrupt?
Links to yEnc Resources


What is yEnc?

yEnc is a new encoding protocol that uses 8-bit encoding to reduce the amount of data being sent and received. The traditional UUencode and base64 encodings incur a 30% overhead by converting all binary data into "safe" characters that can be transported over Usenet.

At its inception, Usenet had a variety of problems with 8-bit data, resulting from the fact that a large number of disparate subnets had been combined into the "internet". Some of these subnets could only pass 7-bit data. Some of them used different (non-ASCII) character sets, which sometimes resulted in translation problems. Some of them treated certain characters or sequences as "special".

These days, in TCP/IP, the vast majority of our servers can handle 8-bit data, to a large degree. There are still some "forbidden" cases that can cause odd behaviour. yEnc has made a "good compromise" that allows all but a handful of characters to be represented by an 8-bit value. This handful is encoded by prefixing its value with an "escape" byte. Thus, a yEncoded binary file is only slightly larger (depending on the content), than the original attachment, whereas UU/base64 encoded binaries are 130% of the original size. yEnc represents a significant savings in up/download time and server file space.


Where does yEnc come from?

Links to the resources mentioned in this section are located at the bottom of this page.

The homepage for yEnc is http://www.yenc.org/. Since its introduction, the traffic to this website has become so heavy that http://www.winews.net/yenc/index.htm was opened as a mirror.

To encourage its adoption as a standard, the yEnc specification (which is still undergoing revisions) has been published and placed in the public domain. Also provided are reference implementations of encoder and decoder source code, and a variety of information useful to developers of yEnc-enabled software.

A freeware decoder that implements the full power of the yDec protocol is available as a Windows program, and in full source code. Again, all has been placed in the public domain.

Local copies of these programs are available via links at the bottom of this page. You should first check out the yEnc site, since the local copies may not be current. As of March 2002, yEnc seems to be changing on a weekly basis.


Will It work?

I don't know. So far, my admittedly limited experience is generally favorable. The biggest problem that I see is that the life cycle of yEnc is just beginning. Although reference implementations are published and distributed, newsreader/newsposter developers (myself included) are rewriting the algorithms to fit their programming paradigm. In doing so, bugs are introduced and some concepts may be overlooked.

The CRC algorithms are one of the most glaring issues. Although yEnc has published a good, fast CRC32 algorithm and demonstrated how to use it, I have encountered several posting programs that do something differently. Currently, AllNews calculates 5 different CRCs during decoding to attempt to match what the poster calculated. (5 possible values in 2^32 instead of 1). As there may be more CRC algorithms that I have not yet encountered, I had to relax the checking of the CRC to allow suspicious files to pass.

Fine details of the yEncode encoding technique are still under discussion. As the "standard" for yEncode "moves", certainly some decoders will fail from time to time. See below for help when AllNews fails to decode yEnc correctly.


Will yEnc become the preferred encoding format?

I don't know. The arguments taking place in the newsgroups remind me of the HTML Wars that broke out a few years back when HTML-enabled newsreaders made their debut. In that case, the HTML added significant overhead to the posts (as well as making them unreadable); whereas yEnc makes a significant decrease in overhead. In both cases, there is bitter dispute between the "haves" and "have-nots"; those people whose newsreaders can and cannot support yEnc. And those people who simply resist change. At any rate, newsreader developers will be forced to implement some form of yEnc support to placate their disgruntled users, or risk losing their audience.

The philosophy behind AllNews is that it should be able to recover the binary attachments from ALL posts that contain the complete set of data, including binaries posted as followups, and a variety of less well-known encodings such as ROT-13, quoted-printable, and xxencode.

In the first rollout of yEnc, AllNews takes a middle-of-the-road response to yEnc. Posting in yEnc is not yet supported. It may be supported later, but it was left out in order to make the decoding function available sooner. Full and complete posts are correctly decoded. That is, all parts must be fully present with no transmission errors, as currently required with UU/base64 posts. The yEnc protocol offers the bonus feature that parts of unlike posts of the same binary attachment can be easily combined to fill missing part n/n holes; even when the two (or more) posters post with different settings for number of lines or bytes per part. These other advanced features will be implemented as time and demand dictate.

In summary, UUencode will remain the format used to reach the largest audience, as its encoders have generally stood the test of time. base64 also has a large audience. The posters of yEnc today are trying to force their particular flavor of decoder (either a particular newsreader client, external decoders, or other add-on) down the throat of their audience. This does not set will with the typical usenet anarchist. Further, additional post-processing is a nuisance. Only when the popular newsreaders support this protocol seamlessly will it become widely accepted. Good luck "selling" this package to MicroSoft!


What do I do if the attachment is corrupt?

First, you must determine whether the problem is with AllNews, the content of the post on the usenet server(s), or with the posting program. If the problem is with AllNews, I will fix it. In the interim, you can use the reference decoder yDec to extract the attachment. If the problem is with the posting program, I will investigate whether there is some hack that will allow successful decoding of the attachment. If the problem is on the server, you MAY be able to overcome it using AllNews' features to acquire the damaged segments from another server.

In the Articles list, select one or more of the parts of the post to decode, and click on Decode. If there is a problem with the content of the post, AllNews should give you the subject line of the segment that is corrupt and the line number in the segment where the corruption was encountered. You may attempt to retrieve this one part from other servers to see if you can recover the attachment. This is a standard feature of AllNews. Check the "Path:" header and try to get as close as possible to the server where the post originated. Note that it may already have expired from that server, and you may have to try to chase it downstream.

If AllNews decodes the attachment, but it is reported as corrupt by an SFV checker, PAR checker, or archive decompressor (e.g., WinRAR), the post needs to be checked against the reference decoder. In the Articles list, select all of the parts of the post in question and click on Save. Don't worry about mixing parts of posts up in a single file; yDec will sort it all out for you.

If you are in Windows (or have a buddy in Windows), you may use yDec, downloaded from Links to yEnc Resources, below. If you are on a different platform, you can download the source code for yDec and adapt it to your environment. If you downloaded yDec some time ago (in Mar 2002 this means a week ago or more), it is worthwhile to check the yEnc site to see if the reference decoder has been changed. You will likely see various yDecoders floating around the newsgroups; beware of viruses, trojans, etc, as you would with any other type of executable from an unknown origin.

If yDec reports problems similar to those reported by AllNews, the problem is either on the usenet server, or in the posting program. If all attachments from a particular poster have the same type of problem, it is likely the posting program. If it is one part of one (or just a few) attachment(s), it is probably a regular usenet transmission error. If yDec decodes the attachment from the saved file, and it is accepted by the PAR/SFV checker, etc., the problem lies with AllNews.

If you DO find a problem with AllNews yDecoding, (a poster whose posts are undecodable), first check the AllNews homepage to make sure you have the latest release. If so, please ZIP the file into which you Saved the attachment and either email it or a URL by which it can be downloaded to the author or use the bug reporting form. If you include an email address, you will receive a reply when the problem has been fixed.

Happ yDec oding


Links to yEnc Resources    Updated Mar 9 2002

http://www.yenc.org/ yEnc homepage.
http://www.winews.net/yenc/index.htm yEnc site mirror.
yEnc specification
http://www.winews.net/yenc/develop.htm reference implementations of encoder and decoder source code, and a variety of information useful to developers of yEnc-enabled software.
yDec - The freeware decoder for yEncode
yDec source code    Local copy
yDec Windows program file    Local copy