Factors Driving the Design

Several factors drove the design. Some of these were covered in advance in the section on stateful codecs. While implementing these stateful decoders we still must adhere to some rules to not undermine the purpose for implementing stateful decoders in the first place. Other design decisions come from the way the BER encoding is devised. These driving factors are all covered here in this document to reveal the present design while justifing it and the decisions that forged it.

TLV Nesting

A constructed TLV tuple contains other TLV tuples nested within the Value field, and so on, recursively. The nesting depth is usually indefinite and dependent on the data structures being transformed. This fact drastically affects the way the decoder is designed and hence operates.

It's always good form to tackle the basis case[s] of any recursive problem. The simplest basis case is the processing of a simple primitive TLV tuple. So for the time being presume we are simply implementing a primitive BER TLV decoder without any nesting.

While decoding a primitive tuple the decoder must maintain state. State for this simple decoder is an indicator representing whether the Tag, Length, or Value was being processed at the end of the last chunk of substrate data. This is required along with any data already captured for these fields to continue processing where the last chunk left off. Value field data actually is not "accumulated" to maintain state, but for the time being we'll ignore that fact.

The primitive TLV tuple decoder is easily designed. Build three stateful sub-decoders for each field, the Tag, Length, and the Value fields. A top level decoder deligates the processing state of the fields to each sub-decoder and switches sub-decoders when the indicator changes from Tag, to Length, to Value then back to Tag and so on. When a Value completes processing and before another Tag is read the decoder triggers a callback event. Here's what the hypothetical primitive TLV tuple decoder would look like:

Now let's try to figure out how to handle constructed TLV tuples which recursively nest other tuples indefinitely. State now is more that just where you left off in the current tuple being processed. When processing an inner tuple the decoder must also know where it left off in the outter tuple to resume processing. More accurately the decoder must maintain the state of every parent and ancestor tuple of the current tuple being processed in the same manner the TLV tuple decoder did for primitive TLV tuples. Hence the state of the decoder is a stack of all ancestor TLV tuple states as well as the state of the tuple currently being processed.

While processing input for a nested TLV tuple the state of all tuple ancestors must also be updated with the same data so the decoder can determine when their processing is about to complete. This way the decoder does not read TLV tuples adjacent to the constructed TLV tuple, incorrectly presuming that they are part of the constructed TLV tuple.

When the last inner tuple in a constructed TLV tuple completes, it triggers a callback for itself, then the stack is popped and another callback event is triggered for the completion of the constructed TLV tuple.

In conclusion, the state of a BER decoder, used to process both primitive and constructed TLV tuples, must take into accout the the processing state of every tuple ancestor in the stack of nested tuples. Otherwise state cannot be maintain. Just how this is efficiently managed is the topic of the next few subsections.

Value Processing

If the decoder accumulates encountered Value field octets to maintain state then we have a problem. First off the size of the Value could be massive and often varies. We want to maintain a fixed maximum memory footprint to the decoder. This goes out the window if Value field content is accumulated within buffers to maintain tuple processing state. Furthermore, with nesting, every ancestor tuple in the nesting stack would maintain a copy of the topmost tuple's Value field when that tuple is about to complete processing. The number of copies is a function of the nesting depth, so the deeper the nesting, the more memory is wastefully consumed. This is totally unacceptable and it undermines the reason for devising stateful codecs in the first place.

To avoid this problem we must not accumulate Value field octets (bytes) while maintaining state. Unlike the Value field, the other Tag and Length fields are limited and often account for only a few bytes within TLV tuples. To maintain state however the decoder still has to perform some accounting to determine when outter tuples in the nesting stack complete processing. The decoder maintains state by using Value byte counters processed rather than using an accumulator to store the Value bytes. This way the decoder can compare a tuple's Length field with it's Value byte counter to determine if processing is complete.

Extending DecoderCallback

The next question is how the decoder propagates TLV tuple Values to the target receiving the TLV tuple stream? If the standard DecoderCallback.decodeOccurred() method is designed to be called upon TLV processing completion how do we avoid collecting the Value while getting the Value bytes somehow to an decoder user via callbacks?

The answer is to use yet another callback. The DecoderCallback interface is extended by actually adding three extra callback methods: one for each field. The BERDecoderCallback interface extends the DecoderCallback interface and adds the following methods:

  • void tagDecoded( Tuple tlv )
  • void lengthDecoded( Tuple tlv )
  • void partialValueDecoded( Tuple tlv )

The following diagram shows the decoder interfaces and a do nothing adapter implementation for convenience:

For a single TLV all methods except for the partialValueDecoded() method is invoked at most once. As its name suggests peices of Value are delivered encapsulated by the Tuple argument rather than the entire Value field. Hence the method can be called zero or more times while processing a TLV.

The extended decoder callback interface allows the decoder to chunk Value fields and hence maintain a fixed maximum footprint. The partialValueDecoded callback is a bit misleading however. It really does not decode any Value bytes based on the Tag of the TLV. It simply hands off the raw value bytes to the callback, this part of the decode is left to higher level decoders built on top of the BERDecoder. However all primitive type decode operations are provided by the BER codec.

Constructed Values

The values of constructed TLV tuples are other tuples. Their Values are already decoded by the BER decoder which triggers TLV events for the nested TLV tuples. Calls to the partialValueDecoded() method hence are never made. Furthermore the decoder transits from the Length state of processing to the Tag state just after completing the decode of a constructed tuple Length field. This is because the next tuple to process is a nested tuple with its Tag following the constructed tuple's Length field.

Constructed TLV tuples never have partialValueDecoded() called. Only primitive TLV tuples have Value octets delivered to this callback. This makes state handling withing the decoder a bit tricky but the complexity for the rewards wreaked is well worth it.

State Management

There are two parts to managing state for the BERDecoder: stack based ancestor Value accounting state, and current tuple processing state.

Current TLV State Management

While processing the tuple in scope a type safe enumeration is used to track the current tuple processing state which could be in either Tag, Length, or Value processing states. Subordinate decoders are used to decode the Tag, and Length fields. The Value field unlike these fields does not have a corresponding decoder: it does not need one since primitives TLV Values are not decoded but returned raw. The sub-decoders for Tag and Length manage the accumulation of field bytes between chunked decode operations. The following diagram displays the helper classes used to manage the current TLV processing state:

ColorGroup
RedGeneric Helper Classes
YellowTag Specific Classes
PurpleLength Specific Classes

The tag specific classes include the Tag and TagDecoder classes. The Tag class handles the extraction of various Tag embedded fields like the constructed bit and the tag's type class. It also collects tag octets up to 4 octets only using a special TagOctetCollector. The TagDecoder is just a stateful decoder implementation wrapper using Tag methods.

The TypeClass class is a type safe enumeration of the four type classes of Tags:

  • UNIVERSAL
  • APPLICATION
  • CONTEXT_SPECIFIC
  • PRIVATE

Once the Tag accumulator collects all tag octets it determines and sets the TypeClass corresponding to the tag.

The TagEnum class is an abstract base class for type safe tag enumerations. This special type safe enumeration associates a tag label with two integers: the tag value and the tag id. The tag value is an integer representation of the tag whereas the id is just the just the id field of the tag. This btw is the main reason why the TagCollector only accepts four bytes for building the tag: an integer is essentially used as the backing store for the tag data. The reasons for this are explained within the tag handling section to follow.

The Length and LengthDecoder operate very much in the same fashion as do the Tag and TagDecoder. The same pattern is applied to both pairs of classes. The primary difference is the use of a ByteBuffer within the Length class rather than a custom data structure like the TagOctetCollector to accumulate Length bytes (octets). The main reason for this is that a limit of 4 tag octets have been imposed on the decoder which in fact is contrary to the BER specification. Length values well above the 4 byte integer are surely possible for TLV values although improbable.

The BERDecoderState class is another type safe enumeration with the following values: TAG, LENGTH and VALUE. It obviously represents the processing state of the TLV tuple currently in scope.

Stack State Management

The BERDecoder UML is displayed below to show some of the memebers and operations available. Pay special attention to the tlvStack member and the getTupleStack() package friendly Stack accessor used for stack state management:

The tlvStack is a Stack of Tuple instances. The last subsection contains a UML diagram with the Tuple class. Tuple objects are the objects handed to the the decodeOccurred() method of the callback. They basically encapsulate a bunch of information associated with a TLV tuple in one object. This includes accounting information used to determine the processing state of constructed TLVs. The Stack of Tuples hence stores the state information associated with ancestor Tuples currently out of scope.

With every chunk of substrate processed for the tuple currently in scope, the accounting information in every Tuple of the Stack is updated. Again, this tracks how much of the anscestor's Value field has been processed. Specifically the length and index fields of Tuple objects are used to determine how much of the TLV has been read.

Tuple Recycling

TLV Tuple Density

BER TLV streams will contain varying densities of TLV tuples. The density of the tuples depends on the nature of the content. Streams with many small primitive types crammed together will generate TLV tuples very rapidly while processing the encoded substrate. Every few, even couple bytes might produce a new tuple.

If we instantiated a new Tuple instance and populated it for every few bytes in the stream, then performance will degrade significantly while processing streams with high TLV tuple densities. Futhermore rapid object creation rates would seriously tax the garbage collector. To avoid these negative effects of instantiating new TLV tuples we need to reuse the same Tuple allowing interested parties to either clone or copy the contained information while processing the tuple. More often than not, most tuples will be ignored. It would be wasteful to create a new Tuple object for every TLV tuple encountered when some or most might potentially be ignored.

Problem With Recycling a Tuple

If we avoid instantiating new TLV Tuples and resuse the same Tuple object, we run into a problem. First we'll loose data when we attempt to push the tuple onto the tlvStack when the next TLV is processed.

One solution to this problem is to clone constructed Tuples before pushing the tuple onto the tlvStack. Hence only primitives would reuse the same Tuple. This works well because primitive tuple data does not need to be maintained past its scope. If the data needs to be copied, it can be copied by the application using the decoder. This makes sense since the application determines which Tuple values to store or ignore.

This solves performance bottlenecks with substrates that are dense with primitive tuples. However the problem will re-emerge if the substrate is dense with deeply nested primitive tuples. If every primitive is embedded deep within its own cavern of nested TLV tuples then we'll be closer to instantiating a Tuple object for almost every TLV encountered. The perfect substrate under this scheme, of course, would be a single primitive element but beyond that it would be flat nesting patterns where as many primitives TLV tuples are stuffed into every contstructed TLV tuple as possible.

The deeply embedded high density of constructed TLV tuples is highly unlikely although possible for recursive ASN.1 definitions. Regardless of these situations producing a high density of constructed TLV tuples, the nesting structures will often share the same parents so the TLV tuple to Tuple object instantiation ration would rarely approach 1:1.

Over all we cannot determine the ratio of constructed to primitive TLV tuples encountered within a substrate. However one would like to believe that complex structures do not predominate, and that protocol designers opt for simpler structures whenever possible. With this sensible hypothesis reuse of primitive TLV tuples and the cloning of constructed TLV tuples seems like a viable strategy for managing excessive object instantiations.

Integer Representation For Tags

According to the BER encoding specification, X.690, a Tag id can be any arbitrary value: there is no limitation to the size of an id. In practice ids are claimed incrementally by ASN.1 modules from the CONTEXT_SPECIFIC and APPLICATION type classes. These values for any reasonable protocol are far less than 100 ids. Experts like Larmouth claim they have never seen Tag ids larger than a thousand. So we don't bother representing Tags within a buffer for the full expressivity of the specification when we know of reasonable soft limits to the Tag id.

Four Tag Octets Are Enough

In most cases, one or two octets suffice for encoding a tag and its identifier. In some cases three bytes may rarely be used. It's highly improbable that we'll ever see four or more bytes to be used to encode a tag: even the experts have never seen this before. The only way I can conceive of this is if computers begin devising or generating protocols :-).

According to the specification the long form can encode the following maximum identifier sizes with various octet lengths:

Octets Maximum Tag Id Calculation
1 30 2^5-1
2 127 2^7-1
3 16,383 2^14-1
4 2,097,151 2^21-1
5 268,435,455 2^28-1

As we can see 3-4 octets encode a maximum tag id we can live with. One might expect the max tag id for say 4 octets would be 2^(4*8)-1 but its not. We loose some bits, to be able to encode a variable length tag with the long form. In the long form all the bits from the first octet are wasted and a bit from each octet there after is lost to be able to terminate the tag field. Hence if we started out with 4 bytes or 32 bits then we're actually using 32-8-3 or 21 of the original bits for storing the value of an id. This yeilds a max id value of 2^21-1 for 32 bits or 4 octets.

Integer Encoding

Tags are used to match for TLV tuples. Nothing matches faster than an integer using a switch statement. It makes sense to store and manage raw Tag octets within the bytes of a primitive Java integer rather than within a byte buffer. This way switch statements can be used to quickly match TLV tuples based on their integer encoding for the first four tag bytes. Furthermore the stub compiler can prefabricate type safe enums whose values equal the integer encoding of a tag's four octets. Matching for TLV tuples by tag then is as fast as it can get using this integer encoding. This btw is the sole reason why we have the abstract class, TagEnum, which extends ValuedEnum. It's a type safe enumeration for Tag octets encoded as integers.

Encoding only the four octets of the raw tag, limits the maximum value of the id that a TLV's tag field can represent to 2^21-1. This was the reason for the discussion in the section above. We simply will not need an id bigger than this. So we decided to break with the specification and restrict the max value of the tag id down to 2^21-1 rather than leave it unbounded. This limitation allows us to represent the first four octets of the tag field as an integer thereby speeding up TLV pattern matching considerably.

The TagOctetCollector is specifically designed to accumulate the four octets of the Tag. It stores the first octet in the most significant byte of the int, the second in the next most significant and so on until the last of the four octets is stored within the least significant byte of the integer. The diagram below shows just how 4 bytes are assembled into the integer:

Note that if their were only 3 tag octets collected, then the bits for Octet 4 would all be zero: bits 0-7 in the integer would be zeros. Likewise if only one octet were used then bits 23-0 would be zero'd out within the 32-bit integer.

The integer encoding for tags are not leveraged here at the level of the BERDecoder. At this low level the decoder does not care about tags other than those in the UNIVERSAL type class reserved for detecting TLV tuple termination sequences within the stream. Later within the BERDigester where Tag pattern matching is used to make sense of these TLV tuple streams, the integer encoding and the TagEnum are used heavily. Rather than add more complexity to the BERDecoder we stop here and build upon it by stacking another decoder, the BERDigester on top. The BERDecoder decodes encoded substrate streams into TLV Tuples and announces their arrival by callbacks which are recieved by the BERDigester. It is then upto the BERDigester to process these TLV tuples using decoding rules triggered by tag nesting patterns. How approapriate! Data encoded using Basic Encoding Rules is decoded using rules that process a TLV tuple stream. More information regarding the design of the BERDigester can be found here.