Monthly Archives: February 2019

SerentityData: Variable-byte, Statistically-based Entity Serialization & Field Encoding

Michael Herman (Toronto/Calgary/Seattle)
Hyperonomy Business Blockchain Project / Parallelspace Corporation
February 2019

The SerentityData Entity Serialization and Field Encoding library fulfills Blockchain Requirement 1 for the Universal UBL (UUBL) extensions to the UBL 2.2 business document schema specification:

Requirement 1. Compact and Efficient Binary Serialization

The requirement for extremely compact and efficient binary serialization of each entity (and subentity) can be fulfilled by various software serialization solutions including SerentityData Variable-byte, Statistically-based Entity Serialization. The SerentityData project can be found on Github. SerentityData is the author’s preferred solution.

Design and Implementation Strategy

Design Principles

  1. Statistically-based Encodings (STE): For a particular datatype (e.g. Signed Int 16 or Unsigned Int 64), some data values representable by this datatype will be used more frequently than others (e.g. 0 (zero), 1, small integer values, etc.) and there should be a way to encode these statistically more frequent values using as few bytes as possible (compared to larger data values).
  2. Variable-byte Encodings (VBE): Over the lifespan of a persisted field (e.g. an Unsigned Int 64 blockchain block or transaction serial number), the initial values of the field will have small values (0, 1, 2, 3, …) and over the course of a long time (years or decades) grow to have very large values. As few bytes as possible should be used to represent small values and this should be less than the number of bytes required to represent very large values.
  3. Application-adaptive Encodings (AAE): Every application or application data domain will require:
    1. A different subset of the available datatypes, and
    2. Each data type will have a different statistical distribution of application-dependent data values.
  4. Single-byte Encodings (SBE): Very common (application-dependent) data values, on a datatype-by-datatype basis, should only require 1 (one) byte of storage. For example, for specific applications, the following values would be candidates for single-byte encodings:
    1. Data value TRUE of datatype Boolean and data value False of datatype Boolean
    2. Specific data values of small whole numbers (e.g. 0, 1, 2, 3, 4, 5, … n) where, depending on the application and datatype, n might be 10, 32 (days of the month), 60 (seconds in a minute), etc. The values do not have to be contiguous.
    3. Specific data values of negative numbers (e.g. -1, -2, -3, -4, -5, … m) where, depending on the application and datatype, m might be -10, -20, etc. The values do not have to be contiguous.
    4. Specific data values of Enum datatypes. The values are usually contiguous but there is no requirement for them to be contiguous.
  5. Support All Datatypes (SAD): It should be possible to encode all possible datatypes and all possible data values for the selected data types.  The current list of supported data types includes:
    1. Signed Integer 16
    2. Signed Integer 32
    3. Signed Integer 64
    4. Unsigned Integer 16
    5. Unsigned Integer 32
    6. Unsigned Integer 64
    7. Byte
    8. Signed Byte
    9. Enum
    10. Byte Array
    11. Boolean
    12. Boolean
    13. Char
    14. String
    15. ASCIIString
    16. Address
    17. Guid
  6. Standalone, Compact, and Efficient (SCE) Implementation
    1. Standalone: The entity serialization and field encoding libraries will not rely on any external source code or callable binary libraries.
    2. Compact: The entity serialization and field encoding libraries will be compact enough to be useful and desirable to execute:
      1. Off-chain in a traditional computing environment including server, PC, tablet, and mobile device, as well as
      2. On-chain in a smart contract (virtual machine) execution environment
    3. Efficient: Highly efficient code execution to be useful and desirable to use in a fee-based, smart contract (virtual machine) execution environment
  7. Storage Compatibility (SC): Compatible with any data storage technology capable of storing variable-length arrays of bytes representing the field encoded and entity serialized data.
  8. Runtime Compatibility (RC): The current design and implementation of the entity serialization and field encoding libraries is compatible with .NET Core 2.0 or later.
  9. Versioning Support (VS): Support for versioning at the entity serialization as well as entity encoding levels; including support for multiple applications, serializations and encodings in the same library.
  10. Entity Extensibility and Backward Compatibility (EEBC): Entity declarations can be extended through the additional fields without losing backward compatibility with previously serialized entities.
  11. Support for ByteArray and String Null-Valued References (NULLS): Support for null-valued references to ByteArrays and Strings – in addition to zero-length and non-zero length ByteArrays and Strings.

Assumptions

  1. There exists a code generation tool that takes as input a description of an entity, its fields and their datatypes that will create the sequence of calls into the entity serialization and field encoding that performs serialization/deserialization and field coding/decoding for a target entity declaration (e.g. a C# class declaration).

Implementation Notes

  1. The reference implementation of the entity serialization and field encoding libraries is called SerentityData and is implemented using .NET Core 2.0, C#, and Visual Studio 2007 Community Edition.

Roadmap

Features that are unsupported in the current release:

  1. Collections and Mappings: next on the queue
  2. Nested Entity Declarations: An entity declaration that has a field whose datatype is another entity declaration.

Field Encoding Strategy

TODO

Byte 0 Mapping

In the Variable Byte Encoding scheme used to represent the value of a datatype, the first byte of an encoded field value (Byte 0) is the most important. For Single-byte Encodings, Byte 0 needs to encode both the datatype of a particular value but also the value itself.

Given there are only 256 unique values that Byte 0 (or any single byte) can assume, certain trade-offs must be made to accommodate the range of application-driven datatypes required and the number of possible single-byte, double-byte, and triple-byte optimizations that are possible. This is where statistical knowledge of the range of values that a particular datatype required by an application is important.

Byte 0 Default Mapping

Figure 1 below is an example of the current default set of Byte 0 mappings.

SerentityData1.pngFigure 1. Byte 0 Default Mapping Table

Data value 0 (zero) is currently not assigned.

Single-Byte Encodings

TODO

Two-Byte Encodings

TODO

Three-Byte Encodings

TODO

Longer Byte Encodings

TODO

Performance

TODO

Sample Use Case

An example of a distributed business application designed to be used with SerentityData is the following SerentityDapp.Perfmon for on-chain [blockchain] performance monitoring and recording.

NCP-001 SerentityDapp.Perfmon v0.7Figure 2. SerentityDapp.Perfmon Data Model – Onchain [blockchain] Performance Monitoring and Recording Example

Appendix A – SerentityData Field Encoding Details

Supported Data Types

02-datatypes

Special Use Cases

21-specialcases

Field Buffer Configurations

A. Buffered Values

  • 3 Fields = FieldEncodingType + Buffer Length + Buffer Bytes
Use Case 0. 16-bit Buffered Value

 

33-usecase0

Use Case 1. 32-bit Buffered Value

 

48-usecase1

 

Use Case 2. 64-bit Buffered Value

 

53-usecase2

B. Headered Values

  • 2 Fields = FieldEncodingType + Value (stored in the 16-bit, 32-bit, or 64-bit Buffer Length field)
Use Case 3. 16-bit Headered Value

 

59-usecase3

Use Case 4. 32-bit Headered Value

 

73-usecase4

Use Case 5. 64-bit Headered Value

 

78-usecase5

C. Value Constants:

  • 1 Field = FieldEncodingType (representing a specific constant value for a particular datatype stored represented an FieldEncodingType op code)
Use Case 6. 8-bit Value Constants

85-usecase6

D. Null ByteArrays

 

Use Case 7. Null-valued reference to a ByteArray

 

92-usecase7

E. Short Length ByteArrays

  • 0, 1, 2, 3 bytes in length
Use Case 8. 0-byte ByteArray

 

100-usecase8

Use Case 9. 1-byte ByteArray
105-usecase9
Use Case 10. 2-byte Byte Array
110-usecase10
Use Case 11. 3-byte Byte Array
115-usecase11

 

Best regards,

Michael Herman (Toronto/Calgary/Seattle)

1 Comment

Filed under Uncategorized

Software Reliability Engineering (SRE): Faults, Errors, Failures and Symptom Chain

Michael Herman (Toronto/Calgary/Seattle)
Hyperonomy Business Blockchain Project / Parallelspace Corporation
February 2019

Draft document for discussion purposes.

Update cycle: As required – sometimes several times in a single day.

 

PSN-SRE-Fault-Error-Failure-Symptom-Model v0.1

Figure 1. Fault (Defect), Error, Failure, and Symptom Chain

Principles

P1. A Fault (or Defect) is a physical, design, or software flaw [Tschudin].

A Fault is the mechanical or algorithmic cause of an Error, while a Potential Fault is a mechanical or algorithmic construction within a system such that (under some circumstances within the specification of use of the system) that construction will cause the system to assume an erroneous state [Millier-Smith].

A Fault is a condition causes a system to fail in performing its required function [Agarwal].

P2. An Error is an incorrect behavior caused by a Fault [Tschudin].

The term Error is used to designate that part of the [system] state which is “incorrect” [Millier-Smith].

An Error is a discrepancy between the actual value of the output given by the software and the specified value of the output for a given input. That is, Error refers to the difference between the actual output of the software and the correct output. An Error is also used to refer to the wrong decision in a given case as compared to what is expected to be the right one. Error also refers to human actions that result in software containing the defect or fault [Agarwal].

P3. A Failure is the inability to perform according to a specification because of an Error [Tschudin].

A Failure of a system occurs when that system does not perform its service in the manner specified, whether it is unable to perform the service at all, or because the results and the external state are not in accordance with the specifications. [Millier-Smith].

Failure is the inability of the software system to perform a required function to its specification [Agarwal].

P4. Symptoms are external manifestations of failures [Steinder].

References

[Agarwal] B.B. Agarwal, M. Gupta, and S.P. Tayal, “Software Engineering and Testing”, Jones & Bartlett Learning, 2010.

[Millier-Smith] P.M. Millier-Smith and B. Randell, “Software Reliability: The Role of Programmed Exception Handling”, ACM, 1976.

[Steinder] Ma łgorzata Steinder and Adarshpal S.Sethi, “A survey of fault localization techniques in computer networks”, Science of Computer Programming, Volume 53, Issue 2, November 2004, Pages 165-194.

[Tschudin] Christian Tschudin, CS321/CS221: Autonomic Computer Systems, University of Basel, 2006.

[Herman] Michael Herman, “Fault-Error-Failure Chains”, https://www.facebook.com/notes/michael-herman/fault-error-failure-chains/493110544121/, Nov. 14, 2010.

 

Leave a comment

Filed under Uncategorized

#iDIDit: [OLD] An Architecture-driven Taxonomy for SSI Agents v0.5

IMPORTANT: Don’t read this version of this article about the INDY-AGENT ARM.  The official (draft) version of the INDY-AGENT ARM has moved to github: click here Appendix B – Indy Agent Architecture Reference Model (INDY-AGENT-ARM)

Michael Herman (Toronto/Calgary/Seattle)
Hyperonomy Business Blockchain Project / Parallelspace Corporation
February 2019

Draft document for discussion purposes.
Update cycle: As required – sometimes several times in a single day.

This chart below was inspired by Daniel Hardman’s 0002: Agents HIPE dated 2017-11-01 (and published 2019-01-31).  Hopefully the chart speaks for itself.  The original goal was to take Daniel’s prose and visualize it as a 2×2 matrix. I #almostDIDit :-).

Please send me your comments and feedback.  Click on each chart to enlarge and/or download it.

Recent Changes

  • Added Relay Agent.
  • Removed Thin “complexity” agent from Partial Function Agent category.
    • Thin “complexity” agent remain in the Lightweight Agent category.
  • Clarified that Thin “complexity” agent has no local state (i.e. no wallet and no local config state/files).
  • Removed Static “complexity” agent from Partial Function Agent category because it doesn’t have a Local Wallet.
    • Static “complexity” agent remain in the Lightweight Agent category.
  • Clarified that Static “complexity” agent has Local Config state.
  • Clarified that the Relay agent has a Local Wallet.
  • Added history of previous versions of this chart back to v0.3

Version 0.5

HBB-SSI-Agents v0.5

Figure 0.5. An Architecture-driven Taxonomy for SSI Agents v0.5

Version 0.4

HBB-SSI-Agents v0.4

Figure 0.4. An Architecture-driven Taxonomy for SSI Agents v0.4

Version 0.3

HBB-SSI-Agents v0.3

Figure 0.3. An Architecture-driven Taxonomy for SSI Agents v0.3

Best regards,
Michael Herman (Toronto/Calgary/Seattle)

 

Leave a comment

Filed under Uncategorized