Software Reliability Engineering (SRE): Faults, Errors, Failures and Symptom Chain

Michael Herman (Toronto/Calgary/Seattle)
Hyperonomy Business Blockchain Project / Parallelspace Corporation
February 2019

Draft document for discussion purposes.
Update cycle: As required – sometimes several times in a single day.

 

PSN-SRE-Fault-Error-Failure-Symptom-Model v0.1

Figure 1. Faults, Error, Failure and Symptom Chain

Principles

P1. A Fault (or Defect) is a physical, design, or software flaw [Tschudin].

A Fault is the mechanical or algorithmic cause of an Error, while a Potential Fault is a mechanical or algorithmic construction within a system such that (under some circumstances within the specification of use of the system) that construction will cause the system to assume an erroneous state [Millier-Smith].

A Fault is a condition causes a system to fail in performing its required function [Agarwal].

P2. An Error is an incorrect behavior caused by a Fault [Tschudin].

The term Error is used to designate that part of the [system] state which is “incorrect” [Millier-Smith].

An Error is a discrepancy between the actual value of the output given by the software and the specified value of the output for a given input. That is, Error refers to the difference between the actual output of the software and the correct output. An Error is also used to refer to the wrong decision in a given case as compared to what is expected to be the right one. Error also refers to human actions that result in software containing the defect or fault [Agarwal].

P3. A Failure is the inability to perform according to a specification because of an Error [Tschudin].

A Failure of a system occurs when that system does not perform its service in the manner specified, whether it is unable to perform the service at all, or because the results and the external state are not in accordance with the specifications. [Millier-Smith].

Failure is the inability of the software system to perform a required function to its specification [Agarwal].

P4. Symptoms are external manifestations of failures [Steinder].

References

[Agarwal] B.B. Agarwal, M. Gupta, and S.P. Tayal, “Software Engineering and Testing”, Jones & Bartlett Learning, 2010.

[Millier-Smith] P.M. Millier-Smith and B. Randell, “Software Reliability: The Role of Programmed Exception Handling”, ACM, 1976.

[Steinder] Ma łgorzata Steinder and Adarshpal S.Sethi, “A survey of fault localization techniques in computer networks”, Science of Computer Programming, Volume 53, Issue 2, November 2004, Pages 165-194.

[Tschudin] Christian Tschudin, CS321/CS221: Autonomic Computer Systems, University of Basel, 2006.

 

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s