Notes on software abstraction
I.
The concept of abstraction - The term abstraction has a long history of senses, it derives from the Latin abstractus originally connoting asceticism and meaning "to withdraw from worldly affairs". Etymologically it is derived from the combination of the word-forming element "ab-" meaning "off, away from" and the verb "trahere" meaning "to drag, draw" and translating as "to detach, to pull away".
In software design we typically think of abstraction as a design tool that detaches from particulars that which they share in common. Iterative application of this process forms a hierarchy or tree structure.
The hierarchical notion of abstraction used in zoological classification as well as in software design is defined most simply by Barabara Liskov in "Abstraction and specification in program development" as the following process:
... to forget information ... [in order] to treat things that are
different as if they were the same
This method of abstraction traces its roots back to Aristotle's Categoriae as the genus-differentia style of definition. Aristotle's positive definition of abstraction can be derived from his definition of subjects (specific instances of an abstraction) as follows:
... that which is either predicable of a subject or present in a
subject
That is, abstraction maintains either an inheritable (predicable) or
compositional (present in) relationship with those things it
abstracts. An Admin
is a kind of User
,
a User
has a username
. However, for
Aristotle "present in" has the very specific meaning:
... incapable of existence apart from said subject
For example, a User
is capable of existence without
an Address
as it likely still contains other information (an id, a
username, etc) that is more fundamental to the concept of
a User
. Therefore, an Address
is not inseparably part
of a User
though a User
may have an address,
they may even have many.
When we are looking to define abstractions we can conduct this subtraction
exercise to separate those things which are necessarily part of a given
abstraction from those things which are only optionally part of it. It often
makes more sense to define Address
as a separate abstraction that
maintains a relation to a User
.
These two definitions - the process definition and the positive definition - explain how we abstract and what separates an abstraction from the individual things it covers. An abstraction is an object in its own right, one that covers a heterogeneous set of particular objects. But what of this covering?
II.
Comprehension and extension - Debatably introduced in "The Port-Royal Logic" by Antoine Arnauld and Pierre Nicole. The comprehension of an abstraction is defined as:
... those attributes which ... cannot be taken away without destroying
it
The comprehension is the concise definition of an abstraction with all superfluous details removed. The extension on the other hand is:
... those subjects to which the [abstraction] applies ...
That is, all of the individual objects covered by an abstraction. Comprehension and extension correspond closely to what in software is called the interface and its implementations. When we say that an interface is extensible, we mean that we can freely increase its extension by adding new implementations.
III.
Inverse relationship - Deleuze points out in "Difference and Repetition" as did many before him such as Cardinal Mercier that extension and comprehension maintain an inverse relationship. The greater the number of individuals covered by an abstraction, the lower the comprehension or number of attributes that abstraction can have. Conversely, the lower the number of individuals covered, the more attributes they can share within a single abstraction.
In software terms, the more specific implementations we try to cover with an interface, the fewer things we can actually do with those implementations through that interface. Interfaces trade off access to specific functionality with applicability to a broader set of implementations.
For example, if you define an interface to take an Object
or void*
you have gained a maximum of extension (you can
operate on any kind of entity). However, this maximum extension is
paid for with a minimum of comprehension (you can't do much with the
things passed in). Conversely, if you revise the interface to take
an int64_t, you have attained a maximum of comprehension (you
have access to all the operations defined for int64_t) in exchange for
a minimum of extension (you can only operate on instances
of int64_t). Between the two lies a spectrum of degrees
of extension and comprehension, but always in this inverse
relationship.
IV.
Interface intensity - De Morgan says in his Formal Logic:
According to such statements as I have seen, "man residing in Europe,
drawing breath north of the equator, seeing the sun rise before those in
America," would be a more intensively quantified notion than "man residing in
Europe"; but certainly not less extensive, for the third and fourth elements of
the notion must belong to those men to whom the first and second
belong.
The intensity of an interface is a measure of how overdetermined that interface is. Intensity, as a rule, does not reduce the extension of an abstraction but merely incorporates more fields, methods, or qualifications within it. The more of these we introduce the more intensive the interface becomes.
For example, we might define an integer interface with methods
like add()
, sub()
, mul()
,
and div()
. This would extend over all basic machine
types: 8-, 16-, 32-, and 64-bit integers. An intesification of this interface
could be made by introducing concepts
like saturating_add()
, overflowing_add()
, wrapping_add()
and so on for all the operations. The abstraction has not lost any of its
extension - it still applies to all the basic machine types - but
these additions represent an intesification of the overall interface
and its semantics.
V.
Blockage - Reality and thought both provide unique problems and stumbling blocks for abstraction. In the most general terms, the extension of an abstraction becomes blocked whenever the differences between the things abstracted reassert themselves in a way that is dissonant with the original abstraction. You know that the machine is capable of doing (or not doing) a particular thing, but the abstraction as it exists blocks us from doing (or not doing) it. This can happen in thought as a new or revised assumption that the abstraction now violates or fails to incorporate. It can happen in practice as a change in the state of affairs the abstraction was created to encapsulate or bring about.
When the extension of an interface becomes blocked you are left with four options:
1. Remake the interface
Perhaps you have created an abstraction called File
with
methods read()
and write()
, but you now realize you
need some way to change the position you're reading from or writing to. In this
case, you may simply patch the original interface by adding a
new seek()
method. In a more complicated scenario, you might remake
the abstraction entirely, moving to a model where blocks of data are
asynchronously read or written at specific locations using
a transfer()
method and allowing users to poll for the status of
their transfers.
2. Break apart the interface
Maybe you decide that File
s that allow seeking and those that
do not are really two separate things. This might lead us to break these apart
into a File
abstraction for seekable things and
a Stream
abstraction for non-seekable things.
3. Introduce new semantics into the interface
Perhaps instead of splitting the file abstraction apart you add
a seek()
method everywhere, and establish the convention that for
non-seekable implementations calling seek()
simply
succeeds without doing anything. This would prevent the need for downstream
code modifications, but complicates the semantics for our users as they must
now be aware that sometimes seek()
may not do any seeking.
4. Introduce a means of accessing the implementations through the interface
Linux device drivers provide a nice example of this
with ioctl(2). Because hardware is so difficult to abstract as
devices evolve and manufacturers add extended and specialized functionality,
some sort of "escape hatch" is needed for accessing device-specific
functionality through the linux file-system interface (for example, querying
disk drive geometry). ioctl()
provides an extension point for
individual device drivers to satisfy additional, device-specific requests
through the file descriptor interface. ioctl()
itself takes only
the device's file descriptor, an integer value representing the request type,
and an arbitrary set of arguments constituting the request. The downside of
this approach is that clients of ioctl()
cannot learn about the
additional functionality except by discovering and incorporating knowledge of
specific devices into their programs. There is also nothing
preventing ioctl()
interfaces from changing or breaking client
programs and no way for client programs to learn about this breakage except by
observing their programs fail or keeping an eye on changes to the relevant
device drivers. A new set of conventions and pseudo-interfaces has grown
around ioctl()
requests for specific classes of devices as an
attempt to alleviate these problems.
VI.
Leakage - Whereas blockage occurs in the interface,
abstraction leakage occurs in the implementations. Specifically,
leakage occurs when details of the implementation are revealed through
the interface when it is applied in a particular way - that which was
believed buried has come back to haunt us. An example, given
by Kiczales, is the attempt to use a Window
abstraction to handle the cells in a spreadsheet by simply creating a set of
child windows as follows:
for(int i = 0; i < 100; ++i) { for(int j = 0; j < 100; ++j) { mkwindow(parent, 20, 10, i*20, j*10); } }
The hope is that this would allow for the display of a rectangular grid of cells and the intercepting of keyboard and mouse events on individual cells using functionality provided by the standard window interface. However, in practice this approach is never used as window abstractions tend to be too heavy-weight to remain performant when there are lots of windows. Details about the implementation of the window abstraction leak through the interface in the slowdown produced when it is used in this way. It is revealed that there are semantics and states of affairs outside of the abstraction that must now be accounted for.
In the case of a leak there are several options:
1. Patch the leak
To patch the leak you might try to gain and exploit knowledge of the
problematic implementation. In the spreadsheet case above you might
learn about the internal structure of the Window
objects
themselves. For example, you might find an internal flag is_modal
that you can set to prevent certain costly operations when rendering or
updating. Such patches tend to be brittle in the face of changes in the
underlying implementation and brittle in usage beyond tested limits
since they largely rely on misusing knowledge about
the implementation.
2. Repair the leak
To repair a leaky abstraction generally requires a drop down into the
problematic implementation itself. In the Window
case this might
involve modifying the Window
data structure in order to optimize
the particularly slow functionality.
3. Build a new abstraction
Finally, you can rebuild the abstraction to fix the leakage exposed. For
example, in the spreadsheet case you might simply choose to implement your own
Cell
interface with a lighter-weight data structure, foregoing the
reuse of the Window
code entirely and likely duplicating much of
it in the process.
VII.
The general problem of software design - The function of software, as noted most concisely by Jackson, is described as follows:
The machine is physically embodied in a general-purpose computer built by
hardware engineers ... the software describes the particular machine needed for
the purpose in hand, and transforms the computer into that machine.
All design involves the determination of how best to specialize a general purpose computer to satisfy a particular set of needs. During the execution of the program the machine is transformed into a specialized machine for carrying out a particular project.
Under this lens the problems of blockage and leakage can be seen as a battle between the Turing-completeness of our general purpose computer and the Turing-incompleteness introduced by our abstractions. As stated concisely by Engler:
'Infinite' extensibility requires Turing completeness, Turing
completeness gives infinite extensibility ... [infinite] extensibility requires
supporting all unanticipated use cases.
The most flexible abstractions are Turing-complete, Turing-incomplete abstractions always have some blockage or leakage. However, blockage and leakage may also result from operating system and hardware abstractions or when the reality of modern hardware reasserts itself.
VIII.
The material reality of the computer - Far from being pure thought stuff, software is always tasked with running on some actually existing machine and utilizing some actually existing hardware. All abstractions eventually resolve back through the operating system and hardware interfaces to physical effects produced in hardware.
The operating system is an abstraction (see Arpaci-Dusseau) that allows every running process to behave as if it had exclusive access to the total machine. It does this by providing hardware, file system, and virtual memory abstractions and by multiplexing access to hardware. Processes request access to these shared resources using OS-provided interfaces defined via syscalls. The difference between syscall interfaces produce incompatabilities that require programs to be separately compiled for each OS, even when targeting the same hardware. In exchange the OS hides hardware details, provides some safety, and also performs global optimizations on resource requests including scheduling and caching.
Hardware itself provides abstractions in the form of CPU instructions, memory-mapped I/O, and interrupts. The ideal of a CPU linearly executing a program instruction-by-instruction is also an abstraction provided by modern CPUs as they increasingly utilize out-of-order execution, branch prediction, cache prefetching, and distributed consensus protocols for cache coherency.
This shifting substrate means that idealized notions about performance as
embodied in big-O notation may no longer hold. In practice, it is not uncommon
to find that an
O(n) = n + m
algorithm (ex. deletion in an array) is faster than
an O(1) = m
algorithm (ex. deletion in a linked-list) as
the m
in the latter big-O model becomes increasingly large on
modern hardware. See, for example,
Wicht on how linked-lists can underperform simple arrays
for random insertion/deletion due to the extensive caching on modern
CPUs. Big-O notation itself is an abstraction that relies on an idealized model
of the computer, it too is prone to blockage and leakage.
IX.
Semantic compression and leverage - Software abstraction proceeds
mechanically by substitution and spatial and temporal subdivision. The process
to be carried out by a program is temporally subdivided into its moments, their
sequential combination produces the desired effect. The data and code is
spatially subdivided in memory. The call
or jump
as
well as the pointer or reference are the facilitators of substitution. Linear
organization in memory is the facilitator of spatial and temporal
subdivision.
Substituting a call
to a subroutine for the contents of a
subroutine builds a lexicon which can be used to reduce duplication and
heterogeneity. Using a pointer or reference instead of a value reduces
duplication and copying. Both facilitate compression of data as well as
compression of semantics. Both of these give us leverage - the ability
to do a lot with a little.
It is precisely here that a risk arises. We can view the whole purpose of abstraction as the dissection of a program into its conceptual categories. This was the profound error and misstep of object-oriented programming (OOP). This style of abstraction leads to the traditional problems of reason, logic, and language as thoroughly covered by Beiser. Alternatively, we can view the purpose of abstraction as compression guided by resemblance and analogy. This style of abstraction has poetry and literature as its exemplary arts. We are given these techniques of substitution in order find the boundaries and limits of resemblance and analogy in pursuit of compression. The choice of path on this fork is the momentous decision that determines the whole of the method we pursue in construction.
Bibliography
- Aristotle. Categories
- Arnauld, Antoine & Nicole, Pierre. The Port-Royal Logic
- Arpaci-Dusseau, Remzi & Arpaci-Dusseau, Andrea. Operating Systems: Three Easy Pieces
- Beiser, Frederick C. The Fate of Reason
- Deleuze, Gilles. Difference and Repetition
- Deleuze, Gilles. What Is Grounding?
- De Morgan, Augustus. Formal Logic
- Egyed, Bela. Deleuze's Transcendental Empiricism
- Engler, Dawson R. The Exokernel Operating System Architecture
- Hegel, G.W.F. Who Thinks Abstractly?
- Jackson, M. A. A Discipline of Description
- Kiczales, Gregor. Towards a New Model of Abstraction in the Engineering of Software
- Kiczales, Gregor. (Video) Why Black Boxes are so Hard to Reuse
- Liskov, Barbara & Wing, Jeannette. Behavioral Subtyping Using Invariants and Constraints
- Liskov, Barbara & Guttag, John V. Abstraction and Specification in Program Development
- Maimon, Salomon. Essay on Transcendental Philosophy
- Mercier. Elements of Logic
- Myers, Glenford J. Reliable Software Through Composite Design
- Peirce, C.S. Upon Logical Comprehension and Extension
- Wicht, Baptiste. C++ benchmark – std::vector VS std::list VS std::deque
incoming(2): memory allocatorswriting
Last update on 7E5108, edited 6 times. 17/18thh
- 7E5108 - published notes on abstraction