Friday, November 16, 2018

Miser Project: Representing Data as Obs

2019-11-05 Update: Complete migration from the WordPress version
2019-11-03 Update: Minor touch-ups as part of initial migration to Blogger

2018-10-18 Update: Strengthening the notion that obs can be data representations, via encoding, with such representations preserved by computational-interpretation of the obs.  Expressing canonical form is now aligned with the updated introduction in Denumerability and Effectiveness
2018-10-20 Update: Smoothing the narrative on data representation along with additional consistent use of “canonical form.”
2018-10-30 Update: Cover some loose ends about computational-interpretation of functional-notation in the OB.sig.sml selection.
The Miser Project conception is composed of a mathematical formulation along with demonstration of computational interpretations that satisfy the formulation via careful engineering.

The foundation of the conception is the mathematical structure ‹ob› = 〈Ob,Of,Ot〉that circumscribes the domain Ob with its primitive functions and an useful supply of individuals to establish mathematical counterparts of computational data.

Here continues the advance toward mathematical formulation of a theory of computation by considering the (computational) interpretation of obs as data.

Preliminaries

  1. Hark! Is That a Theory I See Before Me?
  2. ‹ob› Primitive Functions
  3. Narrowing ‹ob› for Computational Interpretation

1. Standing in Abstraction

It is useful to entertain the notion that the ‹ob› = 〈Ob,Of,Ot〉universe of discourse consists of intangible, abstract entities: the obs (in Ob) and the (abstract) functions/predicates on obs (treated informally and jointly in Of).   Those entities have no material/concrete existence and any abstract “existence” is entirely given by ‹ob› through the logical theory, Ot.  Consider that the entities (so-called) arise entirely in language with their essential nature given by the formulation of ‹ob› expressed in the preceding installments.  Having attention on particular obs (and particular functions and predicates on them) is simply a way of speaking and not any kind of separation from ‹ob›.
  
This way of looking at the abstract nature of  ‹ob›  emphasizes that one cannot infer anything about the mathematical entities that cannot be deduced via Ot, the explicit logical-theory formulation of ‹ob›.  In particular, however closely a concrete computational interpretation manifests – gives the appearance of – ‹ob›-ness, it is important to not confuse engineered artifacts and processes with the abstractions.  Recognition of the difference is important in comprehension of how apparent layering of abstractions is achieved by computational means and software engineering.

This separation between the abstract and the apparently-operational is illustrated by the conscription of obs for interpretation as data representations.

2. Demonstrating Computational Interpretation

A demonstration computational interpretation of ‹ob› using the programming language SML/NJ has been prepared as an SML/NJ Abstract Data Type.  Possible computational interpretations are confined behind an SML/NJ signature declaration that limits what can be exercised of the computational interpretation and through which manifestation is achieved. 

The SML/NJ signature declaration is insufficient by itself.  The additional requirements for valid computational interpretations are specified in narrative comments.


The use of SML/NJ is incidental and extremely useful because of the ease with which this kind of computational interpretation can be demonstrated.   Other programming languages and methodologies will also be used.  Using this particular signature with SML/NJ and not entertaining equivalent SML/NJ alternatives is simply by convention for the SML/NJ demonstration of the ‹ob› computational model. 

To nail down this demonstration of computational interpretation, the signature is combined with definition of an SML/NJ structure that completes establishment of the valid interpretation.


Verifying validity of the interpretation depends upon knowledge of the SML/NJ language as it is applied here and upon correct operation of the SML/NJ software and the computer used.

The condition of validity is that for any deduction concerning canonical forms in Ot, the corresponding computational procedure in the interpretation is aligned by yielding the corresponding interpretation of the deduced consequence.

Even when everything is correct, it can be impacted by operational malfunctions and by resource exhaustion.   To this extent, the interpretation is contingent and empirical.  It is an accomplishment of engineering and not logic, despite the remarkable fidelity of valid interpretations in practice.

It should be clear that the establishment of interpretation validity is, in this case, dependent on inspection of the SML/NJ signature and structure and arriving at empirical agreement on their validity as a computational interpretation.  In this instance, there is a prima facie case, absent any contradictory evidence.

One subject of the Miser Project is demonstration of  validation support using mechanical deductive procedures that are themselves verified to be sound.   Understanding how that can be achieved will depend, at bottom, on a successful empirical foundation, such as that offered at the oMiser level.   It is important to confine the foundation to something whose empirical validity is readily verifiable before bootstrapping to higher levels.   The SML/NJ demonstration is exemplary in that respect.

3. Concerning Data Representation

Canonical forms can be taken as interpretations of data via an encoding – some arrangement of an ob that corresponds to a defined encoding of data.  We say the understood encoding is an interpretation of the (abstracted) data with the happy consequence that a suitably-restricted computational interpretation of the obs will preserve an interpretation as a particular representation of data. 
  • That an ob has the form of a defined encoding does not imply that it is intended to be understood as such a representation ; however, when an ob fails to satisfy a defined encoding, one can conclude the ob is not such a representation.
  • Whether operations on obs having the form of a defined encoding preserve the obs being such such an interpretation as data depend on how the operations honor data-interpretation-preserving restraints. 
  • One way to assure that a data-interpretation is valid is by establishing functions that encapsulate the interpretation, not unlike how the interpretation of ‹ob› is accomplished via the general-purpose language, SML/NJ, using signature and structure agreements to encapsulate a computational interpretation of ‹ob›.
By way of example, one data interpretation could be as representation of numbers.  
  • Let ob.NIL represent number 0,
  • ob.e(ob.NIL) represent number 1,
  • ob.e(ob.e(ob.NIL))) represent number 2, and so on.  
That’s mathematically adequate; it’s rather clumsy as a way of recording numeric data for any general practical purpose although we will see such elementary forms of counting in simple applications of oMiser.

A more suitable interpretation might be as a representation of numbers in a binary notation.  Let ob.NIL represent the digit 0 and let ob.e(ob.NIL) represent the digit 1, the two binary bits.  Then use
  • ob.NIL as number 0,
  • ob.e(ob.NIL) as number 1,
  • ob.c(ob.NIL, ob.e(ob.NIL)) as number 2,
  • ob.c(ob.e(ob.NIL), ob.e(ob.NIL)) as number 3, and
  • ob.c(ob.NIL, ob.c(ob.NIL, ob.e(ob.NIL))) as number 4, and so on.
Here the binary representation proceeds 0, 1, 01, 11, 001, … the reverse of the usual way that binary numbers are written.  That’s still an encoding; one must look elsewhere for preference of one encoding over another.

There are many ways that encodings can be established in this manner.  Knowing what is being represented and the rule of encoding is not inherent to the obs.  Additional information on the intended interpretation is required.

It is intentional that the preceding examples be unfamiliar, exaggerating the difference between intended data and data interpretation in ‹ob›.  This is akin to the accomplishment of data representations in digital computers where the representations of data all devolve to organizations of and operations on binary bits.
  

4. Do the Lindy Hop

Although it is inescapable that additional information is required to know the intended representation for an ob’s interpretation as data, there is a simple measure to gain greater expressiveness in encodings.  We will introduce, on behalf of the computational model and for convenience, literal individuals (lindies).


Lindies are a particular class of individuals adjoined to Ob in ‹ob›.  They are simply distinct individuals with no other significance.  Any intention for them and choices among them are of no significance in ‹ob›.   Their computational interpretation, in the oMiser model, is simply as interpretations of individuals distinguished by their specific character strings.  Their intended significance in an encoding situation is similarly not addressed by ‹ob›.  Lindies are freely available to improve the expressiveness of data representations.
  

5. Streamlined Notation

Interpreting individual obs as data is accomplished on individual canonical forms, no matter how those obs might be formulated on some general pattern.   The straightforward expression of those canonical forms is aided using a specialized notation.
  • A primitive individual, with name of form ending in .NAME, such as ob.NIL, is expressed as simply .NAME (e.g., .NIL).
  • A lindy, designated by Ʃ's', is expressed simply s.
  • ob.e( x ) is expressed ‵ ( x ), where the parentheses can be omitted when x is not a pair.
  • ob.c( x, y) is expressed ( x ) :: ( y ) , where the parentheses can be omitted when x is not a pair and where the parentheses on ( y ) are always optional.
  • In this notation, :: associates to the right, so abc :: .NIL :: ‵ GHI expresses the same ob as abc :: (.NIL :: ‵ GHI), which expresses ob.c(Ʃ'abc', ob.c(ob.NIL, ob.e(Ʃ'GHI') ) ).
When all optional parentheses are removed, this notation provides a direct counterpart of canonical forms which are more compact and, with practice, more understandable.  This equivalent canonical form is identified CFob.

Here is the CFob expression for an ob that might be intended as a small dictionary, with all of the terms being lindies.
    
       (English::en) :: (Русский::ru) :: (日本語::ja) :: (Français::fr) :: ‵ default

The previous examples of encodings can be streamlined using CFob expressions:

  • .NIL for 0, ` .NIL for 1, ` ` .NIL for 2, … etc.
  • b0 for 0, b1 for 1, b0 :: b1 for 10 (2), b1 :: b1 for 11 (3), b0 :: b0 :: b1 for 100 (4), etc., with the order of the bits in the binary-numeral representation reversed as before.
    Observe that there are two levels (at least) of representation here: first as sequences of bits and secondly the interpretations of those as (decimal) numerals.

6. In Sight of the Objective: Models of Computation

The illustration of ob-representation of data has just scratched the surface.   It is time to consider how one reasons about functions of ‹ob› = 〈Ob,Of,Ot〉and how the are considered to be effectively calculable in general and in service of data representations.  That supports unfolding the Miser Project’s applicative model of computation, oMiser.

Although there are many unexplained technicalities and additional steps to take, there is a glimpse of the objective in how intended data representations are achievable.

The mathematical structure ‹ob› = 〈Ob,Of,Ot〉incorporates an abstract model, to be introduced in subsequent articles, of algorithmic computation that demonstrates the nature of universal computation and also emphasizes how the stored-program model affords cascading manifestation -- boot-strapping -- of higher-level abstractions via software engineering.   There is a hint of that in consideration of (layers of) data representation.

We forego, for now, ideas of efficiency and expressiveness in order to first capture the essential character of stored-program computation, including how data representation and representation-honoring operations are accomplished with computational interpretations of ‹ob›.
  

Continuation

  1. Representing Functions in ‹ob›’s Abstract World
    Reasoning about functions on obs and then about interpretation of other structures by representation of functions in interpretations.
  2. Interpretation, Representation, Computation, and Manifestation
    Reprise and account for specialized concepts in Miser Project, setting up for representation of the computation model