Friday, April 30, 2010

Orders of Infinity

Originally posted 14 April 2010 on my temporary blogspace...

Eighteen months ago I posted a blog entry describing a Mass-Parity-Distance-invariant Universe. I got that idea while reading Roger Penrose's The Road to Reality during a month of vacation time away from online and other distractions. I couldn't see anything in the book that contradicted the idea. I knew String Theory is the most popular theory on how to unite the four forces, explain the Universe, and all that, but whenever I read up about it I sensed a certain inelegance about it all. Because I knew Penrose felt the same way, I was keen to plow through his book when I had the time. I haven't read up on physics much since writing that blog, but it's hard not to think about the maths when I'm out for a walk and daydreaming, and so I've had some further ideas for the maths...

Physics recap

I explained how by using a Mass-Parity-Distance (MPD) symmetry, similar to the well-known Charge-Parity-Time (CPT) symmetry, we would have a Universe where equal amounts of positive and negative energy fly off in opposite directions at the big bang, each side self-attracting but mutually repelling. At the Big Bang, the left-handed gravitons and left-handed neutrinos fly off in one direction, thus equating positive mass with normal matter in the large, while their right-handed counterparts would fly off in the other direction, equating negative mass with anti-matter. Matter and antimatter created from positive energy in our side of the Universe would both have positive mass, but a virtual particle-antiparticle pair in a vacuum would have an overall energy of zero, one of the pair having positive energy, the other, negative.

I also explained how such an MPD-symmetric Universe could explain dark energy, though I suspect for it to have its observed strength and timing, the observable Universe would be a very tiny proportion of the actual Universe. Just as our sun is one of about 100 billion in the Milky Way, and our galaxy one of about 100 billion in the observable Universe, so our observable Universe could also be a 100-billionth of the actual Universe. Homogeneity and isotropy would be a more local effect to this scenario. To picture it all using the common 2D-curved-space picture for general relativity, the positive matter would be on top of the sheet sinking downwards, but the negative matter would be under the sheet to indicate negative distances, floating upwards to indicate the negative mass. The positive and negative matter would both self-gravitate, but repel the other.

Cantor's hierarchy of infinities

I suggested the dual CPT and MPD symmetries suggest a mathematical model where the Universe is made up of two complex planes rather than four real lines. A complex plane is different to a 2-real-D plane in not having reflectional symmetry about the real line: two complex planes would have two such assymmetries, perhaps explaining the "parity" in both the CPT and MPD symmetries. But to see why our Universe would be two complex planes related in some way instead of some other structure, we'd need to understand what would make such a structure special. Why two complex planes and not three? And why use complex planes instead of real lines? What might make such a structure special could be the same thing that makes a single complex plane special. I suggest we look at its position in Cantor's hierarchy of infinities, as that seems more foundational than the other branches of mathematics, and move on up from there...

We know the zeroith order of infinity (a.k.a. finiteness) can define a logic system, by using two or more ordered finite values, i.e. false and true for boolean logic, more for other logic systems. The first order of infinity (a.k.a. aleph-0) can define an arithmetic system. The simplest is the natural numbers, created by starting at number 1 and applying induction. Extensions to this such as the integers and rational numbers are also aleph-0. Godel showed this arithmetic system cannot be both consistent and complete.

Things get more interesting at the second order of infinity. A next higher order of infinity is the power set of a lower one. The real numbers, introducing the square root of 2, extend the integers. We can prove they're at some higher order than the first order (integers, etc), but can't prove at what order they are. We therefore speculate they're at the second order (a.k.a. aleph-1), calling this the Continuum Hypothesis. Other number systems with the property of continuity (e.g. complex numbers, n-D manifolds) would then also be aleph-1, but complex numbers, which introduce the square root of -1, are "algebraically complete", not requiring any further extensions.

And what of aleph-2, the third order of infinity? The set of all curves (including fractal ones) is known to be at some order of infinity above aleph-1, hypothesized to be aleph-2. When looking at the curves, maybe it's best to consider the most complex curves first, such as 1D-lines with a (Hausdorff-) dimension of 2. The most famous of these are the Mandelbrot and Julia sets, which both happen to be defined on the complex plane. When we look at computer simulations of them, we see many concentric closed curves of various colors, reflecting the different integral-valued accuracies of calculation. If we could mathematically define a real-valued accuracy of calculation, would we still see concentric fractal curves? I suspect so, and that they would merge into a continuously-varying fractal structure with (Hausdorff-) dimension of 3, on the 2-dimensional complex plane. All of the Julia sets for the standard Mandelbrot look like they have this same concentricity property, though only some of them seem to have (Hausdorff-) dimension 3, the rest (on inspection) seeming to have dimension of less than 2. One is even a perfect circle, with only (Hausdorff-) dimension 1. And of course we could consider the nonstandard Mandelbrot views for all these Julia sets.

The Julibrot Set

Let's use these two fractals in a candidate mathematical model for our Universe. The Julibrot Set is defined as the topologically-4D union of all the Julia sets of the Mandelbrot set. If we consider real-valued accuracies of calculation for the entire Julibrot, we have a (Hausdorff-) dimensionality of somewhere between 4 and 6, embedded in the 4 topological dimensions. Now I suspect there could be a theorem concerning the (Hausdorff-) dimension in such a structure: I'll speculate it's 5 and see where that leads. If such a 2-complex-plane structure explains the 4D-spacetime of our Universe, then there's an extra non-expansive dimension supplied by considering the fractal curves at different positive-real-valued degrees of accuracy. This would explain the phenomenon of mass in our 4D-spacetime, along with certain rules of distribution within, which could be the law of Einsteinian gravity.

But where would this positive real number come from? In my other blog entry, I suggested we could relate each complex plane to the other probabilistically! Each position in a Julia set would correspond only probabilistically to the positions in the associated nonstandard Mandelbrots. This would explain why the said mass in our Universe doesn't conform fully deterministically to the large-scale gravity-based rules of distribution, but has degrees of freedom as ultimately enabled by the law of quantum physics, as powered by Planck's constant, being the measure of probability relating the two complex planes together. Such a structure consisting of 2 complex planes related probabilistically therefore would be the minimumly complexed structure that can squeeze in a fifth real-valued dimension, the one we know as Mass, into the four expansive dimensions.

What would be the order of infinity for this Julibrot-based Universe? There could be a theorem saying this structure is necessary and sufficient to contain all curves, and is therefore at the third order of infinity, aleph-2. Our Universe could then be the minimumly-complexed structure that can exist at aleph-2. Furthermore, just as the complex numbers are algebraically complete at aleph-1-infinity, so also our 4D-spacetime with its inbuilt phenomenon of Mass, obeying rules both of gravitation and of quantum physics, could also be complete in some way at aleph-2-infinity.

The directed dimension and inbuilt polarity

Is this probabilistically-defined Julibrot set the best model for our Universe? Any recursively-applied polynomial equation seems to give the basic Mandelbrotly-edged shape on the computer, so presumably they all have high enough Hausdorff dimension to be a candidate model. But only simple Julibrot Sets are reflectionally or rotationally symmetric in 3 dimensions, but not in 4, giving a dimension that looks like Time.

By looking at the Julibrot's constituent Mandelbrot and Julia sets on the computer, including their colored accuracy levels, we see that the Mandelbrot-real dimension is the assymmetric dimension, i.e. the Time dimension. As a bonus, the Mandelbrot-imaginary dimension is the only one that's reflectionally symmetric, therefore the one along which positive and negative matter flew apart, i.e. the Axial Space dimension. The Julia sets at each point below the standard Mandelbrot Time axis (where y=0 on the plane) are similar to the corresponding one above, except for being reflected through their Julia-real axis. This gives the appearance of the Mass rotating in opposite directions in each half of the structure, perhaps suggesting the opposite handedness of gravitons and neutrinos in each half of the mass distribution in our Universe. But in each half of the Mandelbrot Mass distribution, when we look toward the Time axis, the Mass appears to rotate in the same direction, perhaps suggesting why space appears to have an inbuilt polarity.

The Julibrot generated by the standard parameter plane doesn't bear an obvious visual resemblance to the MPD-symmetric Universe I've been describing, but by using a non-standard parameter plane for the Mandelbrot, we can shape it more like our Universe, such as the 1/μ-plane for a Big Crunch universe, or the 1/(μ + 0.25) plane for a Heat Death one, as pictured on this page.

We can also see suggestions of increasing entropy of Mass by looking at the Mandelbrot set. Because of the second law of thermodynamics, the entropy of the Universe at the Big Bang was at its minimum. In a Big Crunch Universe, the likely death scenario in both the positive matter and negative matter sides of the Universe is as two black holes spinning around each other for trillions of years before eventually smashing together, a high entropy end-state. We can consider the directedness of the dimension of Time as being caused by the increasing entropy of the Mass inside the Universe along the Time dimension. The standard Mandelbrot set on various parameter planes (i.e. the Time and Axial Space dimensions) seem to have one "creation" point, where the (Hausdorff-) dimensionality is 1 just at that point, situated on the Time axis, suggesting the low-entropy Big Bang.

So we could say that by increasing the mapping uncertainty between the Mandelbrot and Julia sets of complex planes (i.e. Planck's constant) up from zero, we create a fifth, non-expansive, dimension called Mass, which causes the Mandelbrot-real dimension to be directed and become Time, and the Mandelbrot-imaginary dimension to become a spatial axis around which the Julia planes rotate.


In this probabilistic Julibrot model I've described, Charge is seemingly absent. We first considered this type of model because of the CPT and MPD dual symmetries suggesting two complex planes, but have only derived out the entity of MPD-symmetric Mass following certain gravitational and quantum rules. We can consider the forces without an infinite range (the strong and weak forces) to be more minor details for filling in later, but can't consider electromagnetism that way.

We can put electromagnetism into this model by introducing an additional relationship into the structure, besides the probabilistic one between the two complex planes. The basic apparent difference between MPD-symmetric gravity and CPT-symmetric electromagnetism is that with gravity, like masses attract while unlike ones repel, whereas with electromagnetism, like charges repel while unlike ones attract. The logical effect of this is that gravity's masses are real numbers, enabling aggregation of mass, while charges must be discrete. These look like big differences, but there's a simple structural property we can introduce into the model which makes gravity and electromagnetism be exactly the same force!

Someone once asked me how we know the Universe isn't like at the end of the Men in Black: 1 movie, where it's all just a speck of dust on another creature's back. I've since realized that unless the Universe looks exactly the same at its largest scale as it does at its smallest, then it's not really a self-contained system. The generally-agreed smallest scale of the Universe is a particle and anti-particle splitting apart, then perhaps coming back together. In an MPD-symmetric Universe, the largest scale is positive matter and negative matter splitting apart in opposite directions along one dimension of space, then, in some cosmological theories, coming back together again. If the very first particles to split apart at the Big Bang were some form of particle/graviton coupling, splitting away from their opposite forms, then the negative mass will be mainly antimatter, just as the positive mass we observe around us is mainly normal matter. And a virtual particle and anti-particle splitting apart and coming back together would have matching positive mass and negative mass.

So, the only difference between gravity and electromagnetism in such a MPD-symmetric neutrino/graviton-origined Universe is that gravity is what we see when we're on the inside looking outwards, and electromagnetism is what we see when we're on the outside looking inwards. On the inside looking outwards, there's only one instance to look at, but on the outside looking in, we see many instances. From the inside looking out, it looks like MPD-symmetric positive and negative Mass obeying the laws of gravity, but from the outside looking in, it looks like CPT-symmetric positive and negative Charge obeying electromagnetic laws.

Black holes could fit naturally into this model. Some people speculate they're inherently similar to particles, others say they're the outside of other Universes. And when we look at a computer simulation of the Mandelbrot set, we see many near-similar instances of the large-scale Mandelbrot at smaller and smaller scales, ditto with the Julia sets, and perhaps this mathematical fact suggests in some way this physical fact about the Universe, that the Universe must always look the same at its largest scale as it does at its smallest. And in fact, this mathematical fact suggests perhaps we don't need to introduce this physical fact as an additional relationship into our structure, but perhaps it falls naturally out of it.

At the Big Bang

What might the Universe look like under this model in the first instant after the Big Bang? Because the neutrino is the particle defining matter/antimatter, I'll call the first particle to split from its antiparticle at the Big Bang a proto-neutrino. When the first protoneutrino/graviton coupling split apart from its opposite at the Big Bang, but before either the protoneutrino or the protoneutrino decayed into more particles, each half of the Universe looked like an expanding outer balloon with one decaying inner balloon inside it. Both outer and inner balloons had the same inherent structure: from the viewpoint of the space between them, the outer balloon appeared to obey the laws of MPD-symmetric Einsteinian gravity, and the inner balloon appeared to obey the laws of quantum electrodynamics. As the protoneutrino decayed into further different particles, the outermost scale and the innermost (quantum) scale always kept the same mathematical structure.

The protoneutrino further decayed into other particles, and in our own positive matter side of the Universe, in our own observable portion of the Universe, the fermions are presently the our well-known leptons, quarks, weak-force bosons, and whatever dark matter particles there are. In unobservable portions of the positive matter side of the Universe, perhaps the fermions are different. Perhaps the masses of particles change depending on when and where they are in the Universe. Perhaps other dimensionless constants, such as the fine structure constant, also change. In the negative matter side of the Universe, perhaps the first few quantum rolls of the dice caused the anti-protoneutrino to decay differently, resulting in a totally different life story there, but the overall mass distribution would be the same as in the positive matter side of the Universe, and there would also be a direct identity lineage from the protoneutrino to our present-day neutrino, ditto in the negative matter Universe. But despite what other changes there are in the structure of the Mass, the smallest scale would always have the same structure as the largest scale, and any changes to the structure of the Mass, such as values of particle masses or the fine structure constant, would actually be caused by this self-similarity requirement.

Higher orders of infinity?

So we've seen a possible model of our physical Universe by regarding Mathematics not as something that simply describes the Universe, but as something which the Universe is at some order of infinity. I suspect if my conjectures above are correct, then the core theorem deriving the axioms of General and Special relativity and quantum electrodynamics from an uncertainty-based Julibrot set would be as significant at aleph-2-infinity as the Cauchy-Riemann theorem is at aleph-1-infinity.

If our Universe is what exists at the third order of infinity, then what might constitute the next higher order of infinity? Using Cantor's theorem, by considering the power set of our Universe, we might say it's the set of everything that could have happened, could yet happen, and would yet have happened in our Universe, except for the number of dimensions. But how would we place our own specific Universe of actual happenings in this picture, with its unique quantum reductions into actualities? One person might say the power set is at a higher order of infinity than our own specific Universe of actual happenings, because of Cantor's theorem. But someone else might say the Universe of happenings is at a higher order, being the real, intentioned actualization, while the possibilities are at a lower order, being just the canvas for the actualization, so to speak. It sounds like an argument between atheists and theists, and perhaps never able to be proven either way.

So possibly our own consciousnesses can't really comprehend very high up the ladders of infinity. At the first order, we can't logically define a consistent and complete system. At the next order, we must hypothesize its actuality as The Continuum in our own Universe. And at another order or two higher, where our own consciousnesses dwell, we can't logically prove which of the two orders is the higher and which is the lower. If we could look higher up the hierarchy of infinities to the infinite level, then the hierarchy itself would be able to be counted by an induction-based counting scheme, and so become self-referenced, and even itself an inconsistent and incomplete arithmetic system, which it isn't. In fact, because different instances of our human consciousnesses can't agree which of the third and fourth orders of infinity is higher than the other, we can't even make the known physical representations of the orders of infinity into a propositional logic system, not knowing which to call True.

Moving each order of infinity up the ladder seems to require utilizing some new well-known mathematical concept. Moving from finiteness to aleph-0 requires Induction, and from aleph-0 to aleph-1 requires Continuity. If an MPD-invariant Universe is at aleph-2, then moving there requires Probability. This would be the lowest order of infinity which contains the entities of Time and Mass, as distinct from Space. And such entities, Time and Mass, are required for the mathematics of computational complexity, Time as a resource and Mass for building Turing machines. Perhaps the next order of infinity, aleph-3, requires the concept of Computation, which could explain the phenomena of consciousness. Such Computation is also required for calculating fractal curves in aleph-2 spacetime.

Computational complexity

Perhaps computational complexity will play an important role in a theory of the Universe. There's now many known complexity classes in the zoo. If we can slot the theory of computational complexity directly onto a mathematical structure which defines a space-distinct Time, which behaves according to the laws of our Universe, then many of these complexity classes, such as PSPACE and P(TIME), may meld into one when we factor in the effects of Special and General Relativity, such as time dilation and space contraction. PSPACE problems require more computation "power" than P(TIME) problems, but if space can become time due to high acceleration or a nearby strong gravitational field, perhaps ultimately they're really the same complexity class.

Similarly, the distinction between the P(TIME) and NP(TIME) complexity classes may not exist at Planck scales because of the quantum nondeterminism. Perhaps the electric field generated by human brain neural structure makes use of such quantum nondeterminism to produce the effect of consciousness. Perhaps both large-scale (relativistic) and small-scale (quantum) effects together reduce the many complexity classes down to a mere few. They seem to fall into four broad groups: logarithmic, polynomial, exponential, and recursive.

Do these complexity groups each match up somehow to the various orders of infinity I've described? Do aleph-0-infinite structures like the integers relate somehow to logarithmic-space computation? Do aleph-1-infinite structures like the complex numbers relate to polynomial-resource computation? Is our Universe an aleph-2-infinite structure? Is it an MPD-invariant "canvas" for our Universe of actual happenings, related somehow to exponential-resource computation? Are the quantum reductions that form the actual happenings in our Universe, including our own consciousnesses, an aleph-3-infinite structure? Related somehow to recursively-enumerable computation? And what could possibly lie beyond that?

Programming Language Structure

Originally posted 16 January 2010 on my temporary blogspace...

Programming languages have their origin in natural language, so to understand the structure of computer languages, we need to understand natural ones. According to Systemic Functional Grammar (SFG) theory, to understand the structure of language, we need to consider its use: language is as it is because of the functions it's required to serve. Much analysis of the English language has been performed using these principles, but I haven't found much on programming languages.

Functional grammar of natural languages

According M.A.K. Halliday's SFG, the vast numbers of options for meaning potential embodied in language combine into three relatively independent components, and each of these components correspond to a certain basic function of language. Within each component, the networks of options are closely interconnected, while between components, the connections are few. He identifies the "representational" and "interactional" functions of language, and a third, the "textual" function, which is instrumental to the other two, linking with them, with itself, and with features of the situation in which it's used.

To understand these three components in natural languages, we need to understand the stages of encoding. Two principle encodings occur when speech is produced: the first converts semantic concepts into a lexical-syntactic encoding; the second converts this into spoken sounds. A secondary encoding converts some semantics directly into the vocal system, being overlaid onto the output of the lexical-syntactic encoding. Programming languages have the same three-level encoding: at the top is the semantic, in the middle is the language syntax, and at the bottom are the lexical tokens.

The representational function of language involves encoding our experience of the outside world, and of our own consciousness. It's often encoded in as neutral a way as possible for example's sake: "The Groovy Language was first officially announced by James Strachan on Friday 29 August 2003, causing some to rejoice and others to tremble."

We can analyze this as two related processes. The first has actor "James Strachan", process "to officially announce", goal "the Groovy Language", instance circumstance "first", and temporal circumstance "Friday 29 August 2008"; the second process is related as an effect in a cause-and-effect relationship, being two further equally conjoined processes: one with process "to rejoice" and actor "some"; the other with process "to tremble" and actor "others".

The interactional function of language involves injecting the language participants into the encoding. A contrived example showing many types of injects: "The Groovy Language was first announced by, of all people, creator James Strachan, sometime in August 2003. Was it on Friday 29th? Could you tell me if it was? Must have been. That august August day made some happy chappies like me rejoice, didn't it?, yeehaaaah, and probably some other unfortunates to tuh-rem-ble, ha-haaah!"

We see an informal tone, implying the relationship between speaker and listener. There's glosses added, i.e. "of all people", "august", "happy chappies like me", "unfortunates", semantic words added, i.e. "creator", semantic words removed, i.e. "officially", sounds inserted, i.e. "yeehaaaah", "ha-haaah", prepended expressions of politeness, i.e. "Could you tell me if", and words spoken differently, e.g. "tuh-rem-ble". Mood is added, i.e. a sequence of (indicative, interrogative, indicative). Probability modality is added, i.e. "must have", "probably". We could have added other modality, such as obligation, permission, or ability. We've added a tag, i.e. "didn't it?". We could have added polarity in the main predicate. What we can't indicate in this written encoding of speech is the attitudinal intonation overlaid onto each clause, of which English has hundreds. Neither can we show the body language, also part of the interactional function of speech.

Natural language in the human brain

A recent article in Scientific American says biologists now believe the specialization of the human brain’s two cerebral hemispheres was already in place when vertebrates arose 500 million years ago, and that "the left hemisphere originally seems to have focused in general on controlling well-established patterns of behavior; the right specialized in detecting and responding to unexpected stimuli. Both speech and right-handedness may have evolved from a specialization for the control of routine behavior. Face recognition and the processing of spatial relations may trace their heritage to a need to sense predators quickly."

I suspect the representational function of language is that which is produced by the left hemisphere of the brain, and the interactional function by the right hemisphere. Because the right side of the brain is responsible for unexpected stimuli, from both friend and foe, then perhaps interactional language in vertebrates began as body language and facial expressions to denote conditions relevant to others, e.g. anger, fear, affection, humidity, rain, danger, etc. Later, vocal sounds arose as the voice box developed in various species, and in humans, increasingly complex sounds became possible. The left side of the brain is responsible for dealing with regular behavior, and so allowed people to use their right hand to make sign language to communicate. Chimpanzees and gorillas use their right hands to communicate with each other, often in gestures that also incorporate the head and mouth. The article hypothesizes that the evolution of the syllable in humans triggered the ability to form sentences describing processes involving people, things, places, times, etc. Proto-representational language was probably a series of one-syllable sounds similar to what some chimps can do nowadays with sign language, e.g. "Cat eat son night". Later, these two separate functions of natural language intertwined onto human speech.

Programming language structure

When looking at programming languages, we can see the representational function easily. It maps closely to that for natural languages. The process is like a function, and the actor, goal, recipient, and other entities in the transitive structure of natural language are like the function parameters. In the object-oriented paradigm, one entity, the actor, is like the object. The circumstances are the surrounding static scope, and the relationships between processes is the sequencing of statements. Of course, the semantic domains of natural and programming languages are different: natural languages talk about a wider variety of things, themselves more vague, than programming languages. But the encoding systems are similar: the functional and object-oriented paradigms became popular for programming because between them it's easy for programmers to code about certain aspects of things they use natural language to talk about. The example in pseudocode:

Date("2003-8-29").events += {
def a = new Instances();
[1] = jamesStrachan.officiallyAnnounce(Language.GROOVY);
[1].effect = [some: s => s.rejoice(), others: o => o.tremble];

The similarities between the interactional functions of natural and programming languages is more difficult to comprehend. The major complication is the extra participants in programming languages. In natural language, one person speaks, maybe one, maybe more people listen, perhaps immediately, perhaps later. Occasionally it's intended someone overhears. In programming languages, one person writes. The computer reads, but good programming practice is that other human people read the code later. Commenting, use of whitespace, and variable naming partly enable this interactional function. So does including test scripts with code. Java/C#-style exception-handling enables programmer-to-programmer interaction similar to the probability-modality of English verbal phrases, e.g. will/definitely, should/probably, might/could/possibly, won't, probably won't.

Many programming systems allow some interactional code to be separated from the representational code. One way is using system-wide aspects. A security aspect will control the pathway between various humans and different functions of the program while it's running. Aspects can control communication between the running program and different facets of the computer equipment, e.g. a logging aspect comes between the program and recording medium, a persistence aspect between the program and some storage mechanism, an execution performance aspect between the program and CPU, a concurrency aspect between the program and many CPU's, a distribution aspect between the program and another executing somewhere else. Here, we are considering these differents facets of the computer equipment to be participants in the communication, just like the programmer. Aspects can also split out code for I/O actions and the program entry point, which are program-to-human interactions. This can also be done by monads in "pure functional" languages like Haskell. Representational function in Haskell is always kept separate from interactional functions like I/O and program entry, with monads enabling the intertwining between them. Monads also control all access between the program and modifiable state in the computer, another example of an interactional function.

Textual function of language

The textual function of language in SFG is that which concerns the language medium itself. In spoken natural language, this is primarily the sequential nature of voice, and in written language, the 2-D form of the page. Whereas in natural language theory, the voice-carrying atmosphere and the ink-carrying paper are obviously mediums and not participants, it's more difficult to categorize the difference between them in programming language theory. Because a program is written as much for the CPU as for other human readers, if not more so, we could call the CPU a participant. But then why can't the CPU cache, computer memory, hard-disk storage, and comms lines also be called participants? Perhaps the participants and the transmission medium for natural languages are also more similar than different.

The textual function of language is made up of the thematic, informational, and cohesive structures. Although mainly medium-oriented, they also involve the participants. The thematic structure is speaker-oriented, the informational structure is listener-oriented. The thematic structure is overlaid onto the clause. In English, what the speaker regards as the heading to what they're saying, the theme, is put in first position. Not only clauses, but also sentences, speech acts, written paragraphs, spoken discourses, and even entire novels have themes. Some examples using lexical items James, to give, programmers, Groovy, and 2003, with theme in italics:

  • James Strachan gave programmers Groovy in 2003.
  • Programmers are who James gave Groovy to in 2003.
  • The Groovy Language is what James gave programmers in 2003.
  • 2003 is when James gave programmers Groovy.
  • Given was Groovy by James to programmers in 2003.

In English, the Actor of the representational function's transitive structure is most likely to be separated from the interactional function's Subject and from the Theme in a clause, than those from each other. I think the textual functions of natural language are far more closely linked to the interactional function than to the representational. Perhaps the right side of the brain also processes for such texture structure.

The informational structure jumps from the top (i.e. semantic) encoding level directly to the bottom (i.e. phonological) one in English, skipping the middle (i.e. lexical/syntactic) level. This is mirrored by how programming languages such as Python use the lexical tokens to directly determine semantic meaning. In English, the speech is broken into tone units, separated by short pauses. Each tone unit has the stress on some part of it to indicate the new information. For example, each of these sentences has a different informational meaning (the bold indicates the stresses):

  • James gave programmers Groovy in 2003.
  • James gave programmers the Groovy Language in 2003.
  • James gave programmers Groovy in 2003.
  • James gave programmers Groovy in 2003.
  • James Strachan gave programmers Groovy in 2003.

Unlike the thematic structure, the informational structures the tone unit by relating it to what has gone before, reflecting what the speaker assumes is the status of the information in the mind of the listener. The informational structure usually uses the same structure used in the thematic, but needn't. English grammar allows the lexical items to be arranged in any order to enable them to be broken up in any combination into tone units. For example, these examples restructure the clause so it can be divided into two tone units (shown by the comma), each with its own stress, so two items of new information can be introduced in one clause:

  • James gave Groovy to programmers, in 2003.
  • As for Groovy, James gave it to programmers in 2003.
  • In 2003, James gave programmers Groovy.

Programming languages should follow the example of natural languages, and allow developers to structure their code to show both thematic and informational structure. The final textual function, the cohesive structure enables links between clauses, using various techniques, such as reference, pronouns, and conjunctions. Imperative programming languages rely heavily on reference, i.e. temporary variables, but don't use pronouns very much. Programming languages should also provide developers with many pronouns.


Programming languages initially represented information in the same way humans do, using transitive structures such as function calls, joined by logical relationships such as blocks and class definitions. Interactional aspects of code were initially intertwined, but could be separated out using aspects and monads. Enabling different textual structures in programs isn't very widespread, so far limited to providing different views of an AST in an IDE, only occasionally allowing "more than one way to do things" at the lexical level. When used well, textual structures in code enable someone later on to more easily read and understand the program.

In promoting the benefits of programming languages enabling different textual structures, I think it's useful to narrow down to two primary structures: the transitive and the thematic, as these two are easiest to communicate to programmers. See my earlier thoughts on how a programming language can enable more thematic variation. Programming languages of the future should provide the same functions for programmers that natural languages provide for humans.

And of course, I'm building Groovy 2.0, which will both enable thematic variation in the language syntax/morphology, and supply a vast vocabulary of Unicode tokens for names. The first iteraction will use Groovy 1.x's SwingBuilder? and ASTBuilder, along with my own Scala-based combinator parsers, to turn Groovy 2.0 source into Groovy 1.x bytecode. The accompanying Strach IME will enable programmers to enter the Unicode tokens intuitively. Groovy 2.0 will break the chains of the the Antlr/Eclipse syntactic bottleneck over Groovy 1.x !!!

Thursday, April 23, 2009

Interactional function of English and Groovy

Michael A.K. Halliday writes in his 1970 paper Language Structure and Language Function that we should analyze language in terms of its use, considering both its structure and function in so doing. He's found the vast numbers of options embodied in it combine into three relatively independent components, and they each correspond to a certain basic function of language: representational (a.k.a. ideational), interactional (a.k.a. interpersonal), and textual. Within each component, the networks of options are closely interconnected, while between components, the connections are few.

For natural language, the representational component represents our experience of the outside world, and of our consciousness within us. The representational similarities between natural and computer languages are most easily noticed:
Mary.have(@Little lamb)
lamb.fleece.Color = Color.SNOW_WHITE
synchronized{ place-> Mary.go(place); lamb.go(place) }

Computer languages' increasing use of abstraction over the years was no doubt based on the representational component of natural languages, giving rise to the functional and object-oriented paradigms. The ideas represented in computer language must be more precise than those in natural language.

Interactional component of English
The interactional component of language involves its producer and receiver/s, and the relationship between them. For natural language, there's one or more human receivers, and for computer language, one or more electronic producers and/or receivers as well as the human one/s.

In English, the interactional component accounts for:

  • many adverbs of opinion, e.g. “That's an incredibly interesting piece of code!”

  • interjections within a clause, e.g. I'm hoping to, er, well, go back sometime, or even in the middle of words, e.g. abso-bloomin'-lutely

  • expressions of politeness we prepend to English sentences, e.g. “Are you able to...” in front of “Tell me the time”

  • the hundreds of different attitudinal intonations we overlay onto our speech, e.g. ”Dunno!” (can you native English speakers hear that intonation?)

  • the mood, whether indicative e.g. “He's gone.”, interrogative e.g. ”Is she there?”, imperative e.g. ”Go now!”, or exclamative e.g. ”How clever!”

  • the modal structure of English grammar, i.e. verbal phrases have certainty e.g. “I might see him”, ability e.g. ”I can see her”, allowability e.g. ”Can he do that?”, polarity e.g. ”They didn't know”, and/or tense e.g. ”We did make it”

Natural language offers many choices regarding how closely to intertwine the interactional component with the representational.
An example... for closely intertwined reported speech: She said that she had already visited her brother, that the day before she'd been with her teacher, and that at that moment she was shopping with her friend.
and using quoted speech to reduce the tangling between interactional and representational components: She said "I've already visited my brother, yesterday I was with my teacher, and right now I'm shopping with my friend."
Another example of keeping these two components disjoint: I'm going to tell the following story exactly as she told it, the way she said it, not how I'd say it...

The original human languages long ago, just like chimpanzee language today, was perhaps mainly interactional, with the representional component slowly added on afterwards.

Interactional component of computer languages
For computer languages, the interactional component determines how people interact with the program, and how other programs interact with it. Like natural languages, the interactional component came first, and representational abstractions added on later. Many have tried to create a representational-only computer language, perhaps the most successful is Haskell. But the Haskell language creators went to great trouble to tack on the minimumly required interactional component, that of Input/Output. They introduced monads to add the I/O capability onto the “purer” underlying functional-paradigm function. Perhaps some functional-paradigm language creators don't appreciate the centrality of the interactional component in language.

Siobhan Clarke et al, writes about the tyranny of the dominant decomposition:
Current object-oriented design methods suffer from the “tyranny of the dominant decomposition” problem, where the dominant decomposition dimension is by object. As a result, designs are caught in the middle of a significant structural misalignment between requirements and code. The units of abstraction and decomposition of object-oriented designs align well with object-oriented code, as both are written in the object-oriented paradigm, and focus on interfaces, classes and methods. However, requirements specifications tend to relate to major concepts in the end user domain, or capabilities like synchronisation, persistence, and failure handling, etc., all of which are unsuited to the object-oriented paradigm.

The object-paradigm is a representational one. The other user-domain capabilities are interactional ones, either human-to-computer or computer-to-computer. Some examples:
  • I/O actions, i.e. between computer and human/s

  • logging, i.e. between processor and recording medium

  • persistence, database access, i.e. between computer and storage unit/s

  • security, i.e. between computer and certain humans only

  • execution performance, i.e. how to maximize use of computing resources

  • entry point to program, i.e. between procesor and external scheduler

  • concurrency, synchronization, i.e. between two processors in one computer

  • distribution, i.e. between two geographically separated computers

  • exceptions, failure handling, i.e. between results of different human-expected certainties

  • testing, i.e. interaction between two different external humans

These capabilities are often interwoven into the programming code, just as mood and modality are overlaid onto all the finite verbal phrases in English. And just as in English, where various interactional functions can be disentangled from the representation functions, e.g. quoted speech above, so also in computer languages, such user-domain capabilities can be extracted as system-wide aspects in aspect-oriented programming.

AspectJ is a well-known attempt to let each aspect use the same syntax, that of the base language. But the idea of limited AOP is much older, often different syntaxes are used for each different user-domain capability.

I've already blogged about some aspects of the textual component of English and Groovy. Whereas the other two components of language exist for reasons independent of the medium itself, the textual component comes into being because the other two components exist, and refers both to those other two components and to itself self-referentially. The textual component ensures every degree of freedom available in the medium itself is utilized.

In computer languages, the textual component if often called ”syntactic sugar”. Often computer language designers scorn the use of lots of syntactic sugar, but natural language designers, i.e. the speakers of natural languages, use all the syntactic sugar available in the communication medium. Programming languages designers should do the same. In the DLR-targetted Groovy I'm working on, I'm focusing on this aspect of the Groovy Language.

Monday, February 23, 2009

Groovy 1.6 Released

Groovy 1.6 has recently been released, the first update to the Groovy Language since version 1.5 over a year ago. Version 1.6 brings Python-style tuples to Groovy, allowing multi-value assignments. This was the last unfulfilled item from James Strachan's original 29 August 2003 manifesto for Groovy.

But many items remain unfulfilled from Groovy's submission as JSR 241 on 16 March 2004 and subsequent approval 2 weeks later. James expanded on his vision at that time. Let's look at some of his points...

(1) James wrote: “One area we've not yet looked at is merging a Java and Groovy compiler together so Groovy and Java code can be compiled together in the same compilation unit.” The Groovy developers added this capability in their implementation of Groovy 1.5 a year ago.

(2) He also wrote: “The JSR allows people to [implement Groovy] if they wish - whether its a complete rewrite, replacing a part of it or just some tinkering, and provides a TCK to know if they correctly implement the JSR. Even just embedding the RI inside a container can sometimes affect things so having a JSR & TCK helps even if we're sharing the same codebase but just configuring & deploying it in different ways..” A spec and test kit encourages other developers to come along and contribute to the language correctly, improving it or the parts that need it. This was the vision in place when I first started tinkering with Groovy.

(3) Also: “Just to be clear, its the expert group's responsibility to make a great spec and a reference implementation and TCK. It's up to others to create different implementations if they want to.” (his emphasis) Not having a JSR provides only one avenue for contributing to the Groovy Language, that of joining the development team, submitting to the commercial interests and internal politics of Codehaus and SpringSource and whoever else might come along, something very few developers want to do. But having a completed JSR provides many avenues to contribute to Groovy. A spec and test kit are like the market rules for a bazaar, but the present Groovy development is still a centrally-planned cathedral. The only really committed developers are those with shares in the business, and even then, when they're not working on Spring or Grails.

Having tight control over the initial stages of an open-source software package keeps development focused. A year after development began, Groovy creator James Strachan thought the time was right for creating a spec and test kit. After some initial activity, work on the spec halted, the reason given was “possible copyright violations with Sun's Java language spec”. The Groovy developers since then have focused on creating the fastest possible JVM meta-object engine, and on melding their implementation of Groovy closely into Grails, both good activities but not an end in themselves.

(4) Finally, James also added: “I'm hoping Groovy goes along the same way - that one day maybe Jikes / gcj implement a Groovy compiler or maybe IDEs (say in Eclipse / IDEA / NetBeans / Workshop) do their own Groovy compiler, reusing their internal Java compilers - maybe reusing the Groovy AST compiler, just replacing the bytecode generation with something else, like Java AST generation, Java code generation or whatever.” James Strachan envisioned Groovy as a loose collection of language technologies, just as the Spring technologies are. A Groovy spec, therefore, needs to define not only the language syntax, but also the meta-object interface, the default Groovy method interface, the AST interface, with clearly-defined visitation states, and so forth.

The Groovy developers, to their credit, have now completed all the items in James' original manifesto, as well as adding a joint Groovy/Java compiler. But the JSR hasn't moved an inch in the past 5 years. Let's hope Groovy 1.7 (or whatever it's called) can address this issue.

Thursday, February 12, 2009

The Rise of Unicode

The next version of Unicode is v.5.2, the latest of a unified character set now with over 100,000 current tokens. One notable addition to v.5.2 will be the Egyptian hieroglyphs, the earliest known system of human writing. Perhaps they will mark Unicode's coming of age, it being another huge step in representing language with graphical symbols. Let's look at a consolidated short history of writing systems, courtesy of various Wikipedia pages, to see Unicode's rise in perspective...

Egyptian hieroglyphs were invented around 4000-3000 BC. The earliest type of hieroglyph was the logogram, where a common noun (such as sun or mountain) is represented by a simple picture. These existing hieroglyphs were then used as phonograms, to denote more abstract ideas with the same sound. Later, these were modified by extra trailing hieroglyphs, called semagrams, to clarify their meaning in context. About 5000 Egyptian hieroglyphs existed by Roman times. When papyrus replaced stone tablets, the hieroglyphs were simplified to accommodate the new medium, sometimes losing their resemblance to the original picture.

The idea of such hieroglyphic writing quickly spread to Sumeria, and eventually to ancient China. The ancient Egyptian and Sumerian hieroglyphs are no longer used, but modern Chinese characters are descended directly from the ancient Chinese ones. Because Chinese characters spread to Japan and ancient Korea, they're now called CJK characters. By looking at such CJK characters, we can get some idea of how Egyptian hieroglyphs worked. Many CJK characters were originally pictures, such as 日 for sun, 月 for moon, 田 for field, 水 for water, 山 for mountain, 女 for woman, and 子 for child. Some pictures have meanings composed of other meanings, such as 女 (woman) and 子 (child) combining into 好, meaning good. About 80% of Chinese characters are phonetic, consisting of two parts, one semantic, the other primarily phonetic, e.g. 土 sounds like tu, and 口 means mouth, so 吐 also sounds like tu, and means to spit (with the mouth). The phonetic part of many phonetic characters often also provides secondary semantics to the character, e.g. the phonetic 土 (in 吐) means ground, where the spit ends up.

Eventually in Egypt, a set of 24 hieroglyphs called uniliterals evolved, each denoting one consonant sound in ancient Egyptian speech, though they were probably only used for transliterating foreign names. This idea was copied by the Phoenicians by 1200BC, and their symbols spread around the Middle East into various other languages' writing systems, having a major social effect. It's the base of almost all alphabets used in the world today, except CJK characters. These Phoenician symbols for consonants were copied by the ancient Hebrews and for Arabic, but when the Greeks copied them, they adapted the symbols of unused consonants for vowel sounds, becoming the first writing system to represent both consonants and vowels.

Over time, cursive versions of letters evolved for the Latin, Greek, and Cyrillic alphabets so people could write them easily on paper. They used either the block or the cursive letters, but not both, in one document. The Carolingian minuscule became the standard cursive script for the Latin alphabet in Europe from 800AD. Soon after, it became common to mix block (uppercase) and cursive (lowercase) letters in the same document. The most common system was to capitalize the first letter of each sentence and of each noun. Chinese characters have only one case, but that may change soon. Simplified characters were invented in 1950's mainland China, replacing the more complex characters still used in Hong Kong, Taiwan, and western countries. Nowadays in mainland China though, both complex and simplified Chinese are sometimes used in the same document, the complex ones for more formal parts of the document. Perhaps one day complex characters will sometimes mix with simplified ones in the same sentence, turning Chinese into another two-case writing system.

Punctuation was popularized in Europe around the same time as cursive letters. Punctuation is chiefly used to indicate stress, pause, and tone when reading aloud. Underlining is a common way of indicating stress. In English, the comma, semicolon, colon, and period (,;:.) indicated pauses of varying degrees, though nowadays, only comma and period is used much in writing. The question mark (?) replaces the period to indicate a question, of either rising or falling tone; the exclamation mark (!) indicates a sharp falling tone.

The idea of separating words with a special mark also began with the Phoenicians. Irish monks began using spaces in 600-700AD, and this quickly spread throughout Europe. Nowadays, the CJK languages are the only major languages not using some form of word separation. Until recently, the Chinese didn't recognize the concept of word in their language, only of (syllabic) character.

The bracketing function of spoken English is usually performed by saying something at a higher or lower pitch, between two pauses. At first, only the pauses were shown in writing, perhaps by pairs of commas. Hyphens might replace spaces between words to show which ones are grouped together. Eventually, explicit bracketing symbols were introduced at the beginning and end of the bracketed text. Sometimes the same symbol was used to show both the beginning and the end, such as pairs of dashes to indicate appositives, and pairs of quotes, either single or double, to indicate speech. Sometimes different paired symbols were used, such as parentheses ( and ). In the 1700's, Spanish introduced inverted ? and ! at the beginning of clauses, in addition to the right-way-up ones at the end, to bracket questions and exclamations. Paragraphs are another bracketing technique, being indicated by indentation.

Around 1050, movable-type printing was invented in China. Instead of carving an entire page on one block as in block printing, each character was on a separate tiny block. These were fastened together into a plate to reflect a page of a book, and after printing, the plate was broken up and the characters reused. But because thousands of characters needed to be stored and manipulated, making movable-type printing difficult, it never replaced block printing in China. But less than a hundred letters and symbols need to be manipulated for European alphabets, much easier. So when movable-type printing reached Europe, the printing revolution began.

With printing a new type of language matured, one that couldn't be spoken very well, only written: the language of mathematics. Mathematics, unlike natural languages, needs to be precisely represented. Natural languages are very expressive, but can also be quite vague. Numbers were represented by many symbols in ancient Egypt and Sumeria, and had reduced to a mere 10 by the Renaissance. But from then on, mathematics started requiring many more symbols than merely two cases of 26 letters, 10 digits, and some operators. Many symbols were imported from other alphabets, different fonts introduced for Latin letters, and many more symbols invented to accommodate the requirements of writing mathematics. Mathematical symbols are now almost standardized throughout the world. Many other symbol systems, such as those for chemistry, music, and architecture, also require precise representation. Existing writing systems changed to utilize the extra expressiveness that came with movable-type printing. Underlining in handwriting was supplemented with bolding and italics. Parentheses were supplemented with brackets [] and curlies {}.

Fifty years ago, yet another type of language arose, for specifying algorithms: computer languages. The first computer languages were easy to parse, requiring little backtracking, but the most popular syntax, that of C and its descendants, requires more complex logic and greater resources to parse. Most programming languages used a small repetoire of letters, digits, punctuation, and symbols, being limited by the keyboard. Other languages, most notably APL, attempted to use many more, but this never became popular. Unlike mathematics, computer languages relied on parsing syntax, rather than a large variety of tokens, to represent algorithms, being limited by the keyboard. Computer programs generally copied natural language writing systems, using letters, numbers, bracketing, separators, punctuation, and symbols in similar ways. One notable innovation of computer languages, though, is camel case, popularized for names in C-like language syntaxes.

The natural language that spread around the world in modern times, English, doesn't use a strict pronunciation-spelling correspondence, perhaps one of the many reasons it spread so rapidly. English writing therefore caters for people who speak English with widely differing vowel sounds and stress, pause, and tone patterns. In this way, English words are a little like Chinese ideographs. As Asian economies developed, techniques for quickly entering large-character-set natural languages were invented, known as IME's (input method editors). But these Asian countries still use English for computer programming.

Around 1990 Unicode was born, unifying the character sets of the world. Initially, there was only room for about 60,000 tokens in Unicode, so the CJK characters of China, Japan, and Korea were unified to save space. Unicode is also bidirectional, catering to Arabic and Hebrew. Topdown languages such as Mongolian and traditional Chinese script can be simulated with left-to-right or right-to-left directioning by using a special sideways font. However, Unicode didn't become very popular until its UTF-8 encoding was invented 10 years ago, allowing backwards compatibility with ASCII. Another benefit of UTF-8 is there's now room for about one million characters in the Unicode character set, allowing less commonly used scripts such as Egyptian hieroglyphs to be encoded.

Many programming languages have recently adopted different policies for using Unicode tokens in names and operators. The display tokens in Unicode are divided into various categories and subcategories, mirroring their use in natural language writing systems. Examples of such subcategories are: uppercase letters (Lu), lowercase ones (Ll), digits (Nd), non-spacing combining marks, e.g. accents (Mn), spacing combining marks, e.g. Eastern vowel signs (Mc), enclosing marks (Me), invisible separators that take up space (Zs), math symbols (Sm), currency symbols (Sc), start bracketing punctuation (Ps), end bracketing (Pe), initial quote (Pi), final quote (Pf), and connector punctuation, e.g. underscore (Pc).

For it to become popular to use a greater variety of Unicode tokens in computer programs, there must be a commonly available IME for their entry with keyboards. Sun's Fortress provides keystroke sequences for entering mathematical symbols in programs, but leaves it vague whether the Unicode tokens or the ASCII keys used to enter them are the true tokens in the program text. And of course there must be a commonly available font representing every token. Perhaps because of the large number of CJK characters, and the recent technological development of mainland China, a large number of programmers may one day suddenly begin using them in computer programming to make their programs terser.

Language representation using graphical symbols has taken many huge leaps in history: Egyptian hieroglyphs to represent speech around 4000 years ago, an alphabet to represent consonant and vowel sounds by the Phoenicians and Greeks around 2500 years ago, movable-type printing in Europe around 500 years ago, and unifying the world's alphabets and symbols into Unicode a mere 20 years ago. And who knows what the full impact of this latest huge leap will be?

Saturday, December 06, 2008

The Thematic Structure of English and Groovy

After working as a programmer for many years, I tossed it in to teach English in China. I spent a few years reading the many books on language and linguistics in the bookshops up here, before returning to programming as a hobby. I then started to see many similarities between natural and computer languages, which I'm intermittently blogging about. Here's today's installment...

Of the books on language I've read, M.A.K. Halliday's ideas make a lot of sense. He suggests we should analyse language in terms of what it's used for, rather than its inherent structure. From this basis, he's isolated three basic functions of natural language, and their corresponding structural subsystems: the ideational, the interpersonal, and the textual.

The ideational function is a representation of experience of the outside world, and of our consciousness within us. It has two main components: the experiential and the logical. The experiential component embodies single processes, with their participants and circumstances, in a transitivity structure. For example, “At quarter past four, the train from Newcastle will arrive at the central station.” has a transitive structure with process to arrive, participants train from Newcastle and central station, and circumstance quarter past four. The primary participant is called the actor, here, the train from Newcastle. Computer languages have a structure paralleling the transitivity structure of natural languages, e.g. train.arrive(station, injectedCircumstance) for object-oriented languages. The logical component of ideational function concerns links between the experiential components, attained with English words such as and, which, and while. These have obvious parallels in programming languages.

The interpersonal function involves the producer and receiver/s of language, and the relationship between them. This function accounts for the hundreds of different attitudinal intonations we overlay onto our speech, interjections, expressions of politeness we prepend to English sentences, e.g. “Are you able to...”, many adverbs of opinion, the mood (whether indicative, interrogative, imperative, or exclamative), and the modal structure of English grammar. The mood structure causes verbal phrases to have certainty, ability, allowability, polarity, and/or tense prepended in English, and can be repeated in the question tag, e.g. isn't he?, can't we?, should they?. The interpersonal function gives the grammatical subject-and-predicate structure to English. In programming languages, the interpersonal function determines how people interact with the program, and how other programs interact with it. The interpersonal functions are what would normally be extracted into aspects in aspect-oriented programming. They generally disrupt the “purer” transitivity structure of the languages.

The textual function brings context to the language through different subsystems. The informational subsystem divides the speech or text into tone units using pauses, then gives stress/es to that part of the unit that is new information. The cohesive subsystem enables links between sentences, using conjunctions and pronouns, substitution, ellipsis, etc. The thematic subsystem makes it easy for receivers to follow the flow of thought. Comparing this structure of the English and Groovy languages is the topic of today's blog post...

Thematic structure of English
Theme in English is overlaid onto the clause, a product of the transitive and modal structures. The theme is the first position in the clause. English grammar allows any lexical item from the clause to be placed in first position. (In fact, English allows the lexical items to be arranged in any order to enable them to be broken up in any combination into tone units.) Some examples, using lexical items to give, Alan, me, and the book, with theme bolded:
  Alan gave me that book in London.
  Alan gave that book to me in London. (putting indirect object into prepositional phrase)
  To me Alan gave that book in London. (fronting indirect object)
  I am who Alan gave that book to in London. (fronting indirect object, with extra emphasis)
  To me that book was given in London. (using passive voice to front indirect object)
  That book was given in London. (using passive voice to omit indirect object)
  That book Alan gave me in London. (fronting direct object as topic)
  That book is the one Alan gave me in London. (fronting direct object in more formal register)
  In London, Alan gave me that book. (fronting adverbial, into separate tone unit)
  London is where Alan gave me that book. (fronting adverbial in the same tone unit)
  There is a book given by Alan to me in London. (null topic)

Although not common, English also allows the verb to be put in first position as theme:
  Give the book Alan did to me in London.
  Give me the book did Alan in London.
  Give did Alan of the book to me in London.
  Given was the book by Alan to me in London.

First position is merely the way English indicates what the theme is, not the definition of it. Japanese indicates the theme in the grammatical structure (with the inflection はwa), while Chinese (I think) uses a combination of first position and grammatical structure (prepending with 是shi).

Thematic structure of Groovy
One way of indicating theme could be to bold it, assuming the text editor had rich text capabilities. This would similar to Japanese. For example, for thematic variable a
  def b = a * 2; def c = a / 2;.
Another way is to use first position, how English indicates it. This would be an Anglo-centric thematic structure to programming languages, which generally already have an Anglo-centric naming system. Perhaps the best way is a combination of both front position and bolding.

Let's look at how Groovy could enable front-position thematic structure. We'll start with something simple, the lowest precedence operator: a = b. If we want to front the b, we can't. We would need some syntax like =:, the reverse of Algol's :=
  b =: a

We'd need to provide the same facility for the other precedence operators at the same level += -= *= /= %= <<= >>= >>>= &= ^= |=. Therefore, we'd have operators =+: =-: =*: =/: =%: =<<: =>>: =>>>: =&: =^: =|:.

At the next higher precedence level are the conditional and Elvis operators. Many programming languages, such as Perl and Ruby, enable unless as statement suffix, allowing the action to be fronted as the theme. Groovy users frequently request this feature of Groovy on the user mailing list. An unless keyword would be useful, but we could also make the ? : and ?: operators multi-theme-enabling by reversing them, i.e. : ? and :?, with opposite (leftwards) associativity. The right-associative ones would have higher precedence over these new ones, so, for example:
  a ? b : c ? d : e would associate like a ? b : (c ? d : e)
  a : b ? c : d ? e would associate like (a : b ? c) : d ? e
  a : b : c ? d ? e would associate like a : (b : c ? d) ? e
  and a ? b ? c : d : e would associate like a ? (b ? c : d) : e

On a similar note: Groovy dropped the do while statement because of parser ambiguities. It should be renamed do until to overcome the ambiguities.

Next up the precedence hierarchy, we need shortcut boolean operators ||: and &&:, which evaluate, associate, and shortcut rightwards. Most of the next few operators up the hierarchy | ^ & == != <=> < <= > >= + * don't need reverse versions, but these do: =~ ==~ << >> >>> - / % **. It's good Groovy supplies the ..< operator so we can emphasize an endpoint in a range without actually processing it. We'll also provide the >.. and >..< operators.

Just as in English we have the choice of saying the king's men or the men of the king, depending on what we want to make thematic, we should have that choice in Groovy too.
We can easily encode reverse-associating versions of *. ?. .& .@ *.@ as .* .? &. @. @.*. To encode the standard path operator ., we could use .:.

A positive by-product of having these reverse-associative versions of the Groovy operators is they'll work nicely with names in right-directional alphabets, such as Arabic and Hebrew, when we eventually enable that.

When defining methods in Groovy, we should have the choice to put return values and modifiers after the method name and parameters, like in Pascal. This would cater speakers of Romance languages, e.g. French, who generally put the adjectives after the nouns.

Groovy, like most programming languages, doesn't enable programmers to supply their own thematic structure to code, only the transitive structure. When used well, thematic structure in code enables someone later on to more easily read and understand the program. Perl was a brave attempt at providing “more than one way to do things”, but most programming languages haven't learnt from it. I'm working on a preprocessor for the Groovy Language, experimenting with some of these ideas. If it looks practical, I'll release it one day, as GroovyScript. It will make Perl code look like utter verbosity.

Saturday, November 22, 2008

Stress and Unstress in Computer Languages

Computer languages could learn a few things from natural languages in their design...

Natural Language
Many natural languages, such as English, make a distinction between stressed and unstressed words. In general, nouns, verbs, and adjectives (incl adverbs ending in -ly) are stressed, while grammar words are unstressed.

For example: “I walked the spotty dog to the shop, quickly bought some bread, and returned home”. (I've bolded the syllables we stress during speech in this and following examples.)

We stress the nouns (dog, shop, bread, home), adjectives (spotty, quick), and verbs (walk, buy, return), and don't stress the grammar words (I, the, to, -ly, some, and). (Note: In Transformational Grammar, adverbs ending in -ly are considered to be a specific inflectional form of the corresponding adjectives.)

Examples of unstressed grammar words in English are conjunctions (and, or, but), conjunctive adverbs (while, because), pronouns (this, you, which), determiners (any, his), auxiliary verbs (is, may), prepositions (to, on, after), and other unclassed words (existential there, infinitive to), as well as many inflectional morphemes (-s, -'s, -ing, -ly).

Verbs are often only half-stressed instead of fully stressed, and prepositions half-stressed instead of unstressed, depending on the surrounding context, e.g. “The teacher saw the book behind the desk.” (Here, I've bold-italicized the half-stressed words.)

English has a clear distinction between grammar words and lexical words (nouns, adjectives/adverbs, and verbs) in speech.

Many languages distinguish between lexical and grammar words in their writing systems. German capitalizes the first letter of each noun. (Dutch stopped doing this in 1948, and English in the 1700's). Japanese uses Chinese characters for nouns and many adjectives, and the Japanese alphabet for grammar words and many verbs.

When using grammar words in a lexical capacity, we stress them when speaking, e.g. “I put an 'is' followed by an 'on', before the 'desk' with a 'the' before it, to make a predicate.” And when writing, we put the grammar words we're using as lexical ones inside quotes.

Using stress and unstress to separate lexical and grammar words enables English, and probably all natural languages, to be self-referential.

Computer Languages
Virtually every computer language differentiates between lexical words and grammar words.

Assembler and Cobol used indentation and leading keywords to distinguish different types of statements, and space and comma to separate items. Like many languages after them, the limited set of keywords couldn't be used for user-defined names. Fortran introduced a simple infix expression syntax for math calculations, using special symbols (+ - * etc) for the precedenced infix operators, and ( ) for bracketing. Lisp removed the indentation and keywords completely, making everything use bracketing, with space for separation, and a prefix syntax. APL removed the precedences, but introduced many more symbols for the operators. The experimentation continued until C became widespread.

C uses 3 different types of symbols for bracketing, ( ) [ ] { }. C++, Java, and C# added < > for bracketing. C uses space and , ; . for separators, and a large number of operators, organized via a complex precedence system. Java has 53 keywords; C# has 77.

The lexical words of computer languages are clear. Classes and variables are nouns. Functions and methods are verbs. Keywords beginning a statement are imperative verbs, and in some languages are indistinguishable from functions. Modifiers, interfaces, and annotations are adjectives/adverbs. The operators (+ - * / % etc) bear a similarity to prepositions, some of them (+= -= *= etc), to verbs. And I'd suggest the tokens used for bracketing and separators are clear examples of grammar words in computer languages, being similar to conjunctions and conjunctive adverbs.

In general, computer languages use some tokens (e.g. A-Z a-z 0-9 _) for naming lexical words, and others (e.g. symbols and punctuation) for grammar. Occasionally, there's exceptions, such as new and instanceof in Java. Some computer languages use other means. Perl and PHP put a @ before all lexical words, enabling all combinations of tokens to be used for names. This is similar to capitalizing all nouns in German. C# allows @ before any lexical word, but only requires it before those which double as keywords. This is similar to quoting grammar words to use them as lexical ones in English.

Newer programming languages have different ways to use Unicode tokens in names and operators. The display tokens in Unicode fall into six basic categories: letters (L), marks (M), numbers (N), symbols (S), punctuation (P), and separators (Z). Python 3.0 names can begin with any Unicode letter (L), numeric letter (in N), or the underscore (in P); subsequent tokens can also be combining marks (in M), digits (in N), and connector punctuation (in P). Scala names can begin with an upper- or lowercase Unicode letter (in L), the underscore (in P), or the dollar sign (in S); subsequent tokens can also be certain other letters (in L), numeric letters (in N), and digits (in N). Scala operators can include math and other symbols (in S). Almost all languages have the same format for numbers, beginning with a number (in N), perhaps with letters (in L) as subsequent tokens.

Perhaps the easiest way to distinguish between lexical and grammar words in GrerlVy is to use Unicode letters (L), marks (M), and numbers (N) exclusively for lexical words, and symbols (S), punctuation (P), and separators (Z) exclusively for grammar words. Of course, we still have a difficulty with the borderline case: infix operators and prefix methods, which correspond roughly to prepositions and verbs, the half-stressed words in English. I'm still thinking about that one.

Saturday, September 06, 2008

Mass-Parity-Distance Invariance

During July and August, I took a break from China and Tesol and Groovy, visiting my home country, New Zealand. I hoed into a copy of Roger Penrose's The Road to Reality, and came up with an idea to explain Dark Energy...

Negative Mass
Negative mass is usually defined in such a way that Einstein's equivalence principle still holds, where gravitational mass is proportional to inertial mass. This results in some bizarre effects. But while reading Penrose's book, I got an idea on how to define negative mass so that all the positive matter and all the negative fly off in two opposite directions at the Big Bang, with the equivalence principle still holding.

The key is how we calculate the (scalar) distance with respect to some mass. For positive matter, we would continue to use the positive solution to the formula where we square root the sum of the squares of the three spatial coordinates. But we'd introduce an invariance, known as the Mass-Distance Invariance, where we'd use the negative solution to the square root for scalar distances measured with respect to negative masses.

Some consequences of this invariance are:
  • The same vector values for velocity and acceleration would be used for negative mass as for positive mass, but their scalar values would depend on whether positive matter was referenced, or negative matter. Negative matter would use negative speeds and, to indicate increasing speeds, negative acceleration values.

  • A positive-valued g-force (created by positive matter) would still mean attraction for positive matter, but repulsion for negative matter. However, a negative g-force (created by negative matter) would mean attraction for negative matter, but repulsion for positive.

  • When calculating the (scalar) gravitational force between two objects, the square of the distance between them would always be positive, but a positive force is attraction, and a negative force is repulsion. This means two negative masses attract, as do two positive masses, but positive and negative masses repel each other.

  • Such scalar values for force involving negative matter would use negative distance again when calculating energies, resulting in negative energies. Penrose mentions negative energies mess with quantum mechanical calculations, but in the real Universe, this would be OK because positive and negative energies would be partitioned off due to the gravitational effects of the Big Bang.

Therefore, when calculating scalar values in the negatively-massed side of the Universe, we'd use (1) negative distances, (2) multiplied by positive time to give negative-valued speed, (3) multiplied by positive time to give negative acceleration values to indicate increasing speeds, (4) multiplied by negative mass to give positive-valued scalar forces to indicate attraction, (5) multiplied by negative distances to give negative values for energy.

Picturing All This
When picturing such a scenario using the common "matter bends space which moves matter" 2D curved-space picture to model the 3+1D reality in general relativity, the positive matter would be on top of the sheet sinking downwards as before, but the negative matter would be under the sheet, to indicate negative distances, floating upwards, to indicate the negative mass. We can then visualize positive and negative matter each self-gravitating, but repelling each other.

The positive matter would act via left-handed gravitons as before, but the negative matter would act via right-handed gravitions. Penrose, in his description of Twistor Theory, says that there's a problem in the calculations getting left-handed and right-handed gravitons to interact with each other to enable graviton plane polarization, similar to what's possible with electromagnetism. But in my theory, it would be a requirement that left-handed and right-handed gravitons don't interact in any way. This enables both attractive gravity and repulsive gravity to operate at different scales in the same spacetime.

This graviton-handedness has a counterpart in neutrinos, reponsible for the vast excess of matter over antimatter in the observable Universe. So we need to follow the lead of Charge-Parity-Time (CPT) Invariance, and likewise introduce parity invariance, resulting in what I'm now calling Mass-Parity-Distance Invariance, or MPD-invariance.

Dark Energy
Observational evidence of such MPD-invariant negative matter would be an expected after-effect of the inflation of the very early Universe. The modified version of the Big Bang is that the Universe's overall zero energy fractures into equal Planck-distance-separated positive and negative amounts in the first quantum instant of the Universe, then their respective gravitational fields repelled the positive and negative away from each other, resulting in a Big Bang in two different directions along one spatial axis. The actual reason for the Big Bang can therefore be explained by quantum effects.

After the faster-than-light inflation stopped, the right-handed gravitons from the negative matter would be travelling towards the positive matter at the speed of light only, resulting in a time lag between inflation ending and the gravitational repulsion of the negative mass beginning to affect the positive mass with a renewed expansion. This is exactly what happened after about 10 billion years, what's called Dark Energy.

Negative-Frequency Electromagnetism
The photon would behave differently to the graviton. Planck's famous equation states photon energy equals Planck's constant multiplied by the frequency. Negative-energy photons would then have negative frequency, but for a photon this is not the same as changing the handedness (helicity), because photons have both electric and magnetic vectors. Both left-handed and right-handed photons have positive energy, and can polarize. Photons of negative energy/frequency, whether left-handed or right-handed, would have their electric and magnetic vectors swapped around.

Negative matter and antimatter are two separate concepts. Matter and antimatter created from positive energy in normal particle interactions would both have positive mass, similarly negative mass for negative energy. But a virtual particle-antiparticle pair in a vacuum would not only have an overall charge of zero, but also an overall energy of zero, one of the pair having positive energy, the other, negative. Perhaps the particle has negative mass, or perhaps the antiparticle does. This fact could provide a solution to the "hierarchy problem", there no longer being any need for supersymmetric particles to adjust quantum energy values.

The first quantum event of the Big Bang would determine how much energy, positive or negative, is in each side of the Universe. The left-handed gravitons and left-handed neutrinos go one way, their right-handed counterparts, the other. So one half of the Universe is matter with positive mass, the other half, antimatter with negative mass. One spatial dimension of the Universe is thus different to the other two, with homogeneity and isotropy being more local effects.

An alternative shape of the Universe is a four-partitioned one, where positive matter, positive antimatter, negative matter, and negative antimatter fly off in 4 different directions on a plane. This can be visualized with the 2-D saddle-shape for a hyperbolic Universe, with positive matter on top of the sheet, its matter going one way and its antimatter the other, both down the saddle on each side, and negative matter underneath the sheet, its matter and antimatter each flying off up the saddle, at ninety-degree angles to the positive matter and antimatter.

It's been two decades since I finished my undergrad maths degree, and I haven't used it since, so I'm rusty. And although I basically followed the maths in Penrose's book, I didn't get all the intricacies of manifold calculus and bundles and Langrangians. If anyone out there fills my wordy explanation of MPD-Invariance with numbers, let me know if it works or if it's rubbish. But there's more follow-on ideas I've had...

The Universe as Two Complex Planes
There's an eiry similarity between the well-known Charge-Parity-Time (CPT) Invariance and my proposed Mass-Parity-Distance (MPD) Invariance. I think it suggests a certain structure to the Universe alluded to by Penrose in his Twistor Theory. He suggests the Universe can be modelled as three complex planes (i.e. 6 real dimensions), the "imaginary" dimension being as physically real as a "real" one. But elsewhere Penrose says if there are only 4 observational dimensions of spacetime, we shouldn't try to model them with 11 or 26 dimensions. I'd suggest the Universe can be modelled as only 2 complex planes to match the 4 observational dimensions of spacetime. The extra 2 dimensions required by Penrose's model could come from the fractal dimensions created by those 2 complex planes.

A curve on a complex plane usually has a (Hausdorff) dimension of 1, but fractal curves have a dimension higher than 1, but less than or equal to 2. Only very special fractals, such as the Mandelbrot set and Julia sets, have Hausdorff dimension of 2. If there exist on any complex plane an (aleph-zero-)infinite number of concentric Hausdorff-dimension-2 sets, then I suspect the plane itself would have Hausdorff dimension 3. The union into a manifold of two such complex planes would have Hausdorff dimension 6, while only having topological dimension 4, thus satisfying Penrose's minimal number of dimensions to model our Universe.

We can create such an arrangement on both our complex planes by relating them together using an uncertainty relation. Because the Mandelbrot and Julia sets are the only sets I know of with Hausdorff dimension high enough to be valid in this model, I'll use the Mandelbrot set as an example. The basic set is only one connected curve on the complex plane, but when a computer calculates it, many circles of various colors are usually displayed to reflect different accuracies of calculation. These circles are concentric. Although only the infinitely accurate Mandelbrot set normally has any mathematical significance, when relating two complex planes together in an uncertainty relationship, the curve generated from each accuracy level takes on significance.

Relationship Between CPT and MPD Invariances
Mass would be modelled as one of the fractal dimensions, while charge modelled as the other. The two invariances, CPT and MPD, both of them having parity (i.e. space reflection) included, bear a vague resemblance to the requirements for 2n-D real manifolds to be treated as n-D complex manifolds under the Newlander-Niremberg theorem, in this case 4 real dimensions as 2 complex planes. One plane, required to be CPT-invariant, would have time as one dimension, say, the real. The imaginary dimension would be a dimension of space, and the fractal dimension, charge. The other complex plane, required to be MPD-invariant, would have the other two dimensions of space for its real and imaginary dimensions, and mass for its fractal dimension.

Planck's constant defines the uncertainty relationship between time (i.e. the real dimension of one complex plane) and energy (i.e. a proxy for mass, the fractal dimension of the other complex plane). This would be the uncertainty relationship that makes the complex planes have (Hausdorff-)dimension-3.

The other dimensionless constants of nature could be interpreted as observational coordinate mappings between dimensions on these two complex planes. The speed of light is a mapping between the time and space dimensions on the same plane. Newton's gravitational constant is a mapping between the (fractal) mass dimension and a space dimension. Coulomb's constant is a mapping between the (fractal) charge dimension and space dimension. The three space dimensions wouldn't need mapping between one another, as their differences from one another are only apparent in the helicity of the graviton and neutrino. So the four dimensionless constants would be sufficient mappings for the two planes.

Everyday Observation
I've ignored the forces without an infinite range (the strong and weak forces) in this model. The basic difference between MPD-invariant gravity and CPT-invariant electromagnetism is that in gravity, like masses attract while unlike ones repel, whereas in electromagnetism, like charges repel while unlike ones attract. The logical effect of this (ignoring finite-range forces) is that gravity's masses are real numbers, while charges are polar.

So we have two complex planes, each with three dimensions, i.e. real-imaginary-fractal. The first has Time-Distance-Charge, the second, Distance-Distance-Mass. Perhaps, in our own everyday observation of these planes, the charge, having polar (i.e. 0 or +1 or -1) values only, doesn't require its total dimensional freedom to operate, and only needs a Planck-distance portion of the (fractal) charge and (imaginary) distance dimensions. So the second complex plane "takes" the excess distance dimension from the first plane to create 3D-space, and the mass, having aggregative values, also "takes" the excess fractal freedom of the charge. So we end up with 1D-time, 3D-space, polar-charge, and aggregative-mass.

If I had the time, I'd be looking at the maths for relating two complex planes together, each with (Hausdorff-)2D fractal curves, using an uncertainty relationship, trying to derive relativity axioms and asymmetric time and such stuff. But I've now got other demands on my time, hence this blog entry. If my description rings a bell with anyone, let me know how it goes.