Palling around.

Pall (n.): pawl.

I couldn't write last week, and my upgrade to QL has progressed no further.
For reference, I stalled before comparing the efficiency of nested Objects to
that of nested Arrays, which I must test before experimenting further with the
prototype compiler or even refining the design. I intend to do that this month.
In the meantime, here's a snapshot of MLPTK with new experiments included.

http://www.mediafire.com/download/566ln3t1bc5jujp/mlptk-p9k-08apr2016.zip

And a correction to my brief about the grammar ("Saddlebread"): actually, the
InchoateConjugation sequence does not cause differentiation, because the OP_CAT
prevents the original from reducing. Other parts may be inaccurate. I'll revise
the grammar brief and post a new one as soon as I have fixed the QL speed bug.

I took some time out from writing Quadrare Lexema to write some code I've been
meaning to write for a very long time: pal9000, the dissociated companion.
This software design is remarkably similar to the venerable "Eggdrop," whose C
source code is available for download at various locations on the Internets.
Obviously, my code is free and within the Public Domain (as open as open source
can be); you can find pal9000 bundled with today's edition of MLPTK, beneath the
/reference/ directory.

The chatbot is a hardy perennial computer program.
People sometimes say chatbots are artificial intelligence; although they aren't,
exactly, or at least this one isn't, because it doesn't know where it is or what
it's doing (actually it makes some assumptions about itself that are perfectly
wrong) and it doesn't apply the compiler-like technique of categorical learning
because I half-baked the project. Soon, though, I hope...

Nevertheless, mathematics allows us to simulate natural language.
Even a simplistic algorithm like Dissociated Press (see "Internet Jargon File,"
maintained somewhere on the World Wide Web, possibly at Thyrsus Enterprises by
Eric Steven Raymond) can produce humanoid phrases that are like real writing.
Where DisPress fails, naturally, is paragraphs and coherence: as you'll see when
you've researched, it loses track of what it was saying after a few words.

Of course, that can be alleviated with any number of clever tricks; such as:
	1. Use a compiler.
	2. Use a compiler.
	3. Use a compiler.
I haven't done that with p9k, yet, but you can if you want.

Of meaningful significance to chat robots is the Markov chain.
That is a mathematical model, used to describe some physical processes (such as
diffusion), describing a state machine in which the probability of any given
state occurring is dependent only on the next or previous state of the system,
without regard to how that state was encountered.
Natural language, especially that language which occurs during a dream state or
drugged rhapsody (frequently and too often with malicious intent, these are
misinterpreted as the ravings of madmen), can also be modeled with something
like a Markov chain because of the diffusive nature of tangential thought.

The Markov-chain chat robot applies the principle that the state of a finite
automaton can be described in terms of a set of states foregoing the present;
that is, the state of the machine is a sliding window, in which is recorded some
number of states that were encountered before the state existent at the moment.
Each such state is a word (or phrase / sentence / paragraph if you fancy a more
precise approach to artificial intelligence), and the words are strung together
one after another with respect to the few words that fit in the sliding window.
So, it's sort of like a compression algorithm in reverse, and similar to the way
we memorize concepts by relating them to other concepts. "It's a brain. Sorta."

One problem with Markov robots, and another reason why compilers are of import
in the scientific examination of artificial intelligence, is that of bananas.
The Banana Problem describes the fact that, when a Markov chain is traversed, it
"forgets" what state it occupied before the sliding window moved.
Therefore, for any window of width W < 6, the input B A N A N A first produces
state B, then states A and N sequentially forever.
Obviously, the Banana Problem can be solved by widening the window; however, if
you do that, the automaton's memory consumption increases proportionately.

Additionally, very long inputs tend to throw a Markov-'bot for a loop.
You can sorta fix this by increasing the width of the sliding window signifying
which state the automaton presently occupies, but then you run into problems
when the sliding window is too big and it can't think of any suitable phrase
because no known windows (phrases corresponding to the decision tree's depth)
fit the trailing portion of the input.
It's a sticky problem, which is why I mentioned compilers; they're of import to
artificial intelligence, which is news to absolutely no one, because compilers
(and grammar, generally) describe everything we know about the learning process
of everyone on Earth: namely, that intelligent beings construct semantic meaning
by observing their environments and deducing progressively more abstract ideas
via synthesis of observations with abstractions already deduced.
Nevertheless, you'd be hard-pressed to find even a simple random-walk chatbot
that isn't at least amusing.
(See the "dp" module in MLPTK, which implements the vanilla DisPress algorithm.)

My chatbot, pal9000, is inspired by the Dissociated Press & Eggdrop algorithms;
the copy rights of which are held by their authors, who aren't me.
Although p9k was crafted with regard only to the mathematics and not the code,
if my work is an infringement, I'd be happy to expunge it if you want.

Dissociated Press works like this:
	1. Print the first N words (letters? phonemes?) of a body of text.
	2. Then, search for a random occurrence of a word in the corpus
	   which follows the most recently printed N words, and print it.
	3. Ad potentially infinitum, where "last N words" are round-robin.
It is random: therefore, humorously disjointed.

And Eggdrop works like this (AFAICR):
	1. For a given coherence factor, N:
	2. Build a decision tree of depth N from a body of text.
	3. Then, for a given input text:
	4. Feed the input to the decision tree (mmm, roots), and then
	5. Print the least likely response to follow the last N words
	   by applying the Dissociated Press algorithm non-randomly.
	6. Terminate response after its length exceeds some threshold;
	   the exact computation of which I can't recall at the moment.
It is not random: therefore, eerily humanoid. (Cue theremin riff, thundercrash.)

A compiler, such as I imagined above, could probably employ sliding windows (of
width N) to isolate recurring phrases or sentences. Thereby it may automatically
learn how to construct meaningful language without human interaction.
Although I think you'll agree that the simplistic method is pretty effective on
its own; notwithstanding, I'll experiment with a learning design once I've done
QL's code generation method sufficiently that it can translate itself to Python.

Or possibly I'll nick one of the Python compiler compilers that already exists.
(Although that would take all the fun out of it.)

I'll parsimoniously describe how pal9000 blends the two:

First of all, it doesn't (not exactly), but it's close.
Pal9000 learns the exact words you input, then generates a response within some
extinction threshold, with a sliding window whose width is variable and bounded.
Its response is bounded by a maximum length (to solve the Banana Problem).
Because it must by some means know when a response ends "properly," it also
counts the newline character as a word.
These former are departures from Eggdrop.
It also learns from itself (to avoid saying something twice), as does Eggdrop.

In addition, p9k's response isn't necessarily random.
If you use the database I included, or choose the experimental "generator"
response method, p9k produces a response that is simply the most surprising word
it encountered subsequent to the preceding state chain.
This produces responses more often, and they are closer to something you said
before, but of course this is far less surprising and therefore less amusing.
The classical Eggdrop method takes a bit longer to generate any reply; but, when
it does, it drinks Dos Equis.
... Uh, I mean... when it does, the reply is more likely to be worth reading.
After I have experimented to my satisfaction, I'll switch the response method
back to the classic Eggdrop algorithm. Until then, if you'd prefer the Eggdrop
experience, you must delete the included database and regenerate it with the
default values and input a screenplay or something. I think Eggdrop's Web site
has the script for Alien, if you want to use that. Game over, man; game over!

In case you're curious, the algorithmic time complexity of PAL 9000 is somewhere
in the ballpark of O(((1 + MAX_COHERENCE - MIN_COHERENCE) * N) ^ X) per reply,
where N is every unique word ever learnt and X is the extinction threshold.
"It's _SLOW._" It asymptotically approaches O(1) in the best case.

For additional detail, consult /mlptk/reference/PAL9000/readme.txt.

Pal9000 is a prototypical design that implements some strange ideas about how,
exactly, a Markov-'bot should work. As such, some parts are nonfunctional (or,
indeed, malfunction actually) and vestigial. "Oops... I broke the algorithm."
While designing, I altered multiple good ideas that Eggdrop and DisPress did
right the first time, and actually made the whole thing worse on the whole. For
a more classical computer science dish, try downloading & compiling Eggdrop.
Advertisements

Pie Jesu domine, dona eis requiem.

Breaking news: according to Shape magazine (March, 2016; volume 35, no. 6), which incorporates Fitness magazine, forty winks shouldn’t be.

Mirel Ketchiff writes: “enlightening new research is challenging [the notion, suggested by the National Sleep Foundation, that we need eight hours of sleep every night].” This enlightening new research indicates that prehistoric cavemen slept six and a half hours each night (possibly because they couldn’t get to sleep while the stalactites dripped on their heads); how anthropologists learnt the crepuscular habits of people who existed before the advent of recorded history is, evidently, left as an exercise to the reader.

Exactly how much sleep does anyone need, anyway? Someone once told me that children need about ten hours a night. Then the National Sleep Foundation told me that adolescents and adults require about eight. Now Shape magazine says I need six and a half. What’ll it be next; maybe I don’t need any sleep at all! Methamphetamine addicts have been known not to sleep for extended periods of time, and to become fashionably slender no matter how gluttonous their eating habits. Perhaps that is fitness, Fitness?

Soon we can all abandon our outmoded, unfashionable and inefficient nightly nonce of unconsciousness. Ascending from our benighted evolution, we’ll first return to our prehistoric habits (as though we ought ever to have abandoned them in the first place), and then do away with sleep altogether. Employing methamphetamine and a thousand other compounds we’re taught in school are bad for our bodies and minds, we’ll become Übermensch — harder, better, faster, and stronger than those other Nation-Brands.

Now unencumbered by our need to rest our minds each night so that we can demarcate the border line of fantasy and reality (and, vicariously, of right and wrong), and and thoroughly brain-damaged as a result, we’ll spring forward into a new age of crime, misconduct, and rampant procreation.

Promiscuity is a citizen’s duty.

I will gladly pay you Tuesday for a Hnakkerbröd today.

Beware: when speaking to Trolls, listen carefully. It could save your ice hole.
        (Trolls are known for their skill in wintertime fishing.)

My software, courtesy of MediaFire:

https://www.mediafire.com/?ebbyj47e35mmytg (http://www.mediafire.com/download/ebbyj47e35mmytg/mlptk-qlupgradecomplete-awaitingspeedfix-27mar2016.zip)

This update (unlike the foregoing, which fixed the Opera bug) regards only QL.
Again: Quadrare Lexema is similar to GNU Bison. If an infringement, I'll delete
it as soon as I hear or see from you. BTW, Bison is time-tested; QL isn't.

Oh, and a correction to "First Rate?": Bison can indeed unshift multiple tokens
back onto its input stream, even though productions can't be multiple symbols in
length, by unshifting directly onto its input during reduction (which is how QL
does it too, during deferment, which amounts to the same exact thing because no
reason exists to compute anything during deferment -- otherwise, there'd be more
race conditions than the Kentucky Derby, which is very silly).

QL is now "kinda-sorta" debugged and functioning to my specification AFAICT. Now
the API has changed considerably from how it was last month (the argument vector
to reduction deferment class constructors has been modified, some new faculties
now exist, and some were removed); this necessitated additional instantiations
of "new Array()," and interolably reduces efficiency when operating on very long
inputs, but I wanted to hurry-up this design iteration. (That was one sentence.)

The token agglutination mechanism of the parser logic is the same as before.
Code to determine precedence & blocking has been abridged; toddling steps toward
a method to generate the parser as in-line code. (As you see, that isn't yet.)

I'm tempted to redo the infrastructure to reduce the number of new Array()s that
are instantiated during the parse phase, but I'm pretty sure I can do that by
rewriting the underlying code without changing the API. The interface between
the parser's stack extremum and its input stream is passed to reductions as an
Array(), but that doesn't mean it always has to be allocated anew.

Remember: the old Command Line Interpretation Translator scaffold isn't decided;
I left ::toTheLimit() where it was, pending a ::hatten() that shall suit you; if
you'd like to use the horrifying monstrosity that is my software architecture,
you can see Mr. Skeleton awaiting you in clit.js -- asking where is his flesh, &
rapidly becoming impatient with my poking and prodding him all day. Soon, Mr.
Skeleton; soon shall be the day when the hat is yours at last, & your calcareous
projection from within my library becomes a fully fledged automaton unto itself.

For the meantime I'm satisfied with the half-measure. I think the API is okay to
start building upon, so I'll start building. Overhaul of the back-end late this
year or early in the next, & it's looking good for me to furnish the CLIT before
the third quarter. Therefore I'd say: expect full CLIT functionality in 2016.

Before I apprise you of my progress so far, let's take a moment for a thoroughly
detailed technical analysis of Mr. Skeleton's bony protrusion.

Phoneme    <=    (EscapeSequence | AnyCharacter | Number | String)
                 (EscapeSequence | AnyCharacter | Number | String | Phoneme | )
	Concatenate a "word" that is one argument in an argument vector.

ISLDQ    <=    '\"'
	Open a <String>.

InchoateString    <=    (ISLDQ | InchoateString)
                        (OP_CAT | OP_CONJ | EscapeSequence | AnyCharacter
                         | Number | Space)
	Make strings out of any symbol following an open string.
	(As you can see, this rule must be rewritten...)

String    <=    InchoateString '\"'
	Close a <String>.

Argument    <=    Phoneme (Space | ) | Argument Argument
	Concatenate the argument vector comprising an executable MLPTK command.
	That bit with "(Space | )" should probably be just "Space".

Catenation    <=    (Argument | Group | Conjugation) OP_CAT
	Concatenate the output of commands.

MalformedGroupCohesion    <=    (Argument | Group | Conjugation) OP_CLPAR
	Automatically correct the user's malformed syntax where the last
	command in a parenthetical sub-grouping was not followed by a ";".

ExecutableInchoateConjugation    <=    Argument OP_CONJ | Blargument
	Signify that a command can be executed as part of a <Conjugation>.

InchoateConjugation    <=    Group OP_CONJ | Conjugation
	Convert a conjugated <Group>, or the output of a <Conjugation>,
	to an <InchoateConjugation> token that can form the left-hand
	part of a further <Conjugation>.
	This reduction causes parser stack differentiation, because it
	conflicts with "Catenation <= Conjugation OP_CAT".
	In that circumstance, the sequence "<Conjugation> <OP_CAT> ..."
	is both a "<Catenation> ..." and a "<InchoateConjugation> <OP_CAT> ...".
	Observe that the latter always produces a syntax error.
	I'm pretty sure I could rewrite the grammar of the <Conjugation> rule to
	fix this; IDK why I didn't. (Maybe a bug elsewhere makes it impossible.)

Conjugation    <=    (ExecutableInchoateConjugation | InchoateConjugation)
                     ExecutableInchoateConjugation
	Execute the command in the <ExecutableInchoateConjugation> at right,
	supplying on its standard input the standard output of that at left.

InchoateGroup    <=    (OP_OPPAR | InchoateGroup) Catenation
	Concatenate the contents of a parenthesized command sub-grouping.

Group    <=    InchoateGroup (OP_CLPAR | MalformedGroupCohesion)
	Close an <InchoateGroup>. Concatenate the contents of a
	<MalformedGroupCohesion> if it trailed.

CommandLine    <=    (CommandLine | ) Catenation
	Concatenate the output of <Catenation>s into one Array.
	This one actually doesn't differentiate. Either a <CommandLine> waits at
	left to consume a Catenation when it reduces, or something else does, &
	<Catenations> in mid-parse never reduce to <CommandLine>s except when
	fatal syntax errors occur, in which case the parser belches brimstone.
	
Blargument    <=    Argument (OP_CAT | OP_CLPAR)
    Duplicate the trailing concatenation operator or close parenthesis following
    an <Argument>, so that a <Conjugation> doesn't conflict with a <Catenation>
    or an <InchoateGroupCohesion>. I think this can be specified formally in a
    proper grammar, without the multiple-symbol unshift, but IDK how just yet --
    because (without lookahead) the parser can't know when the argument vector
    ends without seeing a trailing operator, so execution of the last command in
    the conjugation sequence <InchoateConjugation> <Argument> <OP_CAT> would
    occur when <Argument> <OP_CAT> reduces & executes, disregarding its standard
    input (the contents of the foregoing <InchoateConjugation>.
    "Blargument <= Argument" can never happen and "ExecutableInchoateConjugation
    <= Argument" would grab the <Argument> before it could concatenate with the
    next <Argument>, so I'm at a loss for how I should accomplish this formally.
    BTW, <Blargument> is the <WalkingBassline>, with trivial alterations.
    The <Blargument> reduction causes parser stack differentiation, because it
    conflicts with both <Catenation> and <MalformedGroupCohesion>. In either
    case, the <Blargument> branch encounters a syntax error & disappears
    when <Blargument> didn't immediately follow an inchoate conjugation; the
    other branch disappears in the inverse circumstance.

Token identifier   Operator precedence  Associativity
"Space", - - - - - 2, - - - - - - - - - "wrong",
"OP_CAT",  - - - - 0, - - - - - - - - - "wrong",
"EscapeSequence",  0, - - - - - - - - - "right",
"AnyCharacter",  - 0, - - - - - - - - - "right", (the sequence to the right of
"Number",  - - - - 0, - - - - - - - - - "right",  a right-associative token is
"String",  - - - - 0, - - - - - - - - - "right",  reduced first, except...)
"OP_OPPAR",  - - - 0, - - - - - - - - - "left",
"OP_CLPAR",  - - - 0, - - - - - - - - - "wrong", (... when wrong-associativity
"OP_CONJ", - - - - 0, - - - - - - - - - "wrong",  forces QL to reduce the right-
"QL_FINALIZE", - - 0, - - - - - - - - - "wrong"   associative sequence.)

The avid reader shall observe that my "wrong-associativity" specifier, when used
to define runs of right-associative tokens that stick to one another, is similar
to the lexical comparison & matching algorithm of a lexical analyzer (scanner).
In fact, as written, it _is_ a scanner. For an amusing diversion, try excerpting
the portions of the semantical analyzer (parser) that can be made into lexical
analyzer rules, then put them into the scanner; or wait a few months and I will.

But that's enough bony baloney.
If you preferred Mr. Skeleton as he was, see mlptk/old/clit.js.21mar2016.

As of probably a few days before I posted this brief, my upgrade to QL is now
sufficient to function vice its predecessor. I spruced-up Mr. Skeleton, so that
the test scaffold in clit.js now functions with ::hattenArDin() in QL, and now
everything looks all ready to go for shell arithmetic & such frills as that.

Ironically, it seems that I've made it slower by attempting to make it faster. I
 should have known better. I'll try to fix the speed problem soon; however,
until then, I've abandoned work on the Translator due to intolerable slowness.
I'm sorry for the inconvenience. If it's any help, I think the problem is due to
 too many new Array() invocations or too many nested Arrays, one of the two.
Either way, I intend to fix it this year by rewriting the whole parser (again)
as a static code generator.

I was also thinking of writing a windowing operating system for EmotoSys, but I
am uncertain how or whether to put clickable windows in the text console. I mean
-- it'd be simple now that QL's rolling, but maybe a more Spartan design? I will
post some flowcharts here when I've exhausted my ministrations to the CLIT.

I’m happy to relate that puns about stacks & state have become out of date.

More of QL, courtesy of MediaFire, with the MLPTK framework included as usual:

https://www.mediafire.com/?xaqp7cq9ziqjkdz (http://www.mediafire.com/download/xaqp7cq9ziqjkdz/mlptk-operafix-21mar2016.zip)

I have, at last, made time to fix the input bug in Opera!
I'm sorry for the delay and inconvenience. I didn't think it'd be that tricky.

I fixed an infinite loop bug in "rep" (where I didn't notice that someone might
type in a syntax error, because I didn't have a syntax error in my test cases),
and added to the bibliography the names of all my teachers that I could recall.
I added a quip to the bootstrap sequence. More frills.

My work on QL has been painfully slow, due to pain, but it'll all be over soon.
I see zero fatal flaws in ::hattenArDin. I guess debugging'll be about a month,
and then it's back to the Command Line Interpretation Translator. Sorry for the
delay, but I have to rewrite it right before I can right the written writing.
The improved QL (and, hopefully, the CLIT) is my goal for second quarter, 2016.
After those, I'll take one of two options: either a windowing operating system
for EmotoSys or overhaul of QL. Naturally, I favor the prior, but I must first
review QL to see if the API needs to change again. (Doubtful.)

MLPTK, and QL, are free of charge within the Public Domain.

Last time I said I needed to make the input stream object simpler so that it can
behave like a buffer rather than a memory hog. I have achieved this by returning
to my original design for the input stream. Although that design is ironically
less memory-efficient in action than the virtual file allocation table, it leans
upon the Javascript interpreter's garbage collector so that I mustn't spend as
much time managing the fragmented free space. (That would've been more tedious
than I should afford at this point in the software design, which still awaits an
overhaul to improve its efficiency and concision of code.) And, like I thought,
it'll be too strenuous for me to obviate the precedence heuristic by generating
static code. In fact: unless tokens are identified by enumeration, dropping that
heuristic actually reduces efficiency, because string comparisons; also, I was
wrong about the additional states needed to obviate it: there're none, but a new
"switch" statement (these, like "if" and "while," are heuristics) is needed, and
it's still slower (even best-case) than "if"ing the precedence integer per node.

Here's an illustration of the "old" input stream object (whither I now return):

	 _________________________________
	| ISNode: inherits from Array     | -> [ ISNode in thread 0 ]
	|    ::value = data (i.e., token) |
	|    ::[N] = next node (thread N) | -> [ ISNode in thread 1 ]
	|    (Methods...)                 |
	|    (State....)                  | -> [ etc ... ]
	-----------------------------------

As you see, each token on input is represented by a node in a linked list.
Each node requires memory to store the token, plus the list's state & methods.
Nodes in the istream link forward. I wanted to add back-links, but nightmare.
For further illustration, see my foregoing briefs.

Technically the input stream can be processed in mid-parse by using the "thread"
metaphor (see "Stacking states on dated stacks"); however, by the time you need
a parser that complex, you are probably trying to work a problem that requires a
nested Turing machine instead of using only one. Don't let my software design
lure you into the Turing tar-pit: before using the non-Bison-like parts, check &
see if you can accomplish your goal by doing something similar to what you'd do
in Yacc or Bison. (At least then your grammar & algorithm may be portable.)
QL uses only thread zero by default... I hope.

GNU Bison is among the tools available from the Free Software Foundation.
Oh, yeah, and Unix was made by Bell Laboratories, not Sun Microsystems. (Oops.)

Here's an illustration of the "new" input stream object (whence I departed):

	Vector table 0: file of length 2. Contents: "HI".
		       (table index in first column, data position in second column)
		0 -> 0 (first datum in vector 0 is at data position 0)
		1 -> 1 (second datum, vector 0, is at data position 1)
	Vector table 1: file of length 3. Contents: "AYE".
		0 -> 3
		1 -> 4
		2 -> 7
	Vector table 2: file of length 3. Contents: "H2O".
		0 -> 0
		1 -> 5
		2 -> 10
	Vector table 3: file of length 6. Contents: "HOHOHO".
		0 -> 0
		1 -> 10
		2 -> 0
		3 -> 10
		4 -> 0
		5 -> 10
	Data table: (data position: first row, token: second row)
		0     1     2     3     4     5     6     7     8     9    10
		H     I     W     A     Y     2     H     E     L     L    O

In that depreciated scheme, tokens were stored in a virtual file.
The file was partitioned by a route vector table, similar to vectabs everywhere.
Nodes required negligible additional memory, but fragmentation was troublesome.
Observe that, if data repeats itself, the data table need not increase in size:
lossless data compression, like a zip file or PNG image, operates with a similar
principle whereby sequences of numbers that repeat themselves are squeezed into
smaller sequences by creating a table of sequences that repeat themselves. GIF89
employs run-length encoding, which is a similar algorithm.

There is again some space remaining, and not much left to say about the update.
I'll spend a few paragraphs to walk you through how Quadrare Lexema parses.

QL is based upon a simple theory (actually, axiom) of how to make sense of data:
    1. Shift a token from input onto lookahead; the former LA becomes active.
    2. If the lookahead's operator precedence is higher, block the sequence.
    3. If the active token proceeds in sequence, shift it onto the sequence.
    4. Otherwise, if the sequence reduces, pop it and unshift the reduction.
    5. Otherwise, block the sequence and wait for the expected token to reduce.
And, indeed: when writing a parser in a high-level language such as Javascript,
everything really is exactly that simple. The thousands lines of code are merely
a deceptive veil of illusion, hiding the truth from your third eye.

Whaddayamean, "you don't got a third eye"?

QL doesn't itself do the tokenization. I wrote a separate scanner for that. It's
trivial to accomplish. Essentially the scanner reads a string of text to find a
portion that starts at the beginning of the string and matches a pattern (like a
regular expression, which can be done with my ReGhexp library, but I used native
Javascript reg exps to save time & memory and make the code less nonstandard);
then the longest matching pattern (or, if tied, the first or last or poka-dotted
one: whichever suits your fancy) is chopped off the beginning of the string and
stored in a token object who contains an identifier and a semantic value. Value
of a token is computed by the scanner in similar fashion as parser's reductions.
The scanner doesn't need to do any semantic analysis, but some is OK in HLLs.

Now that your input has ascended from the scanning chakra, let's manifest
another technical modal analysis. I wrote all my code while facing the upstream
direction of a flowing river, sitting under a waterfall, and meditating on the
prismatic refraction of a magical crystal when exposed to a candle dipped at the
eostre of the dawn on the vernal solstice -- which caused a population explosion
among the nearby forest rabbits, enraged a Zen monk who had reserved a parking
space, divested a Wiccan coven, & reanimated the zombie spirits of my ancestors;
although did not result in an appreciable decrease in algorithmic complexity --
but, holy cow, thou may of course ruminate your datum howsoever pleases thee.

The parser begins with one extremum, situated at the root node. The root is that
node whence every possible complete sequence begins; IOW, it's a sentry value.
Thence, symbols are read from input as described above. The parse stack -- that
is, the extrema -- builds on itself, branching anywhen more than one reduction
results from a complete sequence. As soon as a symbol can't go on the stack: if
the sequence is complete, the stack is popped to gather the symbols comprising
the completed sequence; that occurs by traversing the stack backward til arrival
at the root node, which is also popped. (I say "pop()" although, as written, the
stack doesn't implement proper pushing and popping. Actually, it isn't even a
stack: it's a LIFO queue. Yes, yes: I know queues are FIFO; but I could not FIFO
the FIFO because the parser doesn't know where the sequence began until it walks
backwards through the LIFO. So now it's a queue and a stack at the same time;
which incited such anger in the national programming community that California
seceded from the union and floated across the Pacific to hang out with Nippon,
prompting Donald Trump to reconsider his business model. WTF, m8?) Afterward,
the symbols (now rearranged in the proper, forward, order) are trundled along to
the reduction constructor, which is an object class I generated with GenProtoTok
(a function that assembles heritable classes). That constructor makes an object
which represents _a reduction that's about to occur;_ id est: a deferment, which
contains the symbols that await computation. The deferment is sent to the parse
stack (in the implementation via ::toTheLimit) or the input stream (::hatten);
when only one differential parse branch remains, computation is executed and the
deferred reduction becomes a reduction proper. Sequences compound as described.

Stacking states to sublimate or cultivate an emotrait leads me to etiolate.

Emotrait: a portmanteau from emotion and trait (L. tractus, trahere, to draw),
          signifying what is thought of an individual when judging by emotion.
Etiolate: to blanch, or become white; as, by cloistering away from the sun.

IDK whether there is precisely twenty percent more QL by now, but I've tightened
the parser code and QL is now better than before. Nothing much new in MLPTK: I
fixed a few typos; that's about all. Today's installment is, courtesy MediaFire:

https://www.mediafire.com/?5my42sl41rywzsg (http://www.mediafire.com/download/5my42sl41rywzsg/mlptk-qlds14mar2016.zip)

As usual, my work is free of charge within the Public Domain.

The parser's state logic is exactly the same as it was before; but, in addition
to the changes I described (extremum-input interface differentiation) two posts
ago in the foregoing brief, I've altered the format of sequence reductions such
that the argument vector is now two args wide, comprising the interface triplet
(a parser stack node, a thread index, and a file handle into the thread) and a
reference to the parser object. Oh, and I depreciated the "mate" and "stop"
associativity specifiers, because making them work the way I wanted would have
been a nightmare.

And, even though Luna feels better after eating forty Snickers, that's terrible.

Anyway, let me bend your ear (eye?) for a while about the interface and handles.

The parse stack is more slender, but mostly the same, except that nodes are now
Arrays whose members point backward and forward in the stack. The parser's input
is stored (although I call this sometimes, erroneously, "buffering;" buffers do
not work that way) in a data structure named the input stream, which is a random
access virtual file that stores each token as an array element in itself. Again,
the input stream is not a buffer; it is a file that grows in size as tokens are
unshifted. This makes it unsuitable for very large inputs. I'll fix it soon. For
now, you'll have to put up with the memory leak. Maybe you can find some use for
it as a volatile file system or something, but it's useless as a buffer; what it
should have been in the first place. To fix the problem shall require only that
the object is made simpler, so I expect to have it improved by next update. In
the meantime, it functions by mapping an Array to a vector table whose indices
correspond to the Nth item in the data and whose values are the indices of that
item in the Array (which has certainly become fragmented).

That's about all that's developed since last time. As you can see @ hattenArDin,
I'm crafting QL as a quasi static code generator, with fewer heuristics. Instead
of storing within the parser stack nodes the information necessary to transition
the parser's state, Hatten walks the state map to determine exactly what symbols
transition, block, or reduce and whether they are right-associative. I could
also have done this for the precedence decision, but that would require that the
parser generator creates a significant number of additional states: like about
the number of symbols whose precedence is specified multiplied by the number of
symbols that can possibly occur anywhere in any sequence. In other words, such a
computational speed gain (one if-statement) would be about the square of present
memory consumption, or vaguely proximal to that. So that design decision was due
to my failure to ponder the math before writing, and not due to impossibility.
I'll work that problem, too, before I think I'm done improving the blueprint.

I could excerpt some code and walk you through it or something, but it is plain
language to me, and I have no idea how to make it any plainer in English unless
someone asks a question. I think WordPress is set to email me if you comment on
one of my posts, so I should see your letters if they ever arrive.

And, sadly, the tightened code is yet again "sorta functional." If you require a
testbed for your grammar, refer to the test sequence in clit.js & ::toTheLimit()
in quadrare-lexema.js; both of which work well enough, except for recursion.

Tentatively: expect the New & Improved Quadrare Lexema sometime around June.

Stacking states on dated stacks to reinstate Of Stack & State: first rate?

I have accomplished the first major portion of the code I set out to write last
Christmas. My milestone for Quadrare Lexema, first quarter 2016, is available
at MediaFire here:

http://www.mediafire.com/download/26qa4xnidxaoq46/mlptk-2016q1-typeset-full.zip

It's a bit larger than my other snapshots this year, because I included another
full PDF typesetting of my software (which, today, runs into hundreds of pages)
and two audio recordings evoking my most beautiful childhood memories. These
added a few Mbytes. I'll remove them in the next edition.
(No, really; Tuna Loaf truly does encode smaller as a PCM .WAV than an MP3.)

Don't bother downloading this one unless you really want to see how QL's shaping
up or need a preformatted typesetting of progress since the prospectus. There is
nothing much new: all my work in the first quarter has been in Quadrare Lexema &
these updates, and the command line interpreter upgrade isn't yet operational.

My source code is, as usual, free of charge and Public Domain.
Some of the graphics and frills don't belong to me, and I plan to remove them
(either at some unspecified future time or upon any such notification/demand),
but please restrict yourself to my signed code if copying & pasting for profit.

Today's edition of MLPTK includes an upgraded QL. I haven't had time to overhaul
yet, but I did manage to shoehorn in a somewhat improved parser mechanism. This
should make things less of a nightmare for those of you presently experimenting
with the beta version of QL, which is at least a _stable_ beta by now.

Of particular interest are MLPTK's text-mode console, which I employed to debug
Quadrare Lexema in alpha, and which is now stable/maintenance (doesn't crash);
and QL itself, now a stable Beta (doesn't crash much).
My tentative additions to the Bison parser's "left/right" associativity model &
their demonstration in the Command Line Interpretation Translator's library file
are of most interest to experienced programmers. When I've finished writing the
manuscript in its native language, I'll write an additional reference sheet in
English for the novitiate. (See the old AFURACC supplemental schematic reference
for an idea of what this shall look like.)

I've altered QL's schematic slightly, to permit the programmer (that's you!) to
encode a minor degree of context in your grammar, and my code is so easy to read
& alter that you can really have a field day with this thing if you've the mind.

Several of my foregoing briefs, which cover some of these already; consider also
the Blargument rule, which is what makes the CLIT so fascinating. Utilizing QL's
mechanism to unshift multiple tokens back onto the input stream when executing a
reduction (which is really remarkably clever; because, IIRC, Bison can't unshift
multiple tokens during a reduction deferment, so multiple-symbol productions are
very difficult to achieve by using the industry-standard technology), Blargument
consumes 1 less token than it "actually" does by unshifting the trailing token.
Because of how I wrote QL, the parser differentiates its input stream (virtually
or actually) each time a reduction produces symbols; so, even when a deferred
reduction pushes symbols back onto the input stream, these don't interfere with
any of the other deferred reductions (which exist as though the input stream was
not modified by anything that didn't happen during their "timeline").

In addition, my upgrade to QL's programming interface permits differentiation at
any point within the input stream. Although somewhat wobbly, this serves to show
how it's possible for even a deferred computation to polymorph the code - before
the parser has even arrived there, & without affecting any of the other quanta.
Or at least it will, if I can figure out how to apply it before the next time I
overhaul the code; otherwise, I think I'll drop it from the schematic, because
this input stream metaphor should probably be optimized due to large inputs.

The old Capsule Parser generator is still there, via the ::toTheLimit() method,
and the upgraded Ettercap Parser is generated by ::hattenArDin().

Here is an abridged comparison & contrast:
1. Stack nodes are generic; their place taken by hard-coded transition methods.
   Some alterations to stack node constructor argument vector, algorithm.
   There is more hard code and less heuristics. Actually, the whole parser could
   be generated as static code, and I hope to implement that sometime this year.
2. Input is now a linked list; previously an Array.
   Arrays of tokens are preprocessed to generate this list, which functions as
   a stream buffer with a sequential-access read head.
3. The input stream links forward and the parse stack links backward.
4. Stack extrema now interface with the input stream differentially; that is,
   the input stream itself is differentiated (obviating recursion), instead of
   the recursive virtual differentiation that occurred before.
5. Differentiation of the input stream occurs actually, not virtually, and can
   be programmed at any point in the input stream (although it's such a pain).

Here's an illustration of the extremum-to-read-head interface juncture:

                    -> [ EXTREMUM A ] -> [ READ HEAD A ] -> [ DIFFERENTIAL A ]
                   /                                               v
 -> [ PARSE STACK ] -> [ EXTREMUM B ] -> [ READ HEAD B ] -> [ INPUT BUFFER ] ->
                   \                                               ^
                    -> [ EXTREMUM C ] -> [ READ HEAD C ] -> [ DIFFERENTIAL C ]

Actually it is a little more complicated than that, but that is the basic idea.
As you can see, my parser models a Turing machine that sort of unravels itself
at the interface between what it has already seen and what it presently sees.
Then it zips itself back up when reduce/reduce conflicts resolve themselves.
I imagine it like a strand of DNA that is reconstituting itself in karyokinesis.

The extrema A through C are, actually, probably exactly the same extremum.
However, when the next parse iteration steps forward to the location within the
input stream that is pointed-to by the read head, the parse stack branches; so,
it's easiest to imagine that the extrema have already branched at the point of
interface, because if they haven't branched yet then they are about to.

Here's an illustration of timeline differentiation within the input stream:

                        -> [ DIFFERENTIAL A ] -.       -> [ DIFFERENTIAL C ]
                       /                   v---¯     /                       \
    -> [ READ HEAD(S) ] -> [ UNMODIFIED INPUT ] -> [ TOKEN W ] -> [ TOKEN Z ] ->
                       \                   ^-----------------------------------.
                        -> [ DIFFERENTIAL B ] -> [ DIFFERENTIAL B CONTINUES ] -¯

The parser selects differentials based on either of two criteria: the read head,
which points at the "beginning" of the input stream buffer as it is perceived by
the extremum in that timeline; or a timeline identifier ("thread ID") that tells
the algorithm when it should, for example, jump from token W to differential C
and therefore skip token Z. Thus, multiple inputs are "the same program."
Again: parser state becomes unzipped at differential points, then re-zips itself
after the differentiated timelines have resolved themselves by eliminating those
who encounter a parse error until only one possible timeline remains.

So your code can polymorph itself. In fact, it can polymorph itself as much as
you want it to, because the parser returns ambiguous parses as long as they have
produced the origin symbol during its finalization phase. However, I haven't yet
tested this and it probably causes more difficulty than it alleviates.

However, if you've experimented with AFURACC, you might be disappointed to see
that I have failed to write an interface for you to polymorph the grammar at run
time using QL. Probably that's possible to do, but the scaffold (as it is at the
moment) is rigid and inflexible. I want to work it into a future revision; but,
because a heuristic parser is more-or-less sine qua non to the whole idea of a
polymorphic parser, I'm pretty sure I'll stick to just polymorphing the code.

As I work to reduce the time complexity of the parser algorithm, similarity to
the Free Software Foundation's compiler-compiler, GNU Bison, becomes pronounced.
I'm trying to add some new ideas, but Bison is pretty much perfect as it is, so
I haven't had much luck - things that aren't Bison seem to break or complicate.
I still haven't heard anything from the FSF about this, so I guess it's OK?

However, the scaffold seems stable, so I'll work on the command-line interpreter
for a while and formalize its grammar. After that, probably yet another improved
static code generator, and if that doesn't malfunction I'll try video games IDK.

I intend to complete the formal commandline grammar within second quarter 2016.
Goals: command substitution, javascript substitution (a la 'csh'), environment
variables and shell parameter expansion, and shell arithmetic.
Additionally, if there's time, better loops and a shell scripting language.

Om. Wherefore ore? Om. Om. Ore… art thou ore?

Dynamic recompilers: gold? Maybe. But what is the significance of recompilation
in the fast-paced, high-tech world of today? Does, indeed, my idea present any
useful information -- or am I merely setting forth in search of lost CPU time?
What am I doing? What am I saying? Where am I going? What am I computing? I've
built the swing, but what about the swang? After the swang, whither the swung?
(Paraphrased from sketches performed in "Monty Python's Flying Circus.")

This lecture provides, in abstract, my concept design of a specialized Turing
machine: the compound compiler/disassembler. It is feasible via my published
work, Quadrare Lexema, which is in the public domain as is my essay.

(Actually, I think these are more appropriately called "dynamic recompilers.")

If you're just joining me, Quadrare Lexema is a Bison-like parser generator
written in the Javascript programming language for use in Web browsers (FF3.6).
It is portable to any browser employing a modern, standards-compliant Javascript
interpreter; such as Chromium, Chrome, Opera, and (hypothetically) Konqueror.
I have not yet had an opportunity to acquire a copy of Microsoft's Web browser,
Internet Explorer, because I've had no opportunity to acquire a copy of Windows
(and couldn't pay for it, even if I did). QL produces a parser like GNU Bison;
which is to say, an LALR parser for Backus-Naur Format context-free grammars.
For more information, visit the Free Software Foundation's Web site; and/or the
Wikipedia articles treating Backus-Naur Format, regular grammars, LR parsers,
and Turing machines.

Because I have already described my work with the hundreds of pages of code and
the few pages of abstract condensed lectures, this is curtate. However, observe
that my foregoing proof constructs the basis upon which this concept rests.
Besides my work, bachelors of computer science worldwide have demonstrated in
their own dissertations similar proofs to that I have presented you by now.

Again: concept, not implementation. Toward that end lies the specification of a
Turing-complete virtual machine, wherein there be dragons. I will write you one
soon, but I am pending some subsequent blueprints on my acquisition of better
laboratory equipment. In the meantime, I'm pretty sure you can write one w/ QL.

Obviously: a compiler translates the syntax of a human-interface language into
machine instructions; parser generators can be employed to construct compilers;
and such generated parsers can be attached to other parsers in a daisy-chain.

So, make a compiler that converts the abstract instructions it reads into some
progressively simpler set of other instructions, by unshifting multiple tokens
back onto the input stream during the reduction step. You could either read them
again and thereby convert them to yet other computations, or simply skip-ahead
within the input stream until it has been entirely disassembled.

Consider: you're writing an artificial intelligence script, somewhat like this:
	FOR EACH actor IN village:
		WAKE actor,
		actor ACTS SCRIPT FOR TIME OF DAY IN GAME WORLD.
(This syntax is very similar to the A.I. language specified for the Official
 Hamster Republic Role-Playing Game Creation Engine (O.H.R.RPG.C.E.).)
Your context-free grammar is walking along der strasse minding its own business;
when, suddenly, to your surprise, it encounters the ambiguous TIME OF DAY symbol
and the parser does the mash. It does the monster mash. It does the mash, and it
is indeed a graveyard smash in the sense that your carefully-crafted artificial
intelligence -- the very stuff and pith of whose being you have painstakingly
specified as a set of atomic operations in an excruciating mathematical grammar
-- has now encountered a fatal syntax error and you would like nothing better
than to percuss the computer until it does what you say because you've used each
!@#$ free minute learning how to program this thing for the past fifteen !@#$
years and God Forfend(TM) that it should deny you the fruit of your labor.

Well, actually, the OHRRPGCE does solve that problem for you, which is nice; and
maybe it even solves it in exactly this way, which is cool; but I'll continue...
Let's say your parser lets you unshift multiple symbols upon reduction, like QL
(or like any parser that does a similar thing, of which there are surely many);
then "TIME OF DAY" could reduce to an algorithm that does nothing more than find
out what time of day it is and then unshift a symbol corresponding to the script
that most closely matches what time of day it is.

In other words, you've specified the possible computations that the language can
execute as a set of atomic operations, then decomposed the abstract instructions
into those atomic operations before executing them. The functionality is similar
to that of the C preprocessor's #include directive, which directs the CPP to add
the source code from the #included header verbatim like a copy-paste operation.
That's the most straightforward application of this thought, anyway: copy-and-
pasting artificial intelligence scripts into other AI scripts.

Another thought: let's say your compiler is, in addition to its named capacity,
also a virtual machine that's keeping track of state in some artificial world.
So, it's two virtual machines in one. Rather silly, I know, but why not follow
me a bit deeper into the Turing tar-pit where everything is possible and nothing
is NP-complete? After all, "What if?" makes everything meaningless, like magic.

So, again, it's trucking along and hits some instruction like "SLAY THE DRAGON;"
but the dragon is Pete, who cleverly hides from the unworthy. Now, you could say
just ignore this instruction and carry on with the rest of the script, and maybe
even try to slay the dragon again later after you have slung Chalupas out the TB
drive-through window for a few hours. You could even say travel back in time and
slay the dragon while there's yet a chance -- because wholesale violation of the
law of causality is okay when you can't afford to lose your heroic street cred.
But why not check the part of the script you've yet to execute, to see if maybe
there's something else you need to do while you're here, and then try doing it
while you keep an eye out in case Pete shows up?
I mean, you could write that check as part of the "SLAY" atom; or as part of the
next action in your script that happens after the "SLAY" atom; but why not write
it as part of what happens before the "SLAY" atom even executes?

Also, what if you wanted the compiler to try two entirely different sequences of
symbols upon computation of a symbol reduction? Or one reduction modifies the
input stream somehow, while another does not? Or to reorganize/polymorph itself?

All this and more is certainly possible. QL does a big chunk of it the wrong way
by gluing-together all sorts of stuff that could have been done by making your
symbols each a parser instead. Oh, and iterators in Icon also do it better. And
let us not forget the honorable Haskell, whose compiler hails from Scotland and
is nae so much a man as a blancmange. "Learn you a Haskell for great good!" --
this is both the title of a book and an amusing pearl of wisdom.

The only thing that bothers me about this idea is that the procedure necessary
to polymorph the code at runtime seems to be (a) something that I ought to be
doing by nesting context-free parsers, rather than writing a contextual one; and
(b) probably requires a heuristic algorithm, which would be awfully slow.

I haven't developed this idea very well and there's some space left.
Here is a verbatim excerpt with an example Haskell algorithm, for no reason:

  From The Free On-line Dictionary of Computing (30 January 2010) [foldoc]:

  Quicksort
  
     A sorting {algorithm} with O(n log n) average time
     {complexity}.
  
     One element, x of the list to be sorted is chosen and the
     other elements are split into those elements less than x and
     those greater than or equal to x.  These two lists are then
     sorted {recursive}ly using the same algorithm until there is
     only one element in each list, at which point the sublists are
     recursively recombined in order yielding the sorted list.
  
     This can be written in {Haskell}:
  
        qsort               :: Ord a => [a] -> [a]
        qsort []             = []
        qsort (x:xs)         = qsort [ u | u<-xs, u<x ] ++
                               [ x ] ++
                               qsort [ u | u<-xs, u>=x ]

     [Mark Jones, Gofer prelude.]

Now that I've burst my metaphorical payload, in regard to theory, I think future
lectures shall be of a mind to walk novices through some of the smaller bits and
pieces of the algorithms I've already written. There aren't many, but they're a
real pain to translate from source code to English (which is why I haven't yet).
I might also try to explain QL in further detail, but IDK if I can make it much
more straightforward than it is (no questions == no idea if you comprehend)...
and, besides, I've already written more than ten pages about it & it's tedious.

Depending on how much Pokemon I must play to clear my head after overhauling the
parser this year, and how much time is left after that to finish the shell's new
scripting language, I'll aim to write some of those simplified lectures by July.