Pall (n.): pawl. I couldn't write last week, and my upgrade to QL has progressed no further. For reference, I stalled before comparing the efficiency of nested Objects to that of nested Arrays, which I must test before experimenting further with the prototype compiler or even refining the design. I intend to do that this month. In the meantime, here's a snapshot of MLPTK with new experiments included. http://www.mediafire.com/download/566ln3t1bc5jujp/mlptk-p9k-08apr2016.zip And a correction to my brief about the grammar ("Saddlebread"): actually, the InchoateConjugation sequence does not cause differentiation, because the OP_CAT prevents the original from reducing. Other parts may be inaccurate. I'll revise the grammar brief and post a new one as soon as I have fixed the QL speed bug. I took some time out from writing Quadrare Lexema to write some code I've been meaning to write for a very long time: pal9000, the dissociated companion. This software design is remarkably similar to the venerable "Eggdrop," whose C source code is available for download at various locations on the Internets. Obviously, my code is free and within the Public Domain (as open as open source can be); you can find pal9000 bundled with today's edition of MLPTK, beneath the /reference/ directory. The chatbot is a hardy perennial computer program. People sometimes say chatbots are artificial intelligence; although they aren't, exactly, or at least this one isn't, because it doesn't know where it is or what it's doing (actually it makes some assumptions about itself that are perfectly wrong) and it doesn't apply the compiler-like technique of categorical learning because I half-baked the project. Soon, though, I hope... Nevertheless, mathematics allows us to simulate natural language. Even a simplistic algorithm like Dissociated Press (see "Internet Jargon File," maintained somewhere on the World Wide Web, possibly at Thyrsus Enterprises by Eric Steven Raymond) can produce humanoid phrases that are like real writing. Where DisPress fails, naturally, is paragraphs and coherence: as you'll see when you've researched, it loses track of what it was saying after a few words. Of course, that can be alleviated with any number of clever tricks; such as: 1. Use a compiler. 2. Use a compiler. 3. Use a compiler. I haven't done that with p9k, yet, but you can if you want. Of meaningful significance to chat robots is the Markov chain. That is a mathematical model, used to describe some physical processes (such as diffusion), describing a state machine in which the probability of any given state occurring is dependent only on the next or previous state of the system, without regard to how that state was encountered. Natural language, especially that language which occurs during a dream state or drugged rhapsody (frequently and too often with malicious intent, these are misinterpreted as the ravings of madmen), can also be modeled with something like a Markov chain because of the diffusive nature of tangential thought. The Markov-chain chat robot applies the principle that the state of a finite automaton can be described in terms of a set of states foregoing the present; that is, the state of the machine is a sliding window, in which is recorded some number of states that were encountered before the state existent at the moment. Each such state is a word (or phrase / sentence / paragraph if you fancy a more precise approach to artificial intelligence), and the words are strung together one after another with respect to the few words that fit in the sliding window. So, it's sort of like a compression algorithm in reverse, and similar to the way we memorize concepts by relating them to other concepts. "It's a brain. Sorta." One problem with Markov robots, and another reason why compilers are of import in the scientific examination of artificial intelligence, is that of bananas. The Banana Problem describes the fact that, when a Markov chain is traversed, it "forgets" what state it occupied before the sliding window moved. Therefore, for any window of width W < 6, the input B A N A N A first produces state B, then states A and N sequentially forever. Obviously, the Banana Problem can be solved by widening the window; however, if you do that, the automaton's memory consumption increases proportionately. Additionally, very long inputs tend to throw a Markov-'bot for a loop. You can sorta fix this by increasing the width of the sliding window signifying which state the automaton presently occupies, but then you run into problems when the sliding window is too big and it can't think of any suitable phrase because no known windows (phrases corresponding to the decision tree's depth) fit the trailing portion of the input. It's a sticky problem, which is why I mentioned compilers; they're of import to artificial intelligence, which is news to absolutely no one, because compilers (and grammar, generally) describe everything we know about the learning process of everyone on Earth: namely, that intelligent beings construct semantic meaning by observing their environments and deducing progressively more abstract ideas via synthesis of observations with abstractions already deduced. Nevertheless, you'd be hard-pressed to find even a simple random-walk chatbot that isn't at least amusing. (See the "dp" module in MLPTK, which implements the vanilla DisPress algorithm.) My chatbot, pal9000, is inspired by the Dissociated Press & Eggdrop algorithms; the copy rights of which are held by their authors, who aren't me. Although p9k was crafted with regard only to the mathematics and not the code, if my work is an infringement, I'd be happy to expunge it if you want. Dissociated Press works like this: 1. Print the first N words (letters? phonemes?) of a body of text. 2. Then, search for a random occurrence of a word in the corpus which follows the most recently printed N words, and print it. 3. Ad potentially infinitum, where "last N words" are round-robin. It is random: therefore, humorously disjointed. And Eggdrop works like this (AFAICR): 1. For a given coherence factor, N: 2. Build a decision tree of depth N from a body of text. 3. Then, for a given input text: 4. Feed the input to the decision tree (mmm, roots), and then 5. Print the least likely response to follow the last N words by applying the Dissociated Press algorithm non-randomly. 6. Terminate response after its length exceeds some threshold; the exact computation of which I can't recall at the moment. It is not random: therefore, eerily humanoid. (Cue theremin riff, thundercrash.) A compiler, such as I imagined above, could probably employ sliding windows (of width N) to isolate recurring phrases or sentences. Thereby it may automatically learn how to construct meaningful language without human interaction. Although I think you'll agree that the simplistic method is pretty effective on its own; notwithstanding, I'll experiment with a learning design once I've done QL's code generation method sufficiently that it can translate itself to Python. Or possibly I'll nick one of the Python compiler compilers that already exists. (Although that would take all the fun out of it.) I'll parsimoniously describe how pal9000 blends the two: First of all, it doesn't (not exactly), but it's close. Pal9000 learns the exact words you input, then generates a response within some extinction threshold, with a sliding window whose width is variable and bounded. Its response is bounded by a maximum length (to solve the Banana Problem). Because it must by some means know when a response ends "properly," it also counts the newline character as a word. These former are departures from Eggdrop. It also learns from itself (to avoid saying something twice), as does Eggdrop. In addition, p9k's response isn't necessarily random. If you use the database I included, or choose the experimental "generator" response method, p9k produces a response that is simply the most surprising word it encountered subsequent to the preceding state chain. This produces responses more often, and they are closer to something you said before, but of course this is far less surprising and therefore less amusing. The classical Eggdrop method takes a bit longer to generate any reply; but, when it does, it drinks Dos Equis. ... Uh, I mean... when it does, the reply is more likely to be worth reading. After I have experimented to my satisfaction, I'll switch the response method back to the classic Eggdrop algorithm. Until then, if you'd prefer the Eggdrop experience, you must delete the included database and regenerate it with the default values and input a screenplay or something. I think Eggdrop's Web site has the script for Alien, if you want to use that. Game over, man; game over! In case you're curious, the algorithmic time complexity of PAL 9000 is somewhere in the ballpark of O(((1 + MAX_COHERENCE - MIN_COHERENCE) * N) ^ X) per reply, where N is every unique word ever learnt and X is the extinction threshold. "It's _SLOW._" It asymptotically approaches O(1) in the best case. For additional detail, consult /mlptk/reference/PAL9000/readme.txt. Pal9000 is a prototypical design that implements some strange ideas about how, exactly, a Markov-'bot should work. As such, some parts are nonfunctional (or, indeed, malfunction actually) and vestigial. "Oops... I broke the algorithm." While designing, I altered multiple good ideas that Eggdrop and DisPress did right the first time, and actually made the whole thing worse on the whole. For a more classical computer science dish, try downloading & compiling Eggdrop.
Breaking news: according to Shape magazine (March, 2016; volume 35, no. 6), which incorporates Fitness magazine, forty winks shouldn’t be.
Mirel Ketchiff writes: “enlightening new research is challenging [the notion, suggested by the National Sleep Foundation, that we need eight hours of sleep every night].” This enlightening new research indicates that prehistoric cavemen slept six and a half hours each night (possibly because they couldn’t get to sleep while the stalactites dripped on their heads); how anthropologists learnt the crepuscular habits of people who existed before the advent of recorded history is, evidently, left as an exercise to the reader.
Exactly how much sleep does anyone need, anyway? Someone once told me that children need about ten hours a night. Then the National Sleep Foundation told me that adolescents and adults require about eight. Now Shape magazine says I need six and a half. What’ll it be next; maybe I don’t need any sleep at all! Methamphetamine addicts have been known not to sleep for extended periods of time, and to become fashionably slender no matter how gluttonous their eating habits. Perhaps that is fitness, Fitness?
Soon we can all abandon our outmoded, unfashionable and inefficient nightly nonce of unconsciousness. Ascending from our benighted evolution, we’ll first return to our prehistoric habits (as though we ought ever to have abandoned them in the first place), and then do away with sleep altogether. Employing methamphetamine and a thousand other compounds we’re taught in school are bad for our bodies and minds, we’ll become Übermensch — harder, better, faster, and stronger than those other Nation-Brands.
Now unencumbered by our need to rest our minds each night so that we can demarcate the border line of fantasy and reality (and, vicariously, of right and wrong), and and thoroughly brain-damaged as a result, we’ll spring forward into a new age of crime, misconduct, and rampant procreation.
Promiscuity is a citizen’s duty.
Beware: when speaking to Trolls, listen carefully. It could save your ice hole. (Trolls are known for their skill in wintertime fishing.) My software, courtesy of MediaFire: https://www.mediafire.com/?ebbyj47e35mmytg (http://www.mediafire.com/download/ebbyj47e35mmytg/mlptk-qlupgradecomplete-awaitingspeedfix-27mar2016.zip) This update (unlike the foregoing, which fixed the Opera bug) regards only QL. Again: Quadrare Lexema is similar to GNU Bison. If an infringement, I'll delete it as soon as I hear or see from you. BTW, Bison is time-tested; QL isn't. Oh, and a correction to "First Rate?": Bison can indeed unshift multiple tokens back onto its input stream, even though productions can't be multiple symbols in length, by unshifting directly onto its input during reduction (which is how QL does it too, during deferment, which amounts to the same exact thing because no reason exists to compute anything during deferment -- otherwise, there'd be more race conditions than the Kentucky Derby, which is very silly). QL is now "kinda-sorta" debugged and functioning to my specification AFAICT. Now the API has changed considerably from how it was last month (the argument vector to reduction deferment class constructors has been modified, some new faculties now exist, and some were removed); this necessitated additional instantiations of "new Array()," and interolably reduces efficiency when operating on very long inputs, but I wanted to hurry-up this design iteration. (That was one sentence.) The token agglutination mechanism of the parser logic is the same as before. Code to determine precedence & blocking has been abridged; toddling steps toward a method to generate the parser as in-line code. (As you see, that isn't yet.) I'm tempted to redo the infrastructure to reduce the number of new Array()s that are instantiated during the parse phase, but I'm pretty sure I can do that by rewriting the underlying code without changing the API. The interface between the parser's stack extremum and its input stream is passed to reductions as an Array(), but that doesn't mean it always has to be allocated anew. Remember: the old Command Line Interpretation Translator scaffold isn't decided; I left ::toTheLimit() where it was, pending a ::hatten() that shall suit you; if you'd like to use the horrifying monstrosity that is my software architecture, you can see Mr. Skeleton awaiting you in clit.js -- asking where is his flesh, & rapidly becoming impatient with my poking and prodding him all day. Soon, Mr. Skeleton; soon shall be the day when the hat is yours at last, & your calcareous projection from within my library becomes a fully fledged automaton unto itself. For the meantime I'm satisfied with the half-measure. I think the API is okay to start building upon, so I'll start building. Overhaul of the back-end late this year or early in the next, & it's looking good for me to furnish the CLIT before the third quarter. Therefore I'd say: expect full CLIT functionality in 2016. Before I apprise you of my progress so far, let's take a moment for a thoroughly detailed technical analysis of Mr. Skeleton's bony protrusion. Phoneme <= (EscapeSequence | AnyCharacter | Number | String) (EscapeSequence | AnyCharacter | Number | String | Phoneme | ) Concatenate a "word" that is one argument in an argument vector. ISLDQ <= '\"' Open a <String>. InchoateString <= (ISLDQ | InchoateString) (OP_CAT | OP_CONJ | EscapeSequence | AnyCharacter | Number | Space) Make strings out of any symbol following an open string. (As you can see, this rule must be rewritten...) String <= InchoateString '\"' Close a <String>. Argument <= Phoneme (Space | ) | Argument Argument Concatenate the argument vector comprising an executable MLPTK command. That bit with "(Space | )" should probably be just "Space". Catenation <= (Argument | Group | Conjugation) OP_CAT Concatenate the output of commands. MalformedGroupCohesion <= (Argument | Group | Conjugation) OP_CLPAR Automatically correct the user's malformed syntax where the last command in a parenthetical sub-grouping was not followed by a ";". ExecutableInchoateConjugation <= Argument OP_CONJ | Blargument Signify that a command can be executed as part of a <Conjugation>. InchoateConjugation <= Group OP_CONJ | Conjugation Convert a conjugated <Group>, or the output of a <Conjugation>, to an <InchoateConjugation> token that can form the left-hand part of a further <Conjugation>. This reduction causes parser stack differentiation, because it conflicts with "Catenation <= Conjugation OP_CAT". In that circumstance, the sequence "<Conjugation> <OP_CAT> ..." is both a "<Catenation> ..." and a "<InchoateConjugation> <OP_CAT> ...". Observe that the latter always produces a syntax error. I'm pretty sure I could rewrite the grammar of the <Conjugation> rule to fix this; IDK why I didn't. (Maybe a bug elsewhere makes it impossible.) Conjugation <= (ExecutableInchoateConjugation | InchoateConjugation) ExecutableInchoateConjugation Execute the command in the <ExecutableInchoateConjugation> at right, supplying on its standard input the standard output of that at left. InchoateGroup <= (OP_OPPAR | InchoateGroup) Catenation Concatenate the contents of a parenthesized command sub-grouping. Group <= InchoateGroup (OP_CLPAR | MalformedGroupCohesion) Close an <InchoateGroup>. Concatenate the contents of a <MalformedGroupCohesion> if it trailed. CommandLine <= (CommandLine | ) Catenation Concatenate the output of <Catenation>s into one Array. This one actually doesn't differentiate. Either a <CommandLine> waits at left to consume a Catenation when it reduces, or something else does, & <Catenations> in mid-parse never reduce to <CommandLine>s except when fatal syntax errors occur, in which case the parser belches brimstone. Blargument <= Argument (OP_CAT | OP_CLPAR) Duplicate the trailing concatenation operator or close parenthesis following an <Argument>, so that a <Conjugation> doesn't conflict with a <Catenation> or an <InchoateGroupCohesion>. I think this can be specified formally in a proper grammar, without the multiple-symbol unshift, but IDK how just yet -- because (without lookahead) the parser can't know when the argument vector ends without seeing a trailing operator, so execution of the last command in the conjugation sequence <InchoateConjugation> <Argument> <OP_CAT> would occur when <Argument> <OP_CAT> reduces & executes, disregarding its standard input (the contents of the foregoing <InchoateConjugation>. "Blargument <= Argument" can never happen and "ExecutableInchoateConjugation <= Argument" would grab the <Argument> before it could concatenate with the next <Argument>, so I'm at a loss for how I should accomplish this formally. BTW, <Blargument> is the <WalkingBassline>, with trivial alterations. The <Blargument> reduction causes parser stack differentiation, because it conflicts with both <Catenation> and <MalformedGroupCohesion>. In either case, the <Blargument> branch encounters a syntax error & disappears when <Blargument> didn't immediately follow an inchoate conjugation; the other branch disappears in the inverse circumstance. Token identifier Operator precedence Associativity "Space", - - - - - 2, - - - - - - - - - "wrong", "OP_CAT", - - - - 0, - - - - - - - - - "wrong", "EscapeSequence", 0, - - - - - - - - - "right", "AnyCharacter", - 0, - - - - - - - - - "right", (the sequence to the right of "Number", - - - - 0, - - - - - - - - - "right", a right-associative token is "String", - - - - 0, - - - - - - - - - "right", reduced first, except...) "OP_OPPAR", - - - 0, - - - - - - - - - "left", "OP_CLPAR", - - - 0, - - - - - - - - - "wrong", (... when wrong-associativity "OP_CONJ", - - - - 0, - - - - - - - - - "wrong", forces QL to reduce the right- "QL_FINALIZE", - - 0, - - - - - - - - - "wrong" associative sequence.) The avid reader shall observe that my "wrong-associativity" specifier, when used to define runs of right-associative tokens that stick to one another, is similar to the lexical comparison & matching algorithm of a lexical analyzer (scanner). In fact, as written, it _is_ a scanner. For an amusing diversion, try excerpting the portions of the semantical analyzer (parser) that can be made into lexical analyzer rules, then put them into the scanner; or wait a few months and I will. But that's enough bony baloney. If you preferred Mr. Skeleton as he was, see mlptk/old/clit.js.21mar2016. As of probably a few days before I posted this brief, my upgrade to QL is now sufficient to function vice its predecessor. I spruced-up Mr. Skeleton, so that the test scaffold in clit.js now functions with ::hattenArDin() in QL, and now everything looks all ready to go for shell arithmetic & such frills as that. Ironically, it seems that I've made it slower by attempting to make it faster. I should have known better. I'll try to fix the speed problem soon; however, until then, I've abandoned work on the Translator due to intolerable slowness. I'm sorry for the inconvenience. If it's any help, I think the problem is due to too many new Array() invocations or too many nested Arrays, one of the two. Either way, I intend to fix it this year by rewriting the whole parser (again) as a static code generator. I was also thinking of writing a windowing operating system for EmotoSys, but I am uncertain how or whether to put clickable windows in the text console. I mean -- it'd be simple now that QL's rolling, but maybe a more Spartan design? I will post some flowcharts here when I've exhausted my ministrations to the CLIT.
Emotrait: a portmanteau from emotion and trait (L. tractus, trahere, to draw), signifying what is thought of an individual when judging by emotion. Etiolate: to blanch, or become white; as, by cloistering away from the sun. IDK whether there is precisely twenty percent more QL by now, but I've tightened the parser code and QL is now better than before. Nothing much new in MLPTK: I fixed a few typos; that's about all. Today's installment is, courtesy MediaFire: https://www.mediafire.com/?5my42sl41rywzsg (http://www.mediafire.com/download/5my42sl41rywzsg/mlptk-qlds14mar2016.zip) As usual, my work is free of charge within the Public Domain. The parser's state logic is exactly the same as it was before; but, in addition to the changes I described (extremum-input interface differentiation) two posts ago in the foregoing brief, I've altered the format of sequence reductions such that the argument vector is now two args wide, comprising the interface triplet (a parser stack node, a thread index, and a file handle into the thread) and a reference to the parser object. Oh, and I depreciated the "mate" and "stop" associativity specifiers, because making them work the way I wanted would have been a nightmare. And, even though Luna feels better after eating forty Snickers, that's terrible. Anyway, let me bend your ear (eye?) for a while about the interface and handles. The parse stack is more slender, but mostly the same, except that nodes are now Arrays whose members point backward and forward in the stack. The parser's input is stored (although I call this sometimes, erroneously, "buffering;" buffers do not work that way) in a data structure named the input stream, which is a random access virtual file that stores each token as an array element in itself. Again, the input stream is not a buffer; it is a file that grows in size as tokens are unshifted. This makes it unsuitable for very large inputs. I'll fix it soon. For now, you'll have to put up with the memory leak. Maybe you can find some use for it as a volatile file system or something, but it's useless as a buffer; what it should have been in the first place. To fix the problem shall require only that the object is made simpler, so I expect to have it improved by next update. In the meantime, it functions by mapping an Array to a vector table whose indices correspond to the Nth item in the data and whose values are the indices of that item in the Array (which has certainly become fragmented). That's about all that's developed since last time. As you can see @ hattenArDin, I'm crafting QL as a quasi static code generator, with fewer heuristics. Instead of storing within the parser stack nodes the information necessary to transition the parser's state, Hatten walks the state map to determine exactly what symbols transition, block, or reduce and whether they are right-associative. I could also have done this for the precedence decision, but that would require that the parser generator creates a significant number of additional states: like about the number of symbols whose precedence is specified multiplied by the number of symbols that can possibly occur anywhere in any sequence. In other words, such a computational speed gain (one if-statement) would be about the square of present memory consumption, or vaguely proximal to that. So that design decision was due to my failure to ponder the math before writing, and not due to impossibility. I'll work that problem, too, before I think I'm done improving the blueprint. I could excerpt some code and walk you through it or something, but it is plain language to me, and I have no idea how to make it any plainer in English unless someone asks a question. I think WordPress is set to email me if you comment on one of my posts, so I should see your letters if they ever arrive. And, sadly, the tightened code is yet again "sorta functional." If you require a testbed for your grammar, refer to the test sequence in clit.js & ::toTheLimit() in quadrare-lexema.js; both of which work well enough, except for recursion. Tentatively: expect the New & Improved Quadrare Lexema sometime around June.