Pall (n.): pawl. I couldn't write last week, and my upgrade to QL has progressed no further. For reference, I stalled before comparing the efficiency of nested Objects to that of nested Arrays, which I must test before experimenting further with the prototype compiler or even refining the design. I intend to do that this month. In the meantime, here's a snapshot of MLPTK with new experiments included. http://www.mediafire.com/download/566ln3t1bc5jujp/mlptk-p9k-08apr2016.zip And a correction to my brief about the grammar ("Saddlebread"): actually, the InchoateConjugation sequence does not cause differentiation, because the OP_CAT prevents the original from reducing. Other parts may be inaccurate. I'll revise the grammar brief and post a new one as soon as I have fixed the QL speed bug. I took some time out from writing Quadrare Lexema to write some code I've been meaning to write for a very long time: pal9000, the dissociated companion. This software design is remarkably similar to the venerable "Eggdrop," whose C source code is available for download at various locations on the Internets. Obviously, my code is free and within the Public Domain (as open as open source can be); you can find pal9000 bundled with today's edition of MLPTK, beneath the /reference/ directory. The chatbot is a hardy perennial computer program. People sometimes say chatbots are artificial intelligence; although they aren't, exactly, or at least this one isn't, because it doesn't know where it is or what it's doing (actually it makes some assumptions about itself that are perfectly wrong) and it doesn't apply the compiler-like technique of categorical learning because I half-baked the project. Soon, though, I hope... Nevertheless, mathematics allows us to simulate natural language. Even a simplistic algorithm like Dissociated Press (see "Internet Jargon File," maintained somewhere on the World Wide Web, possibly at Thyrsus Enterprises by Eric Steven Raymond) can produce humanoid phrases that are like real writing. Where DisPress fails, naturally, is paragraphs and coherence: as you'll see when you've researched, it loses track of what it was saying after a few words. Of course, that can be alleviated with any number of clever tricks; such as: 1. Use a compiler. 2. Use a compiler. 3. Use a compiler. I haven't done that with p9k, yet, but you can if you want. Of meaningful significance to chat robots is the Markov chain. That is a mathematical model, used to describe some physical processes (such as diffusion), describing a state machine in which the probability of any given state occurring is dependent only on the next or previous state of the system, without regard to how that state was encountered. Natural language, especially that language which occurs during a dream state or drugged rhapsody (frequently and too often with malicious intent, these are misinterpreted as the ravings of madmen), can also be modeled with something like a Markov chain because of the diffusive nature of tangential thought. The Markov-chain chat robot applies the principle that the state of a finite automaton can be described in terms of a set of states foregoing the present; that is, the state of the machine is a sliding window, in which is recorded some number of states that were encountered before the state existent at the moment. Each such state is a word (or phrase / sentence / paragraph if you fancy a more precise approach to artificial intelligence), and the words are strung together one after another with respect to the few words that fit in the sliding window. So, it's sort of like a compression algorithm in reverse, and similar to the way we memorize concepts by relating them to other concepts. "It's a brain. Sorta." One problem with Markov robots, and another reason why compilers are of import in the scientific examination of artificial intelligence, is that of bananas. The Banana Problem describes the fact that, when a Markov chain is traversed, it "forgets" what state it occupied before the sliding window moved. Therefore, for any window of width W < 6, the input B A N A N A first produces state B, then states A and N sequentially forever. Obviously, the Banana Problem can be solved by widening the window; however, if you do that, the automaton's memory consumption increases proportionately. Additionally, very long inputs tend to throw a Markov-'bot for a loop. You can sorta fix this by increasing the width of the sliding window signifying which state the automaton presently occupies, but then you run into problems when the sliding window is too big and it can't think of any suitable phrase because no known windows (phrases corresponding to the decision tree's depth) fit the trailing portion of the input. It's a sticky problem, which is why I mentioned compilers; they're of import to artificial intelligence, which is news to absolutely no one, because compilers (and grammar, generally) describe everything we know about the learning process of everyone on Earth: namely, that intelligent beings construct semantic meaning by observing their environments and deducing progressively more abstract ideas via synthesis of observations with abstractions already deduced. Nevertheless, you'd be hard-pressed to find even a simple random-walk chatbot that isn't at least amusing. (See the "dp" module in MLPTK, which implements the vanilla DisPress algorithm.) My chatbot, pal9000, is inspired by the Dissociated Press & Eggdrop algorithms; the copy rights of which are held by their authors, who aren't me. Although p9k was crafted with regard only to the mathematics and not the code, if my work is an infringement, I'd be happy to expunge it if you want. Dissociated Press works like this: 1. Print the first N words (letters? phonemes?) of a body of text. 2. Then, search for a random occurrence of a word in the corpus which follows the most recently printed N words, and print it. 3. Ad potentially infinitum, where "last N words" are round-robin. It is random: therefore, humorously disjointed. And Eggdrop works like this (AFAICR): 1. For a given coherence factor, N: 2. Build a decision tree of depth N from a body of text. 3. Then, for a given input text: 4. Feed the input to the decision tree (mmm, roots), and then 5. Print the least likely response to follow the last N words by applying the Dissociated Press algorithm non-randomly. 6. Terminate response after its length exceeds some threshold; the exact computation of which I can't recall at the moment. It is not random: therefore, eerily humanoid. (Cue theremin riff, thundercrash.) A compiler, such as I imagined above, could probably employ sliding windows (of width N) to isolate recurring phrases or sentences. Thereby it may automatically learn how to construct meaningful language without human interaction. Although I think you'll agree that the simplistic method is pretty effective on its own; notwithstanding, I'll experiment with a learning design once I've done QL's code generation method sufficiently that it can translate itself to Python. Or possibly I'll nick one of the Python compiler compilers that already exists. (Although that would take all the fun out of it.) I'll parsimoniously describe how pal9000 blends the two: First of all, it doesn't (not exactly), but it's close. Pal9000 learns the exact words you input, then generates a response within some extinction threshold, with a sliding window whose width is variable and bounded. Its response is bounded by a maximum length (to solve the Banana Problem). Because it must by some means know when a response ends "properly," it also counts the newline character as a word. These former are departures from Eggdrop. It also learns from itself (to avoid saying something twice), as does Eggdrop. In addition, p9k's response isn't necessarily random. If you use the database I included, or choose the experimental "generator" response method, p9k produces a response that is simply the most surprising word it encountered subsequent to the preceding state chain. This produces responses more often, and they are closer to something you said before, but of course this is far less surprising and therefore less amusing. The classical Eggdrop method takes a bit longer to generate any reply; but, when it does, it drinks Dos Equis. ... Uh, I mean... when it does, the reply is more likely to be worth reading. After I have experimented to my satisfaction, I'll switch the response method back to the classic Eggdrop algorithm. Until then, if you'd prefer the Eggdrop experience, you must delete the included database and regenerate it with the default values and input a screenplay or something. I think Eggdrop's Web site has the script for Alien, if you want to use that. Game over, man; game over! In case you're curious, the algorithmic time complexity of PAL 9000 is somewhere in the ballpark of O(((1 + MAX_COHERENCE - MIN_COHERENCE) * N) ^ X) per reply, where N is every unique word ever learnt and X is the extinction threshold. "It's _SLOW._" It asymptotically approaches O(1) in the best case. For additional detail, consult /mlptk/reference/PAL9000/readme.txt. Pal9000 is a prototypical design that implements some strange ideas about how, exactly, a Markov-'bot should work. As such, some parts are nonfunctional (or, indeed, malfunction actually) and vestigial. "Oops... I broke the algorithm." While designing, I altered multiple good ideas that Eggdrop and DisPress did right the first time, and actually made the whole thing worse on the whole. For a more classical computer science dish, try downloading & compiling Eggdrop.