Category Archives: Lecture

Give Me That NP-Time Religion.

"Give Me That NP-Time Religion."
"What was Good Enough For Granddad is Good Enough For Me."
"Day-Old Panglossary."
"Babel: the New Fishing Sensation!"

"NP Religion" is a pun on nondeterministic polynomial time & language jihads.
"Granddad" is a pun on the title of a song by the Squirrel Nut Zippers.
"Panglossary" is a pun on linguistics and bread.
"Speakeasy" is a pun on methamphetamine during the Prohibition era.
"Babel" is a pun on Babel Fish, in D. Adams' _Hitchhiker's Guide to the Galaxy._

NB: This lecture is a summary, or tour de force, _not_ a tutorial.
    Until such time as I write some of those, I suggest you seek them elsewhere.
    As with all my publication, it's free of charge to the Public Domain.

I've sometimes encountered the question: "what's the best language to study?"

The answer is as varied as the races of man, but boils down to "whatever."

Nevertheless, I have picked one for the moment: Python 3, a C wrapper.
Python is easy to learn, as powerful as C, and implements many fine metaphors
such as generators, list comprehensions, and generator comprehensions. If you
haven't a clue what I've said, that's fine, because Python's still great for you
as you begin with a computer programming curriculum. See also the Icon language.

In the field of information technology, all have their own preferred flavor of
computing. If you'd like some language I've never written, such as Haskell, you
are probably better off to study that instead of Py3. Prefer specially languages
your friends write and will help you learn. A human being is a better tutor than
a reference manual -- whence I've learnt much.

Briefly, I'll summarize languages I know or with which I'm familiar in passing.
I'll describe the category (What & How), utility (Where & When), and specialty
(Why & Who) of each such language.

There are hundreds of computer programming & markup languages, of which only a
few I've reviewed here. See also The Daily WTF (Worse Than Failure) and your
friendly neighborhood University's computer science department, as well as text-
books like Structure & Interpretation of Computer Programs and tutorials.

I'll not limit these to programming languages, properly; also included shall be
descriptions of markup languages and perhaps a few esoteric mini-languages like
NullSoft's Scriptable Install System. One noteworthy limitation: ubiquity.

(The difference between programming & markup language is loops, i.e. iteration.)

NB: any language can, hypothetically, be compiled. Interpreted languages usually
assume that the speed loss and overhead from interpretation are tolerable, or
there's some caveat (dynamic typing or execution environment) that requires an
interpreter even when the language is "compiled" (such as, IDK, Java).

In no particular order:


What: ECMAscript (Javascript) is a procedural language used in Web pages.
      Javascript is also used in Web browser design, usually supplementing C++,
      for certain runtime code such as Firefox's ill fated tab gallery.
      The Windows Script Host will execute Javascript outside a browser.
How:  Much like C, without Malloc, but quite a bit slower because "everything is
      an object" and for other amusing reasons not limited to the interpreter.
      Browser-based implementations can access the Document Object Model.
      Javascript is an interpreted language.

Where: Dynamic HTML pages for the Worldwide Web. In-browser apps like my MLPTK.
When:  Although Javascript fails at three-dimensional rendering, it is opportune
       for such tasks as moving parts of Web pages (where Cascading Stylesheets
       cannot suffice), making them appear and disappear, and sending beacon
       queries at specified intervals so pages update without clicking "Reload."

Why: Javascript requires no compiler. Compatible Web browsers exist for any OS.
     Modern browsers implement a recursion limit, due to heap and stack overflow
     exploits, which hinders traditionally-generated reentrant parsers; other
     than this caveat, and the speed problem, JS is good for simple computation.
Who: Web developers, browser programmers, casual programming, portability.

Summary: Great for Dynamic HTML! The recursion limit can be a pain, however.


What: An interpreted language executed by Web servers such as Apache or nginx.
How:  Like other languages compatible with the Common Gateway Interface, which
      are many and varied, it's executed by the server.

Where: PHP can be written in conjunction with HTML, within the same document.
       Dynamic Web forms and dynamically generated pages "feel" more natural.
When:  PHP is useful for the server-side ("back end") of Web applications,
       especially where the user must never know how passwords are validated and
       where the server can take some bandwidth load off of the transaction by
       computing something on its side rather than transmitting a massive JS lib
       to the client side and asking the client to compute for it.

Why: Because it's server-side, the end user never sees what it's doing, and so
     it's fantastic for bookkeeping on your side of the transaction.
     HOWEVER! Always remember that you need to keep a careful eye on security,
     especially where the user supplies any input at all. You'll need to escape
     and/or sanitize this input in other ways, to avoid clever little hacks like
     Johnny "; DROP TABLE *; ".
Who: Writers of CAPTCHAs, Web shopping carts, members-only galleries, etc.

Summary: Like other CGI tongues, PHP is a Webmaster's vade mecum.


What: A formatting language used for writing HTML pages on the Worldwide Web.
How:  HTML marks up a hypertext document for rendering by a Web browser.

Where: On the WorldWideWeb, or a mirror on your local file system, etc.
When:  HTML documents are the markup format of choice for Web browser rendering.

Why: Web pages, GUIs for DHTML apps, front ends for business.
Who: Web developers, casual rich text writers, authors of technical documents.

Summary: Good for quick GUIs, reference manuals, and writing (w/o MS Office).


What: CSS (not to be confused with the DVD content encryption scheme of the same
      name) is a formatting (markup) language with programming-like abilities.
How:  It is an interpreted language. Its facilities exclude iteration/recursion.

Where: Web pages, résumés, and other such documents.
When:  Whenever HTML formatting is insufficient to the task.
       CSS, AFAIK, lacks loops, and can't use HTTP. Supplement it w/Javascript.

Why: CSS is a faster, prettier way to mark up Web pages.
Who: Web developers, publishers, résumé formatting, etc.

Summary: Useful to format your curriculum vitae, if LaTeX is too much trouble.


What: C is a procedural programming language that's near to machine language.
      C is "sorta" object-oriented. C++ is more conducive to OO design. I use C.
How:  The programmer works with bytes & system interfaces to effect his design.
      C is compiled to assembly language and then assembled to machine code.

Where: Unix and Unix-like systems, such as MacOS 10+ and Linux.
       C can also be compiled on other systems, like Windows: Dev-C++, mingw.
       The Linux Programmer's Manual, included with Ubuntu's "build-essential"
       package, contains the complete Posix specification described by the ISO
       538 Committee Draft, August 3, 1998 WG14/N843, titled C '98, as well as
       by the Single Unix Specification. C is, substantially, Unix.
When:  The C programming language is close to machine language in many respects:
       therefore, and because compilers have been written and optimized much, it
       is blazing fast. C is most fruitfully employed in operating system design
       because the computer's hardware must be spoken to in machine code.

Why: If I need a fast program or a portable one, C is the lingua franca.
Who: Operating system designers, simulation/game programmers, hackers.

Summary: C and C++ are the Speedy Gonzales of the programming world.


What: Python is a procedural programming language with functional capabilities.
      It contains all of C++'s object-orientation, with additional benefits.
      Python is an interpreted language, and can be compiled.
How:  Using its substantial standard library, and extra OO benefits, Python is
      quite able to solve the vast majority of information technology woes.
      Python's an interpreted language. Its interpreter runs on Windows & Linux.

Where: Python's extensive manual (& library reference) describe convenient fixes
       for most IT-related problems. My only problem with it: unlike, e.g., the
       Haskell language, programs can't be described in pseudo-EBNF.
When:  Just about anytime for fun & profit. However, if you use it commercially,
       Python's author would like you to contact him. If you'd like to write a
       program more like a flowchart than an algorithmic procedure -- e.g., if
       recursion & parsing are your cup of tea -- you might try Haskell?

Why: Like C, but don't like malloc? Python's for you! :) See also:
Who: Information technologists, discrete mathematicians, C programmers, etc.

Summary: Easier than C, with all the phenomenal cosmic power, in a box.
         Python also executes nearly as quickly as C, even when interpreted.


Actually I know next to nothing about this language. Ask Glasgow University?
Also you might try the book "Learn You A Haskell For Great Good!"


Java is not Javascript. That's about all I know. Read more @ Sun Microsystems.


What: Functional programming languages, suitable for lambda calculus etc.
      I believe these are interpreted, but can probably also be compiled.
How:  Unlike object-oriented languages, in which the programmer arranges data in
      groups (structures, objects) that are sensible to the human mind's spatial
      orientation perceptions and its imagination, functional programming uses a
      calculus-like notation to make all programs function()s of other programs.

Where: Because all programs are functions of other programs, the transformative
       idea of the computer program (parser) as a finite state automaton that
       manipulates other such automatons finds an easy home here.
When:  The program itself is also data and object-orientation is (to some minds)
       easier to understand from a functional perspective. So, lisp is a good
       language for programs that have to change their code a lot.

Why: Calculus, I think, and iteration.
Who: The Knights of the Lambda Calculus.

Summary: I haven't much used functional languages. This entry is guesswork.


What: Several varieties of mnemonic languages used to write machine code.
How:  Mnemonics are "assembled" to machine language, or some form of executable
      code, based on the assembler's translation table.

Where: Assemblers are like compilers; but, instead of parsers, they're scanners.
When:  Therefore, they're easier to write. Also, see "MACHINE CODE" below.

Why: Because you don't want to write machine code yourself.
     After all, only a simplistic scanner is required to do this for you.
Who: Manly men, womanly women, and all sorts in between.

Summary: Just write this part after you write your compiler. (It's easier.)


What: All computermachines speak electronically via binary ones and zeroes.
How:  "Machine language" is the coded sequences of binary numbers that are used
      to instruct a computer's computational circuitry (its ALU, or CPU).

Where: Primitive machinery, new interfaces, low-level hacking & intrusion.
When:  Whenever flowcharts and buzzwords just aren't enough to get you hired.

Why: Compilers do most of this work for us these days. Sometimes needed, though.
Who: "Real Programmers," electrical engineers, chipset architects.

Summary: Don't bother unless you're an electrician. (It's tedious / dangerous.)


What: A stack-heavy programming language similar to assembly language.
      FlAS is an interpreted language, and it's pretty slow.
How:  A FlAS is embedded in a Shockwave (Adobe) Flash file to animate parts and
      provide for user interaction. 

Where: Because Javascript is no good for complicated rendering tasks, Flash AS
       is often employed to write Web games to be played in browsers using the
       Macromedia Shockwave -- ahem -- Adobe Flash plugin.
When:  If you need a Web game in a browser, or if you must format your gallery
       in such manner as to make it less scrutable to text-mode scraping 'bots.
       Scrapers can still take your gallery to pieces even if you use encrypted
       Javascript or amusing cryptographic puzzles in Flash, but I'll bet SWF is
       just about the end of the line for the casually-scraping archivist.

Why: Probably just because Javascript is no good for Web games.
     Flash's "tween motions" are very amusing, too, and I keep wondering whether
     Adobe will implement a Disney-like strech-and-squish. Technically, so could
     any programmer, and perhaps you too might like to take a crack at Mickey?
Who: Game programmers, paranoid gallerias, art students.

Summary: I've used this language less than Lisp, but it's popular.


What: A formatting language used for writing technical manuals and textbooks.
How:  DocBook is converted via DB -> TeX -> DVI pipeline, then rendered as PDF.

Where: Dissertations (M.S., Ph.D., ...), portfolios, scientific documents.
When:  Particularly opportune for automatic compilation of reports, such as in
       my, where formatting several thousand documents as individual
       PDFs and then merging them all would be a waste of time. Of course,
       markup languages are always apropos of computer-generated output.

Why: Professional-quality formatting to PDF, in a markup language, without any
     such editor as Adobe's Acrobat and others. DocBook also has a 
     tag for formatting simplistic chemical equations, although many other
     symbols aren't rendered by dblatex -- so, write in pure LaTeX instead!
Who: Scientists, yuppies, and anyone who can't afford Adobe's software... :)

Summary: LaTeX for newbies, and there's nothing wrong with newbies: they learn.
         As DocBook parsers and rendering technology improve, it may even grow
         until it supplants LaTeX entirely. (DocBook _is_ a favored contender.)


What: A programming language that is also a formatting (markup) language.
How:  LaTeX is rendered by a TeX -> DVI -> PDF pipeline involving several tools
      in Linux (and, probably, because Linux is Posix-compliant, also Unix).

Where: Unique among programming languages, LaTeX is used to format rich text.
When:  See DocBook, above, and also where extensive fine formatting is needed.

Why: Conventional tools, such as MS Office, typically lack somewhat in rich text
     formatting facilities -- equations with strange symbols and subscripts are
     particularly difficult to write. LaTeX solves these problems, and also has
     fine-grained control over where and how items appear on the printed page.
     It's only a step away from PostScript (a printer language), in that regard,
     and writing in LaTeX is much easier than writing in PostScript.
Who: See DocBook, above, and also especially mathematicians.

Summary: DocBook for oldbies, and naught is wrong with oldbies: they're elite.


This is a control language for inkjets & laser printers used to write on paper.
It is also a programming language (has loops), I believe, and can be used to
exploit printers in horrifying ways (BTW, kids, don't try that at home).
Because PostScript is a part of Adobe's Portable Document Format (PDF), there've
been computer viruses transmitted via PDF files. Technically, this caveat cannot
be avoided, which is why your boss advises you to only open trustworthy PDFs.


What: Such languages are used to automate repetitive invocation of commands as
      may be encountered during the operation of computer systems like Unix.
How:  Shell scripts simply automate a shell: something like Bash or MLPTK, or
      Microsoft's "Command Prompt" (which used to be MS-DOS).

Where: Although (except for MS-DOS, now outdated) a shell isn't an underlying
       part of the operating system, it can invoke any command compatible with
       its standard input & output.
When:  Shells are therefore suitable to batch processing of data, such as report
       generation, transcoding, and file system rearrangement.

Why: For example, my compiling my book and massaging all the parts
     into DocBook markup for dblatex would take too long if I did so by hand.
Who: Anyone who uses a computermachine to compute, archive, and collate data.

Summary: Great for computing. Probably anathema to intellectual slavery laws.


What: Although Perl is a fully fledged programming language, Sed & Awk are both
      more useful for text processing than anything else.
How:  Using regular expressions (regular grammars that describe alphabets), the
      Stream EDitor and Mawk administer Swedish massage to the output of unit
      tests and other experimental data. Additionally, to text-based protocols
      such as HTTP, sed is the PaperMate of the information technology world.

Where: Tabulating the results of experiments, directory listings, reams of data,
       rewriting the Web as you browse it (see my HTTPyre) and such tasks.
When:  As lifetime Unix gurus will tell you, "every [EXPLETIVE] day."

Why: Any text that makes sense in any legible way can be quickly massaged and
     reformatted by the stream editor. Perl's a bit more powerful, but I hate it
     for irrelevant reasons. I do mean _any_ text, btw: even computer programs.
Who: Virus scanners, information technologists, Web router programmers, cads.

Summary: Sed is to IT as the slide rule is to mechanical engineers.
         The combination of sed, awk, and grep suffice to effect many simplistic
         Web robots (but mind you ROBOTS.TXT); if you're careful to format your
         computer programs' output as legible text, you'll find them handy too.


What: The Nullsoft Scriptable Install System is a mini-language for installers.
How:  It's like a shell script that unpacks a ZIP file, moves the contents to
      specified directories, and modifies the Windows registry accordingly.

Where: Similarly to the InstallShield Wizard, NSIS effects "installation" of the
       kind that Windows users expect. It's much like dpkg / apt-get on Linux
       distributions such as Debian and Ubuntu.
When:  NSIS is free of charge, although InstallShield might be too? :)

Why: Windows users don't want to unpack an archive and move stuff themselves.
Who: Anyone writing a program for Windows. See also: InstallShield.

Summary: One of several ways to write self-contained installers.

“Windows cannot find a critical file.” Current Rage Level: Omicron.

Ubuntu Linux: the Wal-Mart(TM) Frontier. These are the voyages of the Spacecar
Grosvenor. Its continuing mission: to allocate new structs & new classes, unite
all people within its nation, and leak where memory has never leaked before.

Of the numerous Linux installations ("distributions"), I've used Ubuntu Linux (published by Canonical Inc.)
most. It contains the Linux kernel, the GNU core utilities, and several other
items of interest such as an automagically-configured graphical user interface.
It is extraordinarily user-friendly, to the point of feeling constrictive. (The
desktop environment has changed since version 11: users now cannot reconfigure
the taskbar or workspaces. The repository wants to be a dime-store, too, and
although a potentially lucrative storefront I miss the simplicity of Synaptic.)

Its installation procedure is simple: download a Live CD image from Canonical's
Web site, burn it to a CD-R or RW (these days, you might even need a DVD), and
reboot your machine with the disk inserted. (Don't forget to tell the BIOS -- er
whatchamacallit, the Extended Firmware Interface -- to boot from CD.) You'll be
presented with an operable temporary login. Thence you can install the OS. Also
available from this interface was an option to create a USB startup disk, but it
has been removed in recent revisions of Ubuntu: previously, using VirtualBox or
any similar virtual machine, the user could run the LiveCD & make a startup USB
without even rebooting out of their present operating environment, which was
useful on old machines whose optical drives had failed. You can still "Install"
to the USB key, but it boots slowly & you can't install it from there to a box.

The installation wizard is a no brainer: "install alongside Windows." Easy! And
it usually doesn't cause your existing Windows system to go up in smoke, either.
However, to install Ubuntu more than once per box, you must repartition manually
(and may also need to change grub: see /boot/grub and /etc/grub.d). Gparted is
included within the live disc images, but must be retrieved again after install.

If you'd like to make intimate friends with the manual pages, and discover where
primary partitions go when they die, you can install with less assistance. This
lets you specify partitions in which to mount your home & system directories, in
case you'd like to keep them segregated. (That's probably a great idea, but I
never do.) You can also create and specify swap partitions: which are employed
as virtual memory and, I suspect, for hibernation and hybrid suspension.

About file systems: I typically use FAT32, NTFS, ext4, and ext2. (Total newbie.)
FAT32 is elderly and fragile. It's used for boot/EFI partitions, 3DS & 3GPs.
NTFS is Microsoft's modern FS. Withstands some crashes, but has no fsck module.
ext2 & ext4 are Linux's. ext4 journals. ext2 permits file undeletion (PhotoRec).
The extended 4 system is harder to kill than a cockroach on steroids, so I tend
to prefer it anywhere near the heart of my archives. I use ext2 | NTFS for USBs.

Be very careful not to destroy your existing data when repartitioning the drive.
Any such operation carries some risk; back up anything important beforehand. One
way to backup is to prepare an empty HDD (or any medium of equal / greater size)
and dump the complete contents of the populated disk into the empty one:
 dd if=/dev/sda of=/dev/sdb status=progress
 (Where sda is the populated disk, and sdb the empty backup disk.)
Similar can be accomplished by dd'ing one of your partitions (/dev/sda1) into a
disk or a file, then dd'ing the image back onto a partition of equal size. Disk
image flashing is a simple and popular backup method for local machines, sparing
you the time to learn rsync (which is more useful in long term remote backups).
Far from being an annoying elder sister, dd is the Linux troll's best friend.

Beware the dreaded "write a new boot/system partition" prompt. It bricked me.
The problem was because I had set the system to "Legacy Support" boot mode, but
the original (now unrecognized) installation was in Extended Firmware Interface
mode. I was unable to recover until I had re-flashed several partitions.

The usual "new car smell" applies: you'll want to configure whatever settings
haven't yet been forbidden to you by your GUI-toting overlords. In Ubuntu 16,
access them by clicking the gear and wrench icon on the launcher panel. You can
also search for something you're missing by using the Dash (Super, or Windows,
key pulls it up: then type), which functions similarly to the apropos command:
e.g., instead of Ctrl + Alt + T and then "man -k image", Super key then "image".
It will also search your files (and, after plugins, several social media sites).

Although the newfangled Dash is convenient, don't forget your terminal emulator:
you can easily spend the vast majority of your working time using bash by way of
gnome-terminal, without ever clicking your treasured Microsoft IntelliMouse 1.1.
In Ubuntu 16, as it has been since Ubuntu 11, Ctrl + Alt + T opens the terminal.

Under the directory /usr/share/man/, you will find the on line (interactive)
manual. This describes the tools available to you. Begin reading it by opening
a terminal window (using Control + Alt + T, or the Super / Windows key and then
typing "terminal"), keying the command 'man name_of_manual_page', and pressing
the Enter key. In this case, the name of the manual page is the page's archive's
filename before the .[0-9].gz extension.
Of particular interest: telinit, dd, printf, cat, less, sed, tee, gvfs-trash,
mawk, grep, bash (if you're using the Bourne Again Shell, which is default on
Ubuntu 16), cp, rm, mv, make, sudo, chroot, chown, chmod, chgrp, touch, gunzip,
gzip, zip, unzip, python, g++, apt-get (especially `apt-get source ...`), mount,
kpartx, date, diff, charmap (same name on Windows!), basename, zipinfo, md5sum,
pdftotext, gnome-terminal (which is _how_ you're using bash), fortune, ffmpeg,
aview, dblatex, find, cut, uniq, wc, caesar, rot13, curl, wget, sort, vim, man,
tr, du, nautilus, tac, column, head, tail, stat, ls, pwd, pushd, popd, gedit,
source-highlight, libreoffice (a Microsoft Office killer), base64, flex, bison,
regex, perl, firefox, opera, chromium-browser, konqueror, lynx, virtualbox,
apropos, od, hexdump, bless, more, pg, pr, echo, rmdir, mkdir, fsck, fdisk (same
name, but different function, in Windows), ln, gdm, gnome-session, dhelp,
baobab, gparted, kill, locate, ps, photorec, testdisk, update-grub...
(If you haven't some of the above, don't worry. You should already have all you
 need. Keep in mind that the Ubuntu repository's software is divided in sections
 some of which contain potentially harmful or non-free software. When venturing
 beyond the fortified walls of <main>, be cautious: you may be eaten by a grue.)
Beneath /usr/share/doc/ or /usr/share/help/ are sometimes additional manuals.

If you use Linux, you will have to memorize several manuals, and name many more;
especially those of the GNU core utilities, which are a great aid to computing.
There's also a software repository to assist you with various computing tasks:

To acquire additional software: gnome-software (the orange shopping bag to your
left, above the icon), the friendly storefront, will assist you. If
you prefer a compact heads-up-display, try the Synaptic Package Manager instead.
`apt-get install package-name` works well if you know what you're looking for,
as does apt-get source package-name for the ponderously masculine.

And, speaking of ponderous masculinity, if you retrieve source code for any of
Ubuntu's mainline packages, typically all you need to do is 'cd' into the folder
containing the top level of the source tree and then invoke the following:
 1. ./
 (You shouldn't need to chmod u+x ./ to accomplish this.)
 2. make
 (You may need to install additional packages or correct minor errors.)
 3. sudo make install
This can be abbreviated: ./ && make && sudo make install
Beware that sudo is a potentially dangerous operation. Avoid it if unsure.
The && operator, in bash, will only execute the next command if the past command
exited with a successful status code (i.e., zero).

But I digress.

You'll occasionally want to mount your other partitions on Linux's file system,
so that you can browse the files you've stored there. With external drives this
is as simple as connecting them (watch the output of `tail -f /var/log/*` in a
console window to observe the log messages about the procedure), but partitions
on fixed disks (or others, 'cause reasons) may not be mounted automagically. So:
 mount -t fs_type -o option,option,... /dev/sd?? path/to/mount/point/
where the mount point is a directory somewhere in your file system. BTW, mounts
that occurred automatically will be on points beneath /media/your_username/.

On a dual boot Windows system, I mount -t ntfs -o ro /dev/sda3 ~/Desktop/wintmp
often because the NTFS partition is in an unsafe state and won't mount writable.
In that case, rebooting to Windows and running chkdsk /f C: from Command Prompt
with Administrative privileges will sometimes clear the dirty flag if performed
multiple times. (How many times before ntfs-3g mounts writable, seems to vary.)

When you've attached external media, via USB etc, safely remove them after use:
use the "Safely Remove" menu option in the right-click context menu in Nautilus'
sidebar (be careful not to accidentally format the disk). You can also, from a
shell (gnome-terminal), `sync && umount /dev/sdb*` (if sdb is the medium).

Now that you've got a firm foothold in Ubuntu territory, I hope you can see your
house from here 'cause Windows seems to be dying a miserable death of attrition.
Don't count it out, though: all the Linuxes are terrible at Flight Simulator.

こんにちわ い-よ, わんこ いいんちよ…

Title translates loosely, if at all, as "I ain't say hi to the First Dog."

Discrete mathematics treats of sets (groups of items, of known size) and logic.
The operations on sets are union (OR), intersection (AND), difference (-), and
symmetric difference (XOR). I often call AND conjunction and XOR disjunction.

As for logic, the usual Boolean operations with which programmers are familiar:
 AND: Only the items ("1" bits) in both sets (bytes). 01001 & 01100 == 01000
 OR: All the items ("1" bits) in either set (byte). 01001 | 01100 == 01101
 XOR: Everything in one set or the other, _not_ both. 01001 ^ 01100 == 00101
 NOT: Logical inversion. (Not conversion.) Same as ¬. ! 01001 == 10110
(NOT is a unary operator. The binary operators are commutative. The exclusive OR
 (XOR) operator has this additional property: given, axiomatically, A ^ B = C,
 then C ^ B = A and C ^ A = B. I think that makes XOR evanescent, but I can't
 remember if that's transitivity. Anyhow: symmetric difference self-reverses.
 See also memfrob() @ string.h, & especially Python manual section 6.7 (sets).)

I may have misrepresented some of the details: consult a discrete math text too.
Salient trivia: don't confuse the bitwise intersection (&) & union (|) with the
 boolean operators && and ||. Although equivalent for the values
 1 and 0, they behave differently for all others. For example:
 2 & 5 == 0; although both (bool)2 and (bool)5 typecast to true,
 and therefore (bool)2 & (bool)5 == (bool)2 && (bool)5 == true,
 bitwise & is valid on integers and the compiler will produce the
 integer bitwise & operation which results in false.
Et trivia: xor can be used (cautiously) to logically invert comparisons (^ 1).
(See also: GNU Bison, Extended Backus-Naur Format, formal grammar (maths).)

Combinatorics (the science of counting) is discrete mathematics: in which the
delightful factorial (a unary operator, !, whose operand is to the left, which
distinguishes it from unary logical NOT) makes many an appearance. Combinatorial
explosion is a task for any mathematician who wishes to optimize (by trading
memory for repeat computation) such procedures as decision trees, which are NP.
(See also: algorithmic complexity, non-deterministic polynomial time).

Combinations of a set, because order makes no difference, are fewer than its
permutations. A combiner can be written with sliding windows, as in Python's
"itertools" library. (I've also implemented a heuristic version in Javascript.)
Combinations represent all the ways you'd choose some number of items from a set
such that no two combinations contain exactly the same items.
There are S! / (M! * (S - M)!) M-length combinations chosen from a S-length set.
(X! is "X factorial." Don't confuse it with !X, which is "not X," i.e. "¬X." )

Contrariwise: permutations, in which the same items chosen in a different order
constitute a different permutation than any other, represent all the ways you'd
choose items from a set such that the time when you chose each item matters.
Permutation is most simply and directly implemented by way of recursion.
(Combinations can also be written this way, by narrowing the slices, as I have.)
There are S! / (S - M)! permutations of length M chosen from a set of length S.

The number of "same items, different order" permutations can be arrived upon via
subtraction of the combination formula from the permutation formula:

S! S! M! * S! S! (M! - 1) * S!
-------- - ------------- == ------------- - ------------- == ------------- 
(S - M)! M! * (S - M)! M! * (S - M)! M! * (S - M)! M! * (S - M)!

Because the "N-ary Counter" appears frequently in computer programs that behave
in a combinatorial way, and because I implemented them too in Mutate-o-Matic:
the items in any set, when numbered in monotonic integral progression from 0,
can represent the numerals of numbers in base N (where N == S, to reuse the "set
length" nomial from above). Arithmetic with these figures rarely makes sense,
although hash tables and cryptographic checksums sometimes behave similarly.
There are S ** M (S to the power M; ** is exponentiation in Python) N-ary
"numbers" of length M that are possible by counting in base S.

Combinations and permutations are not defined when the length M of the mutation
exceeds the length of the set. N-ary counters are defined for any M, however.

Today I'll describe a mutator I wrote in C. It permutes, combines, and counts.
It is somewhat more complex than the recursive one-liner you'd sensibly write,
but not much different once you see past all the pointers.

Such a one-liner, before I get carried away, would look something like this...
 def anagrams (w): # Pseudocode (read: I haven't tested it) anagram generator
 if len(w) == 1: # <-- Recursion always requires a sentinel condition,
 yield list(w) # unless you are extremely clever indeed.
 for i in range(len(w)): yield [ w[i] ] + anagrams(w[0 : i] + w[i + 1 :])
... in Python, though my own rendition in Python is somewhat more acrobatic (it
foolishly employs generator expressions); as for combinations, see PyDoc 10.7.1
(itertools.combinations). Python is fast, because it's just a shell around C, &
Python's "itertools" module will do the discrete math acrobatics in record time.
Itertools also contains many of the discrete math axioms I rewrote above -- see
its page in the Python manual, section 10.7, for the details.

Not content to waste nanoseconds, however, I resolved to write a significantly
less intelligible implementation using the C programming language. I ought to
point out that the savings in algorithmic complexity are vanishingly negligible:
it actually probably uses more time than the Python implementation, owing to its
prodigal invocation of malloc() (among my other C vices, such as **indirection).

Anyway, here's the Mutate-o-Matic in Stone Tablet Format courtesy of my C-hisel:
 typedef enum { PERM, COMB, NARY } mmode_t;
 // ^- Respectively, "permutation," "combination," or "N-ary counter" mode.
 void mutate (
 char *set[], // set of items to permute. An array of C-strings.
 size_t slen, // length of the set.
 size_t plen, // length of each permutation.
 mmode_t mode, // kind of mutation I'm executing.
 char delim[], // delimiter, per permutation item.
 char *prefix[] = NULL, // characters chosen already.
 size_t pflen = 0 // length of the prefix.
 ) {
 size_t subset_length = (mode == NARY) ? slen : slen - 1;
 char **next_set = (char**) calloc(subset_length, sizeof(char*));
 char **next_prefix = (char**) calloc(pflen + 1, sizeof(char*));
 if (next_set == NULL || next_prefix == NULL) { exit(-1); }
 for (int i = 0; i < pflen; next_prefix[i] = prefix[i], i++) ;
 for (int i = slen - 1, j, k; i >= ((mode == COMB) ? (int)plen : 0); i--) {
 next_prefix[last_item] = set[i];
 if (plen > 0) { // Either descend the recursion ...
 for (k = 0, j = 0; j < i; next_set[k++] = set[j++]) ;
 switch (mode) {// ^- the above are elements common to every mode
 case NARY: next_set[k++] = set[j]; // XXX NB: fallthrough
 case PERM: for (j++; j < slen; next_set[k++] = set[j++]); break;
 } // (^- conditionally copy elements not common to all three modes)
 mutate(next_set, k, plen, mode, delim, next_prefix, pflen);
 } else { // ... or print the mutation (selected set items).
 for (j = 0; j < last_item; printf("%s%s", next_prefix[j++], delim));
 printf("%s\n", next_prefix[last_item]);
 } // *Sound of cash register bell chiming.*
 } // end for (Select items and mutate subsets)
 free(next_set); free(next_prefix); // Paying tribute to Malloc, plz wait

(To download the C file, plus the rest of my works, retrieve my portfolio by way
 of the links to MediaFire supplied in the top post at my WordPress blog.)

It seems so simple: only an idiot couldn't see how to compute mutations in this
manner, right? However, although I first encountered the problem when I was but
a wee lass, & although I finally discovered the formula for permutations and
N-ary counters after the better part of two !@#$ decades, I yet fail to grok
_why_ the "bounded sliding windows" illustrated by itertools successfully choose
only the combinations. (The best I can do is "because they're in sort order?")

Anyway, the procedure is woefully uncomplicated:
append on the prefix an unchosen item from the set, then either...
 [remaining choices > 0] recurse, having culled chosen item (unless in N-ary
 mode); and, if in Combination mode, having culled
 items subsequent to the sliding window (set[i]),
... or ...
 [choices == 0] print chosen items, which are a mutation, & continue

I'm sure you can see that this whole procedure would be a lot faster if I wasn't
spending precious clock ticks spelunking the dreaded caves of malloc(). A global
array of arrays with entries for each prefix length would really speed things up
-- as would splitting the mutater into three different functions, to eliminate
the branching instructions ("if-else," "switch," and "(predicate) ? a : b").

However, I recommend you don't bother trying to write it faster in C. What with
the huge cache memories CPUs have these days, it's probably plenty zippy. Even
though it _can_ be faster in C, it could be even faster in Assembler, so learn
an instruction set and assemble a core image if that's what you really want.

Until then, for all its inefficiency, Mutate-o-Matic renders about 100,000
mutations per second. That's faster than those sci-fi lockpicks in the movies.

Status? Or Update?


Since my last edition in March, I've been refining the design of Videlicet: a
script I hope will aid in maintenance of digital art collections.

Although I haven't yet finished work, it is near completion, as you can see in
this abridged snapshot:

Presently, the only available functionality is a label-maker. Soon, it will be a
label maker with tentacles. (Full disclosure: tentacles are metaphorical in
nature, and the program is so described for the sole purpose of setting up this
joke about octo-pythons.) Hopefully the finished work will be available to you
by this July, at which time I will issue the complete edition.

In the meantime, here are some computer-related trivia.

Factual computer tidbits, each in 80 columns -- the canonical console width.
(These are with considerable reference to FOLDOC and the Jargon File.)

We say "computer" about machines these days, but it once meant one who computes.

Computer machines are a kind of difference engine. They calculate math rapidly.

The first such difference engine is attributed to Charles Babbage.

The first reprogrammable machine, however, is reputed to be the Jacquard loom.

Source code is a series of instructions telling a computer what to compute.

Source code is interpreted by a compiler that translates it to assembler code.

An electronic calculator is, in abstract, an infix notation algebra compiler.

Assemblers translate human-legible mnemonics into computers' "machine language."

Machine language, an abstraction of electrical potential (EMF), is binary code.

The Volt, defined by the IEC in 1983, is the unit of electro-motive force (EMF).

Metal oxide semiconductor field-effect transistors (MOSFETs) are logic circuits.

Logic circuits encode inverse Boolean algebra: NAND, NOR, and NOT.

Computer programming languages evolved to assembler mnemonics from machine code.

From assembler, programs further evolved to high-level language (such as C).

Very high level language (whatever that means - perhaps interpreters?) was next.

Modern computer programming reads much like calculus. (C.f. ASM = arithmetic.)

Object-oriented programming arranges data in nested structures, then computes.

Functional programming computes with nested functions, then arranges the data.

File systems are tree-like data structures encoding allocation of disk memory.

Compressed archives use something like LZMA to squeeze redundant bytes in files.

Non-volatile memory (disk space) lasts longer than volatile (RAM).

Harddisk capacity is measured in Gigabytes. They're magnetic platters or EAPROM.

Magnetic storage functions by "reading" and "writing" magnetic fields.

Electrically alterable programmable read only memory blows fuses & antifuses.

Operating systems handle tasks, as process scheduling, incidental to human use.

Mainframes ("clouds") have one central OS; the clients are dumb terminals.

Dumb terminals have no independent processing or storage capability.

Personal computers, by contrast, have processing, storage, and an OS.

Graphical user interfaces (GUIs) are the modern point-and-click metaphor.

Command line interfaces (CLIs) are the "antiquated" terminal console metaphor.

Computer networks are any set of computermachines "speaking" to one another.

Sessions are by way of protocol. The Worldwide Web uses HTTP over TCP/IP.

The Internet & WWW evolved via cookoffs: see the Requests for Comment.

Bandwidth on the Internet has increased from baud to megabytes per second.

All computing resources can be served on a network, bandwidth permitting.

^- Senator Ted Stevens' famous "series of tubes" quote was accurate, BTW.

Human interface devices are a material tool humans use to interact w/ computers.

Graphical computer displays evolved from oscilloscopes etc, to CRTs, to LCDs.

Cathode ray tubes work by shooting electrons at a phosphorescent matrix.

Modern television-size liquid crystal displays contain millions of circuits.

CPUs handle arithmetic and logic. These days, there are 4 of them on one chip.

Frequency of a CPU's oscillation is measured in Gigahertz. For example, 2.5 GHz.

The chip's clock speed (oscillation) determines how fast it computes.

All arithmetic can be computed by adding: by 1 only, too, I think.

Ones Complement and Twos Complement are binary encodings for negative numbers.

Microchips are printed circuit boards (PCBs) that execute various functions.

The chips on a mainboard are connected to one another by a bus ("omnibus bar").

Computer hardware is the aforementioned assemblage of circuit boards.

"Firmware" (between hardware & software) is on-board (on-chip?) control logic.

Computer software is some instructions compiled & ready to run: a core image.

"Bootstrapping" a computer refers to a story about a man who flew in the air.

The Basic I/O System of yore was supplanted by the Extended Firmware Interface.

Personal computer workstations are sometimes called "boxes," due to their shape.

Alan Turing's model of a finite state automaton is a supposed computer.

Emulators, or virtual machines, are also logical computers.


Automaton Empyreum: the Key to Pygnition. (Trivial File Transfer Protocol edition.)

(I have implemented the Trivial File Transfer Protocol, revision 2, in this milestone snapshot. If you have dealt with reprogramming your home router, you may have encountered TFTP. Although other clients presently exist on Linux and elsewhere, I have implemented the protocol with a pair of Python scripts. You’ll need a Python interpreter, and possibly Administrator privileges (if the server requires them to open port 69), to run them. They can transfer files of size up to 32 Megabytes between any two computers communicating via UDP/IP. Warning: you may need to pull out your metaphorical monkey wrench and tweak the network timeout, or other parameters, in both the client and server before they work to your specification. You can also use TFTP to copy files on your local machine, if for whatever reason you need some replacement for the cp command. Links, courtesy of MediaFire, follow:

Executable source code (the programs themselves, ready to run on your computer):

Candy-colored source code (the pretty colors help me read, maybe they’ll help you too?):

My life in a book (this is what YOUR book can look like, if you learn to use my automatic typesetter and tweak it to make it your own!):


Title is a tediously long pun on "Pan-Seared Programming" from the last lecture.
Key: mechanism to operate an electric circuit, as in a keyboard.
Emporium: ein handelsplatz; or, perhaps, the brain.
Empyreuma: the smell/taste of organic matter burnt in a close vessel (as, pans).
Lignite: intermediate between peat & bituminous coal. Empyreumatic odor.
Pignite: Pokémon from Black/White. Related to Emboar & Tepig (ember & tepid).
Pygmalion (Greek myth): a king; sculptor of Galatea, who Aphrodite animated.

A few more ideas that pop up often in the study of computer programming: which,
by the way, is not computer science. (Science isn't as much artifice as record-
keeping, and the records themselves are the artifact.)

As Eric Steven Raymond of Thyrsus Enterprises writes in "The Art of Unix
Programming," "keep it simple, stupid." If you can take your programs apart, and
then put them back together like Lego(TM) blocks, you can craft reusable parts.

A kind of object with methods (functions) attached. These are an idiom that lets
you lump together all your program's logic with all of its data: then you can
take the class out of the program it's in, to put it in another one. _However,_
I have been writing occasionally for nearly twenty years (since I was thirteen)
and here's my advice: don't bother with classes unless you're preparing somewhat
for a team effort (in which case you're a "class" actor: the other programmers
are working on other classes, or methods you aren't), think your code would gain
from the encapsulation (perhaps you find it easier to read?), or figure there's
a burning need for a standardized interface to whatever you've written (unlikely
because you've probably written something to suit one of your immediate needs:
standards rarely evolve on their own from individual effort; they're written to
the specifications of consortia because one alone doesn't see what others need).
Just write your code however works, and save the labels and diagrams for some
time when you have time to doodle pictures in the margins of your notebook, or
when you _absolutely cannot_ comprehend the whole at once.

This is a kind of data structure in C. I bet you're thinking "oh, those fuddy-
duddy old C dinosaurs, they don't know what progress is really about!" Ah, but
you'll see this ancient relic time and again. Even if your language doesn't let
you handle the bytes themselves, you've got some sort of interface to them, and
even if you don't need to convert between an integer and four ASCII characters
with zero processing time, you'll still need to convert various data of course.
Classes then arise which simulate the behavior of unions, storing the same datum
in multiple different formats or converting back and forth between them.
(Cue the scene from _Jurassic Park,_ the film based on Michael Crichton's book,
 where the velociraptor peeks its head through the curtains at a half-scaffolded
 tourist resort. Those damn dinosaurs just don't know when to quit!)

The most amusing use of void*s I've imagined is to implement the type definition
for parser tokens in a LALR parser. Suppose the parser is from a BNF grammar:
then the productions are functions receiving tokens as arguments and returning a
token. Of course nothing's stopping you from knowing their return types already,
but what if you want to (slow the algorithm down) add a layer of indirection to
wrap the subroutines, perhaps by routing everything via a vector table, and now
for whatever reason you actually _can't_ know the return types ahead of time?
Then of course you cast the return value of the function as whatever type fits.

Washing brights vs darks, convenience, convenience, & convenience, respectively.
Don't forget: convenience helps you later, _when_ you review your code.

These are a treelike structure, or should I say a grasslike structure.
I covered binary trees at some length in my fourth post, titled "On Loggin'."

The reason why you need recursion is to execute depth-first searches, basically.
You want to get partway through the breadth of whatever you're doing at this
level of recursion, then set that stuff aside until you've dealt with something
immensely more important that you encountered partway through the breadth. Don't
confuse this with realtime operating systems (different than realtime priority)
or with interrupt handling, because depth-first searching is far different than
those other three topics (which each deserve lectures I don't plan to write).

Jet airplanes, video games versus file indexing, & how not to save your sanity.

A paradigm appearing in such pleasant languages as Python and Icon.
Generators are functions that yield, instead of return: they act "pause-able,"
and that is plausible because sometimes you really don't want to copy-and-paste
a block of code to compute intermediate values without losing execution context.
Generators are the breadth-first search to recursion's depth-first search, but
of course search algorithms aren't all these idioms are good for.
Suppose you wanted to iterate an N-ary counter over its permutations. (This is
similar to how you configure anagrams of a word, although those are combinations
-- for which, see itertools.combinations in the Python documentation, or any of
the texts on discrete mathematics that deal with combinatorics.) Now, an N-ary
counter looks a lot like this, but you probably don't want a bunch of these...
    var items = new Array(A, B, C, D, ...);       // ... tedious ...
    var L = items.length;                         // ... lines ...
    var nary = new Array(L);                      // ... of code ...
    for (var i = 0; i < L; nary[i++] = 0) ;       // ... cluttering ...
    for (var i = L - 1; i >= 0 && ++nary[i] == L; // ... all ...
        nary[i--] = ((i < 0) ? undefined : 0)     // ... your other ...
    ) ; // end for (incrementation)               // ... computations ...
... in the middle of some other code that's doing somewhat tangentially related.
So, you write a generator: it takes the N-ary counter by reference, then runs an
incrementation loop to update it as desired. The counter is incremented, where-
upon control returns to whatever you were doing in the first place. Voila!
(This might not seem important, but it is when your screen size is 80 by 24.)

(Boodle (v.t.): swindle, con, deceive. Boodle (n.): gimmick, device, strategy.)
Because this lecture consumed only about a half of the available ten thousand
characters permissible in a WordPress article, here's a PowerPoint-like summary
that I was doodling in the margins because I couldn't concentrate on real work.
Modularity: perhaps w/ especial ref to The Art of Unix Programming. "K.I.S.S."
Why modularity is important: take programs apart, put them together like legos.
Data structures: unions, classes.
Why structures are important: atomicity, op overloading, typedefs, wrappers.
linked lists: single, double, circular. Trees. Binary trees covered in wp04??
recursion: tree traversal, data aggregation, regular expressions -- "bookmarks"
Generators. Perhaps illustrate by reference to an N-ary counter?

Suppose someone is in a coma and their standing directive requests you to play
some music for them at a certain time of day. How can you be sure the music is
not what is keeping them in a coma, or that they even like it at all? Having
experienced death firsthand, when I cut myself & bled with comical inefficiency,
I can tell you that only the dying was worth it. The pain was not, and I assure
you that my entire sensorium was painful for a while there -- even though I had
only a few small lacerations. Death was less unpleasant with less sensory input.
I even got sick of the lightbulb -- imagine that! I dragged myself out of the
lukewarm bathtub to switch the thing off, and then realized that I was probably
not going to die of exsanguination any time soon and went for a snack instead.

"You need help! You are insane!"
My 1,000 pages of analytical logic versus your plaintive bleat.

Pan Fried Programming

(Here's the update -- nothing much is new:
Source Highlight:

Remedial (adj.): intended to rectify, amend, heal.
Panacea (n., myth): goddess of healing, daughter of Aesculapius.
Pansear (n.): Chili's Pokémon.

This remedial lecture will tersely cover a semester's curriculum,
similar to what you have learnt in your high school algebra class,
comprising the fundamentals of programming with synthetic languages
(those that are above machine code).

If you don't know what computer programming is, I would recommend that you study
some tutorials & encyclopedia articles. Much is available on the WWW (Worldwide
Web). The Web is a part of the Internet, and it is the Web you access from your
Web browser when you navigate to a Web page. You could also try one'a them there
"<name of programming language> For Dummies" textbooks: the "For Dummies" books
are excellent "Cliff's Notes"-style crash courses, and each aims to be similar
to a "101" course in the topic advertised.

To make a beginning with any programming language, all you must know is that a
computer computes: your instructions, issued in the program you write, tell the
machine how to progress from its input or initial state to a resultant output or
final state. These instructions look different in different languages -- some
languages require more or fewer -- but every computer program is an algorithm,
or "recipe for computation."

Computers and computer programs can be characterized as finite state automata.
They're like assembly-line robots who know where to weld each sheet of metal.
Also like assembly-line robots, they are "blind" in the sense that they'll kill
you with the soldering iron should you step between it and the sheet.
Computing machines do what they're told, even when it is particularly stupid:
that's why computer viruses, worms, and computer espionage exist.

In simplest terms, the computer's processing units receive some numbers and an
instruction that says what mathematical operation to execute, then operates:
like a calculator. High-level programming languages are more synthetic, like a
human language is, and comprise such ideas as objects (amalgamations of data) &
functions (modular sub-routines). Compilers or interpreters read these languages
and translate them into machine instructions, simplifying the lengthy series of
instructions necessary to make the calculator execute these difficult tasks.

In a high-level language, there are few technical concerns.
You can begin immediately with the abstract concepts.
Here are some:

As in algebra, a variable is a name that represents a value.
As in solving a system of equations, values are typically assigned to some vars
and the value of the other variables is computed using the values given.
For example, in Javascript:
    var a = 2;
    var b = a + 2;
The variable <b> is now equal to 2 + 2. Similar operations function similarly.
In Javascript and other very-high-level languages, variables aren't only scalars
and can point at any object. They're like placeholders for procedure.
Although "variable" implies a value stored in memory, and "identifier" only its
mnemonic, the words "variable" & "identifier" used loosely mean about the same.
    "Just don't try that with the Captain."
        -- Geordi LaForge, to Data, _Star Trek: the Next Generation._

These are important ideas that are abstracted away in VHLLs. A pointer stores an
address in memory, for a later indirect read/write or similar operation. In HLLs
a pointer/reference accesses an object directly instead of copying its value.
You'll rarely have to make the distinction in Javascript; but, for example:
    var a = new Array(1, 2, 3); // a[0] == 1, a[1] == 2, a[2] == 3
    var b = a; // Incidentally, b === a, and that is why in the next line...
    b[0] = 555; // ... b[0] == 555, and now a[0] also == 555!
As opposed to:
    var c = new Array(3); // c is a new array of length 3
    c[0] = b[0]; c[1] = b[1]; c[2] = b[2]; // copy scalar values one-by-one
    c[0] = 0; // c[0] == 0, but b[0] remains == a[0], which remains == 555.
    var d = 2;
    var e = d;
    e = 4; // e == 4, but d is still == 2.
As you can see, operating on an object (such as via array subscript operation)
changes the object, even if the object is pointed by multiple variables.
Likewise, objects passed as the argument of a function are passed by reference:
they aren't simply copied, and operating on the argument within the function is
equivalent to changing the object, whose scope is above that of the function.
Some high-level languages, like C, permit you to explicitly specify what is a
pointer or reference, which eliminates some of this confusion but requires more
exacting attention to detail in your design specification.

The state of a program is the value of all its variables, the current location
within the instruction set, and the environment of the operating system (or the
interpreter). In Javascript, within Web browsers, the browser typically provides
access to some of its state via the Document Object Model.

Heuristics, or "guesswork," could not exist if there were no way to execute some
different code depending on the state of the program. Furthermore there are some
mathematics you can't write as exactly one set of instructions that produces one
semantic value: for instance, a function defined only on an interval, or an even
root of a positive number. In this circumstance, you are writing branches:
    if (5 > 10) { /* of course, the code in this block never happens. */ }
    else if (2 < 0) { /* neither does this, b/c cond'n is also false. */ }
    else { /* but all of this code happens, because the others didn't. */ }
... One of the branches executes, and the others don't.
The part in parentheses is the "conditional statement:" it's evaluated as either
"true" or "false," like in Boolean logic. 

Identifiers are only valid within the block (curly brackets, or { ... }) where
they were declared. Well, they're supposed to, anyway. Therefore, if you declare
a variable inside a function, you can't use it outside of the function or within
another function. Why would you want to, anyway? The next time you invoked the
function, the value of the variables you were using in there would change again.

Computers are great at repetition. Loops repeat a set of instructions: they are
typically written as a prefix, conditional, postfix, and body. For example:
    for (var T = 10; T > 0; T--) { alert("T minus " + T); }
... which counts down from ten to one with annoying alert popups.
While or do-while loops have only conditions & bodies.
A loop is an example of an "iterative algorithm." Each time the loop's body is
executed, it's called an "iteration." In computing fractal geometric patterns,
"iteration" means more like "recursion:" which, see below.

A function is a modular segment of your program: a sequence of computation that
is repeated a few times, or can be reused as part of another computation.
Functions are "invoked," or called by name, with values supplied as arguments,
and return a value, similarly to how functions behave in algebra. When declaring
a function, you'd typically write the name of the function followed by its args
in parentheses and then the function body. For example, again in Javascript:
    function intp (N) { return (N % 1) == 0; } // integer predicate
... which returns true if N is probably an integer, or whole number:
    if (intp(5)) { alert("Yes. 5 is probably an integer."); }
    if (intp(5.55)) { alert("This box never appears..."); }
    else { alert("... but this one does, because 5.55 is a floater."); }
(Floating-point numbers are inaccurate, in Javascript as likewise elsewhere.)

A function that invokes itself is a recursive function. Any function invoking an
other function, which subsequently causes the original function to be invoked
again, causes a recursion-like situation that I think is called "re-entrancy."
It is essential to note that _any_ and _every_ recursive function you can write
for a computer to execute can be rewritten as an iterative algorithm. The proof
of this is complex: it follows from Alan Turing's model of finite state automata
and the read-execute model of arithmetic and logic units (CPUs), and basically
asserts that you'd never be able to execute recursion if you couldn't do it by
reading one instruction at a time. In other words, each time your function calls
itself again, it's simply storing its state in memory temporarily while the
machine executes a fresh copy: after the copy is done, the former state is re-
loaded, and execution proceeds from there. This is achieved with stacks: data
structures that grow in size as more is "pushed" onto them, and shrink when some
is "popped" off of the top.

An object is a collection of data that comprises a several datum. That is, when
data are related to one another, they can be envisioned as a "shape" or "motion"
that is the sum of its parts. For example, a vector has direction and magnitude;
an individual has a first and last name; a parser has an input stream, a stack,
and a procedure. In Javascript, you'd write something like this:
    function Blah (newz) { if (newz) { this.z = newz; } return this; }
    Blah.prototype = new Object();
    Blah.prototype.z = 555;
    Blah.prototype.tell_me_z = function () { return this.z; }
    var a = new Blah(222), b = new Blah(); // a.z == 222; b.z = 555.
... syntax varies among languages. Essentially an object is a data structure
containing some members ("variables" attached to the object, like Blah::z above)
and, if the object is a class, some methods (functions, like ::tell_me_z).

Palling around.

Pall (n.): pawl.

I couldn't write last week, and my upgrade to QL has progressed no further.
For reference, I stalled before comparing the efficiency of nested Objects to
that of nested Arrays, which I must test before experimenting further with the
prototype compiler or even refining the design. I intend to do that this month.
In the meantime, here's a snapshot of MLPTK with new experiments included.

And a correction to my brief about the grammar ("Saddlebread"): actually, the
InchoateConjugation sequence does not cause differentiation, because the OP_CAT
prevents the original from reducing. Other parts may be inaccurate. I'll revise
the grammar brief and post a new one as soon as I have fixed the QL speed bug.

I took some time out from writing Quadrare Lexema to write some code I've been
meaning to write for a very long time: pal9000, the dissociated companion.
This software design is remarkably similar to the venerable "Eggdrop," whose C
source code is available for download at various locations on the Internets.
Obviously, my code is free and within the Public Domain (as open as open source
can be); you can find pal9000 bundled with today's edition of MLPTK, beneath the
/reference/ directory.

The chatbot is a hardy perennial computer program.
People sometimes say chatbots are artificial intelligence; although they aren't,
exactly, or at least this one isn't, because it doesn't know where it is or what
it's doing (actually it makes some assumptions about itself that are perfectly
wrong) and it doesn't apply the compiler-like technique of categorical learning
because I half-baked the project. Soon, though, I hope...

Nevertheless, mathematics allows us to simulate natural language.
Even a simplistic algorithm like Dissociated Press (see "Internet Jargon File,"
maintained somewhere on the World Wide Web, possibly at Thyrsus Enterprises by
Eric Steven Raymond) can produce humanoid phrases that are like real writing.
Where DisPress fails, naturally, is paragraphs and coherence: as you'll see when
you've researched, it loses track of what it was saying after a few words.

Of course, that can be alleviated with any number of clever tricks; such as:
	1. Use a compiler.
	2. Use a compiler.
	3. Use a compiler.
I haven't done that with p9k, yet, but you can if you want.

Of meaningful significance to chat robots is the Markov chain.
That is a mathematical model, used to describe some physical processes (such as
diffusion), describing a state machine in which the probability of any given
state occurring is dependent only on the next or previous state of the system,
without regard to how that state was encountered.
Natural language, especially that language which occurs during a dream state or
drugged rhapsody (frequently and too often with malicious intent, these are
misinterpreted as the ravings of madmen), can also be modeled with something
like a Markov chain because of the diffusive nature of tangential thought.

The Markov-chain chat robot applies the principle that the state of a finite
automaton can be described in terms of a set of states foregoing the present;
that is, the state of the machine is a sliding window, in which is recorded some
number of states that were encountered before the state existent at the moment.
Each such state is a word (or phrase / sentence / paragraph if you fancy a more
precise approach to artificial intelligence), and the words are strung together
one after another with respect to the few words that fit in the sliding window.
So, it's sort of like a compression algorithm in reverse, and similar to the way
we memorize concepts by relating them to other concepts. "It's a brain. Sorta."

One problem with Markov robots, and another reason why compilers are of import
in the scientific examination of artificial intelligence, is that of bananas.
The Banana Problem describes the fact that, when a Markov chain is traversed, it
"forgets" what state it occupied before the sliding window moved.
Therefore, for any window of width W < 6, the input B A N A N A first produces
state B, then states A and N sequentially forever.
Obviously, the Banana Problem can be solved by widening the window; however, if
you do that, the automaton's memory consumption increases proportionately.

Additionally, very long inputs tend to throw a Markov-'bot for a loop.
You can sorta fix this by increasing the width of the sliding window signifying
which state the automaton presently occupies, but then you run into problems
when the sliding window is too big and it can't think of any suitable phrase
because no known windows (phrases corresponding to the decision tree's depth)
fit the trailing portion of the input.
It's a sticky problem, which is why I mentioned compilers; they're of import to
artificial intelligence, which is news to absolutely no one, because compilers
(and grammar, generally) describe everything we know about the learning process
of everyone on Earth: namely, that intelligent beings construct semantic meaning
by observing their environments and deducing progressively more abstract ideas
via synthesis of observations with abstractions already deduced.
Nevertheless, you'd be hard-pressed to find even a simple random-walk chatbot
that isn't at least amusing.
(See the "dp" module in MLPTK, which implements the vanilla DisPress algorithm.)

My chatbot, pal9000, is inspired by the Dissociated Press & Eggdrop algorithms;
the copy rights of which are held by their authors, who aren't me.
Although p9k was crafted with regard only to the mathematics and not the code,
if my work is an infringement, I'd be happy to expunge it if you want.

Dissociated Press works like this:
	1. Print the first N words (letters? phonemes?) of a body of text.
	2. Then, search for a random occurrence of a word in the corpus
	   which follows the most recently printed N words, and print it.
	3. Ad potentially infinitum, where "last N words" are round-robin.
It is random: therefore, humorously disjointed.

And Eggdrop works like this (AFAICR):
	1. For a given coherence factor, N:
	2. Build a decision tree of depth N from a body of text.
	3. Then, for a given input text:
	4. Feed the input to the decision tree (mmm, roots), and then
	5. Print the least likely response to follow the last N words
	   by applying the Dissociated Press algorithm non-randomly.
	6. Terminate response after its length exceeds some threshold;
	   the exact computation of which I can't recall at the moment.
It is not random: therefore, eerily humanoid. (Cue theremin riff, thundercrash.)

A compiler, such as I imagined above, could probably employ sliding windows (of
width N) to isolate recurring phrases or sentences. Thereby it may automatically
learn how to construct meaningful language without human interaction.
Although I think you'll agree that the simplistic method is pretty effective on
its own; notwithstanding, I'll experiment with a learning design once I've done
QL's code generation method sufficiently that it can translate itself to Python.

Or possibly I'll nick one of the Python compiler compilers that already exists.
(Although that would take all the fun out of it.)

I'll parsimoniously describe how pal9000 blends the two:

First of all, it doesn't (not exactly), but it's close.
Pal9000 learns the exact words you input, then generates a response within some
extinction threshold, with a sliding window whose width is variable and bounded.
Its response is bounded by a maximum length (to solve the Banana Problem).
Because it must by some means know when a response ends "properly," it also
counts the newline character as a word.
These former are departures from Eggdrop.
It also learns from itself (to avoid saying something twice), as does Eggdrop.

In addition, p9k's response isn't necessarily random.
If you use the database I included, or choose the experimental "generator"
response method, p9k produces a response that is simply the most surprising word
it encountered subsequent to the preceding state chain.
This produces responses more often, and they are closer to something you said
before, but of course this is far less surprising and therefore less amusing.
The classical Eggdrop method takes a bit longer to generate any reply; but, when
it does, it drinks Dos Equis.
... Uh, I mean... when it does, the reply is more likely to be worth reading.
After I have experimented to my satisfaction, I'll switch the response method
back to the classic Eggdrop algorithm. Until then, if you'd prefer the Eggdrop
experience, you must delete the included database and regenerate it with the
default values and input a screenplay or something. I think Eggdrop's Web site
has the script for Alien, if you want to use that. Game over, man; game over!

In case you're curious, the algorithmic time complexity of PAL 9000 is somewhere
in the ballpark of O(((1 + MAX_COHERENCE - MIN_COHERENCE) * N) ^ X) per reply,
where N is every unique word ever learnt and X is the extinction threshold.
"It's _SLOW._" It asymptotically approaches O(1) in the best case.

For additional detail, consult /mlptk/reference/PAL9000/readme.txt.

Pal9000 is a prototypical design that implements some strange ideas about how,
exactly, a Markov-'bot should work. As such, some parts are nonfunctional (or,
indeed, malfunction actually) and vestigial. "Oops... I broke the algorithm."
While designing, I altered multiple good ideas that Eggdrop and DisPress did
right the first time, and actually made the whole thing worse on the whole. For
a more classical computer science dish, try downloading & compiling Eggdrop.