Briefcase: a special kind of folder in Microsoft Windows, which synchronizes its contents with another briefcase of the same name when detected. Used to keep volatile documents on floppy disks & USB flash drives without constantly copying and pasting the contents of the whole disk every time it moves from workstation to workstation. Like an rsync daemon. Use case: in the studies of software design & architecture, a storyboard sketch, or supposition about what a user expects or how s/he'll behave. E.g.. This would be the initial node of a flowchart, a branch in main(), GUI dialog panes, or some interaction of user with program. Additionally, analysis of hostile users and newbies ("misuse case"). I am still moribund ("deadlocked," or "sick to death") by a headache that has become cerebral palsy. I have been unable to concentrate on my plans this year. Speaking of contributions to science, you can find my (literally) auriferous portfolio at the magnanimous MediaFire (they're not just for pirates!): https://www.mediafire.com/folder/kr2bjyn1k3gjr/mlptk-recent (Download & read the CARGO-MANIFEST.TXT to ascertain the contents of the archives you seek.) WARNING: ADULTS ONLY. (Explicit sexual content.) Videlicet is still kind of broken. DiffWalk, too, may be faulty. The hyperlink will lead you to a MediaFire directory. I have added new archives (for bandwidth conservationists). The file CARGO-MANIFEST.TXT describes all the contents: _download and read it first_ if you want to know what's in them there archives, which total over one hundred Megabytes, &/or retrieve your preference. What's new: kanamo & transl8 (in MLPTK), Mutate-o-Matic, Videlicet, & DiffWalk. (I said MLPTK was officially dead, but will I let it rest? How about no...) Archivists curating art galleries downloaded from social networks will love Videlicet, which solves the vexing twin problems of automatic attribution and re-configurable data mining. (For those pesky copy protection mechanisms. Videlicet.py easily cuts through Web galleries and markup up to 1/4" thick.) I even threw in the exprimental upnnas: yea, truly this is an epic day. (^- That line alludes to one of the _Juicy Cerebellum_'s author's asides.) The remainder of this briefing describes the salient points of a Python script I wrote to automatically collate issues of my portfolio. Long story short: "diff." Because the large size of the archives I upload has become problematic, I have established a ramshackle mechanism to prepare smaller files for anyone concerned about bandwidth conservation. (MediaFire reports only two Gigabytes since last year, which is no big deal, but I certanly wasn't helping. Also I couldn't think of much else to do.) In case you cared, the usual issues with bandwidth are constriction & latency: to reuse Senator Ted Stevens' "tubes" metaphor, how wide the tube is and how long it is, and either of these can alter an observer's perception of the pressure of fluid forced through the pipe. "When the tubes get full, things can't get through" -- like dead bodies, or the new episode of Veep. Metaphorically one half of this mechanism is a portable diff utility: DiffWalk. The other half is a shell script that identifies changes to the directory tree. Neither is aught remarkable but why don't I talk your ear off about them anyway? Diff is a program similar to cmp, used to compare two files and describe their discrepancies. In common use on Unixlike systems, it is employed to create patch files that require less time to transmit via point-to-point telecommunication than would be needed to transmit the whole file whenever it changed. Because it is so useful an algorithm, and because I've never seen one for Windows (except in the Berkeley Utilities), I made (but didn't test) a portable one in Python. DiffWalk is a walking collater that creates patches similar to diff's. Although the two are not interoperable, they operate in the same manner: by determination of where the files differ and description of the differences. Therewith, a "new" file can be reconstructed from an "old" file plus a patch -- hypothetically, with according decrease of network bandwidth load. Although the script is a few hundreds of lines long, the scanner (the part that goes through the file looking for the interesting bits: such as, in this case, the positions where the new file differs from the old) is one tenth that size. As you've observed in my other software, I do without proper parsers & grammar. This renders my work brief, vulgar, and full of bugs, but sometimes legible. def diff (old_lines, new_lines): #fmt: old_offset old_lines new_lines\nlines\n patch_file = [ patch_copacetic_leadin ]; scan_line = ""; # Compute MD5 checksums for both files... old_md5sum = hashlib.md5(); for line in old_lines: old_md5sum.update(line); old_md5sum = old_md5sum.hexdigest(); scan_line = "%s\t" % (old_md5sum); new_md5sum = hashlib.md5(); for line in new_lines: new_md5sum.update(line); new_md5sum = new_md5sum.hexdigest(); if new_md5sum == old_md5sum: return None; # same file? then no patch req'd. scan_line += "%s\n" % (new_md5sum); patch_file.append(scan_line); # Second line: old_md5 new_md5 oi = 0; ol = len(old_lines); ni = 0; nl = len(new_lines); tally = 0; scan_line; unique_new_lines = set(new_lines) - set(old_lines); while ni < nl: # 2 phases: scan "same" lines, then diff lines oi = 0; tally = 0; while oi < ol and old_lines[oi] != new_lines[ni]: oi += 1; scan_line = "%d\t" % (oi); #Index in "old" file to cat some of its lines while oi < ol and ni < nl and old_lines[oi] == new_lines[ni]: tally += 1; ni += 1; oi += 1; scan_line += "%d\t" % (tally); # Number of lines to cat from "old" file tally = 0; next_ni = ni; while ni < nl and new_lines[next_ni] in unique_new_lines: tally += 1; next_ni += 1; scan_line += "%d\n" % (tally); # Number of lines to cat from "new" file patch_file.append(scan_line); patch_file.extend(new_lines[ni : next_ni]); ni = next_ni; # end while (scan the files, outputting the patch protocol format) return patch_file; # end function diff: returns diff-style patches as writelines() compatible lists Concise and transpicuous: 1. Tally runs of lines that already existed in the old file. (Scan phase.) 2. Tally runs of lines that do not exist in the old file. (Diff phase.) 3. Print a patch format that permits ordered reconstitution of the lines. 4. Repeat until the entire new file can be reconstructed from patch + old. Here, Python's set()s abstract away a tedious series of repetitive scans. Without set or a like data type, I'd have to either hash the "old" file's lines myself (and waste time writing another binary tree) or loop through it all again and again for each line of the new file. (That would be due to the fact that, if lines had been moved about instead of simply moved apart by interjection, then a lockstep scanner would mistakenly skip some and the patch file would be larger.) There is no capacity to patch binary files, but DW still detects when they have changed, and will write a copy into the patch directory. I assume that changes to binary files are due to transcoding, and therefore the patch'd be just as big -- some kinds of binary files, such as SQL databases, don't behave this way and can be patched in the same manner as I patch text files, but I don't use them. (If you extend the algorithm to databases or executables, don't forget to review the pertinent file formats and open the files in binary mode. :) The rest of the script is a wrapper handling directory traversal and file I/O. As `info diff` artfully states, "computer users often find occasion to ask how 2 files differ." The utility of a script like DiffWalk is therefore not limited to patching, but compression protocol is its primary employment on my system. (I still use `diff` for quotidian difference queries because DW isn't in my $PATH.) Likewise, the automatic collation of updates, such as moved and deleted files, is a pleasant amelioration to the task of finding what's changed in an archive since the last published edition. DiffWalk now handles these tasks for me. If you'd like a better solution to the "Briefcase Problem" (how to synchronize files across multiple installations with minimal time and fuss), don't forget to stop by the manual pages for "diff", "patch", and "rsync".