Book review: Python for Kids

Alright y’all, so I’ve been continuing to read more intro-to-programming and programming-for-kids books as a part of doing more research as I’m writing my own text and I have an actually very positive review this time!

The other day I read through Python for Kids and I really liked it. I have a few criticisms but they are very few.

To start with, it’s a incredibly well formatted book. Maybe that’s an odd thing to gush over, but formatting can make such a big difference when it comes to how fast and how effectively you can read a text. I don’t actually know if this book was formatted in LaTeX but it definitely had the feel of a book that was put together by the kind of LaTeX nerd who actually knows what all the features in the memoir class do without looking them up. The layout was easy on the eyes, the syntax highlighting was good, and everything just looked very nice. The visual aids were also really good. The color-coded diagram on page 55 that showed the connection between blocks and indentation in Python just looked so nice.

On the tone front, I think it struck a really good balance of serious and silly, clear and entertaining. There were really charming drawings breaking up the text, like the one of the dog wearing pants and slippers or the sleeping bear or the helicopter turtles. Sometimes they have almost nothing to do with the code but they’re cute and small and well placed to keep the formatting interesting and easy on the eyes. There were doofy jokes in the code examples that didn’t obscure the point, including gratuitous pandering with the word “butt”. It’s all very cute but never cloying.

As for the content, I thought all of it was very clear and straight forward. With the exception of using methods and functions for an example without having explained what they are, the order in which all the concepts were presented was logical and flowed well. I’m also going to cut the author a little slack on that one because it’s actually really hard to know what to do for showing non-trivial examples without explaining basically the entire language first. That’s a problem I haven’t figured out yet either.

There’s some good little asides I was surprised to see, like the note about using pass for dummy methods as you plan out your classes, the importance of modularity in large programs, and simple serialization. The diagram explaining the % operator was also really good and I think requires barely any maths knowledge to understand.

Like a lot of books aimed at young people it goes for tried and true examples involving logo and making simple games. They’re good though and the code is pretty well explained.

I did have a few small criticisms. I was slightly bothered by the book introducing eval and exec without mentioning anything about their safety. Honestly, I didn’t really think they needed to be introduced at all if they were only going to be discussed very briefly and, IIRC, never used again in any of the examples. I thought the part describing copy and deepcopy could have used a better explanation of how it’s connected to references and memory.

Some of the code explanations towards the very end felt a little dodgier, with the explanation of how the move function works getting muddied because they didn’t explain the idea that objects have IDs in Tk until after already using some examples that have an unexplained argument in the function. It also wasn’t clear what the updateIdle function did from the immediate context of where it was introduced.

In the grand scheme of things, though, these are really small criticisms and the overall book was really nice.

Advertisements

Book review: Automate the Boring Stuff with Python

Well, continuing our reviews of introductory programming books let’s move on to Automate the Boring Stuff with Python.

The tl;dr version first. I liked it, but I think it reads a little dry and, for lack of a better phrase, a little programming as magic. I’ll try to make clear my criticism.

So first off, this is a very practical projects based book, which I always like in a textbook. Two of my favorite programming texts ever were Seibel’s Practical Common Lisp and Hudak’s Haskell School of Expression, which were both extremely focused on projects. On the other hand, the projects don’t really start being non-trivial and interesting until we get to chapter 11, the chapter on html scraping and using selenium for programmatically traversing websites.

The front half of the book is clear but dry. I was really excited when the author started using flow charts to describe control flow and explained how expressions can be evaluated like arithmetic expressions step-by-step. They could have gone further, though, in explaining how to evaluate code in a calculation style. I’ll fully admit that’s my own bias towards operational semantics as a teaching tool, but I was sad to see that the reduce-the-expression style stopped very quickly. As for the dryness, I know that after my review of Ruby Wizardry I may sound like Goldilocks looking for the narrative that’s Just Right, but I just think there has to be a middle ground between narrative and story that’s entertaining-but-obscuring and narrativeless prose that’s straight to the point but hard to focus on because there isn’t enough motivation for the concepts or analogies to break up the code examples. I know that when I write instructional materials for intro. programming I very much tend to be in the dry-but-clear side of things, and that’s something I’m currently working on.

The projects are all well thought out, but my main complaint is that I think it would have been helpful to spend more time explaining how all the libraries behind the projects work. They’re cute and they’re interesting and they’re well motivated, but they pretty much are all of the form “let’s rely on this feature rich library to allow us to write simple code that does a thing”. Now, that’s not inherently a bad approach to educational projects but I think it needs to be supplemented by more explanation of how one might build small versions of these libraries. For example, the section that deals with CVS files could have talked about the process of reading the definition of a file format and building a small library that helps you manipulate them. This is kinda what I mean about programming being magical. I know that when I was first learning programming many years ago, I’d get frustrated by these kinds of programs because it felt like there was this big conceptual leap for how to go from “use these libraries in the way described in the documentation” to actually figuring out what they’re doing under the hood. There was just this big gap between “follow the directions” and feeling like I could, at least in principle, re-engineer this code by understanding what it’s doing. That’s the feeling of programs-as-incantations that I think we should try to avoid when teaching how to code.

Domains and topology part I: motivating topology

This is going to be a multi-part nearly stream of consciousness series of posts as I try to understand myself the topological character of domain theory and try to figure out intuitive understandable ways to explain it.

To start, what’s point-set, sometimes called general, topology really about? Being evocative, you could say that general topology is the study of all possible shapes: this includes things that we normally think of as shapes like spheres and cubes, toruses and klein bottles, but also just about anything you can conceive of that involves sets of objects. These shapes are studied not by measurements, that’s more the ken of differential geometry, but rather by the ways in which the “points”, the objects, that make up the shape are related to each other by a very general idea of “closeness” that doesn’t even need numbers and measurements.

To this end, we introduce our basic terms. A topological space is a set, which is going to be all the “points” of our shape, along with a collection of subsets that represent what points are related to each other. This collection of subsets, called the topology on the set, are the open sets.

These open sets have a few conditions that they need to follow. First, the entire set needs to be an open set, because at the top level every two points are related by being a part of the shape. Next, if you take the union of any number of open sets you should always have an open set. This makes sense because it’s a kind of transitive property of being related. If A is related to B and B is related to C, then A is related to C. We also insist that we can always take the intersection of any two open sets and get a new open set. Why? Because if there are two ways we can relate two things, A and B, then we can create a more fine grained distinction relating A and B by the overlap between the two open sets. The final condition is that the set with no elements, the empty set, is also an open set. Why? Well, it’s because the set with no elements is the thing that proves no points are related. That sounds slightly silly, but it’s also true. It’s very similar to the way that the disjunction of no clauses should be true.

This has all been a bit abstract, but we can think of the open sets as describing closeness in the following way: the more open sets two points have in common, the “closer” we are to each other. An imperfect analogy might be addressing. We’re all connected at the level that we live on the same planet. If we share the same country as well as the same planet, we’re even closer. If we share not only the same country and planet but the same state/province, we’re closer still. If we share all the way down to the same address, then we’re very close indeed and, in fact, this is as fine grained a distinction as our “address topology” can bring us.

At this point, let’s take this intuition and slightly fuzzy language and turn it into precise mathematics.

A topological space is a set X and a topology \mathcal O on X. A topology on X is a collection of subsets \mathcal O of X that have the following properties

  1. X \in \mathcal{O}
  2. \varnothing \in \mathcal{O}
  3. \mathcal{O} is closed under arbitrary unions
  4. \mathcal{O} is closed under finite intersections

Also, let’s try to explain one thing that may not be obvious: how open sets actually describe shape and what distinctions they can make. So you may have heard the joke that a topologist is a person who dunks their coffee cup in their doughnut. The idea is that at this course level of “relatedness” the exact locations of points are meaningless, only their relations. Both a sphere and a cube have the property that you can “go through the center” to connect points on either side by open sets. Indeed, there’s open sets that relate points all over the sphere and cube to each other without having to travel along the surface. The fact that one has corners but the other is round doesn’t affect anything about how points are related to each other in the topology. On the other hand, a torus (doughnut shape) has a hole that limits the ways points on either side of the hole are allowed to be related to each other. No matter how you stretch and pull, you still have that limitation in place.

The other concepts I want to introduce before we start talking about domains in earnest are the separation axioms and the idea of compactness in topology. We’ll start with compactness first.

Compactness is about covers. A cover of a subset A of X, our topological space, is a set of open sets whose union contains X. In thinking about open sets as “relatedness” we could say that a cover over a subset A of X are a collection of related points that includes at least every point in A. So for our example of the addressing topology, then a cover of all the united states would be all the postal codes.

A subset A of X is called compact when every cover of A has a finite subcover, that is you can choose only a finite number of open sets from the cover and still cover A. Now this doesn’t say “there exists a finite cover over A” because that’s actually a trivial property since X itself is an open set that covers any subset. No, compactness actually means something a little more interesting: if we think of each open set in the cover as a “distinction” that relates some points to each other, then compactness means that no matter how you’re breaking apart A into a set of distinctions you actually only need a finite number of those distinctions to explain the entire set A. Finite sets are trivially compact, even as subsets of an infinite space, by the following algorithm:

  1. Choose any point x \in A, then choose an open set from the cover of A that includes x.
  2. Choose a point that hasn’t been chosen yet, and choose an open set that includes it and add it to the finite cover. If there aren’t any unchosen points in A then stop, otherwise repeat step 2.

This doesn’t choose any kind of minimal finite cover, but it’s finite nonetheless since we have at most a number of open sets equal to the number of points in A.

That’s still a bit abstract, so let’s try another analogy: look around you at any object, say the device you’re reading this post on. This device has a finite shape and size. It has a definite dimensions of height and width and depth. An open cover would be literally picking (borderless) regions in space that overlap to include the entire object. We can choose a finite subcover with the following pseudo-algorithm:

  1. start by picking the largest region in the open cover
  2. pick the largest cover that overlaps with the remaining space, if the whole region is covered then stop, otherwise repeat step 2

Now, it might feel like this algorithm should terminate but you might have some anxiety over the possibility of a Zeno-like situation where you manage to only cover a small enough fraction in each step that it takes an infinite number of regions to cover the entire space. This is guaranteed to not happen by a property described in the Heine-Borel theorem.

A rough intuition for how Heine-Borel saves us is that the fact that your object in question has edges is actually what keeps this infinite regress from happening. Since you need to reach the edge, not just get arbitrarily close to it, there must be some neighborhood that crosses the gap and actually touches it. That’s what keeps this situation from being like Zeno’s paradox in which you never quite get there at any finite step.

This post has gotten a little long, so next time we’ll get back to separation axioms and then the scott topology.

Book Review: Ruby Wizardry

While I’m still cleaning up deeper mathier posts, let’s talk more about books.

So writing about technical topics for non-technical audiences is pretty dang hard. Teaching kids is, I think, the expert level version of this challenge. Not because kids are less intelligent than adults, but simply because kids have a smaller frame of reference and a lower threshold for boredom. This means you have to be extra sharp and concise and interesting to make a case for why something challenging is worth understanding.

A really successful example of good writing for young people is The Way Things Work. I read this book as a kid and while it was silly, and weird, and I laughed at all the doofy mammoths as a small child it was also really really good at explaining a number of scientific and engineering concepts. It never felt condescending. The silliness was never obtuse. It was very clear and very memorable.

A lot of other books, though, tend to be really uncomfortable in their efforts to try and feel NEAT and COOL. I just always picture the author as Steve Buscemi announcing “How do you do, fellow kids?”. When I was a kid I hated when books or tv shows were condescending or tried to Make Learning Fun in some heavy-handed manner. It drove me up the wall and I think most people can remember times someone talked to them like they were stupid when they were a kid and recall with perfect clarity how infuriating it was.

All of that preamble is to explain that I mentally chart writing aimed at kids on the Buscemi-Mammoth scale so we can talk about the book Ruby Wizardry.

First, I want to say that I don’t think the book is bad. I understand that this was probably not an easy book to write and for the most part it’s a clear book, which is a victory in its own right.

On the other hand, I do have some criticisms that put it a little closer to Buscemi-holding-a-skateboard than mammoths playing around illustrated gears.

First, the overall narrative felt distracting. Now, I want to be clear, I’m not against writing a serious book in dialogue form. That may not be a terribly fashionable approach to explanatory writing, but it can work and flow well and really break up the density of a topic. This narrative, though, just had a lot of text that felt extraneous but also not really interesting or entertaining either. I’m not saying this as an adult, I honestly tried to read this book from how I could best guess 8-y/o Clarissa would have read it. I would have been pretty bored by what I thought were really blatant “hey isn’t this fun!” writing. I think it’s even more egregious given that I just double checked and technically the book says it’s aimed at 10+, which is older than I thought. There were also Ruby jokes and _why puns that felt like in-jokes that stuck out as a bit random if you didn’t get why (no pun intended) they’d talk about chunky bacon. The humorous parts didn’t really add anything to the explanation in a lot of places.

There were also some pedagogical things that weren’t hugely problematic, but a little off. Mostly the order of topic introduction and mentioning topics but deferring any real discussion. For example, the concept of object IDs was introduced a good 30 pages before objects themselves. Blocks, which are Ruby’s anonymous functions, were introduced chapters before functions were introduced. Even a little thing like mentioning that you want to edit code with a text editor, but don’t use a word processor, without explaining what that even means. I know what it means and I’m sure if you’re reading this you know what it means, but someone completely new to programming wouldn’t. Overall it’s a relatively small problem, but geez that’s the kind of thing that used to bother me when I was reading technical books as a kid. I’d come across a concept and think “oh man, I don’t know what this is, did I miss something? Do I need to reread a section? Am I just stupid?”. Alright, most people probably don’t have that dramatic of a self-effacing reaction but, still, it’s a stumbling block that just doesn’t need to be there.

I also thought it was maybe a little odd to have a discussion about using symbols instead of strings for optimization in a book that’s aimed at complete newcomers to programming. I’m not saying it’s not an important point, but it felt digression-y for a book that was just barely describing how loops work.

On the other hand, and I know this might sound like me contradicting myself a little, I did think the chapter on refactoring was actually really good. Mostly because it explained the idea of looking at code and looking at how to improve it by understanding it. That’s a great lesson for a newbie.

Before the final part of this post, I want to make a side rant, a sub-rant if you will. Please, can we as a people stop talking about languages as being interpreted or compiled and can we please stop talking about “scripts” like they’re different than “programs”. Implementations are compilers or interpreters. The language is merely the language. While the most official implementation of Ruby is an interpreter, people have written Ruby compilers. It’s a silly distinction to try and make and it just confuses the heck out of people on the periphery. Not even that long ago someone tried to start an actual argument with me over the idea that JavaScript is a scripting language which they believed was inherently different than a programming language. Please. No more.

Now, it probably sounds like I hated this book. I swear I didn’t. If anything, this is half an exercise for me to work out my own thoughts about writing on programming for young people. If I were to summarize my complaints, though, I’d say the core issue is one of motivation. Being “fun” and trying to tell stories doesn’t inherently motivate you to care nor explain why each piece is important and that’s where I was disappointed. The story of wizards and royalty and jealous Pythonistas just wouldn’t have helped me learn the material or stick with it when I was a kid.

Book Review: An Introduction to Game Theory by Steve Tadelis

I’ve been reading a lot about game theory lately. Partly because I have an interest in games as a way to understand aggregate behavior, but also I just think it’s an interesting subject in its own right.

One of the better books I’ve read so far is Tadelis’s undergrad textbook. It’s not very mathematically difficult, reducing most problems down to very basic calculus or probability calculations without many proofs, but it’s probably one of the better textbooks I’ve read in a long time in terms of motivation. I’ve read pop-math books on game theory that explained the philosophic assumptions with only a fraction of the clarity.

In particular, the first half of the book rather meticulously lays out the different ideas of rationality and equilibrium under consideration and examines simple games in terms of them. It starts from more obvious ideas like “if there is a strictly best possible strategy for all cases, a rational player will choose it” which only leads to stable strategies in a small number of cases and refines what counts as rational until you get more well known equilibria concepts such as Nash equilibrium and subgame perfect equilibrium.

So I’d recommend this book if you have any real interest in game theory at all. It’s a pretty light and breezy text and the exercises, at least all the ones I did, were structured to be very leading towards to the lesson they were trying to impart. I think in terms of time commitment if you’re comfortable with calculus and probability densities then each chapter, with most of its exercises, will probably only take about an hour each.

Org mode as bullet journal

So bullet journals, if you haven’t heard of them before, are growing to be a popular way to keep yourself organized.

The basic idea is simple: you have a bulleted todo list that you write down every day, periodically migrating over the things you still need to do. There’s various shorthands for assigning different bullet points to different semantics, such as an open task, a closed task, an event, or a migrated task.

So the most basic of workflow is, every night or every morning, making your current list of tasks for the day and modifying the bullets as need to show a state change. Then, every week or month, you migrate over tasks you still care about while tossing out the ones you’ve decided just aren’t relevant anymore or just won’t get done.

The other part of the story, though, is that people develop all sorts of “modules” and “trackers” for recording additional information. These include habit trackers, monthly calendars, indexes by topic, diary entries, etc. People get really elaborate and creative with their additions to the bullet journal system, as you can see from a quick glance over Pinterest.

Now, I had two basic thoughts when I first binge-read all about bullet journals. First, that I think they’re a cute idea. Second, that I’d never really want use one.

I like the idea of having a habit of writing down tasks and diary entries and habit tracking in a time indexed system you can read back through, but the biggest problem with a paper system is that you can’t search it. Sometimes I brain dump a bunch of ideas and tasks at once and I won’t necessarily remember when I wrote them down after the fact or which journal they might be in. Now, you might say that indexing is the right solution for this, but wouldn’t searching your documents be even easier?

The other thing is that I like including links in the middle of my text, because keeping to-dos/diary entries/etc. together with all the additional context around them is really useful, especially when combined with org-ref as a citation manager. To state the obvious, you can’t include working links on paper.

So what’s the solution? Well, I love org-mode which is already searchable and makes including links to files and sites trivial. Org mode also includes the agenda which allows you to see all of the scheduled tasks and automatically handles migration of tasks over time. So what makes org-mode different than an electronic bullet journal? Mostly that org mode files aren’t organized temporally by default in a way that allows you to “flip through” your past writing and accomplishments and ideas. At first I was just using capture templates filing into a date-tree (as described here) but that’s not easily readable in terms of reading over old entries. I found it was just barely cumbersome enough that I was using it as a write only system, which while still somewhat useful isn’t ideal.

As usual for the org/emacs ecosystem, I found a library that did exactly what I needed! Org-journal is an awesome little library that allows you to

  • create files for each day
  • with timestamped entries within each file
  • navigate through them with keyboard shorcuts and a calendar interface

I’ve been having really great luck so far trying to use it like a bullet journal. I can separate out “modules” by tag names, so I can have daily and monthly logs, true journal entries, collections of events happening, etc and still easily go back and re-read old entries when I want to remind myself of thoughts I’ve had in the past.

On Free Will and AI: A Rant and/or Rave

Free will. I know that its existence and nature has been debated to death, but given the prevalence of pop-science articles about AI and whether evil super-intelligences will spell the doom of humanity it’s a topic I want to ramble about for a bit.

First off, we need to define what we mean by “free will”. I think the easiest definition is that to have free will means to have the ability to make a deliberate choice between alternatives. If you haven’t spent many insomnia laden nights worrying about this topic, you might be wondering “okay, that sounds good, why is that controversial?”. Fundamentally, the issue is that physics as we understand it doesn’t leave a lot of room for “choice”. In the classical, Newtonian, view of things the motion of every particle is deterministic. Once you have the initial conditions, you can turn-the-crank on the equations of motion and get the position and momentum of every particle in the universe at any point in the future. Thus in classical mechanics, there can’t be any ability to make a choice since everything that happens follows exactly from everything that happened before it.

Now, folks sometimes try to dig into quantum mechanics hoping that it will save us and provide a clue as to how consciousness and choice work. The problem there is that probabilistic physics doesn’t allow choice any more than deterministic physics. There is still no reflective mechanism that allows for a conscious entity to deliberate and choose.

Do we then conclude that free will is an illusion? I don’t think so. I believe we have free will. I think we have the ability to make choices. In part, because a world where we

  • seem to make choices
  • can debate on our ability to make choices
  • are able to think we make choices
  • but don’t actually make choices

seems very, very unlikely. Of course, if I’m wrong then I don’t suppose I have any choice in my beliefs anyway (1). I’m not even necessarily appealing to a supernatural cause for free will. My best guess is that there’s something deeply strange about the universe that we have yet to discover. But who really knows? I certainly don’t. For the rest of this piece, though, I will be assuming that free will somehow exists.

Where does this fit in with AI, though? It’s a long standing anxiety in science fiction that machines will be sapient just like us and rise up to destroy us, and it’s an anxiety that’s finding its way into tech journalism as AI become more and more prevalent and powerful, but is it even possible? I’d argue that we don’t know whether a machine can ever be conscious. If my assumption is correct and we do have free will, then I imagine we’ll need to understand how this choice works physically before we can make a machine that use it. There’s a few potential showstoppers here. We might not find a physical mechanism for free will. Even if we find one, it’s not obvious to me that it would be easily harnessed in a computer-as-we-know-it. As much as I’d love for them to exist, I don’t particularly see sentient machines happening in the near future.

We can’t talk about thinking machines without discussing the Turing test. The Turing test doesn’t make any reference to free will or choice as a requirement for being a thinking machine, just the ability to fool a person 70% of the time after a five minute conversation into thinking they’re speaking with a human. I think that Turing’s test isn’t necessarily helpful given our current computational power. What I mean is that I agree with the extensional point of view that if a machine isn’t distinguishable in behavior from a person, then there’s no useful distinction made between the machine and us, thus it’s completely reasonable to call the machine “thinking”. I also believe that Turing didn’t realize how quickly our computational power and storage size would grow. Chatbots have become fairly common place at this point, but the facade is rather thin. Even if the program is capable—with enough AIML rules, processing power, and clever heuristics—of convincing a person that they are speaking with another human being in the short term, it doesn’t tell us anything about the ability to make choices.

If our AI cannot make deliberate decisions, then should we be concerned then with AI-as-we-know-it running out of control? I don’t think so, at least not any more concerned than we should be about any large and important software system. This may seem like a facile point, but without choice there’s no room for AI to surprise us if we’ve properly proved properties of their behavior. If anything, I think this is just a strong argument that we need to invest further in careful program construction and better theorem proving and formal methods technologies so that we no longer have to pay an overhead of many person-years to analyze a large system.

(1) I’m being facetious, but I thought it would be too much of a digression to discuss compatiblist views of free will with determinism. It’s probably obvious that I disagree with them, but I’m not dismissive of them.

I actually want to talk about a video game today: TIS-100. It’s not terribly new, but it’s new to me and I don’t remember seeing people talk about it very often.

I had never imagined programming puzzle game could be a genre, but now I want dozens of games like this because of how much fun it’s been.

It’s a great puzzle game because it’s programming assembly for a vaguely made up architecture of independent communicating nodes that are, for the sake of the puzzle, very limited in their features*. Each node can hold only a small number of instructions and has channels to communicate with its geometric neighbors, channels which block on empty reads or full writes. Each node has two “registers” that can hold an integer and there’s basic instructions for register manipulation and one of the registers is specially used for the control flow constructs and arithmetic instructions. It’s a very simple but very cute architecture.

I’m not done with the game, being almost halfway through with all the puzzles (unless more become available that don’t have a placeholder?), but it’s generally been a lot of fun. The puzzles are well-designed in that, at least so far, each one has had a particular programming concept that you’ve needed to internalize to finish it and the lessons have built on each other in a logical progression.

In fact, I’m enjoying this so much I’ve started a little project on github where I’m working on a simulation of similar kind of hardware. My goal is to eventually compile higher level languages down to a configuration of programmed nodes. The main difference is that I’ve removed the instruction limit because I feel like it was largely there for the puzzle factor. The intermmediate goal is to have a slightly-higher-level language that is basically an abstracted version of the assembly code for individual nodes where rather than geometry there’s just a general notion of naming entities in your file and letting them communicate, then with a little graph munging it can either be compiled down to a solution of connected nodes or throw an error that it would require a non-planar graph. That’s at least my current thought, we’ll see when I get around to actually trying it.

*Honestly, it reminds me a lot of how I’ve heard folks describe data flow in gpus for general programming, but I know so little about that topic that I don’t know if that’s an accurate assessment.

A New Year

So my medical leave is done, I’m getting back to grad school, and it’s the start of a new year.

It was a really good year, for me. I got a lot of rest. My health improved. I’ve even been medicating my mental illness more, which in the past few months means that I’ve obtained a rather healthy github streak (115 days and counting at the time of writing). I spent a few months curriculum writing for a code school. I’ve made progress on several long-term writing projects, including a book of essays and an intro-to-computer-science book as well as picking away a little bit on my “working while mentally ill” project, and I’ve managed to work on a lot of little side projects, including my rough sketches about what a true privacy oriented social network should look like.

I also finally gave a talk at the Open Source Bridge conference and got a magazine article published, both of which were about explaining the fascinating mathematical limits, and open questions, of computation to a general audience. There’s a lot of pragmatic, useful things that are good to know: that computation exists abstracted away from the particular of any implementation, that there are hard theoretical limits to what a program can do, etc. But one of the completely impractical things I love to bring up is that the relationship between computability and physics has some very weird and fascinating implications for 3D printing. Essentially, if the laws of physics aren’t computable, then there are clear limits to the complexity of a thing that can be constructed in a 3D printer. If the laws of physics are computable, however, then in principle 3D printing has no limits beyond physics itself: any physical system could be created programmatically. Now, this is possibly of more interest for science fiction than engineering any time in the next century, but it’s engaging nonetheless.

One of the big things that happened over this medical leave was realizing just how bad off I had been before. I could fake functionality for brief periods, but often I was having trouble with reading and listening comprehension and with keeping anything in my head for more than a few seconds. A couple of months into this year off I realized that (shock, gasp) I could actually concentrate. I could sit down with a book and read without having to reread the same sentence five times to actually digest it. I could work on something for an hour or so and actually think about it. I could even work on something for several days in a row without forgetting everything I’d been thinking about in the first place. I’ve been wrung out and ill for practically my entire life, so this whole “I can function like a healthy person” thing has just been a revelation. Medicating my panic and my “reality and I aren’t on speaking terms” days has contributed significantly to this discovery. My worst mental health episodes now last for maybe an hour or two at a time instead of days or weeks of being unable to function. It’s still not fun, mind you, but at least I don’t feel like I’m a bad dice-roll away from losing a month or two of my life at any given moment. It’s just a major improvement to be able to have a bad evening or a bad day instead of a bad [arbitray length of time].

The next two terms are going to be incredibly busy. I’m both getting back to teaching courses and actually giving my thesis proposal. I’m not going to lie, I’m probably way more terrified of my thesis proposal than I need to be for someone who already has a clear outline of (1) what I’ve already done (2) what I want to do (3) how I’m going to do it and (4) how it contributes to my field. I keep feeling like I practically need to have my dissertation done before I even give my proposal. Part of making my proposal this spring is that I’m actually rebuilding some of my old unpublished work on higher dimensional type theory (now that I can do something about the panic attacks it can give me). I already started doing that this last month and it’s been going great. I think I’m actually going to get these papers out there, finally finally finally. This is work I haven’t been able to touch in over three years because I was working on it around the time I was assaulted in 2012, the last real time I was productive until this year off. The panic attacks have come every time I’ve tried to reconstruct all of it because it pushes bad buttons associated with how suicidal and scared and messed up I was at that time, because brains are stupid.

In regards to this blog, I’m going to try and have a Tue/Thu publishing schedule. I’m going to avoid making the same mistakes as before and just publish something on my scheduled days, even if it’s not as in-depth as I’d like. To that end, by the time this post is published I’ve already built up a bit of a buffer in case I have some extra busy weeks.

Virtues and Type Theory

So for today I thought I’d write a short post reviewing the paper The Seven Virtures of Simple Type Theory (Farmer 2008) . I’d read it before perhaps a year ago, but decided to reread it this week. On a fundamental level, this paper is an advertisement for Higher Order Logic/Simple Type Theory as a logical framework worth using. Now, to be clear, it’s not proposing that HOL is the one true logic or anything that grandiose, but more makes a case for it being taugh as a part of standard curricula by expounding its merits.

So what are those merits? Perhaps my only complaint about the writing of this paper is that the “seven virtues” aren’t laid out clearly at the beginning but are interleaved throughout. For ease of reference then, they are

  1. simple and uniform syntax
    • propositions are just expressions of type *
    • there are very few syntactic constructs needed
      • abstraction
      • application
      • equality
      • definite choice (i.e. an operator that choose the unique element that satisfies a proposition, if it exists)
  2. it has a fairly simple semantics because the syntax is very sparse
    • the type of propositions is the two point set of true and false
    • the base types are modeled by sets
    • function types are just the total functions between sets
    • the lambda terms are interpreted the standard way
    • equality is interpreted as equality
    • definite choice is the only slightly odd one, but is interpreted as an error value if the proposition doesn’t evaluate to true for any value
  3. HOL is extremely expressive
    • quantifiers and the like can all be built up from those base elements of syntax
    • anything that can be expressed in an n-th order logic can be embedded into HOL
  4. HOL is categorical (i.e. uniquely “pins down” to a single model) for theories such as Peano Arithmetic
  5. it has a simple but very powerful proof system
    • six axioms
    • one inference rule, which is effectively simplification via equality
  6. Henkin semantics!
    • a different notion of models where the “function spaces” aren’t all the total functions
    • Henkin semantics provides many more models than the standard semantics
    • so many that we get completeness back for our proof system!
    • so many that HOL becomes fancy syntax for 1st order logic, in this perspective
  7. it has a number of neat extensions for data types, partial functions, subtypes, etc.

and that’s more or less a summary of the paper.

I really enjoyed reading this, though, as it’s very well written, easy to read, and summarizes quite a few things that I hadn’t really seen presented quite as cleanly. For example, the reduction of all the normal quantifiers and logical connectives to just the lambda calculus + equality operator and definite choice was the clearest I’d ever seen it presented. I thought the presentation in Lambek and Scott’s book (Lambek and Scott 1988) was good, but I thought this paper made it even more obvious with the simple progression at the beginning of section 4.

This paper was also the first time I’d heard of Henkin semantics, and while strictly speaking I “get it”, on a philosophic level the idea fundamentally disturbed me. I think I still have a nasty streak of platonism that I haven’t managed to expel, because the idea that there could be two very different classes of models that I’m “proving theorems about” felt very strange. It was a reminder that notions such as “completeness” are, in a sense, loosely connected to the real formal system. The logic + proof theory is capable of proving theorems. These theorems are, by our intended models, connected to “real” mathematics—whatever that’s supposed to mean. On the other hand, there are biases in our intentions, as discussed by Girard in papers such as Normativity in Logic (Girard 2012) . I suppose the more I think about this and about things such as the Lowenheim-Skolem theorem in first-order logic the more I get nervous about what the connection between our formal systems and mathematics even is.

Yes, I know, there are many practical matters where such concerns are unwarranted, but this is the part of me that I think still acts like a theologian even after all these years: constantly worried about what the ultimate truth of things is and doubting my certainty in anything.

References

Farmer, William M. 2008. “The Seven Virtues of Simple Type Theory.” Journal of Applied Logic 6 (3). Elsevier: 267–86.

Girard, Jean-Yves. 2012. Normativity in Logic. Springer.

Lambek, Joachim, and Philip J Scott. 1988. Introduction to Higher-Order Categorical Logic. Vol. 7. Cambridge University Press.