CS skills in the real world — debugging

This is the first episode of what I hope to be a mini-series in the practical uses of skills learned through computer science.

It’s August of 2008, and you’re playing around on the Internet. Maybe you’re on Miniclip playing flash games. (I mean, I hardly remember what I did on the Internet in 2008. Exponential growth, am I right?) Maybe you’re refreshing your Facebook news feed, since this was back before the days of AJAX updates. Either way, you’re surfing the web, and you see reports that Russia has invaded Georgia. I can’t really imagine why a high school world history class would teach anything about Georgia — so if you’re like me, you’re a bit puzzled as to how/why Russia invaded the U.S.

I'm twelve years old and what is this?

You don’t want to be Jessica B. Jessica B doesn’t know how to debug.

Debugging is a computer science term for discovering and fixing an error in your code. For those in the know, the word debugging conjures up demonic visions of NullPointerExceptions, application traces, and segfaults. Yes, all of us that have taken a systems programming class have spent hours chasing down a single memory leak the night before a homework was due.

However, my assertion is that the hours spent painfully isolating and correcting those reclusive little bugs gives you a valuable skill applicable throughout your entire life.

Proper use of reference materials

When we teach new CS students at Penn how to debug, we emphasize “Google it” more than anything else. Somebody out there has seen and solved your problem before, especially for large and interconnected frameworks like Rails and iOS. (The number of hours I spent last semester figuring out why it was so damn hard to do nontrivial things with a UITableView…unbelievable.)

I specifically remember hearing about the Russia-Georgia War of 2008, seriously wondering what beef Russia had with America, and Googling for Georgia to figure out that Georgia was indeed a Eurasian country that had nothing to do with the Braves and the Falcons.

CS also teaches you about more directed searches. Javadocs, man pages, and Hoogle are all incredible inventions that don’t get praised enough. Ever tried coding non-trivial Haskell without Wifi, and thus without access to Hoogle? You can’t do it.

Predictably, computer scientists understand the value of information. We use this in all walks of life, from scouring through electronics manuals to saving face when unfamiliar with a “common” fact.

Thinking outside of the box

I recently had a problem set due for a class in theoretical regression. One of the problems was to show that Cov(α-hat, β-hat) = -xbar/Sxx. Not a hard problem, but I kept getting that Cov(α-hat, β-hat) = -xbar*σ^2/Sxx, and I had no idea how to get rid of that extra sigma term.

Some students might have fudged the derivation, hoping that the TAs wouldn’t notice the mathematical gap. I personally refused to do this on principle. Other students might have gone to the teacher to ask for help. While I’m a big proponent of asking for help after immersing one’s self in a problem, this was at 1am the night/morning before the problem set was due. So, what would you do here?

Maybe, through my CS experience, I’m familiar with the problem of data being out-of-sync. I’ve got Apple-Shift-R (force-refresh, ignoring your browser’s cache) on muscle memory speed dial. For whatever reason, the first thought that jumped into my head was what if I’m looking at an old version of the problem set?

Sure enough, when I revisited the course website and checked the problem set PDF, the derivation in question had been changed to reflect the answer I was getting.

I couldn’t show that Cov(α-hat, β-hat) equals -xbar/Sxx because Cov(α-hat, β-hat) doesn’t equal -xbar/Sxx.

I wouldn’t have thought that I’d downloaded an incorrect version of the problem set without my background in CS. That was definitely an unorthodox way to figure out my issue.

This parallels computer science debugging in that programmers encounter really weird stuff all of the time. Programming languages and runtime environments are complicated pieces of engineering; not all bugs can be logiced out the standard way. Sometimes, one must be creative to fix a bug.

Inductive reasoning

A difficult, hidden bug is a riveting puzzle to solve, a true exercise in logic. Programmers maintain a mental list of likely reasons for the error and systematically check different symptoms to determine why the computer is not doing what they want.

This process and method of thought is similar to what a lawyer uses in oral argument or what a doctor uses in diagnosing a patient. (Extending the analogy even further, you could easily think of a patient’s sickness as a bug in their system that must be fixed.)

This may be the most powerful point in this blog because the mentality of inductive reasoning is not easily taught. The context of computer science and programming is one of the most effective ways to build true inductive reasoning skills, skills that transfer to almost any other profession.

My friends and acquaintances that have moved into investment banking (and, more generally, high finance) tell me that their computer science skills give them a huge leg up on their coworkers — not because banking necessarily requires coding skills, but because they know how to diagnose and solve an issue. They have the critical thinking skills that their finance and econ major brethren do not. Programming occurs in an open system, whereas one performs finance / econ problem sets in a closed system. Open systems introduce bugs that require these skills.

Consequently, those exposed to programming come armed with real-world abilities that give them a true competitive advantage in the workforce.

Conclusion

If everyone became a software engineer, the world would stop. Accordingly and obviously, this blog is not a call for everybody to enter the field of software engineering. However, you owe it to yourself to study programming — if not to gain a sense of technical literacy, to properly learn how to solve unexpected problems.

Attracting bright students to computer science

I stumbled upon computer science by accident. In a literal quest to 5 as many AP exams as humanely possible, I took AP Computer Science A on Florida Virtual School. Before that class, I had never considered computer science as a major. After that class, I knew it was something I wanted to do.

The absolute upper echelon of students entering university programmed throughout middle and high school. However, there is absolutely no reason why computer science should be a major reserved for these top gunners. My experiences at Penn have shown me that computer science is a field that many people can relate to and succeed in, but a field that most people would not consider studying since they are never exposed to it in high school.

I attended this high school, a school full of bright and creative students bogged down in the procedural world of AP Calculus, AP Physics, and AP Biology. At no point are they ever told about popular and useful majors such as computer science, finance, and statistics. My gut feeling is that most public schools offer this curriculum of traditional APs — sure, useful background knowledge to know, but a syllabus truly a mole wide and a molecule deep. Here, students are never informed of collegiate and career opportunities that fall outside the archetypal AP exams.

My question to the reader is — how do we attract these intelligent students to the field of computer science?

I personally believe in the power of grassroots activism, which is why I took it upon myself to visit my high school and deliver a guest lecture in my old calculus teacher’s 5th period AP Calculus BC class. This group of bright sophomores, juniors, and seniors is both curious enough to learn for fun and capable enough to learn on the fly.

My guest lecture had three structural components: explain what computer science is, show why it is useful, and field further questions from the students.

What is computer science?

When I asked this question to the class, one girl raised her hand and said, “My mom teaches computer science at a local college, C and C++ I think.” I smiled and shook my head. “Those are programming languages. Programming is an important application of computer science, but there’s much more to it than that.” And indeed, there is.

Computer science is the art of taking stupidly easy problems and spending a stupid amount of time figuring out how to solve them.

Obviously, this is a massive generalization, and an incorrect one at that. Extremely important computer science work is done every day on challenging, (NP-)hard problems. However, the initial appeal in computer science is that many of the algorithms taught in a traditional undergraduate sequence are solutions to simple problems. One could come up with these just by playing. That’s the angle I wanted to take with these students.

“For example, let’s say that I have a list of ten numbers.” I wrote down ten arbitrary numbers. “I want you guys to put these numbers in increasing order. Surely an easy task you guys do all the time, in one way or another. Have you ever stopped and asked yourselves how exactly you sorted these?”

A student raised his hand. “I took the smallest number and put it first, then the second-smallest number and put it second, and so on and so forth.” Ah, selection sort, exactly the answer I was hoping for!

I grinned and said, “Great answer.” I proceeded to discuss the number of “steps” we’d have to take to run that algorithm, essentially exploring the intuition behind Big-Oh notation and asymptotic growth.

Why is it useful?

I posed to the class, “Let’s say that we’re Facebook, and we need to spit out a list of your friends in alphabetical order. Can we do this any quicker?” Of course, I wasn’t expecting anyone to answer with “merge sort,” so I led them through how to derive merge sort and how to prove that it is fundamentally quicker than selection sort. For a class that had never heard of an inductive proof before today, I was impressed with how quickly they picked it up. Moreover, they understood the rationale behind quick algorithms – if your website or application takes too long, nobody’s going to use it.

Any further questions?

I ended up fielding a ton of questions that the students had, from how to count in binary to how Anonymous takes down websites. I explained what hackathons were, what good hacking versus bad hacking was, and what some of the hacks I’ve done at hackathons were. At the end of the class period, I definitely think I turned some students onto the idea of computer science. They were able to see its relations to math and its interesting applications.

Conclusion

While I don’t subscribe to the belief that everyone should code, I definitely think that a lot of bright kids that go pre-med or major in a pure science or other engineering discipline could really succeed in computer science, had they known more about it. In an attempt to expose more students to this material, I gave a similar lecture last year on the RSA algorithm and how they can use it to send secret messages. This is academic and practical content that most public schools simply will not ever explore with their top students. I think it’s a travesty.

How else can we encourage bright students to explore computer science and programming? Leave a comment with what you’ve done or what you think.

Fold as a chain of endomorphisms

A few months ago, Tom Ellis showed how to write foldr' as a chain of endomorphisms. The relevant code is pasted below, with a couple of changes I’ve made for clarity’s sake:

compose :: [b -> b] -> b -> b
compose [] = id
compose (f:fs) = f . compose fs

foldr' :: (a -> b -> b) -> b -> [a] -> b
foldr' combine base xs = compose (map combine xs) base

I originally saw Ellis’s blog post on Reddit, tried to qualitatively understand it, shrugged my shoulders, and moved on. I acknowledged that his implementation worked, but I definitely missed the intuition behind it. I’d Googled the term “endomorphism” (essentially a function of type a -> a), but I failed to make the connection between concept and code.

By coincidence, I recently had a short conversation with Andrew Braunstein about Haskell, and the notion of folding came up. He mentioned that the idea of folding could be understood by imagining replacing every (:) in a list with combine.

It was at that point that I put two and two together! First, I’ll show by example. Let’s say that you have a list of integers 1:2:3:4:5:[], and you want to sum them. You can conceptually think about summing the list by replacing every (:) with (+) (and replacing the empty list with the additive identity, zero). You obviously get 1+2+3+4+5+0.

Ellis’s code represents this idea of mapping one function into another, albeit in an abstract way. compose itself takes a list of endomorphisms and a starting value, and it feeds the result of one function into another. Remember, since an endomorphism is a function with the type a -> a (or in the sample code’s case, b -> b), we can easily achieve this chained computation.

All we have to do is map combine over our input list. In the example of summing integers, we’ll have a new list of functions that can be thought of abstractly as 1+ : 2+ : 3+ : 4+ : 5+ : []. Note that each function in the list takes an integer and returns an integer, hence we have a list of endomorphisms.

Finally, when we call compose (map combine xs) 0, we’ll recurse our way down to the end of this list of functions. When we reach the empty list at the end, we’ll return the base case, 0, and feed that into the last function of the list, 5+. Of course, 5 + 0 == 5, so we feed that result into the previous function, 4+, and so on and so forth.

In practice, this is the same thing as replacing each (:) with `combine` and evaluating right to left.

I plan on using this idea, in a less formal way, to teach students new to functional programming how folding works.

Pretty cool stuff, if you ask me!

A round of golf

Last weekend, I played a round of golf with Lewis.

Out of context, this activity isn’t surprising; I’m originally from South Florida, and Lewis is originally from near Coeur d’Alene, Idaho.

However, the savvy reader might wonder exactly how two students living in Philadelphia in early February could play a round of golf. I suppose I should, in that case, clarify — we didn’t play the sport of golf, but instead engaged in code golf.

Code golf is a game in which a programmer tries to fully write a function in as few characters as possible. Comments don’t count toward length, nor does whitespace. (Brevity and readability are two entirely different concepts.) Lewis had a code golf homework assignment for CIS 194, Penn’s introductory Haskell class, and he invited me to code alongside him.

I tend to be a pretty conservative Haskell programer; my top priority is writing understandable code. I’m not personally a huge fan of pushing complex one-liners into production code; eventually, someone else will read your code, and they shouldn’t have to spend ten minutes figuring out exactly what one line of Haskell does.

However, playing code golf last weekend showed me some of its benefits that I think apply universally to programming.

Refactoring

Nobody breaks par on their first attempt at a challenging golf hole. Likewise, nobody writes a function with optimal brevity on the first try. Therefore, code golf is a useful practice because it forces a programmer to think about efficient code re-writes. Specifically in functional languages like Haskell, refactoring code into more modular, uncoupled parts is a great way to shorten your code.

As an example, let’s examine the Haskell function transpose. It takes a list of lists and runs a 2D matrix transposition on it. Here’s how it’s implemented in Data.List:

transpose []             = []
transpose ([]   : xss)   = transpose xss
transpose ((x:xs) : xss) = (x : [h | (h:_) <- xss]) : transpose (xs : [ t | (_:t) <- xss])

This is pretty verbose, don’t you agree? Lewis thought so. Here’s his first shot at a one-line implementation.

transpose m = foldr (\r n -> map (\(x,y) -> x:y) (zip r n)) (replicate (length $ head m) []) m

Lewis’s line of logic was that he should make a list of empty matrices of the output length — replicate (length $ head m) — and then iterate over the input list, consing one element at a time on to each output list.

He then realized that, by mapping over a call to zip, he had explicitly coded a common higher-order function, zipWith.

transpose m = foldr (zipWith (\x y -> x:y)) (replicate (length $ head m) []) m

Finally, a cryptographer and assistant professor that Lewis knew showed him an even shorter solution by encapsulating the replication and fold into a second map:

transpose m = map (\i -> (map (\v -> v !! i)) m) [0..((length $ head m) - 1)]

The point here is not to belabor the transpose function — clearly this gets a bit hard to understand. However, playing code golf forces you to think about how to refactor your code to significantly shorten it. Refactoring out common design patterns is a much better way to reduce character count in a function than trying to play with parentheses, currying, and flipping. Thus, code golf builds your skills at identifying reusable patterns in code, specifically in a functional language that allows such abstractions to be made.

Test cases

Students don’t write test cases. Put yourselves in the shoes of Billy, a first-year CS student. Billy is a reasonably intelligent 18-year old, and he can code decently well. He also knows that after this week, he’s never going to touch his homework again, so why bother testing the functions? I can’t say I blame him for that. However, not writing test cases is obviously an awful habit to get into.

Code golf forces you to write test cases because you’re rapidly iterating on a function, and you need to ensure its correctness after making drastic changes. Given that code golf involves making relatively radical changes to how you’ve written a function, you need some test cases to verify that your new implementation is correct. It is incredibly difficult to eyeball a code-golfed function and determine its correctness. Hell, I wouldn’t believe that the shortest transpose implementation made any sense had I not tested it myself.

Testing your code is especially satisfying in Haskell, where you can easily use QuickCheck properties to rapidly test hundreds of different inputs for you. This would be a great way to find out that Lewis’s code does not work for asymmetric 2D matrices.

Competition

I am firmly of the belief that the power of the human mind is augmented by a healthy dose of competition. (One could call me an Objectivist.) Humans thrive on competition; this is why we play sports, this is why we found start-ups, and this is why we surround ourselves with intelligent and talented people.

Lewis has more experience with computer science than me, and he can definitely beat me in a game of code golf most days. However, I’ll never pass up the opportunity to play. The competitive aspect of code golf — trying to write a function in fewer characters than anyone else — encourages our creative juices to flow. You won’t sit back and say, “This is good enough”; you’ll iterate and contemplate until you are positive you can’t excise another character from your implementation.

Speaking personally, this precise notion is what enabled me to give Lewis a run for his money in code golf. I wanted to beat him. As a result, I found new heuristics I could use in the future (for example, y > max x z is a cleaner way of saying x < y && y > z). Moreover, when we finally stopped playing and compared our scores and implementations, we were both able to learn from each other’s successes and errors.

Conclusion

Code golf isn’t directly useful in that the code you write isn’t put in real-world production code. Other people have to read your production code, and it’s obviously more efficient for them to read a three-liner in 30 seconds than a one-liner in five minutes.

However, code golf gives you a better eye for patterns of abstraction and encapsulation that a high-level language will allow you to abstract away. Our example of transpose showed incremental abstractions from explicit recursion to folding over a map and zip to folding over a zipWith. We wrote rigorous test cases for our function to provide strong evidence for correctness of code. By making this a game against each other, we mutually learned from each other.

My recommendation — play a round of golf this weekend, no matter which language you pick.

Max and Lewis are sophomores at the University of Pennsylvania. Lewis would like to add that list comprehensions are a great abstraction to shorten and clarify your code. Follow them both on Twitter @MaxScheiber and @LewisJEllis.

Successful failure at a hackathon — my PennApps story

At the beginning of every semester, the University of Pennsylvania hosts a 40-hour hackathon called PennApps. While I didn’t feel confident enough about my hacking skills as a freshman to compete, I started doing hackathons this year. After winning an award from Yahoo! for best mobile hack at PennApps fall 2012 and finishing in second place overall at hackRU fall 2012 (a hackathon hosted at Rugers), I felt pretty confident about my abilities to come up with a viable technologically challenging product in a limited period of time.

I, along with Jeff, Teddy, and Ashu, wanted to create an iPhone app called Merge that facilitated meeting up with your friends for impromptu events like dinner or lifting. Teddy’s an excellent UI guy, Jeff’s a rockstar iOS developer with strong industry experience, and Ashu also had strong industry experience with Python (which we wrote our backend in). I saw myself as the most versatile coder on the team and a good people manager to boot.

We’d sat down before the hackathon and planned out the logistics of the app. We had wireframes of each screen, we all knew the general and specific details of the idea, and we agreed on the exact set of features we should have.

And yet, by the time that our 1pm demo came around, our app wasn’t finished. We had to demo an incomplete app. Also, we’d wanted to make the top 20 teams, and we’d wanted to win one of several sponsor prizes. In my eyes, we’d failed at reaching all of our goals.

I think that the problems we encountered were universal ones that other people can learn from. Here are what I think these problems were and how we can avoid them in the future.

Time commitment

At a hackathon, every minute counts. You need all the time you can get to fix bugs, polish features, and pitch to sponsors. However, our team was not able to attend all of PennApps. Jeff missed about six hours of hacking due to fraternity commitments, and Ashu missed a couple of hours due to a model UN meeting.

I don’t mean to imply that Jeff and Ashu made the wrong choice, or that they should be shamed, or that they didn’t contribute to the team. Both had conflicts that they could not avoid. However, it is undeniable that the missing man-hours would have really helped us complete our project. For the future, our team will let all clubs and organizations know far ahead of schedule that we will be completely unavailable for that period of time.

Arrogance

On the topic of using every second to its fullest, our team walked into PennApps extremely confident that we’d make the top 20 demos. How couldn’t we, after all? We had a solid team, a stellar idea, and mental toughness. Unfortunately, this confidence became our downfall. I’m convinced that we didn’t work urgently enough until Saturday night / Sunday morning. We didn’t foresee any technological challenges — it’s an iPhone app with a custom API! — so we definitely trod along at a comfortable pace. As a result, when we did run into unexpected problems and kicked up the urgency level, we didn’t have enough time to compensate.

Our mentality should have been that of the underdog, not the pampered white horse. There were over 100 teams competing at PennApps, all of which had rockstar teams, great ideas, and energy drinks. We should not have let our arrogance cloud our vision. Hell, I wrote the exact same thing in the Daily Pennsylvanian about PennApps fall 2012! You must move rapidly and diligently.

Idea generation

In retrospect, I don’t think that our team’s idea was best-suited for a hackathon. It was a solid idea for an actual application in the app store (and we had many people over the weekend tell us that they would use our app if submitted), but hackathons aren’t all about marketability.

Successful hackathon ideas are both technically challenging and impressive to the everyday person. This year’s winner, Inventory, combined the technical challenge of RFID scanners with the layman’s allure of an intelligent backpack. (Hell, the idea was so beautiful that my students at M&TSI last summer designed and executed the same idea, albeit less gracefully.) Last year’s winner, JAM, had these two features as well — the technical challenge of music transcription and an awesome demo that blew the crowd away.

Electioneering, my team’s app at PennApps fall 2012, combined the technical challenge of data mining with the layman’s allure of political comparison. DBIDE, a hack that I made with a different team at hackRU fall 2012, was extremely technically impressive, and that by itself projected us to second place. However, Merge wasn’t anything super impressive technically speaking. Moreover, as Mish Awadah put it, “I personally love the idea, but your problem is that it appeals to technically inclined people like ourselves. It’s going to be hard to capture the general public.”

My teammates and I agreed that a better way to come up with a clutch hackathon idea would be to take a look at the various API prizes being offered, using those as inspiration. Indeed, some of the best hacks were hacks that utilized sponsor APIs. This is because external innovation is much easier than internal innovation. In the business world, this is why large companies acquire start-ups; in the hackathon world, this is why top teams use sponsor APIs to hone and focus their stream of ideas. Every hacker knows a million ways to improve the world; however, since sponsor APIs are tried and true, using them to spark the idea generating engine increases the ratio of strong ideas to weak ones.

As a final point, using a sponsor API lets you spend more time making an impressive hack and less time hand-rolling your own technology from scratch. Remember, time is valuable!

Familiarity with the technology

As implied earlier, Jeff knew only Objective-C, Ashu knew only Python, Teddy knew neither, and I knew some Objective-C and little Python. Admittedly, the Python was not a hindrance for me. Since I’m fluent in Ruby, I was able to pick up Python on the fly with little hassle. (Think of it like a native Spanish speaker learning Italian or Portuguese.) However, while my iOS knowledge helped me debug a couple of things in Jeff’s code, such as consistency in our hand-rolled TimePicker, the brunt of the iOS work was still up to Jeff. He essentially wasn’t able to sleep because we were relying on him to construct the mobile app.

For the future, our team should use technologies that two or more people are fluent in. I would like to see this solely because it would allow for us to cycle in our sleep, thus using all 40 hours to code. Hypothetically, Jeff could have coded the iOS part for 4 hours, then napped for 4 hours while I took over. Ashu and I essentially did this for our Python API, and it worked very well.

Moreover, we ran into some serious issues integrating Facebook into our iPhone app. Despite the fact that Jeff and I had both used the Facebook SDK before and we had both acknowledged that it was hard to use, neither of us had thought to practice with it before PennApps. As a result of our collective short-sightedness, we lost precious hours trying to make things work.

Conclusions

I love my team and everyone on it. I look forward to working with them again; they’re all great, talented guys that bring something different to the table. However, I think that we suffered from confirmation bias as a result of our Electioneering hack. We didn’t think that Electioneering was done well or an especially good idea, so we kind of assumed that we’d just kill it out of the water with Merge, a more marketable idea with a prettier UI. It turns out that we definitely missed some things that hindered our extrinsic success at PennApps.

At the end of the day, though, what’s important is the intrinsic success that we derived. Not only did we encounter obstacles we’d never seen before, but we all learned a LOT about the nature of hackathons and how we can become better programmers as a result. Even though I said that we’d failed at meeting all of our goals for PennApps, that doesn’t mean we failed. I suppose that, to quote Jim Lovell of the Apollo 13 mission, this was definitely a successful failure of a weekend.

Constructing a parser combinator in Haskell, part 1

This is my first ever blog post, and I’m definitely looking for constructive feedback on how I can improve in future posts. Feel free to let me know in the comments or on Twitter @MaxScheiber!

The new CS student has had a limited exposure to parsing, usually in a procedural language. Here at the University of Pennsylvania, we have a couple of introductory programming assignments that require this, such as reading in a CSV dictionary file in CIS 120 and simulating an object file in CIS 240.

Ostensibly, students are taught that parsing should take place in a giant loop — for example, in Java:

Reader r = new BufferedReader (new FileReader ("file.txt"));
for (int c = r.read (); c != -1; c = r.read ()) {
  // do stuff
}

This ingrains into students’ minds that the “right” way to parse files is one token at a time in a giant loop. However, this encourages code repetition and does not really provide a good level of abstraction. We spend too much time worrying about things like ASCII values, maintaining boolean flags, and ensuring proper control flow through nested loops and helper methods. In fact, this is why I think performing any kind of nontrivial file IO is one of the ways we write poorly encapsulated spaghetti code that ends up being a giant state machine.

Enter parser combinators. Parser combinators are parsing functions that are, well, combinations of other parsers. You can think of parser combinators as being recursively built; our base case is a parser that accepts one character, and we can build other parsers from that, like a parser for a word.

A functional programming language such as Haskell allows us to write a very clear parser combinator; in fact, this is how the powerful Haskell parsing library Parsec is built.

Let’s construct a basic parser combinator from scratch. Credit where credit is due — I adapted this from Dr. Stephanie Weirich‘s wonderful CIS 552 course at Penn. I originally learned the material from her lectures. I’m hoping to go into more depth with the explanations here.

Core Library

Good functional programming practices dictate that we keep IO as close to the top of our stack as possible, so we will write this library assuming that we have already read some file into a String. Moreover, let us assume that we are turning said String into some formatted type a. Said a can be anything from a String to an algebraic datatype of our choosing (which we will explore in a future post). Finally, let us assume that there may be multiple possible parses — i.e. we are parsing nondeterministically.

We can encapsulate all of this into a type definition!

newtype Parser a = P (String -> [ (a, String) ])

By our definition, a Parser is a function that takes a String and returns a list of all possible parses. A parse itself is simply our formatted type, plus whatever our parser hasn’t touched yet. Keep in mind that our nondeterministic approach allows us to easily represent a failed parse — the empty list! (If we were to write a deterministic parser, we would have to use an alternative method of representing a failed parsing attempt, such as Maybe or Either.)

We can’t avoid the basic idea of reading in one character at a time, and we even identified earlier that this is essentially the base case of our parser combinator. If we were writing normal Haskell and wanted to grab one character at a time, we would write code like this:

getChar :: String -> Char
getChar (c:_) = c
getChar [] = error "empty String does not have a head"

(This is obviously no different from the head function on generic Lists!) Since we’re in the context of a Parser, we can grab a character without worrying about failing!

getChar :: Parser Char
getChar = P $ \cs -> case cs of
  (x:xs) -> [ (x, xs) ]
  []     -> []

As you can tell, the getChar Parser takes a character and keeps it. If there’s no possible character to grab, instead of throwing an error, we fail gracefully by returning the empty list.

Now, as implicit in the type definition of Parser, we aren’t actually running anything — a Parser is, internally, a function waiting for a String. We can define a simple runner function to extract the function from the type and apply it to an argument.

parse :: Parser a -> String -> [ (a, String) ]
parse (P p) s = p s

This function is but a simple pattern match on the structure of a Parser. Pretty cool, if you ask me! Given this, we can construct our next core function, an “either” Parser. The premise behind this is that we cannot predict in advance what we are parsing. Take the hypothetical example of parsing code into an abstract syntax tree. We cannot know if we’re parsing, say, an operator versus an atom. As a result, we have to try both parsers! If the first succeeds, we know we’ve parsed an operator. If it fails, we can try the next one safely.

The either Parser, which we will call <|>, is also our first to represent a joining of Parser blocks!

(<|>) :: Parser a -> Parser a -> Parser a
p1 <|> p2 = P $ \cs -> case parse p1 cs of
  [] -> parse p2 cs
  ps -> ps

Our type signature makes sense — <|> takes two Parsers and returns a new Parser that will try the second Parser iff the first fails. The implementation itself makes sense — if the first Parser succeeded, return what it gives us. If it failed, return what the second Parser gives us (which may fail itself).

Now, <|> ran one Parser, looked at its result, and (possibly) ran a second Parser. This fits nicely into the bind paradigm, implying that Parsers are a Monad. Indeed, we can define them as such.

instance Monad Parser where
  p1 >>= fp2 = P $ \cs -> do (a, cs') <- parse p1 cs 
                             parse (fp2 a) cs'
  return x   = P $ \cs -> [ (x, cs) ]
  fail _     = P $ \_ ->  []

bind is always tricky to think about, but as customary with Haskell, we let the types direct us. The type of bind is Parser a -> (a -> Parser b) -> Parser b, and as a reminder, we constructed our Parser as newtype Parser a = P (String -> [ (a, String) ]). The Monad should jump out at us! A parse gives us a formatted result of type a, which we can feed into fp2 to give us a Parser b that we can then run on the remaining String cs'.

Note: we’ve availed ourselves of the List Monad to take care of the nondeterminism; indeed, that is what the List Monad is for! As a result, our very own Parser Monad is itself nondeterministic.

return is a lot easier to reason about and write; to shove an a into a Parser, just pretend that we’ve already parsed it. fail (optional but convenient in Monads) is even easier, since we’ve established earlier that the empty list represents a failed parse.

While we’re at it, we might as well write a Functor on Parsers.

instance Functor Parser where
  fmap f p = p >>= \x -> return (f x)

In other words, we can map over a Parser simply by running a Parser p and then mapping the formatted type with f.

That’s actually all we need to do with our core library! We don’t need to ever open up or pattern match on the structure of a Parser again. That’s, to me, the beauty of parser combinators. The simple notion that we can build large Parsers from small ones automatically gives us another level of abstraction and safety!

Combinators

Our next part of this will be writing some specific combinators simply from the convenient base case of getChar and helper case of <|>.

Using a predicate

We may want to parse if a character satisfies some predicate. For example, if we’re parsing code, we might want to specifically parse a semicolon to indicate the end of a statement, or we might want to parse anything that’s alphanumeric to represent a character of a variable name. We will creatively call this function satisfy, and we can use our new Parser Monad to help us out a ton. The basic idea of how satisfy works is that we will eagerly parse a character and check after the fact to see if it matches what we were expecting.

satisfy :: (Char -> Bool) -> Parser Char
satisfy p = do 
  c <- getChar
  if p c then return c else fail "Did not satisfy boolean predicate"

satisfy builds off of getChar. We can use it to build some useful Char checkers. (All boolean functions are imported from Data.Char.)

alpha, digit, upper, lower, space :: Parser Char
alpha = satisfy isAlpha
digit = satisfy isDigit
upper = satisfy isUpper
lower = satisfy isLower
space = satisfy isSpace

We can specifically check for Char equality with a useful combinator:

char :: Char -> Parser Char
char = satisfy . (==) -- non-FPN — char c = satisfy (== c)

Going from Chars to Strings

How do we make the jump from reading a Char to a String? In procedural-style parsing, we would use concatenation or a StringBuilder. In Haskell, we can rely on Monads!

string :: String -> Parser String
string = mapM char

Again, let the types guide us.

mapM   :: Monad m => (a -> m b) -> [a] -> m [b]
char   :: Char -> Parser Char
string :: String -> Parser String

char fits as the first argument to mapM, where a == b == Char and m == Parser. Thus, the second argument to mapM, an inputted String, returns a Parser String — which is exactly what string tell us!

Intuitively, what we're doing is taking a predicate String and checking each character of the predicate in order — hence, mapping is proper here.

More complex parses of unknown length

Let's say we want to run a specific Parser multiple times. Our gut instinct may be to try this:

manyNaive :: Parser a -> Parser [a]
manyNaive p = do x  <- p
                 xs <- manyNaive p
                 return (x:xs)

However, once we fail, we will not recursively return anything we've already parsed — we will fail! We thus can't parse anything finite, which is nigh-useless. The way to combat this is with mutually-recursive functions.

-- | Parses zero or more instances of p.
many :: Parser a -> Parser [a]
many p = rest p <|> (return [])
                    
-- | Parses one or more instances of p.
rest :: Parser a -> Parser [a]
rest p = do x  <- p
            xs <- many p
            return (x:xs)

many will try to parse once. If that fails, it will return an empty list. This is NOT the same as failing! Remember, return [] will box the empty list INTO a successful parser. On the other hand, rest will DEFINITELY parse once and WILL return a failure otherwise. If it parses once, it will call on many to search for the possibility of more input.

This is a bit tricky, so let's analyze the control flow a bit more. If we run (many digit) on "14", we'll try to parse a digit in rest. We can parse a '1', so we keep that around and call many on the remaining string, "4". We'll parse the '4' in rest, keep it around, and call many on the empty string, which calls back to rest. That returns a FAILURE back to many. By the nature of the <|> function, we recognize said failure and move to the second branch, which simply returns a "successful" parse that has no characters. We then recursively go up the stack of rest calls, which returns our successfully parsed String of digits.

Converting to other formatting types

Keeping everything in Strings limits us as programmers. Let's say we want to parse an integer from our input, and we want to store it as an Int. We can easily do this using rest.

int :: Parser Int
int = do
  s <- string "-" <|> return [] -- picks up the negative sign
  d <- rest digit -- guarantees at least one digit, so no naked negative signs
  return (read (s ++ d) :: Int)

While the read function is normally unsafe to use because of its potential to throw errors, we've safely guaranteed that we will ONLY ever call it on actual digits thanks to our digit combinator. We will fail gracefully with a Parser error before we ever fail from trying to convert non-digit Chars into Ints.

You can also use algebraic datatype constructors to push bare components, such as Strings and Ints, into more complex types. We will explore this when parsing an assembly language.

Trying many potential Parsers

Real-world use cases may have a ton of nondeterminism. (In the next installment, for example, I will be examining a very basic assembly language, which will have several arithmetic and logical operations, all of which must be eagerly guessed.) It will not be uncommon to need to chain together many Parsers with <|>. To save your eyes, we can construct an unlist function to take care of this for us:

unlist :: [Parser a] -> Parser a
unlist = foldr (<|>) (fail "Could not parse")

Conclusion

You have just seen the basics of monadic parsing via combinators in Haskell as a safer, cleaner, and simpler alternative to procedural parsing. For simplicity's sake, we limited this method of parsing to Strings specifically. In a future post, we will expand our Parser to read from an arbitrary holding type, such as Strings, ByteStrings, and lists of lexed tokens. We will also show how to parse a basic assembly language into an algebraic datatype. For now, I hope that this post was able to make very clear how to construct a monadic parser combinator from scratch, how to use it in unique ways, and how functional programming can make parsing easier to understand.