Blog

Reflections on 18-845 Internet Services

18-845 Internet Services (link) is one of my favorite courses at CMU, ever. It's primarily a reading and discussion group for interesting papers in computer systems for networked services, taught by Dave O'Hallaron (who literally wrote the textbook on computer systems). There's also some coding, with an individual project on scaling a webserver and a more open-ended group project on whatever in the field interests you - I wrote about my group project experience in a separate post.

Reading papers

It turns out that reading papers is not really hard! This was news to me before this course. One process that really helped me understand the papers was writing critiques, which was required for the class. We were expected to have four parts to our critique, and limit the whole critique to a single page:

  1. A summary of the paper
  2. A few concrete things we liked about the paper
  3. A few concrete things we disliked
  4. Research questions and points for discussion

Having all of these parts really shifted my perspective on reading a paper to be more... critical. In particular, generating dislikes and research questions requires you to question the paper - why did they choose this method of evaluation? What about X concern or performance under Y conditions? Often in an academic (and by that I mean class/non-research) setting, I find myself reading uncritically and accepting claims at face value. The critique format forces you out of that mindset.

The one-page limit is also helpful, because to hit it you'll need to limit the summary of the paper to just the most important ideas. For example, the GFS paper is full of cool implementation details: you could easily fill a page just by describing how a write is replicated across chunkservers in GFS. But besides replication (and how it enables use of commodity hardware despite high failure rates), there are other fundamental important ideas, like:

  • A centralized server can avoid becoming being a bottleneck (in terms of network bandwidth) if it only handles metadata
  • Knowledge about your workload gives you a lot of room to optimize:
    • You can prioritize throughput over latency for a search index
    • You can provide a non-standard append operation, to allow clients who don't care about exact position to append concurrently without locking.
    • Since you control both GFS and all of its applications, you can shuffle complexity between the systems; for example, at-least-once appends may cause duplicates, and this complexity is handled in the application level rather than in GFS.

Over the semester, I ended up developing the habit of reading the papers multiple times. First, a quick skim just to get the gist of the paper and draft the summary; then, a more careful re-read to get a deeper understanding and capture more details for the critique. Afterwards, maybe add more re-reads if I still don't understand or didn't get enough material for the critique.

There's an interesting side effect here of this critique format and the limited number of papers we read in the course. Since there's not very many of them, most of the papers we read in the course are very good. This meant that for at least a couple papers, I was going back and forth (often in the evaluation section) looking for something to nitpick, just because I needed to write about something else I disliked.

This course also dovetailed pretty nicely with the Distributed Systems course, which I was TAing for the first time. A good amount of the content overlaps (caching, Paxos, etc.) and when it did, I had the confidence to dive into the original papers to help me answer students' questions that went deeper than the lecture slides.

Discussion groups

I also learned a bit about myself during the course. At the beginning of the course, I paid less attention to what people said if their spoken English language ability was not native-level. So, I listen more to people who were mostly native English speakers, who were mostly CMU undergrads or fifth-years, rather than the grad students who had often done undergrad in a different country. I speculate that this is also influenced by comfort levels: people who had been at CMU for much longer were more willing to speak up at the beginning of the semester, and I interpreted their confidence as expertise.

But as the course went on, I realized that I agreed a lot with the people from CMU undergrad from familiarity alone - "Back in the Distributed Systems class, we did this...", and I would nod along because I also had that experience. Looking back, these callbacks to other courses were probably too vague to be useful to people in the room who hadn't taken them; there wasn't enough context given to really understand if you hadn't done it yourself. Yet if full context was given, it would definitely be boring to the people who had taken the course, so making these references is difficult.

It takes a lot more effort (beyond processing any accent) to listen to what other people with different technical backgrounds were saying: those backgrounds assume knowledge that is much less familiar to me - for example, I tend to zone out during any in-depth reference to machine learning. The discussions from this course showed me that it really takes a lot of conscious effort to overcome that barrier of unfamiliarity.

Props to our professor Dave O'Hallaron and our TA Andrew Yang for managing the transition to online classes well. Despite Zoom not being a full substitute for face-to-face discussion, it felt very similar and the discussions continued to be helpful. The most difficult parts: not being able to read the room for reactions (since most people had their video off), and having to remember to unmute to laugh at jokes.


There's lots of cool papers out there! If you're looking for something to read, Adrian Colyer's The Morning Paper is a great resource with summaries so you can get a gist before reading the actual paper. Hacker News also has interesting technical content, sometimes as blog posts rather than papers. The comments are often as interesting as the links themselves. And if you have something cool to share, I'm always looking for new things to read - let me know!

Thanks to Andy Tsai for feedback on this post.

reflectionBobbie Chen