Software developer, racing fan
774 stories

Algorithms interviews: theory vs. practice

1 Comment and 3 Shares

When I ask people at trendy big tech companies why algorithms quizzes are mandatory, the most common answer I get is something like "we have so much scale, we can't afford to have someone accidentally write an O(n^2) algorithm and bring the site down"1. One thing I find funny about this is, even though a decent fraction of the value I've provided for companies has been solving phone-screen level algorithms problems on the job, I can't pass algorithms interviews! When I say that, people often think I mean that I fail half my interviews or something. It's more than half.

When I wrote a draft blog post of my interview experiences, draft readers panned it as too boring and repetitive because I'd failed too many interviews. I should summarize my failures as a table because no one's going to want to read a 10k word blog post that's just a series of failures, they said (which is good advice; I'm working on a version with a table). I’ve done maybe 40-ish "real" software interviews and passed maybe one or two of them (arguably zero)2.

Let's look at a few examples to make it clear what I mean by "phone-screen level algorithms problem", above.

At one big company I worked for, a team wrote a core library that implemented a resizable array for its own purposes. On each resize that overflowed the array's backing store, the implementation added a constant number of elements and then copied the old array to the newly allocated, slightly larger, array. This is a classic example of how not to implement a resizable array since it results in linear time resizing instead of amortized constant time resizing. It's such a classic example that it's often used as the canonical example when demonstrating amortized analysis.

For people who aren't used to big tech company phone screens, typical phone screens that I've received are one of:

  • an "easy" coding/algorithms question, maybe with a "very easy" warm-up question in front.
  • a series of "very easy" coding/algorithms questions,
  • a bunch of trivia (rare for generalist roles, but not uncommon for low-level or performance-related roles)

This array implementation problem is considered to be so easy that it falls into the "very easy" category and is either a warm-up for the "real" phone screen question or is bundled up with a bunch of similarly easy questions. And yet, this resizable array was responsible for roughly 1% of all GC pressure across all JVM code at the company (it was the second largest source of allocations across all code) as well as a significant fraction of CPU. Luckily, the resizable array implementation wasn't used as a generic resizable array and it was only instantiated by a semi-special-purpose wrapper, which is what allowed this to "only" be responsible for 1% of all GC pressure at the company. If asked as an interview question, it's overwhelmingly likely that most members of the team would've implemented this correctly in an interview. My fixing this made my employer more money annually than I've made in my life.

That was the second largest source of allocations, the number one largest source was converting a pair of long values to byte arrays in the same core library. It appears that this was done because someone wrote or copy pasted a hash function that took a byte array as input, then modified it to take two inputs by taking two byte arrays and operating on them in sequence, which left the hash function interface as (byte[], byte[]). In order to call this function on two longs, they used a handy long to byte[] conversion function in a widely used utility library. That function, in addition to allocating an byte[] and stuffing a long into it, also reverses the endianness of the long (the function appears to have been intended to convert long values to network byte order).

Unfortunately, switching to a more appropriate hash function would've been a major change, so my fix for this was to change the hash function interface to take a pair of longs instead of a pair of byte arrays and have the hash function do the endianness reversal instead of doing it as a separate step (since the hash function was already shuffling bytes around, this didn't create additional work). Removing these unnecessary allocations made my employer more money annually than I've made in my life.

Finding a constant factor speedup isn't technically an algorithms question, but it's also something you see in algorithms interviews. As a follow-up to an algorithms question, I commonly get asked "can you make this faster?" The answer is to these often involves doing a simple optimization that will result in a constant factor improvement.

A concrete example that I've been asked twice in interviews is: you're storing IDs as ints, but you already have some context in the question that lets you know that the IDs are densely packed, so you can store them as a bitfield instead. The difference between the bitfield interview question and the real-world superfluous array is that the real-world existing solution is so far afield from the expected answer that you probably wouldn’t be asked to find a constant factor speedup. More likely, you would've failed the interview at that point.

To pick an example from another company, the configuration for BitFunnel, a search index used in Bing, is another example of an interview-level algorithms question3.

The full context necessary to describe the solution is a bit much for this blog post, but basically, there's a set of bloom filters that needs to be configured. One way to do this (which I'm told was being done) is to write a black-box optimization function that uses gradient descent to try to find an optimal solution. I'm told this always resulted in some strange properties and the output configuration always resulted in non-idealities which were worked around by making the backing bloom filters less dense, i.e. throwing more resources (and therefore money) at the problem.

To create a more optimized solution, you can observe that the fundamental operation in BitFunnel is equivalent to multiplying probabilities together, so, for any particular configuration, you can just multiply some probabilities together to determine how a configuration will perform. Since the configuration space isn't all that large, you can then put this inside a few for loops and iterate over the space of possible configurations and then pick out the best set of configurations. This isn't quite right because multiplying probabilities assumes a kind of independence that doesn't hold in reality, but that seems to work ok for the same reason that naive Bayesian spam filtering worked pretty well when it was introduced even though it incorrectly assumes the probability of any two words appearing in an email are independent. And if you want the full solution, you can work out the non-independent details, although that's probably beyond the scope of an interview.

Those are just three examples that came to mind, I run into this kind of thing all the time and could come up with tens of examples off the top of my head, perhaps more than a hundred if I sat down and tried to list every example I've worked on, certainly more than a hundred if I list examples I know of that someone else (or no one) has worked on. Both the examples in this post as well as the ones I haven’t included have these properties:

  • The example could be phrased as an interview question
  • If phrased as an interview question, you'd expect most (and probably) all people on the relevant team to get the right answer in the timeframe of an interview
  • The cost savings from fixing the example is worth more annually than my lifetime earnings to date
  • The example persisted for long enough that it's reasonable to assume that it wouldn't have been discovered otherwise

At the start of this post, we noted that people at big tech companies commonly claim that they have to do algorithms interviews since it's so costly to have inefficiencies at scale. My experience is that these examples are legion at every company I've worked for that does algorithms interviews. Trying to get people to solve algorithms problems on the job by asking algorithms questions in interviews doesn't work.

One reason is that even though big companies try to make sure that the people they hire can solve algorithms puzzles they also incentivize many or most developers to avoid deploying that kind of reasoning to make money.

Of the three solutions for the examples above, two are in production and one isn't. That's about my normal hit rate if I go to a random team with a diff and don't persistently follow up (as opposed to a team that I have reason to believe will be receptive, or a team that's asked for help, or if I keep pestering a team until the fix gets taken).

If you're very cynical, you could argue that it's surprising the success rate is that high. If I go to a random team, it's overwhelmingly likely that efficiency is in neither the team's objectives or their org's objectives. The company is likely to have spent a decent amount of effort incentivizing teams to hit their objectives -- what's the point of having objectives otherwise? Accepting my diff will require them to test, integrate, deploy the change and will create risk (because all deployments have non-zero risk). Basically, I'm asking teams to do some work and take on some risk to do something that's worthless to them. Despite incentives, people will usually take the diff, but they're not very likely to spend a lot of their own spare time trying to find efficiency improvements(and their normal work time will be spent on things that are aligned with the team's objectives)4.

Hypothetically, let's say a company didn't try to ensure that its developers could pass algorithms quizzes but did incentivize developers to use relatively efficient algorithms. I don't think any of the three examples above could have survived, undiscovered, for years nor could they have remained unfixed. Some hypothetical developer working at a company where people profile their code would likely have looked at the hottest items in the profile for the most computationally intensive library at the company. The "trick" for both isn't any kind of algorithms wizardry, it's just looking at all, which is something incentives can fix. The third example is less inevitable since there isn't a standard tool that will tell you to look at the problem. It would also be easy to try to spin the result as some kind of wizardry -- that example formed the core part of a paper that won "best paper award" at the top conference in its field (IR), but the reality is that the "trick" was applying high school math, which means the real trick was having enough time to look at places where high school math might be applicable to find one.

I actually worked at a company that used the strategy of "don't ask algorithms questions in interviews, but do incentivize things that are globally good for the company". During my time there, I only found one single fix that nearly meets the criteria for the examples above (if the company had more scale, it would've met all of the criteria, but due to the company's size, increases in efficiency were worth much less than at big companies -- much more than I was making at the time, but the annual return was still less than my total lifetime earnings to date).

I think the main reason that I only found one near-example is that enough people viewed making the company better as their job, so straightforward high-value fixes tended not exist because systems were usually designed such that they didn't really have easy to spot improvements in the first place. In the rare instances where that wasn't the case, there were enough people who were trying to do the right thing for the company (instead of being forced into obeying local incentives that are quite different from what's globally beneficial to the company) that someone else was probably going to fix the issue before I ever ran into it.

The algorithms/coding part of that company's interview (initial screen plus onsite combined) was easier than the phone screen at major tech companies and we basically didn't do a system design interview.

For a while, we tried an algorithmic onsite interview question that was on the hard side but in the normal range of what you might see in a BigCo phone screen (but still easier than you'd expect to see at an onsite interview). We stopped asking the question because every new grad we interviewed failed the question (we didn't give experienced candidates that kind of question). We simply weren't prestigious enough to get candidates who can easily answer those questions, so it was impossible to hire using the same trendy hiring filters that everybody else had. In contemporary discussions on interviews, what we did is often called "lowering the bar", but it's unclear to me why we should care how high of a bar someone can jump over when little (and in some cases none) of the job they're being hired to do involves jumping over bars. And, in the cases where you do want them to jump over bars, they're maybe 2" high and can easily be walked over.

When measured on actual productivity, that was the most productive company I've worked for. I believe the reasons for that are cultural and too complex to fully explore in this post, but I think it helped that we didn't filter out perfectly good candidates with algorithms quizzes and assumed people could pick that stuff up on the job if we had a culture of people generally doing the right thing instead of focusing on local objectives.

If other companies want people to solve interview-level algorithms problems on the job perhaps they could try incentivizing people to solve algorithms problems (when relevant). That could be done in addition to or even instead of filtering for people who can whiteboard algorithms problems.

Appendix: how did we get here?

Way back in the day, interviews often involved "trivia" questions. Modern versions of these might look like the following:

  • What's MSI? MESI? MOESI? MESIF? What's the advantage of MESIF over MOESI?
  • What happens when you throw in a destructor? What if it's C++11? What if a sub-object's destructor that's being called by a top-level destructor throws, which other sub-object destructors will execute? What if you throw during stack unwinding? Under what circumstances would that not cause std::terminate to get called?

I heard about this practice back when I was in school and even saw it with some "old school" companies. This was back when Microsoft was the biggest game in town and people who wanted to copy a successful company were likely to copy Microsoft. The most widely read programming blogger around (Joel Spolsky) was telling people they need to adopt software practice X because Microsoft was doing it and they couldn't compete adopting the same practices. For example, in one of the most influential programming blog posts of the era, Joel Spolsky advocates for what he called the Joel test in part by saying that you have to do these things to keep up with companies like Microsoft:

A score of 12 is perfect, 11 is tolerable, but 10 or lower and you’ve got serious problems. The truth is that most software organizations are running with a score of 2 or 3, and they need serious help, because companies like Microsoft run at 12 full-time.

At the time, popular lore was that Microsoft asked people questions like the following (and I was actually asked one of these brainteasers during my on interview with Microsoft around 2001, along with precisely zero algorithms or coding questions):

  • how would you escape from a blender if you were half an inch tall?
  • why are manhole covers round?
  • a windowless room has 3 lights, each of which is controlled by a switch outside of the room. You are outside the room. You can only enter the room once. How can you determine which switch controls which lightbulb?

Since I was interviewing during the era when this change was happening, I got asked plenty of trivia questions as well plenty of brainteasers (including all of the above brainteasers). Some other questions that aren't technically brainteasers that were popular at the time were Fermi problems. Another trend at the time was for behavioral interviews and a number of companies I interviewed with had 100% behavioral interviews with zero technical interviews.

Anyway, back then, people needed a rationalization for copying Microsoft-style interviews. When I asked people why they thought brainteasers or Fermi questions were good, the convenient rationalization people told me was usually that they tell you if a candidate can really think, unlike those silly trivia questions, which only tell if you people have memorized some trivia. What we really need to hire are candidates who can really think!

Looking back, people now realize that this wasn't effective and cargo culting Microsoft's every decision won't make you as successful as Microsoft because Microsoft's success came down to a few key things plus network effects, so copying how they interview can't possibly turn you into Microsoft. Instead, it's going to turn you into a company that interviews like Microsoft but isn't in a position to take advantage of the network effects that Microsoft was able to take advantage of.

For interviewees, the process with brainteasers was basically as it is now with algorithms questions, except that you'd review How Would You Move Mount Fuji before interviews instead of Cracking the Coding Interview to pick up a bunch of brainteaser knowledge that you'll never use on the job instead of algorithms knowledge you'll never use on the job.

Back then, interviewers would learn about questions specifically from interview prep books like "How Would You Move Mount Fuji?" and then ask them to candidates who learned the answers from books like "How Would You Move Mount Fuji?". When I talk to people who are ten years younger than me, they think this is ridiculous -- those questions obviously have nothing to do the job and being able to answer them well is much more strongly correlated with having done some interview prep than being competent at the job. Hillel Wayne has discussed how people come up with interview questions today (and I've also seen it firsthand at a few different companies) and, outside of groups that are testing for knowledge that's considered specialized, it doesn't seem all that different today.

At this point, we've gone through a few decades of programming interview fads, each one of which looks ridiculous in retrospect. Either we've finally found the real secret to interviewing effectively and have reasoned our way past whatever roadblocks were causing everybody in the past to use obviously bogus fad interview techniques, or we're in the middle of another fad, one which will seem equally ridiculous to people looking back a decade or two from now.

Without knowing anything about the effectiveness of interviews, at a meta level, since the way people get interview techniques is the same (crib the high-level technique from the most prestigious company around), I think it would be pretty surprising if this wasn't a fad. I would be less surprised to discover that current techniques were not a fad if people were doing or referring to empirical research or had independently discovered what works.

Inspired by a comment by Wesley Aptekar-Cassels, the last time I was looking for work, I asked some people how they checked the effectiveness of their interview process and how they tried to reduce bias in their process. The answers I got (grouped together when similar, in decreasing order of frequency were):

  • Huh? We don't do that and/or why would we do that?
  • We don't really know if our process is effective
  • I/we just know that it works
  • I/we aren't biased
  • I/we would notice bias if the process if it existed
  • Someone looked into it and/or did a study, but no one who tells me this can ever tell me anything concrete about how it was looked into or what the study's methodology was

Appendix: training

As with most real world problems, when trying to figure out why seven, eight, or even nine figure per year interview-level algorithms bugs are lying around waiting to be fixed, there isn't a single "root cause" you can point to. Instead, there's a kind of hedgehog defense of misaligned incentives. Another part of this is that training is woefully underappreciated.

We've discussed that, at all but one company I've worked for, there are incentive systems in place that cause developers to feel like they shouldn't spend time looking at efficiency gains even when a simple calculation shows that there are tens or hundreds of millions of dollars in waste that could easily be fixed. And then because this isn't incentivized, developers tend to not have experience doing this kind of thing, making it unfamiliar, which makes it feel harder than it is. So even when a day of work could return $1m/yr in savings or profit (quite common at large companies, in my experience), people don't realize that it's only a day of work and could be done with only a small compromise to velocity. One way to solve this latter problem is with training, but that's even harder to get credit for than efficiency gains that aren't in your objectives!

Just for example, I once wrote a moderate length tutorial (4500 words, shorter than this post by word count, though probably longer if you add images) on how to find various inefficiences (how to use an allocation or CPU time profiler, how to do service-specific GC tuning for the GCs we use, how to use some tooling I built that will automatically find inefficiencies in your JVM or container configs, etc., basically things that are simple and often high impact that it's easy to write a runbook for; if you're at Twitter, you can read this at http://go/easy-perf). I've had a couple people who would've previously come to me for help with an issue tell me that they were able to debug and fix an issue on their own and, secondhand, I heard that a couple other people who I don't know were able to go off and increase the efficiency of their service. I'd be surprised if I’ve heard about even 10% of cases where this tutorial helped someone, so I'd guess that this has helped tens of engineers, and possibly quite a few more.

If I'd spent a week doing "real" work instead of writing a tutorial, I'd have something concrete, with quantifiable value, that I could easily put into a promo packet or performance review. Instead, I have this nebulous thing that, at best, counts as a bit of "extra credit". I'm not complaining about this in particular -- this is exactly the outcome I expected. But, on average, companies get what they incentivize. If they expect training to come from developers (as opposed to hiring people to produce training materials, which tends to be very poorly funded compared to engineering) but don't value it as much as they value dev work, then there's going to be a shortage of training.

I believe you can also see training under-incentivized in public educational materials due to the relative difficulty of monetizing education and training. If you want to monetize explaining things, there are a few techniques that seem to work very well. If it's something that's directly obviously valuable, selling a video course that's priced "very high" (hundreds or thousands of dollars for a short course) seems to work. Doing corporate training, where companies fly you in to talk to a room of 30 people and you charge $3k per head also works pretty well.

If you want to reach (and potentially help) a lot of people, putting text on the internet and giving it away works pretty well, but monetization for that works poorly. For technical topics, I'm not sure the non-ad-blocking audience is really large enough to monetize via ads (as opposed to a pay wall).

Just for example, Julia Evans can support herself from her zine income, which she's said has brought in roughly $100k/yr for the past two years. Someone who does very well in corporate training can pull that in with a one or two day training course and, from what I've heard of corporate speaking rates, some highly paid tech speakers can pull that in with two engagements. Those are significantly above average rates, especially for speaking engagements, but since we're comparing to Julia Evans, I don't think it's unfair to use an above average rate.

Appendix: misaligned incentive hedgehog defense, part 3

Of the three examples above, I found one on a team where it was clearly worth zero to me to do anything that was actually valuble to the company and the other two on a team where it valuable to me to do things that were good for the company, regardless of what they were. In my experience, that's very unusual for a team at a big company, but even on that team, incentive alignment was still quite poor. At one point, after getting a promotion and a raise, I computed the ratio of the amount of money my changes made the company vs. my raise and found that my raise was 0.03% of the money that I made the company, only counting easily quantifiable and totally indisputable impact to the bottom line. The vast majority of my work was related to tooling that had a difficult to quantify value that I suspect was actually larger than the value of the quantifiable impact, so I probably recieved well under 0.01% of the marginal value I was prociding. And that's really an overestimate of how much I was incentivized I was to do the work -- at the margin, I strongly suspect that anything I did was worth zero to me. After the first $10m/yr or maybe $20m/yr, there's basically no difference in terms of performance reviews, promotions, raises, etc. Because there was no upside to doing work and there's some downside (could get into a political fight, could bring the site down, etc.), the marginal return to me of doing more than "enough" work was probably negative.

Some companies will give very large out-of-band bonuses to people regularly, but that work wasn't for a company that does a lot of that, so there's nothing the company could do to indicate that it valued additional work once someone did "enough" work to get the best possible rating on a performance review. From a mechanism design point of view, the company was basically asking employees to stop working once they did "enough" work for the year.

So even on this team, which was relatively well aligned with the company's success compared to most teams, the company's compensation system imposed a low ceiling on how well the team could be aligned.

This also happened in another way. As is common at a lot of companies, managers were given a team-wide budget for raises that was mainly a function of headcount, that was then doled out to team members in a zero-sum way. Unfortunately for each team member (at least in terms of compensation), the team pretty much only had productive engineers, meaning that no one was going to do particularly well in the zero-sum raise game. The team had very low turnover because people like working with good co-workers, but the company was applying one the biggest levers it has, compensation, to try to get people to leave the team and join less effective teams.

Because this is such a common setup, I've heard of managers at multiple companies who try to retain people who are harmless but ineffective to try to work around this problem. If you were to ask someone, abstractly, if the company wants to hire and retain people who are ineffective, I suspect they'd tell you no. But insofar as a company can be said to want anything, it wants what it incentivizes.

Thanks to Leah Hanson, Heath Borders, Lifan Zeng, Justin Findlay, Kevin Burke, @chordowl, Peter Alexander, Niels Olson, Kris Shamloo, and Solomon Boulos for comments/corrections/discussion

  1. For one thing, most companies that copy the Google interview don't have that much scale. But even for companies that do, most people don't have jobs where they're designing high-scale algorithms (maybe they did at Google circa 2003, but from what I've seen at three different big tech companies, most people's jobs are pretty light on algorithms work). [return]
  2. Real is in quotes because I've passed a number of interviews for reasons outside of the interview process. Maybe I had a very strong internal recommendation that could override my interview performance, maybe someone read my blog and assumed that I can do reasonable work based on my writing, maybe someone got a backchannel reference from a former co-worker of mine, or maybe someone read some of my open source code and judged me on that instead of a whiteboard coding question (and as far as I know, that last one has only happened once or twice). I'll usually ask why I got a job offer in cases where I pretty clearly failed the technical interview, so I have a collection of these reasons from folks.

    The reason it's arguably zero is that the only software interview where I inarguably got a "real" interview and was coming in cold was at Google, but that only happened because the interviews that were assigned interviewed me for the wrong ladder -- I was interviewing for a hardware position, but I was being interviewed by software folks, so I got what was basically a standard software interview except that one interviewer asked me some questions about state machine and cache coherence (or something like that). After they realized that they'd interviewed me for the wrong ladder, I had a follow-up phone interview from a hardware engineer to make sure I wasn't totally faking having worked at a hardware startup from 2005 to 2013. It's possible that I failed the software part of the interview and was basically hired on the strength of the follow-up phone screen.

    Note that this refers only to software -- I'm actually pretty good at hardware interviews. At this point, I'm pretty out of practice at hardware and would probably need a fair amount of time to ramp up on an actual hardware job, but the interviews are a piece of cake for me. One person who knows me pretty well thinks this is because I "talk like a hardware engineer" and both say things that make hardware folks think I'm legit as well as say things that sound incredibly stupid to most programmers in a way that's more abbout shibboleths than actual knowledge or skills.

  3. This one is a bit harder than you'd expect to get in a phone screen, but it wouldn't be out of line in an onsite interview (although a friend of mine once got a Google Code Jam World Finals question in a phone interview with Google, so you might get something this hard or harder, depending on who you draw as an interviewer).

    BTW, if you're wondering what my friend did when they got that question, it turns out they actually knew the answer because they'd seen and attempted the problem during Google Code Jam. They didn't get the right answer at the time, but they figured it out later just for fun. However, my friend didn't think it was reasonable to give that as a phone screen questions and asked the interviewer for another question. The interviewer refused, so my friend failed the phone screen. At the time, I doubt there were more than a few hundred people in the world who would've gotten the right answer to the question in a phone screen and almost all of them probably would've realized that it was an absurd phone screen question. After failing the interview, my friend ended up looking for work for almost six months before passing an interview for a startup where he ended up building a number of core systems (in terms of both business impact and engineering difficulty). My friend is still there after the mid 10-figure IPO -- the company understands how hard it would be to replace this person and treats them very well. None of the other companies that interviewed this person even wanted to hire them at all and they actually had a hard time getting a job.

  4. Outside of egregious architectural issues that will simply cause a service to fall over, the most common way I see teams fix efficiency issues is to ask for more capacity. Some companies try to counterbalance this in some way (e.g., I've heard that at FB, a lot of the teams that work on efficiency improvements report into the capacity org, which gives them the ability to block capacity requests if they observe that a team has extreme inefficiencies that they refuse to fix), but I haven't personally worked in an environment where there's an effective system fix to this. Google had a system that was intended to address this problem that, among other things, involved making headcount fungible with compute resources, but I've heard that was rolled back in favor of a more traditional system for reasons. [return]
Read the whole story
3 days ago
14 days ago
Earth, Sol system, Western spiral arm
14 days ago
I've given up fighting my company's 'test'. I might drop this link in our architect channel though.
Share this story
1 public comment
14 days ago
“When measured on actual productivity, that was the most productive company I've worked for. I believe the reasons for that are cultural and too complex to fully explore in this post, but I think it helped that we didn't filter out perfectly good candidates with algorithms quizzes and assumed people could pick that stuff up on the job if we had a culture of people generally doing the right thing instead of focusing on local objectives.”
Washington, DC

Iridescent Clouds over Sweden

1 Comment and 2 Shares
Iridescent Clouds over Sweden Why would these clouds multi-colored? A relatively rare phenomenon in clouds known as iridescence can bring up unusual colors vividly or even a whole spectrum of colors simultaneously. These polar stratospheric clouds clouds, also known as nacreous and mother-of-pearl clouds, are formed of small water droplets of nearly uniform size. When the Sun is in the right position and, typically, hidden from direct view, these thin clouds can be seen significantly diffracting sunlight in a nearly coherent manner, with different colors being deflected by different amounts. Therefore, different colors will come to the observer from slightly different directions. Many clouds start with uniform regions that could show iridescence but quickly become too thick, too mixed, or too angularly far from the Sun to exhibit striking colors. The featured image and an accompanying video were taken late last year over Ostersund, Sweden.
Read the whole story
3 days ago
[breath taken away]
Earth, Sol system, Western spiral arm
3 days ago
Share this story

The Best Code BEAM SF talks from the 2010s

1 Share

Preparations for Code BEAM SF 2020 are well and truly underway. This year marks the 9th anniversary of the conference, meaning that Code BEAM has been bringing the best of the BEAM to the Bay Area for the best part of a decade. To whet for your appetite for this year’s event, and to say goodbye to the decade gone by, we thought it was a timely opportunity to look back at the talks that have made the biggest waves in the community every year since its launch in 2012. From the launch of Elixir to lessons from WhatsApp, we’ve got all the Major BEAM events covered. So sit back, relax and get excited with this comprehensive selection of top-class talks.

Highlights from Code BEAM SF 2019

Operable Erlang and Elixir by Fred Hebert

Successful systems grow, and as they grow, they become more complex. It’s not just the code that gets messier either; it is also the people who are part of the system who need to handle a growing level of complexity. In his talk, Fred Hebert explains why it is not enough to take a code-centric approach. Instead, he argues for a holistic approach to making Erlang and Elixir systems truly operator-friendly. This encompasses how our mental models work, what makes for best practice automation, and, what tools exist on the Erlang VM to help us deal with the unexpected. To learn more head to the page for Fred Hebert’s talk on the Code Sync website.

Announcing Broadway - keynote speech by José Valim

The development of Broadway was a significant step forward for Elixir in 2019. The protocol produced by Plataformatec streamlines data processing, making concurrent, multi-stage pipelines easier than ever. In his talk, José Valim explained how Broadway leverages GenStage to provide back-pressure and how Elixir depends on OTP for its famed fault-tolerance. Learn more about José’s talk at Code BEAM SF 2019 on the Code Sync website.

Highlights from Code BEAM SF 2018

The Forgotten Ideas in Computer Science - keynote speech by Joe Armstrong

Some things are just ahead of their time. In his 2018 keynote, Joe Armstrong looks at ideas from the early days of computing and reflects upon the good ideas that could be helpful to revisit and the bad ideas that we can still learn something from. This thought-provoking talk is one that still holds relevance today and is worth revisiting. Interested in what the forgotten ideas of computer science are? Learn more about Joe Armstrong’s talk at Code BEAM SF 2018.

A Reflection on Building the WhatsApp Server by Anton Lavrik

Whatsapp is arguably the BEAM’s most famous success story. In 2014, when Facebook purchased them, the stories of 10 server-side engineers managing a platform with 450 million active users, sending 54 billion messages a day were shared far and wide.
In his talk, Anton Lavrik described some of the tools they use for developing reliable and scalable servers in Erlang. The talk included tools that were not widely used at the time and methods that went against conventional Erlang practices. Want to find out what those tools and practices were? Learn more about Anton Lavrik’s talk at Code BEAM SF.

Highlights from Erlang Factory SF Bay Area 2017

Building a web app in Erlang by Garrett Smith

The long-held belief has been that Erlang is not suitable for web applications. The arrival of LiveView may have given Elixir the edge, but that doesn’t mean you can’t build web applications in Erlang. In this talk, Garrett Smith shows that Erlang can not only be used to build applications but, that it is actually great for it. The discussion is inspired by a presentation from Ian Bicking entitled “Building a Web Framework from Scratch”. It shows that web apps can be built without monolithic frameworks, starting with nothing and gradually adding functionality using abstractions that closely mirror the Web’s underlying protocols. This particular example may be built in Erlang, but the lessons and principles in the talk apply to developing web applications in general. Learn more about Garrett Smith’s talk at Erlang Factory 2017.

Highlights from Erlang Factory 2016

Why Functional Programming Matters, keynote speech by John Hughes

Nearly a decade before Erlang was released as open source; John Hughes published “Why Functional Programming Matters”, a manifesto for functional programming. In this talk, John takes a deep dive into the history of functional programming and explains why functional programming is more important than ever. Get a more detailed insight into John Hughes talk at Erlang Factory 2016.

Highlights from Erlang Factory 2015

Building And Releasing A Massively Multiplayer Online Game With Elixir by Jamie Winsor

The team from Undead Labs launched State of Decay to rave reviews, but they did have one piece of persistent criticism, ‘why was there no multiplayer option?’ To build the infrastructure to handle the massive scale of concurrent users required for massively multiplayer games online traditionally large engineering and support teams, as well as significant time and financial investment. In this talk, Jamie Winsor explains how they decided on Erlang as the right tool for the job to empower them to fast track their multiplayer offering without jeopardising company culture or their product. Learn more about Jamie Winsor’s talk at Erlang Factory 2015.

Highlights from Erlang Factory 2014

That’s 'Billion’ with a 'B’: Scaling to the Next Level at WhatsApp by Rick Reed

In 2014, Whatsapp was the toast of the tech community with its $19 billion sale to Facebook. The news came bought with it an influx of interest in Erlang, as people looked into how such an agile team of developers were able to build such a robust system. In this talk, Rick Reed explains how the team were able to meet the challenge of running hundreds of nodes, thousands of cores and hundreds of terabytes of RAM to scale and handle billions of users. Get more information on Rick Reed’s talk at Eralng Factory 2014.

Highlights from Erlang Factory 2013

The How and Why of Fitting Things Together by Joe Armstrong

Software is complex, and things get complicated when the parts don’t fit together. But how does this happen? And what can we do to prevent this happening? In his talk, Joe Armstrong answers these questions and explains where Erlang comes in to save the day. See the talk descriptions and more details about Joe’s talk from Erlang Factory 2013.

Highlights from Erlang Factory 2012

Scaling to Millions of Simultaneous Connections by Rick Reed

Just two years before they had to scale to billions, WhatsApp needed to figure out how to scale to millions. In this talk, Rick Reed explains how Erlang helps them meet the growing user-demand while continuing to keep a small server footprint. You can see more details from Rick Reed’s talk at Erlang Factory 2012 here.


For nearly a decade, Code BEAM SF has been the beacon of BEAM related news in the U.S.A. It provides a space for the Erlang and Elixir community to share great ideas, be inspired, connect and increase their knowledge base. As we enter a new decade, we are excited to see how these technologies will grow and become important players in growth sectors such as IoT, FinTech and machine learning. The first Code BEAM SF of the decade takes place in March, tickets are on sale now and you can see the full line up of speakers at

Read the whole story
5 days ago
Share this story

Testing the boundaries of collaboration – Increment: Testing

2 Comments and 3 Shares

It’s 2030. A programmer in Lagos extracts a helper method. Seconds later, the code of every developer working on the program around the world updates to reflect the change. Seconds later, each of the thousands of servers running the software updates. Seconds later, the device in my pocket in Berlin updates, along with hundreds of millions of other devices across the globe.

Perhaps the most absurd assumption in this story is that I’ll still have a pocket in 10 years. (Or that I’llcarry” adevice.”) But the instant deployment of tiny changes from thousands of developers is just the continuation of decades-old trends in software development collaboration.

Welcome to collaborative software development, Limbo-style: a future of software development collaboration. (I’m not arrogant enough to call it the futureI’ve seen too much by now to assume that I know where all this is going.)

It’s 2016. I’m in my fifth year at Facebook coaching programmers. I’ve seen the engineering organization grow from 700 to 5,000 developers. Scale is the hard part about Facebook. We took a leading role in Mercurial development just so that we could be sure it could handle all our code in one repository.

I want to make an impact, but I know that I’ll have trouble doing so if I do what everyone else is doing. Plenty of people in the organization are looking at how to get from 5,000 to 10,000 developers. However, I can find no one trying to figure out how to handle 100,000 developers working on the same system. (Protip: When looking for juicy problems, follow an established trend further than any sensible person would.) Ideas need time to ripen, so I stick the 100,000-developer problem in the back of my mind and get back to coaching engineers.

While coaching, I notice just how much time students lose because of the latency and variance created by our blocking, asynchronous code review workflow. In what becomes the industry-standard collaboration workflow, every change to Facebook code requires an independent reviewer to approve it before it can go into production.

Some students juggle several projects at once so that they can continue coding while waiting for a review. Others stack up several changes, betting that the first one won’t have to be revised in ways that will ripple forward. Others just pack more and more into a change to amortize the cost of review delay.

Everyone agrees that small changes are the way to go, but the overhead per change forces programmers into a trade-off. Make the changes too small and you spend all your time waiting for code reviewers. Make the changes too big and you also spend all your time waiting for someone to review and confidently endorse a gigantic change. This phenomenon isn’t scrutinized: It’s just the price we pay for working together.

But my engineer nose twitches, as it does when a problem is about to go nonlinear. The bigger the organization, it seems to me, the bigger the hidden costs of blocking, asynchronous code review.

How low can Limbo go

It’s 2017. I’m still coaching at Facebook when two experiences pull the question of scaling collaborative workflows out of the dusty back rooms of my brain.

First, I measure that the distribution of diff sizes is the same regardless of programming language (confounding my hypothesis). The only outlier is one service where each change is rolled out as a separate deployment. (Changes are batched in the deployment of other services.) Those changes tend to be smaller. Small is good, remember? Hmm . . .

Second, I run a personal project, a Smalltalk virtual machine, where I keep changes tiny. I notice that almost all of the changes are safe. Faced with a hard behavioral change, I spend significant effort changing the structure in small, safe steps, so that the hard change becomes easy. If most changes are safe, then they can be deployed immediately, saving the heavy validation horsepower for a small percentage of changes.

As I program my Smalltalk project, I begin to have literal visions of software development collaboration as Conway’s Game of Life, the cell automaton created by the mathematician John Horton Conway. In my version, each cell is a programmer. When a programmer makes a change, their cell changes color. Changes ripple outward to other developers, sometimes crashing and interfering but usually just going off the edge of the map.

Madness. Chaos warning. Changes propagating, joining, merging, crashing. How could a programmer work in this environment?

As I live with the discomfort of this model, I realize that I haven’t gone far enough. What if, instead, not just programmers but all production machines are on the game board? (And I thought I was outrageous before!)

I tell my friend Saurav Mohapatra, a comics author and software engineer, about this model. The smaller the changes, the better the whole scheme seems to work. He suggests the nameLimbo” because, as the song asks, how low can you go? That is, how small a change is too small? I don’t have an answer, but at this stage of the idea, that’s okay.

Limbo is clearly impractical: a model where all changes are tinyand instantly deployed. Butimpractical” is a compliment when applied to a new idea. Revolutionary innovation comes of making the impractical practical. (Practical work is also necessary for evolution, but it’s not really my bag.)

Chewing on this idea calls forth related and supporting ideas. (As writer Basil King puts it,Go at it boldly, and you’ll find unexpected forces closing round and coming to your aid.”) I have grand ideas for rebuilding the programming toolchain based on abstract syntax tree transformations. I even build an editor based on it, called Prune, with fellow Facebook software engineer Thiago Hirai.

Rewriting editors and indexers and version control and build tools and deployment tools is a lot of work, though. If I want to see Limbo in my lifetime, I need to get lucky.

Test, commit, revert

It’s 2018. My longtime collaborators at Iterate, an innovation consultancy in Oslo, Norway, invite me to host a code camp that November: a week of nothing but coding, and a way for geeks to stretch their legs. As we sip coffee in our borrowed conference room on the first Monday of camp, I explain my vision for Limbo. Then I show the students (Lars Barlindhaug, Oddmund Strømme, and Ole Johannessen) a workflow that I’ve been using for years. Every time the tests pass, I create a little commit so that I can easily get back to a known good state.

Strømme, one of the few programmers I’ve ever met who is as obsessed with symmetry as I am, says,If we commit when the tests pass, then we must revert when the tests fail.

I hate this idea.You mean if I make one little mistake, all of my changes would just*poof* disappear?” Yes, I hate this idea. It’s also cheap to try, so we do. There’s a particular shiver I get when encountering a bad idea so bad that it might be good. I feel that shiver now.

Before we start our coding project, we implement our new workflow, which we call test && commit || revert, or TCR:

python && git commit -am "working" || git reset --hard 

Then we code a sample project: a testing framework. Every time we make a mistake, as expected*poof*our changes disappear. At first, the disappearing code startles us. Then we notice our programming style changing.

Initially, we just make the same change again, but the computer eventually out-stubborns us. Then, if we’ve been making changes and we aren’t sure if we’ve broken anything, we just run the command line. If we disagree with the computer about whether we’ve made a mistake, we figure out how to make the change in several smaller steps.

We know that we’re on to something with this new workflow. Despite its simplicity and similarity to test-driven development, it creates intense incentives for making changes in small steps.

With a couple days’ practice, we gain confidence in our shiny new TCR skills. TCR incentivizes us to create an endless stream of tiny changes, each of which results in working software. Remind you of anything? My idea for Limbo also relies on an endless stream of tiny changes, each of which results in working software. Does this mean that we’re ready to Limbo?

Turns out (the most exciting words in engineering): Yes. We create a repository on GitHub, clone it to two machines, then execute a loop on each machine:

while(true); do git pull --rebase; git push; done; 

We start working on our sample program separately, as two pairs. In spite of deliberately avoiding any explicit coordination, and in spite of the tiny codebase, we find merge conflicts rare and cheap. Every once in a while, we’re writing some code and*poof*it disappears, to be replaced by whatever the other pair has changed. We only lose a few seconds of work, so it never feels like an imposition; instead, it feels more like an update that we appreciate seeing before we get any further.

The incentive to make changes in tiny steps (built into TCR) is amplified in Limbo-style collaboration. The first pair to finish and commit doesn’t risk having their code poofed. And if you don’t want your changes to disappear because of someone else’s activity, make your changes in even smaller steps.

The disaster that wasn’t

It’s 2019. Blocking, asynchronous code reviews are the dominant method for collaboratively developing software. If I draw one certain conclusion from my experiments with TCR and Limbo, it’s that blocking, asynchronous code reviews are not the only effective workflow for collaboration. While I don’t know if TCR and/or Limbo will be the future, I think that something different is coming.

A handful of bloggers and screencasters have replicated TCR in various languages and programming environments. Usage, so far, is confined to wild-eyed pioneers, but momentum is gathering. If you’re the sort of person who likes to try out new programming workflows for fun, TCR awaits you. If you like smooth, professional tool support, then you’ll need to write your own.

As of this writing, Limbo has only been tried to the degree I’ve described here. (Please let me know if you’ve tried it with more independent streams of changes!) It should have been horrible and a disasterand it wasn’t. I get more excited by ideas that should be disasters and aren’t than by ideas that should work and do. But that’s me.

If you want to contribute to the real-time software development collaboration wave, here’s where to begin, with some of the reasons why Limbo is impractical:

  • Instability. Constant changes mean breaking all the time. Right?
  • Bandwidth. Constant changes mean too much bandwidth consumed distributing changes. Right?
  • Security. Constant changes mean bad actors slipping in bad changes. Right?
  • Latency. Constant changes mean spending forever waiting for feedback. Right?
  • Tools. Constant changes mean completely new toolchains. Right?
  • Training. Constant changes mean new design, testing, tooling, and collaboration skills. Right?
  • Culture. Constant changes mean new social structures of programming, and programmers aren’t social. Right?

Transform the impracticalities into practicalities and you’ll really have something special. Even make a little progress and you’ll have progress worth the cost.

That program I’m using in 2030? It has been written by 100,000 programmers around the world. Some programmers came to it as experts and contributed immediately. Others used their work on it to fuel learning. Along the way, new social and economic structures evolved to support continuing work. One thing, though, was constanttiny changes, instantly deployed.

Read the whole story
8 days ago
Just crazy enough?
Earth, Sol system, Western spiral arm
8 days ago
Testing the boundaries of collaboration – Increment: Testing
Share this story

Technology à la carte

1 Comment and 3 Shares

I ran across an article this morning about the booming market for low-tech tractors. Farmers are buying up old tractors at auction because the older tractors are cheaper to buy, cheaper to operate, and cheaper to repair.

The article opens with the story of Kris Folland, a Minnesota farmer who bought a 1979 John Deere tractor. The article says he “retrofitted it with automatic steering guided by satellite.” That’s the part of the article I wanted to comment on.

If you suggest that anything new isn’t better, someone will label you a Luddite. But Folland is not a not Luddite. He wants satellite technology, though he also wants a machine he can repair himself. He didn’t buy a 1979 tractor out of nostalgia. He bought it because it was an order of magnitude cheaper to own and operate.

Related posts

Read the whole story
12 days ago
12 days ago
Davis, CA
Share this story
1 public comment
12 days ago
Similar to this is cloud hosting for services. I was recently told that two servers that a company I was doing work for in the cloud cost $1600 per month to operate, so they were trying to get down to 1 server for their application. These weren't particularly high end servers, being only quad core boxes. You could buy an entire rack at a data center for $800 and put your baseload on that, the parts that you always need to run even if you're basically idle, then offload to the cloud any spike handling where the traffic exceeds what your traditional rack can handle. If you wanted to, you could get two half racks at different areas of the country for some geographic diversity, and host multiple 16 to 64 core servers in them. All of this XaaS (tractor as a service, software as a service, etc) stuff is way more expensive than if you actually really owned something. The only time its of any benefit is if it's something that you're going to use very rarely, not for something you need access to frequently.
Colorado Plateau

Self-Knowledge by Looking at Others


I've published quite a lot on people's poor self-knowledge of their own stream of experience (e.g. this and this), and also a bit on our often poor self-knowledge of our attitudes, traits, and moral character. I've increasingly become convinced that an important but relatively neglected source of self-knowledge derives from one's assessment of the outside world -- especially one's assessment of other people.

I am unaware of empirical evidence of the effectiveness of the sort of thing I have in mind (I welcome suggestions!), but here's the intuitive case.

When I'm feeling grumpy, for example, that grumpiness is almost invisible to me. In fact, to say that grumpiness is a feeling doesn't quite get things right: There's isn't, I suspect, a way that it feels from the inside to be in a grumpy mood. Grumpiness, rather, is a disposition to respond to the world in a certain way; and one can have that disposition while one feels, inside, rather neutral or even happy.

When I come home from work, stepping through the front door, I usually feel (I think) neutral to positive. Then I see my wife Pauline and daughter Kate -- and how I evaluate them reveals whether in fact I came through that door grumpy. Suppose the first thing out of Pauline's mouth when I come through the door is, "Hi, Honey! Where did you leave the keys for the van?" I could see this as an annoying way of being greeted, I could take it neutrally in stride, or I could appreciate how Pauline is still juggling chores even as I come home ready to relax. As I strode through that door, I was already disposed to react one way or another to stimuli that might or might not be interpreted as annoying; but that mood-constituting disposition didn't reveal itself until I actually encountered my family. Casual introspection of my feelings as I approached the front door might not have revealed this disposition to me in any reliable way.

Even after I react grumpily or not, I tend to lack self-knowledge. If I react with annoyance to a small request, my first instinct is to turn the blame outward: It is the request that is annoying. That's just a fact about the world! I either ignore my mood or blame Pauline for it. My annoyed reaction seems to me, in the moment, to be the appropriate response to the objective annoyingness of the situation.

Another example: Generally, on my ten-minute drive into work, I listen to classic rock or alternative rock. Some mornings, every song seems trite and bad, and I cycle through the stations disappointed that there's nothing good to listen to. Other mornings, I'm like "Whoa, this Billy Idol song is such a classic!" Only slowly have I learned that this probably says more about my mood than about the real quality of the songs that are either pleasing or displeasing me. Introspectively, before I turn on the radio and notice this pattern of reactions, there's not much there that I can discover that otherwise clues me into my mood. Maybe I could introspect better and find that mood in there somewhere, but over the years I've become convinced that my song assessment is a better mood thermometer, now that I've learned to think of it that way.

One more example: Elsewhere, I've suggested that probably the best way to discover whether one is a jerk is not by introspective reflection ("hm, how much of a jerk am I?") but rather by noticing whether one regularly sees the world through "jerk goggles". Everywhere you turn, are you surrounded by fools and losers, faceless schmoes, boring nonentities? Are you the only reasonable, competent, and interesting person to be found? If so....

As I was drafting this post yesterday, Pauline interrupted me to ask if I wanted to RSVP to a Christmas music singalong in a few weeks. Ugh! How utterly annoying I felt that interruption to be! And then my daughter's phone, plugged into the computer there, wouldn't stop buzzing with text messages. Grrr. Before those interruptions, I would probably have judged that I was in a middling-to-good mood, enjoying being in the flow of drafting out this post. Of course, as those interruptions happened, I thought of how suitable they were to the topic of this post (and indeed I drafted out this very paragraph in response). Now, a day later, my mood is better, and the whole thing strikes me as such a lovely coincidence!

If I sit too long at my desk at work, my energy level falls. Every couple of hours, I try to get up and stroll around campus a bit. Doing so, I can judge my mood by noticing others' faces. If everyone looks beautiful to me, but in a kind of distant, unapproachable way, I am feeling depressed or blue. Every wart or seeming flaw manifests a beautiful uniqueness that I will never know. (Does this match others' phenomenology of depression? Before having noticed this pattern in my reactions to people, I might not have thought this would be how depression feels.) If I am grumpy, others are annoying obstacles. If I am soaring high, others all look like potential friends.

My mood will change as I walk, my energy rising. By the time I loop back around to the Humanities and Social Sciences building, the crowds of students look different than they did when I first stepped out of my office. It seems like they have changed, but of course I'm the one who has changed.

[image source]

Read the whole story
16 days ago
17 days ago
Cambridge, Massachusetts
Share this story
Next Page of Stories