Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Code reviews need to shift-left (sourcery-ai.medium.com)
37 points by rekahrv on Jan 24, 2023 | hide | past | favorite | 61 comments


I don't really buy the two main criticisms around code review. They're stated as intractable downsides but are really just cultural issues around how you perform code reviews.

If your coworkers are sending changes for review that don't explain why the changes are being made (and therefore provide the context), reject it until they write a proper summary/test plan that does explain it. You should review the "metadata" of changes before you ever review the code itself and if done well, you'll probably be able to guess what a lot of the incoming changes are before reviewing the code. This makes it easier to spot code that deviates from the stated purpose of the changes and can help identify faults in both understanding of the overall problem or the system being built, especially in more junior engineers.

If your coworkers are taking to long to review changes, work with your team to incentivize expedient code reviews, or talk to the people that aren't reviewing enough code to help everyone else get more done. You should also utilize a stack that lets you continue progressing with your work even when waiting for reviews (like stacked diffs).


The first criticism ("Context Gaps. The reviewer doesn’t have the full context of how and why decisions were made for the code they were reviewing.") is one of the things I consider a _advantage_ of code reviews. The article just dismisses the idea that having the reasons for decisions written down is useful, which I find extremely strange.


'You should review the "metadata" of changes before you ever review the code itself'

That's a great approach. It sounds logical, but I certainly don't do it consistently. :-)


The context is not just "why the change was made". That point is usually clear from ticket in the first place. The context is "do other places in code use the same tactic" question. It is also the "I tried these four things and three of them failed because of X" or "I had to choose between these two solutions and picked this one".


Having the “why” in the same place as the code I think is critically important. Who knows what ticketing system you’ll use in the future, it’s fine to link to but it shouldn’t be where all the context is stored.

If you think about a pull request as a story you have who, what, where, when, and why. Most tools handle the first 4 but the why is critically important to anyone, including yourself, who needs to understand the nuance of the change that can’t easily be expressed by the code itself without a novel size comment block.

I find that a lot of times people forget the “why” when submitting a change set and simply restate the “what”. At the time you make the change it’s all fresh in your mind and you trick yourself into believing you’ll remember all the details but in reality that’s no my experience.


Do those things matter once the code is actually written though? They're important during the dev process, particularly if you're deciding between two different-but-valid approaches, which is why teams should be communicating and discussing things, but once the decision has been made it ceases to be anything worth worrying about. It's a sunk cost. It shouldn't need to be a part of a code review unless the comms are failing ahead of the review process. If that's the case then you should try to fix the comms issue.


A lot of what we devs think are vitally important... just aren't. Or, they have a steep time decay on how useful they are.

I do have the longest code review summaries on my team, amusingly. So, I clearly like the idea of it being a discussion and capturing some of that. I would be challenged to say what part is most important overall, though. In large, because not all code changes are the same.


"Do those things matter once the code is actually written though?"

That's a great question. I guess it depends on how much the code will change in the future. These things CAN get relevant again when you consider changing that code. E.g. if it turns out later that the library I've chosen isn't maintained anymore. Or we would need some more functionality, but this library doesn't provide it. In these situations, it's useful to see which other alternatives we've considered.

That said: I can't really estimate how often this turns out to be relevant, indeed. I guess the answer might be very different for different projects.


I think it does not make sense to write them down as comments or in the pull request or anywhere else. It makes sense to talk about them during review process where reviewer might have different opinion - or might think the other option is better because reviewer does not know about the issue.


This is the kind of stuff I would add in the commit message (and PR description): "I ended up choosing solution A because B, I also tried C and D, but that didn't work so well because E".

Everything you say that's not written down will be lost in the aether within weeks, months at best. I'm a huge fan of async communication because everything will be written down.

If there's a discussion "why did you go with A and not B", in an (async) code review then anyone can come back to it a year later and read up on why it was done in this particular way. I've found this useful on a number of occasions.

It certainly beats "Alice, you changed this last year, but why is it done like that? Wouldn't X be much more convenient because Y? It's hard to add Z now.", "ehm, I discussed it with Bob, but he since left the company. I don't quite remember, I think it was A? Or maybe B? Or was it C?"


Generally, those are all good things to include in the commit message.


I dont think so. The "do other places in code use the same tactic" especially not. This is something ideal reviewer would check, consistency is important. But it does not make sense to write it anywhere.


The issue with code reviews is that most of them should be prototype reviews before you are even half way done, or design reviews before you even start. Or just dialog around some code or problem, along the way. Looking for quality issues or obvious mistakes after the fact is very rarely useful. I still think it's a good idea to have them as a sign off (e.g. so you at least have 2 people knowing what's happened), but other than that I don't think they provide that much value, at least not for people who are many years into a project and code base.

But still we do these after-the-fact code reviews because I don't want to bother my colleague N times in the process. So I churn away until I'm done. And then not to bother them twice, I also polish it. And THEN I ask for a review. The reviewer sees something polished and finished, and assumes that a) since it's polished I must have spent a lot of thought around whether it's the right thing, and obviously anything they can do in the review time which is never going to exceed 2% of the dev time, is rarely going to result in them second guessing that and b) since it's polished they know they'll probably step on some toes if they say "throw it out and do something completely different". They can see the effort. The relationship with a colleague is more important than even the health of the project. So they approve it.

The problem of course then is: how do you get from the "too late code reviews that they are rubber stamps" into a process of dialog, half-way reviews and iteration? Other than pair/mob programming I don't know. We say we don't want more processes, but whenever there isn't a process for this iterative behavior, the result is people developing for too long on their own and producing results that are too far gone to be shot down.


Agree that waiting until the very end of the development process for a code review is troublesome. It’s the most expensive time to request changes too. As you point out, asking for something completely different is going to put the author back at square one and frustrate a lot of people.

It’s best, I find, to get that kind of design-level feedback early and review it when needed as progress is made. No sense waiting until you’ve polished it to hear that you wasted your time building the wrong thing.

Another strategy to cope: small changes merged early and often. This requires a bit more work on the part of the author to keep the continuity of the plan moving in the right direction over several PRs. However, it has been demonstrated empirically that humans can’t review much more than a couple hundred lines of code per hour [0].

And lastly, as a reviewer, demand evidence as to why the change is correct. Don’t bother yourself with reading every line like you’re some compiler that’s better than your compiler. Instead validate the reasoning of the author and read the tests/evidence/proof. You need a fairly experienced team that are sophisticated enough to write good specifications, but even good unit tests go a long way. This way you only have to check the author’s argument.

If you’re using languages with manual memory management make sure you’re using the right testing strategy that will catch errors in your program. Write unit tests at the bare minimum. Property tests would be preferred. Fuzz tests, integration tests, etc on top. Whatever it takes: catching errors in programs is surprisingly difficult for humans regardless of experience or training. The JVM had a critical error in its binary search implementation that hid for nearly a decade, OpenSSL, etc.

Code review is not about reading every line and playing, “spot the error.” You’ll miss some. Your team mates will miss some. You need to think above the code to catch those. Time is precious and life is short. Spend review time effectively by making sure you understand the specifications and that your colleagues have done their homework.

Happy reviewing.

[0] https://www.researchgate.net/profile/Ahmed-E-Hassan-2/public...


I think that one issue here is that the early feedback is not code review and there is zero reason to try to force it into code review framework. Vague high level design-level feedback session can be organized and named as such.


Definitely! Code review is a bad time to be making design crit.


A big contributor to this is humans bending over backwards to fit a creative and dynamic process to rigid tools like JIRA.


> I don't want to bother my colleague N times in the process.

Think of it this way: the maintenance burden of your (lower-quality) work if you don't bother your colleagues throughout the process will be way more bothersome than checking in up front.


Exactly. So how do you align everyone on that? The status quo is that people wait too long to make reviews (they are a formal process step at the very end). It's very hard to make it a formal step half way.

At the very end, they are of limited value and seen as a chore. So they become rubber stamps. The quality of the code base suffers from this, making development slower.

Everyone knows what the end goal is, yet "Code review finished code" is easy to adopt, but not very efficient unless the org is extremely good at what they are doing.

"Pause for thought and collaboration half way" is much much harder to achieve, without taking it to the extreme (pair sessions enforced etc).


Train reviewers to look critically at the big picture before they dive into details. That's going to hurt at first: almost finished code is going to be rejected and have to be reworked heavily or deleted entirely. But after a while, people will learn how to avoid it: seek feedback at an earlier stage.


Talking to your colleges about design decisions and wip reviews plus small iterative prs helps a lot.

I have no idea why people think review of final code should be the default, as you say it is rarely useful.


As was pointed out in the article a late async review can ensure that somebody else understands what happens in the code. In a few cases this alone justifies such review even if it is too late to change the architecture.

Plus besides problematic architecture there are a lot of bugs that are easy to fix but that that one does not when developing the code. It is much easier to discover those in polished code than in the first working prototype.


>review time which is never going to exceed 2% of the dev time

What, why?


Because it’s seen as pure overhead, so is expected to take perhaps a few minutes for a large change (that’s my experience at least). The fact that it isn’t is of course not lost on anyone - but everyone accepts the status quo as preferable. That is: people don’t want to spend a lot of time doing code review because they don’t like it. So they rather spend time working on a worse code base.


> We say we don't want more processes

Hum... You mean you don't want more meetings? You can't have more or less processes, their amount is completely defined by the diversity of your work.

If it's about meetings, yes, synchronicity is always a problem. But you have synchronicity on the review already, it can't be too bad to just move it into another time.


I encourage people to put up draft PRs, and to request reviews early. Draft PRs don't have to pass CI/CD or be complete, and they let you see what everyone is currently working on.


How do you get into process of dialogue?

Years of working together in a team on the same thing to gain trust.


This article has a "The Issues with Async Reviews" section but not a "The Issues With Pair Programming and Mobbing" one, so it's not a particularly considered piece.

I have found that pair programming destroys the ability to coherently think about a large portion of work for a large portion of developers. It is not easy to be "on" all the time, and a lot of people work at a pace or in a structure that is not compatible with that.


I haven't personally found this to be the case at all but pair programming doesn't necessarily mean always working with someone _at all times_. Since I'm a big pairing advocate and I think it's important to pair off every day, but there's absolutely nothing wrong with deciding with your pair that a particular problem needs some alone time.


An intriguing article, although my opinion on async reviews is much more positive. :-)

There are many arguments for both sides.

Pro pair / mob programming:

* Faster feedback. * Less time spent on solutions that might be rejected. * Better knowledge sharing (for those who participate).

Pro async review:

* A more neutral, "external" view. (Smaller chance of groupthink.) * More empathy with future readers of the code. * Encourages to make assumptions and decisions explicit. => Better knowledge sharing for those, who weren't involved in the writing of the code.


We don’t do code reviews anymore. We have a good quality gate (e2e tests) and pair programming. All development happens on main and gets directly deployed to prod when the tests succeed …

I hope I never have to use git flow or alike again in the future!


What's the size of your team? Remote or on-site? How many timezones does it span? How senior are the team's members?


I'm on a data team of 10, and after about 6 months of onboarding we have permissions to push directly to main pending passing tests. We generally do this for small changes, or changes where we are the context owner. With the caveat that reviews happen after the code is deployed, and usually within a few days.

Personally, I like the process. It allows us to move quickly, and focus on blocking changes. We can still get reviews prior to pushing code if it is sensible (for large changes), but most (80%?) changes tend to be quite small.


That's an intriguing approach. It makes sense that not all changes are equal and they don't require the same review process. (And it certainly makes sense to encourage small changes.)

What qualifies as a "small change"? Do you have some numbers to measure it? Or is it the developer's call?


Not the OP but I've worked very successfully pair-programming full-time on a fully remote team of 6 devs, 1 tester, 1 designer, and 1 product manager. All devs were quite senior except for one junior (and that junior leveled up faaaaaasssssst). We usually rotated daily so if a story took three days, the two people who started it weren't the same two who finished it. We all developed a similar coding style for the particular project, everyone knew every part of the codebase to some degree, and we felt a collective ownership over it--there was none of that "Oh, I don't want to change this because Sam wrote it and they'll be mad if I change it" nonsense.

However, our timezones only spanned three hours.


I’ll try to answer your questions: - the team size varied but in general we are quite small with around four developers

- all within one time zone

- all on senior level

- not an answer but just for context: we didn’t had any prod incident within 3 years

- we work fully remote

- we do mob sessions most of the day

- developing the e2e tests for the modules we wrote took not that much of time. Maybe 2-3 weeks for each product/module . Most of th me times it’s out something on s3 Herr and check API or DB there… most modules are trivial data transformations tbh but in the end most software i ever touched was…


That sounds very interesting.

Can you tell a bit more about those mob sessions? Do all 4 devs participate? Do you follow some structure?


Yes of course: those sessions happen online using teams or slack. The start of the day in the team is the standup in the morning where we quickly socialise and make a plan for the day and discuss important topics like triggered monitors, security finding or alike. When we made the plan we decide who starts and just start working all in one session. Most of the time that’s 3 devs in one meeting. We switch active roles mostly 2 or 3 times a day but it can also happen that one dev is coding the whole day if he feels like or the others don’t feel like. It’s very opportunistic so to say.


Thanks a lot, that's very insightful. I've worked in various environments, but this is super different from all of them. :-)

How often do the non-coding participants comment on sth? Is it more like a continuous conversation? Or rather one person coding and the others occasionally chiming in? How long is such a mob session?

I'm also wondering what kind of team setup makes such a model feasible and successful... By the way, congrats :-) 3 years without a prod incident sounds impressing, indeed.

How long has the team been working together? You've mentioned that all devs are on senior level. Do you have some kind of hierarchy within the team?


Thanks.

Hm, the team is working together for maybe 2 years like this. Those sessions go the whole day with a few breaks (lunch, coffee, bathroom).

No hierarchy and we are all involved during the sessions but if you have a bad day it’s ok to be more passive…

We were told that we are quiet performant in comparison to other teams but that is maybe also due to the fact that we are all seniors and trying to be pragmatic with our decisions


When you say you do pair programming, do you mean all of the time for everything? I'd argue you actually have a lot more code review in this case. You effectively have someone reviewing every line of code as it's written.


Who reviews tests then? Empty tests and assertEquals(1, 1) always show green.


Pair programming solves this.


How long did it take to develop enough tests for that to work?


(I am biased: I am one of the founders of https://graphite.dev )

Flip side of this, you can tackle both of their issues with better tooling and not throw the baby out with the bathwater.

Time lag: Stacking (https://graphite.dev/stacking) is a powerful technique that allows developers to keep developing even while others read through their code.

Context gap: you encounter the same issue in an IDE when exploring a new part of a codebase, and there are tools to help you there (click to see function def, open relevant tests, AI summarize what its doing - we've been trialing the last one internally and it's so cool). Your CR tool should offer all of that, and at many large companies (Google, FB, MSFT, even Jane street) it does. GitHub is just deficient here.

I agree with the other comments in this thread that the main problem is that what should have been an architecture / prototype / etc review happens in CR, and I agree with that. But I think moving away from CR worsens that problem not improves on it.


"Context gap: you encounter the same issue in an IDE when exploring a new part of a codebase"

Yes, that's one of my major arguments for async reviews, especially in large organizations: Future readers of the code will have ca. the same context as the async reviewer. (Contrary to a pair programmer, who has much more context.)

"there are tools to help you there" That sounds very interesting. Can you elaborate on that? Which tools do you use when getting familiar with a new part of a codebase?

Function def and relevant tests are very useful indeed. AI summary what it's doing. That sounds intriguing. Which tools do you use for that?

How do you get the broader context? E.g. how do you decide which other parts of the codebase are closely related (and should be included in your exploration)?


I was intrigued by your post but unfortunately after visiting the site I see no indication of pricing. Since an account is required to submit a stack, I did not proceed without a clear idea of potential cost.


Am I the only one who thinks it would be incredibly rude to ask for a code review before I'd actually fleshed out my ideas and also cleaned up to make it presentable? For that matter, if early code review is for design decisions, doesn't that mean tasks are large enough to require plans and reviews of those?

Shifting code reviews left seems to be either (or both of):

- look, I made a poopie!

- let's do Big Planning Up Front (agile has failed)


This reads far too basic, like someone read some other blogs and tried to write one without out blatant plagiarism. It's just code advice cliches.


My biggest issue with code review is that people rely on it and trust it too much. It does certain things in the "better then nothing I guess" way, but gets hyped as if it was all there is to teamwork.

Yes, it does little bit to spread knowledge - but it is not much effective at that. Actual occasional architecture session where ideas are explained is much better. Yes, it does little bit to catch bugs, but having actual testing in place does much more. Yes, it does something to teach junior, but for Christ sake, currently it leads to "give them task with no guidance, let them figure it out and then tell them about everything they have done wrong". That is just about the worst way of teaching people.

And it creates social issues we don't want to acknowledge, because code review is sacred and cant be criticized. Which leads to kindergarten level of advice to people who run into issues like "dont take it personaly, be nice". Which is next to useless if you have an actual social issue going on in your team.


"My biggest issue with code review is that people rely on it and trust it too much."

That's a very good way to put it. Code review is useful, but the expectation around it is often over reasonable.

I agree on your points regarding quality and knowledge sharing: Code review is beneficial for both, but definitely not sufficient as the only or main tool.

"give them task with no guidance, let them figure it out and then tell them about everything they have done wrong" It sounds blatant. But often not far from what's happening. :-)

"And it creates social issues we don't want to acknowledge" Yes, that's another longer topic. :-)


I think most issues brought up here would be addressed by writing rfcs and have those go through a review cycle, potentially with attached code that is clearly understood to be a proof of concept. It leverages async by giving someone the autonomy to do clear thinking and clear coding, while avoiding pushing the feedback cycle to when all the work has been done and everybody is stuck with a “lgtm can you reindent the last line” review of a fait accompli .

Added benefits are proper top level documentation, and a log of the discussions and compromises made to cut short on future design discussions.

Edit: typos on mobile


Agreed, the happy path is that code implements a spec that has already been agreed upon, anything less means guess work for both the authors and reviewers of code.

Life is about compromise so this doesn't have to be true 100% of the time, but this should be in your team's list of core competencies.

Also RFC doesn't HAVE to mean big pedantically formatted and worded monospace document. Templates help make sure common mistakes aren't repeated, but using normal-person English and easy to understand code examples can be very effective.


> a traditional async code review process

This is not a very old "tradition." We started code reviews back in the 2000s with in-person small group code reviews, and only later moved to the tool-based async code reviews that are ubiquitous today. We adopted good ideas from a web article "Effective Code Reviews Without the Pain" ( https://www.developer.com/guides/effective-code-reviews-with... ). We had folks enter comments in advance but went through them as a group, using social customs like starting questions with "did you consider..." to keep it cordial and non-confrontational. A lot of these reviews generated good discussions about code quality, clarity, etc. We didn't review 100% of code using this process, and we didn't always require the review to be completed before merging - the main goal was more to improve all developers' habits than to improve one particular chunk of code.

Moving to the current tool-based async code reviews has pros and cons, but now it takes effort to ensure they stay positive, and the async process eliminated some of the great team discussions that the old-style reviews had.

Perhaps the ideal solution is a mix of both approaches?


I am biased here since I am the creator of an async code review tool (codeapprove.com) but I think that many teams don't take the time to invest in their code review process and tools enough. For something that most of us do every day (often multiple times per day) there's surprisingly little focus on how to do it well.

When I joined my last team they had a bad code review process. Basically once a day or so the CTO and one or two other senior people would look at all the ready-to-merge PRs and give them a thumbs up or thumbs down. I pushed for a much more rigorous but inclusive process. Everyone would do code reviews, every PR would have two approvers, we'd talk about how to do review well and we'd invest in some more automations to solve common pain points. Not only did we get faster as a team (contrary to popular belief, more reviews != slower) but we also got much more thorough feedback AND we were able to consciously propagate the right patterns throughout our codebase in an organic way.

It's no exaggeration to say that good code review turned leveled the team more than any other engineering process change we made. More than testing or sprint improvements, more than our management changes, etc. It's highly underrated!


Decide if the right structure was chosen: Even if the code is doing the right thing, its structure may be ineffective or inefficient.

I can't help thinking that if you're waiting until code review to determine if someone on the team wrote the code in a way that will pass a review or not then you have some serious comms issues on your team. If this is a problem then you should be doing mini reviews during the dev process as well as a code review at the end.


This is the code review model I've arrived at after going through a number of different processes throughout my career:

Code review is a health check from your teammates who don't have the authorship bias. It is not a replacement for testing, both manual and automated. Testing is solely the responsibility of the author, code review is just a design lint check and/or knowledge sharing opportunity. You can still have a QA team of course, but that is usually decoupled from application development and is done post-merge anyway.

Every shop is different so YMMV, this is just a personal take.


There are multiple good points.

Which I would addressed first by tooling of course like linters and automated code checks so low hanging fruit is out of the way.

Second thing of shifting left is what I call “developers alignment” - which is yes a meeting where people regularly discuss! things instead of coming up with stuff ad-hoc during review or expecting others to get the context from thin air. Then repeating “obvious” rules regularly because people forget and not expecting people to remember things by heart.


pair programming is not a replacement for code reviews; the two people who worked on the code will have the same blind spots (even if the quality of the code is likely going to be better) and a fresh pair of eyes will offer the same benefits as in the regular case


Yeah, I agree with that. A pair often produces better code than an individual. But now, they have common blind spots. :-)


What's wrong with leaving the last hour of the day for team code reviews?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: