This is of course pretty controversial stuff, and in all honesty, Arlo only has his own data and the feedback of 6 teams that tried it as supporting evidence that it works. Of the 6 teams, all had productivity boosts. 4 teams continued, 2 stopped. The reasons why they stopped was interesting, though: they liked specializations, and were happy with slightly less productivity.
So what about code reviews? Arlo was of the opinion that with pairing compared to code reviews, you learn more about they way developers are thinking and working within the system. You usually don’t need code reviews if you do 100% pairing.
What about bigger teams? Arlo would split into smaller teams; e.g., a team of 21 would be split into three teams of seven. But, to improve inter-team communication and enable experience sharing across the teams, the developers would be rotated through the teams.
How do they deal with code debt? There shouldn’t be any eventually, but you may start out with some or a lot initially. Clients don’t usually appreciate tasks that only deal with refactoring. Over time they can be educated to appreciate it, and that’s of course best. Arlo’s team would do refactoring the last half of every Friday; the team would get together, do a short planning game for purely internal refactoring tasks, and then do it. This hopefully won’t be needed once the code base is clean, though.
What kind of iteration size did they use? Arlo’s using one week iterations/sprints, like us. He’s tried much shorter iterations, even 90 minute ones, but the problem is in the cost of deployment and refactorings. In some ways, though, it seemed like the pairing sessions were like short iterations, and it’s of course possible to have a scrum in the beginning and a demo in the end.
How has Arlo’s team evolved over the last 2-3 years? They started out following many XP practices, but these days they’re looking more towards the lean software development for inspiration. Also, the metrics they use are now one-up, also as recommended by lean: revenue (clients put a fictional-but-representative dollar amount on features), cycle time (how long the team needed to resolve issues of varying criticality; a measure of responsiveness), and throughput (the velocity).
What should we expect if we use these practices? Arlo says productivity will drop first three weeks, but then increase tremendously. After 2-3 months people’s skill gaps would be gone.
In any case, the results for each team will likely be slightly different, so it’s just something we have to try.