[Twisted-Python] major changes, release engineering, and learning cost

Sun May 23 20:35:32 MDT 2010

On 23 May, 12:26 am, glyph at twistedmatrix.com wrote:
>The nice thing about Twisted's compatibility policy is that developers, 
>and even users, very rarely have problems when installing a new version 
>of Twisted.  While this is a nice benefit, the current strategy of 
>developing features in a compatible way does have a couple of costs, 
>and I'd like to see if we can address them without giving up the 
>benefit.  I have a suggestion for a process tweak which would hopefully 
>mitigate some of the difficulties which arise due to the compatibility 
>policy.
>
>When we add a new feature that supersedes an older one, or fix a bug in 
>Twisted that involves changing behavior, the developer fixing it has to 
>come up with a new name.  If we have several behavior-changing bugfixes 
>in the same subsystem, that means that developers using Twisted may 
>have to learn about 3 different symbol names.  Since we tend to avoid 
>just suffixing names with numbers (for good reason, I think), they 
>won't have to learn Bicycle, Bicycle2, Bicycle3, they'll have to learn 
>Bicycle, then Footcycle, and finally Velocipede, and somehow infer that 
>Velocipede is the newest/best name that they should be using, by 
>reading the (hopefully clear, concise) warnings that come out of their 
>unit tests.
>
>This came up again recently on a ticket about URLPath, 
><http://twistedmatrix.com/trac/ticket/2625#comment:16>, where a 
>contributor suggested that it would be better to make a whole new 
>module because it's easier for external developers to learn about that 
>then learn about an individual method change.  This of course raises 
>the question: if we're going to have a whole new URL class, shouldn't 
>it fix the (numerous) *other* bugs that we know about in URLPath?
>
>Up until now the objection to doing things this way is that it results 
>in gigantic branches which are intractable to review.  That's a good 
>objection, but it leaves us with a false dichotomy; reliable software 
>and painless upgrades with a random scattershot of new features that 
>are hard to understand, or coherent iterations of technology which 
>can't be effectively reviewed, and therefore can't be effectively 
>quality controlled.
>
>I propose that we get the best of both worlds by changing the way we do 
>reviews slightly.  Right now every code review needs to be done on an 
>entire feature going to trunk, and _all_ of the code going to trunk 
>needs to be reviewed at once.  I suggest that instead, we create 
>"integration branches" for sensible constellations of features, and 
>have a two-stage review process.
>
>For example, let's say I want to work on making URLPath good.  There 
>are several tickets addressing it:
>
><http://twistedmatrix.com/trac/ticket/2093>
><http://twistedmatrix.com/trac/ticket/2094>
><http://twistedmatrix.com/trac/ticket/2625>
>
>For the sake of argument, let's all pretend these are all deeply 
>interrelated and involve changes to behavior of existing methods.  I 
>think that is sort of true of most of these, but it would be far too 
>verbose to talk about *how*, and get bogged down in that discussion.
>
>First, I'd make an integration ticket, let's call it #ABCD, describing 
>how these features are related and a brief outline of the new API I 
>propose which resolves them.
>
>Then I'd create an integration branch from trunk, for that ticket. 
> From the #ABCD branch, I'd create a branch for #2093, and put it up for 
>review.  The reviewer would review #2093 as usual, citing any test 
>coverage issues, documentation issues, etc.  After the usual review 
>process, when I get an "OK to merge", I would merge #2093 *to the #ABCD 
>branch*, not trunk.
>
>I would repeat this process for #2094 and #2625, merging each to the 
>#ABCD branch as they passed review.
>
>Finally, I'd put #ABCD itself up for review.  At this point the process 
>would differ a bit.  Reviewers would be able to assume, at this point, 
>that the potentially large body of code in #ABCD had already been 
>reviewed, that the test cases were good, the documentation was 
>reasonably clear, and the logic made sense.  This final review would be 
>a quick sanity check, to make sure the tests still pass and that there 
>are no conflicts.
>
>I would like to strongly emphasize that this point in the process would 
>be an inappropriate time for reviewers to start arguing with each other 
>over what is required for the branch to land, disputing the original 
>specification, etc; this is just an opportunity to spot potential 
>regressions before it lands.  Each ticket review for a component of the 
>larger feature should be an opportunity to draw attention to the 
>direction of the larger feature development and prompt discussion. 
>This *might* be an appropriate point to note that some other behavior- 
>changing feature had been left out, though.
>
>In the case that there *were* conflicts, this would be an opportunity 
>to review the conflict resolution itself.
>
>(We saw a nascent version of this approach on some stuff related to 
><http://twistedmatrix.com/trac/ticket/886> and it was hugely painful 
>because nobody was really sure what the process was supposed to be.  So 
>let's not do it like that again.)
>
>So: thoughts?  Does this make sense as a policy change for facilitating 
>the development of major new features or consolidating behavior- 
>changing fixes into easier-to-understand units?

So, to summarize, we could stage our code using more than just two 
branches (trunk + feature branch) in order to make larger changes easier 
to understand for reviewers while still making each change to trunk a 
coherent unit.

This sounds fine to me.  We need to work out some details (like, for 
example, I'm not sure trying to do this using subversion is such a good 
idea, and we want the process to be documented somewhere so we don't 
have a repeat of #886), but I think we should try it and see what 
happens.

Of course, someone needs to work on something big before we'll have a 
chance to try it.  I'm not yet convinced that `URLPath` is a good case 
for this, though.  It's very little code, and a complete 
reimplementation (if even such a thing is needed) will likewise be very 
little code.  Also, I don't think a complete reimplementation is needed 
here.

Going back to the proposed workflow change, we should also be sure 
there's a clear condition under which the integration branch should be 
merged to trunk.  And ideally we should still try to keep the lifespan 
of these things as short as possible.

Jean-Paul