For the record, I merged Tim's patch
<https://gerrit.wikimedia.org/r/c/mediawiki/core/+/708880/> last week and
was unaware of this email thread.

My thinking was as follows:

1. The implementation does not depend on the goto statement.

That is, it is not used to write overly-clever or complicated logic. If you
remove the goto statement, the method behaves the exact same way. And thus
the moment it ceases to serve its use (performance optimisation) it can be
safely removed without further thought. think this is essential to keeping
the code easy to understand and maintain. This one principle actually
covers it all for me. The next three points are implied by this:
1a. This use of goto only jumps downward. Jumping backwards (up) would
likely violate point 1, and either way would imho be too complicated to
think about when debugging or maintaining the code in the future.
Especially the potential for an infinite loop.
1b. This use of goto only jumps to a statement within the same function.
(In fact, jumping to another file, class, or function is not supported by
PHP in the first place. This is literally the only way it can be used.
There is some sanity in the language after all!).
1c. This use of goto serves as a performance optimisation for a hot code
path. Similarly implied by point 1: If it doesn't change behaviour and
doesn't improve performance where it matters, we shouldn't bother using it.

2. An inline comment clearly stays it is a performance optimisation, and
explains why it is safe, and how we know the destination is where we would
end up regardless. (e.g. "we're inside a condition for X2, so we can skip
to the else of X1, and no other statements would run between here and
there").

-- Timo


On Sat, 31 Jul 2021 at 05:10, Tim Starling <tstarl...@wikimedia.org> wrote:

> For performance sensitive tight loops, such as parsing and HTML
> construction, to get the best performance it's necessary to think about
> what PHP is doing on an opcode by opcode basis.
>
> Certain flow control patterns cannot be implemented efficiently in PHP
> without using "goto". The current example in Gerrit 708880
> <https://gerrit.wikimedia.org/r/c/mediawiki/core/+/708880/5/includes/Html.php#545>
> comes down to:
>
> if ( $x == 1 ) {
>       action1();
> } else {
>       action_not_1();
> }
> if ( $x == 2 ) {
>       action2();
> } else {
>       action_not_2();
> }
>
> If $x==1 is true, we know that the $x==2 comparison is unnecessary and is
> a waste of a couple of VM operations.
>
> It's not feasible to just duplicate the actions, they are not as simple as
> portrayed here and splitting them out to a separate function would incur a
> function call overhead exceeding the proposed benefit.
>
> I am proposing
>
> if ( $x == 1 ) {
>       action1();
>       goto not_2; // avoid unnecessary comparison $x == 2
> } else {
>       action_not_1();
> }
> if ( $x == 2 ) {
>       action2();
> } else {
>       not_2:
>       action_not_2();
> }
>
> I'm familiar with the cultivated distaste for goto. Some people are just
> parotting the textbook or their preferred authority, and others are scarred
> by experience with other languages such as old BASIC dialects. But I don't
> think either rationale really holds up to scrutiny.
>
> I think goto is often easier to read than workarounds for the lack of
> goto. For example, maybe you could do the current example with break:
>
> do {
>       do {
>               if ( $x === 1 ) {
>                       action1();
>                       break;
>               } else {
>                       action_not_1();
>               }
>               if ( $x === 2 ) {
>                       action2();
>                       break 2;
>               }
>       } while ( false );
>       action_not_2();
> } while ( false );
>
> But I don't think that's an improvement for readability.
>
> You can certainly use goto in a way that makes things unreadable, but that
> goes for a lot of things.
>
> I am requesting that goto be considered acceptable for micro-optimisation.
>
> When performance is not a concern, abstractions can be introduced which
> restructure the code so that it flows in a more conventional way. I
> understand that you might do a double-take when you see "goto" in a
> function. Unfamiliarity slows down comprehension. That's why I'm suggesting
> that it only be used when there is a performance justification.
>
> -- Tim Starling
> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to