[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17011005#comment-17011005
 ] 

Eric Badger commented on YARN-9052:
-----------------------------------

I'm sure we all agree that there's a balancing art of invasive code cleanup vs 
increasing tech debt. And I'm also sure that everyone will have their slightly 
own differing opinion on which of those is more important. It might be a good 
idea to get more eyes on this broud issue by posting to the larger Hadoop 
mailing list. But I'll add my 2 cents here. 

I generally like to follow the principal of "if something is hard, do it 
frequently" or whatever version of that phrase that you prefer. If someone 
wants to make changes that are good for the long term health of Hadoop, I don't 
want to discourage that. I understand that that means increased pain in the 
short term, but I believe that it promotes health in the long term. The code 
cleanup is the "something that is hard" in this case. If we wait for years for 
stuff to pile up and then try to deal with everything all at once, then every 
committer will basically have a full time job of refactoring code for a few 
months until we're back into a reasonable state. However, if we do it more 
frequently, we can fix some things, but have each chunk much smaller and more 
tenable. But I'm also not going to say that filing 10+ JIRAs in succession is a 
reasonable rate to do code cleanups. Maybe once-in-a-while it isn't that bad, 
though.

I do believe that code cleanup/refactoring is important for the health of 
longterm code, even if the end result of each individual patch has a net result 
of not changing functionality. It makes the code easier to work with in the 
future, and oftentimes makes it simpler or less confusing, which leads to a 
decreased likelihood of bugs being introduced. When code is written poorly, 
trying to modify that code is a nightmare and bugs are created because it's so 
hard to follow what's going on in the code. So while the immediate effect is 
purely negative (patches don't apply anymore, branches diverge), the long-term 
effect (easier code to manage/debug) is positive. It all depends on what you're 
optimizing for. 

I agree with the sentiment that we should treat each code cleanup as a bug 
fix/feature in that we need to test the code vigorously even if _nothing 
changed, we just refactored_. Especially when dealing with non-test code, 
cleanups and refactors need to be done with a huge amount of detail and care., 
since they risk breaking existing functionality. But this is no different than 
any other patch that someone could put up and commit. There is always a risk of 
breaking existing functionality. We need to trust committers to do proper 
testing and not commit code that is too risky for little benefit. 

All of what I've said above assumes that the code cleanup/refactoring is 
actually making the cleaner, more readable, and easier to modify/fix. Any 
changes that do not meet that criteria should be closed as Won't Fix since they 
do not give short term or long term benefit.

> Replace all MockRM submit method definitions with a builder
> -----------------------------------------------------------
>
>                 Key: YARN-9052
>                 URL: https://issues.apache.org/jira/browse/YARN-9052
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Minor
>             Fix For: 3.3.0
>
>         Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.008.patch, YARN-9052.009.patch, 
> YARN-9052.009.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to