[jira] [Commented] (YARN-5167) Escaping occurences of encodedValues

Joep Rottinghuis (JIRA) Thu, 26 May 2016 17:47:08 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303270#comment-15303270
 ]


Joep Rottinghuis commented on YARN-5167:
----------------------------------------

In 
https://issues.apache.org/jira/browse/YARN-5109?focusedCommentId=15302672&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15302672
[~varun_saxena] pointed out:
"In Separator#encode, we are using String#replace which in turn uses Pattern. 
Why dont we use StringUtils#replace instead ?
I think former would be slower.
StringUtils#replace uses indexOf and would return the passed string if indexOf 
returns -1(which would be most of the cases)"

While this is a good point, this may be moot after this jira, because we may 
have to roll our own replace with indexOf, because we need to check for the 
existence of a backslash preceding the sequence we're looking to replace.

> Escaping occurences of encodedValues
> ------------------------------------
>
>                 Key: YARN-5167
>                 URL: https://issues.apache.org/jira/browse/YARN-5167
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Joep Rottinghuis
>            Assignee: Sangjin Lee
>            Priority: Critical
>
> We had earlier decided to punt on this, but in discussing YARN-5109 we 
> thought it would be best to just be safe rather than sorry later on.
> Encoded sequences can occur in the original string, especially in case of 
> "foreign key" if we decide to have lookups.
> For example, space is encoded as %2$.
> Encoding "String with %2$ in it" would decode to "String with   in it".
> We though we should first escape existing occurrences of encoded strings by 
> prefixing a backslash (even if there is already a backslash that should be 
> ok). Then we should replace all unencoded strings.
> On the way out, we should replace all occurrences of our encoded string to 
> the original except when it is prefixed by an escape character. Lastly we 
> should strip off the one additional backslash in front of each remaining 
> (escaped) sequence.
> If we add the following entry to TestSeparator#testEncodeDecode() that 
> demonstrates what this jira should accomplish:
> {code}
>     testEncodeDecode("Double-escape %2$ and %3$ or \\%2$ or \\%3$, nor  
> \\\\%2$ = no problem!", Separator.QUALIFIERS,
>         Separator.VALUES, Separator.SPACE, Separator.TAB);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-5167) Escaping occurences of encodedValues

Reply via email to