Re: QName prefix and localpart handling

John Kaputin Fri, 02 Jun 2006 04:21:26 -0700

> I think this model helps parsing - QNames can be constructed which
> represent the valid or invalid data in XML. Then the QName is there
> for the validator to check semantically later. Perhaps the NCName
> should work to the same model instead of validating the input
> parameters to the ctor?

The Woden validator is a WSDL validator. I don't think it should do XML
data type validation and I don't think we should develop an XML data type
validator.

Given that Woden allows invalid WSDL to be parsed and returned in an object
model along with validation errors, I agree with Jeremy about the need to
capture the QName, NCName, etc, even if the underlying attribute values are
invalid.

I think the requirement for parsing XML attributes is to:
1. validate the attribute's string value against the expected XML data type
and report any errors
2. convert the string to the appropriate Java type if valid
3. capture (for use post-parse time) whether or not the attribute is valid
4. capture (for use post-parse time) the original string value of the
attribute

I have some thoughts on this based on combining the approach used with
XMLAttr subtypes in org.apache.woden.xml and with the existing
validation/conversion logic in org.apache.axis.type.  I'll capture
something on a Wiki page. Meantime you could take a look at these two
packages.

The classes in org.apache.axis.types seem to provide the validation and
conversion logic, but they don't store the validity although they do
provide a static isValid(String) method, and I think you can only create
these objects for valid attribute values (i.e. you can detect an error with
isValid(String), but cannot create an object to retain the invalid string).

Woden uses the XMLAttr subtypes in org.apache.woden.xml for extension
attributes. These are wrappers for existing Java classes where appropriate,
such as java.net.URI or javax.xml.namespace.QName. These wrappers provide
additional behaviour such as deserialization and validation, a validity
indicator, access to the original string value. They could be used for
default serialization logic too. Currently there are classes only for the
attribute data types used for the WSDL extensions defined in the spec.

For example:
QNameAttr for 'xs:qname'
ListOfQNameAttr for 'list of xs:qname'
TokenAttr for 'xs:token'
IntOrTokenAttr for 'xs:integer or xs:token'

These XMLAttr subtypes are registered as extension attributes in the
ExtensionRegistry. They contain similar deserialization and conversion
logic that appears in the reader.parseXXX methods for WSDL elements or the
XXXExtensionDeserializer.unmarshall methods for extension elements.

These XMLAttr subclasses do the following:
1. convert the infoset attribute value to an object of the appropriate type
(URI, QName, Boolean, QName[], etc)
2. report any conversion/validation errors and set an isValid() boolean
3. return the object via the XMLAttr.getContent() method or a type specific
getter such as QNameAttr.getQName()
4. return a String containing the original infoset attribute value

For the infoset attributes in the WSDL namespace, Woden mostly uses
standard classes such as java.lang.String, javax.xml.namepsace.QName and
java.net.URI, however I have copied NCName from Axis2 into
org.apache.woden.types. The disadvantage with Woden's approach to WSDL
attributes is that you cannot instantiate an invalid URI or NCName, so you
lose the original value and you can only tell if a QName is valid by
examining its contents. The XMLAttr approach solves these problems.

There are various options for what to expose where in the Woden API
depending on the use case, but these require more thought.

regards,
John Kaputin

             "Jeremy Hughes"                                               
             <[EMAIL PROTECTED]                                             
             rg>                                                        To 
             Sent by:                  woden-dev@ws.apache.org             
             [EMAIL PROTECTED]                                          cc 
             om                                                            
                                                                   Subject 
                                       Re: QName prefix and localpart      
             01/06/2006 20:56          handling                            

             Please respond to                                             
             [EMAIL PROTECTED]                                             
                  he.org                                                   

On 6/1/06, John Kaputin <[EMAIL PROTECTED]> wrote:
> > It turns out that Woden DOMWSDLParser assumes the name attribute is a
> > well formed NCName and just passes its value [Attr.getValue()] into
> > the localpart param of the QName ctor.
> >
> > So I think we need to fix Woden to report an error at this point.
>
> Remember that Woden validation is performed by syntax  validation via XML
> Schema validation, then semantic validation via the WSDLDocumentValidator
> and WSDLComponentValidator. So with the WSDLReader validation feature
> enabled, this type of error will be reported by schema validation (e.g.
if
> the value of the 'name' attribute of <service> is
'xyz:reservationService'
> the error reported is: 'xyz:reservationService' is not a valid value for
> 'NCName'.).
>
> However, Woden will still return a DescriptionElement so even if this
error
> is reported by schema validation, a QName will still be created with
> "xyz:reservationService" as its local part. This is not a QName error (at
> least, it doesn't seem to violate javax.xml.namespace.QName as defined by
> J2EE 1.4 or J2SE 1.5).  Rather than create such a QName we could trap
this
> 'error' at parse-time by creating an NCName object (which contains NCName
> validation logic) from the attribute value and only create the QName if
> there are no NCName errors. If there are NCName errors no QName will be
> created and the relevent API method like Service.getName() will return
> null.  And of course, the null QName error will get reported by
> WSDLComponentValidator.

I'm curious as to why NCName operates differently to QName w.r.t
validation in the ctor. If the localpart parm of the QName ctor
contains a colon QName doesn't complain and a (surely) invalid QName
is constructed. The JSE 5 javadoc for QName.valueOf() admits as much:

"This method does not do full validation of the resulting QName. In
particular, the local part is not validated as a NCName  as specified
in Namespaces in XML." [1]

I think this model helps parsing - QNames can be constructed which
represent the valid or invalid data in XML. Then the QName is there
for the validator to check semantically later. Perhaps the NCName
should work to the same model instead of validating the input
parameters to the ctor?

> So, I don't think we need to 'fix' Woden to report an error - the error
> already gets reported, but I do think we should check that we have a
valid
> NCName before attempting to use it as the local part in the QName ctor.
> Unless anyone thinks differently, I suggest you raise a JIRA on this.

I guess I'm suggesting above that we could have an invalid NCName
created in the Element model which leads to an invalid QName in the
component model. However, this is minor compared to having consistency
between the way NCName and QName classes work. I'll open a JIRA.

> > Surely it should have a signature like this in
> > BindingElement:
> > public NCName getName();
>
> I agree. The Element API is meant to reflect the WSDL infoset, so
> BindingElement.getName() should return an NCName, not a QName. In the
> Component API, Binding.getName() will still return a QName as it should.
> Other accessor methods will need to change too to reflect this such as
> DescriptionElement.getBindingElement(NCName).  Another JIRA?

I've reaslised there's a problem here. BindingImpl implements both
Binding and BindingElement interfaces. If we change BindingElement's
getName() method's return type to NCName, BindingImpl won't be able to
implement both interfaces - it can't have both these methods:

NCName getName()
QName getName()

I have a proposal though - howabout splitting BindingImpl into
BindingImpl implements Binding and BindingElementImpl implements
BindingElement. There is then an issue with tying together the two
impl classes. When you create a BindingElementImpl it would contain a
null reference to a BindingImpl. When you add the BindingElementImpl
to a DescriptionElement a BindingImpl will get created, referred to
from the BindingElementImpl and added to the Description.

If a BindingImpl were created first then the opposite will happen when
it is added to a Description - ie a BindingElementImpl created,
referred to from the BindingImpl and added to the DescriptionElement.
Phew. I think I need to write some code - there's probably a gotcha in
there.

By doing all this, whenever you have an instance of a Binding you can
call QName getName(). Whenever you have an instance of a
BindingElement you can call NCName getName().

Like I said ... I need to write a bit of code to make sure I haven't
made a mistake.

[1]
http://java.sun.com/webservices/docs/1.5/api/javax/xml/namespace/QName.html

Cheers,
Jeremy

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: QName prefix and localpart handling

Reply via email to