Rules for Inferring Schema Node Types and Structure
This topic describes how the schema inference process translates the node types in an XML document to an XML Schema definition language (XSD) structure.
Element Inference Rules
This section describes the inference rules for element declarations. There are eight structures of element declarations that will be inferred:
Element of simple type
Empty element
Empty element with attributes
Element with attributes and simple content
Element with a sequence of child elements
Element with a sequence of child elements and attributes
Element with a sequence of choices of child elements
Element with a sequence of choices of child elements and attributes
Note
All complexType
declarations are inferred as anonymous types. The only global element inferred is the root element; all other elements are local.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
Simple Typed Element
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded element shows the schema inferred for the simple type element.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root>text</root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root" type="xs:string" /> </xs:schema> |
Empty Element
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded element shows the schema inferred for the empty element.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <empty/> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="empty" /> </xs:schema> |
Empty Element with Attributes
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for the empty element with attributes.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <empty attribute1="text"/> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="empty"> <xs:complexType> <xs:attribute name="attribute1" type="xs:string" use="required" /> </xs:complexType> </xs:element> </xs:schema> |
Element with Attributes and Simple Content
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for an element with attributes and simple content.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root attribute1="text">value</root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="attribute1" type="xs:string" use="required" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:schema> |
Element with a Sequence of Child Elements
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for an element with a sequence of child elements.
Note
Even if an element has only one child element, it is still treated as a sequence.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root> <subElement/> </root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element name="subElement" /> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> |
Element with a Sequence of Child Elements and Attributes
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for an element with a sequence of child elements and attributes.
Note
Even if an element has only one child element, it is still treated as a sequence.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root attribute1="text"> <subElement1/> <subElement2/> </root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element name="subElement1" /> <xs:element name="subElement2" /> </xs:sequence> <xs:attribute name="attribute1" type="xs:string" use="required" /> </xs:complexType> </xs:element> </xs:schema> |
Element with a Sequence and Choices of Child Elements
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for an element with a sequence and choice of child elements.
Note
The maxOccurs
attribute of the xs:choice
element is set to "unbounded"
in the inferred schema.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root> <subElement1/> <subElement2/> <subElement1/> </root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:choice maxOccurs="unbounded"> <xs:element name="subElement1" /> <xs:element name="subElement2" /> </xs:choice> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> |
Element with a Sequence and Choice of Child Elements and Attributes
The following table shows the XML input to the InferSchema method, and the XML schema generated. The bolded elements show the schema inferred for an element with a sequence and choice of child elements and attributes.
Note
The maxOccurs
attribute of the xs:choice
element is set to "unbounded"
in the inferred schema.
For more information about the schema inference process, see Inferring Schemas from XML Documents.
XML | Schema |
---|---|
<?xml version="1.0"?> <root attribute1="text"> <subElement1/> <subElement2/> <subElement1/> </root> |
<?xml version="1.0" encoding="utf-8"?> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xml ns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:choice maxOccurs="unbounded"> <xs:element name="subElement1" /> <xs:element name="subElement2" /> </xs:choice> </xs:sequence> <xs:attribute name="attribute1" type="xs:string" use="required" /> </xs:complexType> </xs:element> </xs:schema> |
Attribute Processing
Whenever a new attribute is encountered within a node, it is added to the inferred definition of the node with use="required"
. The next time the same node is found in the instance, the inference process will compare attributes of the current instance with the ones already inferred. If some of the already inferred ones are missing in the instance, use="optional"
is added to the attribute definition. New attributes are added to existing declarations with use="optional"
.
Occurrence Constraints
During the schema inference process, the minOccurs
and maxOccurs
attributes are generated, for inferred components of a schema, with the values "0"
or "1"
and "1"
or "unbounded"
. The values "1"
and "unbounded"
are used only when the values "0"
and "1"
cannot validate the XML document (for example, if MinOccurs="0"
does not accurately describe an element, minOccurs="1"
is used).
Mixed Content
If an element contains mixed content (for example text interspersed with elements), the mixed="true"
attribute is generated for the inferred complex type definition.
Other Node Type Inference Rules
The following table describes the inference rules for processing instruction, comment, entity reference, CDATA, document type, and namespace nodes.
Node Type | Translation |
---|---|
Processing instruction | Ignored. |
Comment | Ignored. |
Entity reference | The XmlSchemaInference class does not handle entity references. If an XML document contains entity references, you need to use a reader that expands the entities. For example, you can pass an XmlTextReader with the EntityHandling property set to ExpandEntities as a parameter. If entity references are encountered and the reader does not expand entities, an exception is throw. |
CDATA | Any <![CDATA[ … ]] sections in an XML document will be inferred as xs:string . |
Document type | Ignored. |
Namespaces | Ignored. |
For more information about the schema inference process, see Inferring Schemas from XML Documents.