-
public interface LSSerializerALSSerializerprovides an API for serializing (writing) a DOM document out into XML. The XML data is written to a string or an output stream. Any changes or fixups made during the serialization affect only the serialized data. TheDocumentobject and its children are never altered by the serialization operation.During serialization of XML data, namespace fixup is done as defined in [DOM Level 3 Core] , Appendix B. [DOM Level 2 Core] allows empty strings as a real namespace URI. If the
namespaceURIof aNodeis empty string, the serialization will treat them asnull, ignoring the prefix if any.LSSerializeraccepts any node type for serialization. For nodes of typeDocumentorEntity, well-formed XML will be created when possible (well-formedness is guaranteed if the document or entity comes from a parse operation and is unchanged since it was created). The serialized output for these node types is either as a XML document or an External XML Entity, respectively, and is acceptable input for an XML parser. For all other types of nodes the serialized form is implementation dependent.Within a
Document,DocumentFragment, orEntitybeing serialized,Nodesare processed as follows-
Documentnodes are written, including the XML declaration (unless the parameter "xml-declaration" is set tofalse) and a DTD subset, if one exists in the DOM. Writing aDocumentnode serializes the entire document. -
Entitynodes, when written directly byLSSerializer.write, outputs the entity expansion but no namespace fixup is done. The resulting output will be valid as an external entity. - If the parameter
"entities"
is set to
true,EntityReferencenodes are serialized as an entity reference of the form "&entityName;" in the output. Child nodes (the expansion) of the entity reference are ignored. If the parameter "entities" is set tofalse, only the children of the entity reference are serialized.EntityReferencenodes with no children (no correspondingEntitynode or the correspondingEntitynodes have no children) are always serialized. -
CDATAsectionscontaining content characters that cannot be represented in the specified output encoding are handled according to the "split-cdata-sections" parameter. If the parameter is set totrue,CDATAsectionsare split, and the unrepresentable characters are serialized as numeric character references in ordinary content. The exact position and number of splits is not specified. If the parameter is set tofalse, unrepresentable characters in aCDATAsectionare reported as"wf-invalid-character"errors if the parameter "well-formed" is set totrue. The error is not recoverable - there is no mechanism for supplying alternative characters and continuing with the serialization. -
DocumentFragmentnodes are serialized by serializing the children of the document fragment in the order they appear in the document fragment. - All other node types (Element, Text, etc.) are serialized to their corresponding XML source form.
Note: The serialization of a
Nodedoes not always generate a well-formed XML document, i.e. aLSParsermight throw fatal errors when parsing the resulting serialization.Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references. Occurrences of '<' and '&' are replaced by the predefined entities < and &. The other predefined entities (>, ', and ") might not be used, except where needed (e.g. using > in cases such as ']]>'). Any characters that cannot be represented directly in the output character encoding are serialized as numeric character references (and since character encoding standards commonly use hexadecimal representations of characters, using the hexadecimal representation when serializing character references is encouraged).
To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as "'", and the double-quote character (") as """. New line characters and other characters that cannot be represented directly in attribute values in the output character encoding are serialized as a numeric character reference.
Within markup, but outside of attributes, any occurrence of a character that cannot be represented in the output character encoding is reported as a
DOMErrorfatal error. An example would be serializing the element <LaCañada/> withencoding="us-ascii". This will result with a generation of aDOMError"wf-invalid-character-in-node-name" (as proposed in "well-formed").When requested by setting the parameter "normalize-characters" on
LSSerializerto true, character normalization is performed according to the definition of fully normalized characters included in appendix E of [XML 1.1] on all data to be serialized, both markup and character data. The character normalization process affects only the data as it is being written; it does not alter the DOM's view of the document after serialization has completed.Implementations are required to support the encodings "UTF-8", "UTF-16", "UTF-16BE", and "UTF-16LE" to guarantee that data is serializable in all encodings that are required to be supported by all XML parsers. When the encoding is UTF-8, whether or not a byte order mark is serialized, or if the output is big-endian or little-endian, is implementation dependent. When the encoding is UTF-16, whether or not the output is big-endian or little-endian is implementation dependent, but a Byte Order Mark must be generated for non-character outputs, such as
LSOutput.byteStreamorLSOutput.systemId. If the Byte Order Mark is not generated, a "byte-order-mark-needed" warning is reported. When the encoding is UTF-16LE or UTF-16BE, the output is big-endian (UTF-16BE) or little-endian (UTF-16LE) and the Byte Order Mark is not be generated. In all cases, the encoding declaration, if generated, will correspond to the encoding used during the serialization (e.g.encoding="UTF-16"will appear if UTF-16 was requested).Namespaces are fixed up during serialization, the serialization process will verify that namespace declarations, namespace prefixes and the namespace URI associated with elements and attributes are consistent. If inconsistencies are found, the serialized form of the document will be altered to remove them. The method used for doing the namespace fixup while serializing a document is the algorithm defined in Appendix B.1, "Namespace normalization", of [DOM Level 3 Core] .
While serializing a document, the parameter "discard-default-content" controls whether or not non-specified data is serialized.
While serializing, errors and warnings are reported to the application through the error handler (
LSSerializer.domConfig's "error-handler" parameter). This specification does in no way try to define all possible errors and warnings that can occur while serializing a DOM node, but some common error and warning cases are defined. The types (DOMError.type) of errors and warnings defined by this specification are:"no-output-specified" [fatal]- Raised when
writing to a
LSOutputif no output is specified in theLSOutput. -
"unbound-prefix-in-entity-reference" [fatal] - Raised if the
configuration parameter
"namespaces"
is set to
trueand an entity whose replacement text contains unbound namespace prefixes is referenced in a location where there are no bindings for the namespace prefixes. -
"unsupported-encoding" [fatal] - Raised if an unsupported encoding is encountered.
In addition to raising the defined errors and warnings, implementations are expected to raise implementation specific errors and warnings for any other error and warning cases such as IO errors (file not found, permission denied,...) and so on.
See also the Document Object Model (DOM) Level 3 Load and Save Specification.
- Since:
- 1.5
-
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description DOMConfigurationgetDomConfig()TheDOMConfigurationobject used by theLSSerializerwhen serializing a DOM node.LSSerializerFiltergetFilter()When the application provides a filter, the serializer will call out to the filter before serializing each Node.StringgetNewLine()The end-of-line sequence of characters to be used in the XML being written out.voidsetFilter(LSSerializerFilter filter)When the application provides a filter, the serializer will call out to the filter before serializing each Node.voidsetNewLine(String newLine)The end-of-line sequence of characters to be used in the XML being written out.booleanwrite(Node nodeArg, LSOutput destination)Serialize the specified node as described above in the general description of theLSSerializerinterface.StringwriteToString(Node nodeArg)Serialize the specified node as described above in the general description of theLSSerializerinterface.booleanwriteToURI(Node nodeArg, String uri)A convenience method that acts as ifLSSerializer.writewas called with aLSOutputwith no encoding specified andLSOutput.systemIdset to theuriargument.
-
-
-
Method Detail
-
getDomConfig
DOMConfiguration getDomConfig()
TheDOMConfigurationobject used by theLSSerializerwhen serializing a DOM node.
In addition to the parameters recognized by the DOMConfiguration interface defined in [DOM Level 3 Core] , theDOMConfigurationobjects forLSSerializeradds, or modifies, the following parameters:"canonical-form"-
true- [optional] Writes the document according to the rules specified in
[Canonical XML].
In addition to the behavior described in
"canonical-form"
[DOM Level 3 Core]
, setting this parameter to
truewill set the parameters "format-pretty-print", "discard-default-content", and "xml-declaration ", tofalse. Setting one of those parameters totruewill set this parameter tofalse. Serializing an XML 1.1 document when "canonical-form" istruewill generate a fatal error. false- [required] (default) Do not canonicalize the output.
"discard-default-content"-
-
true - [required] (default) Use the
Attr.specifiedattribute to decide what attributes should be discarded. Note that some implementations might use whatever information available to the implementation (i.e. XML schema, DTD, theAttr.specifiedattribute, and so on) to determine what attributes and content to discard if this parameter is set totrue. false- [required]Keep all attributes and all content.
-
"format-pretty-print"-
-
true - [optional] Formatting the output by adding whitespace to produce a pretty-printed, indented, human-readable form. The exact form of the transformations is not specified by this specification. Pretty-printing changes the content of the document and may affect the validity of the document, validating implementations should preserve validity.
-
false - [required] (default) Don't pretty-print the result.
-
-
"ignore-unknown-character-denormalizations" -
-
true - [required] (default) If, while verifying full normalization when
[XML 1.1] is
supported, a character is encountered for which the normalization
properties cannot be determined, then raise a
"unknown-character-denormalization"warning (instead of raising an error, if this parameter is not set) and ignore any possible denormalizations caused by these characters. -
false - [optional] Report a fatal error if a character is encountered for which the processor cannot determine the normalization properties.
-
-
"normalize-characters" - This parameter is equivalent to
the one defined by
DOMConfigurationin [DOM Level 3 Core] . Unlike in the Core, the default value for this parameter istrue. While DOM implementations are not required to support fully normalizing the characters in the document according to appendix E of [XML 1.1], this parameter must be activated by default if supported. -
"xml-declaration" -
true- [required] (default) If a
Document,Element, orEntitynode is serialized, the XML declaration, or text declaration, should be included. The version (Document.xmlVersionif the document is a Level 3 document and the version is non-null, otherwise use the value "1.0"), and the output encoding (seeLSSerializer.writefor details on how to find the output encoding) are specified in the serialized XML declaration. -
false - [required] Do not serialize the XML and text declarations. Report a
"xml-declaration-needed"warning if this will cause problems (i.e. the serialized data is of an XML version other than [XML 1.0], or an encoding would be needed to be able to re-parse the serialized data).
-
getNewLine
String getNewLine()
The end-of-line sequence of characters to be used in the XML being written out. Any string is supported, but XML treats only a certain set of characters sequence as end-of-line (See section 2.11, "End-of-Line Handling" in [XML 1.0], if the serialized content is XML 1.0 or section 2.11, "End-of-Line Handling" in [XML 1.1], if the serialized content is XML 1.1). Using other character sequences than the recommended ones can result in a document that is either not serializable or not well-formed).
On retrieval, the default value of this attribute is the implementation specific default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by XML 1.0 or XML 1.1, depending on the serialized content. Setting this attribute tonullwill reset its value to the default value.
-
setNewLine
void setNewLine(String newLine)
The end-of-line sequence of characters to be used in the XML being written out. Any string is supported, but XML treats only a certain set of characters sequence as end-of-line (See section 2.11, "End-of-Line Handling" in [XML 1.0], if the serialized content is XML 1.0 or section 2.11, "End-of-Line Handling" in [XML 1.1], if the serialized content is XML 1.1). Using other character sequences than the recommended ones can result in a document that is either not serializable or not well-formed).
On retrieval, the default value of this attribute is the implementation specific default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by XML 1.0 or XML 1.1, depending on the serialized content. Setting this attribute tonullwill reset its value to the default value.
-
getFilter
LSSerializerFilter getFilter()
When the application provides a filter, the serializer will call out to the filter before serializing each Node. The filter implementation can choose to remove the node from the stream or to terminate the serialization early.
The filter is invoked after the operations requested by theDOMConfigurationparameters have been applied. For example, CDATA sections won't be passed to the filter if "cdata-sections" is set tofalse.
-
setFilter
void setFilter(LSSerializerFilter filter)
When the application provides a filter, the serializer will call out to the filter before serializing each Node. The filter implementation can choose to remove the node from the stream or to terminate the serialization early.
The filter is invoked after the operations requested by theDOMConfigurationparameters have been applied. For example, CDATA sections won't be passed to the filter if "cdata-sections" is set tofalse.
-
write
boolean write(Node nodeArg, LSOutput destination) throws LSException
Serialize the specified node as described above in the general description of theLSSerializerinterface. The output is written to the suppliedLSOutput.
When writing to aLSOutput, the encoding is found by looking at the encoding information that is reachable through theLSOutputand the item to be written (or its owner document) in this order:-
LSOutput.encoding, -
Document.inputEncoding, -
Document.xmlEncoding.
If no encoding is reachable through the above properties, a default encoding of "UTF-8" will be used. If the specified encoding is not supported an "unsupported-encoding" fatal error is raised.
If no output is specified in theLSOutput, a "no-output-specified" fatal error is raised.
The implementation is responsible of associating the appropriate media type with the serialized data.
When writing to a HTTP URI, a HTTP PUT is performed. When writing to other types of URIs, the mechanism for writing the data to the URI is implementation dependent.- Parameters:
nodeArg- The node to serialize.destination- The destination for the serialized DOM.- Returns:
- Returns
trueifnodewas successfully serialized. Returnfalsein case the normal processing stopped but the implementation kept serializing the document; the result of the serialization being implementation dependent then. - Throws:
LSException- SERIALIZE_ERR: Raised if theLSSerializerwas unable to serialize the node. DOM applications should attach aDOMErrorHandlerusing the parameter "error-handler" if they wish to get details on the error.
-
-
writeToURI
boolean writeToURI(Node nodeArg, String uri) throws LSException
A convenience method that acts as ifLSSerializer.writewas called with aLSOutputwith no encoding specified andLSOutput.systemIdset to theuriargument.- Parameters:
nodeArg- The node to serialize.uri- The URI to write to.- Returns:
- Returns
trueifnodewas successfully serialized. Returnfalsein case the normal processing stopped but the implementation kept serializing the document; the result of the serialization being implementation dependent then. - Throws:
LSException- SERIALIZE_ERR: Raised if theLSSerializerwas unable to serialize the node. DOM applications should attach aDOMErrorHandlerusing the parameter "error-handler" if they wish to get details on the error.
-
writeToString
String writeToString(Node nodeArg) throws DOMException, LSException
Serialize the specified node as described above in the general description of theLSSerializerinterface. The output is written to aDOMStringthat is returned to the caller. The encoding used is the encoding of theDOMStringtype, i.e. UTF-16. Note that no Byte Order Mark is generated in aDOMStringobject.- Parameters:
nodeArg- The node to serialize.- Returns:
- Returns the serialized data.
- Throws:
DOMException- DOMSTRING_SIZE_ERR: Raised if the resulting string is too long to fit in aDOMString.LSException- SERIALIZE_ERR: Raised if theLSSerializerwas unable to serialize the node. DOM applications should attach aDOMErrorHandlerusing the parameter "error-handler" if they wish to get details on the error.
-
-