Google

Xerces 3.1.1 API: Class HTMLSerializer
Xerces 3.1.1


Class HTMLSerializer

java.lang.Object
  |
        |

Implements an HTML/XHTML serializer supporting both DOM and SAX pretty serializing. HTML/XHTML mode is determined in the

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done SAX events and using the serializer as a document handler.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

XHTML is slightly different than HTML:

  • Element/attribute names are lower case and case matters
  • Attributes must specify value, even if empty string
  • Empty elements must have '/' in empty tag
  • Contents of SCRIPT and STYLE elements serialized as CDATA

Version:
$Revision: 1.13 $ $Date: 2000/08/30 18:59:21 $


           
Field Summary
static java.lang.String
 
          Constructs a new serializer.
          Constructs a new HTML/XHTML serializer depending on the value of xhtml.
          Constructs a new serializer.
          Constructs a new serializer that writes to the specified output stream using the specified output format.
          Constructs a new serializer that writes to the specified writer using the specified output format.
Constructor Summary
 
protected
 
 
 
  int start, int length)
          Receive notification of character data.
          Called to print the text contents in the prevailing element format.
          Receive notification of the end of an element. java.lang.String localName, java.lang.String rawName)
          Receive notification of the end of an element.
           
          Returns the suitable entity reference for this character value, or null if no such entity exists.
          Called to serialize a DOM element.
          Specifies an output format for this serializer.
          Called to serialize the document's DOCTYPE by the root element.
          Receive notification of the beginning of an element. java.lang.String localName, java.lang.String rawName,
          Receive notification of the beginning of an element.
Method Summary
 void
protected  void
 void
 void
protected  java.lang.String
protected  java.lang.String
protected  void
 void
protected  void
 void
 void
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

XHTMLNamespace

public static java.lang.String XHTMLNamespace
Constructor Detail
HTMLSerializer
protected HTMLSerializer(boolean xhtml,
Constructs a new HTML/XHTML serializer depending on the value of xhtml. The serializer cannot be used without calling #init first.
Parameters:
xhtml - True if XHTML serializing

HTMLSerializer

public HTMLSerializer()
Constructs a new serializer. The serializer cannot be used without first.

HTMLSerializer
Constructs a new serializer. The serializer cannot be used without first.

HTMLSerializer
public HTMLSerializer(java.io.Writer writer,
Constructs a new serializer that writes to the specified writer using the specified output format. If format is null, will use a default output format.
Parameters:
writer - The writer to use
format - The output format to use, null for the default

HTMLSerializer
public HTMLSerializer(java.io.OutputStream output,
Constructs a new serializer that writes to the specified output stream using the specified output format. If format is null, will use a default output format.
Parameters:
output - The output stream to use
format - The output format to use, null for the default
Method Detail
setOutputFormat
Specifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.
Parameters:
format - The output format to use

startElement
public void startElement(java.lang.String namespaceURI,
                         java.lang.String localName,
                         java.lang.String rawName,
Receive notification of the beginning of an element.

The Parser will invoke this method at the beginning of every element in the XML document; there will be a corresponding (even when the element is empty). All of the element's content will be reported, in order, before the corresponding endElement event.

This event allows up to three name components for each element:

  1. the Namespace URI;
  2. the local name; and
  3. the qualified (prefixed) name.

Any or all of these may be provided, depending on the properties:

  • the Namespace URI and local name are required when the namespaces property is true (the default), and are optional when the namespaces property is false (if one is specified, both must be);
  • the qualified name is required when the namespace-prefixes property is true, and is optional when the namespace-prefixes property is false (the default).

Note that the attribute list provided will contain only attributes with explicit values (specified or defaulted): #IMPLIED attributes will be omitted. The attribute list will contain attributes used for Namespace declarations property is true (it is false by default, and support for a

Parameters:
uri - The Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed.
localName - The local name (without prefix), or the empty string if Namespace processing is not being performed.
qName - The qualified name (with prefix), or the empty string if qualified names are not available.
atts - The attributes attached to the element. If there are no attributes, it shall be an empty

endElement

public void endElement(java.lang.String namespaceURI,
                       java.lang.String localName,
                       java.lang.String rawName)
Receive notification of the end of an element.

The SAX parser will invoke this method at the end of every element in the XML document; there will be a corresponding event (even when the element is empty).

Parameters:
uri - The Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed.
localName - The local name (without prefix), or the empty string if Namespace processing is not being performed.
qName - The qualified XML 1.0 name (with prefix), or the wrapping another exception.

characters

public void characters(char[] chars,
                       int start,
                       int length)
Receive notification of character data.

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

The application must not attempt to read from the array outside of the specified range.

Note that some parsers will report whitespace in element method rather than this one (validating parsers must do so).


startElement
public void startElement(java.lang.String tagName,
Receive notification of the beginning of an element.

The Parser will invoke this method at the beginning of every element in the XML document; there will be a corresponding endElement() event for every startElement() event (even when the element is empty). All of the element's content will be reported, in order, before the corresponding endElement() event.

If the element name has a namespace prefix, the prefix will still be attached. Note that the attribute list provided will contain only attributes with explicit values (specified or


endElement

public void endElement(java.lang.String tagName)
Receive notification of the end of an element.

The SAX parser will invoke this method at the end of every element in the XML document; there will be a corresponding startElement() event for every endElement() event (even when the element is empty).

If the element name has a namespace prefix, the prefix will

wrapping another exception.

startDocument

protected void startDocument(java.lang.String rootTagName)
Called to serialize the document's DOCTYPE by the root element. The document type declaration must name the root element, but the root element is only known when that element is serialized, and not at the start of the document.

will serialize the document type declaration, and will serialize all pre-root comments and PIs that were accumulated in the document this is not the first root element of the document.


serializeElement

protected void serializeElement(Element elem)
inbetween, but better optimized.
Parameters:
elem - The element to serialize

characters

protected void characters(java.lang.String text)
Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.
Parameters:
text - The text to print
unescaped - True is should print unescaped

getEntityRef

protected java.lang.String getEntityRef(char ch)
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".
Parameters:
ch - Character value
Returns:
Character entity name, or null

escapeURI

protected java.lang.String escapeURI(java.lang.String uri)

Xerces 3.1.1