net.jxta.search.util
Class XmlParser

java.lang.Object
  |
  +--net.jxta.search.util.XmlParser

public class XmlParser
extends java.lang.Object

A very light-weight non-validating xml parser. The call-back mechanism is flexible enough to be used for constructing a stack of tags or more specific tasks like link extraction or indexing of both text and hrefs.

Usage:

      XmlParser parser = new XmlParser ();

      XmlParser.ParserCallback callback = new XmlParser.ParserCallback () {

		public void startTag (byte[] chars, int start, int len) {
		    System.out.println ("Start tag: " + 
					new String (chars, start, len));
		}
		
		public void chars (byte[] chars, int start, int len) {
		    System.out.println ("Chars: ." +
					new String (chars, start, len) +
					".");
		}

		public void endTag (byte[] chars, int start, int len) {
		    System.out.println ("End tag: " +
					new String (chars, start, len));
		}
	    };

	byte[] buf = new char[256];
	
	InputStream reader = System.in;
	
	parser.parse (reader, buf, callback);
 


Inner Class Summary
static class XmlParser.Exception
           
static interface XmlParser.ParserCallback
          Methods of object implementing this interface will be called at appropriate times during the parsing of the html.
 
Constructor Summary
XmlParser()
           
 
Method Summary
static java.lang.String getAttributeValue(byte[] attributeName, byte[] tag, int start, int len)
          Parse the attribute value of a given attribute name in a tag.
static int getIntAttribute(byte[] attributeName, byte[] tag, int start, int len, int defaultValue)
           
static void main(java.lang.String[] argv)
           
static void parse(java.io.InputStream reader, byte[] chars, XmlParser.ParserCallback callback)
          Parse the input stream as html.
static boolean startsWith(byte[] what, byte[] chars, int start, int len)
          Check whether a given character buffer starts with a certain set of characters.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XmlParser

public XmlParser()
Method Detail

parse

public static void parse(java.io.InputStream reader,
                         byte[] chars,
                         XmlParser.ParserCallback callback)
                  throws java.io.IOException,
                         XmlParser.Exception
Parse the input stream as html. Please note that tags longer than the character buffer's size will be thrown away. If your buffer is sufficiently large, say 8k, then this obviously doesn't
Parameters:
reader - the input stream (unbuffered, hopefully) from which to read html
chars - the character buffer to use when reading
callback - the callback interface implementation

getAttributeValue

public static java.lang.String getAttributeValue(byte[] attributeName,
                                                 byte[] tag,
                                                 int start,
                                                 int len)
Parse the attribute value of a given attribute name in a tag.
Parameters:
attributeName - the name of the attribute
tag - the tag in which to look for the attribute
Returns:
the attribute's value or null if the tag doesn't contain that attribute.

getIntAttribute

public static int getIntAttribute(byte[] attributeName,
                                  byte[] tag,
                                  int start,
                                  int len,
                                  int defaultValue)

startsWith

public static boolean startsWith(byte[] what,
                                 byte[] chars,
                                 int start,
                                 int len)
Check whether a given character buffer starts with a certain set of characters.
Parameters:
chars - the character buffer
what - the set of characters the buffer might start with.
Returns:
true if the character buffer starts with what

main

public static void main(java.lang.String[] argv)
                 throws java.io.IOException,
                        XmlParser.Exception