Class Summary |
AbstractTokenStream<T extends TokenType> |
A TokenStream that lazily fetches one token at a time. |
BufferBackedSequence |
|
Chardet |
Utilities for dealing with converting byte streams with unknown character
sets to character streams. |
ChardetTest |
|
CharProducer |
A character reader that tracks character file position information. |
CharProducer.ChainCharProducer |
|
CharProducer.Factory |
Convenience methods for creating producers. |
CharProducer.Factory.CharProducerImpl |
|
CharProducerTest |
|
CharProducerTest.StreamState |
|
CssLexer |
A lexer that recognizes the
CSS 2.1 Grammar
plus line comments as interpreted by most browsers. |
CssLexerTest |
|
CssSplitter |
|
DecodingCharProducer |
|
DecodingCharProducer.Decoder |
|
DecodingCharProducerTest |
|
ExternalReference |
A reference to an external resource from an input. |
FetchedData |
Encapsulates a unit of content fetched from some remote location, including
some basic metadata about the content. |
FetchedData.BinaryFetchedData |
|
FetchedData.TextualFetchedData |
|
FetchedDataTest |
|
FetchedDataTest.TestURLConnection |
|
FilePosition |
The range of characters in a source file occupied by a token or a group of
tokens. |
FilePositionTest |
|
GuessContentType |
Guesses content type based on reported mime-type, file name, content of file. |
GuessContentTypeTest |
|
HtmlEntities |
|
HtmlInputSplitter |
A token stream that breaks a character stream into
HtmlTokenType.{TEXT,TAGBEGIN,TAGEND,DIRECTIVE,COMMENT,CDATA,DIRECTIVE}
tokens. |
HtmlLexer |
A flexible lexer for html, gxp, and related document types. |
HtmlLexerTest |
|
InputElementSplitter |
Splits lines into strings, comments, regular expression literals, and
blocks of non-whitespace. |
InputElementSplitter.ParsedNumber |
|
InputSource |
A file of source code. |
JsLexer |
Tokenizes javascript source. |
JsLexer.WordClassifier |
|
JsLexerTest |
testcases for JsLexer . |
JsTokenQueue |
A token queue for javascript. |
NumberRecognizer |
A state machine that keeps track of whether a run of word characters and
dots might be a part of a number. |
PositionInferer |
Does some simple constraint solving to assign reasonable position values
to generated parse tree nodes. |
PositionInferer.Boundary |
An edge of a node descriptor's position. |
PositionInferer.EqualRelation |
A relation between two boundaries that should be at the same actual
position. |
PositionInferer.LessThanRelation |
A relation between a boundary that appears at or before another boundary. |
PositionInferer.Region |
A start boundary and an end boundary. |
PositionInferer.Relation |
A relationship between two boundaries that constrains the positions of the
set of boundaries. |
PositionInfererTest |
|
PunctuationTrie<T> |
A trie used to separate punctuation tokens in a run of non-whitespace
characters by preferring the longest punctuation string possible in a
greedy left-to-right scan. |
PunctuationTrieTest |
testcases for PunctuationTrie . |
SourceBreaks |
Encapsulates an InputSource and the positions of newlines in that
source file. |
SourceBreaksTest |
|
Token<T extends TokenType> |
A lexical token. |
TokenQueue<T extends TokenType> |
A queue of tokens extracted from a Lexer and a bunch of convenience
methods for parsing. |
TokenQueue.Mark |
Allows rewinding to a known position in the token queue. |
TokenQueue.TokenList<TT extends TokenType> |
A singly linked list of tokens. |
UriDecoder |
Decodes url-encoded content as specified in section 2.4 of
RFC 2396. |