com.google.caja.lang.css
Class CssPropertyPatterns

java.lang.Object
  extended by com.google.caja.lang.css.CssPropertyPatterns

public class CssPropertyPatterns
extends java.lang.Object

Operates on CSS property signatures to come up with a simple regular expression that validates values where possible.

This class produces a javascript file like

   var css = {
       "background-image": /url("[^\"\\\(\)]+")\s+/i,
       clear: /(?:none|left|right|both|inherit)\s+/i,
       color: /(?:blue|red|green|fuschia)\s+/i
   };
 

The css map does not contain every property in the given CssSchema since some cannot be matched efficiently by values. See comments on [ a || b ]* constructs inline below.

Differences from Server Side CSS

Nor does it contain every option matched by the server side CSS processor. Specifically, it does not currently match

Whitespace Between Tokens

The patterns in the example above all end in \s+. This simplifies a lot of the patterns since a signature like foo* strictly translates to the regular expression /(foo(\s+foo)*)?/i. Even if we repeated subexpressions, we would run into problems with the signature a [b | c? d]? e which could translates to a regular expression but only with non-local handling of whitespace around sub-expressions that can be empty.

Instead, we require every literal to be followed by one or more spaces. We can then match against CSS padded with spaces, as in

   var isValid = css[cssPropertyName].test(cssText + ' ');
 

Program Flow

This class examines a schema and builds a list of all allowed CSS properties. It then tries to convert each property's signature to a regular expression pattern. It may fail for some patterns, especially the aggregate ones like background which combine background-image, background-style, etc.

Next it optimizes the patterns it found. This includes flattening concatenation and union operators, and moving the \s+ out of unions. Optimizing /((blue\s+|red\s+|green\s+)|inherit\s+)/ might yield the simpler expression /(blue|red|green|inherit)\s+/.

Once it has a mapping of property names to regular expressions it builds a constant pool by hashing on regular expression text. This allows properties with identical patterns such as border-top and border-bottom to share an instance.

Finally it builds a javascript parse tree that assigns the css namespace to an object whose keys are CSS property names, and whose values are regular expressions.

Caveats

Some of the regular expressions do match URLs. If valid css text contains the string 'uri' case-insensitively, then a client may need to extract and rewrite URLs. Since all strings are double quoted, this should be doable without lexing CSS.

Author:
mikesamuel@gmail.com

Nested Class Summary
static class CssPropertyPatterns.Builder
           
private static class CssPropertyPatterns.JSREBuilder
           
 
Field Summary
private static java.util.Map<java.lang.String,JSRE> BUILTINS
           
private static Name COLOR
           
static java.util.Set<Name> HISTORY_INSENSITIVE_STYLE_WHITELIST
          Set of properties accessible on computed style of an anchor (<A>) element or some element nested within an anchor.
private static JSRE OPT_SPACES
           
private  Bag<java.lang.String> refsUsed
           
private  CssSchema schema
           
private static JSRE SPACES
           
private static Name STANDARD_COLOR
           
 
Constructor Summary
CssPropertyPatterns(CssSchema schema)
           
 
Method Summary
private  JSRE builtinToPattern(Name name)
           
private  CssPropertyPatterns.JSREBuilder callToPattern(boolean identBefore, CssPropertySignature.CallSignature sig)
           
 java.util.regex.Pattern cssPropertyToJavaRegex(CssPropertySignature sig)
           
private  JSRE cssPropertyToJSRE(CssPropertySignature sig)
           
 java.lang.String cssPropertyToPattern(CssPropertySignature sig)
          Generates a regular expression for the given signature if a simple regular expression exists.
private  CssPropertyPatterns.JSREBuilder exclusiveToPattern(boolean identBefore, CssPropertySignature sig)
           
static void generatePatterns(CssSchema schema, java.lang.Appendable out)
           
private static boolean isIdentChar(char ch)
           
private static CssPropertyPatterns.JSREBuilder litToPattern(boolean identBefore, CssPropertySignature.LiteralSignature lit)
           
static void main(java.lang.String[] args)
           
private static Expression makeRegexp(java.util.Map<java.lang.String,java.lang.Integer> commonSubstrings, java.lang.String regex)
           
private static void makeRegexpOnto(java.util.List<Pair<java.lang.String,Expression>> strs, java.lang.String pattern, int index, java.util.List<Expression> parts)
           
(package private) static java.lang.String propertyNameToDom2Property(Name cssPropertyName)
          Converts a css property name to a javascript identifier, e.g.
private  CssPropertyPatterns.JSREBuilder refToPattern(boolean identBefore, CssPropertySignature.PropertyRefSignature sig)
           
private  CssPropertyPatterns.JSREBuilder repToPattern(boolean identBefore, CssPropertySignature.RepeatedSignature sig)
           
private  CssPropertyPatterns.JSREBuilder seriesToPattern(boolean identBefore, CssPropertySignature.SeriesSignature sig)
           
private  CssPropertyPatterns.JSREBuilder setToPattern(boolean identBefore, CssPropertySignature sig)
           
private  CssPropertyPatterns.JSREBuilder sigToPattern(boolean identBefore, CssPropertySignature sig)
           
private  CssPropertyPatterns.JSREBuilder symbolToPattern(boolean identBefore, CssPropertySignature.SymbolSignature sig)
           
private static JSRE withoutSpacesOrZero(JSRE p)
          Spaces and zero tend to get moved/merged frequently during regex optimization so don't consider them when doing common substring matching.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

schema

private final CssSchema schema

HISTORY_INSENSITIVE_STYLE_WHITELIST

public static java.util.Set<Name> HISTORY_INSENSITIVE_STYLE_WHITELIST
Set of properties accessible on computed style of an anchor (<A>) element or some element nested within an anchor. This list is a conservative one based on the ability to do visibility, containment, and layout calculations. It REQUIRES that user CSS is prevented from specifying ANY of these properties in a history sensitive manner (i.e., in a rule with a ":link" or ":visited" predicate). Otherwise, it would allow an attacker to probe the user's history as described at https://bugzilla.mozilla.org/show_bug.cgi?id=147777 .


SPACES

private static final JSRE SPACES

OPT_SPACES

private static final JSRE OPT_SPACES

refsUsed

private final Bag<java.lang.String> refsUsed

COLOR

private static final Name COLOR

STANDARD_COLOR

private static final Name STANDARD_COLOR

BUILTINS

private static final java.util.Map<java.lang.String,JSRE> BUILTINS
Constructor Detail

CssPropertyPatterns

public CssPropertyPatterns(CssSchema schema)
Method Detail

cssPropertyToPattern

public java.lang.String cssPropertyToPattern(CssPropertySignature sig)
Generates a regular expression for the given signature if a simple regular expression exists.

Returns:
null if no simple regular expression exists or the text of a Javascript regular expression like "/foo\s+/i" that matches values of the given value with one or more trailing whitespace characters. If the color property only matched the literal "blue", the resulting pattern would match "blue ".

cssPropertyToJavaRegex

public java.util.regex.Pattern cssPropertyToJavaRegex(CssPropertySignature sig)

cssPropertyToJSRE

private JSRE cssPropertyToJSRE(CssPropertySignature sig)

sigToPattern

private CssPropertyPatterns.JSREBuilder sigToPattern(boolean identBefore,
                                                     CssPropertySignature sig)

litToPattern

private static CssPropertyPatterns.JSREBuilder litToPattern(boolean identBefore,
                                                            CssPropertySignature.LiteralSignature lit)

isIdentChar

private static boolean isIdentChar(char ch)

repToPattern

private CssPropertyPatterns.JSREBuilder repToPattern(boolean identBefore,
                                                     CssPropertySignature.RepeatedSignature sig)

refToPattern

private CssPropertyPatterns.JSREBuilder refToPattern(boolean identBefore,
                                                     CssPropertySignature.PropertyRefSignature sig)

seriesToPattern

private CssPropertyPatterns.JSREBuilder seriesToPattern(boolean identBefore,
                                                        CssPropertySignature.SeriesSignature sig)

symbolToPattern

private CssPropertyPatterns.JSREBuilder symbolToPattern(boolean identBefore,
                                                        CssPropertySignature.SymbolSignature sig)

setToPattern

private CssPropertyPatterns.JSREBuilder setToPattern(boolean identBefore,
                                                     CssPropertySignature sig)

callToPattern

private CssPropertyPatterns.JSREBuilder callToPattern(boolean identBefore,
                                                      CssPropertySignature.CallSignature sig)

exclusiveToPattern

private CssPropertyPatterns.JSREBuilder exclusiveToPattern(boolean identBefore,
                                                           CssPropertySignature sig)

builtinToPattern

private JSRE builtinToPattern(Name name)

generatePatterns

public static void generatePatterns(CssSchema schema,
                                    java.lang.Appendable out)
                             throws java.io.IOException
Throws:
java.io.IOException

makeRegexp

private static Expression makeRegexp(java.util.Map<java.lang.String,java.lang.Integer> commonSubstrings,
                                     java.lang.String regex)

makeRegexpOnto

private static void makeRegexpOnto(java.util.List<Pair<java.lang.String,Expression>> strs,
                                   java.lang.String pattern,
                                   int index,
                                   java.util.List<Expression> parts)

withoutSpacesOrZero

private static JSRE withoutSpacesOrZero(JSRE p)
Spaces and zero tend to get moved/merged frequently during regex optimization so don't consider them when doing common substring matching.


propertyNameToDom2Property

static java.lang.String propertyNameToDom2Property(Name cssPropertyName)
Converts a css property name to a javascript identifier, e.g. background-color => backgroundColor.


main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException


Copyright (C) 2008 Google Inc.
Licensed under the Apache License, Version 2.0