Class URI
- java.lang.Object
-
- org.exoplatform.common.http.client.URI
-
public class URI extends Object
This class represents a generic URI, as defined in RFC-2396. This is similar to java.net.URL, with the following enhancements:- it doesn't require a URLStreamhandler to exist for the scheme; this allows this class to be used to hold any URI, construct absolute URIs from relative ones, etc.
- it handles escapes correctly
- equals() works correctly
- relative URIs are correctly constructed
- it has methods for accessing various fields such as userinfo, fragment, params, etc.
- it handles less common forms of resources such as the "*" used in http URLs.
The elements are always stored in escaped form.
While RFC-2396 distinguishes between just two forms of URI's, those that follow the generic syntax and those that don't, this class knows about a third form, named semi-generic, used by quite a few popular schemes. Semi-generic syntax treats the path part as opaque, i.e. has the form <scheme>://<authority>/<opaque> . Relative URI's of this type are only resolved as far as absolute paths - relative paths do not exist.
Ideally, java.net.URL should subclass URI.
- Since:
- V0.3-1
- Version:
- 0.3-3 06/05/2001
- Author:
- Ronald Tschal�r
- See Also:
- rfc-2396
-
-
Field Summary
Fields Modifier and Type Field Description protected static BitSetalphanumCharprotected static HashtabledefaultPortsstatic booleanENABLE_BACKWARDS_COMPATIBILITYIf true, then the parser will resolve certain URI's in backwards compatible (but technically incorrect) manner.static BitSetescpdFragCharlist of characters which must not be escaped when escaping a fragment identifierstatic BitSetescpdPathCharlist of characters which must not be escaped when escaping a pathstatic BitSetescpdQueryCharlist of characters which must not be escaped when escaping a query stringprotected Stringfragmentprotected static intGENERICprotected Stringhostprotected static BitSethostCharprotected static BitSetmarkCharprotected Stringopaqueprotected static intOPAQUEprotected static BitSetopaqueCharprotected Stringpathprotected static BitSetpcharCharprotected intportprotected Stringqueryprotected static BitSetreg_nameCharprotected static BitSetreservedCharstatic BitSetresvdHostCharlist of characters which must not be unescaped when unescaping a hoststatic BitSetresvdPathCharlist of characters which must not be unescaped when unescaping a pathstatic BitSetresvdQueryCharlist of characters which must not be unescaped when unescaping a query stringstatic BitSetresvdSchemeCharlist of characters which must not be unescaped when unescaping a schemestatic BitSetresvdUICharlist of characters which must not be unescaped when unescaping a userinfoprotected Stringschemeprotected static BitSetschemeCharprotected static intSEMI_GENERICprotected inttypeprotected static BitSetunreservedCharprotected static BitSeturicCharprotected URLurlprotected Stringuserinfoprotected static BitSetuserinfoCharprotected static HashtableusesGenericSyntaxprotected static HashtableusesSemiGenericSyntax
-
Constructor Summary
Constructors Constructor Description URI(String uri)Constructs a URI from the given string representation.URI(String scheme, String opaque)Constructs an opaque URI from the given parts.URI(String scheme, String host, int port, String path)Constructs a URI from the given parts.URI(String scheme, String host, String path)Constructs a URI from the given parts, using the default port for this scheme (if known).URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment)Constructs a URI from the given parts.URI(URL url)Construct a URI from the given URL.URI(URI base, String rel_uri)Constructs a URI from the given string representation, relative to the given base URI.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static StringcanonicalizePath(String path)Remove all "/../" and "/./" from path, where possible.static intdefaultPort(String protocol)Return the default port used by a given protocol.booleanequals(Object other)static char[]escape(char[] elem, BitSet allowed_char, boolean utf8)Escape any character not in the given character class.static Stringescape(String elem, BitSet allowed_char, boolean utf8)Escape any character not in the given character class.StringgetFragment()StringgetHost()StringgetOpaque()StringgetPath()StringgetPathAndQuery()intgetPort()StringgetQueryString()StringgetScheme()StringgetUserinfo()inthashCode()The hash code is calculated over scheme, host, path, and query.booleanisGenericURI()Does the scheme specific part of this URI use the generic-URI syntax?booleanisSemiGenericURI()Does the scheme specific part of this URI use the semi-generic-URI syntax?static voidmain(String[] args)Run test set.StringtoExternalForm()StringtoString()Return the URI as string.URLtoURL()Will try to create a java.net.URL object from this URI.static Stringunescape(String str, BitSet reserved)Unescape escaped characters (i.e.static booleanusesGenericSyntax(String scheme)static booleanusesSemiGenericSyntax(String scheme)
-
-
-
Field Detail
-
ENABLE_BACKWARDS_COMPATIBILITY
public static final boolean ENABLE_BACKWARDS_COMPATIBILITY
If true, then the parser will resolve certain URI's in backwards compatible (but technically incorrect) manner. Example:base = http://a/b/c/d;p?q rel = http:g result = http:g (correct) result = http://a/b/c/g (backwards compatible)
See rfc-2396, section 5.2, step 3, second paragraph.- See Also:
- Constant Field Values
-
defaultPorts
protected static final Hashtable defaultPorts
-
usesGenericSyntax
protected static final Hashtable usesGenericSyntax
-
usesSemiGenericSyntax
protected static final Hashtable usesSemiGenericSyntax
-
alphanumChar
protected static final BitSet alphanumChar
-
markChar
protected static final BitSet markChar
-
reservedChar
protected static final BitSet reservedChar
-
unreservedChar
protected static final BitSet unreservedChar
-
uricChar
protected static final BitSet uricChar
-
pcharChar
protected static final BitSet pcharChar
-
userinfoChar
protected static final BitSet userinfoChar
-
schemeChar
protected static final BitSet schemeChar
-
hostChar
protected static final BitSet hostChar
-
opaqueChar
protected static final BitSet opaqueChar
-
reg_nameChar
protected static final BitSet reg_nameChar
-
resvdSchemeChar
public static final BitSet resvdSchemeChar
list of characters which must not be unescaped when unescaping a scheme
-
resvdUIChar
public static final BitSet resvdUIChar
list of characters which must not be unescaped when unescaping a userinfo
-
resvdHostChar
public static final BitSet resvdHostChar
list of characters which must not be unescaped when unescaping a host
-
resvdPathChar
public static final BitSet resvdPathChar
list of characters which must not be unescaped when unescaping a path
-
resvdQueryChar
public static final BitSet resvdQueryChar
list of characters which must not be unescaped when unescaping a query string
-
escpdPathChar
public static final BitSet escpdPathChar
list of characters which must not be escaped when escaping a path
-
escpdQueryChar
public static final BitSet escpdQueryChar
list of characters which must not be escaped when escaping a query string
-
escpdFragChar
public static final BitSet escpdFragChar
list of characters which must not be escaped when escaping a fragment identifier
-
OPAQUE
protected static final int OPAQUE
- See Also:
- Constant Field Values
-
SEMI_GENERIC
protected static final int SEMI_GENERIC
- See Also:
- Constant Field Values
-
GENERIC
protected static final int GENERIC
- See Also:
- Constant Field Values
-
type
protected int type
-
scheme
protected String scheme
-
opaque
protected String opaque
-
userinfo
protected String userinfo
-
host
protected String host
-
port
protected int port
-
path
protected String path
-
query
protected String query
-
fragment
protected String fragment
-
url
protected URL url
-
-
Constructor Detail
-
URI
public URI(String uri) throws ParseException
Constructs a URI from the given string representation. The string must be an absolute URI.- Parameters:
uri- a String containing an absolute URI- Throws:
ParseException- if no scheme can be found or a specified port cannot be parsed as a number
-
URI
public URI(URI base, String rel_uri) throws ParseException
Constructs a URI from the given string representation, relative to the given base URI.- Parameters:
base- the base URI, relative to which rel_uri is to be parsedrel_uri- a String containing a relative or absolute URI- Throws:
ParseException- if base is null and rel_uri is not an absolute URI, or if base is not null and the scheme is not known to use the generic syntax, or if a given port cannot be parsed as a number
-
URI
public URI(URL url) throws ParseException
Construct a URI from the given URL.- Parameters:
url- the URL- Throws:
ParseException- ifurl.toExternalForm()generates an invalid string representation
-
URI
public URI(String scheme, String host, String path) throws ParseException
Constructs a URI from the given parts, using the default port for this scheme (if known). The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)host- the hostpath- the path part- Throws:
ParseException- if scheme is null
-
URI
public URI(String scheme, String host, int port, String path) throws ParseException
Constructs a URI from the given parts. The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)host- the hostport- the portpath- the path part- Throws:
ParseException- if scheme is null
-
URI
public URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment) throws ParseException
Constructs a URI from the given parts. Any part except for the the scheme may be null. The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)userinfo- the userinfohost- the hostport- the portpath- the path partquery- the query stringfragment- the fragment identifier- Throws:
ParseException- if scheme is null
-
URI
public URI(String scheme, String opaque) throws ParseException
Constructs an opaque URI from the given parts.- Parameters:
scheme- the scheme (sometimes known as protocol)opaque- the opaque part- Throws:
ParseException- if scheme is null
-
-
Method Detail
-
canonicalizePath
public static String canonicalizePath(String path)
Remove all "/../" and "/./" from path, where possible. Leading "/../"'s are not removed.- Parameters:
path- the path to canonicalize- Returns:
- the canonicalized path
-
usesGenericSyntax
public static boolean usesGenericSyntax(String scheme)
- Returns:
- true if the scheme should be parsed according to the generic-URI syntax
-
usesSemiGenericSyntax
public static boolean usesSemiGenericSyntax(String scheme)
- Returns:
- true if the scheme should be parsed according to a semi-generic-URI syntax <scheme>://<hostport>/<opaque>
-
defaultPort
public static final int defaultPort(String protocol)
Return the default port used by a given protocol.- Parameters:
protocol- the protocol- Returns:
- the port number, or 0 if unknown
-
getScheme
public String getScheme()
- Returns:
- the scheme (often also referred to as protocol)
-
getOpaque
public String getOpaque()
- Returns:
- the opaque part, or null if this URI is generic
-
getHost
public String getHost()
- Returns:
- the host
-
getPort
public int getPort()
- Returns:
- the port, or -1 if it's the default port, or 0 if unknown
-
getUserinfo
public String getUserinfo()
- Returns:
- the user info
-
getPath
public String getPath()
- Returns:
- the path
-
getQueryString
public String getQueryString()
- Returns:
- the query string
-
getPathAndQuery
public String getPathAndQuery()
- Returns:
- the path and query
-
getFragment
public String getFragment()
- Returns:
- the fragment
-
isGenericURI
public boolean isGenericURI()
Does the scheme specific part of this URI use the generic-URI syntax?In general URI are split into two categories: opaque-URI and generic-URI. The generic-URI syntax is the syntax most are familiar with from URLs such as ftp- and http-URLs, which is roughly:
generic-URI = scheme ":" [ "//" server ] [ "/" ] [ path_segments ] [ "?" query ]
(see RFC-2396 for exact syntax). Only URLs using the generic-URI syntax can be used to create and resolve relative URIs.Whether a given scheme is parsed according to the generic-URI syntax or wether it is treated as opaque is determined by an internal table of URI schemes.
- See Also:
- rfc-2396
-
isSemiGenericURI
public boolean isSemiGenericURI()
Does the scheme specific part of this URI use the semi-generic-URI syntax?Many schemes which don't follow the full generic syntax actually follow a reduced form where the path part is treated is opaque. This is used for example by ldap, smtp, pop, etc, and is roughly
generic-URI = scheme ":" [ "//" server ] [ "/" [ opaque_path ] ]
I.e. parsing is identical to the generic-syntax, except that the path part is not further parsed. URLs using the semi-generic-URI syntax can be used to create and resolve relative URIs with the restriction that all paths are treated as absolute.Whether a given scheme is parsed according to the semi-generic-URI syntax is determined by an internal table of URI schemes.
- See Also:
isGenericURI()
-
toURL
public URL toURL() throws MalformedURLException
Will try to create a java.net.URL object from this URI.- Returns:
- the URL
- Throws:
MalformedURLException- if no handler is available for the scheme
-
toExternalForm
public String toExternalForm()
- Returns:
- a string representation of this URI suitable for use in links, headers, etc.
-
toString
public String toString()
Return the URI as string. This differs from toExternalForm() in that all elements are unescaped before assembly. This is not suitable for passing to other apps or in header fields and such, and is usually not what you want.- Overrides:
toStringin classObject- Returns:
- the URI as a string
- See Also:
toExternalForm()
-
equals
public boolean equals(Object other)
-
hashCode
public int hashCode()
The hash code is calculated over scheme, host, path, and query.
-
escape
public static String escape(String elem, BitSet allowed_char, boolean utf8)
Escape any character not in the given character class. Characters greater 255 are always escaped according to ??? .- Parameters:
elem- the string to escapeallowed_char- the BitSet of all allowed charactersutf8- if true, will first UTF-8 encode unallowed characters- Returns:
- the string with all characters not in allowed_char escaped
-
escape
public static char[] escape(char[] elem, BitSet allowed_char, boolean utf8)Escape any character not in the given character class. Characters greater 255 are always escaped according to ??? .- Parameters:
elem- the array of characters to escapeallowed_char- the BitSet of all allowed charactersutf8- if true, will first UTF-8 encode unallowed characters- Returns:
- the elem array with all characters not in allowed_char escaped
-
unescape
public static final String unescape(String str, BitSet reserved) throws ParseException
Unescape escaped characters (i.e. %xx) except reserved ones.- Parameters:
str- the string to unescapereserved- the characters which may not be unescaped, or null- Returns:
- the unescaped string
- Throws:
ParseException- if the two digits following a `%' are not a valid hex number
-
-