Class URI
- it doesn't require a URLStreamhandler to exist for the scheme; this allows this class to be used to hold any URI, construct absolute URIs from relative ones, etc.
- it handles escapes correctly
- equals() works correctly
- relative URIs are correctly constructed
- it has methods for accessing various fields such as userinfo, fragment, params, etc.
- it handles less common forms of resources such as the "*" used in http URLs.
The elements are always stored in escaped form.
While RFC-2396 distinguishes between just two forms of URI's, those that follow the generic syntax and those that don't, this class knows about a third form, named semi-generic, used by quite a few popular schemes. Semi-generic syntax treats the path part as opaque, i.e. has the form <scheme>://<authority>/<opaque> . Relative URI's of this type are only resolved as far as absolute paths - relative paths do not exist.
Ideally, java.net.URL should subclass URI.
- Since:
- V0.3-1
- Version:
- 0.3-3 06/05/2001
- Author:
- Ronald Tschal�r
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected static final BitSetprotected static final Hashtablestatic final booleanIf true, then the parser will resolve certain URI's in backwards compatible (but technically incorrect) manner.static final BitSetlist of characters which must not be escaped when escaping a fragment identifierstatic final BitSetlist of characters which must not be escaped when escaping a pathstatic final BitSetlist of characters which must not be escaped when escaping a query stringprotected Stringprotected static final intprotected Stringprotected static final BitSetprotected static final BitSetprotected Stringprotected static final intprotected static final BitSetprotected Stringprotected static final BitSetprotected intprotected Stringprotected static final BitSetprotected static final BitSetstatic final BitSetlist of characters which must not be unescaped when unescaping a hoststatic final BitSetlist of characters which must not be unescaped when unescaping a pathstatic final BitSetlist of characters which must not be unescaped when unescaping a query stringstatic final BitSetlist of characters which must not be unescaped when unescaping a schemestatic final BitSetlist of characters which must not be unescaped when unescaping a userinfoprotected Stringprotected static final BitSetprotected static final intprotected intprotected static final BitSetprotected static final BitSetprotected URLprotected Stringprotected static final BitSetprotected static final Hashtableprotected static final Hashtable -
Constructor Summary
ConstructorsConstructorDescriptionConstructs a URI from the given string representation.Constructs an opaque URI from the given parts.Constructs a URI from the given parts.Constructs a URI from the given parts, using the default port for this scheme (if known).URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment) Constructs a URI from the given parts.Construct a URI from the given URL.Constructs a URI from the given string representation, relative to the given base URI. -
Method Summary
Modifier and TypeMethodDescriptionstatic StringcanonicalizePath(String path) Remove all "/../" and "/./" from path, where possible.static final intdefaultPort(String protocol) Return the default port used by a given protocol.booleanstatic char[]Escape any character not in the given character class.static StringEscape any character not in the given character class.getHost()getPath()intgetPort()inthashCode()The hash code is calculated over scheme, host, path, and query.booleanDoes the scheme specific part of this URI use the generic-URI syntax?booleanDoes the scheme specific part of this URI use the semi-generic-URI syntax?static voidRun test set.toString()Return the URI as string.toURL()Will try to create a java.net.URL object from this URI.static final StringUnescape escaped characters (i.e.static booleanusesGenericSyntax(String scheme) static booleanusesSemiGenericSyntax(String scheme)
-
Field Details
-
ENABLE_BACKWARDS_COMPATIBILITY
public static final boolean ENABLE_BACKWARDS_COMPATIBILITYIf true, then the parser will resolve certain URI's in backwards compatible (but technically incorrect) manner. Example:base = http://a/b/c/d;p?q rel = http:g result = http:g (correct) result = http://a/b/c/g (backwards compatible)
See rfc-2396, section 5.2, step 3, second paragraph.- See Also:
-
defaultPorts
-
usesGenericSyntax
-
usesSemiGenericSyntax
-
alphanumChar
-
markChar
-
reservedChar
-
unreservedChar
-
uricChar
-
pcharChar
-
userinfoChar
-
schemeChar
-
hostChar
-
opaqueChar
-
reg_nameChar
-
resvdSchemeChar
list of characters which must not be unescaped when unescaping a scheme -
resvdUIChar
list of characters which must not be unescaped when unescaping a userinfo -
resvdHostChar
list of characters which must not be unescaped when unescaping a host -
resvdPathChar
list of characters which must not be unescaped when unescaping a path -
resvdQueryChar
list of characters which must not be unescaped when unescaping a query string -
escpdPathChar
list of characters which must not be escaped when escaping a path -
escpdQueryChar
list of characters which must not be escaped when escaping a query string -
escpdFragChar
list of characters which must not be escaped when escaping a fragment identifier -
OPAQUE
protected static final int OPAQUE- See Also:
-
SEMI_GENERIC
protected static final int SEMI_GENERIC- See Also:
-
GENERIC
protected static final int GENERIC- See Also:
-
type
protected int type -
scheme
-
opaque
-
userinfo
-
host
-
port
protected int port -
path
-
query
-
fragment
-
url
-
-
Constructor Details
-
URI
Constructs a URI from the given string representation. The string must be an absolute URI.- Parameters:
uri- a String containing an absolute URI- Throws:
ParseException- if no scheme can be found or a specified port cannot be parsed as a number
-
URI
Constructs a URI from the given string representation, relative to the given base URI.- Parameters:
base- the base URI, relative to which rel_uri is to be parsedrel_uri- a String containing a relative or absolute URI- Throws:
ParseException- if base is null and rel_uri is not an absolute URI, or if base is not null and the scheme is not known to use the generic syntax, or if a given port cannot be parsed as a number
-
URI
Construct a URI from the given URL.- Parameters:
url- the URL- Throws:
ParseException- ifurl.toExternalForm()generates an invalid string representation
-
URI
Constructs a URI from the given parts, using the default port for this scheme (if known). The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)host- the hostpath- the path part- Throws:
ParseException- if scheme is null
-
URI
Constructs a URI from the given parts. The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)host- the hostport- the portpath- the path part- Throws:
ParseException- if scheme is null
-
URI
public URI(String scheme, String userinfo, String host, int port, String path, String query, String fragment) throws ParseException Constructs a URI from the given parts. Any part except for the the scheme may be null. The parts must be in unescaped form.- Parameters:
scheme- the scheme (sometimes known as protocol)userinfo- the userinfohost- the hostport- the portpath- the path partquery- the query stringfragment- the fragment identifier- Throws:
ParseException- if scheme is null
-
URI
Constructs an opaque URI from the given parts.- Parameters:
scheme- the scheme (sometimes known as protocol)opaque- the opaque part- Throws:
ParseException- if scheme is null
-
-
Method Details
-
canonicalizePath
Remove all "/../" and "/./" from path, where possible. Leading "/../"'s are not removed.- Parameters:
path- the path to canonicalize- Returns:
- the canonicalized path
-
usesGenericSyntax
- Returns:
- true if the scheme should be parsed according to the generic-URI syntax
-
usesSemiGenericSyntax
- Returns:
- true if the scheme should be parsed according to a semi-generic-URI syntax <scheme>://<hostport>/<opaque>
-
defaultPort
Return the default port used by a given protocol.- Parameters:
protocol- the protocol- Returns:
- the port number, or 0 if unknown
-
getScheme
- Returns:
- the scheme (often also referred to as protocol)
-
getOpaque
- Returns:
- the opaque part, or null if this URI is generic
-
getHost
- Returns:
- the host
-
getPort
public int getPort()- Returns:
- the port, or -1 if it's the default port, or 0 if unknown
-
getUserinfo
- Returns:
- the user info
-
getPath
- Returns:
- the path
-
getQueryString
- Returns:
- the query string
-
getPathAndQuery
- Returns:
- the path and query
-
getFragment
- Returns:
- the fragment
-
isGenericURI
public boolean isGenericURI()Does the scheme specific part of this URI use the generic-URI syntax?In general URI are split into two categories: opaque-URI and generic-URI. The generic-URI syntax is the syntax most are familiar with from URLs such as ftp- and http-URLs, which is roughly:
generic-URI = scheme ":" [ "//" server ] [ "/" ] [ path_segments ] [ "?" query ]
(see RFC-2396 for exact syntax). Only URLs using the generic-URI syntax can be used to create and resolve relative URIs.Whether a given scheme is parsed according to the generic-URI syntax or wether it is treated as opaque is determined by an internal table of URI schemes.
- See Also:
-
isSemiGenericURI
public boolean isSemiGenericURI()Does the scheme specific part of this URI use the semi-generic-URI syntax?Many schemes which don't follow the full generic syntax actually follow a reduced form where the path part is treated is opaque. This is used for example by ldap, smtp, pop, etc, and is roughly
generic-URI = scheme ":" [ "//" server ] [ "/" [ opaque_path ] ]
I.e. parsing is identical to the generic-syntax, except that the path part is not further parsed. URLs using the semi-generic-URI syntax can be used to create and resolve relative URIs with the restriction that all paths are treated as absolute.Whether a given scheme is parsed according to the semi-generic-URI syntax is determined by an internal table of URI schemes.
- See Also:
-
toURL
Will try to create a java.net.URL object from this URI.- Returns:
- the URL
- Throws:
MalformedURLException- if no handler is available for the scheme
-
toExternalForm
- Returns:
- a string representation of this URI suitable for use in links, headers, etc.
-
toString
Return the URI as string. This differs from toExternalForm() in that all elements are unescaped before assembly. This is not suitable for passing to other apps or in header fields and such, and is usually not what you want. -
equals
-
hashCode
public int hashCode()The hash code is calculated over scheme, host, path, and query. -
escape
Escape any character not in the given character class. Characters greater 255 are always escaped according to ??? .- Parameters:
elem- the string to escapeallowed_char- the BitSet of all allowed charactersutf8- if true, will first UTF-8 encode unallowed characters- Returns:
- the string with all characters not in allowed_char escaped
-
escape
Escape any character not in the given character class. Characters greater 255 are always escaped according to ??? .- Parameters:
elem- the array of characters to escapeallowed_char- the BitSet of all allowed charactersutf8- if true, will first UTF-8 encode unallowed characters- Returns:
- the elem array with all characters not in allowed_char escaped
-
unescape
Unescape escaped characters (i.e. %xx) except reserved ones.- Parameters:
str- the string to unescapereserved- the characters which may not be unescaped, or null- Returns:
- the unescaped string
- Throws:
ParseException- if the two digits following a `%' are not a valid hex number
-
main
Run test set.- Throws:
Exception- if any test fails
-