4. Usage of the IRI Module
4.1 Parsing of IRI references:
IRI references can be parsed using parseIRIref/2
and parseIRIref/3. For efficiency, the programmer should use
parseIRIref/2 which requires terminated lists of Unicode character
codes.
- parseIRIref( + TermUCSList, IRIref )
Given a list of Unicode character codes, terminated with -1,
parseIRIref/2 returns the term representation of the IRI reference.
Fails if the first argument is not a syntactically correct IRI
reference. Notice that separators of the several IRI components do
not appear in the IRI reference term representation.
The production ihostname of Internationalized Resource Identifiers is
not fully implemented: it is only checked if the ihostname part does not
contain illegal characters. The syntax of IPv6 addresses is not checked,
and IPv4 addresses are checked only if UserInfo is present.
Example:
| ?- append(
"http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING", [-1],
_Codes ),
parseIRIref( _Codes, Ref ).
Ref = iriref(scheme([104,116,116,112]),
authority([],[119,119,119,46,105,99,115,46,117,99,105,46,101,100,117],[]),
path(abs,[segment([112,117,98]),
segment([105,101,116,102]),
segment([117,114,105]),
segment([104,105,115,116,111,114,105,99,97,108,46,104,116,109,108])
]
),
[],
fragment([87,65,82,78,73,78,71])
);
- parseIRIref( + Terminated, + ListOfCodes,
IRIref )
The first argument Terminated may
take the values yes or
no, indicating respectively whether the
2nd argument list of Unicode character codes is terminated or not. A
call of the form parseIRIref( yes, ListOfCodes,
IRIref ) is equivalent to parseIRIref(
ListOfCodes, IRI ). If the first argument is
no, then the symbol -1 is appended to
the 2nd argument and parseIRIref/2 is called. Thus, this second
form should be used sparingly.
4.2 Testing and Inspection of
IRI reference terms:
The following set of predicates determine the type of IRI reference
parsed or constructed:
- isIRIref(+ IRIref )
This predicates succeeds when its argument is an IRI term function
symbol. For efficiency, it does not check if its component arguments are
correct.
- isIRI(+ IRIref )
This predicates succeed when its argument is an IRI, i.e. an IRI
reference with a non-empty scheme component.
- isAbsoluteIRI(+ IRIref )
This predicates succeed when its argument is an absolute IRI, i.e. an
IRI without fragment part.
- isRelativeIRI(+ IRIref )
This predicates succeed when its argument is a relative IRI, i.e. an IRI
reference with an empty scheme component.
To obtain the several components of an IRI reference term, the
following predicates may be used:
- getIRIrefScheme(+ IRIref,Scheme)
Obtains the scheme component of a given IRI reference. The scheme
component is a term of the form scheme( ListOfCodes ) or an empty
list, as described in Section 2.
- getIRIrefAuthority(+ IRIref,Authority)
Obtains the authority component of a given IRI reference. The
authority component is a term of the form authority( UserInfo, Host,
Port) or an empty list, as described in Section 2.
- getIRIrefPath(+ IRIref,Path)
Obtains the path component of a given IRI reference. The path
component is a term of the form path( AbsRel, Segments ) or an
empty list, as described in Section 2.
- getIRIrefQuery(+ IRIref,Query)
Obtains the query component of a given IRI reference. The query
component is a term of the form query( ListOfCodes ) or an empty
list, as described in Section 2.
- getIRIrefFragment(+ IRIref,Fragment)
Obtains the fragment component of a given IRI reference. The
fragment component is a term of the form fragment( ListOfCodes )
or an empty list, as described in Section 2.
4.3 Construction of IRI
references:
The next predicates provide mechanisms to dynamically construct IRI
references. The advised method to construct IRIs is to parse them from
lists of Unicode character codes. The predicates described in this section
should be used with care since no checking of arguments is performed.
-
createEmptyIRIref(
IRIref )
This predicate creates an empty IRI reference
-
createIRIref(
+ Scheme, + Authority, + Path, + Query, + Fragment, IRIref )
This predicate creates an IRI reference from the
several components of the IRI reference. The input arguments are either
empty lists or component terms as described in Section 2
above.
-
setIRIrefScheme(
+ OldIRIref, + Scheme,
NewIRIref )
The predicate setIRIrefScheme/3 replaces
the scheme component in the IRI reference term
OldIRIref by the list of Unicode character codes in argument
Scheme, returning the new IRI reference
term in the last argument NewIRIref.
-
setIRIrefAuthority(
+ OldIRIref, + UserInfo,
+ Host, + Port, NewIRIref )
The predicate setIRIrefAuthority/5 replaces the
authority component in the IRI term OldIRIref
by the authority term constructed from the lists of Unicode character
codes arguments UserInfo,
Host and Port.
The new IRI reference term is returned in the last argument
NewIRIref.
-
setIRIrefPath(
+ OldIRIref, + AbsRel,
+ Path, NewIRIref )
The predicate setIRIrefPath/5 replaces the
path component in the IRI term OldIRIref
by the path term constructed from the list of segments in argument
Path, and the flag
AbsRel, which may take the values
abs or rel.
The new IRI reference term is returned in the last argument
NewIRIref.
-
setIRIrefQuery(
+ OldIRIref, + Query,
NewIRIref )
The predicate setIRIrefQuery/3 replaces the
query component in the IRI reference term
OldIRIref by the list of Unicode character codes in argument
Query, returning the new IRI reference
term in the last argument NewIRIref.
-
setIRIrefFragment(
+ OldIRIref, + Query,
NewIRIref )
The predicate setIRIrefFragment/3 replaces
the fragment component in the IRI reference term
OldIRIref by the list of Unicode
character codes in argument Query,
returning the new IRI reference term in the last argument
NewIRIref.
4.4 Resolution of IRI
references:
The IRI module implements resolution of IRI references according to the
algorithms described in RFC 2396 bis. Therefore, empty references are
allowed and abnormal relative path ".." segments are removed from the
resulting IRI.
- resolveIRIref( + IRIref, + BaseIRI, ResIRI)
The first argument of resolveIRIref/3 is an arbitrary IRI
reference term, while the BaseIRI should
be an IRI term, i.e. with scheme component part. The resolved IRI is
returned in the last argument.
Example:
| ?- atom2iriref(
'http://www.example.com:8080/a/b/c', BaseIRI ),
atom2iriref( '../x/y&query#123', RelIRI ),
resolveIRIref( RelIRI, BaseIRI, ResIRI ),
iriref2atom( ResIRI, Resolved ).
BaseIRI = iriref(scheme([104,116,116,112]),
authority([],[119,119,119,46,101,120,97,109,112,108,101,46,99,111,109],[56,48,56,48]),
path(abs,[segment([97]),segment([98]),segment([99])]),
[],
[]
)
RelIRI = iriref([],
[],
path(rel,[segment([46,46]),segment([120]),segment([121,38,113,117,101,114,121])]),
[],
fragment([49,50,51])
)
ResIRI = iriref(scheme([104,116,116,112]),
authority([],[119,119,119,46,101,120,97,109,112,108,101,46,99,111,109],[56,48,56,48]),
path(abs,[segment([97]),segment([120]),segment([121,38,113,117,101,114,121])]),
[],
fragment([49,50,51])
)
Resolved = http://www.example.com:8080/a/x/y&query#123;
4.5 Conversion and mapping of
IRI references
- atom2iriref( + AtomInUTF8, IRIref).
This predicate converts an IRI reference represented as an UTF8 sequence
of octets to the IRI ref term representation. It fails if the atom is
not a syntactically correct IRI reference.
- iriref2atom( + IRIref, AtomInUTF8 ).
Predicate iriref2atom/2 converts the IRI reference term representation
to an Atom in UTF-8 encoding.
Example:
| ?- atom2iriref(
'mailto:Carlos.Damasio@di.fct.unl.pt', IRIref ),
iriref2atom(IRIref, Atom ).
IRIref = iriref(scheme([109,97,105,108,116,111]),
[],
path(rel,[segment([67,97,114,108,111,115,46,68,97,109,97,115,105,111,
64,100,105,46,102,99,116,46,117,110,108,46,112,116])]
),
[],
[])
Atom = mailto:Carlos.Damasio@di.fct.unl.pt;
- iriref2string( + IRIref, StringInUTF8)
iriref2string( + IRIref, StringInUTF8, RestStringInUTF8 ).
Predicates iriref2string convert an IRI refererence term representation
to a list of Unicode characters in UTF-8 encoding. The three argument
version returns an incomplete list, where RestStringInUTF8 is the
variable tail.
- iri2uri( + UCSList, URIList )
iri2uri( + UCSList, URIList, RestURIList )
Predicates iri2uri convert an IRI reference represented by a list of
Unicode character codes to a proper Universal Resource Identifier, using
the algoritm described in Internationalized Resource Identifiers.
The three argument version returns an incomplete list, where RestURIList
is the variable tail.
Example:
| ?- iri2uri(
"mailto://Carlos.Damásio@di.fct.unl.pt", L ),
atom_codes( URI, L ).
L =
[109,97,105,108,116,111,58,47,47,67,97,114,108,111,115,46,68,97,109,37,67,
50,37,65,48,115,105,111,64,100,105,46,102,99,116,46,117,110,108,46,112,116]
URI = mailto://Carlos.Dam%C2%A0sio@di.fct.unl.pt;
- filename2uri( + UCSList, URIList )
filename2uri( + UCSList, URIList, RestURIList )
Predicates filename2uri assume that an absolute file path,
represented by a list of ASCII character codes to a Universal Resource
Identifier, escaping excluded charactes. The three argument version
returns an incomplete list, where RestURIList is the variable tail. This
predicate uses specific built-in XSB predicates to be able to detect the
unerlying operating system in order to recognize path separators: ''\'
in Windows-based.
In the case of Windows operating systems, the absolute file path must
contain the drive letter. For non-windows operating systems, the path
must start with '/'.
Example (Windows):
| ?- filename2uri( "C:\My
Documents\Jo%A0o", L, [-1] ),
parseIRIref( L, _IRI ),
iriref2atom( _IRI, FilePath ).
L =
[102,105,108,101,58,47,47,67,58,47,77,121,37,50,48,68,111,99,117,109,101,110,116,115,47,74,111,37,65,48,111,-1]
FilePath = file://C:/My%20Documents/Jo%A0o;
no
| ?- filename2uri( "C:/My Documents/Jo%A0o", L, [-1] ),
parseIRIref( L, _IRI ),
iriref2atom( _IRI, FilePath ).
L =
[102,105,108,101,58,47,47,67,58,47,77,121,37,50,48,68,111,99,117,109,101,110,116,115,47,74,111,37,65,48,111,-1]
FilePath = file://C:/My%20Documents/Jo%A0o
Example (Non-Windows):
| ?- filename2uri( "/My
Documents/Jo%A0o", L, [-1] ),
parseIRIref( L, _IRI ),
iriref2atom( _IRI, FilePath ).
L =
[102,105,108,101,58,47,77,121,37,50,48,68,111,99,117,109,101,110,116,115,47,74,111,37,65,48,111,-1]
FilePath = file:/My%20Documents/Jo%A0o;
no
|