← Index
NYTProf Performance Profile   « line view »
For svc/members/upsert
  Run on Tue Jan 13 11:50:22 2015
Reported on Tue Jan 13 12:09:51 2015

Filename/usr/share/perl5/URI/Escape.pm
StatementsExecuted 272 statements in 784µs
Subroutines
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
11114µs30µsURI::Escape::::BEGIN@3URI::Escape::BEGIN@3
11110µs20µsURI::Escape::::BEGIN@4URI::Escape::BEGIN@4
1119µs19µsURI::Escape::::BEGIN@140URI::Escape::BEGIN@140
1114µs4µsURI::Escape::::BEGIN@146URI::Escape::BEGIN@146
2113µs3µsURI::Escape::::CORE:qrURI::Escape::CORE:qr (opcode)
0000s0sURI::Escape::::_fail_hiURI::Escape::_fail_hi
0000s0sURI::Escape::::escape_charURI::Escape::escape_char
0000s0sURI::Escape::::uri_escapeURI::Escape::uri_escape
0000s0sURI::Escape::::uri_escape_utf8URI::Escape::uri_escape_utf8
0000s0sURI::Escape::::uri_unescapeURI::Escape::uri_unescape
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1package URI::Escape;
2
3227µs245µs
# spent 30µs (14+15) within URI::Escape::BEGIN@3 which was called: # once (14µs+15µs) by C4::Auth::BEGIN@24 at line 3
use strict;
# spent 30µs making 1 call to URI::Escape::BEGIN@3 # spent 16µs making 1 call to strict::import
4277µs229µs
# spent 20µs (10+9) within URI::Escape::BEGIN@4 which was called: # once (10µs+9µs) by C4::Auth::BEGIN@24 at line 4
use warnings;
# spent 20µs making 1 call to URI::Escape::BEGIN@4 # spent 9µs making 1 call to warnings::import
5
6=head1 NAME
7
8URI::Escape - Percent-encode and percent-decode unsafe characters
9
10=head1 SYNOPSIS
11
12 use URI::Escape;
13 $safe = uri_escape("10% is enough\n");
14 $verysafe = uri_escape("foo", "\0-\377");
15 $str = uri_unescape($safe);
16
17=head1 DESCRIPTION
18
19This module provides functions to percent-encode and percent-decode URI strings as
20defined by RFC 3986. Percent-encoding URI's is informally called "URI escaping".
21This is the terminology used by this module, which predates the formalization of the
22terms by the RFC by several years.
23
24A URI consists of a restricted set of characters. The restricted set
25of characters consists of digits, letters, and a few graphic symbols
26chosen from those common to most of the character encodings and input
27facilities available to Internet users. They are made up of the
28"unreserved" and "reserved" character sets as defined in RFC 3986.
29
30 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
31 reserved = ":" / "/" / "?" / "#" / "[" / "]" / "@"
32 "!" / "$" / "&" / "'" / "(" / ")"
33 / "*" / "+" / "," / ";" / "="
34
35In addition, any byte (octet) can be represented in a URI by an escape
36sequence: a triplet consisting of the character "%" followed by two
37hexadecimal digits. A byte can also be represented directly by a
38character, using the US-ASCII character for that octet.
39
40Some of the characters are I<reserved> for use as delimiters or as
41part of certain URI components. These must be escaped if they are to
42be treated as ordinary data. Read RFC 3986 for further details.
43
44The functions provided (and exported by default) from this module are:
45
46=over 4
47
48=item uri_escape( $string )
49
50=item uri_escape( $string, $unsafe )
51
52Replaces each unsafe character in the $string with the corresponding
53escape sequence and returns the result. The $string argument should
54be a string of bytes. The uri_escape() function will croak if given a
55characters with code above 255. Use uri_escape_utf8() if you know you
56have such chars or/and want chars in the 128 .. 255 range treated as
57UTF-8.
58
59The uri_escape() function takes an optional second argument that
60overrides the set of characters that are to be escaped. The set is
61specified as a string that can be used in a regular expression
62character class (between [ ]). E.g.:
63
64 "\x00-\x1f\x7f-\xff" # all control and hi-bit characters
65 "a-z" # all lower case characters
66 "^A-Za-z" # everything not a letter
67
68The default set of characters to be escaped is all those which are
69I<not> part of the C<unreserved> character class shown above as well
70as the reserved characters. I.e. the default is:
71
72 "^A-Za-z0-9\-\._~"
73
74=item uri_escape_utf8( $string )
75
76=item uri_escape_utf8( $string, $unsafe )
77
78Works like uri_escape(), but will encode chars as UTF-8 before
79escaping them. This makes this function able to deal with characters
80with code above 255 in $string. Note that chars in the 128 .. 255
81range will be escaped differently by this function compared to what
82uri_escape() would. For chars in the 0 .. 127 range there is no
83difference.
84
85Equivalent to:
86
87 utf8::encode($string);
88 my $uri = uri_escape($string);
89
90Note: JavaScript has a function called escape() that produces the
91sequence "%uXXXX" for chars in the 256 .. 65535 range. This function
92has really nothing to do with URI escaping but some folks got confused
93since it "does the right thing" in the 0 .. 255 range. Because of
94this you sometimes see "URIs" with these kind of escapes. The
95JavaScript encodeURIComponent() function is similar to uri_escape_utf8().
96
97=item uri_unescape($string,...)
98
99Returns a string with each %XX sequence replaced with the actual byte
100(octet).
101
102This does the same as:
103
104 $string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
105
106but does not modify the string in-place as this RE would. Using the
107uri_unescape() function instead of the RE might make the code look
108cleaner and is a few characters less to type.
109
110In a simple benchmark test I did,
111calling the function (instead of the inline RE above) if a few chars
112were unescaped was something like 40% slower, and something like 700% slower if none were. If
113you are going to unescape a lot of times it might be a good idea to
114inline the RE.
115
116If the uri_unescape() function is passed multiple strings, then each
117one is returned unescaped.
118
119=back
120
121The module can also export the C<%escapes> hash, which contains the
122mapping from all 256 bytes to the corresponding escape codes. Lookup
123in this hash is faster than evaluating C<sprintf("%%%02X", ord($byte))>
124each time.
125
126=head1 SEE ALSO
127
128L<URI>
129
130
131=head1 COPYRIGHT
132
133Copyright 1995-2004 Gisle Aas.
134
135This program is free software; you can redistribute it and/or modify
136it under the same terms as Perl itself.
137
138=cut
139
140256µs229µs
# spent 19µs (9+10) within URI::Escape::BEGIN@140 which was called: # once (9µs+10µs) by C4::Auth::BEGIN@24 at line 140
use Exporter 'import';
# spent 19µs making 1 call to URI::Escape::BEGIN@140 # spent 10µs making 1 call to Exporter::import
1411200nsour %escapes;
14211µsour @EXPORT = qw(uri_escape uri_unescape uri_escape_utf8);
1431200nsour @EXPORT_OK = qw(%escapes);
1441200nsour $VERSION = "3.31";
145
1462406µs14µs
# spent 4µs within URI::Escape::BEGIN@146 which was called: # once (4µs+0s) by C4::Auth::BEGIN@24 at line 146
use Carp ();
# spent 4µs making 1 call to URI::Escape::BEGIN@146
147
148# Build a char->hex map
14911µsfor (0..255) {
150256199µs $escapes{chr($_)} = sprintf("%%%02X", $_);
151}
152
1531200nsmy %subst; # compiled patterns
154
155112µs23µsmy %Unsafe = (
# spent 3µs making 2 calls to URI::Escape::CORE:qr, avg 2µs/call
156 RFC2732 => qr/[^A-Za-z0-9\-_.!~*'()]/,
157 RFC3986 => qr/[^A-Za-z0-9\-\._~]/,
158);
159
160sub uri_escape {
161 my($text, $patn) = @_;
162 return undef unless defined $text;
163 if (defined $patn){
164 unless (exists $subst{$patn}) {
165 # Because we can't compile the regex we fake it with a cached sub
166 (my $tmp = $patn) =~ s,/,\\/,g;
167 eval "\$subst{\$patn} = sub {\$_[0] =~ s/([$tmp])/\$escapes{\$1} || _fail_hi(\$1)/ge; }";
168 Carp::croak("uri_escape: $@") if $@;
169 }
170 &{$subst{$patn}}($text);
171 } else {
172 $text =~ s/($Unsafe{RFC3986})/$escapes{$1} || _fail_hi($1)/ge;
173 }
174 $text;
175}
176
177sub _fail_hi {
178 my $chr = shift;
179 Carp::croak(sprintf "Can't escape \\x{%04X}, try uri_escape_utf8() instead", ord($chr));
180}
181
182sub uri_escape_utf8 {
183 my $text = shift;
184 utf8::encode($text);
185 return uri_escape($text, @_);
186}
187
188sub uri_unescape {
189 # Note from RFC1630: "Sequences which start with a percent sign
190 # but are not followed by two hexadecimal characters are reserved
191 # for future extension"
192 my $str = shift;
193 if (@_ && wantarray) {
194 # not executed for the common case of a single argument
195 my @str = ($str, @_); # need to copy
196 for (@str) {
197 s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
198 }
199 return @str;
200 }
201 $str =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg if defined $str;
202 $str;
203}
204
205# XXX FIXME escape_char is buggy as it assigns meaning to the string's storage format.
206sub escape_char {
207 # Old versions of utf8::is_utf8() didn't properly handle magical vars (e.g. $1).
208 # The following forces a fetch to occur beforehand.
209 my $dummy = substr($_[0], 0, 0);
210
211 if (utf8::is_utf8($_[0])) {
212 my $s = shift;
213 utf8::encode($s);
214 unshift(@_, $s);
215 }
216
217 return join '', @URI::Escape::escapes{split //, $_[0]};
218}
219
22015µs1;
 
# spent 3µs within URI::Escape::CORE:qr which was called 2 times, avg 2µs/call: # 2 times (3µs+0s) by C4::Auth::BEGIN@24 at line 155, avg 2µs/call
sub URI::Escape::CORE:qr; # opcode