Bush hid the facts
Bush hid the facts is a common name for a bug present in some versions of Microsoft Windows, which causes text encoded in ASCII to be interpreted as if it were UTF-16LE, resulting in garbled text. When the string «Bush hid the facts», without quotes, was put in a new Notepad document and saved, closed, and reopened, the nonsensical sequence of Chinese characters » 畂桳栠摩琠敨映捡獴 » would appear instead.
While «Bush hid the facts» is the sentence most commonly presented on the Internet to induce the error, the bug can be triggered by other strings with letters and spaces in the same positions, for example «hhhh hhh hhh hhhhh» [1] or «this app can break» . [2] Other sequences trigger the bug as well, including simply the text «a » . (This most commonly used sentence is a reference to United States of America President George W. Bush’s statements about nuclear weapons in Iraq.)
The bug occurs when the string is passed to the Win32 charset detection function IsTextUnicode . IsTextUnicode sees that the bytes match the UTF-16LE encoding of assigned Unicode code points, concludes that the text is valid UTF-16LE, and returns true , and the application then incorrectly interprets the text as UTF-16LE. [3]
The bug had existed since IsTextUnicode was introduced with Windows NT 3.5 in 1994, but was not discovered until early 2004. [4] Many text editors and tools exhibit this behavior on Windows because they use IsTextUnicode to determine the encoding of text files. As of Windows Vista, Notepad has been modified to use a different detection algorithm that does not exhibit the bug, but IsTextUnicode remains unchanged in the operating system, so any other tools that use the function are still affected. [5]
Bush hid the facts
Bush hid the facts is a common name for a bug present in some Microsoft Windows applications, which causes a file of text encoded in ASCII or its superset (such as in a Windows code page) to be interpreted as if it were UTF-16LE, resulting in mojibake. When «Bush hid the facts» (without newline or quotes) is put in a new (pre-Vista) Notepad document and saved, closed, and reopened, the nonsensical Chinese characters » 畂桳栠摩琠敨映捡獴 » appear instead.
While «Bush hid the facts» is the sentence most commonly presented on the Internet to induce the error, the bug can be triggered by many sentences with characters and spaces in a particular order so that the bytes match the UTF-16LE encoding of valid (if nonsensical) Chinese Unicode characters. Other popular strings are «this app can break» , «acre vai pra globo» (Portuguese for «Acre goes to Rede Globo»), and «aaaa aaa aaa aaaaa» . [1] The bug is triggered even by the text «a » .
The bug occurs when the string is passed to the Win32 charset detection function IsTextUnicode with no other characters. IsTextUnicode sees what it thinks is valid UTF-16LE Chinese and returns true, and the application then incorrectly interprets the text as UTF-16LE. [2]
Many text editors and tools exhibit this behavior because they use IsTextUnicode as well.
Contents
Discovery
The bug appeared since IsTextUnicode was introduced with NT 3.5 in 1994, but was not discovered until early 2004. [3]
Workarounds
Vista SP1 and later Notepad includes a workaround for the IsTextUnicode bug [1].
Editing the text to not be a pattern that triggers this bug will avoid it. For instance, adding a new line in the first 20 characters will work.
If the file is saved as «UTF-8» rather than «ANSI» the text loads correctly, because Notepad prepends a UTF-8 byte order mark, which is a pattern that does not trigger the bug. UTF-8 without the byte order mark would still trigger the bug, as this sequence is represented identically in UTF-8 as in ASCII.
The bug is also avoided by saving as «Unicode», which in Microsoft Windows means UTF-16LE. When loading this text IsTextUnicode should (and does) return true and the text is correct.
To retrieve the original text using Notepad, bring up the «Open a file» dialog box, select the file, select «ANSI» or «UTF-8» in the «Encoding» list box, and click Open. Under Windows 2000, Notepad lacks the «Encoding» list box. Notepad2 also lacks this. WordPad appears to load the text correctly without choosing the encoding, since it uses its own encoding detection.
BUSH HID THE FACTS
This is one of the most popular notepad tricks because of its mysterious nature. In order to get an idea as to what this trick does, just follow the steps given below:
Open Notepad.
Type “BUSH HID THE FACTS” or “this app can break” (without quotes).
Save that file with any name and close it.
Open It Again to see the magic.
Reason For This Behavior: It is known as the 4335 Rule. It means that if we enter four words separated by spaces, wherein the first word has 4 letters, the next two have three letters each, and the last word has five letters. Then Notepad Automatically hides the text into unknown code.
Bush Hid The Facts — Notepad Conspiracy Claim
Summary:
Message claims that the strange result when a Windows Notepad file with the text «Bush hid the facts» is re-opened may represent a deliberate anti-Bush political statement by Microsoft (Full commentary below.)
Status:
False
Example:(Submitted, June 2006)
Subject: political conspiracy?
hey this is really weird!!
open notepad
type «bush hid the facts» without quotation marks
don’t press «enter» save the file
close notepad
open the file again
what do you think?
Commentary:
This little Windows Notepad «trick» is often posted to online forums and blogs and also travels via email. When the phrase «Bush hid the facts» is typed into the Windows XP or Windows NT/2000 versions of Notepad as instructed above, the re-opened file displays an unreadable line of squares or Chinese style characters.
The first image below shows the text before closing the Notepad file. The second image shows the text as it is displayed after the file is re-opened:
Some of the more wide-eyed conspiracy theorists postulate that this result is a form of political commentary directed against US President Bush and was knowingly and deliberately programmed into Notepad by Microsoft.
Alas, the truth is far less compelling. It appears that a lot of other character strings in the pattern 4 letters, 3 letters, 3 letters and 5 letters will give the same result. For example, the phrase «Bill fed the goats» also displays the garbled text as shown below:
In fact, even a line of text such as «hhhh hhh hhh hhhhh» will elicit the same results.
Since I first published this article, a few readers have pointed out that some character strings that fit the «4,3,3,5» pattern do not generate the error. For example, the phrase «Bush hid the truth» is displayed normally. However, conspiracy theorists should not take this as aiding their argument. «Fred led the brats», «brad ate the trees» and other strings also escape the error.
Thus, any hint of political conspiracy fades into oblivion and is replaced by a rather mundane programming bug. It seems probable that a certain combination and/or frequency of letters in the character string cause Notepad to misinterpret the encoding of the file when it is re-opened. If the file is originally saved as «Unicode» rather than «ANSI» the text displays correctly. Older versions of Notepad such as those that came with Windows 95, 98 or ME do not include Unicode support so the error does not occur.
So, nothing weird here at all. except perhaps for the fact that someone, somewhere had nothing better to do than turn a simple software glitch into another lame conspiracy theory.