Share via


Some bytes have been replaced with the Unicode substitution character

Question

Thursday, September 18, 2014 6:13 AM

I am running English Windows 7 with Japanese fonts installed and the IME enabled.  I am running Visual Studio 2012, details at end of note.  I needed to enter Shift-JIS kanji into my program files so I had to get the IME to switch from Unicode (the default of course) to Shift-JIS.  To do this I had to change the Language for nonUnicode programs in the control panel to Japanese from English by changing the system locale.  It worked.  But it triggers alarming behaviour in the Visual Studio 2012 editor.  Now, when I open any program source code file by double clicking I get a pop up box that says "Some bytes have been replaced with the Unicode substitution character while loading file <filename> with Japanese Shift-JIS encoding.  Saving the file will not preserve the contents."

A little research shows that this pop up is triggered when any file contains a character with a value higher than x'7F' (decimal 127).  Apparently, if even one character in that file has the high order bit set it is immediately branded as a Unicode file and this automatic(!) unasked for, no way to stop it, corruption of my source code program occurs.

I would really like to know how to turn off this behavior in Visual Studio 2012 when I'm in Shift-JIS mode.

ecbyahoo

English Windows 7 Enterprise SP1

Japanese Fonts Installed

IME active

Language for NonUnicode programs:  Japanese (Japan)

Microsoft Visual Studio Professional 2012
Version 11.0.61030.00 Update 4
Microsoft .NET Framework
Version 4.5.50938

Installed Version: Professional

All replies (2)

Thursday, September 18, 2014 7:48 AM âś…Answered

I had no idea such a facility existed.  I will try it.  Now be aware that what is happening is that I have many program source files that already exist and already have extended graphics characters in it, like a small "a" with two circles over it.  Apparently it isn't the saving of the file but just the act of doubleclicking on it and opening it for editing that triggers this behaviour.  Clearly, when in Shift-JIS input mode, Visual studio or windows, is performing a Unicode determination check.  I turn off autodetect of UTF-8 files in my visual studio tool settings on the general settings tab but no luck.  The UTF-8 check still occurs with that automatic, and unwanted, corruption (in my opinion) of my program source code.

Update: It worked!  If I pick an editor with encoding, I can avoid the Unicode purity police!

Thank you thank you thank you! :)

ecbyahoo


Thursday, September 18, 2014 7:33 AM | 1 vote

Have you tried to save the file with UTF-8 or Unicode encoding: http://msdn.microsoft.com/en-us/library/vstudio/w11571b4(v=vs.100).aspx (if applicable for your case)?