Explicitly declare execution character set

Consider this program:

  #include <cstdio>

  int main(void) {
    const char *example_string = "ディセント3";
    for (size_t i = 0; example_string[i] != '\0'; ++i) {
      printf("%02hhx ", example_string[i]);
    }
    puts("");

    return 0;
  }

What will that program output? The answer is: it depends. If that
program is compiled with a UTF-8 execution character set, then it will
print this:

  e3 83 87 e3 82 a3 e3 82 bb e3 83 b3 e3 83 88 33

If that program is compiled with a Shift JIS execution character set,
then it will print this:

  83 66 83 42 83 5a 83 93 83 67 33

This is especially a problem when using MSVC. MSVC doesn’t necessarily
default to using UTF-8 as a program’s execution character set [1].

---

Before this change, Descent 3 would use whatever the default execution
character set was. This commit ensures that the execution character set
is UTF-8 as long as Descent 3 gets compiled with MSVC, GCC or Clang. If
Descent 3 is compiled with a different compiler, then a different
execution character set might get used, but as far as I know, we only
support MSVC, GCC and Clang.

I’m not sure whether or not this change has any noticeable effects. If
using different execution character sets do have noticeable effects,
then this change will hopefully ensure that those effects are the same
for everyone.

[1]: <https://learn.microsoft.com/en-us/answers/questions/1805730/what-is-msvc-s-default-execution-character-set>
This commit is contained in:
Jason Yundt 2024-07-07 12:54:52 -04:00
parent dd757e9034
commit adf58eca81

View File

@ -29,9 +29,18 @@ set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
set_property(GLOBAL PROPERTY USE_FOLDERS ON)
if(MSVC)
add_compile_options(/source-charset:UTF-8)
add_compile_options(/source-charset:UTF-8 /execution-charset:UTF-8)
else()
add_compile_options(-finput-charset=UTF-8)
# Unfortunately, Clang doesnt support -fexec-charset yet so this next part
# is GCC only. Luckily, Clang defaults to using UTF-8 for the execution
# character set [1], so were fine. Once Clang gets support for
# -fexec-charset, we should probably start using it.
#
# [1]: <https://discourse.llvm.org/t/rfc-enabling-fexec-charset-support-to-llvm-and-clang-reposting/71512>
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
add_compile_options(-fexec-charset=UTF-8)
endif()
endif()
if(FORCE_COLORED_OUTPUT)