Default character encoding for java console output

java windows utf-8 character-encoding console

I'm assuming that your console still runs under cmd.exe. I doubt your console is really expecting UTF-8 - I expect it is really an OEM DOS encoding (e.g. 850 or 437.)

Java will encode bytes using the default encoding set during JVM initialization.

Reproducing on my PC:

java Foo

Java encodes as windows-1252; console decodes as IBM850. Result: Mojibake

java -Dfile.encoding=UTF-8 Foo

Java encodes as UTF-8; console decodes as IBM850. Result: Mojibake

cat test.txt

cat decodes file as UTF-8; cat encodes as IBM850; console decodes as IBM850.

java Foo | cat

Java encodes as windows-1252; cat decodes as windows-1252; cat encodes as IBM850; console decodes as IBM850

java -Dfile.encoding=UTF-8 Foo | cat

Java encodes as UTF-8; cat decodes as UTF-8; cat encodes as IBM850; console decodes as IBM850

This implementation of cat must use heuristics to determine if the character data is UTF-8 or not, then transcodes the data from either UTF-8 or ANSI (e.g. windows-1252) to the console encoding (e.g. IBM850.)

This can be confirmed with the following commands:

$ java HexDump utf8.txt78 78 c3 a4 c3 b1 78 78$ cat utf8.txtxxäñxx$ java HexDump ansi.txt78 78 e4 f1 78 78$ cat ansi.txtxxäñxx

The cat command can make this determination because e4 f1 is not a valid UTF-8 sequence.

You can correct the Java output by:

Setting the console encoding to the system ANSI value
Using the Console type
Using some shiv layer as you are doing with cat

HexDump is a trivial Java application:

import java.io.*;class HexDump {  public static void main(String[] args) throws IOException {    try (InputStream in = new FileInputStream(args[0])) {      int r;      while((r = in.read()) != -1) {        System.out.format("%02x ", 0xFF & r);      }      System.out.println();    }  }}

CodeHunter

Default character encoding for java console output

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last