Java Clipboard: Paste HTML from Firefox on Linux

java html linux firefox clipboard

I belive the problem is related due to the fact that he read from clipboard as US-ASCII, then convert to unicode and expect to leave German umlauts intact. As US-ASCII is a 7-bit charset German umlauts are not included and already lost after reading the clipboard as US-ASCII.

public class CharsetDemo {    public static void main(String[] args) throws Exception {        byte[] bytes;        // convert the German umlaut to bytes in US-ASCII charset        bytes = "ö".getBytes("US-ASCII");        System.out.println("US-ASCII");        System.out.println("bytes : " + asHexString(bytes));        System.out.println("string: " + new String(bytes, "US-ASCII"));        System.out.println();        // create a unicode string from the US-ASCII bytes        String utf8String = new String(bytes, "UTF-8");        bytes = utf8String.getBytes("UTF-8");        System.out.println("UTF-8");        System.out.println("bytes : " + asHexString(bytes));        System.out.println("string: " + utf8String);        System.out.println();        // convert the German umlaut to bytes in ISO-8859-1 charset        bytes = "ö".getBytes("ISO-8859-1");        System.out.println("ISO 8859-1");        System.out.println("bytes : " + asHexString(bytes));        System.out.println("string: " + new String(bytes, "ISO-8859-1"));        System.out.println();        // create a unicode string from the ISO-8859-1 bytes        utf8String = new String(bytes, "UTF-8");        bytes = utf8String.getBytes("UTF-8");        System.out.println("UTF-8");        System.out.println("bytes : " + asHexString(bytes));        System.out.println("string: " + utf8String);        System.out.println();        // bytes of the "REPLACEMET CHARACTER"        System.out.println("replacement character bytes: "             + asHexString("\uFFFD".getBytes("UTF-8")));    }    static String asHexString(byte[] bytes) {        StringBuilder sb = new StringBuilder();        for (byte b : bytes) {            sb.append(String.format("%X ", b));        }        return sb.toString();    }}

output

US-ASCIIbytes : 3F string: ?  <--- the question mark represents here the "REPLACEMENT CHARACTER"UTF-8bytes : 3F string: ?ISO 8859-1bytes : F6 string: öUTF-8bytes : EF BF BD  <-- the "REPLACEMENT CHARACTER", as "F6" is not a valid UTF-8 codepointstring: �replacement character bytes: EF BF BD

java html linux firefox clipboard

Java 6 is not supported any more. So, question is obsolete.

CodeHunter

Java Clipboard: Paste HTML from Firefox on Linux

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last