Strange problem printing Unicode characters
Help! My students are trying to implement the Huffman coding algorithm and
one of them is having a bizarre error that I can't figure out. Here's the
problem code segment:
int tmpDec = convertBinToDec(tmpSub);
char tmpChar = (char)tmpDec;
System.out.print(tmpChar);
pr.print(tmpChar);
The specific problem is that one of his calls to convertBinToDec() is
returning the decimal value 135, which codes to an unprintable control
character. For some reason, the program is converting that to character
value = 63, which is the question mark (?). Obviously, this results in an
incorrectly encoded message.
I didn't see this happen in my program, but maybe my test file never hit
any of these unprintable control characters. Can someone please explain to
me what's going on, and better yet, how to fix it?
Thanks!
Roger
Re: Strange problem printing Unicode characters
If you want help, you'll have to provide an SSCCE that demonstrates the problem. This should only be a few lines in a main method of a bare-bones class.
And we're not going to copy anything to your personal email. We aren't your underlings, and quite frankly, requesting we do something to make your life easier is rude and will decrease your chances of getting help.
Re: Strange problem printing Unicode characters
Sorry about the last part. I was in a hurry and had copied and pasted a message to a listserv. My bad.
Re: Strange problem printing Unicode characters
That's fine, but I guess in that case you should read this: http://www.javaprogrammingforums.com...s-posting.html
But like I said, we can't really help you until you show us an SSCCE. More specifically, what tmpSub?
Re: Strange problem printing Unicode characters
Quote:
Originally Posted by
KevinWorkman
If you want help, you'll have to provide an
SSCCE that demonstrates the problem. This should only be a few lines in a main method of a bare-bones class.
How about this?
public class WhatIsTheBug
{
public static void main(String[] args)
{
int tmpDec = 135; // that's the problem value
char tmpChar = (char)tmpDec;
System.out.println(tmpChar); // why does this print '?' when that char val = 63?
}
}
Re: Strange problem printing Unicode characters
What would you expect it to print anyway?
To get a better understanding of what's going on, I suggest you run something like this:
Code java:
public class Main {
public static void main(String[] args)
{
for(int i = 0; i < 200; i++){
System.out.println(i + ": " + (char)i);
}
}
}
Re: Strange problem printing Unicode characters
Maybe that's my question. In Huffman coding, after you convert the original characters to their Huffman codes and generate a long binary string (like "11011010010001"), you then need to break up that string into eight-bit chunks and translate those chunks back into their corresponding character values.
It seems like if one of those character values happens to be a non-printing control character (like 135), then the program punts and converts it into some printable character like '?' that doesn't really match. I wonder, is there any way of getting the actual Unicode character into the file? Otherwise that seems like a major roadblock to implementing the algorithm.
This is my first time trying this project with my students, so I haven't had to deal with some of these problems before. Thanks for all your replies and also for tolerating my newbish behavior on the forums. FWIW, I've read the rules and guidelines article now and I think I'm trying to follow all the rules. The only other place I've posted this question is to the AP Computer Science teachers listserv.
Re: Strange problem printing Unicode characters
What encoding are you trying to use? Apparently, converting from an int to a char by casting uses the default of ISO-8859-1, which doesn't seem to have a value for 135: ISO/IEC 8859-1 - Wikipedia, the free encyclopedia
Re: Strange problem printing Unicode characters
Quote:
Originally Posted by
KevinWorkman
*blush* I have to admit, I never thought to use anything other than the default encoding. You're right, apparently 8859-1 doesn't have encodings for some hexadecimal values, including 135 (0x87).
Hmm. I suppose that the only solution here is to read and write the file bit by bit, instead of trying to convert those chunks into encoded characters. I'll try that.
Thanks!
Re: Strange problem printing Unicode characters
The first answer in this posting gives a pretty good explanation and possible solution: conversion - Converting int to char in java - Stack Overflow