Discussion:
Question about MUC and xmpp4r (encoding issue)
Björn Herzig
2010-01-02 00:40:47 UTC
Permalink
Ok, i could pinpoint the problem.

It's basically an encoding problem. Whenever i receive german umlaut an exception gets
thrown and i get disconnected from the server

W, [2010-01-02T01:37:17.139421 #2071] WARN -- : EXCEPTION:
REXML::ParseException
#<Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)>
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/baseparser.rb:432:in `pull'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/sax2parser.rb:92:in `parse'
/UConnecting to localhost:6667

I'm not a ruby expert. How can i work around this?

Regards,
Björn
Steve Gibson
2010-01-02 02:14:51 UTC
Permalink
Björn,

Here's what I ended up doing:

msg.gsub!(/\x01/,'*')
msg.gsub!(/[\x02-\x1F]|[\x7F-\xFF]/,'')

Not a good long-term solution, but it works for me for now...

-Steve
Post by Björn Herzig
Ok, i could pinpoint the problem.
It's basically an encoding problem. Whenever i receive german umlaut an exception gets
thrown and i get disconnected from the server
REXML::ParseException
#<Encoding::CompatibilityError: incompatible encoding regexp match
(UTF-8 regexp with ASCII-8BIT string)>
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/baseparser.rb:432:in `pull'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/sax2parser.rb:92:in `parse'
/UConnecting to localhost:6667
I'm not a ruby expert. How can i work around this?
Regards,
Björn
_______________________________________________
Xmpp4r-devel mailing list
https://mail.gna.org/listinfo/xmpp4r-devel
--
Steve Gibson
Email & XMPP: steve-l+ZVnuUkFBdXqviUI+***@public.gmane.org
Andreas Wiese
2010-01-03 14:09:38 UTC
Permalink
Post by Björn Herzig
Ok, i could pinpoint the problem.
It's basically an encoding problem. Whenever i receive german umlaut an exception gets
thrown and i get disconnected from the server
REXML::ParseException
#<Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)>
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/baseparser.rb:432:in `pull'
/Users/raichoo/.local/lib/ruby/1.9.1/rexml/parsers/sax2parser.rb:92:in `parse'
/UConnecting to localhost:6667
I'm not a ruby expert. How can i work around this?
That's a Ruby 1.9 issue (since Ruby 1.9 now supports different file- and
data-encodings). Ruby seems to be unsure about the encoding of the
String you want to match, but knows that it's not 8bit-clean and thus
interprets it simply as binary (a.k.a. ASCII-8BIT). You'll have to
force the String you want to match to be interpreted as UTF-8. This
could be achieved by

str = str.force_encode('utf-8')

HTH.

HAND & LG -- aw
np: nothing
--
Ehe (3): Die Wahrheit Ìber die Ehe erfÀhrst du nur durch Anschauung aus
nÀchster NÀhe: zu Hause bei den Eltern. Oder zu spÀt: drei Jahre nach
der Hochzeit.
— Janosch / »Wörterbuch der Lebenskunst(-griffe)«
Loading...