« Mojo Facets actions, changes and editing | Main | OBD2: geek with a car »

Croatian characters 8-bit encoding

We all speek utf-8 thease days, don't we? Well, not really... I got CSV file export and I couldn't guess encoding from simply looking into it any more. So I wrote gist to dump all Croatian 8-bit encodings in utf-8:

#!/bin/sh -x

file=$1

function encoding {
        echo "# $1"
        head $file | iconv -f $1 -t utf-8
}

encoding cp850
encoding cp852
encoding cp1250
encoding cp1252
encoding iso-8859-1
encoding iso-8859-2
encoding mac
encoding MAC-CENTRALEUROPE

Example usage:

./test-8bit-encodings.sh data/ESB_izvadak-tekuci.csv | vi -R -c 'set nowrap' -

TrackBack

TrackBack URL for this entry:
http://blog.rot13.org/mt/mt-tb.cgi/697

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on June 23, 2010 5:53 PM.

The previous post in this blog was Mojo Facets actions, changes and editing.

The next post in this blog is OBD2: geek with a car.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 5.04