Main Page | Report this Page
 
   
Science Forum Index  »  Languages Forum  »  DOC - dialects of China
Page 1 of 1    
Author Message
Dylan Sung
Posted: Mon Jan 19, 2004 1:43 pm
Guest
A while ago, I downloaded the text version of the Dialects of China project,
which derives most of it's data from Hanyu Fangyin Zihui, a list of readings
of 2700 characters across almost 20 dialects. I've converted the file and
zipped it up. Unzipped it is 5105 kilobytes, zipped it is 460 kb. The
original file docmas9.txt was over 55000 lines long. A bit of programming
went a long way.

http://www.dylanwhs.ukgateway.net/download/doc.htm

Also needs Big5 fonts, plus a font to display IPA characters like "lucida
sans unicode".

Cheers,
Dyl.
Geoff
Posted: Wed Jan 21, 2004 10:17 am
Guest
Dylan Sung wrote:
Quote:
A while ago, I downloaded the text version of the Dialects of China project,
which derives most of it's data from Hanyu Fangyin Zihui, a list of readings
of 2700 characters across almost 20 dialects. I've converted the file and
zipped it up. Unzipped it is 5105 kilobytes, zipped it is 460 kb. The
original file docmas9.txt was over 55000 lines long. A bit of programming
went a long way.

http://www.dylanwhs.ukgateway.net/download/doc.htm

Also needs Big5 fonts, plus a font to display IPA characters like "lucida
sans unicode".

Cheers,
Dyl.


Nice work.
Dylan Sung
Posted: Wed Jan 21, 2004 3:42 pm
Guest
"Dylan Sung" <dylanwhs.tsktsktsk@pacific.net.hk> wrote in message
news:bumdq9$jkvn1$1@ID-119091.news.uni-berlin.de...
Quote:
http://www.dylanwhs.ukgateway.net/download/doc.htm

Thanks, but the credit goes to DOC for making it so easy to manipulate
their
txt format. Looks like a four or five fold increase in size with all the
HTML tags. There are several entries where the tone of a character is
marked
as X
and the yin-yang split is given as H. I'm still investigating as to what
this refers to. The X H data isn't confined to one dialect perhaps some
folks in the know can shed some light on this...


The characters I've found with no tone or splitting is now appended to the
bottom of my info page on the DOC listing.

http://www.dylanwhs.ukgateway.net/download/doc-more.htm

I've spent the evening looking for bugs, spotted a couple, and even made the
download smaller. Please re-download the corrected version.

Cheers,
Dyl.
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Sat Aug 30, 2008 12:34 am