The Quintessential Japanese Problem

I should be happy about the job I got today. It is simple; build a few Regexes, and simplify the interface to a CMS.

Simple – yet stupid.

First, why is this difficult system being used? (Politics of a sunk cost)

Second, why is the design of the interface so utterly bad? (Japan only product therefore it doesn’t have to compete in an unbiased market)

Finally, is it morally right to sell this band-aid as a “solution” when the problem is actually a deeper malaise? (No)

I feel dirty.

How many radicals are there for the basic 2000 Japanese Kanji anyway?

I am working on this Kanji dictionary program that helps sort out similar-looking (but different-meaning) words. I found an open source dictionary (kradfile) and was doing some analysis on it when I came across a total of 252 radicals (shown below). However, a popular online list has 214 radicals. What’s up?

[0] => 匕 [1] => ノ [2] => 勹 [3] => ヨ [4] => 亠 [5] => 厶 [6] => 川 [7] => 一 [8] => | [9] => 二 [10] => ハ [11] => 衣 [12] => ⺅ [13] => 口 [14] => 大 [15] => 矢 [16] => 廾 [17] => 文 [18] => 斉 [19] => 并 [20] => 木 [21] => 王 [22] => 羊 [23] => 耒 [24] => 日 [25] => 白 [26] => 入 [27] => 冂 [28] => 凵 [29] => 儿 [30] => 冖 [31] => 冫 [32] => 刀 [33] => 工 [34] => 丶 [35] => 力 [36] => 九 [37] => 生 [38] => 十 [39] => 又 [40] => 尸 [41] => 巾 [42] => 父 [43] => 田 [44] => 彑 [45] => 水 [46] => 月 [47] => 西 [48] => 門 [49] => 亅 [50] => 彳 [51] => 金 [52] => 土 [53] => 長 [54] => 女 [55] => 宀 [56] => 耳 [57] => 禸 [58] => 虫 [59] => 夂 [60] => ⻖ [61] => 人 [62] => 寸 [63] => 厂 [64] => 山 [65] => 干 [66] => 而 [67] => 鬼 [68] => 小 [69] => 示 [70] => 灬 [71] => 鳥 [72] => 几 [73] => 夕 [74] => 豆 [75] => 心 [76] => 忄 [77] => 爪 [78] => 扌 [79] => 手 [80] => 臼 [81] => 隹 [82] => 卜 [83] => 子 [84] => 幺 [85] => 攵 [86] => 至 [87] => 戈 [88] => 刂 [89] => 𠆢 [90] => 穴 [91] => 谷 [92] => 屮 [93] => 爿 [94] => 目 [95] => 罒 [96] => 自 [97] => 竹 [98] => 高 [99] => 欠 [100] => 氵 [101] => 广 [102] => 羽 [103] => 革 [104] => 火 [105] => 卩 [106] => 士 [107] => 牛 [108] => 乙 [109] => 犭 [110] => 屯 [111] => ⺾ [112] => 疒 [113] => 貝 [114] => 禾 [115] => 頁 [116] => 缶 [117] => 見 [118] => 毛 [119] => 亀 [120] => 世 [121] => 米 [122] => 戸 [123] => 糸 [124] => 聿 [125] => ⺌ [126] => 匚 [127] => 臣 [128] => ⺹ [129] => 氏 [130] => 舌 [131] => 舟 [132] => 青 [133] => 風 [134] => 乃 [135] => 馬 [136] => 隶 [137] => 止 [138] => ⻏ [139] => 皿 [140] => 血 [141] => 衤 [142] => 雨 [143] => 角 [144] => 言 [145] => 豸 [146] => 足 [147] => 車 [148] => 酉 [149] => 食 [150] => 骨 [151] => 比 [152] => 黽 [153] => 弓 [154] => 疋 [155] => 斤 [156] => 彡 [157] => 井 [158] => 久 [159] => 廴 [160] => 也 [161] => 立 [162] => 辛 [163] => マ [164] => 五 [165] => 亡 [166] => 曰 [167] => 囗 [168] => 弋 [169] => 方 [170] => 及 [171] => 支 [172] => 犬 [173] => 用 [174] => 艮 [175] => 玄 [176] => 歹 [177] => ユ [178] => 冊 [179] => 母 [180] => 毋 [181] => 辰 [182] => 已 [183] => 巴 [184] => 里 [185] => 免 [186] => 非 [187] => 奄 [188] => 虍 [189] => 甘 [190] => 走 [191] => 韋 [192] => 勿 [193] => 音 [194] => 面 [195] => 舛 [196] => 豕 [197] => 品 [198] => 黄 [199] => 癶 [200] => 尢 [201] => 尤 [202] => 无 [203] => 齊 [204] => 辶 [205] => 無 [206] => 黒 [207] => 鹿 [208] => 元 [209] => 牙 [210] => 巛 [211] => 岡 [212] => 矛 [213] => 鼻 [214] => 石 [215] => 身 [216] => 殳 [217] => 瓜 [218] => 肉 [219] => 行 [220] => 鬲 [221] => 麻 [222] => 歯 [223] => 首 [224] => 赤 [225] => 魚 [226] => 竜 [227] => 爻 [228] => 斗 [229] => 皮 [230] => 片 [231] => 滴 [232] => 韭 [233] => 釆 [234] => 巨 [235] => 邑 [236] => 气 [237] => 鼠 [238] => 礻 [239] => 色 [240] => 香 [241] => 鹵 [242] => 龠 [243] => 瓦 [244] => 鼓 [245] => 黍 [246] => 飛 [247] => 髟 [248] => 鬥 [249] => 鬯 [250] => 麦 [251] => 黹 [252] => 鼎

If you want to duplicate the result yourself, here is the PHP code to use.


$krad_input = file_get_contents(dirname(__FILE__) . "/krad.txt");
$krad_array = explode("\n", $krad_input);
unset($krad_input);
$total = count($krad_array);
$i = 0;
$radicalarray = array();
$stored = array();
while ($i < $total){
$string = $krad_array[$i];
if (substr($string, 0, 1) != "#"){
$kanji_and_radicals = extract_kanji_and_radicals($string);
//print_r($kanji_and_radicals);
foreach($kanji_and_radicals['radicals'] as $radical){
if (!isset($stored[$radical])){
$stored[$radical] = 1;
$radicalarray[] = $radical;
}
}
// We know the Kanji. We know its radicals. We can insert it into the database.
}
$i++;
}
print_r($radicalarray);
\n
function extract_kanji_and_radicals($string){
$regex = "`(\s:\s|\s+)`imus";
$matches = preg_split($regex, $string);
$return['kanji'] = $matches[0];
array_shift($matches);
$return['radicals'] = $matches;
return $return;
}

You can get the dictionary file from Kanjicafe.com.

-edit-
Curiouser and curiouser, it seems that the classic number is 212. There are a bunch of radicals in there that simply do not exist classically; perhaps they are an amalgamation of the radicals available in all the various Japanese-English dictionaries? (In some cases the radicals were altered to more closely suit the kanji itself)