文字の範囲検索について - 秀丸エディタマクロ作者会議室

今日は、マクロ初心者の rosegardeyk です。（秀丸エディター会議室ではなくこち
らに投稿します）
質問
１．①は、"あ" が[㍾-∩]の範囲外なのに、何故○になるのでしょうか？　（"あ"だ
けでなくすべての全角文字で○となリます。）
２．②が[㍼-≒]で○なのに、③[㍼-㍼]と④[≒-≒]のいずれも×なのはなぜでしょ
うか？
何かマクロの作り方に問題あるでしょうか？（この可能性大です。間違いあればご指
導ください。）
そうでないと仮定した場合、shift-JIS 文字コード上の問題でしょうか、それとも秀
丸上だけの問題でしょうか？

//サンプルマクロ///////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////
//shift-JIS 文字コード表の並び：　㍾　㍽　㍼　≒　≡　∫　∮　∑　√　⊥　∠
　∟　⊿　∵　∩　∪
loaddll "hmjre.dll";
$$str = "abcdあ";
##pos = dllfunc("FindRegular", "[㍾-∩]", $$str, 0);
question "① ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[㍼-≒]", $$str, 0);
question "② ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[㍼-㍼]", $$str, 0);
question "③ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[≒-≒]", $$str, 0);
question "④ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

endmacro;
//サンプルマクロ　終わり//////////////////////////////////////

[ △ ]

RE:07200 文字の範囲検索について

No.07201

h-tom さん　13/03/07 00:15

h-tom です。

そこの13区は、NEC機種依存文字で、文字コードが重複していたり、色々面倒です。
"[㍾-∩]"の文字コードをみると、[0x878d-0x81bf] に見えます。
正確に、その範囲を指定したいなら、文字コードで指定すればいいのでは？

[ △ ]

RE:07201 文字の範囲検索について

No.07202

rosegardenyk さん　13/03/07 01:28

h-tomさん、ありがとうございます。
>そこの13区は、NEC機種依存文字で、文字コードが重複していたり、色々面倒です。
>"[㍾-∩]"の文字コードをみると、[0x878d-0x81bf] に見えます。
>正確に、その範囲を指定したいなら、文字コードで指定すればいいのでは？

∩は0x81BEもありますが範囲からいうと㍾（0x878D）より大きいはずなので、
この∩はshift-JIS 文字コード表（http://charset.7jp.net/sjis.html）によると 0
x879B のようです。
ただ、文字コードで範囲指定するマクロのコーディング方法を知らないので下記マク
ロでは全部×になってしまいました。
正しい方法をご教示くださる助かります。[0x878D-0x879B]　を [\x878D-\x879B]に
してもうまくいきません。
よろしくご教授ください。

//サンプルマクロ//////////////////////////////////////
//shift-JIS 文字コード表の並び：　㍾　㍽　㍼　≒　≡　∫　∮　∑　√　⊥　∠
　∟　⊿　∵　∩　∪
loaddll "hmjre.dll";
$$str = "abcdあ㍼";
##pos = dllfunc("FindRegular", "[x878D-x879B]", $$str, 0);//[㍾-∩]　コード
表によると∩は0x879Bのようです。
question "① ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[0x878F-0x81E0]", $$str, 0);//[㍼-≒]
question "② ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[0x878F-0x878F]", $$str, 0);//[㍼-㍼]
question "③ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[0x81E0-0x81E0]", $$str, 0);//[≒-≒]
question "④ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[0x82A0-0x82A0]", $$str, 0);//[あ-あ]
question "⑤ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[0x61-0x62]", $$str, 0);//[a-b]
question "⑥ ×\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

endmacro;
//サンプルマクロ　終わり//////////////////////////////////////

[ △ ]

RE:07202 文字の範囲検索について

No.07203

colder さん　13/03/07 02:05

colderです
>この∩はshift-JIS 文字コード表（http://charset.7jp.net/sjis.html）によると
>0x879B のようです。
いや、windows上ではコード0x879Bの文字を普通の方法では入力できないです。
勝手にコード0x81BEの方に変換されてしまいます。

>正しい方法をご教示くださる助かります。[0x878D-0x879B]　を [\x878D-\x879B]に
>してもうまくいきません。
>よろしくご教授ください。
[\x87\x8D-\x87\x9B]でどう

[ △ ]

RE:07203 文字の範囲検索について

No.07204

rosegardenyk さん　13/03/07 06:24

colderさん、
>[\x87\x8D-\x87\x9B]でどう

colderさん、ありがとうございます。
下記マクロのようにしたらうまくいきました。

13区の　≒ ≡ ∫ ∮ ∑ √ ⊥ ∠ ∟ ⊿ ∵ ∩ ∪
の内　　≒ ≡ ∫ √ ⊥ ∠ ∵ ∩ ∪
が2区にもあってWindow 上は2区のコードになるということのようですね。確かにshi
ft-JIS文字コード表（http://charset.7jp.net/sjis.html）
の文字をそれぞれ秀丸エディターにコピペして文字コード表示（私の場合ctrl+shift
+M）させるとどちらも2区のコードになってますね。
∫はよく使うので何で同じコードが shift-JISに2つもあるのかと思ってました。

質問：今でもNECのPCだけは環境依存文字とかいって独自なんでしょうか？　
例えば"①"[0x8740]は変換キーを押すと環境依存文字の表示が出ますが、NECのPCを
使っている人のことを考えた場合、
①-⑳から㍻-㍼も含め13区の文字は他の人にも利用される可能性のあるマクロでは使
用しない方がよいということになりますか？

//サンプルマクロ//////////////////////////////////////////////////////////////
//shift-JIS 文字コード表の並び：　㍾　㍽　㍼　≒　≡　∫　∮　∑　√　⊥　∠
　∟　⊿　∵　∩　∪
loaddll "hmjre.dll";
$$str = "abcdあ㍼㍾≒";
##pos = dllfunc("FindRegular", "[\x87\x8D-\x87\x9B]+", $$str, 0);//[㍾-∩]　
コード表によると∩は0x879Bのようです。
question "① ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[\x87\x8F-\x87\X90]+", $$str, 0);//[㍼-≒]
question "② ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[\x87\x8F-\x87\x8F]+", $$str, 0);//[㍼-㍼]
question "③ ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[\x81\xE0-\x81\xE0]+", $$str, 0);//[≒-≒]
　[\x87\x90-\x87\x90]だと×
question "④ ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[\x82\xA0-\x82\xA0]+", $$str, 0);//[あ-あ]
question "⑤ ○\n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

##pos = dllfunc("FindRegular", "[\x61-\x62]+", $$str, 0);//[a-b]
question "⑥ ○ \n##pos = " + str(##pos) + "\n文字 = " + midstr($$str, ##pos,
dllfunc("GetLastMatchLength"));
if(!result) endmacro;

endmacro;
//サンプルマクロ　終わり//////////////////////////////////////

[ △ ]