blob: ecccfc694c5c74b7f64a3e1f64d0ea03566e786f [file] [log] [blame] [edit]
Here is a test file you can use (from http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[^[:c:][:z:][:di:]]&abb=on&ucd=on)
Every codepoint within one of these ranges (in field 0) should succeed, and every codepoint outside of these ranges should fail. Even testing that the first of each range works, and the last+1 fails would be a good test.
0021..007E ; TILDE
00A1..00AC ; NOT SIGN
00AE..034E ; COMBINING UPWARDS ARROW BELOW
0350..0377 ; GREEK SMALL LETTER PAMPHYLIAN DIGAMMA
037A..037F ; GREEK CAPITAL LETTER YOT
0384..038A ; GREEK CAPITAL LETTER IOTA WITH TONOS
038C ; GREEK CAPITAL LETTER OMICRON WITH TONOS
038E..03A1 ; GREEK CAPITAL LETTER RHO
03A3..052F ; CYRILLIC SMALL LETTER EL WITH DESCENDER
0531..0556 ; ARMENIAN CAPITAL LETTER FEH
0559..055F ; ARMENIAN ABBREVIATION MARK
0561..0587 ; ARMENIAN SMALL LIGATURE ECH YIWN
0589 ; ARMENIAN FULL STOP
058A ; ARMENIAN HYPHEN
058D..058F ; ARMENIAN DRAM SIGN
0591..05C7 ; HEBREW POINT QAMATS QATAN
05D0..05EA ; HEBREW LETTER TAV
05F0..05F4 ; HEBREW PUNCTUATION GERSHAYIM
0606..061B ; ARABIC SEMICOLON
061E..06DC ; ARABIC SMALL HIGH SEEN
06DE..070D ; SYRIAC HARKLEAN ASTERISCUS
0710..074A ; SYRIAC BARREKH
074D..07B1 ; THAANA LETTER NAA
07C0..07FA ; NKO LAJANYALAN
0800..082D ; SAMARITAN MARK NEQUDAA
0830..083E ; SAMARITAN PUNCTUATION ANNAAU
0840..085B ; MANDAIC GEMINATION MARK
085E ; MANDAIC PUNCTUATION
08A0..08B4 ; ARABIC LETTER KAF WITH DOT BELOW
08E3..0983 ; BENGALI SIGN VISARGA
0985..098C ; BENGALI LETTER VOCALIC L
098F ; BENGALI LETTER E
0990 ; BENGALI LETTER AI
0993..09A8 ; BENGALI LETTER NA
09AA..09B0 ; BENGALI LETTER RA
09B2 ; BENGALI LETTER LA
09B6..09B9 ; BENGALI LETTER HA
09BC..09C4 ; BENGALI VOWEL SIGN VOCALIC RR
09C7 ; BENGALI VOWEL SIGN E
09C8 ; BENGALI VOWEL SIGN AI
09CB..09CE ; BENGALI LETTER KHANDA TA
09D7 ; BENGALI AU LENGTH MARK
09DC ; BENGALI LETTER RRA
09DD ; BENGALI LETTER RHA
09DF..09E3 ; BENGALI VOWEL SIGN VOCALIC LL
09E6..09FB ; BENGALI GANDA MARK
0A01..0A03 ; GURMUKHI SIGN VISARGA
0A05..0A0A ; GURMUKHI LETTER UU
0A0F ; GURMUKHI LETTER EE
0A10 ; GURMUKHI LETTER AI
0A13..0A28 ; GURMUKHI LETTER NA
0A2A..0A30 ; GURMUKHI LETTER RA
0A32 ; GURMUKHI LETTER LA
0A33 ; GURMUKHI LETTER LLA
0A35 ; GURMUKHI LETTER VA
0A36 ; GURMUKHI LETTER SHA
0A38 ; GURMUKHI LETTER SA
0A39 ; GURMUKHI LETTER HA
0A3C ; GURMUKHI SIGN NUKTA
0A3E..0A42 ; GURMUKHI VOWEL SIGN UU
0A47 ; GURMUKHI VOWEL SIGN EE
0A48 ; GURMUKHI VOWEL SIGN AI
0A4B..0A4D ; GURMUKHI SIGN VIRAMA
0A51 ; GURMUKHI SIGN UDAAT
0A59..0A5C ; GURMUKHI LETTER RRA
0A5E ; GURMUKHI LETTER FA
0A66..0A75 ; GURMUKHI SIGN YAKASH
0A81..0A83 ; GUJARATI SIGN VISARGA
0A85..0A8D ; GUJARATI VOWEL CANDRA E
0A8F..0A91 ; GUJARATI VOWEL CANDRA O
0A93..0AA8 ; GUJARATI LETTER NA
0AAA..0AB0 ; GUJARATI LETTER RA
0AB2 ; GUJARATI LETTER LA
0AB3 ; GUJARATI LETTER LLA
0AB5..0AB9 ; GUJARATI LETTER HA
0ABC..0AC5 ; GUJARATI VOWEL SIGN CANDRA E
0AC7..0AC9 ; GUJARATI VOWEL SIGN CANDRA O
0ACB..0ACD ; GUJARATI SIGN VIRAMA
0AD0 ; GUJARATI OM
0AE0..0AE3 ; GUJARATI VOWEL SIGN VOCALIC LL
0AE6..0AF1 ; GUJARATI RUPEE SIGN
0AF9 ; GUJARATI LETTER ZHA
0B01..0B03 ; ORIYA SIGN VISARGA
0B05..0B0C ; ORIYA LETTER VOCALIC L
0B0F ; ORIYA LETTER E
0B10 ; ORIYA LETTER AI
0B13..0B28 ; ORIYA LETTER NA
0B2A..0B30 ; ORIYA LETTER RA
0B32 ; ORIYA LETTER LA
0B33 ; ORIYA LETTER LLA
0B35..0B39 ; ORIYA LETTER HA
0B3C..0B44 ; ORIYA VOWEL SIGN VOCALIC RR
0B47 ; ORIYA VOWEL SIGN E
0B48 ; ORIYA VOWEL SIGN AI
0B4B..0B4D ; ORIYA SIGN VIRAMA
0B56 ; ORIYA AI LENGTH MARK
0B57 ; ORIYA AU LENGTH MARK
0B5C ; ORIYA LETTER RRA
0B5D ; ORIYA LETTER RHA
0B5F..0B63 ; ORIYA VOWEL SIGN VOCALIC LL
0B66..0B77 ; ORIYA FRACTION THREE SIXTEENTHS
0B82 ; TAMIL SIGN ANUSVARA
0B83 ; TAMIL SIGN VISARGA
0B85..0B8A ; TAMIL LETTER UU
0B8E..0B90 ; TAMIL LETTER AI
0B92..0B95 ; TAMIL LETTER KA
0B99 ; TAMIL LETTER NGA
0B9A ; TAMIL LETTER CA
0B9C ; TAMIL LETTER JA
0B9E ; TAMIL LETTER NYA
0B9F ; TAMIL LETTER TTA
0BA3 ; TAMIL LETTER NNA
0BA4 ; TAMIL LETTER TA
0BA8..0BAA ; TAMIL LETTER PA
0BAE..0BB9 ; TAMIL LETTER HA
0BBE..0BC2 ; TAMIL VOWEL SIGN UU
0BC6..0BC8 ; TAMIL VOWEL SIGN AI
0BCA..0BCD ; TAMIL SIGN VIRAMA
0BD0 ; TAMIL OM
0BD7 ; TAMIL AU LENGTH MARK
0BE6..0BFA ; TAMIL NUMBER SIGN
0C00..0C03 ; TELUGU SIGN VISARGA
0C05..0C0C ; TELUGU LETTER VOCALIC L
0C0E..0C10 ; TELUGU LETTER AI
0C12..0C28 ; TELUGU LETTER NA
0C2A..0C39 ; TELUGU LETTER HA
0C3D..0C44 ; TELUGU VOWEL SIGN VOCALIC RR
0C46..0C48 ; TELUGU VOWEL SIGN AI
0C4A..0C4D ; TELUGU SIGN VIRAMA
0C55 ; TELUGU LENGTH MARK
0C56 ; TELUGU AI LENGTH MARK
0C58..0C5A ; TELUGU LETTER RRRA
0C60..0C63 ; TELUGU VOWEL SIGN VOCALIC LL
0C66..0C6F ; TELUGU DIGIT NINE
0C78..0C7F ; TELUGU SIGN TUUMU
0C81..0C83 ; KANNADA SIGN VISARGA
0C85..0C8C ; KANNADA LETTER VOCALIC L
0C8E..0C90 ; KANNADA LETTER AI
0C92..0CA8 ; KANNADA LETTER NA
0CAA..0CB3 ; KANNADA LETTER LLA
0CB5..0CB9 ; KANNADA LETTER HA
0CBC..0CC4 ; KANNADA VOWEL SIGN VOCALIC RR
0CC6..0CC8 ; KANNADA VOWEL SIGN AI
0CCA..0CCD ; KANNADA SIGN VIRAMA
0CD5 ; KANNADA LENGTH MARK
0CD6 ; KANNADA AI LENGTH MARK
0CDE ; KANNADA LETTER FA
0CE0..0CE3 ; KANNADA VOWEL SIGN VOCALIC LL
0CE6..0CEF ; KANNADA DIGIT NINE
0CF1 ; KANNADA SIGN JIHVAMULIYA
0CF2 ; KANNADA SIGN UPADHMANIYA
0D01..0D03 ; MALAYALAM SIGN VISARGA
0D05..0D0C ; MALAYALAM LETTER VOCALIC L
0D0E..0D10 ; MALAYALAM LETTER AI
0D12..0D3A ; MALAYALAM LETTER TTTA
0D3D..0D44 ; MALAYALAM VOWEL SIGN VOCALIC RR
0D46..0D48 ; MALAYALAM VOWEL SIGN AI
0D4A..0D4E ; MALAYALAM LETTER DOT REPH
0D57 ; MALAYALAM AU LENGTH MARK
0D5F..0D63 ; MALAYALAM VOWEL SIGN VOCALIC LL
0D66..0D75 ; MALAYALAM FRACTION THREE QUARTERS
0D79..0D7F ; MALAYALAM LETTER CHILLU K
0D82 ; SINHALA SIGN ANUSVARAYA
0D83 ; SINHALA SIGN VISARGAYA
0D85..0D96 ; SINHALA LETTER AUYANNA
0D9A..0DB1 ; SINHALA LETTER DANTAJA NAYANNA
0DB3..0DBB ; SINHALA LETTER RAYANNA
0DBD ; SINHALA LETTER DANTAJA LAYANNA
0DC0..0DC6 ; SINHALA LETTER FAYANNA
0DCA ; SINHALA SIGN AL-LAKUNA
0DCF..0DD4 ; SINHALA VOWEL SIGN KETTI PAA-PILLA
0DD6 ; SINHALA VOWEL SIGN DIGA PAA-PILLA
0DD8..0DDF ; SINHALA VOWEL SIGN GAYANUKITTA
0DE6..0DEF ; SINHALA LITH DIGIT NINE
0DF2..0DF4 ; SINHALA PUNCTUATION KUNDDALIYA
0E01..0E3A ; THAI CHARACTER PHINTHU
0E3F..0E5B ; THAI CHARACTER KHOMUT
0E81 ; LAO LETTER KO
0E82 ; LAO LETTER KHO SUNG
0E84 ; LAO LETTER KHO TAM
0E87 ; LAO LETTER NGO
0E88 ; LAO LETTER CO
0E8A ; LAO LETTER SO TAM
0E8D ; LAO LETTER NYO
0E94..0E97 ; LAO LETTER THO TAM
0E99..0E9F ; LAO LETTER FO SUNG
0EA1..0EA3 ; LAO LETTER LO LING
0EA5 ; LAO LETTER LO LOOT
0EA7 ; LAO LETTER WO
0EAA ; LAO LETTER SO SUNG
0EAB ; LAO LETTER HO SUNG
0EAD..0EB9 ; LAO VOWEL SIGN UU
0EBB..0EBD ; LAO SEMIVOWEL SIGN NYO
0EC0..0EC4 ; LAO VOWEL SIGN AI
0EC6 ; LAO KO LA
0EC8..0ECD ; LAO NIGGAHITA
0ED0..0ED9 ; LAO DIGIT NINE
0EDC..0EDF ; LAO LETTER KHMU NYO
0F00..0F47 ; TIBETAN LETTER JA
0F49..0F6C ; TIBETAN LETTER RRA
0F71..0F97 ; TIBETAN SUBJOINED LETTER JA
0F99..0FBC ; TIBETAN SUBJOINED LETTER FIXED-FORM RA
0FBE..0FCC ; TIBETAN SYMBOL NOR BU BZHI -KHYIL
0FCE..0FDA ; TIBETAN MARK TRAILING MCHAN RTAGS
1000..10C5 ; GEORGIAN CAPITAL LETTER HOE
10C7 ; GEORGIAN CAPITAL LETTER YN
10CD ; GEORGIAN CAPITAL LETTER AEN
10D0..115E ; HANGUL CHOSEONG TIKEUT-RIEUL
1161..1248 ; ETHIOPIC SYLLABLE QWA
124A..124D ; ETHIOPIC SYLLABLE QWE
1250..1256 ; ETHIOPIC SYLLABLE QHO
1258 ; ETHIOPIC SYLLABLE QHWA
125A..125D ; ETHIOPIC SYLLABLE QHWE
1260..1288 ; ETHIOPIC SYLLABLE XWA
128A..128D ; ETHIOPIC SYLLABLE XWE
1290..12B0 ; ETHIOPIC SYLLABLE KWA
12B2..12B5 ; ETHIOPIC SYLLABLE KWE
12B8..12BE ; ETHIOPIC SYLLABLE KXO
12C0 ; ETHIOPIC SYLLABLE KXWA
12C2..12C5 ; ETHIOPIC SYLLABLE KXWE
12C8..12D6 ; ETHIOPIC SYLLABLE PHARYNGEAL O
12D8..1310 ; ETHIOPIC SYLLABLE GWA
1312..1315 ; ETHIOPIC SYLLABLE GWE
1318..135A ; ETHIOPIC SYLLABLE FYA
135D..137C ; ETHIOPIC NUMBER TEN THOUSAND
1380..1399 ; ETHIOPIC TONAL MARK KURT
13A0..13F5 ; CHEROKEE LETTER MV
13F8..13FD ; CHEROKEE SMALL LETTER MV
1400..167F ; CANADIAN SYLLABICS BLACKFOOT W
1681..169C ; OGHAM REVERSED FEATHER MARK
16A0..16F8 ; RUNIC LETTER FRANKS CASKET AESC
1700..170C ; TAGALOG LETTER YA
170E..1714 ; TAGALOG SIGN VIRAMA
1720..1736 ; PHILIPPINE DOUBLE PUNCTUATION
1740..1753 ; BUHID VOWEL SIGN U
1760..176C ; TAGBANWA LETTER YA
176E..1770 ; TAGBANWA LETTER SA
1772 ; TAGBANWA VOWEL SIGN I
1773 ; TAGBANWA VOWEL SIGN U
1780..17B3 ; KHMER INDEPENDENT VOWEL QAU
17B6..17DD ; KHMER SIGN ATTHACAN
17E0..17E9 ; KHMER DIGIT NINE
17F0..17F9 ; KHMER SYMBOL LEK ATTAK PRAM-BUON
1800..180A ; MONGOLIAN NIRUGU
1810..1819 ; MONGOLIAN DIGIT NINE
1820..1877 ; MONGOLIAN LETTER MANCHU ZHA
1880..18AA ; MONGOLIAN LETTER MANCHU ALI GALI LHA
18B0..18F5 ; CANADIAN SYLLABICS CARRIER DENTAL S
1900..191E ; LIMBU LETTER TRA
1920..192B ; LIMBU SUBJOINED LETTER WA
1930..193B ; LIMBU SIGN SA-I
1940 ; LIMBU SIGN LOO
1944..196D ; TAI LE LETTER AI
1970..1974 ; TAI LE LETTER TONE-6
1980..19AB ; NEW TAI LUE LETTER LOW SUA
19B0..19C9 ; NEW TAI LUE TONE MARK-2
19D0..19DA ; NEW TAI LUE THAM DIGIT ONE
19DE..1A1B ; BUGINESE VOWEL SIGN AE
1A1E..1A5E ; TAI THAM CONSONANT SIGN SA
1A60..1A7C ; TAI THAM SIGN KHUEN-LUE KARAN
1A7F..1A89 ; TAI THAM HORA DIGIT NINE
1A90..1A99 ; TAI THAM THAM DIGIT NINE
1AA0..1AAD ; TAI THAM SIGN CAANG
1AB0..1ABE ; COMBINING PARENTHESES OVERLAY
1B00..1B4B ; BALINESE LETTER ASYURA SASAK
1B50..1B7C ; BALINESE MUSICAL SYMBOL LEFT-HAND OPEN PING
1B80..1BF3 ; BATAK PANONGONAN
1BFC..1C37 ; LEPCHA SIGN NUKTA
1C3B..1C49 ; LEPCHA DIGIT NINE
1C4D..1C7F ; OL CHIKI PUNCTUATION DOUBLE MUCAAD
1CC0..1CC7 ; SUNDANESE PUNCTUATION BINDU BA SATANGA
1CD0..1CF6 ; VEDIC SIGN UPADHMANIYA
1CF8 ; VEDIC TONE RING ABOVE
1CF9 ; VEDIC TONE DOUBLE RING ABOVE
1D00..1DF5 ; COMBINING UP TACK ABOVE
1DFC..1F15 ; GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
1F18..1F1D ; GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
1F20..1F45 ; GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA
1F48..1F4D ; GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA
1F50..1F57 ; GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI
1F59 ; GREEK CAPITAL LETTER UPSILON WITH DASIA
1F5B ; GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA
1F5D ; GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA
1F5F..1F7D ; GREEK SMALL LETTER OMEGA WITH OXIA
1F80..1FB4 ; GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
1FB6..1FC4 ; GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
1FC6..1FD3 ; GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
1FD6..1FDB ; GREEK CAPITAL LETTER IOTA WITH OXIA
1FDD..1FEF ; GREEK VARIA
1FF2..1FF4 ; GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
1FF6..1FFE ; GREEK DASIA
2010..2027 ; HYPHENATION POINT
2030..205E ; VERTICAL FOUR DOTS
2070 ; SUPERSCRIPT ZERO
2071 ; SUPERSCRIPT LATIN SMALL LETTER I
2074..208E ; SUBSCRIPT RIGHT PARENTHESIS
2090..209C ; LATIN SUBSCRIPT SMALL LETTER T
20A0..20BE ; LARI SIGN
20D0..20F0 ; COMBINING ASTERISK ABOVE
2100..218B ; TURNED DIGIT THREE
2190..23FA ; BLACK CIRCLE FOR RECORD
2400..2426 ; SYMBOL FOR SUBSTITUTE FORM TWO
2440..244A ; OCR DOUBLE BACKSLASH
2460..2B73 ; DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
2B76..2B95 ; RIGHTWARDS BLACK ARROW
2B98..2BB9 ; UP ARROWHEAD IN A RECTANGLE BOX
2BBD..2BC8 ; BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED
2BCA..2BD1 ; UNCERTAINTY SIGN
2BEC..2BEF ; DOWNWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS
2C00..2C2E ; GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE
2C30..2C5E ; GLAGOLITIC SMALL LETTER LATINATE MYSLITE
2C60..2CF3 ; COPTIC SMALL LETTER BOHAIRIC KHEI
2CF9..2D25 ; GEORGIAN SMALL LETTER HOE
2D27 ; GEORGIAN SMALL LETTER YN
2D2D ; GEORGIAN SMALL LETTER AEN
2D30..2D67 ; TIFINAGH LETTER YO
2D6F ; TIFINAGH MODIFIER LETTER LABIALIZATION MARK
2D70 ; TIFINAGH SEPARATOR MARK
2D7F..2D96 ; ETHIOPIC SYLLABLE GGWE
2DA0..2DA6 ; ETHIOPIC SYLLABLE SSO
2DA8..2DAE ; ETHIOPIC SYLLABLE CCO
2DB0..2DB6 ; ETHIOPIC SYLLABLE ZZO
2DB8..2DBE ; ETHIOPIC SYLLABLE CCHO
2DC0..2DC6 ; ETHIOPIC SYLLABLE QYO
2DC8..2DCE ; ETHIOPIC SYLLABLE KYO
2DD0..2DD6 ; ETHIOPIC SYLLABLE XYO
2DD8..2DDE ; ETHIOPIC SYLLABLE GYO
2DE0..2E42 ; DOUBLE LOW-REVERSED-9 QUOTATION MARK
2E80..2E99 ; CJK RADICAL RAP
2E9B..2EF3 ; CJK RADICAL C-SIMPLIFIED TURTLE
2F00..2FD5 ; KANGXI RADICAL FLUTE
2FF0..2FFB ; IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
3001..303F ; IDEOGRAPHIC HALF FILL SPACE
3041..3096 ; HIRAGANA LETTER SMALL KE
3099..30FF ; KATAKANA DIGRAPH KOTO
3105..312D ; BOPOMOFO LETTER IH
3131..3163 ; HANGUL LETTER I
3165..318E ; HANGUL LETTER ARAEAE
3190..31BA ; BOPOMOFO LETTER ZY
31C0..31E3 ; CJK STROKE Q
31F0..321E ; PARENTHESIZED KOREAN CHARACTER O HU
3220..32FE ; CIRCLED KATAKANA WO
3300..4DB5 ; CJK UNIFIED IDEOGRAPH-4DB5
4DC0..9FD5 ; CJK UNIFIED IDEOGRAPH-9FD5
A000..A48C ; YI SYLLABLE YYR
A490..A4C6 ; YI RADICAL KE
A4D0..A62B ; VAI SYLLABLE NDOLE DO
A640..A6F7 ; BAMUM QUESTION MARK
A700..A7AD ; LATIN CAPITAL LETTER L WITH BELT
A7B0..A7B7 ; LATIN SMALL LETTER OMEGA
A7F7..A82B ; SYLOTI NAGRI POETRY MARK-4
A830..A839 ; NORTH INDIC QUANTITY MARK
A840..A877 ; PHAGS-PA MARK DOUBLE SHAD
A880..A8C4 ; SAURASHTRA SIGN VIRAMA
A8CE..A8D9 ; SAURASHTRA DIGIT NINE
A8E0..A8FD ; DEVANAGARI JAIN OM
A900..A953 ; REJANG VIRAMA
A95F..A97C ; HANGUL CHOSEONG SSANGYEORINHIEUH
A980..A9CD ; JAVANESE TURNED PADA PISELEH
A9CF..A9D9 ; JAVANESE DIGIT NINE
A9DE..A9FE ; MYANMAR LETTER TAI LAING BHA
AA00..AA36 ; CHAM CONSONANT SIGN WA
AA40..AA4D ; CHAM CONSONANT SIGN FINAL H
AA50..AA59 ; CHAM DIGIT NINE
AA5C..AAC2 ; TAI VIET TONE MAI SONG
AADB..AAF6 ; MEETEI MAYEK VIRAMA
AB01..AB06 ; ETHIOPIC SYLLABLE TTHO
AB09..AB0E ; ETHIOPIC SYLLABLE DDHO
AB11..AB16 ; ETHIOPIC SYLLABLE DZO
AB20..AB26 ; ETHIOPIC SYLLABLE CCHHO
AB28..AB2E ; ETHIOPIC SYLLABLE BBO
AB30..AB65 ; GREEK LETTER SMALL CAPITAL OMEGA
AB70..ABED ; MEETEI MAYEK APUN IYEK
ABF0..ABF9 ; MEETEI MAYEK DIGIT NINE
AC00..D7A3 ; HANGUL SYLLABLE HIH
D7B0..D7C6 ; HANGUL JUNGSEONG ARAEA-E
D7CB..D7FB ; HANGUL JONGSEONG PHIEUPH-THIEUTH
F900..FA6D ; CJK COMPATIBILITY IDEOGRAPH-FA6D
FA70..FAD9 ; CJK COMPATIBILITY IDEOGRAPH-FAD9
FB00..FB06 ; LATIN SMALL LIGATURE ST
FB13..FB17 ; ARMENIAN SMALL LIGATURE MEN XEH
FB1D..FB36 ; HEBREW LETTER ZAYIN WITH DAGESH
FB38..FB3C ; HEBREW LETTER LAMED WITH DAGESH
FB3E ; HEBREW LETTER MEM WITH DAGESH
FB40 ; HEBREW LETTER NUN WITH DAGESH
FB41 ; HEBREW LETTER SAMEKH WITH DAGESH
FB43 ; HEBREW LETTER FINAL PE WITH DAGESH
FB44 ; HEBREW LETTER PE WITH DAGESH
FB46..FBC1 ; ARABIC SYMBOL SMALL TAH BELOW
FBD3..FD3F ; ORNATE RIGHT PARENTHESIS
FD50..FD8F ; ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM
FD92..FDC7 ; ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM
FDF0..FDFD ; ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM
FE10..FE19 ; PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS
FE20..FE52 ; SMALL FULL STOP
FE54..FE66 ; SMALL EQUALS SIGN
FE68..FE6B ; SMALL COMMERCIAL AT
FE70..FE74 ; ARABIC KASRATAN ISOLATED FORM
FE76..FEFC ; ARABIC LIGATURE LAM WITH ALEF FINAL FORM
FF01..FF9F ; HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK
FFA1..FFBE ; HALFWIDTH HANGUL LETTER HIEUH
FFC2..FFC7 ; HALFWIDTH HANGUL LETTER E
FFCA..FFCF ; HALFWIDTH HANGUL LETTER OE
FFD2..FFD7 ; HALFWIDTH HANGUL LETTER YU
FFDA..FFDC ; HALFWIDTH HANGUL LETTER I
FFE0..FFE6 ; FULLWIDTH WON SIGN
FFE8..FFEE ; HALFWIDTH WHITE CIRCLE
FFFC ; OBJECT REPLACEMENT CHARACTER
FFFD ; REPLACEMENT CHARACTER
10000..1000B ; LINEAR B SYLLABLE B046 JE
1000D..10026 ; LINEAR B SYLLABLE B032 QO
10028..1003A ; LINEAR B SYLLABLE B042 WO
1003C ; LINEAR B SYLLABLE B017 ZA
1003D ; LINEAR B SYLLABLE B074 ZE
1003F..1004D ; LINEAR B SYLLABLE B091 TWO
10050..1005D ; LINEAR B SYMBOL B089
10080..100FA ; LINEAR B IDEOGRAM VESSEL B305
10100..10102 ; AEGEAN CHECK MARK
10107..10133 ; AEGEAN NUMBER NINETY THOUSAND
10137..1018C ; GREEK SINUSOID SIGN
10190..1019B ; ROMAN CENTURIAL SIGN
101A0 ; GREEK SYMBOL TAU RHO
101D0..101FD ; PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE
10280..1029C ; LYCIAN LETTER X
102A0..102D0 ; CARIAN LETTER UUU3
102E0..102FB ; COPTIC EPACT NUMBER NINE HUNDRED
10300..10323 ; OLD ITALIC NUMERAL FIFTY
10330..1034A ; GOTHIC LETTER NINE HUNDRED
10350..1037A ; COMBINING OLD PERMIC LETTER SII
10380..1039D ; UGARITIC LETTER SSU
1039F..103C3 ; OLD PERSIAN SIGN HA
103C8..103D5 ; OLD PERSIAN NUMBER HUNDRED
10400..1049D ; OSMANYA LETTER OO
104A0..104A9 ; OSMANYA DIGIT NINE
10500..10527 ; ELBASAN LETTER KHE
10530..10563 ; CAUCASIAN ALBANIAN LETTER KIW
1056F ; CAUCASIAN ALBANIAN CITATION MARK
10600..10736 ; LINEAR A SIGN A664
10740..10755 ; LINEAR A SIGN A732 JE
10760..10767 ; LINEAR A SIGN A807
10800..10805 ; CYPRIOT SYLLABLE JA
10808 ; CYPRIOT SYLLABLE JO
1080A..10835 ; CYPRIOT SYLLABLE WO
10837 ; CYPRIOT SYLLABLE XA
10838 ; CYPRIOT SYLLABLE XE
1083C ; CYPRIOT SYLLABLE ZA
1083F..10855 ; IMPERIAL ARAMAIC LETTER TAW
10857..1089E ; NABATAEAN LETTER TAW
108A7..108AF ; NABATAEAN NUMBER ONE HUNDRED
108E0..108F2 ; HATRAN LETTER QOPH
108F4 ; HATRAN LETTER SHIN
108F5 ; HATRAN LETTER TAW
108FB..1091B ; PHOENICIAN NUMBER THREE
1091F..10939 ; LYDIAN LETTER C
1093F ; LYDIAN TRIANGULAR MARK
10980..109B7 ; MEROITIC CURSIVE LETTER DA
109BC..109CF ; MEROITIC CURSIVE NUMBER SEVENTY
109D2..10A03 ; KHAROSHTHI VOWEL SIGN VOCALIC R
10A05 ; KHAROSHTHI VOWEL SIGN E
10A06 ; KHAROSHTHI VOWEL SIGN O
10A0C..10A13 ; KHAROSHTHI LETTER GHA
10A15..10A17 ; KHAROSHTHI LETTER JA
10A19..10A33 ; KHAROSHTHI LETTER TTTHA
10A38..10A3A ; KHAROSHTHI SIGN DOT BELOW
10A3F..10A47 ; KHAROSHTHI NUMBER ONE THOUSAND
10A50..10A58 ; KHAROSHTHI PUNCTUATION LINES
10A60..10A9F ; OLD NORTH ARABIAN NUMBER TWENTY
10AC0..10AE6 ; MANICHAEAN ABBREVIATION MARK BELOW
10AEB..10AF6 ; MANICHAEAN PUNCTUATION LINE FILLER
10B00..10B35 ; AVESTAN LETTER HE
10B39..10B55 ; INSCRIPTIONAL PARTHIAN LETTER TAW
10B58..10B72 ; INSCRIPTIONAL PAHLAVI LETTER TAW
10B78..10B91 ; PSALTER PAHLAVI LETTER TAW
10B99..10B9C ; PSALTER PAHLAVI FOUR DOTS WITH DOT
10BA9..10BAF ; PSALTER PAHLAVI NUMBER ONE HUNDRED
10C00..10C48 ; OLD TURKIC LETTER ORKHON BASH
10C80..10CB2 ; OLD HUNGARIAN CAPITAL LETTER US
10CC0..10CF2 ; OLD HUNGARIAN SMALL LETTER US
10CFA..10CFF ; OLD HUNGARIAN NUMBER ONE THOUSAND
10E60..10E7E ; RUMI FRACTION TWO THIRDS
11000..1104D ; BRAHMI PUNCTUATION LOTUS
11052..1106F ; BRAHMI DIGIT NINE
1107F..110BC ; KAITHI ENUMERATION SIGN
110BE..110C1 ; KAITHI DOUBLE DANDA
110D0..110E8 ; SORA SOMPENG LETTER MAE
110F0..110F9 ; SORA SOMPENG DIGIT NINE
11100..11134 ; CHAKMA MAAYYAA
11136..11143 ; CHAKMA QUESTION MARK
11150..11176 ; MAHAJANI LIGATURE SHRI
11180..111CD ; SHARADA SUTRA MARK
111D0..111DF ; SHARADA SECTION MARK-2
111E1..111F4 ; SINHALA ARCHAIC NUMBER ONE THOUSAND
11200..11211 ; KHOJKI LETTER JJA
11213..1123D ; KHOJKI ABBREVIATION SIGN
11280..11286 ; MULTANI LETTER GA
11288 ; MULTANI LETTER GHA
1128A..1128D ; MULTANI LETTER JJA
1128F..1129D ; MULTANI LETTER BA
1129F..112A9 ; MULTANI SECTION MARK
112B0..112EA ; KHUDAWADI SIGN VIRAMA
112F0..112F9 ; KHUDAWADI DIGIT NINE
11300..11303 ; GRANTHA SIGN VISARGA
11305..1130C ; GRANTHA LETTER VOCALIC L
1130F ; GRANTHA LETTER EE
11310 ; GRANTHA LETTER AI
11313..11328 ; GRANTHA LETTER NA
1132A..11330 ; GRANTHA LETTER RA
11332 ; GRANTHA LETTER LA
11333 ; GRANTHA LETTER LLA
11335..11339 ; GRANTHA LETTER HA
1133C..11344 ; GRANTHA VOWEL SIGN VOCALIC RR
11347 ; GRANTHA VOWEL SIGN EE
11348 ; GRANTHA VOWEL SIGN AI
1134B..1134D ; GRANTHA SIGN VIRAMA
11350 ; GRANTHA OM
11357 ; GRANTHA AU LENGTH MARK
1135D..11363 ; GRANTHA VOWEL SIGN VOCALIC LL
11366..1136C ; COMBINING GRANTHA DIGIT SIX
11370..11374 ; COMBINING GRANTHA LETTER PA
11480..114C7 ; TIRHUTA OM
114D0..114D9 ; TIRHUTA DIGIT NINE
11580..115B5 ; SIDDHAM VOWEL SIGN VOCALIC RR
115B8..115DD ; SIDDHAM VOWEL SIGN ALTERNATE UU
11600..11644 ; MODI SIGN HUVA
11650..11659 ; MODI DIGIT NINE
11680..116B7 ; TAKRI SIGN NUKTA
116C0..116C9 ; TAKRI DIGIT NINE
11700..11719 ; AHOM LETTER JHA
1171D..1172B ; AHOM SIGN KILLER
11730..1173F ; AHOM SYMBOL VI
118A0..118F2 ; WARANG CITI NUMBER NINETY
118FF ; WARANG CITI OM
11AC0..11AF8 ; PAU CIN HAU GLOTTAL STOP FINAL
12000..12399 ; CUNEIFORM SIGN U U
12400..1246E ; CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
12470..12474 ; CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
12480..12543 ; CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
13000..1342E ; EGYPTIAN HIEROGLYPH AA032
14400..14646 ; ANATOLIAN HIEROGLYPH A530
16800..16A38 ; BAMUM LETTER PHASE-F VUEQ
16A40..16A5E ; MRO LETTER TEK
16A60..16A69 ; MRO DIGIT NINE
16A6E ; MRO DANDA
16A6F ; MRO DOUBLE DANDA
16AD0..16AED ; BASSA VAH LETTER I
16AF0..16AF5 ; BASSA VAH FULL STOP
16B00..16B45 ; PAHAWH HMONG SIGN CIM TSOV ROG
16B50..16B59 ; PAHAWH HMONG DIGIT NINE
16B5B..16B61 ; PAHAWH HMONG NUMBER TRILLIONS
16B63..16B77 ; PAHAWH HMONG SIGN CIM NRES TOS
16B7D..16B8F ; PAHAWH HMONG CLAN SIGN VWJ
16F00..16F44 ; MIAO LETTER HHA
16F50..16F7E ; MIAO VOWEL SIGN NG
16F8F..16F9F ; MIAO LETTER REFORMED TONE-8
1B000 ; KATAKANA LETTER ARCHAIC E
1B001 ; HIRAGANA LETTER ARCHAIC YE
1BC00..1BC6A ; DUPLOYAN LETTER VOCALIC M
1BC70..1BC7C ; DUPLOYAN AFFIX ATTACHED TANGENT HOOK
1BC80..1BC88 ; DUPLOYAN AFFIX HIGH VERTICAL
1BC90..1BC99 ; DUPLOYAN AFFIX LOW ARROW
1BC9C..1BC9F ; DUPLOYAN PUNCTUATION CHINOOK FULL STOP
1D000..1D0F5 ; BYZANTINE MUSICAL SYMBOL GORGON NEO KATO
1D100..1D126 ; MUSICAL SYMBOL DRUM CLEF-2
1D129..1D172 ; MUSICAL SYMBOL COMBINING FLAG-5
1D17B..1D1E8 ; MUSICAL SYMBOL KIEVAN FLAT SIGN
1D200..1D245 ; GREEK MUSICAL LEIMMA
1D300..1D356 ; TETRAGRAM FOR FOSTERING
1D360..1D371 ; COUNTING ROD TENS DIGIT NINE
1D400..1D454 ; MATHEMATICAL ITALIC SMALL G
1D456..1D49C ; MATHEMATICAL SCRIPT CAPITAL A
1D49E ; MATHEMATICAL SCRIPT CAPITAL C
1D49F ; MATHEMATICAL SCRIPT CAPITAL D
1D4A2 ; MATHEMATICAL SCRIPT CAPITAL G
1D4A5 ; MATHEMATICAL SCRIPT CAPITAL J
1D4A6 ; MATHEMATICAL SCRIPT CAPITAL K
1D4A9..1D4AC ; MATHEMATICAL SCRIPT CAPITAL Q
1D4AE..1D4B9 ; MATHEMATICAL SCRIPT SMALL D
1D4BB ; MATHEMATICAL SCRIPT SMALL F
1D4BD..1D4C3 ; MATHEMATICAL SCRIPT SMALL N
1D4C5..1D505 ; MATHEMATICAL FRAKTUR CAPITAL B
1D507..1D50A ; MATHEMATICAL FRAKTUR CAPITAL G
1D50D..1D514 ; MATHEMATICAL FRAKTUR CAPITAL Q
1D516..1D51C ; MATHEMATICAL FRAKTUR CAPITAL Y
1D51E..1D539 ; MATHEMATICAL DOUBLE-STRUCK CAPITAL B
1D53B..1D53E ; MATHEMATICAL DOUBLE-STRUCK CAPITAL G
1D540..1D544 ; MATHEMATICAL DOUBLE-STRUCK CAPITAL M
1D546 ; MATHEMATICAL DOUBLE-STRUCK CAPITAL O
1D54A..1D550 ; MATHEMATICAL DOUBLE-STRUCK CAPITAL Y
1D552..1D6A5 ; MATHEMATICAL ITALIC SMALL DOTLESS J
1D6A8..1D7CB ; MATHEMATICAL BOLD SMALL DIGAMMA
1D7CE..1DA8B ; SIGNWRITING PARENTHESIS
1DA9B..1DA9F ; SIGNWRITING FILL MODIFIER-6
1DAA1..1DAAF ; SIGNWRITING ROTATION MODIFIER-16
1E800..1E8C4 ; MENDE KIKAKUI SYLLABLE M060 NYON
1E8C7..1E8D6 ; MENDE KIKAKUI COMBINING NUMBER MILLIONS
1EE00..1EE03 ; ARABIC MATHEMATICAL DAL
1EE05..1EE1F ; ARABIC MATHEMATICAL DOTLESS QAF
1EE21 ; ARABIC MATHEMATICAL INITIAL BEH
1EE22 ; ARABIC MATHEMATICAL INITIAL JEEM
1EE24 ; ARABIC MATHEMATICAL INITIAL HEH
1EE27 ; ARABIC MATHEMATICAL INITIAL HAH
1EE29..1EE32 ; ARABIC MATHEMATICAL INITIAL QAF
1EE34..1EE37 ; ARABIC MATHEMATICAL INITIAL KHAH
1EE39 ; ARABIC MATHEMATICAL INITIAL DAD
1EE3B ; ARABIC MATHEMATICAL INITIAL GHAIN
1EE42 ; ARABIC MATHEMATICAL TAILED JEEM
1EE47 ; ARABIC MATHEMATICAL TAILED HAH
1EE49 ; ARABIC MATHEMATICAL TAILED YEH
1EE4B ; ARABIC MATHEMATICAL TAILED LAM
1EE4D..1EE4F ; ARABIC MATHEMATICAL TAILED AIN
1EE51 ; ARABIC MATHEMATICAL TAILED SAD
1EE52 ; ARABIC MATHEMATICAL TAILED QAF
1EE54 ; ARABIC MATHEMATICAL TAILED SHEEN
1EE57 ; ARABIC MATHEMATICAL TAILED KHAH
1EE59 ; ARABIC MATHEMATICAL TAILED DAD
1EE5B ; ARABIC MATHEMATICAL TAILED GHAIN
1EE5D ; ARABIC MATHEMATICAL TAILED DOTLESS NOON
1EE5F ; ARABIC MATHEMATICAL TAILED DOTLESS QAF
1EE61 ; ARABIC MATHEMATICAL STRETCHED BEH
1EE62 ; ARABIC MATHEMATICAL STRETCHED JEEM
1EE64 ; ARABIC MATHEMATICAL STRETCHED HEH
1EE67..1EE6A ; ARABIC MATHEMATICAL STRETCHED KAF
1EE6C..1EE72 ; ARABIC MATHEMATICAL STRETCHED QAF
1EE74..1EE77 ; ARABIC MATHEMATICAL STRETCHED KHAH
1EE79..1EE7C ; ARABIC MATHEMATICAL STRETCHED DOTLESS BEH
1EE7E ; ARABIC MATHEMATICAL STRETCHED DOTLESS FEH
1EE80..1EE89 ; ARABIC MATHEMATICAL LOOPED YEH
1EE8B..1EE9B ; ARABIC MATHEMATICAL LOOPED GHAIN
1EEA1..1EEA3 ; ARABIC MATHEMATICAL DOUBLE-STRUCK DAL
1EEA5..1EEA9 ; ARABIC MATHEMATICAL DOUBLE-STRUCK YEH
1EEAB..1EEBB ; ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN
1EEF0 ; ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WITH TATWEEL
1EEF1 ; ARABIC MATHEMATICAL OPERATOR HAH WITH DAL
1F000..1F02B ; MAHJONG TILE BACK
1F030..1F093 ; DOMINO TILE VERTICAL-06-06
1F0A0..1F0AE ; PLAYING CARD KING OF SPADES
1F0B1..1F0BF ; PLAYING CARD RED JOKER
1F0C1..1F0CF ; PLAYING CARD BLACK JOKER
1F0D1..1F0F5 ; PLAYING CARD TRUMP-21
1F100..1F10C ; DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
1F110..1F12E ; CIRCLED WZ
1F130..1F16B ; RAISED MD SIGN
1F170..1F19A ; SQUARED VS
1F1E6..1F202 ; SQUARED KATAKANA SA
1F210..1F23A ; SQUARED CJK UNIFIED IDEOGRAPH-55B6
1F240..1F248 ; TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557
1F250 ; CIRCLED IDEOGRAPH ADVANTAGE
1F251 ; CIRCLED IDEOGRAPH ACCEPT
1F300..1F579 ; JOYSTICK
1F57B..1F5A3 ; BLACK DOWN POINTING BACKHAND INDEX
1F5A5..1F6D0 ; PLACE OF WORSHIP
1F6E0..1F6EC ; AIRPLANE ARRIVING
1F6F0..1F6F3 ; PASSENGER SHIP
1F700..1F773 ; ALCHEMICAL SYMBOL FOR HALF OUNCE
1F780..1F7D4 ; HEAVY TWELVE POINTED PINWHEEL STAR
1F800..1F80B ; DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
1F810..1F847 ; DOWNWARDS HEAVY ARROW
1F850..1F859 ; UP DOWN SANS-SERIF ARROW
1F860..1F887 ; WIDE-HEADED SOUTH WEST VERY HEAVY BARB ARROW
1F890..1F8AD ; WHITE ARROW SHAFT WIDTH TWO THIRDS
1F910..1F918 ; SIGN OF THE HORNS
1F980..1F984 ; UNICORN FACE
1F9C0 ; CHEESE WEDGE
20000..2A6D6 ; CJK UNIFIED IDEOGRAPH-2A6D6
2A700..2B734 ; CJK UNIFIED IDEOGRAPH-2B734
2B740..2B81D ; CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; CJK UNIFIED IDEOGRAPH-2CEA1
2F800..2FA1D ; CJK COMPATIBILITY IDEOGRAPH-2FA1D