.. include:: ./global.rst

###################
Unicode Code Points
###################

The "characters" in strings (and as standalone values) are Unicode
code points, normally represented by ``#U+...`` for enough hexadecimal
digits to represent the code point.  Leading zeroes are not required
(but may be necessary, see below).

You can input a specific character with ``#\X`` for some *X*, a UTF-8
code point.

There are a limited number of ``#\{newline}`` *named* characters.

.. code-block:: idio
   :caption: :file:`code-points.idio`

   ;; ħ is U+0127 LATIN SMALL LETTER H WITH STROKE
   c1 := #U+127

   c2 := #\ħ

   ;; the unicode type is much like fixnum and can be compared with eqv?
   printf "Does <<%s>> eqv? <<%s>>? %s\n" c1 c2 (eqv? c1 c2)

   ;; SPACE
   c1 = #U+20

   ;; or using a named character
   c2 = #\{space}

   printf "Does <<%s>> eqv? <<%s>>? %s\n" c1 c2 (eqv? c1 c2)

.. code-block:: console

   $ idio code-points
   Does <<ħ>> eqv? <<ħ>>? #t
   Does << >> eqv? << >>? #t

There are a number of Unicode-derived *Category* and *Property*
predicates and a very limited set of conversion functions.

.. code-block:: idio
   :caption: :file:`unicode-functions.idio`

   c1 := #U+127

   if (Lowercase? c1) {
     printf "%s ->Uppercase %s\n" c1 (->Uppercase c1)
   }

   ;; tell me what you know!
   unicode/describe c1

.. code-block:: console

   $ idio unicode-functions
   ħ ->Uppercase Ħ
   0127;;Ll;;;;;;;;;;0126;;0126 # Letter Lowercase Alphabetic Uppercase=0126 Titlecase=0126 

.. include:: ./commit.rst