Commit Graph

27 Commits

Author SHA1 Message Date
Yonatan Goldschmidt
df5c3bd976 py/unicode: Add unichar_isalnum(). 2020-01-12 13:03:57 +11:00
Damien George
7c85c7c210 py/unicode: Fix check for valid utf8 being stricter about contn chars. 2018-11-26 16:13:08 +11:00
Damien George
19aee9438a py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.
This patch provides inline versions of the utf8 helper functions for the
case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0).
This saves code size.

The unichar_charlen function is also renamed to utf8_charlen to match the
other utf8 helper functions, and the signature of this function is adjusted
for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2018-02-14 18:19:22 +11:00
tll
68c28174d0 py/objstr: Add check for valid UTF-8 when making a str from bytes.
This patch adds a function utf8_check() to check for a valid UTF-8 encoded
string, and calls it when constructing a str from raw bytes.  The feature
is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and
is enabled if unicode is enabled.  It costs about 110 bytes on Thumb-2, 150
bytes on Xtensa and 170 bytes on x86-64.
2017-09-06 16:43:09 +10:00
Alexander Steffen
55f33240f3 all: Use the name MicroPython consistently in comments
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George
afc5063539 py/unicode: Comment-out unused function unichar_isprint. 2016-12-28 17:50:10 +11:00
Alex March
69d9e7d27d py/repl: Check for an identifier char after the keyword.
- As described in the #1850.
- Add cmdline tests.
2016-02-17 08:56:15 +00:00
Dave Hylands
afaa66b657 py: Minor improvement to unichar_isxdigit
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to
0x14 bytes (so net code reduction of 12 bytes) and will make
unicode_is_xdigit perform slightly faster.
2015-05-20 09:31:22 +01:00
Dave Hylands
3ad94d6072 extmod: Add ubinascii.unhexlify
This also pulls out hex_digit from py/lexer.c and makes unichar_hex_digit
2015-05-20 09:29:22 +01:00
Damien George
4dea922610 py: Adjust some spaces in code style/format, purely for consistency. 2015-04-09 15:29:54 +00:00
Damien George
51dfcb4bb7 py: Move to guarded includes, everywhere in py/ core.
Addresses issue #1022.
2015-01-01 20:32:09 +00:00
Damien George
5318cc028a py: Tidy up a few function declarations. 2014-12-10 22:37:07 +00:00
Damien George
40f3c02682 Rename machine_(u)int_t to mp_(u)int_t.
See discussion in issue #50.
2014-07-03 13:25:24 +01:00
Paul Sokolovsky
9e215fa4c2 py: Make unichar_charlen() accept/return machine_uint_t. 2014-06-28 23:15:29 +03:00
Damien George
e04a44e2f6 py: Small comments, name changes, use of machine_int_t. 2014-06-28 10:27:23 +01:00
Paul Sokolovsky
1044c3dfe6 unicode: Make get_char()/next_char()/charlen() be 8-bit compatible.
Based on config define.
2014-06-27 00:04:19 +03:00
Paul Sokolovsky
46d31e9ca9 unicode: Add utf8_ptr_to_index().
Useful when we have pointer to char inside string, but need to return char
index. (E.g. str.find()).
2014-06-27 00:04:19 +03:00
Chris Angelico
c88987c1af py: Implement basic unicode functions. 2014-06-27 00:04:17 +03:00
Paul Sokolovsky
59c675a64c py: Include mpconfig.h before all other includes.
It defines types used by all other headers.

Fixes #691.
2014-06-21 22:43:22 +03:00
Paul Sokolovsky
b0bb458810 unicode: String API is const byte*.
We still have that char vs byte dichotomy, but majority of string operations
now use byte.
2014-06-14 06:22:11 +03:00
Damien George
c59af52e84 py: Rename some unichar functions for consistency. 2014-05-11 17:53:11 +01:00
Paul Sokolovsky
6913521911 objstr: Implement .lower() and .upper(). 2014-05-10 19:49:07 +03:00
Damien George
04b9147e15 Add license header to (almost) all files.
Blanket wide to all .c and .h files.  Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.

Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George
175cecfa87 py: Make form-feed character a space (following C isspace).
Eg, in CPython stdlib, email/header.py has a form-feed character.
2014-04-10 11:39:36 +01:00
Paul Sokolovsky
520e2f58a5 Replace global "static" -> "STATIC", to allow "analysis builds". Part 2. 2014-02-12 18:31:30 +02:00
Paul Sokolovsky
0b7184dcb8 Implement octal and hex escapes in strings. 2014-01-22 22:48:25 +02:00
Damien George
8cc96a35e5 Put unicode functions in unicode.c, and tidy their names. 2013-12-30 18:23:50 +00:00