Re: sscanf: implement basic character sets

From: Jessica Yu
Date: Mon Mar 07 2016 - 18:09:51 EST


+++ Rasmus Villemoes [03/03/16 00:49 +0100]:
On Fri, Feb 26 2016, Jessica Yu <jeyu@xxxxxxxxxx> wrote:

@@ -2714,6 +2718,57 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
num++;
}
continue;
+ /*
+ * Warning: This implementation of the '[' conversion specifier
+ * deviates from its glibc counterpart in the following ways:
+ * (1) It does NOT support ranges i.e. '-' is NOT a special character
+ * (2) It cannot match the closing bracket ']' itself
+ * (3) A field width is required
+ * (4) '%*[' (discard matching input) is currently not supported
+ *
+ * Example usage:
+ * ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]", buf1, buf2, buf3);
+ * if (ret < 3)
+ * // etc..
+ */
+ case '[':
+ {
+ char *s = (char *)va_arg(args, char *);
+ DECLARE_BITMAP(set, 256) = {0};
+ unsigned int len = 0;
+ bool negate = (*fmt == '^');
+
+ /* field width is required */
+ if (field_width == -1)
+ return num;
+
+ if (negate)
+ ++fmt;
+
+ for ( ; *fmt && *fmt != ']'; ++fmt, ++len)
+ set_bit((u8)*fmt, set);
+
+ /* no ']' or no character set found */
+ if (!*fmt || !len)
+ return num;
+ ++fmt;
+

I think it might be useful to be able to do [^] to match any sequence of
characters. If the user passed [] the code below won't match anything,
so we'll return num anyway. In other words, I'd just omit the test for
empty character set. Other than that, LGTM.

Thanks for the review. My only concern would be that that behavior
(i.e., have [^] match any sequence of characters) would also deviate
from glibc sccanf behavior (which matches nothing), and would need to
be documented as well. Perhaps we should best keep these differences
to a minimum, so as to prevent unexpected surprises.

Jessica