/**
  *  \file format.cc
  *  \brief Typesafe "sprintf" workalike
  *
  *  The main advantage of a format string driven system over ad-hoc
  *  output and string concatenation is that format strings can be put
  *  in locale databases and be translated as whole, what can not be
  *  done as easily with `cout' snippets:
  *  `cout << "Price: EUR " << n << " plus " << tax << "% VAT"' would
  *  require the three strings to be translated separately, which is
  *  more work and makes problems when one snippet occurs multiple times
  *  in a program, at least when a `gettext'-like approach is used.
  *
  *  With `format', this gets
  *     `cout << (format("Price: EUR %d plus %d %% VAT") << n << tax)',
  *  which is easier to translate.
  *
  *  The disadvantage of `sprintf' is that it is not typesafe and not
  *  extensible. How do we solve that? Instead of relying on the format
  *  string for types, we take type information from the compiler, and
  *  only use the format string as a guide. For example, using `%d'
  *  to output a number gets you the number in decimal, `%x' gets you
  *  lower-case hex. Unlike `sprintf', however, `%s' does not try to
  *  output the thing as a string; in this case it'll get you decimal.
  *  And extensibility? Add an own function
  *    string_t toString(const MyObject&, int width, int prec, int flags,
  *                      format& fmt)
  *  which turns MyObjects into strings. width will be the field width
  *  (zero if not specified), prec will be the precision (negative if
  *  not specified), flags is a bitfield of flags, and fmt is the
  *  formatter object (for example, to use setState/clearState to
  *  affect conditionals, see below). In fact, you needn't honor width.
  *
  *  We support up to 10 (MAX_ARGS) arguments. An alternative to a
  *  static limit would have been making a chained list, which is
  *  less efficient. The above `format' expression gets translated by
  *  g++ into 6 `movl' insns (and two `leal' to get the addresses of
  *  `n' and `tax'), followed by the call to format::operator string_t().
  *  Using a chained list would require more run-time list walking, and
  *  more `leal's.
  *
  *  The format string contains escapes like this:
  *     \%\<index$>\<flags>\<width>\<.prec>\<code>
  *  All parts except code are optional.
  *
  *  <pre>
  *  \<index$>  set argument pointer to /index/, e.g., "%5$d"
  *             means format 6th argument with type 'd'
  *  \<flags>   contains
  *             "#"  -- alternate format
  *             "0"  -- pad with zero, not space; overrides '-'
  *             "-"  -- left justify (default: right justify)
  *             " "  -- (space) prepend blank to positive numbers
  *             "+"  -- show positive sign if nonnegative, overrides " "
  *             "'"  -- use 1000-separator. Can't be used with '0'
  *             "!"  -- do not actually output value (i.e. only set
  *                     condition codes)
  *  \<width>   positive number, output gets at least that many places,
  *             padded with spaces/zeroes to the left/right if needed.
  *             E.g., "%-10d" left justify to 10 places
  *  \<.prec>   precision. differs for each type.
  *  \<code>    modifier. differs for each type.
  *
  *  </pre>
  *
  *  Formats for strings (string_t):
  *  - precision is maximum number of characters to copy from string
  *    (e.g. "%.3s" copies at most 3 characters)
  *  - any modifier does
  *
  *  Formats for C strings (const char*):
  *  - precision works like for strings. String need not be 0-terminated
  *    when this is used.
  *  - modifier 'p' causes output in pointer format (void*), any other
  *    causes output as string.
  *
  *  Formats for integers (signed/unsigned char/short/int/long):
  *  - precision is minimum number of digits. Output is zero-padded
  *    when needed
  *  - modifier selects base: 'o' octal, 'x' hex (lower-case), 'X'
  *    hex (upper-case), others decimal.
  *  - flag '#' causes "0" or "0x" to be prepended to oct/hex numbers
  *
  *  Formats for floating point (float/double/long double):
  *  - precision is number of decimals after period, defaults to 6.
  *  - modifier selects output format: 'e'/'E' always exponential format
  *    (with 'e'/'E' for the exponent), 'f'/'F' always decimal format,
  *    'g'/'G' auto-select format and omit period of possible
  *  - flag '#' causes 'g'/'G' not to remove trailing zeroes
  *  - flag ''' is currently not supported
  *  - currently, after a floating-point output, condition codes 0 and 1
  *    are unspecified.
  *
  *  Formats for pointers (void*)
  *  - non-null pointers are output as unsigned longs with the '#'
  *    flag set and type 'x', hence the other integer things work, too.
  *
  *  Format strings can contain conditional constructs.
  *  - \%<n>{\<value-if-true>\%}
  *  - \%<n>{\<value-if-true>\%|\<value-if-false>\%}
  *
  *  \<n> is the number of the condition we're testing, 0 to 31. If
  *  omitted, it is zero. Currently, there are the conditions 0
  *  (true iff last output number was zero) and 1 (true iff last output
  *  number was 1). This allows for things like
  *    "%d %1{item%|items%}"
  *  Conditions can be nested up to 31 levels. Conditions can be negated
  *  using "\%!<n>{". To test a value without outputting it, use the '!'
  *  flag for it:
  *    "%!d%1{one item%|%0$d items%}"
  *  Note the use of "%0$d" to reset the argument pointer to the first
  *  argument.
  *
  *  To extend format for your own types, define a function
  *  printHelper() which converts your object into a string. In
  *  addition, you can also define a format_traits<> specialisation.
  *
  *  Currently, 'format' is a class. It might also be a function or
  *  some other callable entity in the future. We only guarantee that
  *  expressions of the form 'format("string") << arg' or
  *  'format("string", arg)' will work.
  *
  *  \todo
  *  - support %0'd (zeropad with grouping)
  *  - support %'f (floating point with grouping)
  *  - support infinities/NaNs in floating point. Needs C99 or BSD.
  *  - should FP set condition codes?
  *
  *  (c) 2001-2006 Stefan Reuther <Streu@gmx.de>
  *
  *  This program is free software; you can redistribute it and/or
  *  modify it under the terms of file `COPYING' that comes with the
  *  source code.
  *
  *  This program is distributed in the hope that it will be useful,
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  */

#include <cstring>
#include <climits>
#include <cmath>
#include <iostream>
#include "cpluslib/format.h"

/** Maximum size of a formatted unsigned long, and base, with 1000-sep,
    plus one. Very generous approximation. */
#define DIGIT_BUF_SIZE (CHAR_BIT * sizeof(unsigned long) + 1)

/** Format an unsigned long into a buffer.
    \param c    Value
    \param typ  Format specifier (o/d/u/x/X)
    \param p    Pointer one-past-end of sufficiently large buffer */
static char*
formatUnsigned(unsigned long c, char typ, char* p)
{
    int base = (typ == 'o') ? 8 : (typ == 'x' || typ == 'X') ? 16 : 10;
    const char*const alpha = (typ == 'X') ? "0123456789ABCDEF" : "0123456789abcdef";
    *--p = 0;
    while (c) {
        *--p = alpha[c % base];
        c /= base;
    }
    if (!*p)
        *--p = '0';
    return p;
}

/** Add grouping to a decimal number */
static void
addGrouping(string_t& s, string_t::size_type start)
{
    enum { GROUP = 3 };
    while (start > GROUP) {
        start -= GROUP;
        s.insert(start, ",", 1);
    }
}

/** Convert unsigned long to string.
    \param c   Value
    \param typ Type. Recognized types are 'c', 'o', 'x' and 'X'
    \param wi  Width for left-justification
    \param pre Precision, minimum number of digits
    \param flags Most flags recognized
    \param fmt Format object this function is called for */
string_t
toString(unsigned long c, char typ, int wi, int pre, int flags, format& fmt)
{
    fmt.clearState(format::STATE_NULL);
    fmt.clearState(format::STATE_ONE);
    if (c == 0)
        fmt.setState(format::STATE_NULL);
    if (c == 1)
        fmt.setState(format::STATE_ONE);

    if (typ == 'c') {
        return string_t(1, char(c));
    } else {
        char buffer[DIGIT_BUF_SIZE];
        char* p = formatUnsigned(c, typ, buffer + sizeof buffer);

        /* Now: p -> formatted value */
        string_t value(p);

        /* pre = minimum # of digits */
        if (pre >= 0 && unsigned(pre) > value.length())
            value.insert(string_t::size_type(0), pre - value.length(), '0');

        if (flags & format::FORMAT_GROUP)
            addGrouping(value, value.length());

        if (flags & format::FORMAT_ALTERNATE) {
            if (typ == 'o') {
                if (value[0] != '0')
                    value.insert(0, "0", 1);
            } else if (typ == 'x') {
                value.insert(0, "0x", 2);
            } else if (typ == 'X') {
                value.insert(0, "0X", 2);
            }
        }

        if (flags & (format::FORMAT_BLANK + format::FORMAT_SIGN))
            --wi;
        if (value.length() < wi)
            if (flags & format::FORMAT_ZEROPAD)
                value.insert(string_t::size_type(0), wi - value.length(), '0');
        if (flags & format::FORMAT_SIGN)
            value.insert(0, "+", 1);
        else if (flags & format::FORMAT_BLANK)
            value.insert(0, " ", 1);
        return value;
    }
}

/** Convert character to string. Characters are always treated as if
    they were unsigned, even if they are signed. */
string_t
toString(char c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<unsigned long>(static_cast<unsigned char>(c)),
                    typ, wi, pre, flags, fmt);
}

/** Convert unsigned char to string. */
string_t
toString(unsigned char c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<unsigned long>(c), typ, wi, pre, flags, fmt);
}

/** Convert unsigned short to string. */
string_t
toString(unsigned short c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<unsigned long>(c), typ, wi, pre, flags, fmt);
}

/** Convert unsigned int to string. */
string_t
toString(unsigned int c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<unsigned long>(c), typ, wi, pre, flags, fmt);
}

/** Convert signed char to string. */
string_t
toString(signed char c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<long>(c), typ, wi, pre, flags, fmt);
}

/** Convert signed short to string. */
string_t
toString(signed short c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<long>(c), typ, wi, pre, flags, fmt);
}

/** Convert signed int to string. */
string_t
toString(signed int c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<long>(c), typ, wi, pre, flags, fmt);
}

/** Convert signed long to string. */
string_t
toString(signed long c, char typ, int wi, int pre, int flags, format& fmt)
{
    if (typ != 'd' || c >= 0) {
        return toString(static_cast<unsigned long>(c), typ, wi, pre, flags, fmt);
    } else {
        /* pretty hackish, but should work */
        string_t result = toString(static_cast<unsigned long>(-c), typ, wi, pre, flags | format::FORMAT_SIGN, fmt);
        string_t::size_type t = result.find('+');
        result[t] = '-';
        return result;
    }
}

/** Convert float to string. */
string_t
toString(float c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<long double>(c), typ, wi, pre, flags, fmt);
}

/** Convert double to string. */
string_t
toString(double c, char typ, int wi, int pre, int flags, format& fmt)
{
    return toString(static_cast<long double>(c), typ, wi, pre, flags, fmt);
}

/** Remove trailing zeroes/decimal point from string. */
static inline void
trimZeroes(string_t& s)
{
    string_t::size_type n = s.find_last_not_of(".0");
    if (n != string_t::npos)
        s.erase(n+1);
    else if (s.length() > 1)
        s.erase(1);
}

#define MAX_DIGITS 100

/** Compute v^exp. */
static long double
ipow(long double v, int exp)
{
    long double rv = 1;
    if (exp > 0) {
        while (exp) {
            if (exp & 1) {
                rv *= v;
                --exp;
            }
            v *= v;
            exp /= 2;
        }
    } else {
        exp = -exp;
        while (exp) {
            if (exp & 1) {
                rv /= v;
                --exp;
            }
            v *= v;
            exp /= 2;
        }
    }
    return rv;
}

/** Round string so that it contains pre+1 digits.
    \param s     [in/out] String to round. Represents a number in [0,10).
    \param pre   [in] Requested precision; number of fractional digits
                 of s.
    \param expon [in/out] Exponent. Actual value is s*10^expon.
    \param allow_expand [in] Allow increase in precision if rounding
                 requires that.

    If rounding causes the value to become exactly 10.0, this will
    reduce it to 1.0 and increase the exponent in return. If
    allow_expand is true, the precision will be increased as well to
    keep the same number of effective fractional digits for 'f'
    formats. With 'e' formats, precision should not be increased as
    the shift in magnitude will be represented by the 'e+NN' part. */
static void
roundString(string_t& s, int pre, int& expon, bool allow_expand)
{
    ++pre;
    if (s.length() > pre) {
        string_t::size_type pos = pre;
        if (s[pos] >= '5') {
            while (pos > 0 && s[pos-1] == '9') {
                s[pos-1] = '0';
                --pos;
            }
            if (pos == 0) {
                s.insert(string_t::size_type(0), 1, '1');
                ++expon;
                if (allow_expand)
                    ++pre;
            } else {
                s[pos-1]++;
            }
        }
        s.erase(pre);
    }
}

/** Returns digits of c, so that c = 'return'*10^'exp_'. */
static string_t
getDigits(long double c, int& exp_)
{
    /* figure out exponent */
    int exp2, exp10;
    frexp(c, &exp2);
    exp10 = exp2 * 19728L / 65536;     // 65536/19728 = 3.3219 = ld 10
    long double divi = ipow(10, exp10);
    if (divi > c && c) {
        divi /= 10;
        --exp10;
    }
    c /= divi;

    /* now, c \in [0,10); format it */
    string_t rv;
    rv.reserve(MAX_DIGITS);
    for (int i = 1; i < MAX_DIGITS; ++i) {
        int digit = int(c);
        rv.append(1, char(digit + '0'));
        c = 10*(c - digit);
    }
    exp_ = exp10;
    return rv;
}

/** Convert long double to string.
    \param c      Value to convert
    \param typ    Format. Accepted values are 'f' (9.9999), 'e' (9.99e+99),
                  'g' (auto-select), and 'E'/'G' (same, but with upper-case
                  'e'). Default is 'f'.
    \param wi     Minimum width.
    \param pre    Precision (number of decimals after point). Defaults to 6 if negative.
    \param flags  Flags. Most flags recognized. */
string_t
toString(long double c, char typ, int wi, int pre, int flags, format& fmt)
{
    using namespace std;

    bool negative = (c < 0);
    bool trim     = false;
    if (negative)
        c = -c;

    /* figure out format to use */
    if (pre < 0)
        pre = 6;
    if (typ == 'g' || typ == 'G') {
        if (pre == 0)
            pre = 1;
        if (c < 1.0E-4 || c > ipow(10, pre))
            typ = (typ == 'g' ? 'e' : 'E');
        trim = !(flags & format::FORMAT_ALTERNATE);
    }
    /* figure out exponent */
    int   expon;
    string_t rv = getDigits(c, expon);

    if (typ == 'e' || typ == 'E') {
        /* First, round to requested precision */
        roundString(rv, pre, expon, false);
        rv.insert(1, 1, '.');
        if (trim)
            trimZeroes(rv);
        
        if (expon || !trim) {
            rv.append(1, typ);
            rv += toString(expon, 'd', 3, 0,
                           format::FORMAT_SIGN | format::FORMAT_ZEROPAD | (flags & format::FORMAT_GROUP),
                           fmt);
        }
    } else {
        /* Non-exponential format */
        if (expon >= 0) {
            if (rv.length() < pre + expon)
                rv.append(pre + expon - rv.length(), '0');
            roundString(rv, pre + expon, expon, true);
            rv.insert(expon+1, 1, '.');
            if (trim)
                trimZeroes(rv);
        } else {
            roundString(rv, pre + expon, expon, true);
            if (expon < 0)
                rv.insert(string_t::size_type(0), -expon, '0');
            rv.insert(1, 1, '.');
            if (trim)
                trimZeroes(rv);
        }
    }

    /* zeropad */
    if (flags & format::FORMAT_ZEROPAD) {
        int n = wi;
        if (negative || (flags & (format::FORMAT_BLANK | format::FORMAT_SIGN)))
            --n;
        if (rv.length() < n)
            rv.insert(string_t::size_type(0), n - rv.length(), '0');
    }

    /* add sign */
    if (negative)
        rv.insert(0, "-", 1);
    else if (flags & format::FORMAT_SIGN)
        rv.insert(0, "+", 1);
    else if (flags & format::FORMAT_BLANK)
        rv.insert(0, " ", 1);

    return rv;
}

/** Convert C string to string. */
string_t
toString(const char* c, char typ, int wi, int pre, int flags, format& fmt)
{
    if (!c)
        return "(nil)";

    if (typ == 'p')
        return toString((const void*) c, typ, wi, pre, flags, fmt);

    if (pre < 0) {
        return string_t(c);
    } else {
        int len = 0;
        while (len < pre && c[len])
            ++len;
        return string_t(c, len);
    }
}

/** Convert pointer to string. */
string_t
toString(const void* c, char typ, int wi, int pre, int flags, format& fmt)
{
    if (!c)
        return "(nil)";
    else
        /* FIXME? */
        return toString(reinterpret_cast<unsigned long>(c), 'x', wi, pre,
                        flags | format::FORMAT_ALTERNATE, fmt);
}

/** Format string for output. */
string_t
toString(const string_t& s, char typ, int wi, int pre, int flags, format& fmt)
{
    if (pre >= 0 && s.length() > pre)
        return s.substr(0, pre);
    else
        return s;
}

/** Convert to string. This is the function where the actual
    formatting happens. */
format::operator string_t()
{
    string_t result;
    const char* p = fmt;
    unsigned arg_index = 0;
    int level = 0;
    unsigned conditions = 0;
    state = 0;

    while (const char* q = std::strchr(p, '%')) {
        if (!conditions)
            result.append(p, q - p);
        /* Format sequence: `%<index$><flags><width><.prec><code>' */
        int width = 0, prec = -1, flags = 0;
        bool show = true;
        p = q;
     again:                      // restart parsing after `$'
        while (1) {
            ++p;
            if (*p == '#')
                flags |= FORMAT_ALTERNATE;
            else if (*p == '0')
                flags |= FORMAT_ZEROPAD;
            else if (*p == '-')
                flags |= FORMAT_LEFTJUST;
            else if (*p == ' ')
                flags |= FORMAT_BLANK;
            else if (*p == '+')
                flags |= FORMAT_SIGN;
            else if (*p == '\'')
                flags |= FORMAT_GROUP;
            else if (*p == '!')
                show = false;
            else
                break;
        }
        /* now parse width */
        while (*p >= '0' && *p <= '9')
            width = 10*width + (*p++ - '0');

        if (*p == '$') {
            /* %index$.... == insert named parameter, indexing starts at 0 */
            arg_index = width;
            width = 0;
            flags = 0;
            goto again;
        }

        /* precision specified? */
        if (*p == '.') {
            prec = 0;
            ++p;
            while (*p >= '0' && *p <= '9')
                prec = 10*prec + (*p++ - '0');
        }

        /* Convert argument */
        if (*p == 0)
            break;              // invalid format string

        if (*p == '{') {
            /* condition test */
            /*    "%d %1{item%|items%}" */
            ++level;
            conditions <<= 1;
            if (hasState(width & 31))
                show = !show;
            if (show) {
                /* condition not valid */
                conditions |= 1;
            }
        } else if (*p == '|') {
            if (level)
                conditions ^= 1;
        } else if (*p == '}') {
            if (level) {
                conditions >>= 1;
                --level;
            }
        } else if (!conditions) {
            string_t here;
            if (*p == '%') {
                here = *p;
            } else {
                if (arg_index >= n) {
                    here = "<invalid>";
                } else {
                    here = fac[arg_index].func(fac[arg_index].data, *p, width, prec, flags, *this);
                    ++arg_index;
                }
            }

            /* postprocess it */
            if (show) {
                if (here.length() < width) {
                    if (flags & FORMAT_LEFTJUST)
                        here.append(width - here.length(), ' ');
                    else
                        here.insert(string_t::size_type(0), width - here.length(), ' ');
                }
                result += here;
            }
        }
        ++p;
    }
    result.append(p);
    return result;
}

/** Output formatted. This outputs the result of the format operation
    on the specified output stream to the left. */
std::ostream&
operator<<(std::ostream& os, format& fmt)
{
    return os << string_t(fmt);
}
