VirtualBox

Changeset 36428 in vbox for trunk


Ignore:
Timestamp:
Mar 25, 2011 12:46:45 PM (14 years ago)
Author:
vboxsync
Message:

com/string.h: AssertLogRel when encountering an invalid encoding in the copyFrom*() methods doing UTF-16/8 conversions. The ASSUMPTION is that all input strings are correctly encoded and that this is enforced by VirtualBox border code before things gets down to Utf8Str or Bstr.

Location:
trunk
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • trunk/include/VBox/com/string.h

    r35128 r36428  
    5858/**
    5959 *  String class used universally in Main for COM-style Utf-16 strings.
    60  *  Unfortunately COM on Windows uses UTF-16 everywhere, requiring conversions
    61  *  back and forth since most of VirtualBox and our libraries use UTF-8.
    62  *
    63  *  To make things more obscure, on Windows, a COM-style BSTR is not just a
    64  *  pointer to a null-terminated wide character array, but the four bytes
    65  *  (32 bits) BEFORE the memory that the pointer points to are a length
    66  *  DWORD. One must therefore avoid pointer arithmetic and always use
    67  *  SysAllocString and the like to deal with BSTR pointers, which manage
    68  *  that DWORD correctly.
    69  *
    70  *  For platforms other than Windows, we provide our own versions of the
    71  *  Sys* functions in Main/xpcom/helpers.cpp which do NOT use length
    72  *  prefixes though to be compatible with how XPCOM allocates string
    73  *  parameters to public functions.
    74  *
    75  *  The Bstr class hides all this handling behind a std::string-like interface
    76  *  and also provides automatic conversions to MiniString and Utf8Str instances.
    77  *
    78  *  The one advantage of using the SysString* routines is that this makes it
    79  *  possible to use it as a type of member variables of COM/XPCOM components
    80  *  and pass their values to callers through component methods' output parameters
    81  *  using the #cloneTo() operation. Also, the class can adopt (take ownership of)
    82  *  string buffers returned in output parameters of COM methods using the
    83  *  #asOutParam() operation and correctly free them afterwards.
    84  *
    85  *  Starting with VirtualBox 3.2, like Utf8Str, Bstr no longer differentiates
    86  *  between NULL strings and empty strings. In other words, Bstr("") and
    87  *  Bstr(NULL) behave the same. In both cases, Bstr allocates no memory,
    88  *  reports a zero length and zero allocated bytes for both, and returns an
    89  *  empty C wide string from raw().
     60 *
     61 * Unfortunately COM on Windows uses UTF-16 everywhere, requiring conversions
     62 * back and forth since most of VirtualBox and our libraries use UTF-8.
     63 *
     64 * To make things more obscure, on Windows, a COM-style BSTR is not just a
     65 * pointer to a null-terminated wide character array, but the four bytes (32
     66 * bits) BEFORE the memory that the pointer points to are a length DWORD. One
     67 * must therefore avoid pointer arithmetic and always use SysAllocString and
     68 * the like to deal with BSTR pointers, which manage that DWORD correctly.
     69 *
     70 * For platforms other than Windows, we provide our own versions of the Sys*
     71 * functions in Main/xpcom/helpers.cpp which do NOT use length prefixes though
     72 * to be compatible with how XPCOM allocates string parameters to public
     73 * functions.
     74 *
     75 * The Bstr class hides all this handling behind a std::string-like interface
     76 * and also provides automatic conversions to MiniString and Utf8Str instances.
     77 *
     78 * The one advantage of using the SysString* routines is that this makes it
     79 * possible to use it as a type of member variables of COM/XPCOM components and
     80 * pass their values to callers through component methods' output parameters
     81 * using the #cloneTo() operation.  Also, the class can adopt (take ownership
     82 * of) string buffers returned in output parameters of COM methods using the
     83 * #asOutParam() operation and correctly free them afterwards.
     84 *
     85 * Starting with VirtualBox 3.2, like Utf8Str, Bstr no longer differentiates
     86 * between NULL strings and empty strings. In other words, Bstr("") and
     87 * Bstr(NULL) behave the same. In both cases, Bstr allocates no memory,
     88 * reports a zero length and zero allocated bytes for both, and returns an
     89 * empty C wide string from raw().
     90 *
     91 * @note    All Bstr methods ASSUMES valid UTF-16 or UTF-8 input strings.
     92 *          The VirtualBox policy in this regard is to validate strings coming
     93 *          from external sources before passing them to Bstr or Utf8Str.
    9094 */
    9195class Bstr
     
    301305     *  If the member string is empty, this allocates an empty BSTR in *pstr
    302306     *  (i.e. makes it point to a new buffer with a null byte).
    303      */
    304     void detachTo(BSTR *pstr)
     307     *
     308     * @param   pbstrDst        The BSTR variable to detach the string to.
     309     *
     310     * @throws  std::bad_alloc if we failed to allocate a new empty string.
     311     */
     312    void detachTo(BSTR *pbstrDst)
    305313    {
    306314        if (m_bstr)
    307             *pstr = m_bstr;
     315            *pbstrDst = m_bstr;
    308316        else
    309317        {
    310318            // allocate null BSTR
    311             *pstr = ::SysAllocString((const OLECHAR *)g_bstrEmpty);
     319            *pbstrDst = ::SysAllocString((const OLECHAR *)g_bstrEmpty);
    312320#ifdef RT_EXCEPTIONS_ENABLED
    313             if (!*pstr)
     321            if (!*pbstrDst)
    314322                throw std::bad_alloc();
    315323#endif
     
    322330     *  Takes the ownership of the returned data.
    323331     */
    324     BSTR* asOutParam()
     332    BSTR *asOutParam()
    325333    {
    326334        cleanup();
     
    352360     *
    353361     * If the source is empty, this sets the member string to NULL.
    354      * @param rs
    355      */
    356     void copyFrom(const OLECHAR *rs)
    357     {
    358         if (rs && *rs)
     362     *
     363     * @param   a_bstrSrc           The source string.  The caller guarantees
     364     *                              that this is valid UTF-16.
     365     *
     366     * @throws  std::bad_alloc - the object is representing an empty string.
     367     */
     368    void copyFrom(const OLECHAR *a_bstrSrc)
     369    {
     370        if (a_bstrSrc && *a_bstrSrc)
    359371        {
    360             m_bstr = ::SysAllocString(rs);
     372            m_bstr = ::SysAllocString(a_bstrSrc);
    361373#ifdef RT_EXCEPTIONS_ENABLED
    362374            if (!m_bstr)
     
    375387     *
    376388     * If the source is empty, this sets the member string to NULL.
    377      * @param rs
    378      */
    379     void copyFrom(const char *rs)
    380     {
    381         if (rs && *rs)
    382         {
    383             PRTUTF16 s = NULL;
    384             ::RTStrToUtf16(rs, &s);
    385 #ifdef RT_EXCEPTIONS_ENABLED
    386             if (!s)
    387                 throw std::bad_alloc();
    388 #endif
    389             copyFrom((const OLECHAR *)s);            // allocates BSTR from zero-terminated input string
    390             ::RTUtf16Free(s);
    391         }
    392         else
    393             m_bstr = NULL;
     389     *
     390     * @param   a_pszSrc            The source string.  The caller guarantees
     391     *                              that this is valid UTF-8.
     392     *
     393     * @throws  std::bad_alloc - the object is representing an empty string.
     394     */
     395    void copyFrom(const char *a_pszSrc)
     396    {
     397        copyFromN(a_pszSrc, RTSTR_MAX);
    394398    }
    395399
     
    397401     * Variant of copyFrom for sub-string constructors.
    398402     *
    399      * @param   a_pszSrc            The source string.
     403     * @param   a_pszSrc            The source string.  The caller guarantees
     404     *                              that this is valid UTF-8.
    400405     * @param   a_cchMax            The maximum number of chars (not
    401406     *                              codepoints) to copy.  If you pass RTSTR_MAX
    402407     *                              it'll be exactly like copyFrom().
    403      * @throws  std::bad_alloc
     408     *
     409     * @throws  std::bad_alloc - the object is representing an empty string.
    404410     */
    405411    void copyFromN(const char *a_pszSrc, size_t a_cchSrc);
     
    417423
    418424
    419 ////////////////////////////////////////////////////////////////////////////////
     425
    420426
    421427/**
    422  *  String class used universally in Main for UTF-8 strings.
    423  *
    424  *  This is based on iprt::MiniString, to which some functionality has been
    425  *  moved. Here we keep things that are specific to Main, such as conversions
    426  *  with UTF-16 strings (Bstr).
    427  *
    428  *  Like iprt::MiniString, Utf8Str does not differentiate between NULL strings
    429  *  and empty strings. In other words, Utf8Str("") and Utf8Str(NULL)
    430  *  behave the same. In both cases, MiniString allocates no memory, reports
    431  *  a zero length and zero allocated bytes for both, and returns an empty
    432  *  C string from c_str().
     428 * String class used universally in Main for UTF-8 strings.
     429 *
     430 * This is based on iprt::MiniString, to which some functionality has been
     431 * moved.  Here we keep things that are specific to Main, such as conversions
     432 * with UTF-16 strings (Bstr).
     433 *
     434 * Like iprt::MiniString, Utf8Str does not differentiate between NULL strings
     435 * and empty strings.  In other words, Utf8Str("") and Utf8Str(NULL) behave the
     436 * same.  In both cases, MiniString allocates no memory, reports
     437 * a zero length and zero allocated bytes for both, and returns an empty
     438 * C string from c_str().
     439 *
     440 * @note    All Utf8Str methods ASSUMES valid UTF-8 or UTF-16 input strings.
     441 *          The VirtualBox policy in this regard is to validate strings coming
     442 *          from external sources before passing them to Utf8Str or Bstr.
    433443 */
    434444class Utf8Str : public iprt::MiniString
     
    558568protected:
    559569
    560     void copyFrom(CBSTR s);
     570    void copyFrom(CBSTR a_pbstr);
    561571
    562572    friend class Bstr; /* to access our raw_copy() */
  • trunk/src/VBox/Main/glue/string.cpp

    r35128 r36428  
    5353    size_t cwc;
    5454    int vrc = ::RTStrCalcUtf16LenEx(a_pszSrc, a_cchMax, &cwc);
    55     AssertRCReturnVoid(vrc); /* throw instead? */
     55    if (RT_FAILURE(vrc))
     56    {
     57        /* ASSUME: input is valid Utf-8. Fake out of memory error. */
     58        AssertLogRelMsgFailed(("%Rrc %.*Rhxs\n", vrc, RTStrNLen(a_pszSrc, a_cchMax), a_pszSrc));
     59        throw std::bad_alloc();
     60    }
    5661
    5762    m_bstr = ::SysAllocStringByteLen(NULL, cwc * sizeof(OLECHAR));
    58     if (m_bstr)
     63    if (RT_UNLIKELY(!m_bstr))
     64        throw std::bad_alloc();
     65
     66    PRTUTF16 pwsz = (PRTUTF16)m_bstr;
     67    vrc = ::RTStrToUtf16Ex(a_pszSrc, a_cchMax, &pwsz, cwc + 1, NULL);
     68    if (RT_FAILURE(vrc))
    5969    {
    60         PRTUTF16 pwsz = (PRTUTF16)m_bstr;
    61         vrc = ::RTStrToUtf16Ex(a_pszSrc, a_cchMax, &pwsz, cwc + 1, NULL);
    62         if (RT_FAILURE(vrc))
    63         {
    64             /* This should not happen! */
    65             AssertRC(vrc);
    66             cleanup();
    67         }
     70        /* This should not happen! */
     71        AssertRC(vrc);
     72        cleanup();
     73        throw std::bad_alloc();
    6874    }
    69     else
    70         throw std::bad_alloc();
    7175}
    7276
     
    8084    size_t cb = length() + 1;
    8185    *pstr = (char*)nsMemory::Alloc(cb);
    82     if (!*pstr)
     86    if (RT_UNLIKELY(!*pstr))
    8387        throw std::bad_alloc();
    8488    memcpy(*pstr, c_str(), cb);
     
    137141 * copying from a UTF-16 string.
    138142 *
    139  * As with the iprt::ministring::copyFrom() variants, this unconditionally
    140  * sets the members to a copy of the given other strings and makes
    141  * no assumptions about previous contents. This can therefore be used
    142  * both in copy constructors, when member variables have no defined
    143  * value, and in assignments after having called cleanup().
     143 * As with the iprt::ministring::copyFrom() variants, this unconditionally sets
     144 * the members to a copy of the given other strings and makes no assumptions
     145 * about previous contents.  This can therefore be used both in copy
     146 * constructors, when member variables have no defined value, and in
     147 * assignments after having called cleanup().
    144148 *
    145149 * This variant converts from a UTF-16 string, most probably from
    146150 * a Bstr assignment.
    147151 *
    148  * @param s
     152 * @param   a_pbstr         The source string.  The caller guarantees that this
     153 *                          is valid UTF-16.
     154 *
     155 * @sa      iprt::MiniString::copyFromN
    149156 */
    150 void Utf8Str::copyFrom(CBSTR s)
     157void Utf8Str::copyFrom(CBSTR a_pbstr)
    151158{
    152     if (s && *s)
     159    if (a_pbstr && *a_pbstr)
    153160    {
    154         int vrc = RTUtf16ToUtf8Ex((PRTUTF16)s,      // PCRTUTF16 pwszString
     161        int vrc = RTUtf16ToUtf8Ex((PCRTUTF16)a_pbstr,
    155162                                  RTSTR_MAX,        // size_t cwcString: translate entire string
    156163                                  &m_psz,           // char **ppsz: output buffer
    157164                                  0,                // size_t cch: if 0, func allocates buffer in *ppsz
    158165                                  &m_cch);          // size_t *pcch: receives the size of the output string, excluding the terminator.
    159         if (RT_FAILURE(vrc))
     166        if (RT_SUCCESS(vrc))
     167            m_cbAllocated = m_cch + 1;
     168        else
    160169        {
    161             if (    vrc == VERR_NO_STR_MEMORY
    162                  || vrc == VERR_NO_MEMORY
    163                )
    164                 throw std::bad_alloc();
     170            if (   vrc != VERR_NO_STR_MEMORY
     171                && vrc != VERR_NO_MEMORY)
     172            {
     173                /* ASSUME: input is valid Utf-16. Fake out of memory error. */
     174                AssertLogRelMsgFailed(("%Rrc %.*Rhxs\n", vrc, RTUtf16Len(a_pbstr) * sizeof(RTUTF16), a_pbstr));
     175            }
    165176
    166             // @todo what do we do with bad input strings? throw also? for now just keep an empty string
    167177            m_cch = 0;
    168178            m_cbAllocated = 0;
    169179            m_psz = NULL;
     180
     181            throw std::bad_alloc();
    170182        }
    171         else
    172             m_cbAllocated = m_cch + 1;
    173183    }
    174184    else
Note: See TracChangeset for help on using the changeset viewer.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette