VirtualBox

Changeset 36428 in vbox for trunk/include/VBox


Ignore:
Timestamp:
Mar 25, 2011 12:46:45 PM (14 years ago)
Author:
vboxsync
svn:sync-xref-src-repo-rev:
70790
Message:

com/string.h: AssertLogRel when encountering an invalid encoding in the copyFrom*() methods doing UTF-16/8 conversions. The ASSUMPTION is that all input strings are correctly encoded and that this is enforced by VirtualBox border code before things gets down to Utf8Str or Bstr.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/include/VBox/com/string.h

    r35128 r36428  
    5858/**
    5959 *  String class used universally in Main for COM-style Utf-16 strings.
    60  *  Unfortunately COM on Windows uses UTF-16 everywhere, requiring conversions
    61  *  back and forth since most of VirtualBox and our libraries use UTF-8.
    62  *
    63  *  To make things more obscure, on Windows, a COM-style BSTR is not just a
    64  *  pointer to a null-terminated wide character array, but the four bytes
    65  *  (32 bits) BEFORE the memory that the pointer points to are a length
    66  *  DWORD. One must therefore avoid pointer arithmetic and always use
    67  *  SysAllocString and the like to deal with BSTR pointers, which manage
    68  *  that DWORD correctly.
    69  *
    70  *  For platforms other than Windows, we provide our own versions of the
    71  *  Sys* functions in Main/xpcom/helpers.cpp which do NOT use length
    72  *  prefixes though to be compatible with how XPCOM allocates string
    73  *  parameters to public functions.
    74  *
    75  *  The Bstr class hides all this handling behind a std::string-like interface
    76  *  and also provides automatic conversions to MiniString and Utf8Str instances.
    77  *
    78  *  The one advantage of using the SysString* routines is that this makes it
    79  *  possible to use it as a type of member variables of COM/XPCOM components
    80  *  and pass their values to callers through component methods' output parameters
    81  *  using the #cloneTo() operation. Also, the class can adopt (take ownership of)
    82  *  string buffers returned in output parameters of COM methods using the
    83  *  #asOutParam() operation and correctly free them afterwards.
    84  *
    85  *  Starting with VirtualBox 3.2, like Utf8Str, Bstr no longer differentiates
    86  *  between NULL strings and empty strings. In other words, Bstr("") and
    87  *  Bstr(NULL) behave the same. In both cases, Bstr allocates no memory,
    88  *  reports a zero length and zero allocated bytes for both, and returns an
    89  *  empty C wide string from raw().
     60 *
     61 * Unfortunately COM on Windows uses UTF-16 everywhere, requiring conversions
     62 * back and forth since most of VirtualBox and our libraries use UTF-8.
     63 *
     64 * To make things more obscure, on Windows, a COM-style BSTR is not just a
     65 * pointer to a null-terminated wide character array, but the four bytes (32
     66 * bits) BEFORE the memory that the pointer points to are a length DWORD. One
     67 * must therefore avoid pointer arithmetic and always use SysAllocString and
     68 * the like to deal with BSTR pointers, which manage that DWORD correctly.
     69 *
     70 * For platforms other than Windows, we provide our own versions of the Sys*
     71 * functions in Main/xpcom/helpers.cpp which do NOT use length prefixes though
     72 * to be compatible with how XPCOM allocates string parameters to public
     73 * functions.
     74 *
     75 * The Bstr class hides all this handling behind a std::string-like interface
     76 * and also provides automatic conversions to MiniString and Utf8Str instances.
     77 *
     78 * The one advantage of using the SysString* routines is that this makes it
     79 * possible to use it as a type of member variables of COM/XPCOM components and
     80 * pass their values to callers through component methods' output parameters
     81 * using the #cloneTo() operation.  Also, the class can adopt (take ownership
     82 * of) string buffers returned in output parameters of COM methods using the
     83 * #asOutParam() operation and correctly free them afterwards.
     84 *
     85 * Starting with VirtualBox 3.2, like Utf8Str, Bstr no longer differentiates
     86 * between NULL strings and empty strings. In other words, Bstr("") and
     87 * Bstr(NULL) behave the same. In both cases, Bstr allocates no memory,
     88 * reports a zero length and zero allocated bytes for both, and returns an
     89 * empty C wide string from raw().
     90 *
     91 * @note    All Bstr methods ASSUMES valid UTF-16 or UTF-8 input strings.
     92 *          The VirtualBox policy in this regard is to validate strings coming
     93 *          from external sources before passing them to Bstr or Utf8Str.
    9094 */
    9195class Bstr
     
    301305     *  If the member string is empty, this allocates an empty BSTR in *pstr
    302306     *  (i.e. makes it point to a new buffer with a null byte).
    303      */
    304     void detachTo(BSTR *pstr)
     307     *
     308     * @param   pbstrDst        The BSTR variable to detach the string to.
     309     *
     310     * @throws  std::bad_alloc if we failed to allocate a new empty string.
     311     */
     312    void detachTo(BSTR *pbstrDst)
    305313    {
    306314        if (m_bstr)
    307             *pstr = m_bstr;
     315            *pbstrDst = m_bstr;
    308316        else
    309317        {
    310318            // allocate null BSTR
    311             *pstr = ::SysAllocString((const OLECHAR *)g_bstrEmpty);
     319            *pbstrDst = ::SysAllocString((const OLECHAR *)g_bstrEmpty);
    312320#ifdef RT_EXCEPTIONS_ENABLED
    313             if (!*pstr)
     321            if (!*pbstrDst)
    314322                throw std::bad_alloc();
    315323#endif
     
    322330     *  Takes the ownership of the returned data.
    323331     */
    324     BSTR* asOutParam()
     332    BSTR *asOutParam()
    325333    {
    326334        cleanup();
     
    352360     *
    353361     * If the source is empty, this sets the member string to NULL.
    354      * @param rs
    355      */
    356     void copyFrom(const OLECHAR *rs)
    357     {
    358         if (rs && *rs)
     362     *
     363     * @param   a_bstrSrc           The source string.  The caller guarantees
     364     *                              that this is valid UTF-16.
     365     *
     366     * @throws  std::bad_alloc - the object is representing an empty string.
     367     */
     368    void copyFrom(const OLECHAR *a_bstrSrc)
     369    {
     370        if (a_bstrSrc && *a_bstrSrc)
    359371        {
    360             m_bstr = ::SysAllocString(rs);
     372            m_bstr = ::SysAllocString(a_bstrSrc);
    361373#ifdef RT_EXCEPTIONS_ENABLED
    362374            if (!m_bstr)
     
    375387     *
    376388     * If the source is empty, this sets the member string to NULL.
    377      * @param rs
    378      */
    379     void copyFrom(const char *rs)
    380     {
    381         if (rs && *rs)
    382         {
    383             PRTUTF16 s = NULL;
    384             ::RTStrToUtf16(rs, &s);
    385 #ifdef RT_EXCEPTIONS_ENABLED
    386             if (!s)
    387                 throw std::bad_alloc();
    388 #endif
    389             copyFrom((const OLECHAR *)s);            // allocates BSTR from zero-terminated input string
    390             ::RTUtf16Free(s);
    391         }
    392         else
    393             m_bstr = NULL;
     389     *
     390     * @param   a_pszSrc            The source string.  The caller guarantees
     391     *                              that this is valid UTF-8.
     392     *
     393     * @throws  std::bad_alloc - the object is representing an empty string.
     394     */
     395    void copyFrom(const char *a_pszSrc)
     396    {
     397        copyFromN(a_pszSrc, RTSTR_MAX);
    394398    }
    395399
     
    397401     * Variant of copyFrom for sub-string constructors.
    398402     *
    399      * @param   a_pszSrc            The source string.
     403     * @param   a_pszSrc            The source string.  The caller guarantees
     404     *                              that this is valid UTF-8.
    400405     * @param   a_cchMax            The maximum number of chars (not
    401406     *                              codepoints) to copy.  If you pass RTSTR_MAX
    402407     *                              it'll be exactly like copyFrom().
    403      * @throws  std::bad_alloc
     408     *
     409     * @throws  std::bad_alloc - the object is representing an empty string.
    404410     */
    405411    void copyFromN(const char *a_pszSrc, size_t a_cchSrc);
     
    417423
    418424
    419 ////////////////////////////////////////////////////////////////////////////////
     425
    420426
    421427/**
    422  *  String class used universally in Main for UTF-8 strings.
    423  *
    424  *  This is based on iprt::MiniString, to which some functionality has been
    425  *  moved. Here we keep things that are specific to Main, such as conversions
    426  *  with UTF-16 strings (Bstr).
    427  *
    428  *  Like iprt::MiniString, Utf8Str does not differentiate between NULL strings
    429  *  and empty strings. In other words, Utf8Str("") and Utf8Str(NULL)
    430  *  behave the same. In both cases, MiniString allocates no memory, reports
    431  *  a zero length and zero allocated bytes for both, and returns an empty
    432  *  C string from c_str().
     428 * String class used universally in Main for UTF-8 strings.
     429 *
     430 * This is based on iprt::MiniString, to which some functionality has been
     431 * moved.  Here we keep things that are specific to Main, such as conversions
     432 * with UTF-16 strings (Bstr).
     433 *
     434 * Like iprt::MiniString, Utf8Str does not differentiate between NULL strings
     435 * and empty strings.  In other words, Utf8Str("") and Utf8Str(NULL) behave the
     436 * same.  In both cases, MiniString allocates no memory, reports
     437 * a zero length and zero allocated bytes for both, and returns an empty
     438 * C string from c_str().
     439 *
     440 * @note    All Utf8Str methods ASSUMES valid UTF-8 or UTF-16 input strings.
     441 *          The VirtualBox policy in this regard is to validate strings coming
     442 *          from external sources before passing them to Utf8Str or Bstr.
    433443 */
    434444class Utf8Str : public iprt::MiniString
     
    558568protected:
    559569
    560     void copyFrom(CBSTR s);
     570    void copyFrom(CBSTR a_pbstr);
    561571
    562572    friend class Bstr; /* to access our raw_copy() */
Note: See TracChangeset for help on using the changeset viewer.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette