Knowing if the variable is wchar_t* or a char*.

Question

jaepi 22 Practically a Master Poster

17 Years Ago

Hello there, I would just like ask what to use if you want to know if a variable is of type wchar_t* or char* using it in a flow control statement.

Would this work?

void Check(void* pVarToCheck){
    if(sizeof(pVarToCheck) == sizeof(wchar_t))){
        cout << "Variable is wchar_t*" << endl;
   }else if(sizeof(pVarToCheck) == sizeof(char)){
        cout << "Variable is char*" << endl;
   }else cout << "Mofo!" << endl;
}

Thanks!

c++

4 Contributors
12 Replies
989 Views
2 Days Discussion Span
Latest Post 17 Years Ago Latest Post by jaepi

All 12 Replies

Ancient Dragon 5,243 Achieved Level 70

17 Years Ago

most win32 api functions have two versions -- one for char* and another for wchar_t*. If you are looking at a function that uses TCHAR* then you can just safely assume char* when not compiled for UNICODE.

Ancient Dragon 5,243 Achieved Level 70

17 Years Ago

In that case assume TCHAR* is all wchar_t*. You have to call special conversion functions to convert wchar_t* to char*. Which win32 api functions do you want to change?

Ancient Dragon 5,243 Achieved Level 70

17 Years Ago

>>return *( static_cast< const char* >(pointer_to_non_null_cstr) + 1 ) == 0 ;
As a general rule that will not work because many languages take up two or more bytes. In *nix the wchar_t is a long integer (4 bytes) while in MS-Windows it is a short integer (2 bytes). And the last time I read the UNICODE standards board there was some talk about expanding it to 128 bites so that it can hold the graphics use for chinese characters.

This site has some good information you may want to browse.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Salem 5,265 Posting Sage · Answer 1 · 2007-10-02T14:21:23+00:00

void* specifically washes away any concept of type.

Check("string");
Check(&myDouble);
All mean the same thing.

Are you trying to do something with the UNICODE in the win32 API ?

jaepi 22 Practically a Master Poster · Answer 2 · 2007-10-02T15:38:54+00:00

I'm actually converting win32 api to standard library. Win32 api's are using wchar_t* which is hard for me since most of the standard library are using const char* or char*.

jaepi 22 Practically a Master Poster · Answer 3 · 2007-10-02T18:30:56+00:00

It's compiled in UNICODE, I've been having a hard time porting these win32 apis to standard library. *sigh*. I'm actually making my own interface so that it would be easy for me to debug and modify.

vijayan121 1,152 Posting Virtuoso · Answer 4 · 2007-10-02T21:45:32+00:00

> Win32 api's are using wchar_t* which is hard for me since most of the standard library are using const char* or char*.
> I've been having a hard time porting these win32 apis to standard library. *sigh*.

most (not all) of the c++ part of the standard library has good support for mutilple character types. for wchar_t characters, you could use wcout (instead of cout), wifstream (instead of ifstream), wstring (instead of string) etc. you would also have to handle C literal escape sequences correctly eg.

void put_newline( std::wofstream& stm )
{
   stm << stm.widen( '\n' ) ; // widen can be used on *all* streams
}

the problem remains in the places where you use the posix api

here is a desperate hack; really desperate, do not use unless you have no other option:

#include <iostream>
#include <cassert>

// *** do not use except as a desperate hack
// ***
// *** invariants:
// *** pointer_to_non_null_cstr *must* point to a c-style string
// *** of char having a length of *atleast two* characters
// *** or of wchar_t having a length of *atleast one* character
// *** char type in use must conform to ISO/IEC 646 encoding (eg. ASCII)
// *** results are *undefined* if any of these are not met
// ***
// *** written for endianess on intel (i386) platform
inline bool is_it_wchar_t( const void* pointer_to_non_null_cstr )
{
  union
  {
    char ch[ sizeof(wchar_t) ] ;
    wchar_t wch ;
  } ;
  wch = L'a' ;
  assert( ch[1] == 0 ) ; // endianness
  assert( pointer_to_non_null_cstr != 0 ) ;
  
  return *( static_cast< const char* >(pointer_to_non_null_cstr) + 1 ) == 0 ;
}

int main()
{
  std::cout << std::boolalpha
            << is_it_wchar_t( "hello world" ) << '\n' // false
            << is_it_wchar_t( L"hello world" ) << '\n' // true
            << is_it_wchar_t( "" ) << '\n' // *** undefined
            << is_it_wchar_t( "1" ) << '\n' // *** undefined
            << is_it_wchar_t( "12" ) << '\n' // false
            << is_it_wchar_t( L"" ) << '\n' // *** undefined
            << is_it_wchar_t( L"1" ) << '\n' // true
            << is_it_wchar_t( &std::cout ) << '\n' ; // *** undefined
}

vijayan121 1,152 Posting Virtuoso · Answer 5 · 2007-10-03T09:48:15+00:00

> As a general rule that will not work because many languages take up two or more bytes.
yes, it is not a general rule and will not work for any language taking up more than one byte per character. it will only work for single byte encodings (conforming to 'Basic Latin'). in this case, the code points for the equivalent unicode characters are in the range 0x0000 to 0x007f
for example, in ASCII and similar char sets, the code point for a lower case 'c' is 0x63; it would have the same value in a unicode encoding. http://unicode.org/charts/PDF/U0000.pdf
intel's x86 processors use the little-endian format (increasing numeric significance with increasing memory addresses) for data. so, 0x0063 as a two byte value would be stored in memory from lower to higher address as:
-------------------
| 0x63 | 0x00 |
-------------------
and 0x00000063 as a four byte value would be stored in memory from lower to higher address as
-------------------------------------
| 0x63 | 0x00 | 0x00 | 0x00 |
-------------------------------------
for these char sets, for a character string containing atleast two characters, the byte at the address immediately higher than the start address would be non-zero for char, zero for wchar_t.

jaepi 22 Practically a Master Poster · Answer 6 · 2007-10-03T14:58:44+00:00

@AncientDragon - GetFileAttributes, GetDiskFreeSpace, ReadFile, WriteFile, SetFilePointer, SetFilePointerEx...and so on...I only lack the first two functions. I'm still working on the next two functions (ReadFile and WriteFile).

@vijayan121 - I'm not that desperate yet. I do have some solutions (like converting wchar_t* to char* using the wcstombs before passing it to the standard lib functions) but I think it's not that effective.

@all - Thank you! :)

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 7 · 2007-10-03T18:56:23+00:00

I am still confused -- you are ompiling your program for UNICODE but want to use char* instead of wchar_t* ? Then why compile for UINICODE ? Why not just compile the normal way ?

As I mentioned before all those functions have two versions -- one for wchar_t* and the other for char*. The function names you normally use are just macros which the compiler appends either W or A depending on the setting of UNICODE. You can explicitely do the same thing in your code and actually use both versions of the function in the same program. Example: So you see you are just wasting your efforts attempting to rewrite the win32 api functions, unless you are replacing them with standard C/C++ library functions for portability.

winbase.h

#ifdef UNICODE
#define GetFileAttributes  GetFileAttributesW
#else
#define GetFileAttributes  GetFileAttributesA
#endif // !UNICODE

vijayan121 1,152 Posting Virtuoso · Answer 8 · 2007-10-03T23:14:46+00:00

> I'm not that desperate yet. I do have some solutions...
i'm quite relieved. i was not happy about about it after having impulsively made that post; it is not something that should be used.

jaepi 22 Practically a Master Poster · Answer 9 · 2007-10-04T09:05:51+00:00

@AncientDragon - My purpose of rewritting some of the win32 apis is that I'm using it for version changes. So I don't have to scan the entire code for these apis or change their parameter types just to make them fit in the standard library. Yeah, I agree, it's really a waste of effort and time, but in my situation, I might have the advantage or I could even use it for future projects. Porting issues are really hard to deal with. Oh well. Hey, thanks for the enlightenment. If this interface will not work, I'll try to, just replace them instead of burning my brain trying to figure out how to rewrite. :)

@vijayan121 - And someone here in the forums would just copy-paste and give it a try. Or maybe I could try it if I'm really out of options. :) . Thanks man.

Knowing if the variable is wchar_t* or a char*.

Recommended Answers Collapse Answers

All 12 Replies

Recommended Answers