Is there a way to pre-specify a specific string buffer size for the underlying buffer
to which repeated concatenations are going to be made in a loop with the purpose of
eliminating time wasting re-allocations and block movements of characters from one
buffer to another? I have tried the string::reserve() member but in my trials at
least it appears useless; the first assignment of a string to an object to which
it was applied results in the string::capacity() being reset to the approximate length
of the string assigned; whatever the original string::reserve() was set to is lost.
Perhaps some kind of 'lock' type capacity, but I don't see anything like that. Here
is an example program showing my problem, as well as its output which shows that my
initial 'reserve()' of 256 bytes is being completely ignored and lost...

#include <stdio.h>
#include <string>
using namespace std;

int main()
{
 string s1[]={"Zero ","One ","Two ","Three ","Four ","Five ","Six "};
 string s2;

 s2.reserve(256);   //Set buffer to 256???
 printf("s2.capacity() = %d\n\n\n",s2.capacity());
 printf("i\ts1.c_str()\n");
 printf("==================\n");
 for(unsigned i=0; i<sizeof(s1)/sizeof(s1[0]); i++)
     printf("%d\t%s\n",i,s1[i].c_str());
 printf("\n\n");
 printf("i\ts2.capacity()\ts2.size()\ts2.c_str()\n");
 printf("=========================================================================\n");
 for(unsigned i=0; i<sizeof(s1)/sizeof(s1[0]); i++)
 {
     s2=s2+s1[i];  //reserve doesn't maintain 256???
     printf("%d\t%d\t\t%d\t\t%s\n",i,s2.capacity(),s2.size(),s2.c_str());
 }
 getchar();

 return 0;
}
s2.capacity() = 256


i       s1.c_str()
==================
0       Zero
1       One
2       Two
3       Three
4       Four
5       Five
6       Six


i       s2.capacity()   s2.size()       s2.c_str()
=========================================================================
0       5               5               Zero
1       10              9               Zero One
2       20              13              Zero One Two
3       19              19              Zero One Two Three
4       38              24              Zero One Two Three Four
5       29              29              Zero One Two Three Four Five
6       58              33              Zero One Two Three Four Five Six

As can be seen above, the very first line of the program (after variable declarations)
does a string::reserve() on s2 for 256 bytes. Immediately after that an output
statement shows that the capacity is indeed 256. However, the string array s1 is
concatenated into s2 in the bottom for loop, and prior reserve settings appear to be
completely ignored/lost as the output clearly shows.

I have to admit I've never used the Standard C++ Library String Class (I have my own
which seems to be a bit more efficient), so I'm struggling here. Help would be greatly
appreciated.

Change s2=s2+s1[i]; to s2 += s1[i]; and you would get the behaviour you expect. (why?)

Thank You very, very much vijayan21! I've been struggling with this for some time, and that accomplished exactly what I needed!

I might add that that really surprised me. In terms of why, my only possible explanation is that there is a different implementation for operator+ and operator+=??? I wouldn't have thought that!

To be perfectly honest, I never use those compound operators. I just don't like the notation. From seeing this though, if my explanation is anywhere near correct, I'd better start using it if I'm going to use other folk's classes.

In the problem I'm working on (don't know if you are interested), pre-allocating the buffer like that and having it 'stick' decreased my tick count from 402985 to 287922 on this algorithm which allocates and creates a 2MB buffer of '-' chars, replaces every 7th char with a 'P'; then replaces every 'P' with a 'PU'; then replaces every '-' with an '8'; then copies this string in 90 byte chunks to another buffer appending a CrLf to each 90 byte line, and finally outputs the last 4000 bytes to a MessageBox() Whew! I Know! Fairly heavy duty string minipulation....

#include "windows.h"
#include <stdio.h>
#include <string>
#define  NUMBER 2000000  
using namespace std;

string& ReplaceAll(string& context, const string& from, const string& to)
{
 size_t lookHere=0;
 size_t foundHere;

 while((foundHere=context.find(from,lookHere)) != string::npos)
 {
  context.replace(foundHere, from.size(), to);
  lookHere=foundHere+to.size();
 }

 return context;
}

int main(void)
{
 unsigned tick;
 int iCount=0;
 string s2;

 tick=GetTickCount();
 string s1(NUMBER,'-');
 s2.reserve(2200000);
 for(int i=0; i<NUMBER; i++)
 {
     iCount++;
     if(iCount%7==0)
        s1[iCount-1]='P';
 }
 ReplaceAll(s1,"P","PU");
 ReplaceAll(s1,"-","8");
 s2.erase(0);
 for(int i=0; i<NUMBER; i=i+90)
     s2+=s1.substr(i,90)+"\r\n";
 s1=s2.substr(s2.length()-4000,4000);
 tick=GetTickCount()-tick;
 printf("tick = %u\n",(unsigned)tick);
 printf("%u\n",s2.size());
 MessageBox(NULL,s1.c_str(),"Here Is Your String John!",MB_OK);
 printf("%u\n",s2.size());
 getchar();

 return 0;
}

This compiles to about 49 K for me. Using my own string class I'm coming in around 124000 ticks with a 28 K executable, so it appears I'm beating it every way. For that loop on the bottom if I eliminate usage of String classes entirely and just use C isms such as _tcsncpy(), _tcscat(), I increase speed by a factor of 100 or 1000 (I forget which, I'd have to check).

In terms of why, my only possible explanation is that there is a different implementation for operator+ and operator+=???

Consider that operator+= returns a reference to the current object while operator+ returns a new string object.

Thanks for the info Narue! That is indeed a critical issue I hadn't thought of!

allocates and creates a 2MB buffer of '-' chars, replaces every 7th char with a 'P'; then replaces every 'P' with a 'PU'; then replaces every '-' with an '8';

This will yield the eight chars '888888PU' repeated over and over again to fill 2M + 2M/7 chars

then copies this string in 90 byte chunks to another buffer appending a CrLf to each 90 byte line

The first 92 bytes: 888888PU repeated 11 times followed by 88Cr-Lf
The next 92 bytes : 8888PU followed by 888888PU repeated 10 times followed by 8888Cr-Lf
Then: 88PU followed by 888888PU repeated 10 times followed by 888888Cr-Lf
Then: PU followed by 888888PU repeated 11 times followed by Cr-LF

And now: 888888PU repeated 11 times followed by 88Cr-Lf
The pattern repeats after every 360 (LCM of 90 and 8) bytes in the original buffer; every 368 bytes in the second buffer.

and finally outputs the last 4000 bytes to a MessageBox()

You should now be able to do away with the 2M + 2M/7 buffer altogether and directly generate just the last 4000 bytes needed for the output in a 4000 byte buffer.

I need to ‘come clean’ Vivayan; I’m working on benchmarking various C++ compilers, programming languages, and algorithms, so just generating the last 4000 bytes of that series isn’t my goal. Also, I’m using this exercise to try to improve my knowledge of C++ and particularly String Class construction; String Classes are a particular interest of mine, since I’m an application programmer. My background is more in C and other programming languages than C++. Developing my String Class over the past several years has been a very good learning experience for me, and I have used it very successfully in my Windows CE work.

I’m working very hard on this right now, and if anyone is interested in commenting on what I’m doing or my code I’d appreciate it (maybe there are like minded folks out there!). Here is the PowerBASIC program I’m trying to implement and match as close as possible speed wise in either C or C++. It’ll do this 2MB buffer thing in 0.078 seconds (78 ticks) on my old laptop…

'1)Create a 2MB string of spaces; John Gleason’s original had 15 MB
'2)Change every 7th null to a "P"
'3)replace every "P" with a "PU" (hehehe)
'4)replace every null with an "8"
'5)Put in a carriage return & line feed every 90 characters
'DONE 0.078 secs!

#Compile Exe
#Dim All
Declare Function GetTickCount Lib "KERNEL32.DLL" Alias "GetTickCount" () As Dword

Function PBMain() As Long
  Local i, count7, currPos As Long
  Local s, s1 As String
  Local tick As Dword

  tick = getTickCount
  s = String$(2000000, " ")     '1)
  For i = 1 To 2000000
    Incr count7
    If count7 = 7 Then
       count7 = 0
       Asc(s, i) = &h50         '2) "P"
    End If
  Next
  Replace "P" With "PU" In s    '3) "PU"
  Replace Any " " With "8" In s '4) null change to 8
  s1 = String$(2200000, $NUL)   '5)2nd string for CRLF's a little bigger
  currPos = 1                   'for s2 position tracking
  For i = 1 To 2000000 Step 90
    Mid$(s1, currPos) = Mid$(s, i, 90) & $CRLF
    currPos = currPos + 92
  Next ii
  s = RTrim$(s1, $NUL)
  tick = getTickCount - tick
  Msgbox "Done in" & Str$(tick/1000) & _
  " seconds, and here is the last part of the answer:" & $CRLF & _
  Right$(s, 4000) & $CrLf & $CrLf & "Len(s) = " & Str$(Len(s))

  PBMain=0
End Function

Here are the results I’ve come up with so far in my tests…

Ansi        Unicode      Executable Size

PowerBASIC Windows 9                                        0.078 secs  ------------    10240 bytes
GNU GCC My String Class, C Runtime I/O (stdio)            126.328 secs  245.125 secs    29184 bytes
GNU GCC Std C++ Lib String Class, C Runtime I/O (stdio)   287.922 secs  ------------    46592 bytes
GNU GCC Std C++ Lib String Class And I/O (iostream)       288.312 secs  ------------   470528 bytes
VC9 C++ VStudio 2008 Pro, My String Class, stdio I/O)     148.188 secs  174.109 secs    95232 bytes

Of course, these results don’t look very favorable to C++, and I’m not sure I’m doing everything right. Naru’s reply kind of jogged something in my mind, and further reflection on these statements in the PowerBASIC program…

For i = 1 To 2000000 Step 90
  Mid$(s1, currPos) = Mid$(s, i, 90) & $CRLF
  currPos = currPos + 92
Next ii

made me realize what I ought to do is create an external Mid() function of my own that returns a reference to a String object, and inside that function perhaps do the strncpy() and strcat() calls from the C runtime, or maybe even try some of the Windows memory byte block transfer calls, or further, get ahold of some asm doing the same. Don’t know if I posted it, but using those C Library calls instead of the String Class operator+ and operator= calls reduces my tick count to 300 or 400, which is still four or five times slower than the PowerBAIC code, but at least its movement in the right direction and better than 128000 ticks!

I’ll be working on this. That you can believe! Anyway, I’ll post the code with My String Class (it’s a lot!)…

//Main.cpp
#include <windows.h> 
#include <tchar.h>
#include <stdio.h>
#include "Strings.h"                               //<<< My String Class!
#define NUMBER 2000000                             //2 MB String

int main()
{
 DWORD tick;                                       //for timing with GetTickCount()
 String s2;                                        //exchange/slop/extra buffer

 tick = GetTickCount();                            //Get initial tick count
 String s1(_T(' '),NUMBER);                        //make String containing NUMBER of space chars
 for(int i=1; i<=NUMBER; i++)                      //loop through s1 converting every seventh char to 'P'
     if(i%7==0) s1.SetChar(i,_T('P'));             //my String::SetChar() is one based. % is mod operator.
 s2=s1.Replace((TCHAR*)_T("P"),(TCHAR*)_T("PU"));  //Replace every 'P' with "PU" which expands String
 s1=s2.Replace((TCHAR*)_T(" "),(TCHAR*)_T("8"));   //one char with every insert.  then replace blanks with 8
 s2.SetChar(1,'\0');                               //Reduce String::LenStr() to zero by setting NULL
 for(int i=0; i<NUMBER; i=i+90)                    //Finally, copy 90 byte chunks from s1 to s2, appending
     s2=s2+s1.Mid(i+1,90)+_T("\r\n");              //a CrLf to each line.  then display the last 4000     
 s1=s2.Right(4000);                                //characters in a MessageBox() and display final tick
 tick = GetTickCount() - tick;                     //difference in console
 printf("tick = %u\n",(unsigned)tick);
 printf("s2.LenStr() = %u\n",(unsigned)s2.LenStr());
 MessageBox(0,s1.lpStr(),_T("Here's Your String John!"),MB_OK);
 getchar();

 return 0;
}
//Strings.h
#if !defined(STRINGS_H)
#define STRINGS_H
#define EXPANSION_FACTOR      2
#define MINIMUM_ALLOCATION    8
//#define UNICODE
//#define _UNICODE

class __declspec(dllexport) String
{
 public:
 String();                                     //Uninitialized Constructor
 String(const TCHAR);                          //Constructor Initializes String With TCHAR
 String(const TCHAR*);                         //Constructor Initializes String With TCHAR*
 String(const String&);                        //Constructor Initializes String With Another String (Copy Constructor)
 String(int);                                  //Constructor Initializes Buffer To Specific Size
 String(const TCHAR ch, const int iCount);     //Constructor initializes String with int # of chars
 String& operator=(const TCHAR);               //Assigns TCHAR To String
 String& operator=(const TCHAR*);              //Assigns TCHAR* To String
 String& operator=(const String&);             //Assigns one String to another (this one)
 String& operator=(int);                       //Converts And Assigns An Integer to A String
 String& operator=(unsigned int);              //Converts And Assigns An Unsigned Integer to A String
 String& operator=(long);                      //Converts And Assigns A Long to A String
 String& operator=(DWORD);                     //Converts And Assigns A DWORD to A String
 String& operator=(double);                    //Converts And Assigns A double to A String
 String& operator+(const TCHAR);               //For adding TCHAR to String
 String& operator+(const TCHAR*);              //For adding null terminated TCHAR array to String
 String& operator+(const String&);             //For adding one String to Another
 bool operator==(const String);                //For comparing Strings
 String Left(unsigned int);                    //Returns String of iNum Left Most TCHARs of this
 String Right(unsigned int);                   //Returns String of iNum Right Most TCHARs of this
 String Mid(unsigned int, unsigned int);       //Returns String consisting of number of TCHARs from some offset
 String& Make(const TCHAR ch, int iCount);     //Creates (Makes) a String with iCount TCHARs
 String Remove(const TCHAR*, bool);            //Returns A String With A Specified TCHAR* Removed
 String Remove(TCHAR* pStr);                   //Returns A String With All The TCHARs In A TCHAR* Removed (Individual char removal)
 String Retain(TCHAR* pStr);                   //Seems to return a String with some characters retained???
 String Replace(TCHAR* pMatch, TCHAR* pNew);   //Replace a match string found in this with a new string
 int InStr(const TCHAR);                       //Returns one based offset of a specific TCHAR in a String
 int InStr(const TCHAR*, bool);                //Returns one based offset of a particular TCHAR pStr in a String
 int InStr(const String&, bool);               //Returns one based offset of where a particular String is in another String
 void LTrim();                                 //Returns String with leading spaces/tabs removed
 void RTrim();                                 //Returns String with spaces/tabs removed from end
 void Trim();                                  //Returns String with both leading and trailing whitespace removed
 unsigned int ParseCount(const TCHAR);         //Returns count of Strings delimited by a TCHAR passed as a parameter
 void Parse(String*, TCHAR);                   //Returns array of Strings in first parameter as delimited by 2nd TCHAR delimiter
 int iVal();                                   //Returns int value of a String
 int LenStr(void);                             //Returns length of string
 TCHAR* lpStr();                               //Returns address of pStrBuffer member variable
 TCHAR GetChar(unsigned int);                  //Returns TCHAR at one based index
 void SetChar(unsigned int, TCHAR);            //Sets TCHAR at one based index
 void Print(bool);                             //Outputs String to Console with or without CrLf
 ~String();                                    //String Destructor

 private:
 TCHAR* pStrBuffer;
 int    iAllowableCharacterCount;
};

String operator+(TCHAR* lhs, String& rhs);     //global function
#endif  //#if !defined(STRINGS_H)

//Strings.cpp

#include  <windows.h>
#include  <tchar.h>
#include  <stdlib.h>
#include  <stdio.h>
#include  <math.h>
#include  <string.h>
#include  "Strings.h"


String operator+(TCHAR* lhs, String& rhs)         //global function
{
 String sr=lhs;
 sr=sr+rhs;
 return sr;
}


String::String()    //Uninitialized Constructor
{
 pStrBuffer=new TCHAR[MINIMUM_ALLOCATION];
 pStrBuffer[0]=_T('\0');
 this->iAllowableCharacterCount=MINIMUM_ALLOCATION-1;
}


String::String(const TCHAR ch)  //Constructor: Initializes with TCHAR
{
 pStrBuffer=new TCHAR[MINIMUM_ALLOCATION];
 pStrBuffer[0]=ch;
 pStrBuffer[1]=_T('\0');
 iAllowableCharacterCount=MINIMUM_ALLOCATION-1;
}


String::String(const TCHAR* pStr)  //Constructor: Initializes with TCHAR*
{
 int iLen,iNewSize;

 iLen=_tcslen(pStr);
 iNewSize=(iLen/16+1)*16;
 pStrBuffer=new TCHAR[iNewSize];
 this->iAllowableCharacterCount=iNewSize-1;
 _tcscpy(pStrBuffer,pStr);
}


String::String(const String& s)  //Constructor Initializes With Another String, i.e., Copy Constructor
{
 int iLen,iNewSize;

 iLen=_tcslen(s.pStrBuffer);
 iNewSize=(iLen/16+1)*16;
 this->pStrBuffer=new TCHAR[iNewSize];
 this->iAllowableCharacterCount=iNewSize-1;
 _tcscpy(this->pStrBuffer,s.pStrBuffer);
}


String::String(int iSize)        //Constructor Creates String With Custom Sized
{                                //Buffer (rounded up to paragraph boundary)
 int iNewSize;

 iNewSize=(iSize/16+1)*16;
 pStrBuffer=new TCHAR[iNewSize];
 this->iAllowableCharacterCount=iNewSize-1;
 this->pStrBuffer[0]=_T('\0');
}


String::String(const TCHAR ch, int iCount)
{
 int iNewSize;

 iNewSize=(iCount/16+1)*16;
 pStrBuffer=new TCHAR[iNewSize];
 this->iAllowableCharacterCount=iNewSize-1;
 for(int i=0; i<iCount; i++)
     pStrBuffer[i]=ch;
 pStrBuffer[iCount]=_T('\0');
}


String& String::operator=(const TCHAR ch)  //Overloaded operator = for assigning a TCHAR to a String
{
 this->pStrBuffer[0]=ch;
 this->pStrBuffer[1]=_T('\0');

 return *this;
}


String& String::operator=(const TCHAR* pStr)   //Constructor For If Pointer To Asciiz String Parameter
{
 int iLen,iNewSize;

 iLen=_tcslen(pStr);
 if(iLen<this->iAllowableCharacterCount)
    _tcscpy(pStrBuffer,pStr);
 else
 {
    delete [] pStrBuffer;
    iNewSize=(iLen/16+1)*16;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
    _tcscpy(pStrBuffer,pStr);
 }

 return *this;
}


String& String::operator=(const String& strRight)  //Overloaded operator = for
{                                                  //assigning another String to
 int iRightLen,iNewSize;                           //a String

 if(this==&strRight)
    return *this;
 iRightLen=_tcslen(strRight.pStrBuffer);
 if(iRightLen < this->iAllowableCharacterCount)
    _tcscpy(pStrBuffer,strRight.pStrBuffer);
 else
 {
    iNewSize=(iRightLen/16+1)*16;
    delete [] this->pStrBuffer;
    this->pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
    _tcscpy(pStrBuffer,strRight.pStrBuffer);
 }

 return *this;
}



bool String::operator==(const String strCompare)
{
 if(_tcscmp(this->pStrBuffer,strCompare.pStrBuffer)==0)  //_tcscmp
    return true;
 else
    return false;
}


String& String::operator+(const TCHAR ch)      //Overloaded operator + (Puts TCHAR in String)
{
 int iLen,iNewSize;
 TCHAR* pNew;

 iLen=_tcslen(this->pStrBuffer);
 if(iLen<this->iAllowableCharacterCount)
 {
    this->pStrBuffer[iLen]=ch;
    this->pStrBuffer[iLen+1]='\0';
 }
 else
 {
    iNewSize=((this->iAllowableCharacterCount*EXPANSION_FACTOR)/16+1)*16;
    pNew=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
    _tcscpy(pNew,this->pStrBuffer);
    delete [] this->pStrBuffer;
    this->pStrBuffer=pNew;
    this->pStrBuffer[iLen]=ch;
    this->pStrBuffer[iLen+1]='\0';
 }

 return *this;
}


String& String::operator+(const TCHAR* pChar) //Overloaded operator + (Adds TCHAR literals
{                                             //or pointers to Asciiz Strings)
 int iLen,iNewSize;
 TCHAR* pNew;

 iLen=_tcslen(this->pStrBuffer)+_tcslen(pChar);
 if(iLen<this->iAllowableCharacterCount)
 {
    if(this->pStrBuffer)
       _tcscat(this->pStrBuffer,pChar);
    else
       _tcscpy(this->pStrBuffer, pChar);
 }
 else
 {
    iNewSize=(iLen*EXPANSION_FACTOR/16+1)*16;
    pNew=new TCHAR[iNewSize];
    this->iAllowableCharacterCount = iNewSize-1;
    if(this->pStrBuffer)
    {
       _tcscpy(pNew,this->pStrBuffer);
       delete [] pStrBuffer;
       _tcscat(pNew,pChar);
    }
    else
       _tcscpy(pNew,pChar);
    this->pStrBuffer=pNew;
 }

 return *this;
}


String& String::operator+(const String& strRight)  //Overloaded operator + Adds
{                                                  //Another String to the left
 int iLen,iNewSize;                                //operand
 TCHAR* pNew;

 iLen=_tcslen(this->pStrBuffer) + _tcslen(strRight.pStrBuffer);
 if(iLen < this->iAllowableCharacterCount)
 {
    if(this->pStrBuffer)
       _tcscat(this->pStrBuffer,strRight.pStrBuffer);
    else
       _tcscpy(this->pStrBuffer,strRight.pStrBuffer);
 }
 else
 {
    iNewSize=(iLen*EXPANSION_FACTOR/16+1)*16;
    pNew=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
    if(this->pStrBuffer)
    {
       _tcscpy(pNew,this->pStrBuffer);
       delete [] pStrBuffer;
       _tcscat(pNew,strRight.pStrBuffer);
    }
    else
       _tcscpy(pNew,strRight.pStrBuffer);
    this->pStrBuffer=pNew;
 }

 return *this;
}


String String::Left(unsigned int iNum)
{
 unsigned int iLen,i,iNewSize;
 String sr;

 iLen=_tcslen(this->pStrBuffer);
 if(iNum<iLen)
 {
    iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    sr.iAllowableCharacterCount=iNewSize-1;
    sr.pStrBuffer=new TCHAR[iNewSize];
    for(i=0;i<iNum;i++)
        sr.pStrBuffer[i]=this->pStrBuffer[i];
    sr.pStrBuffer[iNum]='\0';
    return sr;
 }
 else
 {
    sr=*this;
    return sr;
 }
}


String String::Right(unsigned int iNum)  //Returns Right$(strMain,iNum)
{
 unsigned int iLen,iNewSize;
 String sr;

 iLen=_tcslen(this->pStrBuffer);
 if(iNum<iLen)
 {
    iNewSize=(iNum*EXPANSION_FACTOR/16+1)*16;
    sr.iAllowableCharacterCount=iNewSize-1;
    sr.pStrBuffer=new TCHAR[iNewSize];
    _tcsncpy(sr.pStrBuffer,this->pStrBuffer+iLen-iNum,iNum);
    sr.pStrBuffer[iNum]='\0';
    return sr;
 }
 else
 {
    sr=*this;
    return sr;
 }
}


String String::Mid(unsigned int iStart, unsigned int iCount)
{
 unsigned int iLen,iNewSize;
 String sr;

 iLen=_tcslen(this->pStrBuffer);
 if(iStart && iStart<=iLen)
 {
    if(iCount && iStart+iCount-1<=iLen)
    {
       iNewSize=(iCount*EXPANSION_FACTOR/16+1)*16;
       sr. iAllowableCharacterCount=iNewSize-1;
       sr.pStrBuffer=new TCHAR[iNewSize];
       _tcsncpy(sr.pStrBuffer,this->pStrBuffer+iStart-1,iCount);
       sr.pStrBuffer[iCount]='\0';
       return sr;
    }
    else
    {
       sr=*this;
       return sr;
    }
 }
 else
 {
    sr=*this;
    return sr;
 }
}


String& String::Make(const TCHAR ch, int iCount)    //Creates (Makes) a String with iCount TCHARs
{
 if(iCount>this->iAllowableCharacterCount)
 {
    delete [] pStrBuffer;
    int iNewSize=(iCount*EXPANSION_FACTOR/16+1)*16;
    this->pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 for(int i=0; i<iCount; i++)
     pStrBuffer[i]=ch;
 pStrBuffer[iCount]=_T('\0');

 return *this;
}





String String::Remove(const TCHAR* pToRemove, bool blnCaseSensitive)
{
 int i,j,iParamLen,iReturn=0;

 if(*pToRemove==0)
    return *this;
 String strNew(this->LenStr());
 iParamLen=_tcslen(pToRemove);
 i=0, j=0;
 do
 {
  if(pStrBuffer[i]==0)
     break;
  if(blnCaseSensitive)
     iReturn=_tcsncmp(pStrBuffer+i,pToRemove,iParamLen);  //_tcsncmp
  else
     iReturn=_tcsnicmp(pStrBuffer+i,pToRemove,iParamLen); //__tcsnicmp
  if(iReturn==0)
     i=i+iParamLen;
  strNew.pStrBuffer[j]=pStrBuffer[i];
  j++, i++;
  strNew.pStrBuffer[j]=_T('\0');
 }while(1);

 return strNew;
}


String String::Remove(TCHAR* pStr)
{
 unsigned int i,j,iStrLen,iParamLen;
 TCHAR *pThis, *pThat, *p;
 bool blnFoundBadChar;

 iStrLen=this->LenStr();    //The length of this
 String sr((int)iStrLen);   //Create new String big enough to contain original String (this)
 iParamLen=_tcslen(pStr);   //Get length of parameter (pStr) which contains chars to be removed
 pThis=this->pStrBuffer;
 p=sr.lpStr();
 for(i=0; i<iStrLen; i++)
 {
     pThat=pStr;
     blnFoundBadChar=false;
     for(j=0; j<iParamLen; j++)
     {
         if(*pThis==*pThat)
         {
            blnFoundBadChar=true;
            break;
         }
         pThat++;
     }
     if(!blnFoundBadChar)
     {
        *p=*pThis;
         p++;
        *p=_T('\0');
     }
     pThis++;
 }

 return sr;
}


String String::Retain(TCHAR* pStr)
{
 unsigned int i,j,iStrLen,iParamLen;
 TCHAR *pThis, *pThat, *p;
 bool blnFoundGoodChar;

 iStrLen=this->LenStr();    //The length of this
 String sr((int)iStrLen);   //Create new String big enough to contain original String (this)
 iParamLen=_tcslen(pStr);   //Get length of parameter (pStr) which contains chars to be retained
 pThis=this->pStrBuffer;    //pThis will point to this String's buffer, and will increment through string.
 p=sr.lpStr();              //p will start by pointing to new String's buffer and will increment through new string
 for(i=0; i<iStrLen; i++)
 {
     pThat=pStr;
     blnFoundGoodChar=false;
     for(j=0; j<iParamLen; j++)
     {
         if(*pThis==*pThat)
         {
            blnFoundGoodChar=true;
            break;
         }
         pThat++;
     }
     if(blnFoundGoodChar)
     {
        *p=*pThis;
         p++;
        *p=_T('\0');
     }
     pThis++;
 }

 return sr;
}
String String::Replace(TCHAR* pMatch, TCHAR* pNew)
{
 int iLenMatch,iLenNew,iLenMainString,iCountMatches,iExtra,iExtraLengthNeeded,iAllocation,iCtr;
 String sr;

 iCountMatches=0, iAllocation=0, iCtr=0;
 iLenMatch=_tcslen(pMatch);
 iLenNew=_tcslen(pNew);
 iLenMainString=this->LenStr();
 if(iLenNew==0)
    sr=this->Remove(pMatch); //return
 else
 {
    iExtra=iLenNew-iLenMatch;
    for(int i=0; i<iLenMainString; i++)
        if(_tcsncmp(pStrBuffer+i,pMatch,iLenMatch)==0) iCountMatches++;
    iExtraLengthNeeded=iCountMatches*iExtra;
    iAllocation=iLenMainString+iExtraLengthNeeded;
    String strNew(iAllocation);
    for(int i=0; i<iLenMainString; i++)
    {
        if(_tcsncmp(pStrBuffer+i,pMatch,iLenMatch)==0)
        {
           _tcscpy(strNew.pStrBuffer+iCtr,pNew);
           iCtr=iCtr+iLenNew;
           i=i+iLenMatch-1;
        }
        else
        {
           strNew.pStrBuffer[iCtr]=this->pStrBuffer[i];
           iCtr++;
        }
        strNew.pStrBuffer[iCtr]=_T('\0');
    }
    sr=strNew;
 }

 return sr;
}


int String::InStr(const TCHAR ch)
{
 int iLen,i;

 iLen=_tcslen(this->pStrBuffer);
 for(i=0;i<iLen;i++)
 {
     if(this->pStrBuffer[i]==ch)
        return (i+1);
 }

 return 0;
}


int String::InStr(const TCHAR* pStr, bool blnCaseSensitive)
{
 int i,iParamLen,iRange;

 if(*pStr==0)
    return 0;
 iParamLen=_tcslen(pStr);
 iRange=_tcslen(pStrBuffer)-iParamLen;
 if(iRange>=0)
 {
    for(i=0;i<=iRange;i++)
    {
        if(blnCaseSensitive)
        {
           if(_tcsncmp(pStrBuffer+i,pStr,iParamLen)==0)   //_tcsncmp
              return i+1;
        }
        else
        {
           if(_tcsnicmp(pStrBuffer+i,pStr,iParamLen)==0)  //__tcsnicmp
              return i+1;
        }
    }
 }

 return 0;
}


int String::InStr(const String& s, bool blnCaseSensitive)
{
 int i,iParamLen,iRange,iLen;

 iLen=_tcslen(s.pStrBuffer);
 if(iLen==0)
    return 0;
 iParamLen=iLen;
 iRange=_tcslen(pStrBuffer)-iParamLen;
 if(iRange>=0)
 {
    for(i=0;i<=iRange;i++)
    {
        if(blnCaseSensitive)
        {
           if(_tcsncmp(pStrBuffer+i,s.pStrBuffer,iParamLen)==0)  //_tcsncmp
              return i+1;
        }
        else
        {
           if(_tcsnicmp(pStrBuffer+i,s.pStrBuffer,iParamLen)==0) //__tcsnicmp
              return i+1;
        }
    }
 }

 return 0;
}


void String::LTrim()
{
 unsigned int i,iCt=0,iLenStr;

 iLenStr=this->LenStr();
 for(i=0;i<iLenStr;i++)
 {
     if(pStrBuffer[i]==32||pStrBuffer[i]==9)
        iCt++;
     else
        break;
 }
 if(iCt)
 {
    for(i=iCt;i<=iLenStr;i++)
        pStrBuffer[i-iCt]=pStrBuffer[i];
 }
}


void String::RTrim()
{
 unsigned int iCt=0, iLenStr;

 iLenStr=this->LenStr()-1;
 for(unsigned int i=iLenStr; i>0; i--)
 {
     if(this->pStrBuffer[i]==9||this->pStrBuffer[i]==10||this->pStrBuffer[i]==13||this->pStrBuffer[i]==32)
        iCt++;
     else
        break;
 }
 this->pStrBuffer[this->LenStr()-iCt]=0;
}


void String::Trim()
{
 this->LTrim();
 this->RTrim();
}


unsigned int String::ParseCount(const TCHAR c) //returns one more than # of
{                                              //delimiters so it accurately
 unsigned int iCtr=0;                          //reflects # of strings delimited
 TCHAR* p;                                     //by delimiter.

 p=this->pStrBuffer;
 while(*p)
 {
  if(*p==c)
     iCtr++;
  p++;
 }

 return ++iCtr;
}


void String::Parse(String* pStr, TCHAR delimiter)
{
 unsigned int i=0;
 TCHAR* pBuffer=0;
 TCHAR* c;
 TCHAR* p;

 pBuffer=new TCHAR[this->LenStr()+1];
 if(pBuffer)
 {
    p=pBuffer;
    c=this->pStrBuffer;
    while(*c)
    {
     if(*c==delimiter)
     {
        pStr[i]=pBuffer;
        p=pBuffer;
        i++;
     }
     else
     {
        *p=*c;
        p++;
        *p=0;
     }
     c++;
    }
    pStr[i]=pBuffer;
    delete [] pBuffer;
 }
}


int String::iVal()
{
 return _ttoi(this->pStrBuffer);  //_ttoi
}


String& String::operator=(int iNum)
{
 if(this->iAllowableCharacterCount<16)
 {
    int iNewSize;
    delete [] this->pStrBuffer;
    iNewSize=16;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 _stprintf(this->pStrBuffer,_T("%d"),iNum);

 return *this;
}


String& String::operator=(unsigned int iNum)
{
 if(this->iAllowableCharacterCount<16)
 {
    int iNewSize;
    delete [] this->pStrBuffer;
    iNewSize=16;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 _stprintf(this->pStrBuffer,_T("%d"),iNum);

 return *this;
}


String& String::operator=(long iNum)
{
 if(this->iAllowableCharacterCount<16)
 {
    int iNewSize;
    delete [] this->pStrBuffer;
    iNewSize=16;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 _stprintf(this->pStrBuffer,_T("%ld"),iNum);

 return *this;
}


String& String::operator=(DWORD iNum)
{
 if(this->iAllowableCharacterCount<16)
 {
    int iNewSize;
    delete [] this->pStrBuffer;
    iNewSize=16;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 _stprintf(this->pStrBuffer,_T("%u"),(unsigned)iNum);

 return *this;
}


String& String::operator=(double dblNum)
{
 if(this->iAllowableCharacterCount<16)
 {
    int iNewSize;
    delete [] this->pStrBuffer;
    iNewSize=32;
    pStrBuffer=new TCHAR[iNewSize];
    this->iAllowableCharacterCount=iNewSize-1;
 }
 _stprintf(this->pStrBuffer,_T("%10.14f"),dblNum);

 return *this;
}


int String::LenStr(void)
{
 return _tcslen(this->pStrBuffer);
}


TCHAR* String::lpStr()
{
 return pStrBuffer;
}


TCHAR String::GetChar(unsigned int iOffset)
{
 return this->pStrBuffer[iOffset-1];
}


void String::SetChar(unsigned int iOneBasedOffset, TCHAR tcChar)
{
 if((int)iOneBasedOffset<=this->iAllowableCharacterCount)
    this->pStrBuffer[iOneBasedOffset-1]=tcChar;
}


void String::Print(bool blnCrLf)
{
 _tprintf(_T("%s"),pStrBuffer);
 if(blnCrLf)
    _tprintf(_T("\n"));
}
String::~String()   //String Destructor
{
 delete [] pStrBuffer;
 pStrBuffer=0;
}

Here’s the version that gives me the lowest tick count…

//Strings.h
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
#include "Strings.h"
#define NUMBER 2000000      //Ansi 406, 421       UNICODE 281

int main()
{
 String s1(_T(' '),NUMBER);
 DWORD tick;
 String s2;

 tick = GetTickCount();
 for(int i=1; i<=NUMBER; i++)
     if(i%7==0)  s1.SetChar(i,_T('P'));
 s2=s1.Replace((TCHAR*)_T("P"),(TCHAR*)_T("PU"));
 s1=s2.Replace((TCHAR*)_T(" "),(TCHAR*)_T("8"));
 s2.Make(_T('\0'),2200000);
 TCHAR* pBuf1=s1.lpStr();
 TCHAR* pBuf2=s2.lpStr();
 for(int i=0; i<NUMBER; i=i+90)   //Forget String Classes and resurrect good old C!
 {
     _tcsncpy(pBuf2,pBuf1,90);
     _tcscat(pBuf2,(TCHAR*)_T("\r\n"));
     pBuf2=pBuf2+92;
     pBuf1=pBuf1+90;
 }
 s1=s2.Right(4000);
 tick = GetTickCount()-tick;
 printf("tick = %u\n",(unsigned)tick);
 MessageBox(NULL,s1.lpStr(),_T("Here's Your String John!"),MB_OK);
 getchar();

 return 0;

Here is the PowerBASIC program I’m trying to implement and match as close as possible speed wise in either C or C++. It’ll do this 2MB buffer thing in 0.078 seconds (78 ticks) on my old laptop…

This is how I would translate the PowerBASIC code to C++

#include <string>
#include <algorithm>
#include <iostream>

// BEWARE OF DOG. SLIPPERY WHEN WET.

int main()
{
    enum { K = 7, N = 2*1024*1024, N1 = N + N/7, BLOCKSZ = 90, M = N1 + 2*N1/BLOCKSZ, TAIL = 4000 } ;
    const char* const crnl = "\r\n" ;
    typedef std::string::size_type size_type ;
    typedef std::string::iterator iterator ;

    // create a string containing N ' '
    std::string str( N, ' ' ) ;

    // replace every Kth ' ' with a 'P'
    for( size_type i = K-1 ; i < N ; i += K ) str[i] = 'P' ;

    // replace every 'P' with a 'PU'

    // we could do this in quadratic time by:
    // for( size_type i = K ; i < N1 ; i += K+1 ) str.insert( str.begin()+i, 'U' ) ;

    // however, by using some temporary memory, we can do it in linear time by:
    {
        std::string temp ;
        temp.reserve(N1) ;
        iterator i = str.begin() + K ;
        const iterator end = str.end() - K ;
        for(  ; i < end ; i += K )
        {
            temp.insert( temp.end(), i, i+K ) ;
            temp += 'U' ;
        }
        temp.insert( temp.end(), i, str.end() ) ; // copy the tail fragment

        str = temp ; // and finally modify the original str with a single assignment

    } // this is presumably what PowerBASIC would have done for: Replace "P" With "PU" In s

    // replace every ' ' with an '8'
    std::replace_copy( str.begin(), str.end(), str.begin(), ' ', '8' ) ;

    std::string dest ;
    dest.reserve(M) ;

    // copy blocks of BLOCKSZ chars to dest, appending a cr-nl to each copied block
    iterator i = str.begin() ;
    const iterator last = str.end() - BLOCKSZ ;
    for( ; i < last ; i += BLOCKSZ )
    {
        dest.insert( dest.end(), i, i+BLOCKSZ ) ;
        dest += crnl ;
    }
    dest.insert( dest.end(), i, str.end() ) ; // copy what is left

    // using stdout in place of MessageBox()
    std::cout << dest.substr( dest.size() - TAIL ) ;
}

On my laptop (FreeBSD 8.1-STABLE i386, Core2 Duo T7100 @1.80GHz, GCC 4.5) the 10K executable runs in about 0.04 seconds. As expected, about the the same order of magnitude time as PowerBASIC. Curiously, also an almost identical executable size (though one is on unix and the other is on windows).

>c++ -O3 -march=core2 -fomit-frame-pointer strings.cc && ll a.out && time ./a.out > /dev/null
-rwxr-xr-x 1 vijayan vijayan 10182 Jan 10 14:38 a.out*
0.037u 0.007s 0:00.04 100.0% 16+1712k 0+0io 0pf+0w

and inside that function perhaps do the strncpy() and strcat() calls from the C runtime, or maybe even try some of the Windows memory byte block transfer calls, or further, get ahold of some asm doing the same.

The C++ compiler can do these kinds of optimizations if your code gives it a chance.

Wow Vijayan! Thanks! I'm going to really study that code and see what I can learn from it. I'll see if I can get it to run now.

It compiles and runs vijayan! I'm coming in around 47 ticks. By far best yet. That's awesome code. I'm going to try to figure it out now. I'm very grateful for your time and help. Very.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.