Problem: Split a string by a delim, and return a vector of strings containing no delim

1. Use stringstream getline

Signature istream& getline (istream&& is, string& str, char delim)

Usage [1]:

vector<string> split(const string &s, char delim) {
    stringstream ss(s);
    string item;
    vector<string> tokens;
    while (getline(ss, item, delim)) {
        tokens.push_back(item);
    }
    return tokens;
}

Note that this solution does not skip empty tokens, so the following will find 4 items, one of which is empty: std::vector<std::string> x = split("one:two::three", ':');

2. Use string’s find function

Need to be careful when using substr to truncate the tokens

Usage [2]:

void split(const string& s, char delim,vector<string>& v) {
    auto i = 0;
    auto pos = s.find(delim);
    while (pos != string::npos) {
      v.push_back(s.substr(i, pos-i));
      i = ++pos;
      pos = s.find(delim, pos);

      if (pos == string::npos)
         v.push_back(s.substr(i, s.length()));
    }
}

3. Use std::strtok

Signature char* strtok( char* str, const char* delim );

Return a pointer to the beginning of the next token or NULL if there are no more tokens. Usage [3][5]:

void split(const string &s, const char* delim, vector<string> & v){
    // to avoid modifying original string
    // first duplicate the original string and return a char pointer then free the memory
    char * dup = strdup(s.c_str());
    char * token = strtok(dup, delim);
    while(token != NULL){
        v.push_back(string(token));
        // the call is treated as a subsequent calls to strtok:
        // the function continues from where it left in previous invocation
        token = strtok(NULL, delim);
    }
    free(dup);
}

4. Use boost tokenizer

See detail

References

  1. split a string in C
  2. splitting a string
  3. using strtok with a string
  4. std::getline
  5. std::strtok

Yang Song

Ph.D. Student in Robotics