Ruby Strings


A String object in Ruby holds and manipulates an arbitrary sequence of one or more bytes, typically representing characters that represent human language.

The simplest string literals are enclosed in single quotes (the apostrophe character). The text within the quote marks is the value of the string:

      'This is a simple Ruby string literal'
      

If you need to place an apostrophe within a single-quoted string literal, precede it with a backslash so that the Ruby interpreter does not think that it terminates the string:

      'Won\'t you read O\'Reilly\'s book?'
      

The backslash also works to escape another backslash, so that the second backslash is not itself interpreted as an escape character.

Following are string-related features Ruby.

String Concatenation:

The simplest way to concatenate strings is the "+" operator (method if you wish):

      a = 'We can '
      b = 'concatelate ' 
      c = 'strings'
      
      puts a+b+(c)
      

This will produce following result:

      We can concatelate strings 
      

We can of course concatenate strings "on the fly" with the "<<" operator:

      a = 'We can '
      a << 'concatelate ' 
      a << 'strings'
      
      puts a
      

Which will produce the same result:

      We can concatelate strings 
      

Expression Substitution:

Expression substitution, a.k.a. String Interpolation, is a means of embedding the value of any Ruby expression into a string using #{ and }:

      x = 12
      y = 36
      z = 72
      puts "The value of x is #{ x }."
      puts "The sum of x and y is #{ x + y }."
      puts "The average was #{ (x + y + z)/3 }."
      

This will produce following result:

      The value of x is 12.
      The sum of x and y is 48.
      The average was 40.
      

 

General Delimited Strings:

With general delimited strings, you can create strings inside a pair of matching though arbitrary delimiter characters, e.g., !, (, {, <, etc., preceded by a percent character (%). Q, q, and x have special meanings. General delimited strings can be nested:

      %{Ruby is fun.}    # equivalent to "Ruby is fun."
      %Q{ Ruby is fun. } # equivalent to " Ruby is fun. "
      %q[Ruby is fun.]   # equivalent to a single-quoted string
      %x!ls!             # equivalent to back tick command output `ls`
      

 

Character Encoding:

The default character set for Ruby is ASCII, whose characters may be represented by single bytes. If you use UTF-8, or another modern character set, characters may be represented in one to four bytes.

You can change your character set using $KCODE at the beginning of your program, like this:

      $KCODE = 'u'
      # Following are the possible values for $KCODE:
      # a - ASCII
      # e - EUC.
      # n - None (same as ASCII) 
      # u - UTF-8
      

String Built-in Methods:

We need to have an instance of String object to call a String method. Following are two ways to create an instance of String object:

      str = ''
      str = String.new('')
      

This will returns a new string object containing a copy of str. Now using str object we can call any available instance methods. For example:

      str = String.new("THIS IS TEST")
      foo = str.downcase
      
      puts "#{foo}"
      

This will produce following result:

      this is test
      

The slice Method:

Some times you need to extract characters from a certain offset within a string.

In Ruby, this is done using the slice method of the String class. However, slice works a little different; slice is really an alias to [], so:

      s = 'My kingdom for a string!'
      puts s.slice(3)          # => 107
      puts s[3]                # => 107
      puts 107.chr             # => "k"
      puts s.slice(3,1)        # => "k"
      puts s[3,1]              # => "k"
      puts s[23,1]             # => "!"
      puts s[24]               # => "nil"
      

This will produce following result:

      107
      107
      k
      k
      k
      !
      nil
      

If compared to substr in other languages, this may come as a surprise; it returns the value of the character and not the rest of the string. We can do the same using ranges instead of specific indices:

      s = 'My kingdom for a string!'
      
      puts s.slice(3..9)
      puts s[0..1] + s[17..22] + s[11..13] + s[15..15] + s[3..9] + s[23..-1]
      puts s.slice(0..-15)
      puts s[17..-1]
      

This will produce following result:

      kingdom
      Mystringforakingdom!
      My kingdom
      string!
      

The split Method:

Taking a string and splitting it with a delimiter is a very common task in Ruby. The official documentation states that String#split divides str into substrings based on a delimiter, returning an array of these substrings.

The delimiter itself can be a string or regular expression:

      #string delimiter
      puts "hello".split('').inspect 
      puts "hello".split('ll').inspect
      
      # regular expression delimiter
      puts "hello".split(//).inspect
      puts "hello".split(/l+/).inspect
      

This will produce following result:

      ["h", "e", "l", "l", "o"]
      ["he", "o"]
      ["h", "e", "l", "l", "o"]
      ["he", "o"]
      

String#split takes an optional second parameter representing a limit.

      # omitting the limit
      puts "hallo hello hillo hollo".split('o').inspect 
      
      # positive limit
      puts "hallo hello hillo hollo".split('o', 1).inspect 
      puts "hallo hello hillo hollo".split('o', 2).inspect 
      puts "hallo hello hillo hollo".split('o', 3).inspect 
      puts "hallo hello hillo hollo".split('o', 4).inspect 
      puts "hallo hello hillo hollo".split('o', 5).inspect 
      
      # negative limit
      puts "hallo hello hillo hollo".split('o', -1).inspect 
      

This will produce following result:

      ["hall", " hell", " hill", " h", "ll"]
      ["hallo hello hillo hollo"]
      ["hall", " hello hillo hollo"]
      ["hall", " hell", " hillo hollo"]
      ["hall", " hell", " hill", " hollo"]
      ["hall", " hell", " hill", " h", "llo"]
      ["hall", " hell", " hill", " h", "ll", ""]
      

String Pattern Formatting:

You can format your data immediately, using the sprintf method:

      puts sprintf("%d %04x", 123, 123)   
      puts sprintf("%08b '%4s'", 123, 123)      
      puts sprintf("%1$*2$s %2$d %1$s", "hello", 8) 
      puts sprintf("%1$*2$s %2$d", "hello", -8)    
      puts sprintf("%+g:% g:%-g", 1.23, 1.23, 1.23)  
      puts sprintf("%u", -123)             
      

This will produce following result:

      123 007b
      01111011 ' 123'
         hello 8 hello
      hello    -8
      +1.23: 1.23:1.23
      ..4294967173
      

Or You can create a template, and interpolate data with it later:

      template = '%s, has always been in %s with %s.'
      
      puts template % ['Oceania', 'war', 'Eurasia']
      
      puts template % ['Luke Skywalker', 'love', 'Leia Organa']
      

This will produce following result:

      Oceania, has always been in war with Eurasia.
      Luke Skywalker, has always been in love with Leia Organa.