Ruby Regular Expressions

This chapter explores Ruby Regular Expressions (Regex), a powerful tool for pattern matching and text processing. Regular expressions are essential for tasks like validating input, searching, and manipulating strings. Ruby provides robust support for regular expressions, enabling efficient and concise operations on text.

Chapter Goals

  • Understand the purpose and syntax of regular expressions in Ruby.
  • Learn how to use regex for pattern matching, substitution, and validation.
  • Explore common regex patterns and their applications.
  • Implement best practices for writing and using regular expressions.

Key Characteristics of Ruby Regular Expressions

  • Pattern Matching: Identify patterns within strings using regex.
  • Search and Replace: Modify text based on regex matches.
  • Integration: Seamlessly integrates with Ruby’s string and text-processing methods.
  • Efficiency: Provides concise solutions for complex text operations.

Basic Rules for Regular Expressions

  • Enclose patterns in forward slashes (/pattern/) or use %r{pattern}.
  • Use regex methods like match, scan, sub, and gsub for text processing.
  • Escape special characters with a backslash (\) if they need to be matched literally.

Best Practices

  • Write clear and concise regex patterns for readability.
  • Use named captures for clarity when extracting groups.
  • Test regex patterns on various inputs to ensure reliability.
  • Document complex regex patterns for maintainability.

Syntax Table

Serial No Method/Pattern Syntax/Example Description
1 Match /pattern/ =~ string Checks if the string matches the pattern.
2 Match Object string.match(/pattern/) Returns a MatchData object for matches.
3 Scan string.scan(/pattern/) Returns all matches in an array.
4 Substitute string.sub(/pattern/, ‘new’) Replaces the first match in the string.
5 Global Substitute string.gsub(/pattern/, ‘new’) Replaces all matches in the string.
6 Named Captures /(?<name>pattern)/ Names a capturing group for clarity.

Syntax Explanation

Match with Regular Expressions

What is Matching?

Matching checks if a string contains a substring that fits a regex pattern.

Syntax

if /pattern/ =~ string

  puts “Match found!”

end

Detailed Explanation

  • The =~ operator returns the starting index of the first match or nil if no match is found.
  • Can be used in conditional statements for decision-making.

Additional Notes

  • Supports inline matching for compact code: puts “Match!” if /Ruby/ =~ “I love Ruby.”
  • Case sensitivity can be modified using the i flag: /pattern/i.

Example

if /Ruby/ =~ “I love Ruby programming!”

  puts “Ruby is mentioned.”

end

Example Explanation

  • Outputs “Ruby is mentioned.” because the string contains “Ruby”.

Using Match Objects

What is a Match Object?

A MatchData object stores details about the match, including matched groups.

Syntax

match = string.match(/pattern/)

if match

  puts match[0]

end

Detailed Explanation

  • The match method returns a MatchData object if the pattern matches.
  • Provides access to matched groups using array indexing or named captures.
  • Useful for extracting multiple parts of a match.

Example

match = “Ruby programming”.match(/Ruby/)

puts match[0] if match

Example Explanation

  • Outputs “Ruby” as the first match.

Advanced Example

match = “Hello 123 World”.match(/(\d+)/)

puts “Number: \#{match[1]}” if match

Advanced Example Explanation

  • Outputs “Number: 123” by capturing the digit sequence.

Scanning for Matches

What is Scanning?

Scanning finds all occurrences of a pattern in a string.

Syntax

matches = string.scan(/pattern/)

matches.each { |match| puts match }

Detailed Explanation

  • The scan method returns an array of all matches.
  • Works well with simple patterns or grouped patterns.
  • Provides a structured way to extract multiple occurrences of data.

Example

matches = “123-456-789”.scan(/\d+/)

matches.each { |num| puts num }

Example Explanation

  • Outputs 123, 456, and 789 as separate matches.

Grouped Example

matches = “(1,2), (3,4)”.scan(/\((\d+),(\d+)\)/)

matches.each { |x, y| puts “Coordinates: \#{x}, \#{y}” }

Grouped Example Explanation

  • Outputs each pair of numbers as coordinates.

Substitution with sub and gsub

What is Substitution?

Substitution replaces matched patterns in a string.

Syntax

result = string.sub(/pattern/, ‘new’)

result = string.gsub(/pattern/, ‘new’)

Detailed Explanation

  • sub replaces the first match, while gsub replaces all matches.
  • Can use backreferences to include matched groups in the replacement.

Example

puts “abc-123”.sub(/\d+/, “XYZ”)

puts “abc-123-456”.gsub(/\d+/, “XYZ”)

Example Explanation

  • sub outputs “abc-XYZ”.
  • gsub outputs “abc-XYZ-XYZ”.

Advanced Example

puts “file123.txt”.gsub(/(\d+)/, ‘<>’)

Advanced Example Explanation

  • Wraps the digits in angle brackets, outputting “file<123>.txt”.

Named Captures

What are Named Captures?

Named captures label groups within a regex for easier access.

Syntax

pattern = /(?<area>\d{3})-(?<local>\d{4})/

if match = “123-4567”.match(pattern)

  puts “Area: \#{match[:area]}, Local: \#{match[:local]}”

end

Detailed Explanation

  • Use (?<name>pattern) to define named groups.
  • Access named groups using match[:name].
  • Improves readability and maintainability of regex patterns.

Example Explanation

  • Outputs “Area: 123, Local: 4567” by extracting named groups.

Advanced Usage

pattern = /(?<key>\w+): (?<value>\w+)/

“id: 123 name: Alice”.scan(pattern) do |key, value|

  puts “\#{key.capitalize}: \#{value}”

end

Advanced Example Explanation

  • Outputs “Id: 123” and “Name: Alice” by extracting and processing named groups.

Real-Life Project

Project Name: Email Validator

Project Goal

Create a program to validate email addresses using regular expressions.

Code for This Project

def valid_email?(email)

  pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

  !!(email =~ pattern)

end

emails = ["test@example.com", "invalid-email", "user@domain.org"]

emails.each do |email|

  puts "\#{email} is valid: \#{valid_email?(email)}"

end

Steps

  1. Define a regex pattern for valid email addresses.
  2. Create a method valid_email? to check emails against the pattern.
  3. Test the method with a list of email addresses.

Expected Output

test@example.com is valid: true

invalid-email is valid: false

user@domain.org is valid: true

Project Explanation

  • Demonstrates pattern matching for validating email addresses.
  • Highlights the use of regex in real-world scenarios.

Insights

Ruby Regular Expressions are a powerful tool for text processing. Understanding their syntax and methods enables efficient pattern matching and string manipulation.

Key Takeaways

  • Use regex for matching, extracting, and manipulating text.
  • Leverage methods like match, scan, sub, and gsub for regex operations.
  • Write clear and maintainable regex patterns to ensure readability.
  • Test regex patterns thoroughly to handle edge cases effectively.