This chapter explores Ruby Regular Expressions (Regex), a powerful tool for pattern matching and text processing. Regular expressions are essential for tasks like validating input, searching, and manipulating strings. Ruby provides robust support for regular expressions, enabling efficient and concise operations on text.
Chapter Goals
- Understand the purpose and syntax of regular expressions in Ruby.
- Learn how to use regex for pattern matching, substitution, and validation.
- Explore common regex patterns and their applications.
- Implement best practices for writing and using regular expressions.
Key Characteristics of Ruby Regular Expressions
- Pattern Matching: Identify patterns within strings using regex.
- Search and Replace: Modify text based on regex matches.
- Integration: Seamlessly integrates with Ruby’s string and text-processing methods.
- Efficiency: Provides concise solutions for complex text operations.
Basic Rules for Regular Expressions
- Enclose patterns in forward slashes (/pattern/) or use %r{pattern}.
- Use regex methods like match, scan, sub, and gsub for text processing.
- Escape special characters with a backslash (\) if they need to be matched literally.
Best Practices
- Write clear and concise regex patterns for readability.
- Use named captures for clarity when extracting groups.
- Test regex patterns on various inputs to ensure reliability.
- Document complex regex patterns for maintainability.
Syntax Table
Serial No | Method/Pattern | Syntax/Example | Description |
1 | Match | /pattern/ =~ string | Checks if the string matches the pattern. |
2 | Match Object | string.match(/pattern/) | Returns a MatchData object for matches. |
3 | Scan | string.scan(/pattern/) | Returns all matches in an array. |
4 | Substitute | string.sub(/pattern/, ‘new’) | Replaces the first match in the string. |
5 | Global Substitute | string.gsub(/pattern/, ‘new’) | Replaces all matches in the string. |
6 | Named Captures | /(?<name>pattern)/ | Names a capturing group for clarity. |
Syntax Explanation
Match with Regular Expressions
What is Matching?
Matching checks if a string contains a substring that fits a regex pattern.
Syntax
if /pattern/ =~ string
puts “Match found!”
end
Detailed Explanation
- The =~ operator returns the starting index of the first match or nil if no match is found.
- Can be used in conditional statements for decision-making.
Additional Notes
- Supports inline matching for compact code: puts “Match!” if /Ruby/ =~ “I love Ruby.”
- Case sensitivity can be modified using the i flag: /pattern/i.
Example
if /Ruby/ =~ “I love Ruby programming!”
puts “Ruby is mentioned.”
end
Example Explanation
- Outputs “Ruby is mentioned.” because the string contains “Ruby”.
Using Match Objects
What is a Match Object?
A MatchData object stores details about the match, including matched groups.
Syntax
match = string.match(/pattern/)
if match
puts match[0]
end
Detailed Explanation
- The match method returns a MatchData object if the pattern matches.
- Provides access to matched groups using array indexing or named captures.
- Useful for extracting multiple parts of a match.
Example
match = “Ruby programming”.match(/Ruby/)
puts match[0] if match
Example Explanation
- Outputs “Ruby” as the first match.
Advanced Example
match = “Hello 123 World”.match(/(\d+)/)
puts “Number: \#{match[1]}” if match
Advanced Example Explanation
- Outputs “Number: 123” by capturing the digit sequence.
Scanning for Matches
What is Scanning?
Scanning finds all occurrences of a pattern in a string.
Syntax
matches = string.scan(/pattern/)
matches.each { |match| puts match }
Detailed Explanation
- The scan method returns an array of all matches.
- Works well with simple patterns or grouped patterns.
- Provides a structured way to extract multiple occurrences of data.
Example
matches = “123-456-789”.scan(/\d+/)
matches.each { |num| puts num }
Example Explanation
- Outputs 123, 456, and 789 as separate matches.
Grouped Example
matches = “(1,2), (3,4)”.scan(/\((\d+),(\d+)\)/)
matches.each { |x, y| puts “Coordinates: \#{x}, \#{y}” }
Grouped Example Explanation
- Outputs each pair of numbers as coordinates.
Substitution with sub and gsub
What is Substitution?
Substitution replaces matched patterns in a string.
Syntax
result = string.sub(/pattern/, ‘new’)
result = string.gsub(/pattern/, ‘new’)
Detailed Explanation
- sub replaces the first match, while gsub replaces all matches.
- Can use backreferences to include matched groups in the replacement.
Example
puts “abc-123”.sub(/\d+/, “XYZ”)
puts “abc-123-456”.gsub(/\d+/, “XYZ”)
Example Explanation
- sub outputs “abc-XYZ”.
- gsub outputs “abc-XYZ-XYZ”.
Advanced Example
puts “file123.txt”.gsub(/(\d+)/, ‘<>’)
Advanced Example Explanation
- Wraps the digits in angle brackets, outputting “file<123>.txt”.
Named Captures
What are Named Captures?
Named captures label groups within a regex for easier access.
Syntax
pattern = /(?<area>\d{3})-(?<local>\d{4})/
if match = “123-4567”.match(pattern)
puts “Area: \#{match[:area]}, Local: \#{match[:local]}”
end
Detailed Explanation
- Use (?<name>pattern) to define named groups.
- Access named groups using match[:name].
- Improves readability and maintainability of regex patterns.
Example Explanation
- Outputs “Area: 123, Local: 4567” by extracting named groups.
Advanced Usage
pattern = /(?<key>\w+): (?<value>\w+)/
“id: 123 name: Alice”.scan(pattern) do |key, value|
puts “\#{key.capitalize}: \#{value}”
end
Advanced Example Explanation
- Outputs “Id: 123” and “Name: Alice” by extracting and processing named groups.
Real-Life Project
Project Name: Email Validator
Project Goal
Create a program to validate email addresses using regular expressions.
Code for This Project
def valid_email?(email)
pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
!!(email =~ pattern)
end
emails = ["test@example.com", "invalid-email", "user@domain.org"]
emails.each do |email|
puts "\#{email} is valid: \#{valid_email?(email)}"
end
Steps
- Define a regex pattern for valid email addresses.
- Create a method valid_email? to check emails against the pattern.
- Test the method with a list of email addresses.
Expected Output
test@example.com is valid: true
invalid-email is valid: false
user@domain.org is valid: true
Project Explanation
- Demonstrates pattern matching for validating email addresses.
- Highlights the use of regex in real-world scenarios.
Insights
Ruby Regular Expressions are a powerful tool for text processing. Understanding their syntax and methods enables efficient pattern matching and string manipulation.
Key Takeaways
- Use regex for matching, extracting, and manipulating text.
- Leverage methods like match, scan, sub, and gsub for regex operations.
- Write clear and maintainable regex patterns to ensure readability.
- Test regex patterns thoroughly to handle edge cases effectively.