- •Programming Ruby The Pragmatic Programmer's Guide
- •Foreword
- •Preface
- •Ruby Sparkles
- •What Kind of Language Is Ruby?
- •Is Ruby for Me?
- •Why Did We Write This Book?
- •Ruby Versions
- •Installing Ruby
- •Building Ruby
- •Running Ruby
- •Interactive Ruby
- •Ruby Programs
- •Resources
- •Acknowledgments
- •Notation Conventions
- •Roadmap
- •Ruby.New
- •Ruby Is an Object-Oriented Language
- •Some Basic Ruby
- •Arrays and Hashes
- •Control Structures
- •Regular Expressions
- •Blocks and Iterators
- •Reading and 'Riting
- •Onward and Upward
- •Classes, Objects, and Variables
- •Inheritance and Messages
- •Inheritance and Mixins
- •Objects and Attributes
- •Writable Attributes
- •Virtual Attributes
- •Class Variables and Class Methods
- •Class Variables
- •Class Methods
- •Singletons and Other Constructors
- •Access Control
- •Specifying Access Control
- •Variables
- •Containers, Blocks, and Iterators
- •Containers
- •Implementing a SongList Container
- •Blocks and Iterators
- •Implementing Iterators
- •Blocks for Transactions
- •Blocks Can Be Closures
- •Standard Types
- •Numbers
- •Strings
- •Working with Strings
- •Ranges as Sequences
- •Ranges as Conditions
- •Ranges as Intervals
- •Regular Expressions
- •Patterns
- •Anchors
- •Character Classes
- •Repetition
- •Alternation
- •Grouping
- •Pattern-Based Substitution
- •Backslash Sequences in the Substitution
- •Object-Oriented Regular Expressions
- •More About Methods
- •Defining a Method
- •Variable-Length Argument Lists
- •Methods and Blocks
- •Calling a Method
- •Expanding Arrays in Method Calls
- •Making Blocks More Dynamic
- •Collecting Hash Arguments
- •Expressions
- •Operator Expressions
- •Miscellaneous Expressions
- •Command Expansion
- •Backquotes Are Soft
- •Assignment
- •Parallel Assignment
- •Nested Assignments
- •Other Forms of Assignment
- •Conditional Execution
- •Boolean Expressions
- •Defined?, And, Or, and Not
- •If and Unless Expressions
- •If and Unless Modifiers
- •Case Expressions
- •Iterators
- •Break, Redo, and Next
- •Variable Scope and Loops
- •Exceptions, Catch, and Throw
- •The Exception Class
- •Handling Exceptions
- •Tidying Up
- •Play It Again
- •Raising Exceptions
- •Adding Information to Exceptions
- •Catch and Throw
- •Modules
- •Namespaces
- •Instance Variables in Mixins
- •Iterators and the Enumerable Module
- •Including Other Files
- •Basic Input and Output
- •What Is an io Object?
- •Opening and Closing Files
- •Reading and Writing Files
- •Iterators for Reading
- •Writing to Files
- •Talking to Networks
- •Threads and Processes
- •Multithreading
- •Creating Ruby Threads
- •Manipulating Threads
- •Thread Variables
- •Threads and Exceptions
- •Controlling the Thread Scheduler
- •Mutual Exclusion
- •The Mutex Class
- •Condition Variables
- •Running Multiple Processes
- •Spawning New Processes
- •Independent Children
- •Blocks and Subprocesses
- •When Trouble Strikes
- •Ruby Debugger
- •Interactive Ruby
- •Editor Support
- •But It Doesn't Work!
- •But It's Too Slow!
- •Create Locals Outside Blocks
- •Use the Profiler
- •Ruby and Its World
- •Command-Line Arguments
- •Command-Line Options
- •Program Termination
- •Environment Variables
- •Writing to Environment Variables
- •Where Ruby Finds Its Modules
- •Build Environment
- •Ruby and the Web
- •Writing cgi Scripts
- •Using cgi.Rb
- •Quoting
- •Creating Forms and html
- •Cookies
- •Sessions
- •Embedding Ruby in html
- •Using eruby
- •Installing eruby in Apache
- •Improving Performance
- •Ruby Tk
- •Simple Tk Application
- •Widgets
- •Setting Widget Options
- •Getting Widget Data
- •Setting/Getting Options Dynamically
- •Sample Application
- •Binding Events
- •Scrolling
- •Just One More Thing
- •Translating from Perl/Tk Documentation
- •Object Creation
- •Running Ruby Under Windows
- •Windows Automation
- •Getting and Setting Properties
- •Named Arguments
- •For each
- •An Example
- •Optimizing
- •Extending Ruby
- •Ruby Objects in c
- •Value as a Pointer
- •Value as an Immediate Object
- •Writing Ruby in c
- •Evaluating Ruby Expressions in c
- •Sharing Data Between Ruby and c
- •Directly Sharing Variables
- •Wrapping c Structures
- •An Example
- •Memory Allocation
- •Creating an Extension
- •Creating a Makefile with extconf.Rb
- •Static Linking
- •Embedding a Ruby Interpreter
- •Bridging Ruby to Other Languages
- •Ruby c Language api
- •The Ruby Language
- •Source Layout
- •Begin and end Blocks
- •General Delimited Input
- •The Basic Types
- •Integer and Floating Point Numbers
- •Strings
- •Requirements for a Hash Key
- •Symbols
- •Regular Expressions
- •Regular Expression Options
- •Regular Expression Patterns
- •Substitutions
- •Extensions
- •Variable/Method Ambiguity
- •Variables and Constants
- •Scope of Constants and Variables
- •Predefined Variables
- •Exception Information
- •Pattern Matching Variables
- •Input/Output Variables
- •Execution Environment Variables
- •Standard Objects
- •Global Constants
- •Expressions Single Terms
- •Operator Expressions
- •More on Assignment
- •Parallel Assignment
- •Block Expressions
- •Boolean Expressions
- •Truth Values
- •And, Or, Not, and Defined?
- •Comparison Operators
- •Ranges in Boolean Expressions
- •Regular Expressions in Boolean Expressions
- •While and Until Modifiers
- •Break, Redo, Next, and Retry
- •Method Definition
- •Method Arguments
- •Invoking a Method
- •Class Definition
- •Creating Objects from Classes
- •Class Attribute Declarations
- •Module Definitions
- •Mixins---Including Modules
- •Module Functions
- •Access Control
- •Blocks, Closures, and Proc Objects
- •Proc Objects
- •Exceptions
- •Raising Exceptions
- •Handling Exceptions
- •Retrying a Block
- •Catch and Throw
- •Classes and Objects
- •How Classes and Objects Interact
- •Your Basic, Everyday Object
- •Object-Specific Classes
- •Mixin Modules
- •Extending Objects
- •Class and Module Definitions
- •Class Names Are Constants
- •Inheritance and Visibility
- •Freezing Objects
- •Locking Ruby in the Safe
- •Safe Levels
- •Tainted Objects
- •Reflection, ObjectSpace, and Distributed Ruby
- •Looking at Objects
- •Looking Inside Objects
- •Looking at Classes
- •Looking Inside Classes
- •Calling Methods Dynamically
- •Performance Considerations
- •System Hooks
- •Runtime Callbacks
- •Tracing Your Program's Execution
- •How Did We Get Here?
- •Marshaling and Distributed Ruby
- •Custom Serialization Strategy
- •Distributed Ruby
- •Compile Time? Runtime? Anytime!
- •Standard Library
Pattern-Based Substitution
Sometimes finding a pattern in a string is good enough. If a friend challenges you to find a word that contains the letters a, b, c, d, and e in order, you could search a word list with the pattern /a.*b.*c.*d.*e/and find ``absconded'' and ``ambuscade.'' That has to be worth something.
However, there are times when you need to change things based on a pattern match. Let's go back to our song list file. Whoever created it entered all the artists' names in lowercase. When we display them on our jukebox's screen, they'd look better in mixed case. How can we change the first character of each word to uppercase?
The methods String#sub andString#gsub look for a portion of a string matching their first argument and replace it with their second argument.String#sub performs one replacement, whileString#gsub replaces every occurrence of the match. Both routines return a new copy of theStringcontaining the substitutions. Mutator versionsString#sub! andString#gsub! modify the original string.
a = "the quick brown fox" | ||
a.sub(/[aeiou]/, '*') |
» |
"th* quick brown fox" |
a.gsub(/[aeiou]/, '*') |
» |
"th* q**ck br*wn f*x" |
a.sub(/\s\S+/, '') |
» |
"the brown fox" |
a.gsub(/\s\S+/, '') |
» |
"the" |
The second argument to both functions can be either a Stringor a block. If a block is used, the block's value is substituted into theString.
a = "the quick brown fox" | ||
a.sub(/^./) { $&.upcase } |
» |
"The quick brown fox" |
a.gsub(/[aeiou]/) { $&.upcase } |
» |
"thE qUIck brOwn fOx" |
So, this looks like the answer to converting our artists' names. The pattern that matches the first character of a word is \b\w---look for a word boundary followed by a word character. Combine this withgsuband we can hack the artists' names.
def mixedCase(aName) | ||
aName.gsub(/\b\w/) { $&.upcase } | ||
end | ||
| ||
mixedCase("fats waller") |
» |
"Fats Waller" |
mixedCase("louis armstrong") |
» |
"Louis Armstrong" |
mixedCase("strength in numbers") |
» |
"Strength In Numbers" |
Backslash Sequences in the Substitution
Earlier we noted that the sequences \1,\2, and so on are available in the pattern, standing for thenth group matched so far. The same sequences are available in the second argument ofsubandgsub.
"fred:smith".sub(/(\w+):(\w+)/, '\2, \1') |
» |
"smith, fred" |
"nercpyitno".gsub(/(.)(.)/, '\2\1') |
» |
"encryption" |
There are additional backslash sequences that work in substitution strings: \&(last match),\+(last matched group),\`(string prior to match),\'(string after match), and\\(a literal backslash). It gets confusing if you want to include a literal backslash in a substitution. The obvious thing is to write
str.gsub(/\\/, '\\\\') |
Clearly, this code is trying to replace each backslash in strwith two. The programmer doubled up the backslashes in the replacement text, knowing that they'd be converted to ``\\'' in syntax analysis. However, when the substitution occurs, the regular expression engine performs another pass through the string, converting ``\\'' to ``\'', so the net effect is to replace each single backslash with another single backslash. You need to writegsub(/\\/, '\\\\\\\\')!
str = 'a\b\c' |
» |
"a\b\c" |
str.gsub(/\\/, '\\\\\\\\') |
» |
"a\\b\\c" |
However, using the fact that \&is replaced by the matched string, you could also write
str = 'a\b\c' |
» |
"a\b\c" |
str.gsub(/\\/, '\&\&') |
» |
"a\\b\\c" |
If you use the block form of gsub, the string for substitution is analyzed only once (during the syntax pass) and the result is what you intended.
str = 'a\b\c' |
» |
"a\b\c" |
str.gsub(/\\/) { '\\\\' } |
» |
"a\\b\\c" |
Finally, as an example of the wonderful expressiveness of combining regular expressions with code blocks, consider the following code fragment from the CGI library module, written by Wakou Aoyama. The code takes a string containing HTMLescape sequences and converts it into normal ASCII. Because it was written for a Japanese audience, it uses the ``n'' modifier on the regular expressions, which turns off wide-character processing. It also illustrates Ruby'scaseexpression, which we discuss starting on page 81.
def unescapeHTML(string) str = string.dup str.gsub!(/&(.*?);/n) { match = $1.dup case match when /\Aamp\z/ni then '&' when /\Aquot\z/ni then '"' when /\Agt\z/ni then '>' when /\Alt\z/ni then '<' when /\A#(\d+)\z/n then Integer($1).chr when /\A#x([0-9a-f]+)\z/ni then $1.hex.chr end } str end
puts unescapeHTML("1<2 && 4>3") puts unescapeHTML(""A" = A = A") |
produces:
1<2 && 4>3 "A" = A = A |