- •Programming Ruby The Pragmatic Programmer's Guide
- •Foreword
- •Preface
- •Ruby Sparkles
- •What Kind of Language Is Ruby?
- •Is Ruby for Me?
- •Why Did We Write This Book?
- •Ruby Versions
- •Installing Ruby
- •Building Ruby
- •Running Ruby
- •Interactive Ruby
- •Ruby Programs
- •Resources
- •Acknowledgments
- •Notation Conventions
- •Roadmap
- •Ruby.New
- •Ruby Is an Object-Oriented Language
- •Some Basic Ruby
- •Arrays and Hashes
- •Control Structures
- •Regular Expressions
- •Blocks and Iterators
- •Reading and 'Riting
- •Onward and Upward
- •Classes, Objects, and Variables
- •Inheritance and Messages
- •Inheritance and Mixins
- •Objects and Attributes
- •Writable Attributes
- •Virtual Attributes
- •Class Variables and Class Methods
- •Class Variables
- •Class Methods
- •Singletons and Other Constructors
- •Access Control
- •Specifying Access Control
- •Variables
- •Containers, Blocks, and Iterators
- •Containers
- •Implementing a SongList Container
- •Blocks and Iterators
- •Implementing Iterators
- •Blocks for Transactions
- •Blocks Can Be Closures
- •Standard Types
- •Numbers
- •Strings
- •Working with Strings
- •Ranges as Sequences
- •Ranges as Conditions
- •Ranges as Intervals
- •Regular Expressions
- •Patterns
- •Anchors
- •Character Classes
- •Repetition
- •Alternation
- •Grouping
- •Pattern-Based Substitution
- •Backslash Sequences in the Substitution
- •Object-Oriented Regular Expressions
- •More About Methods
- •Defining a Method
- •Variable-Length Argument Lists
- •Methods and Blocks
- •Calling a Method
- •Expanding Arrays in Method Calls
- •Making Blocks More Dynamic
- •Collecting Hash Arguments
- •Expressions
- •Operator Expressions
- •Miscellaneous Expressions
- •Command Expansion
- •Backquotes Are Soft
- •Assignment
- •Parallel Assignment
- •Nested Assignments
- •Other Forms of Assignment
- •Conditional Execution
- •Boolean Expressions
- •Defined?, And, Or, and Not
- •If and Unless Expressions
- •If and Unless Modifiers
- •Case Expressions
- •Iterators
- •Break, Redo, and Next
- •Variable Scope and Loops
- •Exceptions, Catch, and Throw
- •The Exception Class
- •Handling Exceptions
- •Tidying Up
- •Play It Again
- •Raising Exceptions
- •Adding Information to Exceptions
- •Catch and Throw
- •Modules
- •Namespaces
- •Instance Variables in Mixins
- •Iterators and the Enumerable Module
- •Including Other Files
- •Basic Input and Output
- •What Is an io Object?
- •Opening and Closing Files
- •Reading and Writing Files
- •Iterators for Reading
- •Writing to Files
- •Talking to Networks
- •Threads and Processes
- •Multithreading
- •Creating Ruby Threads
- •Manipulating Threads
- •Thread Variables
- •Threads and Exceptions
- •Controlling the Thread Scheduler
- •Mutual Exclusion
- •The Mutex Class
- •Condition Variables
- •Running Multiple Processes
- •Spawning New Processes
- •Independent Children
- •Blocks and Subprocesses
- •When Trouble Strikes
- •Ruby Debugger
- •Interactive Ruby
- •Editor Support
- •But It Doesn't Work!
- •But It's Too Slow!
- •Create Locals Outside Blocks
- •Use the Profiler
- •Ruby and Its World
- •Command-Line Arguments
- •Command-Line Options
- •Program Termination
- •Environment Variables
- •Writing to Environment Variables
- •Where Ruby Finds Its Modules
- •Build Environment
- •Ruby and the Web
- •Writing cgi Scripts
- •Using cgi.Rb
- •Quoting
- •Creating Forms and html
- •Cookies
- •Sessions
- •Embedding Ruby in html
- •Using eruby
- •Installing eruby in Apache
- •Improving Performance
- •Ruby Tk
- •Simple Tk Application
- •Widgets
- •Setting Widget Options
- •Getting Widget Data
- •Setting/Getting Options Dynamically
- •Sample Application
- •Binding Events
- •Scrolling
- •Just One More Thing
- •Translating from Perl/Tk Documentation
- •Object Creation
- •Running Ruby Under Windows
- •Windows Automation
- •Getting and Setting Properties
- •Named Arguments
- •For each
- •An Example
- •Optimizing
- •Extending Ruby
- •Ruby Objects in c
- •Value as a Pointer
- •Value as an Immediate Object
- •Writing Ruby in c
- •Evaluating Ruby Expressions in c
- •Sharing Data Between Ruby and c
- •Directly Sharing Variables
- •Wrapping c Structures
- •An Example
- •Memory Allocation
- •Creating an Extension
- •Creating a Makefile with extconf.Rb
- •Static Linking
- •Embedding a Ruby Interpreter
- •Bridging Ruby to Other Languages
- •Ruby c Language api
- •The Ruby Language
- •Source Layout
- •Begin and end Blocks
- •General Delimited Input
- •The Basic Types
- •Integer and Floating Point Numbers
- •Strings
- •Requirements for a Hash Key
- •Symbols
- •Regular Expressions
- •Regular Expression Options
- •Regular Expression Patterns
- •Substitutions
- •Extensions
- •Variable/Method Ambiguity
- •Variables and Constants
- •Scope of Constants and Variables
- •Predefined Variables
- •Exception Information
- •Pattern Matching Variables
- •Input/Output Variables
- •Execution Environment Variables
- •Standard Objects
- •Global Constants
- •Expressions Single Terms
- •Operator Expressions
- •More on Assignment
- •Parallel Assignment
- •Block Expressions
- •Boolean Expressions
- •Truth Values
- •And, Or, Not, and Defined?
- •Comparison Operators
- •Ranges in Boolean Expressions
- •Regular Expressions in Boolean Expressions
- •While and Until Modifiers
- •Break, Redo, Next, and Retry
- •Method Definition
- •Method Arguments
- •Invoking a Method
- •Class Definition
- •Creating Objects from Classes
- •Class Attribute Declarations
- •Module Definitions
- •Mixins---Including Modules
- •Module Functions
- •Access Control
- •Blocks, Closures, and Proc Objects
- •Proc Objects
- •Exceptions
- •Raising Exceptions
- •Handling Exceptions
- •Retrying a Block
- •Catch and Throw
- •Classes and Objects
- •How Classes and Objects Interact
- •Your Basic, Everyday Object
- •Object-Specific Classes
- •Mixin Modules
- •Extending Objects
- •Class and Module Definitions
- •Class Names Are Constants
- •Inheritance and Visibility
- •Freezing Objects
- •Locking Ruby in the Safe
- •Safe Levels
- •Tainted Objects
- •Reflection, ObjectSpace, and Distributed Ruby
- •Looking at Objects
- •Looking Inside Objects
- •Looking at Classes
- •Looking Inside Classes
- •Calling Methods Dynamically
- •Performance Considerations
- •System Hooks
- •Runtime Callbacks
- •Tracing Your Program's Execution
- •How Did We Get Here?
- •Marshaling and Distributed Ruby
- •Custom Serialization Strategy
- •Distributed Ruby
- •Compile Time? Runtime? Anytime!
- •Standard Library
Repetition
When we specified the pattern that split the song list line, /\s*\|\s*/, we said we wanted to match a vertical bar surrounded by an arbitrary amount of whitespace. We now know that the\ssequences match a single whitespace character, so it seems likely that the asterisks somehow mean ``an arbitrary amount.'' In fact, the asterisk is one of a number of modifiers that allow you to match multiple occurrences of a pattern.
If rstands for the immediately preceding regular expression within a pattern, then:
r* |
matches zero or more occurrences of r. |
r+ |
matches one or more occurrences of r. |
r? |
matches zero or one occurrence of r. |
r{m,n} |
matches at least ``m'' and at most ``n'' occurrences of r. |
r{m,} |
matches at least ``m'' occurrences of r. |
These repetition constructs have a high precedence---they bind only to the immediately preceding regular expression in the pattern. /ab+/matches an ``a'' followed by one or more ``b''s, not a sequence of ``ab''s. You have to be careful with the*construct too---the pattern /a*/ will match any string; every string has zero or more ``a''s.
These patterns are called greedy, because by default they will match as much of the string as they can. You can alter this behavior, and have them match the minimum, by adding a question mark suffix.
a = "The moon is made of cheese" | ||
showRE(a, /\w+/) |
» |
<<The>> moon is made of cheese |
showRE(a, /\s.*\s/) |
» |
The<< moon is made of >>cheese |
showRE(a, /\s.*?\s/) |
» |
The<< moon >>is made of cheese |
showRE(a, /[aeiou]{2,99}/) |
» |
The m<<oo>>n is made of cheese |
showRE(a, /mo?o/) |
» |
The <<moo>>n is made of cheese |
Alternation
We know that the vertical bar is special, because our line splitting pattern had to escape it with a backslash. That's because an unescaped vertical bar ``|'' matches either the regular expression that precedes it or the regular expression that follows it.
a = "red ball blue sky" | ||
showRE(a, /d|e/) |
» |
r<<e>>d ball blue sky |
showRE(a, /al|lu/) |
» |
red b<<al>>l blue sky |
showRE(a, /red ball|angry sky/) |
» |
<<red ball>> blue sky |
There's a trap for the unwary here, as ``|'' has a very low precedence. The last example above matches ``red ball'' or ``angry sky'', not ``red ball sky'' or ``red angry sky''. To match ``red ball sky'' or ``red angry sky'', you'd need to override the default precedence using grouping.
Grouping
You can use parentheses to group terms within a regular expression. Everything within the group is treated as a single regular expression.
showRE('banana', /an*/) |
» |
b<<an>>ana |
showRE('banana', /(an)*/) |
» |
<<>>banana |
showRE('banana', /(an)+/) |
» |
b<<anan>>a |
a = 'red ball blue sky' | ||
showRE(a, /blue|red/) |
» |
<<red>> ball blue sky |
showRE(a, /(blue|red) \w+/) |
» |
<<red ball>> blue sky |
showRE(a, /(red|blue) \w+/) |
» |
<<red ball>> blue sky |
showRE(a, /red|blue \w+/) |
» |
<<red>> ball blue sky |
showRE(a, /red (ball|angry) sky/) |
» |
no match |
a = 'the red angry sky' | ||
showRE(a, /red (ball|angry) sky/) |
» |
the <<red angry sky>> |
Parentheses are also used to collect the results of pattern matching. Ruby counts opening parentheses, and for each stores the result of the partial match between it and the corresponding closing parenthesis. You can use this partial match both within the remainder of the pattern and in your Ruby program. Within the pattern, the sequence \1refers to the match of the first group,\2the second group, and so on. Outside the pattern, the special variables$1,$2, and so on, serve the same purpose.
"12:50am" =~ /(\d\d):(\d\d)(..)/ |
» |
0 |
"Hour is #$1, minute #$2" |
» |
"Hour is 12, minute 50" |
"12:50am" =~ /((\d\d):(\d\d))(..)/ |
» |
0 |
"Time is #$1" |
» |
"Time is 12:50" |
"Hour is #$2, minute #$3" |
» |
"Hour is 12, minute 50" |
"AM/PM is #$4" |
» |
"AM/PM is am" |
The ability to use part of the current match later in that match allows you to look for various forms of repetition.
# match duplicated letter | ||
showRE('He said "Hello"', /(\w)\1/) |
» |
He said "He<<ll>>o" |
# match duplicated substrings | ||
showRE('Mississippi', /(\w+)\1/) |
» |
M<<ississ>>ippi |
You can also use back references to match delimiters.
showRE('He said "Hello"', /(["']).*?\1/) |
» |
He said <<"Hello">> |
showRE("He said 'Hello'", /(["']).*?\1/) |
» |
He said <<'Hello'>> |