Database Articles
  Home arrow Database Articles arrow Pattern Matching with Strings
Codewalker Forums 
  Tutorials  
Database Articles  
Miscellaneous  
Navigation Usability  
PEAR Articles  
Programming Basics  
Server Administration  
XML Tutorials  
  Reviews  
Database Book Reviews  
Linux Book Reviews  
Miscellaneous Reviews  
PHP Book Reviews  
PHP Software Reviews  
Server Admin Reviews  
SQL Tool Reviews  
  Code Gallery  
Content Management Code  
Contest Code  
Counters Code  
Database Code  
Date Time Code  
Discussion Board Code  
Email Code  
File Manipulation Code  
GUI Code  
Link Farm Code  
Miscellaneous Code  
Search Code  
Site Navigation Code  
User Management Code  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Download TestComplete 
Forums Sitemap 
Weekly Newsletter 
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
DATABASE ARTICLES

Pattern Matching with Strings
By: O'Reilly Media
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 2
    2007-09-13

    Table of Contents:
  • Pattern Matching with Strings
  • Pattern Matching with Regular Expressions continued
  • 5.12 Controlling Case Sensitivity in Pattern Matching
  • 5.13 Breaking Apart or Combining Strings
  • Breaking Apart or Combining Strings

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Pattern Matching with Strings


    (Page 1 of 5 )

    In this fourth part of a series of articles that explain how to work with strings and string data, you'll learn about pattern matching and more. It is excerpted from chapter five of the MySQL Cookbook, Second Edition, written by Paul DuBois (O'Reilly; ISBN: 059652708X). Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

    5.11  Pattern Matching with Regular Expressions

    Problem

    You want to data type perform a pattern match rather than a literal comparison.

    Solution

    Use the REGEXP operator and a regular expression pattern, described in this section. Or use an SQL pattern, described in Recipe 5.10.

    Discussion

    SQL patterns (see Recipe 5.10 ) are likely to be implemented by other database systems,so they’re reasonably portable beyond MySQL. On the other hand, they’re somewhat limited. For example, you can easily write an SQL pattern %abc% to find strings that contain abc, but you cannot write a single SQL pattern to identify strings that contain any of the characters a, b, or c. Nor can you match string content based on character types such as letters or digits. For such operations, MySQL supports another type of pattern matching operation based on regular expressions and the REGEXP operator (or NOT REGEXP to reverse the sense of the match). REGEXP matching uses the pattern elements shown in the following table.

    Pattern

    What the pattern matches

    ^

    Beginning of string

    $

    End of string

    .

    Any single character

    [...]

    Any character listed between the square brackets

    [^...]

    Any character not listed between the square brackets

    p1|p2|p3

    Alternation; matches any of the patterns p1, p2, or p3

    *

    Zero or more instances of preceding element

    +

    One or more instances of preceding element

    {n}

    ninstances of preceding element

    {m,n}

    mthrough ninstances of preceding element

    You may already be familiar with these regular expression pattern characters, because many of them are the same as those used by vi, grep, sed, and other Unix utilities that support regular expressions. Most of them are used also in the regular expressions understood by programming languages. ( Chapter 10 discusses the use of pattern matching in programs for data validation and transformation.)

    Recipe 5.10 showed how to use SQL patterns to match substrings at the beginning or end of a string, or at an arbitrary or specific position within a string. You can do the same things with regular expressions:

    • Strings that begin with a particular substring:

        mysql> SELECT name FROM metal WHERE name REGEXP '^co';
       
      +--------+
        | name   |
        +--------+
        | copper |
        +--------+ 
    • Strings that end with a particular substring:

        mysql> SELECT name FROM metal WHERE name REGEXP 'er$';
       
      +--------+
        | name   |
        +--------+
        | copper |
        | silver |
        +--------+
       
    • Strings that contain a particular substring at any position:

        mysql> SELECT name FROM metal WHERE name REGEXP 'er';
       
      +---------+
        | name    |
        +---------+
        | copper  |
        | mercury |
        | silver  |
        +---------+

       
    • Strings that contain a particular substring at a specific position:

        mysql> SELECT name FROM metal WHERE name REGEXP '^..pp';
       
      +--------+
        | name   |
        +--------+
        | copper |
        +--------+

    In addition, regular expressions have other capabilities and can perform kinds of matches that SQL patterns cannot. For example, regular expressions can contain character classes, which match any character in the class:

    • To write a character class, use square brackets and list the characters you want the class to match inside the brackets. Thus, the pattern [abc] matches either a, b, or c.
    • Classes may indicate ranges of characters by using a dash between the beginning and end of the range. [a-z] matches any letter, [0-9] matches digits, and [a-z0-9] matches letters or digits.
    1. To negate a character class (“match any character but these”), begin the list with a ^ character. For example, [^0-9] matches anything but digits.

    MySQL’s regular-expression capabilities also support POSIX character classes. These match specific character sets, as described in the following table.

    POSIX class

    What the class matches

    [:alnum:]

    Alphabetic and numeric characters

    [:alpha:]

    Alphabetic characters

    [:blank:]

    Whitespace (space or tab characters)

    [:cntrl:]

    Control characters

    [:digit:]

    Digits

    [:graph:]

    Graphic (nonblank) characters

    [:lower:]

    Lowercase alphabetic characters

    [:print:]

    Graphic or space characters

    [:punct:]

    Punctuation characters

    [:space:]

    Space, tab, newline, carriage return

    [:upper:]

    Uppercase alphabetic characters

    [:xdigit:]

    Hexadecimal digits (0-9, a-f, A-F)

     

    POSIX classes are intended for use within character classes, so you use them within square brackets. The following expression matches values that contain any hexadecimal digit character:

      mysql> SELECT name, name REGEXP '[[:xdigit:]]' FROM metal;

     

    name

     name REGEXP '[[:xdigit:]]' 

     

    copper

     

     gold

     

     iron

     

    0

    lead

     

    mercury

     

    platinum 

    silver

     

     

     tin

     

     

     


    Regular expressions can contain alternations. The syntax looks like this:

      alternative1|alternative2|...

    An alternation is similar to a character class in the sense that it matches if any of the alternatives match. But unlike a character class, the alternatives are not limited to single characters. They can be multiple-character strings or even patterns. The following alternation matches strings that begin with a vowel or end with er:

      mysql> SELECT name FROM metal WHERE name REGEXP '^[aeiou]|er$';
     
    +--------+
      | name   |
      +--------+
      | copper |
      | iron   |
      | silver |
      +--------+

    Parentheses can be used to group alternations. For example, if you want to match strings that consist entirely of digits or entirely of letters, you might try this pattern, using an alternation:

      mysql> SELECT '0m' REGEXP '^[[:digit:]]+|[[:alpha:]]+$';

     

    '0m' REGEXP '^[[:digit:]]+|[[:alpha:]]+$'

     

     

    However, as the query result shows, the pattern doesn’t work. That’s because the ^ groups with the first alternative, and the $ groups with the second alternative. So the pattern actually matches strings that begin with one or more digits, or strings that end with one or more letters. If you group the alternatives within parentheses, the ^ and $ apply to both of them, and the pattern acts as you expect:

      mysql> SELECT '0m' REGEXP '^([[:digit:]]+|[[:alpha:]]+)$';

     

     '0m' REGEXP '^([[:digit:]]+|[[:alpha:]]+)$'

     

     0

     


    Unlike SQL pattern matches, which are successful only if the pattern matches the entire comparison value, regular expressions are successful if the pattern matches anywhere within the value. The following two pattern matches are equivalent in the sense that each one succeeds only for strings that contain a b character, but the first is more efficient because the pattern is simpler:

      'abc' REGEXP 'b'
      'abc' REGEXP '^.*b.*$'

    More Database Articles Articles
    More By O'Reilly Media


     

    Buy this book now. This article is excerpted from chapter five of the MySQL Cookbook, Second Edition, written by Paul DuBois (O'Reilly; ISBN: 059652708X). Check it out today at your favorite bookstore. Buy this book now.

    DATABASE ARTICLES ARTICLES

    - More on Query Optimization for Oracle Databa...
    - Query Optimization in Oracle
    - Clusters and Other Data Structures for Oracle
    - Using Indexes with an Oracle Database
    - The Basics of Data Structures in Oracle
    - Oracle Data Structures
    - Best Practices for PL/SQL Variables
    - What`s Code Without Variables?
    - Clauses, Sorting, and SQL Queries
    - The From Clause and SQL Queries
    - Query Primer
    - Full Text Searches and Strings
    - Searching with Strings
    - Pattern Matching with Strings
    - Working with Cases of Strings





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 6 Hosted by Hostway
    Stay green...Green IT