Let the Mac Speak for Me

The Problem

I often lose my voice temporarily due to allergy. During that time, my main mode of communication is a notepad, or the electronic equivalent: Zen Brush. I was looking for a solution that can convert what I type into spoken words.

The Solution

Since I am using Mac both at home and at work, and Mac has the say command which is useful for this purpose; I decided to roll my own solution. It turned out that the solution is a very simple bash script, which I named speak4me.sh, which I saved in my ~/bin directory:

#!/bin/bash
while read line
do
    say -v Alex $line
done

To make the script executable, I issued the following command:

$ chmod +x ~/bin/speak4me.sh

Using speak4me.sh

To use speak4me.sh, from the terminal, issue the following command:

$ speak4me.sh

After that, start typing your message. As soon as you hit the return key, the script will “say” what you type. You can keep on typing and hitting return. To end the script, just type Ctrl+C.

Discussion

The script is very simple, pragmatic and free from bells and whistles, but it works the way I like it. The -v Alex part of the say command specifies a voice from Alex. OS X comes with a few voices which you can experiment with yourself. To list the voices your system has, issue the following command:

$ say -v ?
Agnes               en_US    # Isn't it nice to have a computer that will talk to you?
Albert              en_US    #  I have a frog in my throat. No, I mean a real frog!
Alex                en_US    # Most people recognize me by my voice.
Bad News            en_US    # The light you see at the end of the tunnel is the headlamp of a fast approaching train.
Bahh                en_US    # Do not pull the wool over my eyes.
Bells               en_US    # Time flies when you are having fun.
Boing               en_US    # Spring has sprung, fall has fell, winter's here and it's colder than usual.
Bruce               en_US    # I sure like being inside this fancy computer
Bubbles             en_US    # Pull the plug! I'm drowning!
Cellos              en_US    # Doo da doo da dum dee dee doodly doo dum dum dum doo da doo da doo da doo da doo da doo da doo
Deranged            en_US    # I need to go on a really long vacation.
Fred                en_US    # I sure like being inside this fancy computer
Good News           en_US    # Congratulations you just won the sweepstakes and you don't have to pay income tax again.
Hysterical          en_US    # Please stop tickling me!
Junior              en_US    # My favorite food is pizza.
Kathy               en_US    # Isn't it nice to have a computer that will talk to you?
Pipe Organ          en_US    # We must rejoice in this morbid voice.
Princess            en_US    # When I grow up I'm going to be a scientist.
Ralph               en_US    # The sum of the squares of the legs of a right triangle is equal to the square of the hypotenuse.
Trinoids            en_US    # We cannot communicate with these carbon units.
Vicki               en_US    # Isn't it nice to have a computer that will talk to you?
Victoria            en_US    # Isn't it nice to have a computer that will talk to you?
Whisper             en_US    # Pssssst, hey you, Yeah you, Who do ya think I'm talking to, the mouse?
Zarvox              en_US    # That looks like a peaceful planet.

Python: Making Complex Regular Expression Easier to Read

In my last post, I shared a way to created regular expression with embedded comments for the Tcl scripting language. It turns out that Python also offers similar feature.

The Problem

I often need to deal with complex regular expression while scripting in Python. The problem is, the expression syntax is terse, cryptic and hard to understand and debug. There must be a better way to deal with regular expression, a way to add comments would be nice.

The Solution

As with my last post, I will use the same example: fishing out email addresses from a chunk of text. Below is the Python counterpart of my previous solution:

import re

if __name__ == '__main__':
    test_data = '''
            This is a bunch of text
            within it, there are some emails such as foo@bar.com
            or one@two.three.net
            What about mixed case: John.Doe@services.company.ws...
            Let see if we can extract them out
            '''
    email_pattern = r'''
            # The part before the @
            [a-z0-9._%-]+

            # The ampersand itself
            @

            # The domain, not including the last dot
            [a-z0-9.-]+

            # The last dot
            \.

            # The top-level domain (TLD), which ranges from 
            # 2 to 4 characters
            [a-z]{2,4}
            '''
    print 'START'
    result = re.findall(email_pattern, 
            test_data, 
            re.IGNORECASE|re.VERBOSE)
    print '\n'.join(result)
    print 'END'

The output:

START
foo@bar.com
one@two.three.net
John.Doe@services.company.ws
END

Conclusion

With the re.VERBOSE flag, I can embed white spaces and comments in the regular expression, making it easier to read and understand.

A Less AWKward For Loop

When writing an AWK script, many people who come from C background write the for loop for accessing an array like this:

for (i=1; i <= length(myarray); i++) {
    # Do something with myarray[i]
}

There is nothing wrong the the above code: it gets the job done. However, the loop feel less AWK-like and a little … AWKward. Besides, the above pattern does not work with arrays whose indices are strings, not number. A better way to write the same for loop is:

for (i in myarray) {
    # Do something with myarray[i]
}

So, stop being AWKward and start coding your for loop the AWK way.

Tcl: Making Complex Regular Expression Easier to Read

The Problem

I often need to deal with complex regular expression while scripting in Tcl (or other languages, for that matter.) The problem is, the expression syntax is terse, cryptic and hard to understand and debug. There must be a better way to deal with regular expression, a way to add comments would be nice.

For example, below is the expression to extract email addresses:

set email_pattern {[a-z0-9._%-]+@[a-z0-9.-]+\.[a-z]{2,4}}

Below is some code to demonstrate the use of this pattern to extract email from a block of text:

set test_data {
	This is a bunch of text
	within it, there are some emails such as foo@bar.com
	or one@two.three.net
	What about mixed case: John.Doe@services.company.ws...
	Let see if we can extract them out
}

set email_pattern {[a-z0-9._%-]+@[a-z0-9.-]+\.[a-z]{2,4}}

puts "START"
set result [regexp -inline -all -nocase $email_pattern $test_data]
puts [join $result "\n"]
puts "END"

The output:

START
foo@bar.com
one@two.three.net
John.Doe@services.company.ws
END

While this code gets the job done, the lack of document on the regular expression makes it hard to debug the code. Don’t you wish you could include comments to make it easier to read?

The Solutions

My first instinct was to break up the regular expression into parts, then glue them together:

set pre_ampersand {[a-z0-9._%-]+}
set domain {[a-z0-9.-]+}
set tld {\.[a-z]{2,4}}
set email_pattern ""
append email_pattern $pre_ampersand @ $domain $tld

That’s better, the code is now self-documented and I have broken up the long expression into manageable pieces. The drawback is I have to use so many variables to accomplish my goal.

After digging into the Tcl’s regex documentation, I discovered the -expanded flag which will do what I want: It allows me to add white space and comments to the regular expression. Now the code becomes:

set test_data {
	This is a bunch of text
	within it, there are some emails such as foo@bar.com
	or one@two.three.net
	What about mixed case: John.Doe@services.company.ws...
	Let see if we can extract them out
}

set email_pattern {
	# The part before the @
	[a-z0-9._%-]+

	# The ampersand itself
	@

	# The domain, not including the last dot
	[a-z0-9.-]+

	# The last dot
	\.

	# The top-level domain (TLD), which ranges from 2 to 4 characters
	[a-z]{2,4}
}

puts "START"
set result [regexp -expanded -inline -all -nocase $email_pattern $test_data]
puts [join $result "\n"]
puts "END"

The above code accomplished the same goal as before. While it is longer it is better documented and easier to understand and debug. For those who code in Python, a similar feature exists: it is called the re.VERBOSE flag.

Drobo Saved My Day for the Third Time

A few hours ago, I went home to see one of the lights on my Drobo blinking in red color: one of my hard drives in the Drobo has died. For those who don’t know, a Drobo is an external hard drive enclosure which houses five (in my model) 3.5-inch hard drives. Drobo offers protection against drive failures such as the one I am currently experienced.

In the old days, a failed hard drive meant data loss—that is how I lost several photographs and video clips of my family. When I lost my first batch of files due to hard drive failure, I got wiser and backed up my desktop’s hard drive. Sadly, when my hard drived failed the second time, I found out my back-up CDs (those old days) also turned bad. For years, I struggled with back-up solutions, none were reliable enough for my use. Then came Drobo.

The first time I saw this black box, I knew it should be part of my data storage and back up plan. The initial price for the empty Drobo S, my model, was nearly 700 USD—a very steep price indeed. However, the more I thought about my past incidents, the more sense the Drobo makes. Finally, I bit the bullet and purchased one. Let me cut right to the point: I experienced today’s episode twice before, and had not lost a single file due to hard drive failure. I believe that was the best $680 I have ever spent.

Tonight, I will replaced the doomed drive knowing that my Drobo is working hard to ensure my data integrity. My Drobo has saved my day yet again.

How I Name My Printers

At home, we have two printers that I set up so the whole family can print. Their names were HP LaserJet xxx and Epson xxx, the default my OS gave them. Things were fine until my wife and kids start asking me which one is the color printer.

What? Don’t they know that the Epson printer is the color one? The truth is, for the people who wants to print, they don’t care if the printer is laser or inkjet, made by HP or Epson. All they care about is which one can print in color and which one cannot.

Finally, I named them Black & White Printer and Color Printer. This makes a lot of sense for them: they don’t have to wonder which one is which. It also releases me of the answering duty.