PRACTICE EXERCISES

Time to cement what you learned. Each exercise builds on the previous.

SETUP

Create a fresh practice directory:

mkdir ~/regex-workshop
cd ~/regex-workshop

EXERCISE 1: Create Test Files

Task: Create these files for the exercises.

# Create log files
echo "2024-01-15 10:30:22 INFO Server started on port 8080" > server.log
echo "2024-01-15 10:31:05 WARNING High memory usage: 85%" >> server.log
echo "2024-01-15 10:32:17 ERROR Connection refused to database" >> server.log
echo "2024-01-15 10:33:44 INFO User 'admin' logged in" >> server.log
echo "2024-01-15 10:35:02 ERROR File not found: /tmp/cache.dat" >> server.log

# Create messy filenames
touch "IMG 2024 01 15 photo.jpg"
touch "IMG 2024 01 16 sunset.jpg"
touch "IMG 2024 02 20 vacation.jpg"
touch "Document (1).txt"
touch "Document (2).txt"
touch "final_FINAL_v2.docx"

# Create a simple CSV
echo "name,email,score" > scores.csv
echo "Alice,alice@example.com,95" >> scores.csv
echo "Bob,bob@test.org,87" >> scores.csv
echo "Carol,carol@demo.net,92" >> scores.csv

EXERCISE 2: Basic ack Searches

Task: Use ack to find things in server.log.

# Find all ERROR lines
ack "ERROR" server.log

# Find all lines with timestamps starting with 10:3
ack "10:3\d" server.log

# Find lines containing either WARNING or ERROR
ack "WARNING|ERROR" server.log

Challenge: Find lines that mention a port number.

EXERCISE 3: Case-Insensitive Search

Task: Search case-insensitively.

# Create a test file
echo "The Quick Brown Fox" > animals.txt
echo "the quick brown fox" >> animals.txt
echo "THE QUICK BROWN FOX" >> animals.txt

# Find all variations of "quick"
ack -i "quick" animals.txt

Challenge: Find all variations of "fox" as a whole word only.

EXERCISE 4: Extract Data with ack

Task: Pull specific data from server.log.

# Print only the timestamps
ack -o "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}" server.log

# Print only the port number
ack -o "port \d+" server.log

Challenge: Extract all usernames (hint: look for quoted strings).

EXERCISE 5: rename - Basic

Task: Clean up filenames (always test with -n first!).

# Preview: Replace spaces with underscores in IMG files
rename -n 's/ /_/g' IMG*

# If it looks right, do it
rename 's/ /_/g' IMG*

# Preview: Remove "(1)" and "(2)" from document names
rename -n 's/\s*\(\d+\)//' Document*

# If it looks right, do it
rename 's/\s*\(\d+\)//' Document*

Challenge: Clean up "final_FINAL_v2.docx" to just "final.docx".

EXERCISE 6: rename - Capture Groups

Task: Restructure the IMG filenames.

Current:  IMG_2024_01_15_photo.jpg
Target:   2024-01-15_photo.jpg

# Preview the transformation
rename -n 's/IMG_(\d{4})_(\d{2})_(\d{2})_(.+)/$1-$2-$3_$4/' IMG*

# Apply it
rename 's/IMG_(\d{4})_(\d{2})_(\d{2})_(.+)/$1-$2-$3_$4/' IMG*

Challenge: Now pad the day with a leading zero if needed. (Hint: already done with \d{2}, but think about how you'd handle "IMG_2024_1_5_photo.jpg")

EXERCISE 7: Perl One-Liners - Basic

Task: Transform text with Perl.

# Convert ERROR to CRITICAL in server.log
perl -pe 's/ERROR/CRITICAL/' server.log

# Make it case-insensitive and global
perl -pe 's/error/CRITICAL/gi' server.log

# Print only lines containing ERROR
perl -ne 'print if /ERROR/' server.log

Challenge: Print lines that do NOT contain INFO.

EXERCISE 8: Perl - CSV Processing

Task: Work with the scores.csv file.

# Print only names (first column)
perl -F',' -ane 'print "$F[0]\n"' scores.csv

# Skip the header
perl -F',' -ane 'print "$F[0]\n" if $. > 1' scores.csv

# Print name and score only
perl -F',' -ane 'print "$F[0],$F[2]" if $. > 1' scores.csv

Challenge: Calculate the average score.

EXERCISE 9: Perl - Data Extraction

Task: Extract specific values.

# Extract all email addresses from scores.csv
perl -ne 'print "$1\n" if /([a-z0-9.]+@[a-z0-9.]+)/' scores.csv

# Extract the memory percentage from server.log
perl -ne 'print "$1\n" if /(\d+)%/' server.log

Challenge: Extract all numbers from server.log.

EXERCISE 10: Combined Workflow

Here's a realistic scenario. You have AI output files to process.

Setup:

mkdir ai_output
echo "PASS: Clean image" > ai_output/img001_result.txt
echo "REJECT: Nudity detected" > ai_output/img002_result.txt
echo "PASS: Clean image" > ai_output/img003_result.txt
echo "REJECT: Violence detected" > ai_output/img004_result.txt
echo "PASS: Clean image" > ai_output/img005_result.txt

Tasks:

# 1. Find all rejected images
ack "REJECT" ai_output/

# 2. List only the filenames of rejected images
ack -l "REJECT" ai_output/

# 3. Count passes vs rejects
ack -c "PASS" ai_output/
ack -c "REJECT" ai_output/

# 4. Extract rejection reasons
ack -o "REJECT: .*" ai_output/

# 5. Get just the reason part
perl -ne 'print "$1\n" if /REJECT: (.+)/' ai_output/*.txt

Challenge: Create a summary report showing counts by rejection type.

BONUS EXERCISE: The Real World

Create a script that processes AI vision results:

#!/bin/bash
# process_results.sh

echo "=== AI Results Summary ==="
echo ""

PASSES=$(ack -c "PASS" ai_output/ | perl -ne '$s += (split/:/)[1]; END{print $s}')
REJECTS=$(ack -c "REJECT" ai_output/ | perl -ne '$s += (split/:/)[1]; END{print $s}')

echo "Total PASS:   $PASSES"
echo "Total REJECT: $REJECTS"
echo ""
echo "=== Rejection Reasons ==="
perl -ne 'print "$1\n" if /REJECT: (.+)/' ai_output/*.txt | sort | uniq -c

Run with: bash process_results.sh

SOLUTIONS HINT

Stuck? Here are patterns for the challenges:

Ex 2: ack "port \d+" server.log
Ex 3: ack -iw "fox" animals.txt
Ex 4: ack -o "'[^']+'" server.log
Ex 5: rename 's/final_FINAL_v2/final/' final*
Ex 7: perl -ne 'print unless /INFO/' server.log
Ex 8: perl -F',' -ane '$s+=$F[2] if $.>1; END{print $s/3,"\n"}' scores.csv
Ex 9: perl -ne 'print "$_\n" while /(\d+)/g' server.log
Ex 10: perl -ne 'print "$1\n" if /REJECT: (.+)/' ai_output/*.txt | sort | uniq -c

WHAT'S NEXT?

You now have the fundamentals. To go deeper:

Practice daily. Use regex for real tasks, not just exercises.
Learn more Perl. One-liners are the gateway drug.
Explore ripgrep (rg) for even faster searching.
Check out sed and awk - different tools, overlapping uses.
Bookmark regex101.com - great for testing patterns.
Join the Techalicious sessions - we use these skills constantly.