Written by Mark Pringle | Last Updated on Tuesday, January 10, 2023
C# Programming Tutorial Articles
About 18 years ago, I built an online poetry community that is still very popular today. This community has the most features and tools of any poetry community worldwide. However, the tool I am most proud of is its syllable counter. This syllable counter counts the syllables in single words or paragraphs and was built entirely using the C# programming language.
How Does it Work?
First, I split the submitted paragraphs or sentences into individual words and put them in an array.
string[] word_partsDB = Regex.Split(words, " ");
After splitting the submitted text into an array, I use a foreach
loop to query my MS SQL Server database to find words and their syllable count. This is not the algorithm. It’s a simple database query of my existing syllable counter dictionary. The syllable counter algorithm kicks in if words and their syllable counts are not found in the dictionary. This syllable counter uses a combination of a 240,364 word U.S. English syllable count dictionary and a syllable counter algorithm.
//Begin database query
foreach (string i in word_partsDB)
{
using (System.Data.SqlClient.SqlConnection con = new SqlConnection(ConfigurationManager.ConnectionStrings["XXXConnectionString"].ConnectionString))
{
try
{
con.Open();
// Create Select command
dad = new SqlDataAdapter("SELECT * FROM SyllableDictionary WHERE Word = @Word and Syllables is not null", con);
SqlCommandBuilder builder = new SqlCommandBuilder(dad);
builder.DataAdapter.SelectCommand.Parameters.AddWithValue("@Word", i.Trim());
// Add data to DataTable
dtblPMStuff = new DataTable();
dad.Fill(dtblPMStuff);
if (dtblPMStuff.Rows.Count != 0)
{
numSyllables = numSyllables + Int32.Parse(dtblPMStuff.Rows[0]["Syllables"].ToString());
// Subtract database words from string to use in algorithm
word = Regex.Replace(word, "\\b" + i + "\\b", "");
word = Regex.Replace(word, "\\b" + i + "$", "");
}
}
finally
{
con.Close();
}
}
}
As the foreach loop iterates through the database, the words found in the database are removed from the initial array. The words remaining in the array are sent to the syllable counter algorithm to count syllables programmatically.
Syllable Counting Rules
There are some general rules for counting the number of syllables in a word that must be understood before building an algorithm in c#.
- Count the vowels in the word,
- Subtract any silent vowels
- Subtract one vowel when two vowel sounds form one speech sound (diphthong)
- Add syllables for anomalies that do not adhere to the standard patterns
Generally speaking, the number of vowel sounds remaining should be the number of syllables.
More Syllable Division Rules
- Divide between two middle consonants. Example: hap/py
- Usually divide before a single middle consonant. Example: o/ver
- Divide before a consonant that is immediately before an "-le" word part. Example: bub/ble
- Divide off any compound words, prefixes, suffixes, and roots. Example: sea/weed
A syllable is typically made up of a syllable nucleus or vowel with optional opening and closing consonants.
Word | Nucleus | # Syllables |
---|---|---|
cat [kæt] | [æ] | 1 |
bed [bɛd] | [ɛ] | 1 |
ode [oʊd] | [oʊ] | 1 |
beet [bit] | [i] | 1 |
bite [baɪt] | [aɪ] | 1 |
rain [reɪn] | [eɪ] | 1 |
bitten [ˈbɪt.ən] or [ˈbɪt.n] |
[ɪ] [ə] or [n] |
2 |
The C# Syllable Counting Algorithm
Now, I will show you excerpts of our syllable counting algorithm. This will also help you see the rules of syllable division.
Start by putting the non-database words into an array.
string[] word_partsAfterDB = Regex.Split(word, " ");
Loop through each word in the array using a foreach
loop and add or subtract syllables using nested foreach
loops based on the syllable counter rules mentioned above. These loops will:
- Count a syllable for each vowel split
- Subtract syllables for each silent vowel regex match in the array
- Add syllables for each syllabic anomaly regex match in the array
foreach (string x in word_partsAfterDB)
{
//split words at vowels count syllable for split
string[] word_partsAfterDB_Split = Regex.Split(x, "[^aeiouy]+");
//count syllables for vowel split
foreach (string vs in word_partsAfterDB_Split)
{
if (vs.Trim() != "")
{
numSyllables++;
numSyllablesPer++;
}
}
//subtract syllables for each silent vowel regex match in array
foreach (string xx in SubtractSyllables)
{
if ((Regex.IsMatch(x, xx)) && (x.Trim() != ""))
{
numSyllables--;
numSyllablesPer--;
}
}
//add syllable for each anomaly regex match in array
foreach (string xx in AddSyllables)
{
if ((Regex.IsMatch(x, xx)) && (x.Trim() != ""))
{
numSyllables++;
numSyllablesPer++;
}
}
}
In the code above, once we split the words at the vowels and count those syllables...
string[] word_partsAfterDB_Split = Regex.Split(x, "[^aeiouy]+");
..we subtract a syllable for each diphthong and silent vowel. Below are a few examples.
ArrayList SubtractSyllables = new ArrayList();
SubtractSyllables.Add("cial");
SubtractSyllables.Add("tia");
SubtractSyllables.Add("cius");
SubtractSyllables.Add("cious");
SubtractSyllables.Add("uiet");
SubtractSyllables.Add("gious");
SubtractSyllables.Add("geous");
SubtractSyllables.Add("priest");
SubtractSyllables.Add("giu");
SubtractSyllables.Add("dge");
SubtractSyllables.Add("ion");
SubtractSyllables.Add("iou");
SubtractSyllables.Add("rhy");
SubtractSyllables.Add("n't");
...
We then take into consideration the numerous syllable anomalies in the U.S. English language. There are anomalies in the standard patterns for dividing syllables. In our syllable counting program, I account for these deviations by adding a syllable for each deviation. Here are a few:
ArrayList AddSyllables = new ArrayList();
AddSyllables.Add("ia");
AddSyllables.Add("riet");
AddSyllables.Add("dien");
AddSyllables.Add("ien");
AddSyllables.Add("iet");
AddSyllables.Add("iu");
AddSyllables.Add("iest");
AddSyllables.Add("io");
AddSyllables.Add("ii");
AddSyllables.Add("ily");
AddSyllables.Add(@".oala\b");
AddSyllables.Add(@".iara\b");
AddSyllables.Add(@".ying\b");
AddSyllables.Add(".earest");
AddSyllables.Add(".arer");
AddSyllables.Add(".aress");
AddSyllables.Add(@".eate\b");
AddSyllables.Add(@".eation\b");
AddSyllables.Add(@"[aeiouym]bl\b");
When you see /b (above), those are word parts that end a word in the eyes of regular expressions. The other word parts can be found anywhere within the word.
There are many word parts and syllabic anomalies that I have not shown here (I can't give away all of the secrets). They have been added to the C# algorithm. Additionally, there's more to the program, but you can see the final product in action at https://www.syllablecount.com/ or https://www.poetrysoup.com/syllables/syllable_counter.aspx.