Posted: Mon Nov 17, 2008 10:55 pm Post subject: Word Count in C
Hi,
I was wondering if anyone have any helpful resources on how to do word counts in C. I have been trying to find simple to understand and very useful C examples on how to do this but to no avail. I tried Googling too, but can't seem to find any good examples.
If anyone can help, please help and point me to great examples!
Many thanks,
deltatux
P.S: To XKCD readers, no pointers please, you know what I mean =P.
Sponsor Sponsor
A.J
Posted: Mon Nov 17, 2008 11:27 pm Post subject: Re: Word Count in C
deltatux wrote:
P.S To XKCD readers, no pointers please, you know what I mean =P.
don't worry...no pointers
by word count, do u mean to count the number of words in a sentence?
If yes, all u have to do is have a variable thats stores the number of words u have so far (it shud start at '0'), and increase it by one every time u get a string...
I hav a feeling that this isnt what u want.....sry if this didnt help...
please clarify what u mean by 'word count'
deltatux
Posted: Tue Nov 18, 2008 12:55 am Post subject: Re: Word Count in C
for example:
Quote:
Hello I am deltatux
The software should say that there are 4 words.
Thanks,
deltatux
Euphoracle
Posted: Tue Nov 18, 2008 7:06 am Post subject: RE:Word Count in C
Well, you can count the number of spaces in the sequence, and add 1 for each complete sentence, assuming you're not counting hyphenated words as two individual words, or putting two spaces after punctuation. If the latter is true, feel free to keep track of whether the last character you checked was a space or punctuation, and if it was, ignore further spaces until it isn't. You can reset your "add 1 'counter'" by determining if you've passed over punctuation. The general structure of a simple starting sentence is:
<word1><space><word2><space><word3><punctuation>
and for a simple sentence further in:
<punctuation_from_last><space><space><word1><space><word2><space><word3><punctuation>
code:
I have a blue dog. His name is Cody.
4 spaces in first sentence, add 1 = 5 words.
5 spaces in second sentence, remove 2 from start, add 1 = 4 words.
Total = 5 words + 4 words = 9 words.
A.J
Posted: Tue Nov 18, 2008 11:51 am Post subject: Re: Word Count in C
Euphoracle wrote:
Well, you can count the number of spaces in the sequence, and add 1 for each complete sentence, assuming you're not counting hyphenated words as two individual words, or putting two spaces after punctuation. If the latter is true, feel free to keep track of whether the last character you checked was a space or punctuation, and if it was, ignore further spaces until it isn't. You can reset your "add 1 'counter'" by determining if you've passed over punctuation. The general structure of a simple starting sentence is:
<word1><space><word2><space><word3><punctuation>
and for a simple sentence further in:
<punctuation_from_last><space><space><word1><space><word2><space><word3><punctuation>
code:
I have a blue dog. His name is Cody.
4 spaces in first sentence, add 1 = 5 words.
5 spaces in second sentence, remove 2 from start, add 1 = 4 words.
Total = 5 words + 4 words = 9 words.
That doesn't necessarily work. what if the sentence is :
code:
Hi, I am A.J.
This has a space before the sentence...
I wud say FIRST remove the space before and after the senctence, remove all punctuation, then count the spaces and + 1.
deltatux
Posted: Tue Nov 18, 2008 1:19 pm Post subject: RE:Word Count in C
This is very confusing ... can someone explain it a bit simpler?
Many thanks,
deltatux
md
Posted: Tue Nov 18, 2008 2:09 pm Post subject: RE:Word Count in C
code:
read string
index = 0
while index < length of string
while character in string at index is not a space
index++
word count += 1
while character in string at index is a space
index++
A.J
Posted: Tue Nov 18, 2008 3:53 pm Post subject: Re: Word Count in C
then again, that counts the spaces before and after the string
do exactly what md did, but first replace all spaces.
md wrote:
code:
read string
index = 0
while index < length of string
while character in string at index is not a space
index++
word count += 1
while character in string at index is a space
index++
Sponsor Sponsor
Euphoracle
Posted: Tue Nov 18, 2008 4:19 pm Post subject: Re: Word Count in C
A.J @ Tue Nov 18, 2008 11:51 am wrote:
Euphoracle wrote:
Well, you can count the number of spaces in the sequence, and add 1 for each complete sentence, assuming you're not counting hyphenated words as two individual words, or putting two spaces after punctuation. If the latter is true, feel free to keep track of whether the last character you checked was a space or punctuation, and if it was, ignore further spaces until it isn't. You can reset your "add 1 'counter'" by determining if you've passed over punctuation. The general structure of a simple starting sentence is:
<word1><space><word2><space><word3><punctuation>
and for a simple sentence further in:
<punctuation_from_last><space><space><word1><space><word2><space><word3><punctuation>
code:
I have a blue dog. His name is Cody.
4 spaces in first sentence, add 1 = 5 words.
5 spaces in second sentence, remove 2 from start, add 1 = 4 words.
Total = 5 words + 4 words = 9 words.
That doesn't necessarily work. what if the sentence is :
code:
Hi, I am A.J.
This has a space before the sentence...
I wud say FIRST remove the space before and after the senctence, remove all punctuation, then count the spaces and + 1.
That was assuming that you're not giving it silly data. Also...
0 + 1 = 1; (Hi,)
2 + 1 = 3; (, I am A.)
0 + 1 = 1; (.J.)
= 5.
A.J. is on preference, imo. It can be one or two words, I guess; I don't know how you want it. I'd count it as two, just because it stands for two words, and it's not a recognized acronym, like RADAR or NASA or COBOL and the likes.
A.J
Posted: Tue Nov 18, 2008 6:16 pm Post subject: Re: Word Count in C
Quote:
A.J @ Tue Nov 18, 2008 11:51 am wrote:
Euphoracle wrote:
Well, you can count the number of spaces in the sequence, and add 1 for each complete sentence, assuming you're not counting hyphenated words as two individual words, or putting two spaces after punctuation. If the latter is true, feel free to keep track of whether the last character you checked was a space or punctuation, and if it was, ignore further spaces until it isn't. You can reset your "add 1 'counter'" by determining if you've passed over punctuation. The general structure of a simple starting sentence is:
<word1><space><word2><space><word3><punctuation>
and for a simple sentence further in:
<punctuation_from_last><space><space><word1><space><word2><space><word3><punctuation>
code:
I have a blue dog. His name is Cody.
4 spaces in first sentence, add 1 = 5 words.
5 spaces in second sentence, remove 2 from start, add 1 = 4 words.
Total = 5 words + 4 words = 9 words.
That doesn't necessarily work. what if the sentence is :
code:
Hi, I am A.J.
This has a space before the sentence...
I wud say FIRST remove the space before and after the senctence, remove all punctuation, then count the spaces and + 1.
That was assuming that you're not giving it silly data. Also...
0 + 1 = 1; (Hi,)
2 + 1 = 3; (, I am A.)
0 + 1 = 1; (.J.)
= 5.
A.J. is on preference, imo. It can be one or two words, I guess; I don't know how you want it. I'd count it as two, just because it stands for two words, and it's not a recognized acronym, like RADAR or NASA or COBOL and the likes.
well, technically i put a space before the 'Hi, I am A.J' (look at it again)...
but if u dont get messed up testcases, then Euphoracle is right....
sry but the confusion, deltatux
[edit by md] fixed quote
md
Posted: Tue Nov 18, 2008 6:54 pm Post subject: Re: Word Count in C
A.J @ 2008-11-18, 3:53 pm wrote:
then again, that counts the spaces before and after the string
do exactly what md did, but first replace all spaces.
md wrote:
code:
read string
index = 0
while index < length of string
while character in string at index is not a space
index++
word count += 1
while character in string at index is a space
index++
Infact that algorithm does not count spaces twice. And replacing spaces first (with what?) will break it entirely since it requires spaces as word separators.
A.J
Posted: Tue Nov 18, 2008 7:35 pm Post subject: Re: Word Count in C
what i meant was to remove spaces before the words...but nevertheless, it wud still work md [/b]
md
Posted: Tue Nov 18, 2008 10:15 pm Post subject: RE:Word Count in C
The space after one word is the space before another
Incidentally, why mangle a string which you probably need somewhere else just to count words? Algorithms that destroy data do have their place; but it is not here.
Vermette
Posted: Wed Nov 19, 2008 10:21 am Post subject: RE:Word Count in C
Publisher style word count:
Book.length/5
Euphoracle
Posted: Wed Nov 19, 2008 3:14 pm Post subject: Re: Word Count in C
A.J @ Tue Nov 18, 2008 6:16 pm wrote:
Quote:
A.J @ Tue Nov 18, 2008 11:51 am wrote:
Euphoracle wrote:
Well, you can count the number of spaces in the sequence, and add 1 for each complete sentence, assuming you're not counting hyphenated words as two individual words, or putting two spaces after punctuation. If the latter is true, feel free to keep track of whether the last character you checked was a space or punctuation, and if it was, ignore further spaces until it isn't. You can reset your "add 1 'counter'" by determining if you've passed over punctuation. The general structure of a simple starting sentence is:
<word1><space><word2><space><word3><punctuation>
and for a simple sentence further in:
<punctuation_from_last><space><space><word1><space><word2><space><word3><punctuation>
code:
I have a blue dog. His name is Cody.
4 spaces in first sentence, add 1 = 5 words.
5 spaces in second sentence, remove 2 from start, add 1 = 4 words.
Total = 5 words + 4 words = 9 words.
That doesn't necessarily work. what if the sentence is :
code:
Hi, I am A.J.
This has a space before the sentence...
I wud say FIRST remove the space before and after the senctence, remove all punctuation, then count the spaces and + 1.
That was assuming that you're not giving it silly data. Also...
0 + 1 = 1; (Hi,)
2 + 1 = 3; (, I am A.)
0 + 1 = 1; (.J.)
= 5.
A.J. is on preference, imo. It can be one or two words, I guess; I don't know how you want it. I'd count it as two, just because it stands for two words, and it's not a recognized acronym, like RADAR or NASA or COBOL and the likes.
well, technically i put a space before the 'Hi, I am A.J' (look at it again)...
but if u dont get messed up testcases, then Euphoracle is right....
sry but the confusion, deltatux
[edit by md] fixed quote
Which is why I said "That was assuming that you're not giving it silly data." I removed the space.