Posted: Mon Aug 17, 2009 3:43 am Post subject: sorting vowels from a given sentence
Hi,i have a sentence and i wanted to know the number of vowels present in that particular sentence.can anyone help me with the code?
Sponsor Sponsor
DtY
Posted: Mon Aug 17, 2009 8:34 am Post subject: RE:sorting vowels from a given sentence
Iterate over each byte, if it's a vowel increase the vowel counter
gianni
Posted: Mon Aug 17, 2009 9:53 am Post subject: RE:sorting vowels from a given sentence
Another way to do this, which is both fast and short, is to replace all non-vowels in the string (with a regular expression, of course), and get the length of the resulting string.
OneOffDriveByPoster
Posted: Mon Aug 17, 2009 2:55 pm Post subject: Re: RE:sorting vowels from a given sentence
gianni @ Mon Aug 17, 2009 9:53 am wrote:
Another way to do this, which is both fast and short, is to replace all non-vowels in the string (with a regular expression, of course), and get the length of the resulting string.
Fast to write, not to run. Plus with Java this means creating extra objects.
gianni
Posted: Mon Aug 17, 2009 3:16 pm Post subject: Re: RE:sorting vowels from a given sentence
OneOffDriveByPoster @ Mon Aug 17, 2009 2:55 pm wrote:
Fast to write, not to run. Plus with Java this means creating extra objects.
I beg to differ. Perhaps regular expressions are slow in Java, but in almost all other languages they are blazing fast.
Here's a benchmark comparison run of both methods in ruby:
user system total real
0.120000 0.000000 0.120000 ( 0.120379)
As you can see, the regular expression method is over twice as fast. Of course this is in Ruby, I don't know how fast Regexs would be in Java.
DtY
Posted: Mon Aug 17, 2009 9:00 pm Post subject: RE:sorting vowels from a given sentence
I think the problem is that regular expressions aren't that fast, but that ruby is slow, meaning that regular expressions (which aren't implemented in ruby) will run comparitively quite fast
wtd
Posted: Mon Aug 17, 2009 9:17 pm Post subject: RE:sorting vowels from a given sentence
In any given language, regular expressions should be faster for this task than a naive loop and increment-count approach, assuming a decent regex implementation.
A decent regular expression engine has spent years being optimized for the task of looping over a string and doing things to it. A basic loop cannot (in the absence of insanely complex optimizing compilers) be optimized in the same way.
Similarly, object creation should have a negligible impact on runtime, or the language is poorly implemented.
rizzix
Posted: Mon Aug 17, 2009 10:59 pm Post subject: RE:sorting vowels from a given sentence
Regardless of the arguments for using regular expressions, the algorithm DtY provided is as optimal as it gets.
The regex would involve creating a DFA or NFA regex engine and running the input string through it. On the other hand DtY's algorithm does not have this overhead (no matter how small the overhead may be).
Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY.
Hence really.... using regex for such a trivial problem is really an overkill and it WILL affect performance (but perhaps negligibly).
Sponsor Sponsor
gianni
Posted: Tue Aug 18, 2009 9:01 pm Post subject: Re: RE:sorting vowels from a given sentence
rizzix @ Mon Aug 17, 2009 10:59 pm wrote:
Regardless of the arguments for using regular expressions, the algorithm DtY provided is as optimal as it gets.
You're right, in Java, the first method is indeed faster. Sample benchmark of 10,000 iterations on a relatively short sentence:
Iteration:
code:
real 0m0.372s
user 0m0.353s
sys 0m0.060s
Regex:
code:
real 0m0.728s
user 0m0.814s
sys 0m0.084s
rizzix @ Mon Aug 17, 2009 10:59 pm wrote:
Hence really.... using regex for such a trivial problem is really an overkill and it WILL affect performance (but perhaps negligibly).
I don't agree that using a regex is overkill, as the regex solution is only one line and the iteration method is five lines (not including braces). Of course with just a single iteration the difference in time is (as you stated) negligible.
rizzix
Posted: Wed Aug 19, 2009 12:21 am Post subject: RE:sorting vowels from a given sentence
If you're going to provide a Java benchmark, please exclude the JVM startup time. It is an unreliable variable in any Java benchmark.
Here's code that excludes the JVM startup-time:
code:
class test {
public static void main(String [] args) {
final int RUNS = 10000;
String s = "Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY.";
long start = 0, size = 0, avg = 0;
for (int i = 0; i < RUNS; i++) {
start = System.nanoTime();
String t = s.replaceAll("[^aeiou]", "");
size = t.length();
avg += System.nanoTime() - start;
}
Posted: Wed Aug 19, 2009 12:51 am Post subject: Re: sorting vowels from a given sentence
I decided to try out something similar in Ruby:
code:
RUNS = 10000
s = "Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY."
size = 0
avg = 0.0
RUNS.times do
start = Time.now
t = s.gsub /[^aeiou]/, ""
size = t.length
avg += Time.now - start
end
puts "size: #{size} time: #{avg.to_f/RUNS}"
size = 0
avg = 0.0
RUNS.times do
size = 0
start = Time.now
s.each_char do |x|
#if x == 'a' || x == 'e' || x == 'i' || x == 'o' || x == 'u' then
if "aeiou".index(x) then #funny how this is faster
size += 1
end
end
avg += Time.now - start
end
puts "size: #{size} time: #{avg.to_f/RUNS}"
Qualitatively speaking, DtY was correct in saying that Ruby is simply slower at processing ordinary things. Hence its regular expression engine which is basically wrapper code around c-functions are faster than any other computative code. This is not necessarily a bad thing as it forces you to use the bundled functions, which generally results in cleaner code.
Further proof is in the fact that if you replace the if-conditional with the comment above, you'll notice an order of magnitude in runtime difference.
Vermette
Posted: Wed Aug 19, 2009 8:19 am Post subject: Re: sorting vowels from a given sentence
rizzix @ August 19th 2009, 00:51 wrote:
I decided to try out something similar in Ruby:
code:
#if x == 'a' || x == 'e' || x == 'i' || x == 'o' || x == 'u' then
if "aeiou".index(x) then #funny how this is faster
size += 1
end
end
Reorder the terminals to be evaluated in order of their frequency for the given language and on a big enough piece of text your code should run even a bit faster (in the case of english, eaoiu).
Tony
Posted: Wed Aug 19, 2009 10:18 am Post subject: RE:sorting vowels from a given sentence
before we get carried away in "Ruby is too slow" direction, it should be pointed out that in a practical application, the source string will likely come from a database; access of which will likely be an order of magnitude above even the slow regex.
Going along with Vermette's micro-optimization, the terminal order could even be tuned to specific text (or type of text). Rizzix's sample text, for example, is actually "eoiau".
Posted: Wed Aug 19, 2009 10:40 am Post subject: Re: RE:sorting vowels from a given sentence
Tony @ Wed Aug 19, 2009 10:18 am wrote:
before we get carried away
I think Robin got scared away after the first reply.
gianni
Posted: Wed Aug 19, 2009 11:16 am Post subject: Re: sorting vowels from a given sentence
Just for fun I thought I'd try the scripts in Jruby, my code is similar to rizzix's:
Ruby:
require'benchmark'
runs = 10_000
text = "Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY."
def benchmark(repetitions, &block)
Benchmark.bmdo |b|
b.report{ repetitions.times &block } end
end
# Regex
benchmark(runs)do
count = text.gsub(/[^aeiouy]/i, '').length end
# Iteration
benchmark(runs)do
count = 0
vowels = "aeiouy"
text.downcase.each_chardo |c|
count += 1if vowels.include? c
end
end
Jruby in 1.9 mode:
code:
user system total real
1.208000 0.000000 1.208000 ( 1.125000) (average: 0.0001208)
user system total real
2.054000 0.000000 2.054000 ( 2.054000) (average: 0.0002054)
As you can see, the JRuby regex version just as fast as the actual Java regex version. And the iteration method doesn't even compare to Java's.