
-----------------------------------
Robin
Mon Aug 17, 2009 3:43 am

sorting vowels from a given sentence
-----------------------------------
Hi,i have a sentence and i wanted to know the number of vowels present in that particular sentence.can anyone help me with the code?

-----------------------------------
DtY
Mon Aug 17, 2009 8:34 am

RE:sorting vowels from a given sentence
-----------------------------------
Iterate over each byte, if it's a vowel increase the vowel counter

-----------------------------------
gianni
Mon Aug 17, 2009 9:53 am

RE:sorting vowels from a given sentence
-----------------------------------
Another way to do this, which is both fast and short, is to replace all non-vowels in the string (with a regular expression, of course), and get the length of the resulting string.

-----------------------------------
OneOffDriveByPoster
Mon Aug 17, 2009 2:55 pm

Re: RE:sorting vowels from a given sentence
-----------------------------------
Another way to do this, which is both fast and short, is to replace all non-vowels in the string (with a regular expression, of course), and get the length of the resulting string.Fast to write, not to run.  Plus with Java this means creating extra objects.

-----------------------------------
gianni
Mon Aug 17, 2009 3:16 pm

Re: RE:sorting vowels from a given sentence
-----------------------------------
Fast to write, not to run.  Plus with Java this means creating extra objects.

I beg to differ. Perhaps regular expressions are slow in Java, but in almost all other languages they are blazing fast.

Here's a benchmark comparison run of both methods in ruby:

Iteration:
[code]Rehearsal ------------------------------------
   0.270000   0.010000   0.280000 (  0.275242)
--------------------------- total: 0.280000sec

       user     system      total        real
   0.270000   0.000000   0.270000 (  0.272689)[/code]

Regex:
[code]Rehearsal ------------------------------------
   0.110000   0.000000   0.110000 (  0.119916)
--------------------------- total: 0.110000sec

       user     system      total        real
   0.120000   0.000000   0.120000 (  0.120379)[/code]

As you can see, the regular expression method is over twice as fast. Of course this is in Ruby, I don't know how fast Regexs would be in Java.

-----------------------------------
DtY
Mon Aug 17, 2009 9:00 pm

RE:sorting vowels from a given sentence
-----------------------------------
I think the problem is that regular expressions aren't that fast, but that ruby is slow, meaning that regular expressions (which aren't implemented in ruby) will run comparitively quite fast

-----------------------------------
wtd
Mon Aug 17, 2009 9:17 pm

RE:sorting vowels from a given sentence
-----------------------------------
In any given language, regular expressions should be faster for this task than a naive loop and increment-count approach, assuming a decent regex implementation.

A decent regular expression engine has spent years being optimized for the task of looping over a string and doing things to it.  A basic loop cannot (in the absence of insanely complex optimizing compilers) be optimized in the same way.

Similarly, object creation should have a negligible impact on runtime, or the language is poorly implemented.

-----------------------------------
rizzix
Mon Aug 17, 2009 10:59 pm

RE:sorting vowels from a given sentence
-----------------------------------
Regardless of the arguments for using regular expressions, the algorithm DtY provided is as optimal as it gets.

The regex would involve creating a DFA or NFA regex engine and running the input string through it. On the other hand DtY's algorithm does not have this overhead (no matter how small the overhead may be).

Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY.

Hence really.... using regex for such a trivial problem is really an overkill and it WILL affect performance (but perhaps negligibly).

-----------------------------------
gianni
Tue Aug 18, 2009 9:01 pm

Re: RE:sorting vowels from a given sentence
-----------------------------------
Regardless of the arguments for using regular expressions, the algorithm DtY provided is as optimal as it gets.

You're right, in Java, the first method is indeed faster. Sample benchmark of 10,000 iterations on a relatively short sentence:

Iteration:
Hence really.... using regex for such a trivial problem is really an overkill and it WILL affect performance (but perhaps negligibly).

I don't agree that using a regex is overkill, as the regex solution is only one line and the iteration method is five lines (not including braces). Of course with just a single iteration the difference in time is (as you stated) negligible.

-----------------------------------
rizzix
Wed Aug 19, 2009 12:21 am

RE:sorting vowels from a given sentence
-----------------------------------
If you're going to provide a Java benchmark, please exclude the JVM startup time. It is an unreliable variable in any Java benchmark.

Here's code that excludes the JVM startup-time:
[code]class test {
    public static void main(String [] args) {
        final int RUNS = 10000;

        String s = "Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY.";

        long start = 0, size = 0, avg = 0;

        for (int i = 0; i < RUNS; i++) {
            start = System.nanoTime();
            String t = s.replaceAll("[^aeiou]", "");
            size = t.length();
            avg += System.nanoTime() - start;
        }

        System.out.printf("size: %d, time: %f\n", size, avg*1e-9/RUNS);
        avg = 0;

        for (int i = 0; i < RUNS; i++) {
            size = 0;
            start = System.nanoTime();
            char[] chrs = s.toCharArray();
            for (int j = 0; j < chrs.length; j++) {
                if ( chrs[j] == 'a'
                  || chrs[j] == 'e'
                  || chrs[j] == 'i'
                  || chrs[j] == 'o'
                  || chrs[j] == 'u')
                      size++;
            }
            avg += System.nanoTime() - start;
        }

        System.out.printf("size: %d, time: %f\n", size, avg*1e-9/RUNS);
    }
}
[/code]

Results I get (time in seconds):[code]size: 155, time: 0.000151
size: 155, time: 0.000004[/code]

-----------------------------------
rizzix
Wed Aug 19, 2009 12:51 am

Re: sorting vowels from a given sentence
-----------------------------------
I decided to try out something similar in Ruby:
Qualitatively speaking, DtY was correct in saying that Ruby is simply slower at processing ordinary things. Hence its regular expression engine which is basically wrapper code around c-functions are faster than any other computative code. This is not necessarily a bad thing as it forces you to use the bundled functions, which generally results in cleaner code.

Further proof is in the fact that if you replace the if-conditional with the comment above, you'll notice an order of magnitude in runtime difference.

-----------------------------------
Vermette
Wed Aug 19, 2009 8:19 am

Re: sorting vowels from a given sentence
-----------------------------------
I decided to try out something similar in Ruby:


Reorder the terminals to be evaluated in order of their frequency for the given language and on a big enough piece of text your code should run even a bit faster (in the case of english, eaoiu).

-----------------------------------
Tony
Wed Aug 19, 2009 10:18 am

RE:sorting vowels from a given sentence
-----------------------------------
before we get carried away in "Ruby is too slow" direction, it should be pointed out that in a practical application, the source string will likely come from a database; access of which will likely be an order of magnitude above even the slow regex.

Going along with Vermette's micro-optimization, the terminal order could even be tuned to specific text (or type of text). Rizzix's sample text, for example, is actually "eoiau".

-----------------------------------
chrisbrown
Wed Aug 19, 2009 10:40 am

Re: RE:sorting vowels from a given sentence
-----------------------------------
before we get carried away
I think Robin got scared away after the first reply.

-----------------------------------
gianni
Wed Aug 19, 2009 11:16 am

Re: sorting vowels from a given sentence
-----------------------------------
Just for fun I thought I'd try the scripts in Jruby, my code is similar to rizzix's:

require 'benchmark'

runs = 10_000
text = "Finally considering you need to count the number of vowels, which presumably can occur (effectively) radomly in the input string, you necessarily have to go through all characters of the input string. Thus what ever approch you take, whether it invovles elimnating non-vowels from the input and then counting the length of the remaining string or whether you run through it once counting the vowels on the way, you need to go through all the characters in the string. Thus the most optmal algorithm would be an O( n) algorithm which was provided by DtY."

def benchmark(repetitions, &block)
  Benchmark.bm do |b|
    b.report { repetitions.times &block }
  end
end

# Regex
benchmark(runs) do
  count = text.gsub(/

Jruby in 1.9 mode:
[code]      user     system      total        real
  1.208000   0.000000   1.208000 (  1.125000)     (average: 0.0001208)


      user     system      total        real
  2.054000   0.000000   2.054000 (  2.054000)     (average: 0.0002054)[/code]

As you can see, the JRuby regex version just as fast as the actual Java regex version. And the iteration method doesn't even compare to Java's.

-----------------------------------
rizzix
Wed Aug 19, 2009 2:01 pm

Re: sorting vowels from a given sentence
-----------------------------------
As you can see, the JRuby regex version just as fast as the actual Java regex version. And the iteration method doesn't even compare to Java's. Interesting find; I'm not sure why they didn't optimize it enough. Do you have a mac? It would be nice to try it on macruby and see if there's any significant difference.
