Programming C, C++, Java, PHP, Ruby, Turing, VB
Computer Science Canada 
Programming C, C++, Java, PHP, Ruby, Turing, VB  

Username:   Password: 
 RegisterRegister   
 Why is it that...
Index -> Programming, C++ -> C++ Help
View previous topic Printable versionDownload TopicSubscribe to this topicPrivate MessagesRefresh page View next topic
Author Message
SJ




PostPosted: Wed Jul 30, 2008 6:56 am   Post subject: Why is it that...

Why is it that when I try to make an array (array, not vector) of say, 1,000,000,000 bools the program runs, but 2,000,000,000 bools it won't run?

Also, why is that when I have an array of 200,000,000 bools it takes only about 500K of memory, but a vector of 200,000,000 bools takes up to 25000K of memory?
Sponsor
Sponsor
Sponsor
sponsor
DemonWasp




PostPosted: Wed Jul 30, 2008 8:12 am   Post subject: RE:Why is it that...

1. 1B bools ~= 1GB of memory consumed, 2B bools ~= 2GB memory consumed. How much RAM do you have available (including swap space, subtracting what your OS is already using)?

2. Arrays and Vectors are fundamentally different. Arrays are a static, contiguous block of RAM allocated for one purpose, and indexed by simple arithmetic; they contain NO additional information, and are fast to access, but cannot be dynamically resized, and are a poor implementation of "lists". Vectors are a construct made to implement a "list", which requires dynamic resizing, insertion between existing elements, and so forth; it consumes additional RAM in order to be able to provide these functions, and may also have to expend four or more bytes per bool (I can't say for sure, it may be required by the C++ template structure that Vectors use).

Even so, your numbers seem WAY off. First, 200M bools ~= 200MB of RAM, not 500k. Perhaps the operating system / compiler are smart enough to only use what you actually access...? Secondly, a vector of 200M bools at 25MB of RAM is exceptionally low (it may not have allocated much of the list at all; in fact, it probably only allocated some small initial part of it). Try setting every entry to some number, then check the usage again. If it still reports the same, let us know how you're measuring it.
Dan




PostPosted: Wed Jul 30, 2008 11:24 am   Post subject: RE:Why is it that...

DemonWasp seem to have given a good awnser, tho the size of a boolean depends on the C++ implementation you are using and the CPU. Ideally it could be 1 bit, however some impmenations of C have a bool data type that is relay a 4 bit integer on 32 bit CPUs.

So if it is 1bit then 2B would be 238.418579 megabytes, if it is 1 byte, 2B would be 1.86264515 gigabytes and if it was 4 bytes it would be 7.4505806 gigabytes.

To find out you can do sizeof(bool) or the name of the data type you are using, it is likely 1 byte or 1 bit.

Also I am very curious to know why you need 2 billion booleans in an array.
Computer Science Canada Help with programming in C, C++, Java, PHP, Ruby, Turing, VB and more!
DemonWasp




PostPosted: Wed Jul 30, 2008 1:12 pm   Post subject: RE:Why is it that...

Yes, ideally a boolean could be 1 bit, but in practicality it isn't (variety of reasons I can't recall well enough at the moment). It's generally a byte, though I suppose 4 bytes makes sense too (as that could be the size of a memory-fetch).

According to Wikipedia, though: "sizeof is a compile-time operator that returns the size, in multiples of the size of char, of the variable or parenthesized type-specifier that it precedes. The size of char on most architectures is 1 byte (8 bits) so for all practical purposes sizeof effectively returns the size in bytes"

So, sadly, sizeof ( bool ) will return 1 on most computers, and it means "1 byte", not the 1 bit you might hope for. This is why, if you have the need for a huge array of booleans, you do it by hand, and bit-shift for yourself; I did this...I've now got a program that can find all the prime numbers from 1 to a hideous number (I think it was 100,000,000) in about 5 seconds on my laptop. If anyone's interested, I could post source code.

Sometimes the low-level specifics of computers are awesome, other times they just make my head hurt.
SJ




PostPosted: Wed Jul 30, 2008 4:27 pm   Post subject: Re: RE:Why is it that...

DemonWasp @ Wed Jul 30, 2008 8:12 am wrote:
1. 1B bools ~= 1GB of memory consumed, 2B bools ~= 2GB memory consumed. How much RAM do you have available (including swap space, subtracting what your OS is already using)?


I have 2gig ram. This make sense now, thanks Smile

DemonWasp @ Wed Jul 30, 2008 8:12 am wrote:
Even so, your numbers seem WAY off. First, 200M bools ~= 200MB of RAM, not 500k. Perhaps the operating system / compiler are smart enough to only use what you actually access...? Secondly, a vector of 200M bools at 25MB of RAM is exceptionally low (it may not have allocated much of the list at all; in fact, it probably only allocated some small initial part of it). Try setting every entry to some number, then check the usage again. If it still reports the same, let us know how you're measuring it.


Yep they are! Surprised me before. I took your advice and set every entry to 1. here are the results:

normal array: 196000 K memory, takes a second to set everything
vector: 25000 K, takes about 15 seconds
bitset: 25000 K, around 10 seconds

so the normal array turns out normal. The vector seems to take very little space. The bitset appears to be the way to go.

Dan @ Wed Jul 30, 2008 11:24 am wrote:
Also I am very curious to know why you need 2 billion booleans in an array.


I was testing my limits Twisted Evil
DemonWasp




PostPosted: Thu Jul 31, 2008 8:26 am   Post subject: Re: RE:Why is it that...

SJ @ Wed Jul 30, 2008 4:27 pm wrote:
DemonWasp @ Wed Jul 30, 2008 8:12 am wrote:
Even so, your numbers seem WAY off. First, 200M bools ~= 200MB of RAM, not 500k. Perhaps the operating system / compiler are smart enough to only use what you actually access...? Secondly, a vector of 200M bools at 25MB of RAM is exceptionally low (it may not have allocated much of the list at all; in fact, it probably only allocated some small initial part of it). Try setting every entry to some number, then check the usage again. If it still reports the same, let us know how you're measuring it.


Yep they are! Surprised me before. I took your advice and set every entry to 1. here are the results:

normal array: 196000 K memory, takes a second to set everything
vector: 25000 K, takes about 15 seconds
bitset: 25000 K, around 10 seconds


Well, they appear to be taking the correct amount of time to initialise, but the RAM usage is still funny for the vector. While there are ways of storing that many booleans in less space (for example, run-length encoding could do it in some cases), they are not implemented in the array, and are unlikely to be there in the vector (I don't know what a Bitset is, having never used one). The array usage looks about right (since Windows reports KiB as KB, 196000KiB == 200704KB), but the vector is doing something screwy.

How are you measuring RAM usage?
jeffgreco13




PostPosted: Thu Jul 31, 2008 9:47 am   Post subject: RE:Why is it that...

I know this is the C++ forum but does anyone know if the RAM consumption is similar throughout different languages?
DemonWasp




PostPosted: Thu Jul 31, 2008 10:02 am   Post subject: RE:Why is it that...

RAM consumption will depend in many ways upon the language, but the bigger differences will be in the algorithm / data structures used. So, you can probably expect that languages like C will have the lowest RAM consumption, while things like Javascript will have hideous RAM consumption. Object-oriented stuff requires additional RAM (v-tables and so forth), as would untyped code (I'm not 100% sure of that one).

However, you can expect that the right data structure will generally trump language. That is, if you code something the naive way in C, and the better way in Javascript, there's a decent chance that the JS will use less RAM than the C code will (the same is also true of processing time and so forth).

Also worth noting is that a fair number of languages run inside a virtual machine or interpreter (including Java, but I couldn't enumerate others without looking stupid). These virtual machines will instantly add a fairly large amount of RAM usage to the application, something Java has been notorious for in the past (they're getting a lot better about it, but it's still eating a fair bit of space).
Sponsor
Sponsor
Sponsor
sponsor
SJ




PostPosted: Thu Jul 31, 2008 4:07 pm   Post subject: RE:Why is it that...

I'm just measuring memory by looking at it from the task manager. What's a better way?
DemonWasp




PostPosted: Thu Jul 31, 2008 7:24 pm   Post subject: RE:Why is it that...

If you're on a Unix-based system, then there's a lot of better ways. On Windows, however, there's a way to easily find the amount of RAM the program is actually holding onto (if not actively using):

In the Task Manager, View -> Select Columns and enable "Virtual Memory Size". Take the value in that column and add it to the "Mem Usage" column. This is the operating system being clever and paging out parts of the array that aren't frequently used, as far as I can tell.

If you want to do it really properly, then start looking for a software tool called a "profiler". These let you look at what your program is doing as it's running, far better than simple cout/printf calls would let you; they're incredibly helpful in plugging memory leaks, hunting down segfaults, and other common C++ debugging (particularly on large applications). Most will cost money, but there should be something out there that's free.
SJ




PostPosted: Sat Aug 02, 2008 7:38 pm   Post subject: RE:Why is it that...

I see. Thanks a lot DemonWasp. I'm using vista so there isn't an option called "virtual memory size". There is, however, "memory - working set" , "memory - commit size", and others totaling 7 memory options... confusing but i'll play around them as well as a profiler, which sounds pretty fun Smile
Rigby5




PostPosted: Thu Oct 09, 2008 11:25 pm   Post subject: Re: RE:Why is it that...

SJ @ Sat Aug 02, 2008 7:38 pm wrote:
I see. Thanks a lot DemonWasp. I'm using vista so there isn't an option called "virtual memory size". There is, however, "memory - working set" , "memory - commit size", and others totaling 7 memory options... confusing but i'll play around them as well as a profiler, which sounds pretty fun Smile


With Windows, "vitual memory" is when you need more memory than you actually have, so the system uses some hard drive space as a buffer, to act like real memory.

The main types of actual memory used by a running program are heap and stack.
The heap is permanent memory that either the loader calculates you need at the beginning, or you add with a "new" call.
The stack is temporary memory that grows and shrinks as you call and return from functions.
The reason there are terms like "working set" and "commit size" is because even though you may have requested memory off the heap, until you actually write to it, the system will not commit it to cache, and therefore allows other operations to use cache and work faster.
Cache is not really main memory, but a small and fact buffer that can shadow any part of real memory, for speed improvement.
jbking




PostPosted: Fri Oct 10, 2008 9:24 am   Post subject: Re: Why is it that...

Process Explorer would be my suggestion instead of Task Manager for looking at the total size of memory in use by a process as well as seeing if that process has child processes which can happen at times.
md




PostPosted: Fri Oct 10, 2008 11:37 am   Post subject: RE:Why is it that...

This thread is old. Do not revive old threads.
Display posts from previous:   
   Index -> Programming, C++ -> C++ Help
View previous topic Tell A FriendPrintable versionDownload TopicSubscribe to this topicPrivate MessagesRefresh page View next topic

Page 1 of 1  [ 14 Posts ]
Jump to:   


Style:  
Search: