 |
|
| Linux Forum Index » Linux Development - System » thread memory size... |
|
Page 1 of 4 Goto page 1, 2, 3, 4 Next |
|
| Author |
Message |
| Jan Helgesen... |
Posted: Mon Oct 05, 2009 2:41 am |
|
|
|
Guest
|
Hi
I cant find any concrete information about this or I used the wriong
search terms on google, but
how much memory does a thread in linux take up, if you exclude the stack
size of the threaed. And what is the default thread stack size?
regards
jan |
|
|
| Back to top |
|
|
|
| Rainer Weikusat... |
Posted: Mon Oct 05, 2009 4:47 am |
|
|
|
Guest
|
Jan Helgesen <spam at (no spam) nospam.com> writes:
[...]
Quote: how much memory does a thread in linux take up, if you exclude the
stack size of the threaed. And what is the default thread stack size?
Have you considered writing a multi-threaded program and looking?
--------
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void *do_nothing(void *unused)
{
pause();
return NULL;
}
int main(void)
{
pthread_t tid;
int rc;
rc = pthread_create(&tid, NULL, do_nothing, NULL);
if (rc) {
errno = rc;
perror("Something went terribly wrong");
exit(1);
}
pause();
return 0;
}
--------
After having been compiled and started, the runtime virtual memory
layout can be inspected with the pmap command. The relevant part is
this (pmap -d):
16172: ./a.out
Address Kbytes Mode Offset Device Mapping
08048000 4 r-x-- 0000000000000000 016:00007 a.out
08049000 4 rw--- 0000000000000000 016:00007 a.out
0804a000 132 rw--- 0000000000000000 000:00000 [ anon ]
b7648000 4 ----- 0000000000000000 000:00000 [ anon ]
b7649000 8192 rw--- 0000000000000000 000:00000 [ anon ]
The 8M is address space reserved for the stack of the
do_nothing-thread. Actual memory is assigned to the addresses in this
area as needed. The stack grows downward and the 'no access' page
which sits just below the lower bound of the stack is supposed to
cause a trap in case of a stack overrun.
As to the other part of your question: What are you referring to,
virtual address space? Physical memory used by the kernel? Both? |
|
|
| Back to top |
|
|
|
| Jan Helgesen... |
Posted: Mon Oct 05, 2009 6:05 am |
|
|
|
Guest
|
Rainer Weikusat wrote:
Quote: As to the other part of your question: What are you referring to,
virtual address space? Physical memory used by the kernel? Both?
It depends on which parts are relevant to answer the question. That is
why I am asking such an open ended question, because my knowledge about
these details in Linux specifically are quite minimal.
To be more concise, I am probably talking about the amount of kernel
memory required to keep a thread. In other words not the amount of
runtime memory the code needs.
So my follow up question is would then be: given 1GB of RAM how many
threads can run at the same time in Linux.
I was watching this intro about Erlang, in which he states that Erlang
has native and super lightweight threads. Erlang threads takes up about
300-350 bytes. This could make it possible to run 1 million threads at
the same time, on the correct hardware. It was mentioned a processor
called Tilera, Tile64, which is a "network chip", which means it could
possibly scale to 1 million cores (across a number of processors) in the
future.
So my question was then, what has to be done with Linux to be able to
operate on such a system? or would it not even be possible with the
current linux architecture
regards
jan |
|
|
| Back to top |
|
|
|
| David Schwartz... |
Posted: Mon Oct 05, 2009 8:40 am |
|
|
|
Guest
|
On Oct 5, 5:05 am, Jan Helgesen <s... at (no spam) nospam.com> wrote:
Quote: I was watching this intro about Erlang, in which he states that Erlang
has native and super lightweight threads. Erlang threads takes up about
300-350 bytes. This could make it possible to run 1 million threads at
the same time, on the correct hardware. It was mentioned a processor
called Tilera, Tile64, which is a "network chip", which means it could
possibly scale to 1 million cores (across a number of processors) in the
future.
If you had that many cores, you would not want them running threads.
That would make no sense at all.
Quote: So my question was then, what has to be done with Linux to be able to
operate on such a system? or would it not even be possible with the
current linux architecture
You wouldn't use threads in that case, at least not in the sense of
the types of threads that Linux creates. If you had a million cores,
in any forseeable architecture within the next several dozen years or
so, you would not want SMP.
DS |
|
|
| Back to top |
|
|
|
| David Schwartz... |
Posted: Mon Oct 05, 2009 11:03 am |
|
|
|
Guest
|
On Oct 5, 1:19 pm, Jan Helgesen <s... at (no spam) nospam.com> wrote:
Quote: Could you explain why you think so? Because I am pretty sure there are
more ways than the one, or perhaps two ways that are traditional today,
to do high performance computing. Maybe you dont know of those ways yet?
Threads all share a vm. If you ran a million cores with a million
threads, anything that changed the vm would require coordination among
all the threads. That would be a pretty hideous synchronization
bottleneck.
DS |
|
|
| Back to top |
|
|
|
| Jan Helgesen... |
Posted: Mon Oct 05, 2009 2:19 pm |
|
|
|
Guest
|
David Schwartz wrote:
Quote:
If you had that many cores, you would not want them running threads.
That would make no sense at all.
You wouldn't use threads in that case, at least not in the sense of
the types of threads that Linux creates. If you had a million cores,
in any forseeable architecture within the next several dozen years or
so, you would not want SMP.
Could you explain why you think so? Because I am pretty sure there are
more ways than the one, or perhaps two ways that are traditional today,
to do high performance computing. Maybe you dont know of those ways yet?
regards
Jan |
|
|
| Back to top |
|
|
|
| David Schwartz... |
Posted: Mon Oct 05, 2009 10:27 pm |
|
|
|
Guest
|
On Oct 6, 1:12 am, Jan Helgesen <s... at (no spam) nospam.com> wrote:
Quote: Thats only a problem if you use the SharedMemoryconcurrency model.
Right, which these threads do.
Quote: If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if you
don't share the access to the data, you don't need to a use lock, nor
synchronisation.
Right, so it would make sense to ask this question about threads that
use that model, but these don't.
Quote: This is age old knowledge, which was implemented into the Erlang
language 20 years ago. But for some reason computer people insist on
using the SharedMemory model instead. I think its because the Message
Passing model requires a complete shift in the thinking of how we
program our code. And as the optimists we humans are, we think we know
better. So we think that we can solve the SharedMemory model and hence
focus our efforts there. Thats like trying to save a sinking ship.
But I digress.
So back to my question: how much kernel memory does a thread require?
It doesn't matter, since that would only be a sensible question to ask
about threads that don't use the shared memory model. But these do.
These are not the kind of threads where your question makes sense. If
you were going to set up message passing threads, they'd look nothing
like Linux KSEs, so how much memory a Linux KSE takes up wouldn't
matter to you.
DS |
|
|
| Back to top |
|
|
|
| David Schwartz... |
Posted: Mon Oct 05, 2009 11:23 pm |
|
|
|
Guest
|
On Oct 6, 1:55 am, Jan Helgesen <s... at (no spam) nospam.com> wrote:
Quote: Right, so it would make sense to ask this question about threads that
use that model, but these don't.
I think you misunderstand. shared memory is not the same as the Shared
Memory concurrency model. Yes, the threads share the same memory space,
but that does not mean it have to share the data using that shared
memory. The program can still pass data as messages, using pipes or
other messaging software.
What a mind-bogglingly stupid thing to do. You pay all the costs of
shared memory and get none of the benefits.
If you have shared memory threads, implementing the concurrent through
shared memory would be the most efficient way to do it. The whole
point of not using shared memory concurrency is that it makes it
possible to use threads that don't share memory, saving you the cost
of shared memory threads.
Quote: But as I say, it requires a leap in thinking
because of the fact that the structure of the program has to be
completely different than it is for today's sequential program structure.
I agree. And the biggest difference -- you wouldn't use shared memory
threads. That way, you get rid of all the overhead of keeping the vm
in sync.
Quote: Three very important aspects of this are:
1) The programs or components are not allowed to use global variables or
memory, just local variables and data structures
2) every piece of the program defined as an independent component has to
communicate its data to the next component by way of a message with
data, instead of by sending a pointer to the data. This means data will
be copied more than normally, but with proper support in the operating
system and a million cores, or even just 256 cores, you wont know the
difference.
3) The programmer has to study the data and the processing to determine
a way to parallelise the execution so it can leverage the number of
threads and cores.
And the whole point of going to all that trouble is that you can get
your concurrency even if your threads don't share memory. So you
*don't* pay the overhead of shared memory threads but still get fast
concurrency.
If you used shared memory threads though, any sensible implementation
would implement the message passing in shared memory. If threads share
memory, that shared memory will almost always be the fastest IPC
mechanism.
DS |
|
|
| Back to top |
|
|
|
| David Schwartz... |
Posted: Tue Oct 06, 2009 1:50 am |
|
|
|
Guest
|
On Oct 6, 2:44 am, Rainer Weikusat <rweiku... at (no spam) mssgmbh.com> wrote:
Quote: Jan Helgesen <s... at (no spam) nospam.com> writes:
Thats only a problem if you use the Shared Memory concurrency model.
If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if
you don't share the access to the data, you don't need to a use lock,
nor synchronisation.
This is nonsense. 'Synchronization' is always (and only) an issue,
when shared ressources are used in potentially conflicting ways, for
instance, to provide communication channels with more than one sender
and one receiver (n:m access, n * m != 0 && (n > 1 || m > 1), please
note that negative numbers of senders or receivers make no
sense). And try to attend to an introductory course on
lingustic. Using a nonsense definitions ('message passing' is only and
exactly what our code does) is ok for marketing, but not for science.
He is trying to repeat something he doesn't fully understand.
What he wants to say is this:
If you do all your synchronization using message passing, you scale to
more execution vehicles than can reasonably be put in an SMP domain
because you don't have to share an entire vm. Synchronization will be
a bit more expensive than with a shared-memory system, but (hopefully)
there will be so much less synchronization that it will be a net win.
But to use message passing synchronization when threads that share a
vm is just piling a loss on top of a loss. And if your message passing
is sensible, it will use shared memory to emulate the pipe anyway.
What he's missing is that the whole point of limiting all your
inter-'thread' communication to message passing is that your threads
then don't have to share a vm. So he's trying to add message passing
to threads that share a vm and imagine scaling that to thousands of
cores.
This totally defeats the point of message passing.
DS |
|
|
| Back to top |
|
|
|
| Jan Helgesen... |
Posted: Tue Oct 06, 2009 2:12 am |
|
|
|
Guest
|
David Schwartz wrote:
Quote:
Threads all share a vm. If you ran a million cores with a million
threads, anything that changed the vm would require coordination among
all the threads. That would be a pretty hideous synchronization
bottleneck.
Thats only a problem if you use the Shared Memory concurrency model.
If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if you
don't share the access to the data, you don't need to a use lock, nor
synchronisation.
This is age old knowledge, which was implemented into the Erlang
language 20 years ago. But for some reason computer people insist on
using the Shared Memory model instead. I think its because the Message
Passing model requires a complete shift in the thinking of how we
program our code. And as the optimists we humans are, we think we know
better. So we think that we can solve the Shared Memory model and hence
focus our efforts there. Thats like trying to save a sinking ship.
But I digress.
So back to my question: how much kernel memory does a thread require?
regards
Jan |
|
|
| Back to top |
|
|
|
| Jan Helgesen... |
Posted: Tue Oct 06, 2009 2:55 am |
|
|
|
Guest
|
David Schwartz wrote:
Quote: On Oct 6, 1:12 am, Jan Helgesen <s... at (no spam) nospam.com> wrote:
Thats only a problem if you use the SharedMemoryconcurrency model.
Right, which these threads do.
If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if you
don't share the access to the data, you don't need to a use lock, nor
synchronisation.
Right, so it would make sense to ask this question about threads that
use that model, but these don't.
I think you misunderstand. shared memory is not the same as the Shared
Memory concurrency model. Yes, the threads share the same memory space,
but that does not mean it have to share the data using that shared
memory. The program can still pass data as messages, using pipes or
other messaging software. But as I say, it requires a leap in thinking
because of the fact that the structure of the program has to be
completely different than it is for today's sequential program structure.
Three very important aspects of this are:
1) The programs or components are not allowed to use global variables or
memory, just local variables and data structures
2) every piece of the program defined as an independent component has to
communicate its data to the next component by way of a message with
data, instead of by sending a pointer to the data. This means data will
be copied more than normally, but with proper support in the operating
system and a million cores, or even just 256 cores, you wont know the
difference.
3) The programmer has to study the data and the processing to determine
a way to parallelise the execution so it can leverage the number of
threads and cores.
regards
Jan |
|
|
| Back to top |
|
|
|
| Rainer Weikusat... |
Posted: Tue Oct 06, 2009 3:44 am |
|
|
|
Guest
|
Jan Helgesen <spam at (no spam) nospam.com> writes:
Quote: David Schwartz wrote:
Threads all share a vm. If you ran a million cores with a million
threads, anything that changed the vm would require coordination among
all the threads. That would be a pretty hideous synchronization
bottleneck.
Thats only a problem if you use the Shared Memory concurrency model.
If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if
you don't share the access to the data, you don't need to a use lock,
nor synchronisation.
This is nonsense. 'Synchronization' is always (and only) an issue,
when shared ressources are used in potentially conflicting ways, for
instance, to provide communication channels with more than one sender
and one receiver (n:m access, n * m != 0 && (n > 1 || m > 1), please
note that negative numbers of senders or receivers make no
sense). And try to attend to an introductory course on
lingustic. Using a nonsense definitions ('message passing' is only and
exactly what our code does) is ok for marketing, but not for science. |
|
|
| Back to top |
|
|
|
| Rainer Weikusat... |
Posted: Tue Oct 06, 2009 3:54 am |
|
|
|
Guest
|
David Schwartz <davids at (no spam) webmaster.com> writes:
[...]
Quote: If you used shared memory threads though, any sensible implementation
would implement the message passing in shared memory. If threads share
memory, that shared memory will almost always be the fastest IPC
mechanism.
Of course it is. That's why these clowns are using shared memory. And
they have so far 'tested' their theories on quadcore-machines (IIRC)
and already run into 'queueing and dispatching problems' while having
provided the usual performance drop associated with 'one data-copying
message passing library to rule them all'.
A note to the OP: While your universe may resemble a Windows-PC very
closely, and, although even 'PCs' have had SMP support for quite some
time, PC-SMP has recently become a lot more popular does not imply
that other people haven't already been using 'large multiprocessors'
for a fairy long time. The Tilera-chip as 64 cores (I cannot presently
check how this compares with a T2) and this is by no means excessive.
And neither is SoC an innovative new concept. Even Intel has been
building these (ARM-based, eg XScale) for years. |
|
|
| Back to top |
|
|
|
| Jan Helgesen... |
Posted: Tue Oct 06, 2009 6:17 am |
|
|
|
Guest
|
David Schwartz wrote:
Quote: If you used shared memory threads though, any sensible implementation
would implement the message passing in shared memory. If threads share
memory, that shared memory will almost always be the fastest IPC
mechanism.
Do you think there is only one way to efficiently implement message
passing? And do you think that the solution you describe are the only
solution?
Please see my other reply to Weikusat for more detailed answers.
regards
Jan |
|
|
| Back to top |
|
|
|
| Rainer Weikusat... |
Posted: Tue Oct 06, 2009 6:34 am |
|
|
|
Guest
|
David Schwartz <davids at (no spam) webmaster.com> writes:
Quote: On Oct 6, 2:44 am, Rainer Weikusat <rweiku... at (no spam) mssgmbh.com> wrote:
Jan Helgesen <s... at (no spam) nospam.com> writes:
Thats only a problem if you use the Shared Memory concurrency model.
If you use the Message Passing concurrency model or other non-shared
data models, then synchronisation is no longer an issue. Because if
you don't share the access to the data, you don't need to a use lock,
nor synchronisation.
This is nonsense. 'Synchronization' is always (and only) an issue,
when shared ressources are used in potentially conflicting ways, for
instance, to provide communication channels with more than one sender
and one receiver (n:m access, n * m != 0 && (n > 1 || m > 1), please
note that negative numbers of senders or receivers make no
sense). And try to attend to an introductory course on
lingustic. Using a nonsense definitions ('message passing' is only and
exactly what our code does) is ok for marketing, but not for science.
He is trying to repeat something he doesn't fully understand.
What he wants to say is this:
If you do all your synchronization using message passing, you scale to
more execution vehicles than can reasonably be put in an SMP domain
because you don't have to share an entire vm. Synchronization will be
a bit more expensive than with a shared-memory system, but (hopefully)
there will be so much less synchronization that it will be a net
win.
That's your merciful assumption :->. I am fairly convinced that this
is part of the stern wave of this
http://www.barrelfish.org/
Ok, 'ETH Zuerich' using C and a POSIX-compatible API is in itself a small
revolution, apparently, 'we' are slowly getting past the 'CS research
nuclear winter' which basically commenced after 'the Europeans' and
'the Americans' decided to rather ignore each other something like
thirty years ago. But it is somewhat late and more than somewhat
weird: Insofar people need a massive amount of 'independent execution
units' they use clustered SMP-machines with fast interconnects,
reaping the benefits of both approaches. SGI has[*] and is still
building SMP-computers with 1024 cores, cf
http://www.sgi.com/products/servers/altix/4000/features.html
and people have been using these to build _large_ clusters, cf
http://www.sgi.com/company_info/features/2004/oct/columbia/
[I apologize this becomes somewhat 'advertisy' but I don't see how
could avoid that].
And this guy (the OP) is writing about 'unsurmountable difficulties'
which will certainly appear in a 256-way SMP machine. That's just one
(binary) order of magnitude more than what 'standard' high-end
UNIX(*)-servers provide and this is not the limit, see links above.
Lastly, the 'original idea' to build asychronous multiprocessors is
old enough that U. Valhalia considered it to be outdated when writing
UNIX(*) Internals (published 1995).
I am starting to become fairly convinced that something like 'a
Microverse', being a time-space-continuum of its only, not sharing any
memory with 'known space', and only communicating with it by
ocassionally sending messages (and never receiving anything) actually
exists ... |
|
|
| Back to top |
|
|
|
|
|
All times are GMT - 5 Hours
The time now is Tue Dec 01, 2009 9:11 pm
|
|