Main Page | Report this Page
 
   
Science Forum Index  »  Math - Symbolic Forum  »  benchmarking CAS
Page 2 of 3    Goto page Previous  1, 2, 3  Next
Author Message
Vladimir Bondarenko
Posted: Sun Mar 30, 2008 5:57 am
Guest
Thank you for describing the options.

About some of them we thought. I hope to give you much more details
in private.

I received your letter several days ago, and wrote you two letters to
this address. None of them bounced, and still you cannot read them.

Could they be trapped by your spam filter if you have any?


On Mar 30, 7:34 am, caveman <dev.n...@dev.null.com> wrote:
Quote:
Vladimir Bondarenko wrote:
Hello Nasser,

Thanks for your idea.

Technically, we could do this. Anyways, now we have a small amount of
resource we can dedicate to a concrete system; but we here hope that
the things could look better reasonably soon, hopefully, this year.

As you understand, if we here dedicate something to SAGE, we subtract
this from the above list I quoted.

Not necessarily - you have many options.

1) Release the code under the GPL, letting others adapt it to SAGE.

Of course, if you want to sell your code, which I think you do, this
might not be too attractive.

(As I have said before, my personal belief is your biggest barrier to
commercial exploitation is your unorthodox approach.)

2) Open source the code, but license it under a restrictive license that
expressly forbids usage by commercial users such as Wolfram Research
without payment of a license fee, but allows its free use testing
open-source programs such as SAGE.

Of course WRI could technically ignore your license, but I very much
doubt they would. No WRI manager is going to ask an employee in WRI to
ignore your license.

However, you may be exploited in some ways.

* An employee of WRI could compile your code at home, report bugs at
WRI, not telling the truth how he found them. He would appear a good
employee and gain promotion.

* Normal users would find bugs, report them to WRI, so WRI gain while
others do the work for them.

3) Adapt your code to SAGE (some work on your part), then release a
*binary* package that only works with SAGE and written in such a way it
would not work with the commercial programs Mathematica, Maple, MATLAB etc..

Obviously you could couple this technical restriction with a license
which prevents its adoption by commercial applications.

I've no idea how difficult it might be for someone to write a
SAGE->Mathematica translator, then use your binary code to test
Mathematica by fooling your code.

Again, nobody in WRI is going to officially ask another employee to
write such a translator, but there might be an incentive for some
employees to do this at home.

4) Release a binary package for testing SAGE an use a client/server
approach so that the code needs an internet connection and gets a
license valid for one day only - much the same as FlexLM works. Then if
you thought you were being exploited you could quickly put a stop to it.

PS - you asked me to email you, which I did, but you never replied.

Dave (from the UK).
Jaap Spies
Posted: Sun Mar 30, 2008 7:14 am
Guest
SzH wrote:
Quote:
On Mar 30, 8:57 am, "Nasser Abbasi" <n...@12000.org> wrote:
It would be interesting if your VM machine can add SAGE to the systems it
tests. SAGE is described here:

http://en.wikipedia.org/wiki/Software_for_Algebra_and_Geometry_Experi...

I played with SAGE a little, but I find its interface on windows, having to
use a browser, a bit awkward to say the least. I have not tried it text
based interface.

AFAIK Sage is primarily a browser-based notebook interface to other
packages such as Maxima and SciPy. So it is those other packages that
would benefit from testing and not SAGE itself.

This is certainly not true! Sage has its own code base and is much, much
more then only a notebook interface.

See http://www.sagemath.org/
Browse through the code:
http://www.sagemath.org/hg/sage-main?cmd=manifest;manifest=-1;path=/sage/


Jaap
Nasser Abbasi
Posted: Sun Mar 30, 2008 7:46 am
Guest
"Vladimir Bondarenko" <vb@cybertester.com> wrote in message
news:d2159b9f-595a-46d6-b15f-e8a8aa3a7bc6@m3g2000hsc.googlegroups.com...

Quote:
Could you please tell why do you feel that testing SAGE could be really
useful? Useful for whom?

Well, since Sage is, I think, is the first open source GPL'ed CAS software?
and so it would be useful to see how its quality stack up to commercial
software. May be for some who are thinking of using open source CAS but not
sure if Sage is good enough to invest in learning it? Having the ability to
read the source code and experiment with it is a very important aspect in
learning about algorithms used in computer algebra, assuming all things are
equal.

Quote:
By the way, maybe you know, approximately, how many persons use SAGE?
Occasionally? On a constant basis?

From
http://thedaily.washington.edu/2008/1/14/sage-insights-open-source-technology/
it says (dated january 14, 2008): (I cut paste the relevent lines)

"Typically SAGE gets 1,000 hits per month and our user base is 10,000; about
300 developers are on our [developer] e-mail list. Maple [a similar program]
has about a million users, he added. Stein attributes low usage among
professionals to the fact that some professors and engineers find it
difficult to transfer their calculations to a new program. He said students
appreciate the advantages SAGE provides, such as compatibility with non-open
source programs and its ease of use."

Nasser
rjf
Posted: Sun Mar 30, 2008 7:50 am
Guest
Well, VB managed both to post to an additional distracting newsgroup
(sci.math) and (deliberately?) misinterpret the main question. The
question was whether we could come up with benchmarks that tested the
long-term performance of a CAS (and maybe, whether that mattered or
not). How long it takes to provoke a fatal error by some heuristic
search is a different question.


Thomas Richard actually provided info regarding Maple 11 behavior in
returning memory. It still doesn't answer the more pointed question,
which is... can Maple 11 get into a situation where its memory usage
is so scattered that it runs at "disk speed" even though the actual
data in use is not so voluminous.

Whether SAGE should be tested independently of its components is
hardly a question. Why should SAGE not have bugs in addition to the
bugs in the libraries it uses? On the main question here, as to
whether there are benchmarks that would test efficiency, I assume that
the benchmarks for Maxima (etc) would be relevant to SAGE to the
extent that SAGE is just calling Maxima. To the extent that SAGE has
its own inner loops and memory allocation, it could have its own
problems or successes. I understand that it is essentially a python
program, and thus the python implementation may be a boon or a
disaster, and maybe that is the most prominent factor that can be
identified. (Is there a compacting garbage collector for python?)

I don't understand NA's comment that SAGE is the first GPL CAS, since
it depends on several other predecessors. Or the comment that large
problems in Maple are interpreted.

If we don't raise questions when people continue to compare CAS on
essentially irrelevant grounds, we are in the position of essentially
allowing others who read this newsgroup from accepting these comments
as valid.
rjf
Posted: Sun Mar 30, 2008 7:52 am
Guest
Well, VB managed both to post to an additional distracting newsgroup
(sci.math) and (deliberately?) misinterpret the main question. The
question was whether we could come up with benchmarks that tested the
long-term performance of a CAS (and maybe, whether that mattered or
not). How long it takes to provoke a fatal error by some heuristic
search is a different question.


Thomas Richard actually provided info regarding Maple 11 behavior in
returning memory. It still doesn't answer the more pointed question,
which is... can Maple 11 get into a situation where its memory usage
is so scattered that it runs at "disk speed" even though the actual
data in use is not so voluminous.

Whether SAGE should be tested independently of its components is
hardly a question. Why should SAGE not have bugs in addition to the
bugs in the libraries it uses? On the main question here, as to
whether there are benchmarks that would test efficiency, I assume that
the benchmarks for Maxima (etc) would be relevant to SAGE to the
extent that SAGE is just calling Maxima. To the extent that SAGE has
its own inner loops and memory allocation, it could have its own
problems or successes. I understand that it is essentially a python
program, and thus the python implementation may be a boon or a
disaster, and maybe that is the most prominent factor that can be
identified. (Is there a compacting garbage collector for python?)

If we don't raise questions when people continue to compare CAS on
essentially irrelevant grounds, we are in the position of essentially
allowing others who read this newsgroup from accepting these comments
as valid.
Jaap Spies
Posted: Sun Mar 30, 2008 8:54 am
Guest
SzH wrote:

Quote:

Seehttp://www.sagemath.org/
Browse through the code:http://www.sagemath.org/hg/sage-main?cmd=manifest;manifest=-1;path=/s...

I don't think that the right way to introduce SAGE to a newbie is to
ask him to browse the system's source code ...


This was certainly not an introduction to Sage, but just showing
the existence of the the code base!

Try the tutorial: http://www.sagemath.org/doc/html/tut/index.html

Quote:
SAGE might be a useful program, but the current website is somehow not
convincing (and not informative) enough to persuade me to attempt a
500 MB download of something that will only run on a linux system
running inside an emulator ... Just compare it with e.g. the orders
of magnitude better Yacas website to see what I mean.

You are right! There is no 'about Sage' page on the web site.

Some Sage developers are working hard on a native Windows port of
Sage. This is not a trivial undertaking, but there is support from
Microsoft Research.

Jaap
Mike Hansen
Posted: Sun Mar 30, 2008 9:04 am
Guest
Quote:
I understand that it is essentially a python
program, and thus the python implementation may be a boon or a
disaster, and maybe that is the most prominent factor that can be
identified. (Is there a compacting garbage collector for python?)

Python 2.5 will release memory back to the system, but the CPython
implementation guarantees that it will not move objects around.

Quote:
If we don't raise questions when people continue to compare CAS on
essentially irrelevant grounds, we are in the position of essentially
allowing others who read this newsgroup from accepting these comments
as valid.

Agreed.

--Mike
mabshoff
Posted: Sun Mar 30, 2008 9:17 am
Guest
rjf wrote:

Hello,

Quote:
Well, VB managed both to post to an additional distracting newsgroup
(sci.math) and (deliberately?) misinterpret the main question. The
question was whether we could come up with benchmarks that tested the
long-term performance of a CAS (and maybe, whether that mattered or
not). How long it takes to provoke a fatal error by some heuristic
search is a different question.

I agree with you on that point, i.e. the use of toy examples to
benchmark is wrong. It is amazing how much code out there [non-
mathematical and mathematical alike] falls over when you start pushing
it. And the vast majority of benchmarks do not take long term effects
or even memory consumption into account. The most famous example I can
come up with are Gröbner Basis computations comparing the Buchberger
algorithm with F4 or F5.

Quote:
Thomas Richard actually provided info regarding Maple 11 behavior in
returning memory. It still doesn't answer the more pointed question,
which is... can Maple 11 get into a situation where its memory usage
is so scattered that it runs at "disk speed" even though the actual
data in use is not so voluminous.

Well, I think it mostly matters if that is a case that does happen in
real life or if you can construct some malicious example to prove your
point. I know next to nothing about the internal memory management
systems of Maple or Mathematica, but the choices aren't really between
"dumb" malloc and sophisticated gc enabled lisp with compactification.
People have been using slab allocators in code for decades, so the
fragmentation issue that potentially happens with some "dumb" heap
base allocation algorithm has less of an impact in the real world
IMHO. Heap fragmentation is a serious issue, but if that really is a
problem with the algorithms one wants to implement one needs to adapt
the algorithm or use various tricks like slab allocators. Those aren't
a magic bullet, but one my list of things to do heap fragmentation is
not very high. I can easily construct and example where I exhaust all
available memory using slab allocators, but I have never hit an
example like that in real life.

Quote:
Whether SAGE should be tested independently of its components is
hardly a question. Why should SAGE not have bugs in addition to the
bugs in the libraries it uses?

Why is that question even relevant to this discussion? Given the
choice between GAP to do some computation related to group theory or
everything else out there what would you choose? Given some
computation related to L-functions would you use John Cremona's eclib,
sympow or lcalc or something else? I am sure that most of the above
computations matter little to most people, but Sage's main goal is
mathematical research in number theory, combinatorics, graph theory
and cryptography and a couple other things. So people tend to push the
code a little harder than you average college student computing some
integrals to do his or her homework.

Sage is more than the sum of its parts and I am under the impression
from previous comments you made in public and private correspondence
that you do not understand the concept behind Sage. Does Sage have
bugs? Yes. But the important thing is that we are fixing those bugs
and I tend to believe that we are moving much more quickly than the
Open Source competition out there.

And regarding memory consumption: I do valgrind the complete Sage test
suite at least once a week and analyze the logs and also investigate
every time somebody does report a potential memory leak. After every
two or three patches I merge I run a test suite consisting of 50,000+
[and steadily growing] inputs consuming about an hour and a half of
CPU time and if things get broken we do fix them. People have been
running computations on sage.math for months at a time with the same
Sage instance and in those particular cases the memory consumption did
not grow. So Sage is not some toy system.

Every component in Sage is audited for memory leaks via the doctests
and I have been slowly but surely auditing all components of Sage with
their own test suite with valgrind for memory leaks. I am not done
yet, but will finish this year. All the issues I found have been fixed
upstream. We take memory leaks and bugs in general very seriously. If
a patch causes memory leaks during review that code is not merge,
regardless of how desirable that feature is. The Sage project did not
set out to do an alright job or come in second best. The declared goal
is to beat Magma on performance and memory efficiency and while we are
not there yet across the board we have made tremendous progress in the
last year and have overtaken Magma in some areas. See

http://sagemath.blogspot.com/2008/02/benchmarketing-modular-hermite-normal.html

for example.

Sage certainly still has its weaknesses (mv poly factorization, no F4/
F5 for anything but boolean rings and a couple other things), but we
are aware of those issues and on the way to fix them. Magma is still
better in a lot of areas, but let's see how things are in two years.

Quote:
On the main question here, as to
whether there are benchmarks that would test efficiency, I assume that
the benchmarks for Maxima (etc) would be relevant to SAGE to the
extent that SAGE is just calling Maxima. To the extent that SAGE has
its own inner loops and memory allocation, it could have its own
problems or successes. I understand that it is essentially a python
program, and thus the python implementation may be a boon or a
disaster,

Sage does combine Python, Cython, C and C++ code (and some Fortran)
and depending on the computation you do the vast amount of memory is
not allocated in the Python instance but in the lower layers. For
example if you allocate a large number of matrices and delete them the
memory is returned to the system since Python's garbage collector is
barely involved in that case. Some high level structure are involved
since we use Python's C API. It raises the potential for heap
fragmentation, but with today's 64 bit CPUs I am more concerned about
the eventual demise of our Sun than the exhaustion of VM space.

Quote:
and maybe that is the most prominent factor that can be
identified. (Is there a compacting garbage collector for python?)

I don't think Python itself can return memory to the system, but it
doesn't matter too much for the reasons cited above.

Quote:
If we don't raise questions when people continue to compare CAS on
essentially irrelevant grounds, we are in the position of essentially
allowing others who read this newsgroup from accepting these comments
as valid.

Cheers,

Michael
Jaap Spies
Posted: Sun Mar 30, 2008 9:31 am
Guest
Vladimir Bondarenko wrote:
Quote:
Michael Abshoff writes



MA> Sage farms out most of those tasks [integration, differentiation,
MA> ODE] to Maxima, which is already tested.

Why sure, I meant precisely this!


Is this true? How far do you test Maxima? What shows up in the newsgroups
are failures of Mathematica and Maple. But maybe I'm wrong and are you doing
extensive tests in Maxima.

Jaap
caveman
Posted: Sun Mar 30, 2008 9:34 am
Guest
Vladimir Bondarenko wrote:
Quote:
Hello Nasser,

Thanks for your idea.

Technically, we could do this. Anyways, now we have a small amount of
resource we can dedicate to a concrete system; but we here hope that
the things could look better reasonably soon, hopefully, this year.

As you understand, if we here dedicate something to SAGE, we subtract
this from the above list I quoted.

Not necessarily - you have many options.

1) Release the code under the GPL, letting others adapt it to SAGE.

Of course, if you want to sell your code, which I think you do, this
might not be too attractive.

(As I have said before, my personal belief is your biggest barrier to
commercial exploitation is your unorthodox approach.)

2) Open source the code, but license it under a restrictive license that
expressly forbids usage by commercial users such as Wolfram Research
without payment of a license fee, but allows its free use testing
open-source programs such as SAGE.

Of course WRI could technically ignore your license, but I very much
doubt they would. No WRI manager is going to ask an employee in WRI to
ignore your license.

However, you may be exploited in some ways.

* An employee of WRI could compile your code at home, report bugs at
WRI, not telling the truth how he found them. He would appear a good
employee and gain promotion.

* Normal users would find bugs, report them to WRI, so WRI gain while
others do the work for them.

3) Adapt your code to SAGE (some work on your part), then release a
*binary* package that only works with SAGE and written in such a way it
would not work with the commercial programs Mathematica, Maple, MATLAB etc.

Obviously you could couple this technical restriction with a license
which prevents its adoption by commercial applications.

I've no idea how difficult it might be for someone to write a
SAGE->Mathematica translator, then use your binary code to test
Mathematica by fooling your code.

Again, nobody in WRI is going to officially ask another employee to
write such a translator, but there might be an incentive for some
employees to do this at home.

4) Release a binary package for testing SAGE an use a client/server
approach so that the code needs an internet connection and gets a
license valid for one day only - much the same as FlexLM works. Then if
you thought you were being exploited you could quickly put a stop to it.

PS - you asked me to email you, which I did, but you never replied.

Dave (from the UK).
caveman
Posted: Sun Mar 30, 2008 9:47 am
Guest
mabshoff wrote:

Quote:
It runs on various flavors of Linux and also on OSX. Microsoft
Reseatch is paying for a native Windows port and Solaris is getting
close to running. There are many other worthwhile platforms left after
that in my book.


What is your problem with Solaris? I was at one point working with
William on trying to do some work with Solaris and have spent quite a
bit of time attempting compilation. The problem seems to be the large
number of external packages you use, all of which are written by
different people using very different approaches.


Not sure what your latest status is, but if you need hardware, rather
than man-hours, I can probably help with access via SSH to one or more
multi-processor Suns. I know at one time you were limited by hardware
availability. If that is still so, I can help.
Vladimir Bondarenko
Posted: Sun Mar 30, 2008 9:50 am
Guest
RJF> whether we could come up with benchmarks that tested the
RJF> long-term performance of a CAS (and maybe, whether that
RJF> mattered or not).

A very good question. The long-term performance is something the
customers need. Personally, I hate when after using Maple 11 for
several days, I must reboot as my machine gets tortoise-like.

In contrast, if Mathematica 6 runs for days, and Maple 11 is not run
over this span of time, there is no tangible performance degradation.
Please note that I speak only about *classic* version as using
Standard Worksheets makes my machine, quickly, running just like a
sleepy Bradypus.

...

The long-term performance does matter as it is not impossible to
imagine a symbolic/hybrid calculation task that takes a week or
a month.

RJF> and (deliberately?) misinterpret the main question.

I am shocked to hear this viewpoint... I hope you are not in
serious...


On Mar 30, 10:50 am, rjf <fate...@gmail.com> wrote:
Quote:
Well, VB managed both to post to an additional distracting newsgroup
(sci.math) and (deliberately?) misinterpret the main question.  The
question was whether we could come up with benchmarks that tested the
long-term performance of a CAS (and maybe, whether that mattered or
not). How long it takes to provoke a fatal error by some heuristic
search is a different question.

Thomas Richard actually provided info regarding Maple 11 behavior in
returning memory.  It still doesn't answer the more pointed question,
which is... can Maple 11 get into a situation where its memory usage
is so scattered that it runs at "disk speed" even though the actual
data in use is not so voluminous.

Whether SAGE should be tested independently of its components is
hardly a question.  Why should SAGE not have bugs in addition to the
bugs in the libraries it uses?  On the main question here, as to
whether there are benchmarks that would test efficiency, I assume that
the benchmarks for Maxima (etc) would be relevant to SAGE to the
extent that SAGE is just calling Maxima. To the extent that SAGE has
its own inner loops and memory allocation, it could have its own
problems or successes.  I understand that it is essentially a python
program, and thus the python implementation may be a boon or a
disaster, and maybe that is the most prominent factor that can be
identified.  (Is there a compacting garbage collector for python?)

I don't understand NA's comment that SAGE is the first GPL CAS, since
it depends on several other predecessors. Or the comment that large
problems in Maple are interpreted.

If we don't raise questions when people continue to compare CAS on
essentially irrelevant grounds, we are in the position of essentially
allowing others who read this newsgroup from accepting these comments
as valid.
Thomas Richard
Posted: Sun Mar 30, 2008 9:54 am
Guest
Roman Pearce <rpearcea@gmail.com> wrote:

Quote:
In Maple memory is allocated but not returned to the operating system.

I think this no longer correct in general for Maple 11. Earlier versions
did have this problem, AFAIK - maybe the change was already in 10.
Upon a restart (button press or command), you can see the effect in
the Windows task manager or the Unix equivalents of such a tool.

--
Thomas Richard
Maple Support
Scientific Computers GmbH
http://www.scientific.de
SzH
Posted: Sun Mar 30, 2008 10:05 am
Guest
On Mar 30, 9:50 pm, Vladimir Bondarenko <v...@cybertester.com> wrote:
Quote:
RJF> whether we could come up with benchmarks that tested the
RJF> long-term performance of a CAS (and maybe, whether that
RJF> mattered or not).

A very good question. The long-term performance is something the
customers need. Personally, I hate when after using Maple 11 for
several days, I must reboot as my machine gets tortoise-like.

In contrast, if Mathematica 6 runs for days, and Maple 11 is not run
over this span of time, there is no tangible performance degradation.

Here's a very simple experiment with Mathematica 6.0.2 on WinXP. The
results are repeatable on my computer (I ran this several times). I
have 1 GB of RAM.

In[1]:= $HistoryLength = 0
Out[1]= 0

In[2]:= Timing[x = Expand[(a + b + c + d + e + f + g)^34];]
Out[2]= {61.86, Null}

In[3]:= Timing[x =.]
Out[3]= {1.984, Null}

In[4]:= Timing[x = Expand[(a + b + c + d + e + f + g)^34];]
Out[4]= {84.766, Null}

In[5]:= Timing[x =.]
Out[5]= {2.843, Null}

In[6]:= Timing[x = Expand[(a + b + c + d + e + f + g)^34];]
Out[6]= {94.219, Null}

In[7]:= Timing[x =.]
Out[7]= {2.922, Null}

In[8]:= Timing[x = Expand[(a + b + c + d + e + f + g)^34];]
Out[8]= {95.484, Null}

In[9]:= Timing[x =.]
Out[9]= {2.938, Null}


Note how the timing of the Expand[...] increases from ~60 s to ~95 s
in the subsequent evaluations. I don't want to speculate about why
this happens, but I would like to know what other people think.

Two things to watch out for if you start experimenting yourself (I
have /not/ tested these thoroughly, so they might not be repeatable):

1. Each input should be put in a separate input cell. If the
Expand[...] and the x=. are in the same input cell, Mathematica does
not release memory and the system starts swapping ...

2. It appears that if Share[] is executed after Expand[...], but /
before/ x=., then the timing value does not increase.

Szabolcs
Dave
Posted: Sun Mar 30, 2008 12:15 pm
Guest
mabshoff wrote:
Quote:
On Mar 30, 4:47 pm, caveman <dev.n...@dev.null.com> wrote:
mabshoff wrote:
It runs on various flavors of Linux and also on OSX. Microsoft
Reseatch is paying for a native Windows port and Solaris is getting
close to running. There are many other worthwhile platforms left after
that in my book.

Hi,

What is your problem with Solaris?

It compiles in 32 bit mode with small modifications, but the test
suite doesn't pass yet.


Quote:
So if you have some
more test machines with say Indiana on Sparc I would be more than
happy to take you up on that offer. Even build and test feedback would
be nice. Since this is getting OT we should take this offlist from now
on or go over to gg:sage-devel.


I'll drop you a private email.
 
Page 2 of 3    Goto page Previous  1, 2, 3  Next   All times are GMT - 5 Hours
The time now is Sun Oct 12, 2008 4:07 pm