 |
|
| Computers Forum Index » Computer Languages (Misc) » Syntax for user-defined infix operators... |
|
Page 2 of 2 Goto page Previous 1, 2 |
|
| Author |
Message |
| Dmitry A. Kazakov... |
Posted: Thu Aug 27, 2009 11:55 am |
|
|
|
Guest
|
On Wed, 26 Aug 2009 20:20:31 GMT, bartc wrote:
Quote: I have also looked at numeric constants following dot operators, and it
seemed to cause problems:
seq.12.34
Once I designed a language close to what James proposes with regard to the
operation dot. (The idea is somehow infectious ( )
In particular it has the operations "." and ":", which are used to extract
substrings (and numeric slices as well). ":" takes the string prefix, "."
does its suffix. For example:
"abcdef".3:2 = "cd"
it works as follows:
("abcdef".3) = "cdef"
"cdef":2 = "cd"
After some years of using it, I must concede that it was rather an unwise
choice because of confusion with numeric literals. Exactly as you said.
BTW, my motivation was to attempt getting rid of index/function
parenthesis. The language has only ordering parenthesis. All operations are
either unary or infix. For example: "sin (x)" is merely "sin x"
Quote: Yes, I think I would insist on the parentheses!
Agreed.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| Rod Pemberton... |
Posted: Thu Aug 27, 2009 12:40 pm |
|
|
|
Guest
|
"Dmitry A. Kazakov" <mailbox at (no spam) dmitry-kazakov.de> wrote in message
news:zm5nrfntir2z$.212ivhpy3lmn.dlg at (no spam) 40tude.net...
Quote: On Wed, 26 Aug 2009 20:20:31 GMT, bartc wrote:
Once I designed a language close to what James proposes with regard to the
operation dot. (The idea is somehow infectious (  )
In particular it has the operations "." and ":", which are used to extract
substrings (and numeric slices as well). ":" takes the string prefix, "."
does its suffix. For example:
"abcdef".3:2 = "cd"
it works as follows:
("abcdef".3) = "cdef"
"cdef":2 = "cd"
Heh! You just reminded me of BASIC. BASIC had/has three substring
operators, i.e., LEFT$, RIGHT$, MID$, and one operator for string
concatenation, +. Even after years of C programming with the immensely
useful and powerful C string functions, I'm always amazed by how much those
four operations could do in BASIC. Unfortunately, C needs a function to do
string concatenation, i.e., strcat()...
Rod Pemberton |
|
|
| Back to top |
|
|
|
| bartc... |
Posted: Thu Aug 27, 2009 2:44 pm |
|
|
|
Guest
|
James Harris wrote:
Quote: On 27 Aug, 09:40, "Rod Pemberton" <do_not_h... at (no spam) nohavenot.cmm> wrote:
"Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de> wrote in
messagenews:zm5nrfntir2z$.212ivhpy3lmn.dlg at (no spam) 40tude.net...
On Wed, 26 Aug 2009 20:20:31 GMT, bartc wrote:
Once I designed a language close to what James proposes with regard
to the operation dot. (The idea is somehow infectious (  )
In particular it has the operations "." and ":", which are used to
extract substrings (and numeric slices as well). ":" takes the
string prefix, "." does its suffix. For example:
"abcdef".3:2 = "cd"
it works as follows:
("abcdef".3) = "cdef"
"cdef":2 = "cd"
Heh! You just reminded me of BASIC. BASIC had/has three substring
operators, i.e., LEFT$, RIGHT$, MID$, and one operator for string
concatenation, +. Even after years of C programming with the
immensely
useful and powerful C string functions, I'm always amazed by how
much those
four operations could do in BASIC. Unfortunately, C needs a function
to do
string concatenation, i.e., strcat()...
One could do a lot with mid$ and its friends but they were seriously
horrible, weren't they?
*Far* better, IMHO, is simple string slicing treating the string as an
array of characters.
Suppose you had a string say s="ABCDEF", and you indexed it using:
s[3]
would the result be a character, or a string of length 1?
(For years I've been using a language with the latter approach, and it's
worked well (after all why should s[3] be that different from the slice
s[3..4]), with asc(s[3]) to get the character value.)
But which is better?
--
Bartc |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Thu Aug 27, 2009 4:09 pm |
|
|
|
Guest
|
On Thu, 27 Aug 2009 02:10:47 -0700 (PDT), tm wrote:
Quote: On 26 Aug., 11:02, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
They do not fit into scoped languages,
where different scopes and the same scope may contain identically named
objects with identical / equivalent signatures.
Two objects with with identical / equivalent signature in the same
scope? That would mean that there are ambiguous expressions which can
only be resolved by context.
Not even by context, if the signature includes the result, as it should,
then the only way to resolve ambiguity is per using fully qualified names,
which is a reason to have them.
Quote: Ada supports this kind of expressions, but I don't think they are a
good idea. In Ada the + operator can be overloaded with the same
argument types and different result type. E.g.: Two + operators, one
with an integer and one with a float result. This way 1+2 may have
an integer or a float result and the context decides which operator
should be used.
Well this is not really ambiguous. In Ada you can have a context where two
objects of equivalent signatures are visible. These cannot be distinguished
otherwise than qualifying the names.
Quote: Seed7 does not allow this kind of overloading. The Seed7 overloading
resolution does not take the result of an expression into account.
This way the overload resolution algorithm works strictly bottom up.
This also makes reading expressions easier for humans.
Humans use both forms. In English you cannot tell if "set" is a noun or a
verb without the context. Fully inflectional languages do exist (sort of
"bottom up"), but they are far more complex to learn and use than English.
Quote: That is why Ada and to a
lesser extent C++ deploy nominal equivalence.
Seed7 - The extensible programming language: User defined statements
Ada and and to a lesser extent C++ have ambiguous expressions
resolved by context. Seed7 tries to avoid this problem by using
unambiguous expressions. In some rare cases this looks complicated,
but in the common case it improves the readability.
Well, that depends. Forcing user to invent names for language reasons is a
bad practice. Hungarian notation is the worst example of this plague.
Technically you cannot enforce unique names in a large program. You have to
be able to resolve name clashes in the context. There are only two
thinkable ways: context-local renaming and canonic qualifying. The former
is very obtrusive and totally unreadable when it comes to a large scale
software.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| tm... |
Posted: Thu Aug 27, 2009 4:59 pm |
|
|
|
Guest
|
On 26 Aug., 18:22, James Harris <james.harri... at (no spam) googlemail.com> wrote:
Quote: On 21 Aug, 15:49, Robbert Haarman <comp.lang.m... at (no spam) inglorion.net
wrote:
On Fri, Aug 21, 2009 at 05:54:15AM -0700, James Harris wrote:
I would like to offer to programmers the ability to use the same
syntax as is available for built-in operations so instead of
op(a, b)
the programmer could code
a op b
for a user-defined binary operator, op.
You may want to take a look at how Haskell does it. In Haskell, any function
whose name consists entirely of "symbols" (characters you would normally
expect to find in infix operators) is an infix operator. Other names are
prefix by default. E.g.
12 + 4
div 12 4
You can use an infix operator in prefix position by surrounding it with
parentheses (this is an instance of currying), and you can use a prefix
function in infix position by surrounding it with backticks:
(+) 12 4
12 `div` 4
Thanks, Bob. This is the kind of thing I was looking for. I have my
doubts about the specific syntax used but it shows the same constructs
as I have in mind: turning prefix into infix (and vice versa which is
additional). The result is easy to read though the mechanisms seem
fairly arbitrary.
So you suggest that all functions with two parameters
can be called prefix and infix?
Some random thoughts:
Why do you want two notations for the same thing?
I am not sure that inconsistent use of infix and
prefix notation will improve readability.
Or do you propose different application areas, such
as calling the function or referring to the function
object itself, for infix and prefix notation?
Functions with three or more parameters probably
don't have this infix/prefix possibility.
Infix notation with stropping and without priority
and associativity is probably not as handy as usual.
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows. |
|
|
| Back to top |
|
|
|
| Charles Lindsey... |
Posted: Thu Aug 27, 2009 8:00 pm |
|
|
|
Guest
|
In <6c22a3e0-4f63-49b6-a2b6-4e7f88fb4f52 at (no spam) c29g2000yqd.googlegroups.com> James Harris <james.harris.1 at (no spam) googlemail.com> writes:
Quote: On 21 Aug, 15:49, Robbert Haarman <comp.lang.m... at (no spam) inglorion.net
wrote:
On Fri, Aug 21, 2009 at 05:54:15AM -0700, James Harris wrote:
You can use an infix operator in prefix position by surrounding it with
parentheses (this is an instance of currying), and you can use a prefix
function in infix position by surrounding it with backticks:
(+) 12 4
12 `div` 4
Thanks, Bob. This is the kind of thing I was looking for. I have my
doubts about the specific syntax used but it shows the same constructs
as I have in mind: turning prefix into infix (and vice versa which is
additional). The result is easy to read though the mechanisms seem
fairly arbitrary.
Ah! At last we have got there. Algol 68 solved this whole problem 40 years
ago. Syntactically, it is no problem, but you DO need two alphabets of
letters. So you have 26 letters which you use to construct identifiers (NO
reserved words needed) and you have another 26 letters which you use for
Types and Operators (and maybe a few reserved words like BEGIN, END,
DO...).
So how to distinguish them? Various alternatives exist:
For Publication in pretty Journals, you use Bold for the 2nd alphabet (a
good old ALGOL convention).
For practical programming, you use the Upper Case letters.
And if you are still stuck in a world without any upper/lower case
distinction (as was common in 1968) you use a "stropping convention"
(usually apostrophes).
So you can have a DIV b
or A 'DIV' B
and you can define Types (ALGOL 68 called them 'Modes') with
MODE LINK = STRUCT(REF LINK head, tail)
which does away with all the trouble you get if you mis-spell some
'typedef' in C.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Web: http://www.cs.man.ac.uk/~chl
Email: chl at (no spam) clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5 |
|
|
| Back to top |
|
|
|
| robin... |
Posted: Thu Aug 27, 2009 8:07 pm |
|
|
|
Guest
|
"James Harris" <james.harris.1 at (no spam) googlemail.com> wrote in message
news:a20767c7-b537-4e6b-9caa-b8db7746e14a at (no spam) o32g2000yqm.googlegroups.com...
Quote: Opinions sought....
Many (maybe most) languages accept symbols as infix operators for
binary (two-operand) operations such as
x + y
Some also predefine words as infix operators such as Pascal's
i div j
I would like to offer to programmers the ability to use the same
syntax as is available for built-in operations so instead of
op(a, b)
the programmer could code
a op b
for a user-defined binary operator, op.
The problem with this is that we have, effectively, three adjacent
words.
You might like to look at Fortran, which offers that facility
(namely to define new operators). |
|
|
| Back to top |
|
|
|
| tm... |
Posted: Thu Aug 27, 2009 10:22 pm |
|
|
|
Guest
|
On 27 Aug., 20:53, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de>
wrote:
Quote: On Thu, 27 Aug 2009 06:29:32 -0700 (PDT), tm wrote:
On 27 Aug., 14:09, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
On Thu, 27 Aug 2009 02:10:47 -0700 (PDT), tm wrote:
On 26 Aug., 11:02, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
They do not fit into scoped languages,
where different scopes and the same scope may contain identically named
objects with identical / equivalent signatures.
Two objects with with identical / equivalent signature in the same
scope? That would mean that there are ambiguous expressions which can
only be resolved by context.
Not even by context, if the signature includes the result, as it should,
IMHO a function should be identified by its name and its parameters.
The type of the result should not be needed to identify a function.
Counterexample is represented by parameterless functions and named
constants. Numeric and string literals are such things. If you have several
numeric types you need to overload their literals as well as operations.
Maybe for Ada this are counterexamples, but for Seed7 they are not.
In Seed7 a parameterless function and a named constant are both
identified just with the name (but attribute parameters can be used
to attach this names to a type or even several types). The type of
a Seed7 literal is also unambiguous:
5 ... integer literal
'a' ... character literal
1.2 ... float literal
"ab" ... string literal
3_ ... bigInt literal
Please keep in mind that I don't talk about a 'dream' language as
many in this group do. Seed7 is implemented, it can be downloaded
and this concepts work. Just try it.
Quote: OTOH a bottom up overloading resolution algorithm is easy to
implement and easy to understand for humans. This way it is also
easy to see why a compiler complains. With a nontrivial overloading
resolution algorithm it can happen that humans think that something
is unambiguous, but the compiler complains and the reason for the
error is not obvious.
But anybody would expect:
A : Array (1..20) := (others => 0);
Without the context (array of 20 elements), you cannot resolve the array
aggregate of indefinite bounds, unknown index type and unknown type of the
elements.
This is a typical Ada construct. For Ada your reasoning is probably
ok, but for Seed7 a different view is necessary.
I assume that the Ada example above declares an array of integer
elements and when an element outside the allowed index range is
accessed the value 0 should be returned.
In Seed7 an array is declared this way:
var array integer: A is 20 times 0;
The expression '20 times 0' creates the array value of 20 integer
elements with the value 0. This value is assigned to the variable
'A' which has the type 'array integer'. Accessing elements outside
the allowed range of 1 to 20 results in an RANGE_ERROR exception. To
support your Ada example it would be necessary to define an improved
array type which allows to assign a value outside of the array
elements. Since Seed7 supports abstract data types such an improved
array type can be defined with it. I just prefer to ommit the
implementation here and continue as if it is already defined. The
'times' operator could be extended to something like
20 times 1 others 0
which specifies 20 integer elements with value 1 and the value
0 for all elements outside the allowed range. The declaration would
look like:
var improvedArray integer: A is 20 times 1 others 0;
Assigning an 'others' value later could be done with
A.others := 3;
As you can see: A slight shift in focus and the Seed7 world can
adopt to such needs without ambiguous expressions.
Quote: then the only way to resolve ambiguity is per using fully qualified names,
which is a reason to have them.
Ada supports this kind of expressions, but I don't think they are a
good idea. In Ada the + operator can be overloaded with the same
argument types and different result type. E.g.: Two + operators, one
with an integer and one with a float result. This way 1+2 may have
an integer or a float result and the context decides which operator
should be used.
Well this is not really ambiguous.
So you know the result type of 1+2 in the example above?
Why should I know it? Technically in Ada it is Universal_Integer,
Correct when + is only defined for integers. But what happens when
+ has been overloaded with:
function "+"(LEFT, RIGHT: INTEGER) return REAL;
In this case you don't know. The Ada compiler will take this
function or the original one depending on the context.
Quote: but for
the sake of argument, there is no reason to define it in absence of the
target.
In Ada you can have a context where two
objects of equivalent signatures are visible. These cannot be distinguished
otherwise than qualifying the names.
Exactly for this reason I think the Ada way of overloading is wrong.
Actually it is never a problem. In 99% cases when qualified names are used
then with generic instances. But generics is an abomination by itself in
any language. C++ is strictly bottom up, yet templates there is a sheer
horror.
Seed7 supports functions with type parameters and type result. They
are executed at compile they have the power of templates/generics
without introducing a special syntax. An example "template" which
defines for loops for a given type is here:
http://seed7.sourceforge.net/examples/for_decl.htm
Quote: In case of overloading resolution the Ada one is far more
complex to learn and use than the Seed7 one.
No need to learn them, if you are not a compiler designer. The programmer
just uses the names he wants. The compiler rejects illegal choices.
And you don't know why the compiler rejects it. Maybe another
compiler does accept it.
Quote: This is
automatically more friendly, because the body of name clashes is smaller in
Ada than seems to be in Seed7.
I have probably more knowledge about Ada than you about Seed7
and I have a different view. "Do what I mean" concepts which are
not understood by the programmer are a possible source of undetected
errors. The programmer has one concept but the compiler has
a different view and this may lead to erroneous behaviour.
Quote: The golden rule of language design - do not introduce arbitrary
constraints.
Come on, this is not an arbitrary constraint. Doing overloading
without taking the result of a function into account is a natural
concept. Ask people about the result type of
1 + 2
and
1.5 + 3
They will tell you that the first expression has an 'integer'
result and the second one has a 'float' result. Nobody will assume
that there is another + operator which adds two 'integers' but has
a 'float' result. Therefore they will not ask you:
I can only tell you when I know where the epressions are used.
People use the same bottom up algorithm for overload resolution as
Seed7. They identify the + in '1 + 2' as integer addition and the +
in '1.5 + 3' as float addition.
People are instinctively aware that the bottom up overloading
resolution determines the type of every expression and subexpresion
unambibuously.
The bottom up overloading resolution just seems arbitrary when you
are influenced by the Ada overloading concept.
IMHO the bottom up overloading organizes the concept of overloading
just the same way as strucured statements organize the flow of
control. Statements like 'while' are seemingly less powerful then a
spaghetti program with 'goto' statements, nevertheless most
programmers prefer structured statements.
Quote: Technically you cannot enforce unique names in a large program. You have to
be able to resolve name clashes in the context. There are only two
thinkable ways: context-local renaming and canonic qualifying. The former
is very obtrusive and totally unreadable when it comes to a large scale
software.
I agree, but different libraries usually work with different types so
most things are resolved by normal overloading without the need to
invent strange names. Seed7 has also a feature called attribute
parameter which can be used to attach functions to a type. This
further reduces the need to invent strange names. Attribute
parameters are explained together with class methods here:
http://seed7.sourceforge.net/manual/objects.htm#class_methods
An example of an object declared with an attribute parameter is:
const char: (attr char) . value is ' ';
This attaches '.value' to the type char. To use this constant just
write:
char.value
Seed7 uses this concept to attach default values to all types.
Attribute parameters are not reduced to expressions with '.'.
They can be used in normal functions:
const func circle: create (in integer: radius, attr circle) is
return circle(radius);
This attaches the function 'create' to the type 'circle'. This
function can be called with:
create(10, circle)
Overloading 'create' with several other attribute types is also
possible. As you can see the author of a library has ways to
avoid name clashes to some extent when the library is designed.
This moves redresses lack of the result type in the form of a fake
parameter, ...
The attribute parameter allows class functions in a more
elegant way.
Quote: ... just to make it *overloadable*, as it should have been right
from the start.
What is right and what is wrong depends on the point of view
(see above).
Quote: Why is this better than obvious:
My_Circle : Circle := Create (10);
1. Because this is obvously only obvious for Ada people.
2. For a "procedure" (which has no result and therefore cannot be
assigned) this approach is not possible.
Quote: Back to the qualifying as you see it. Seed7 does qualifying of
objects without parameters as
myModule.anElement
This is supported in structs and will be also done this way in the
(to be implemented) modules/packages. For objects with parameters
I prefer
myModule.(1+2)
or
myModule.((in integer param) + (in integer param))
over
1 myModule.+ 2
In Ada it is, in the case of a conflict or when + is not directly visible:
My_Module."+" (1, 2)
Ah, the "infix operator is a function and vice versa" concept.
Btw.: I am interested in a critical view at my concepts.
When I don't sound like that let me repeat it: Critic is welcome.
Maybe you should take a view at the Seed7 homepage to show weak
points in my concepts even better.
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows. |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Thu Aug 27, 2009 10:53 pm |
|
|
|
Guest
|
On Thu, 27 Aug 2009 06:29:32 -0700 (PDT), tm wrote:
Quote: On 27 Aug., 14:09, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
On Thu, 27 Aug 2009 02:10:47 -0700 (PDT), tm wrote:
On 26 Aug., 11:02, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
They do not fit into scoped languages,
where different scopes and the same scope may contain identically named
objects with identical / equivalent signatures.
Two objects with with identical / equivalent signature in the same
scope? That would mean that there are ambiguous expressions which can
only be resolved by context.
Not even by context, if the signature includes the result, as it should,
IMHO a function should be identified by its name and its parameters.
The type of the result should not be needed to identify a function.
Counterexample is represented by parameterless functions and named
constants. Numeric and string literals are such things. If you have several
numeric types you need to overload their literals as well as operations.
Quote: OTOH a bottom up overloading resolution algorithm is easy to
implement and easy to understand for humans. This way it is also
easy to see why a compiler complains. With a nontrivial overloading
resolution algorithm it can happen that humans think that something
is unambiguous, but the compiler complains and the reason for the
error is not obvious.
But anybody would expect:
A : Array (1..20) := (others => 0);
Without the context (array of 20 elements), you cannot resolve the array
aggregate of indefinite bounds, unknown index type and unknown type of the
elements.
Quote: then the only way to resolve ambiguity is per using fully qualified names,
which is a reason to have them.
Ada supports this kind of expressions, but I don't think they are a
good idea. In Ada the + operator can be overloaded with the same
argument types and different result type. E.g.: Two + operators, one
with an integer and one with a float result. This way 1+2 may have
an integer or a float result and the context decides which operator
should be used.
Well this is not really ambiguous.
So you know the result type of 1+2 in the example above?
Why should I know it? Technically in Ada it is Universal_Integer, but for
the sake of argument, there is no reason to define it in absence of the
target.
Quote: In Ada you can have a context where two
objects of equivalent signatures are visible. These cannot be distinguished
otherwise than qualifying the names.
Exactly for this reason I think the Ada way of overloading is wrong.
Actually it is never a problem. In 99% cases when qualified names are used
then with generic instances. But generics is an abomination by itself in
any language. C++ is strictly bottom up, yet templates there is a sheer
horror.
Quote: In case of overloading resolution the Ada one is far more
complex to learn and use than the Seed7 one.
No need to learn them, if you are not a compiler designer. The programmer
just uses the names he wants. The compiler rejects illegal choices. This is
automatically more friendly, because the body of name clashes is smaller in
Ada than seems to be in Seed7.
The golden rule of language design - do not introduce arbitrary
constraints.
Quote: Technically you cannot enforce unique names in a large program. You have to
be able to resolve name clashes in the context. There are only two
thinkable ways: context-local renaming and canonic qualifying. The former
is very obtrusive and totally unreadable when it comes to a large scale
software.
I agree, but different libraries usually work with different types so
most things are resolved by normal overloading without the need to
invent strange names. Seed7 has also a feature called attribute
parameter which can be used to attach functions to a type. This
further reduces the need to invent strange names. Attribute
parameters are explained together with class methods here:
http://seed7.sourceforge.net/manual/objects.htm#class_methods
An example of an object declared with an attribute parameter is:
const char: (attr char) . value is ' ';
This attaches '.value' to the type char. To use this constant just
write:
char.value
Seed7 uses this concept to attach default values to all types.
Attribute parameters are not reduced to expressions with '.'.
They can be used in normal functions:
const func circle: create (in integer: radius, attr circle) is
return circle(radius);
This attaches the function 'create' to the type 'circle'. This
function can be called with:
create(10, circle)
Overloading 'create' with several other attribute types is also
possible. As you can see the author of a library has ways to
avoid name clashes to some extent when the library is designed.
This moves redresses lack of the result type in the form of a fake
parameter, just to make it *overloadable*, as it should have been right
from the start.
Why is this better than obvious:
My_Circle : Circle := Create (10);
Quote: Back to the qualifying as you see it. Seed7 does qualifying of
objects without parameters as
myModule.anElement
This is supported in structs and will be also done this way in the
(to be implemented) modules/packages. For objects with parameters
I prefer
myModule.(1+2)
or
myModule.((in integer param) + (in integer param))
over
1 myModule.+ 2
In Ada it is, in the case of a conflict or when + is not directly visible:
My_Module."+" (1, 2)
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Sat Aug 29, 2009 12:20 pm |
|
|
|
Guest
|
On Fri, 28 Aug 2009 14:01:34 -0700 (PDT), tm wrote:
Quote: On 28 Aug., 18:48, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
To me user-defined scalar types are paramount.
Seed7 supports user defined enumeration types and subtypes of scalar
(and other) types (see below for a link to an example).
But no integers, reals etc.
This is wrong. Seed7 supports subtypes of integers and floats also.
I wrote about types. If user-defined integer *types* are supported, do they
have literals? If they do, then there shall be contexts where 1 may mean
more than one type.
Quote: The expression '20 times 0' creates the array value of 20 integer
elements with the value 0.
But already this is ambiguous in a language like Ada.
But not in Seed7. In the expression '20 times 0' the 'times'
operator uses the index type 'integer' and the lower bound 1.
Why Integer and not Byte, Priority_Level, Identity_No etc?
I did not say that the index must be 'integer'. I wrote:
Seed7 also supports arrays where the
index type is not 'integer'.
which includes Byte, Priority_Level, Identity_No etc.
In that case 20 times 0 is "ambiguous" because 20 may refer to
Priority_Levels Idle, Very_Low, Low etc, or Bytes from 0 to 19, or
whatever.
Quote: Ada is a strongly typed language you
cannot get anything wrong because of overloading. Any ambiguity is treated
as an error.
As I showed above a sub-expression like 1+2 can be ambiguous.
Ambiguous in Ada = illegal. You cannot compile an illegal program. No harm
can happen.
You obviously don't try to follow my arguments.
No, "ambiguity" is a language term. You are using it is a psychological
context of some imaginary layman, who would consider some language
constructs "ambiguous" or not. This is fruitless, because in order to make
statements about psychology or sociology, one should conduct scientific
experiments. Otherwise it is all a matter of taste.
As an OO programmer I am accustomed to generic programming, that is
programming in terms of sets of types (AKA classes). So an expression like
1+2 renders to me to a class of additive types probably of the structure of
a ring or a group, with an operation + and elements 1 and 2. This is quite
enough to grasp the idea (semantics) of the program. The concrete types
involved are of no interests so long my strongly typed language has checked
them OK.
----------------
Anyway. The language design point is rather simple. In presence of
user-defined scalar types (=types that have literals), you have
semantically overloaded literals. Period.
Quote: This is
automatically more friendly, because the body of name clashes is smaller in
Ada than seems to be in Seed7.
I have probably more knowledge about Ada than you about Seed7
and I have a different view. "Do what I mean" concepts which are
not understood by the programmer are a possible source of undetected
errors.
I don't see how it applies here. If there are conflicting interpretations
of a construct, the program is rejected in Ada.
When the programmer does not know how the overload resolution works
he might think that 'a' has type Latin-1 instead of UCS-2. Here I am
referring to your example
('a' => 'a', 'b' => 'b')
where you said:
"where 'a' on the left is not 'a' on the right"
The programmer can think anything he wants. What is the problem? So long
there is no ambiguity, everything is OK.
You think that a program that compiles without errors is OK?
Certainly yes, if its semantics responds the programmer's intention. The
converse is wrong.
If the programmer's intention was to index UCS-2 string using Latin-1
index, why should the language forbid this? Consider it as a task:
The task: create a map of Latin-1 characters to the Unicode UCS-2 code
positions. Then create a To_Lower map.
In Ada this task can be accomplished in a way I used above:
type Latin_to_UCS is array (Character) of Wide_Character;
To_Lower_Case : Latin_to_UCS :=
( 'a' | 'A' => 'a', 'b' | 'B' => 'b', -- and so on
);
It is natural, obvious and elegant to denote Latin-1 'a' and Unicode 'a'
using the same literal. Why should it be otherwise?
Quote: People are instinctively aware that the bottom up overloading
resolution determines the type of every expression and subexpresion
unambiguously.
This is an unsupported claim.
Proof: When I asked you for the type of 1+2 you answered:
"Technically in Ada it is Universal_Integer"
You probably used a bottom up approach.
No, I did not. Ada declares all integer literals of the type
Universal_Integer which is automatically converted to the particular
integer or modular type. The effect is as if types had literals of their
own. That does not change the semantics. You can treat 1+2 as
Unsigned_64'(1) + Unsigned_64'(2).
When the + is overloaded with
function "+"(LEFT, RIGHT: INTEGER) return REAL;
the expression 1+2 may return a 'REAL' result as well as an
'INTEGER' one.
And Unsigned_8, and Integer_16, and Long_Integer, and potentially infinite
number of types. Why should I care without a context?
Quote: There is a subtle but important difference, which can illustrate the
advantages of Ada's model. The standard requires all static numeric
expressions to be evaluated exact. Consider the following:
type T is range 1..2; -- Has only 1 and 2 values
X : T := 1024 / 512; -- This is OK!
Though neither 1024 nor 512 belong to T, the compiler is required to accept
this program because 1024 / 512 is statically 2 which is in T. Observe that
an attempt to qualify the types involved in, in a bottom-up manner would
produce an illegal program,
There is a cast involved when the integer 1024 / 512 is assigned
to T. Seed7 would just require that this cast is explicit instead
of implicit.
So casting to a known type is supposed to be readable? I prefer a language
where I am not forced to cast obvious expressions.
Quote: 1+2 perfectly fits the concept. You want to add things at the higher
abstraction level (on the TOP). If the compiler grasps your idea,
everything is OK and you both are happy. If it complains, you descend one
level below (DOWN) and consider what are these types involved etc. It is
just a matter of productivity, comfort and SAFETY. Consider awful integer
literals in C. of different length. You have to specify 1L and if later the
type gets changed you will have to revise the program. That is error prone.
A strong typed language would at least tell you where you need
to change something.
Yes, and this argument equally refutes what you wrote about "ambiguities".
Quote: Attribute parameters are a feature that can be used for different
purposes.
Overloading is such a feature. Programmers are not advised to extensively
use it, but in certain cases they have to.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Sun Aug 30, 2009 10:58 am |
|
|
|
Guest
|
On 27 Aug, 11:44, "bartc" <ba... at (no spam) freeuk.com> wrote:
Quote: James Harris wrote:
On 27 Aug, 09:40, "Rod Pemberton" <do_not_h... at (no spam) nohavenot.cmm> wrote:
"Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de> wrote in
messagenews:zm5nrfntir2z$.212ivhpy3lmn.dlg at (no spam) 40tude.net...
On Wed, 26 Aug 2009 20:20:31 GMT, bartc wrote:
Once I designed a language close to what James proposes with regard
to the operation dot. (The idea is somehow infectious (  )
In particular it has the operations "." and ":", which are used to
extract substrings (and numeric slices as well). ":" takes the
string prefix, "." does its suffix. For example:
"abcdef".3:2 = "cd"
it works as follows:
("abcdef".3) = "cdef"
"cdef":2 = "cd"
Heh! You just reminded me of BASIC. BASIC had/has three substring
operators, i.e., LEFT$, RIGHT$, MID$, and one operator for string
concatenation, +. Even after years of C programming with the
immensely
useful and powerful C string functions, I'm always amazed by how
much those
four operations could do in BASIC. Unfortunately, C needs a function
to do
string concatenation, i.e., strcat()...
One could do a lot with mid$ and its friends but they were seriously
horrible, weren't they?
*Far* better, IMHO, is simple string slicing treating the string as an
array of characters.
Suppose you had a string say s="ABCDEF", and you indexed it using:
s[3]
would the result be a character, or a string of length 1?
(For years I've been using a language with the latter approach, and it's
worked well (after all why should s[3] be that different from the slice
s[3..4]), with asc(s[3]) to get the character value.)
But which is better?
This could be a big question. As no replies yet I've started a new
thread with the query above.
James |
|
|
| Back to top |
|
|
|
| Marco van de Voort... |
Posted: Sun Oct 04, 2009 10:15 pm |
|
|
|
Guest
|
On 2009-08-27, bartc <bartc at (no spam) freeuk.com> wrote:
Quote: *Far* better, IMHO, is simple string slicing treating the string as an
array of characters.
Suppose you had a string say s="ABCDEF", and you indexed it using:
s[3]
would the result be a character, or a string of length 1?
(For years I've been using a language with the latter approach, and it's
worked well (after all why should s[3] be that different from the slice
s[3..4]), with asc(s[3]) to get the character value.)
But which is better?
Depends on your definition of character. Is s[3] really a character, a
codepoint or just the granularity of your encoding (e.g. 2 with UTF-16,
while characters can be multiple 32-bit codepoints in theory) |
|
|
| Back to top |
|
|
|
| Marco van de Voort... |
Posted: Mon Oct 05, 2009 8:07 am |
|
|
|
Guest
|
On 2009-10-05, Dmitry A. Kazakov <mailbox at (no spam) dmitry-kazakov.de> wrote:
Quote:
But which is better?
Depends on your definition of character. Is s[3] really a character, a
codepoint or just the granularity of your encoding (e.g. 2 with UTF-16,
while characters can be multiple 32-bit codepoints in theory)
I think that character string should obviously consist of characters, where
each character is a code point independently on the encoding. Or better to
say it is of no matter which encoding String has. That is an implementation
detail.
But in Unicode, (printable) characters can be multiple codepoints. Specially
languages that allow combining of accents need this iirc.
The trouble of using codepoints as character, is that to access s[n] you
have to parse the entire string till you find character n. Might be fine for
a scripting language, but can be a performance killer.
Quote: To bring a particular encoding into the picture one should have strings of
octets (for UTF-  strings of words (UTF-16) etc. These are different types
which basically have nothing to do with String.
I'm talking about any realistic choice to be used as internal storage inside
"String", and how it translates to String's properties. Other types not
included. |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Mon Oct 05, 2009 12:52 pm |
|
|
|
Guest
|
On Mon, 5 Oct 2009 08:07:11 +0000 (UTC), Marco van de Voort wrote:
Quote: On 2009-10-05, Dmitry A. Kazakov <mailbox at (no spam) dmitry-kazakov.de> wrote:
But which is better?
Depends on your definition of character. Is s[3] really a character, a
codepoint or just the granularity of your encoding (e.g. 2 with UTF-16,
while characters can be multiple 32-bit codepoints in theory)
I think that character string should obviously consist of characters, where
each character is a code point independently on the encoding. Or better to
say it is of no matter which encoding String has. That is an implementation
detail.
But in Unicode, (printable) characters can be multiple codepoints. Specially
languages that allow combining of accents need this iirc.
You are right, but that is an insanity the Unicode guys have introduced
upon us. IMO it is hopeless to maintain the "character = glyph" idea. I
would ignore that stuff and stop at the code point.
Quote: The trouble of using codepoints as character, is that to access s[n] you
have to parse the entire string till you find character n. Might be fine for
a scripting language, but can be a performance killer.
To bring a particular encoding into the picture one should have strings of
octets (for UTF-  strings of words (UTF-16) etc. These are different types
which basically have nothing to do with String.
I'm talking about any realistic choice to be used as internal storage inside
"String", and how it translates to String's properties. Other types not
included.
The internal representation can be UCS-4 (memory is cheap) or UTF-8 with an
index to speed up search. The language should not specify this. In practice
I would favor an UTF-8 internal representation with cached last fetched
character's octet index. This will suffice for almost all cases of
indexing. Normally characters are scanned either forward or backward.
Random access to string characters is practically never used.
There could be issues with concurrent access to the string cache, but I
presume that all objects including strings to be allocated on the stack, or
else the language should properly handle shared objects anyway.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
|
|
All times are GMT
The time now is Wed Dec 09, 2009 6:25 am
|
|