 |
|
| Computers Forum Index » Computer Languages (Misc) » Syntax for user-defined infix operators... |
|
Page 1 of 2 Goto page 1, 2 Next |
|
| Author |
Message |
| James Harris... |
Posted: Fri Aug 21, 2009 12:54 pm |
|
|
|
Guest
|
Opinions sought....
Many (maybe most) languages accept symbols as infix operators for
binary (two-operand) operations such as
x + y
Some also predefine words as infix operators such as Pascal's
i div j
I would like to offer to programmers the ability to use the same
syntax as is available for built-in operations so instead of
op(a, b)
the programmer could code
a op b
for a user-defined binary operator, op.
The problem with this is that we have, effectively, three adjacent
words. The Pascal example only looks right because 1) i and j look
like variable names, 2) div sounds a little like an operation, 3)
Pascal reserves div so it is a known word.
Of course a programmer defining an operation could ensure that
something like a verb was used and could also ensure variables are
something like nouns but is that enough?
I'm thinking of using a modifier symbol. Some languages modify the
variables such as
$a op $b
It may be better to modify the operation such as
a $op b
where $ as a prefix indicates that the word is an operation. Maybe a
suffix or both prefix and suffix would be better. Maybe a different
symbol should be used.
Anyone have suggestions for a syntax that makes the operation clear? I
should say this is intended to be for both unary and binary operators.
Operators with more operands would have to come before the operands
such as op(a, b, c). Whatever notation is used may be best defined as
optional - i.e. just used for clarity where needed. What do you think?
James |
|
|
| Back to top |
|
|
|
| Robbert Haarman... |
Posted: Fri Aug 21, 2009 6:49 pm |
|
|
|
Guest
|
On Fri, Aug 21, 2009 at 05:54:15AM -0700, James Harris wrote:
Quote:
I would like to offer to programmers the ability to use the same
syntax as is available for built-in operations so instead of
op(a, b)
the programmer could code
a op b
for a user-defined binary operator, op.
You may want to take a look at how Haskell does it. In Haskell, any function
whose name consists entirely of "symbols" (characters you would normally
expect to find in infix operators) is an infix operator. Other names are
prefix by default. E.g.
12 + 4
div 12 4
You can use an infix operator in prefix position by surrounding it with
parentheses (this is an instance of currying), and you can use a prefix
function in infix position by surrounding it with backticks:
(+) 12 4
12 `div` 4
Moreover, you can define associativity and priority using "fixity
declarations", where you declare whether your function is left-associative
(infixl) right-associative (infixl), or non-associative, and how strongly
it binds (0 being weakest, 9 being strongest; normal function application
has a strength of 10). E.g.
infixr 6 +++
declares +++ to be left-associative with a binding strength of 6.
A little example of definition and usage:
-- | Kripke's quus function.
-- Behaves like +, but returns 5 if either operand is 56 or greater
x `quus` y
| x < 56 && y < 56 = x + y
| otherwise = 5
-- | Tests if x is more or less equal to y
x +- y = (x >= y * 0.95) && (x <= y * 1.05)
main = do
print (4 `quus` 5)
print (56 `quus` 3)
print (10 +- 11)
print (105 +- 100)
Disclaimer: I am not a Haskell programmer, so these examples may not be
idiomatic Haskell.
Regards,
Bob
--
But I ask you, what can a mathematician do without a sponge? |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Fri Aug 21, 2009 10:19 pm |
|
|
|
Guest
|
On Fri, 21 Aug 2009 07:43:22 -0700 (PDT), tm wrote:
Quote: On 21 Aug., 15:21, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
And what is the problem when the user defines new operator symbols
with priority and associativity?
But you need the priorities in the parser to make it working.
Seed7 has solved this problem with a table driven LL(1) parser.
It is possible to reconstruct parser dynamically each time the programmer
defines an operator, but that will one a can or worms.
It is not necessary to reconstruct the parser dynamically. Just the
tables need to be changed.
For example nested
operations declarations will change the syntax of their scope.
Nesting of syntax declarations is not supported in Seed7.
This is the point.
Quote: Nobody will
be able to understand such a program or errors spilled by the compiler.
While there is always room to improve, the error messages of the
'hi' (Seed7) interpreter are useful.
It is impossible to produce reasonable error messages if syntax is fluid.
(That is the reason why in natural languages syntax is the most
conservative part. Otherwise it were unable to understand each other in
presence of errors and uncertainties.)
Quote: A pragmatic approach is to fix all operations and their priorities. I.e.
there is predefined symbol + with the priority lower than *, etc.
No, it is just not necessary to fix all operator symbols and their
priorities in the compiler.
If you don't support scoped declarations.
Quote: You can introduce a special syntax for op(x,y,z).
Are you suggesting and additional notation used when the operator
is defined?
Not only. It is also necessary for fully qualified names. Again, I assume
scoping. I also assume that operations are treated equivalently, so that
brackets, commas, membership operation . have same treatment in the
language as + or *.
Quote: Is 'y' the operator symbol and 'op' used for all
operators? An additional function style notation for operator
symbols is IMHO a bad idea.
If operation is not a proper name, then, trivially, you cannot use it in
the contexts where a proper name is expected. You cannot get rid of all
such contexts. Consider referencing an operation as an object rather than
calling it. So if you maintain a distinction, you have to able to get a
proper name of an operation.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Wed Aug 26, 2009 11:55 am |
|
|
|
Guest
|
On 23 Aug, 09:03, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de>
wrote:
....
Quote: These priorities and associations are NOT necessary. The plan is to
define all such user-defined infix operators as unspecified precedence
and association - at least for now. Then they would need parentheses
to explicitly define the order of application.
OK, but that would take the charm of infix notation away, you would have
parentheses anyway.
Parentheses would be there but they would appear differently. Instead
of
binop.(binop.(a, b), c)
we would have
(a binop b) binop c
which, I think, is more readable.
Quote:
However many levels of priorities become a problem, like in C. I think a
reasonable solution is 4-6 levels plus association rules that forbid
certain combination of the operations of equal priority. Ada deploys this
technique. For example:
x and y or z -- Illegal
logical "and" may not share operands with "or" (they have same priority).
x and (y or z) -- This is OK
Yes. And releasing the initial version of the language with few
precedences (thus requiring parentheses) allows a few additional
precedences to be supported in later versions of the language, if that
appears sensible.
....
Quote: You can
introduce a special syntax for op(x,y,z). For example, C++ uses operator op
to construct a proper name of op. Ada uses "op". I think these are good
pragmatic solutions, especially because the syntax op(x,y,z) is rarely used
with operations.
Not sure what you mean here. By a proper name of op you mean a name
for a symbol?
I think you already have this in the form <op><dot>. I.e. if "+" is an
operation then its proper name is "+.". So
1 + 2
+.(1, 2)
integrate (array_of_data, +., *.) // Passing operations as parameters
I don't think that dot suffix is a good choice. You probably wanted to use
dot as an operation. Traditionally the record member extraction operation
is denoted as dot.
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
James |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Wed Aug 26, 2009 1:52 pm |
|
|
|
Guest
|
On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
Quote: James Harris wrote:
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Ah - good question. I'll need to explain a bit more.
The first reason is for simple consistency.
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
Sequences are mappings from the index to the element. In a similar
way, function calls can be seen as mappings. They may not return the
same value each time (nor do arrays) but functions do map inputs to
outputs. So they too get the same format for element reference.
function.argument or arguments
In any of the above if an element reference is an expression or a
tuple or a range it needs to be enclosed in parentheses but otherwise
the parens are optional.
The second reason is that it seems best for data structures to be
constructable by features rather than predefined by names such as
"list" or "vector". Some structures will be simple - such as an array
or a record. Others will be arbitrarily complex. In all cases the idea
is that a dot binds the structure to the subcomponent specification.
This allows arbitrary data structures to be treated as simple ones.
For example, take a FAT-formatted floppy disk. It has a boot sector,
two FATs, a root directory and a data area. If the floppy disk is
represented by variable f we might specify some of its components as
f.fat2 selects all of the second FAT
f.root_dir selects all of the root directory
f.data_sector.(15) selects sector 15 of the data area
As the last example shows, components can themselves be composites.
Field data_sector is part of f. It is also subscripted showing it has
subcomponents. It refers to sector 15 in the data area.
To implement the above the underlying floppy structure, f, may be a
simple array of sectors. It would offset its data_sector field to the
correct starting sector.
Notably, this is intended to work in exactly the same way whether we
have a real floppy disk or just a floppy disk image, and whether there
is caching or not.
The third reason is to do with keeping open the option to parse what I
call command format but hopefully the above is enough to show why I
use dot in seq.(index) and mapping.(arg...).
How does this look? Dotty? :-(
James |
|
|
| Back to top |
|
|
|
| bartc... |
Posted: Wed Aug 26, 2009 4:14 pm |
|
|
|
Guest
|
James Harris wrote:
Quote: Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
--
bartc |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Wed Aug 26, 2009 4:22 pm |
|
|
|
Guest
|
On 21 Aug, 15:49, Robbert Haarman <comp.lang.m... at (no spam) inglorion.net>
wrote:
Quote: On Fri, Aug 21, 2009 at 05:54:15AM -0700, James Harris wrote:
I would like to offer to programmers the ability to use the same
syntax as is available for built-in operations so instead of
op(a, b)
the programmer could code
a op b
for a user-defined binary operator, op.
You may want to take a look at how Haskell does it. In Haskell, any function
whose name consists entirely of "symbols" (characters you would normally
expect to find in infix operators) is an infix operator. Other names are
prefix by default. E.g.
12 + 4
div 12 4
You can use an infix operator in prefix position by surrounding it with
parentheses (this is an instance of currying), and you can use a prefix
function in infix position by surrounding it with backticks:
(+) 12 4
12 `div` 4
Thanks, Bob. This is the kind of thing I was looking for. I have my
doubts about the specific syntax used but it shows the same constructs
as I have in mind: turning prefix into infix (and vice versa which is
additional). The result is easy to read though the mechanisms seem
fairly arbitrary.
If the + sign was a symbol which meant the add operation my intention
was that these should be equivalent
12 add 4
12 + 4
And if add was part of the "integer" library they should be equivalent
to
12 integer.add
Turning that into Haskel-esque and adding a couple of other operations
12 integer.`add` 4
12 integer.`sub` 4
12 integer.`mul` 4
Other options: first, an asterisk prefix
12 integer.*add 4
12 integer.*sub 4
12 integer.*mul 4
An asterisk suffix
12 integer.add* 4
12 integer.sub* 4
12 integer.mul* 4
Or maybe using both looks better
12 integer.*add* 4
12 integer.*sub* 4
12 integer.*mul* 4
I'm not sure if any of these would fall foul of the parser and the
asterisk be recognised as a multiplication symbol....
....
Quote: A little example of definition and usage:
-- | Kripke's quus function.
-- Behaves like +, but returns 5 if either operand is 56 or greater
x `quus` y
| x < 56 && y < 56 = x + y
| otherwise = 5
-- | Tests if x is more or less equal to y
x +- y = (x >= y * 0.95) && (x <= y * 1.05)
Quote:
main = do
print (4 `quus` 5)
print (56 `quus` 3)
print (10 +- 11)
print (105 +- 100)
Disclaimer: I am not a Haskell programmer, so these examples may not be
idiomatic Haskell.
No need. That's very neat.
James |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Wed Aug 26, 2009 6:22 pm |
|
|
|
Guest
|
On Wed, 26 Aug 2009 06:52:02 -0700 (PDT), James Harris wrote:
Quote: On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
James Harris wrote:
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Ah - good question. I'll need to explain a bit more.
The first reason is for simple consistency.
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
Yes, but for a record field is a name, for an array index is an expression.
So, record selection should probably be:
record.field.
because
record.field
in your notation could rather mean - take a variable named field and index
record by the value of field.
Traditionally record member is considered itself an operation. So it is the
operation <dot><field-name> which is applied to record, rather than the
operation <select> applied to the arguments record and field. The
distinction is important. Because the former can give birth to methods, all
distinct according to the names of the fields. With <select> you have only
one method, which limits design to flat containers. Further the signature
of <select> has statically same result type (or no type), so statically all
record components would have one type (or none). This type would
dynamically be resolved to the actual specific types at run-time. I.e. you
force yourself to dynamic typing and only dynamic typing,
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Wed Aug 26, 2009 7:51 pm |
|
|
|
Guest
|
On 26 Aug, 19:46, "bartc" <ba... at (no spam) freeuk.com> wrote:
Quote: "James Harris" <james.harri... at (no spam) googlemail.com> wrote in message
news:169f6574-7d8d-4b06-8e14-96916468b0d8 at (no spam) b14g2000yqd.googlegroups.com...
On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
James Harris wrote:
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Ah - good question. I'll need to explain a bit more.
The first reason is for simple consistency.
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
So the dot selects a field or array element, with the parentheses needed for
array elements because otherwise there would be confusion (but you go on to
say these are optional).
(The confusion, assuming you allow seq.12 even though it looks odd, is that
seq.i could be selecting a field i, or indexing an array with i. With my
languages that would be ambiguous because both kinds of i are allowed into
the namespace at the same time, and a record can be both field selected and
indexed!)
The expression "seq.12" is OK and would be the same as "seq.(12)" (the
expression in parens, 12, resolves to, er, 12 so is fine) but if i is
an integer the expression would need to be "seq.(i)". On the other
hand "seq.i" would try to select a field called i (which wouldn't
exist in an array) and wouldn't compile. To be clear, if i is 12
seq.12
seq.(12)
seq.(i)
seq.(8 + 4)
would all mean the same thing.
seq.i
would be invalid for an array.
Quote:
That's OK although most array syntax looks like seq[i] or seq(i).
(I did have vaguely similar ideas once, where I grouped my complex objects
into two: multiple values (lists and arrays), and compound values usually
considered a single object (such as records and strings).
I'm with you up until not regarding strings as multiple objects. In
terms of indexing how I don't see how they are different from arrays
and lists.
Quote: So I used two
indexing methods:
array[i] i'th element
record.[i] i'th field (as well as regular fields)
string.[i] i'th character
integer.[i] i'th bit
Having the two methods caused some subtle problems however so now I just
have regular a[i] indexing for everything)
OK.
James |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Wed Aug 26, 2009 9:56 pm |
|
|
|
Guest
|
On 26 Aug, 21:20, "bartc" <ba... at (no spam) freeuk.com> wrote:
....
Quote: seq.12
seq.(12)
seq.(i)
seq.(8 + 4)
would all mean the same thing.
seq.i
would be invalid for an array.
It seems then that indexing in this language will usually be seq.(index), so
that the ability to do seq.intconst would be an exception.
I have also looked at numeric constants following dot operators, and it
seemed to cause problems:
seq.12.34
Is this seq.(12).(34), or seq.(12.34)? And at the lexical level, abc.123
looks at first like a name followed by a floating point constant value. For
that matter, does seq .12 work? What about seq..950 (seq.(0.950) (I'm
assuming floating indices will be converted to integers). Or:
define twelve=12
seq.twelve?
Yes, I think I would insist on the parentheses!
Thanks for explaining your findings. They are useful. In fact the
integer literal suffix of a sequence is not meant to be a virtue,
merely an effort towards consistency and a consequence of certain
design decisions. My main concern was whether people would find the
language had too many dots. You are right that seq.(12).(34) would be
valid - a reference to a two-dimensional structure or a reference to a
one-dimensional component of another one-dimensional component.
Plain floating point literals - i.e. those without an exponent - would
need at least one digit before and one after the decimal point so
there would be no ambiguity. That said, I'm undecided as yet whether
in general to permit whitespace before a binding dot. On one hand I
don't want to make the language too whitespace-sensitive - which
suggests allowing and ignoring any whitespace. On the other hand some
restrictions can make the language safer - which suggests outlawing
whitespace before the dot.
I'm also not decided but am tending away from making any implicit data
conversion that loses information (including the typical conversion
from 32-bit integer to 32-bit float which loses precision) so I may
disallow floating point indices unless they have been truncated or
rounded.
Finally, seq.twelve would be a field reference or, more generally, a
reference to a variable called twelve in the namespace called seq. It
could not refer to a variable called twelve in the active
namespace..... so it would need to be seq.(twelve).
Again, thanks for the pointers. I'm happier now about including the
dots!
James |
|
|
| Back to top |
|
|
|
| bartc... |
Posted: Wed Aug 26, 2009 10:46 pm |
|
|
|
Guest
|
"James Harris" <james.harris.1 at (no spam) googlemail.com> wrote in message
news:169f6574-7d8d-4b06-8e14-96916468b0d8 at (no spam) b14g2000yqd.googlegroups.com...
Quote: On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
James Harris wrote:
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Ah - good question. I'll need to explain a bit more.
The first reason is for simple consistency.
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
So the dot selects a field or array element, with the parentheses needed for
array elements because otherwise there would be confusion (but you go on to
say these are optional).
(The confusion, assuming you allow seq.12 even though it looks odd, is that
seq.i could be selecting a field i, or indexing an array with i. With my
languages that would be ambiguous because both kinds of i are allowed into
the namespace at the same time, and a record can be both field selected and
indexed!)
That's OK although most array syntax looks like seq[i] or seq(i).
(I did have vaguely similar ideas once, where I grouped my complex objects
into two: multiple values (lists and arrays), and compound values usually
considered a single object (such as records and strings). So I used two
indexing methods:
array[i] i'th element
record.[i] i'th field (as well as regular fields)
string.[i] i'th character
integer.[i] i'th bit
Having the two methods caused some subtle problems however so now I just
have regular a[i] indexing for everything)
--
Bartc |
|
|
| Back to top |
|
|
|
| bartc... |
Posted: Thu Aug 27, 2009 12:20 am |
|
|
|
Guest
|
James Harris wrote:
Quote: On 26 Aug, 19:46, "bartc" <ba... at (no spam) freeuk.com> wrote:
"James Harris" <james.harri... at (no spam) googlemail.com> wrote in message
news:169f6574-7d8d-4b06-8e14-96916468b0d8 at (no spam) b14g2000yqd.googlegroups.com...
On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
James Harris wrote:
I have dot as a "subordinacy" or "binding" operator. For records
the notation is record.field. For indexable sequences or any
other type of mapping including function calls the notation is
seq.(index) or mapping.(arg1, arg2, ... argN). Of course, a dot
is also used in floating point numbers.
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
So the dot selects a field or array element, with the parentheses
needed for array elements because otherwise there would be confusion
(but you go on to say these are optional).
(The confusion, assuming you allow seq.12 even though it looks odd,
is that seq.i could be selecting a field i, or indexing an array
with i. With my languages that would be ambiguous because both kinds
of i are allowed into the namespace at the same time, and a record
can be both field selected and indexed!)
The expression "seq.12" is OK and would be the same as "seq.(12)" (the
expression in parens, 12, resolves to, er, 12 so is fine) but if i is
an integer the expression would need to be "seq.(i)". On the other
hand "seq.i" would try to select a field called i (which wouldn't
exist in an array) and wouldn't compile. To be clear, if i is 12
seq.12
seq.(12)
seq.(i)
seq.(8 + 4)
would all mean the same thing.
seq.i
would be invalid for an array.
It seems then that indexing in this language will usually be seq.(index), so
that the ability to do seq.intconst would be an exception.
I have also looked at numeric constants following dot operators, and it
seemed to cause problems:
seq.12.34
Is this seq.(12).(34), or seq.(12.34)? And at the lexical level, abc.123
looks at first like a name followed by a floating point constant value. For
that matter, does seq .12 work? What about seq..950 (seq.(0.950) (I'm
assuming floating indices will be converted to integers). Or:
define twelve=12
seq.twelve?
Yes, I think I would insist on the parentheses!
--
Bartc |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Thu Aug 27, 2009 10:01 am |
|
|
|
Guest
|
On 27 Aug, 09:40, "Rod Pemberton" <do_not_h... at (no spam) nohavenot.cmm> wrote:
Quote: "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de> wrote in messagenews:zm5nrfntir2z$.212ivhpy3lmn.dlg at (no spam) 40tude.net...
On Wed, 26 Aug 2009 20:20:31 GMT, bartc wrote:
Once I designed a language close to what James proposes with regard to the
operation dot. (The idea is somehow infectious (  )
In particular it has the operations "." and ":", which are used to extract
substrings (and numeric slices as well). ":" takes the string prefix, ".."
does its suffix. For example:
"abcdef".3:2 = "cd"
it works as follows:
("abcdef".3) = "cdef"
"cdef":2 = "cd"
Heh! You just reminded me of BASIC. BASIC had/has three substring
operators, i.e., LEFT$, RIGHT$, MID$, and one operator for string
concatenation, +. Even after years of C programming with the immensely
useful and powerful C string functions, I'm always amazed by how much those
four operations could do in BASIC. Unfortunately, C needs a function to do
string concatenation, i.e., strcat()...
One could do a lot with mid$ and its friends but they were seriously
horrible, weren't they?
*Far* better, IMHO, is simple string slicing treating the string as an
array of characters.
Concatenation is fine though.
James |
|
|
| Back to top |
|
|
|
| James Harris... |
Posted: Thu Aug 27, 2009 10:11 am |
|
|
|
Guest
|
On 27 Aug, 08:39, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de>
wrote:
....
I note from the other subthread that you and bartc would insist on the
parentheses. I'll take that on board.
Quote: Classes are types and are effectively records with field protection
combined with a pseudo-executable inheritance. (I really don't want to
get into the pseudo-executable part of that just now as that is way
off topic. It's probably enough to just ignore the pseudo-executable
part of it and say that classes are types and are effectively records
with field protection combined with inheritance.)
Methods are effectively executable fields of classes with their own
types which includes the types of results and the types of parameters.
OK, this is the "standard model", which is slow (due to redispatch),
asymmetric (cannot handle integers etc), excludes multiple dispatch (cannot
handle + as a method). I dislike it.
"Standard model" is good, "slow" is bad. As mentioned, I'm not at the
stage of implementing this yet. I'll see what I can do for performance
when I get there.
Thanks for the input.
James |
|
|
| Back to top |
|
|
|
| Dmitry A. Kazakov... |
Posted: Thu Aug 27, 2009 11:39 am |
|
|
|
Guest
|
On Wed, 26 Aug 2009 08:29:28 -0700 (PDT), James Harris wrote:
Quote: On 26 Aug, 15:22, "Dmitry A. Kazakov" <mail... at (no spam) dmitry-kazakov.de
wrote:
On Wed, 26 Aug 2009 06:52:02 -0700 (PDT), James Harris wrote:
On 26 Aug, 13:14, "bartc" <ba... at (no spam) freeuk.com> wrote:
James Harris wrote:
Well, here are my current plans for the humble dot:
I have dot as a "subordinacy" or "binding" operator. For records the
notation is record.field. For indexable sequences or any other type of
mapping including function calls the notation is seq.(index) or
mapping.(arg1, arg2, ... argN). Of course, a dot is also used in
floating point numbers.
Dots are therefore used in many places. Perhaps too many. I hope they
don't become as irritating as Lisp's parentheses....
Anyone see a problem with using dot in these places?
Why is the dot necessary in seq.(index) or mapping.(arg...)?
Ah - good question. I'll need to explain a bit more.
The first reason is for simple consistency.
Fields are components of records.
Elements are components of arrays.
For example,
record.field selects an element of the record
seq.index selects an element of the sequence
In both cases the dot allows selection of a subcomponent.
Yes, but for a record field is a name, for an array index is an expression.
So, record selection should probably be:
record.field.
because
record.field
in your notation could rather mean - take a variable named field and index
record by the value of field.
Not quite. If the subcomponent is an expression it would need
parentheses. Let me show specific examples. Say we had a boring old
employee record. Its fields might include
employee.id
employee.surname
employee.initial
For an array, if we had an array called scores indexed from 0 up to 3
its elements would be
scores.0
scores.1
scores.2
scores.3
In the above, because they are constants, parentheses are optional so
scores.2
scores.(2)
mean exactly the same. If we wanted to access that array with
subscripts which were expressions - say "i" and "i + 2" we would
require parentheses so we get
scores.(i)
scores.(i + 2)
If we had an associative array p_tab its elements might be accessed by
p_tab."id"
p_tab.(name_type + "name")
p_tab.(field_name)
Parens would be mandatory for the last two entries as they are
expressions. Parentheses would be optional for "id" as it is a
constant.
You have already started explaining this in a subthread. So if I correctly
understood the idea, employee.id is equivalent to employee."id". I.e. each
name is also a string literal of itself. Considering an example with named
constants, variables and parameterless functions, let id is a variable with
the value "surname", then:
employee.id = employee."id"
employee.(id) = employee."surname"
The latter case dereferences id, the former case does not. It might turn
very confusing.
Quote: Traditionally record member is considered itself an operation. So it is the
operation <dot><field-name> which is applied to record, rather than the
operation <select> applied to the arguments record and field.
I'm not sure I follow. Are you talking about implementation? My
intention is that the source code express the algorithm but is as
ignorant as possible of the implementation. The idea is that the
implementation can change - perhaps to something faster or to a
debugging version - but the application logic does not need to change.
I meant that dot followed by an identifier usually denotes an operation
"get member named as the identifier tells". I.e. there is a compound
operation (let's name it <.id>) defined on the employee type. Formally:
<.id> : employee_type -> id_type
so if employee is of employee_type then employee.id is <.id> called on
employee:
<.id> (employee)
rather than an operation <.> defined on the Cartesian product of the
employee and string types:
<.> : employee_type x string -> component_type (class of)
called on a tuple:
<.> (employee, "id")
These are two semantically different interpretations of the syntax sugar
employee.id with far stretching consequences. The first one is that you
should go straight to classes of components bound as late as at run-time.
Quote: The
distinction is important. Because the former can give birth to methods, all
distinct according to the names of the fields. With <select> you have only
one method, which limits design to flat containers. Further the signature
of <select> has statically same result type (or no type), so statically all
record components would have one type (or none). This type would
dynamically be resolved to the actual specific types at run-time. I.e. you
force yourself to dynamic typing and only dynamic typing,
Interesting. I've not much considered object orientation much as I'm
waiting to see what is readily implementable and what would be too
slow. What I have in mind at this early stage is:
Classes are types and are effectively records with field protection
combined with a pseudo-executable inheritance. (I really don't want to
get into the pseudo-executable part of that just now as that is way
off topic. It's probably enough to just ignore the pseudo-executable
part of it and say that classes are types and are effectively records
with field protection combined with inheritance.)
Methods are effectively executable fields of classes with their own
types which includes the types of results and the types of parameters.
OK, this is the "standard model", which is slow (due to redispatch),
asymmetric (cannot handle integers etc), excludes multiple dispatch (cannot
handle + as a method). I dislike it.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de |
|
|
| Back to top |
|
|
|
|
|
All times are GMT
The time now is Thu Nov 26, 2009 1:54 pm
|
|