Main Page | Report this Page
Computers Forum Index  »  Computer Compilers  »  Languages with well-integrated Foreign Function...
Page 1 of 1    

Languages with well-integrated Foreign Function...

Author Message
Christoffer Lernö...
Posted: Tue Jul 21, 2009 7:09 am
Guest
I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.
It's likely that I will restrict myself to interfacing with C, so if
there are elegant solutions integrating with C, and more clunky but
flexible that are more general, I probably prefer the ones that
exclusively target C.

What I'm looking for is syntax, to what extent automatic conversion of
arguments are done, how to handle callbacks, memory management, how to
create structured data (i.e. structs in the case of C) etc.

/Christoffer
[Someone asked roughly the same question in 1997, but got no answers. Poking
around on the net, all the FFI seems rather ad-hoc and language specific. -John]
 
Barry Kelly...
Posted: Tue Jul 21, 2009 4:33 pm
Guest
Christoffer Lernv wrote:

Quote:
I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.

I think the current trends are towards what could be described as a mix
of aspect orientation and dynamic code generation. I've seen this in
...NET, Java, and Ruby.

...NET: P/Invoke

http://www.pinvoke.net/ has lots of example syntax and declarations.

These are basically external static method signatures with attributes
(user-definable metadata) attached that describes to the runtime
environment how to marshal arguments and what name to import from the
linked-to library.

Even though in practice it's the CLR, i.e. the actual runtime, which
does this linking and marshaling, it principle it could be done entirely
by a third-party library using a small handful of primitives. Since the
(full-version, non-browser/mobile) platform has the capability of
generating code dynamically at runtime, efficient wrappers that grab
arguments can be generated, juggling them about as necessary (guided by
metadata), and dispatching the final call, almost entirely written in
the language itself and in a replaceable and extendable way.

Java: JNA

https://jna.dev.java.net/

The idea is you create an interface that represents the exported
routines you want to access, and let a library take care of mapping
through reflection and annotations (user-definable metadata). Much like
P/Invoke and far easier to use than JNI (though it uses JNI under the
hood from what I understand).

Ruby: FFI

http://blog.headius.com/2008/10/ffi-for-ruby-now-available.html

Quote:
What I'm looking for is syntax, to what extent automatic conversion of
arguments are done, how to handle callbacks, memory management, how to
create structured data (i.e. structs in the case of C) etc.

* Conversion of arguments: when interacting with C, certain things
almost certainly need to be done, such as strings and untyped
polymorphic return buffers (e.g. Windows APIs where there's a cbSize at
the start of the struct which determines which version will be
returned). The details of correspondences with C-level primitive types
depend on the type system and library, of course.

For example, C# uses mutable StringBuffer instances to represent C
non-const char*, and uses attributes on the parameter / return value in
the P/Invoke declaration to indicate the encoding. Similarly, structures
which are used for interop must be blittable (no GC'able references) and
can have explicit layout (again, done with attributes) for cases like
unions.

* Callbacks: this is "just" the inverse of imports, but may require
dynamic stubs for things like function pointer -> closure/method-pointer
conversion, so that both self/environment and code address can be
exported.

* Memory management: in a precise GC environment, there normally needs
to be either some mechanism for pinning memory (prevent move or
collection), or an "out" to manual memory allocation, or else all
transfer must be copies both ways. For example, .NET supports all three;
pinning via the C# 'fixed' keyword and GCHandle framework type, manual
allocation via the Marshal framework class, and generally defaults to
copying.

* Structs: some kind of way of declaratively specifying the layout of
the C-visible type, possibly using attributes / annotations or perhaps
even dynamic imperative specification at runtime like Ruby/FFI. With
runtime support, attribute metadata can be used to make the internal
type match the external type, but it's not absolutely necessary, if the
FFI library is going to do the conversion work.

If I were you, I'd try and focus on ensuring that the FFI work can be
done using libraries that may possibly be extendable for certain user
scenarios (using some kind of metadata), over and above some ad-hoc
tooling that needs to use the same language as the runtime (typically C)
and compiled and linked against the runtime for every FFI linkage. To
make rich interaction with the platform easy, make the FFI very easy to
use and play with.

-- Barry

--
http://barrkel.blogspot.com/
 
Hans Aberg...
Posted: Tue Jul 21, 2009 6:05 pm
Guest
Christoffer LernC6 wrote:
Quote:
I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.
It's likely that I will restrict myself to interfacing with C, so if
there are elegant solutions integrating with C, and more clunky but
flexible that are more general, I probably prefer the ones that
exclusively target C.

Haskell <http://haskell.org/> has FFI to C. It is probably GHC you
should look at. Hugs is an interpreter, good for running at a console,
and also has some FFI to C.

Hans
 
...
Posted: Wed Jul 22, 2009 12:50 am
Guest
In article <09-07-074 at (no spam) comp.compilers>, lerno at (no spam) dragonascendant.com
(=?ISO-8859-1?Q?Christoffer_Lern=F6?=) wrote:

Quote:
I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.
It's likely that I will restrict myself to interfacing with C, so if
there are elegant solutions integrating with C, and more clunky but
flexible that are more general, I probably prefer the ones that
exclusively target C.

The native code invocation (Platform/Invoke, or P/Invoke) of C# are
pretty useful, and worth studying. But in this field quite a bit depends
on how the language you're calling from is implemented, as regards
callbacks, resource management, and so on. So a bit more idea about the
language you're creating would help people make suggestions.

--
John Dallman jgd at (no spam) cix.co.uk
"Any sufficiently advanced technology is indistinguishable from a
well-rigged demo"
 
Nathan Seese...
Posted: Wed Jul 22, 2009 6:16 am
Guest
I remember hearing that Ruby's was good (I've never used it, though).

--
Moore's Law of Mad Science: Every eighteen months, the minimum IQ
necessary to destroy the world drops by one point.
 
Gene...
Posted: Thu Jul 23, 2009 4:08 am
Guest
On Jul 21, 3:09 am, Christoffer Lernv <le... at (no spam) dragonascendant.com>
wrote:
Quote:
I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.
It's likely that I will restrict myself to interfacing with C, so if
there are elegant solutions integrating with C, and more clunky but
flexible that are more general, I probably prefer the ones that
exclusively target C.

What I'm looking for is syntax, to what extent automatic conversion of
arguments are done, how to handle callbacks, memory management, how to
create structured data (i.e. structs in the case of C) etc.

/Christoffer
[Someone asked roughly the same question in 1997, but got no answers. Poking
around on the net, all the FFI seems rather ad-hoc and language specific.-John]

I've looked at quite a few. John is right. Because of the double
dependence on source and target languages and runtime environments, ad
hoc is an understatement. There is very little commonality. I think
the least worst approach is to define a distinct glue language and
process that into C code that provides clean interfaces. Perl does
this, though IMO there is much room to improve the approach. Think
also about the other direction: Calls from a foreign language program
into code written in yours.

Gene
 
Tony Finch...
Posted: Sun Jul 26, 2009 5:55 pm
Guest
Christoffer Lerno <lerno at (no spam) dragonascendant.com> wrote:
Quote:

I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.

Here's a link to a fairly useful survey/comparison of a number of
embeddable/extensible dynamic/scripting language FFIs. It mostly covers
Lua, Perl, Python, and Ruby, and Tcl also gets a few mentions.

Hisham Muhammad and Roberto Ierusalimschy. C APIs in extension and
extensible languages. In XI Brazilian Symposium on Programming Languages,
Natal, May 2007.

http://www.inf.puc-rio.br/%7Eroberto/docs/jucs-c-apis.pdf

Tony.
--
f.anthony.n.finch <dot at (no spam) dotat.at> http://dotat.at/
MALIN HEBRIDES: SOUTH OR SOUTHEAST, VEERING SOUTHWEST, 5 TO 7, OCCASIONALLY
GALE 8 AT FIRST. ROUGH OR VERY ROUGH. SQUALLY SHOWERS. GOOD.
 
Paul Biggar...
Posted: Tue Jul 28, 2009 3:09 pm
Guest
Hi Christoffer, Tony,

On Sun, Jul 26, 2009 at 6:55 PM, Tony Finch<dot at (no spam) dotat.at> wrote:
Quote:
Christoffer Lerno <lerno at (no spam) dragonascendant.com> wrote:

I'd like to research FFI in various languages, basically to find the
best FFI-solution and copy from that one.

Here's a link to a fairly useful survey/comparison of a number of
embeddable/extensible dynamic/scripting language FFIs. It mostly covers
Lua, Perl, Python, and Ruby, and Tcl also gets a few mentions.

Hisham Muhammad and Roberto Ierusalimschy. C APIs in extension and
extensible languages. In XI Brazilian Symposium on Programming Languages,
Natal, May 2007.

http://www.inf.puc-rio.br/%7Eroberto/docs/jucs-c-apis.pdf

In my opinion, the style of FFI used by scripting languages (Perl,
python, lua, PHP, Ruby) is awful, and should never be emulated. It is
particularly bad since most standard libraries are written using the
FFI (as opposed to being written in user-code), which cannot be reused
by other implementations of the scripting language.

Consider PHP. Its FFI is simply an API which exposed quite a lot of
the internals of the PHP interpreter. If you wish to provide another
implementation of PHP, you must copy or wrap the API (which is the
approach of Phalanger, a .Net version of PHP), ignore it (Roadsend)
and miss out on or reimplement all the libraries, or try horrendous
preprocessor and linker tricks to provide the same API (ProjectZero).

Every scripting language I've come across has the same style, as the
cited paper describes (although Ruby has at least put thought into
their API, and made it quite attractive). Some have come up with
better approaches, but they are not standard (Ruby's libFFI
[http://blog.headius.com/2008/10/ffi-for-ruby-now-available.html] or
Python's cython/pyrex [http://cython.org/])


In my opinion, the best approach is to provide a
domain-specific-language which allows libraries to call C functions
directly, along with a data type to wrap void-pointers. This is
similar to Cython/pyrex, which I consider the current best-of-breed
for scripting language FFIs.


Some more reading, if you're interested:
- The problems faced by PHP in this regard, see
http://wiki.php.net/rfc/remove_zend_api,
- ProjectZero describe a lot of the FFI problems they faced in
http://wiki.php.net/rfc/remove_zend_api/scratchpad,
- I give an overview of the "best approach" from above in
http://wiki.php.net/rfc/php_native_interface,
- Finally, I go into more detail about the problems with scripting
language FFIs in my SAC '09 paper:
https://www.cs.tcd.ie/~pbiggar/#sac-2009.


Thanks,
Paul

--
Paul Biggar
paul.biggar at (no spam) gmail.com
 
Paul Biggar...
Posted: Mon Aug 31, 2009 12:43 pm
Guest
On Sun, Aug 30, 2009 at 10:37 PM, BGB / cr88192 <cr88192 at (no spam) hotmail.com> wrote:
Quote:
In my opinion, the best approach is to provide a
domain-specific-language which allows libraries to call C functions
directly, along with a data type to wrap void-pointers. This is
similar to Cython/pyrex, which I consider the current best-of-breed
for scripting language FFIs.


I once did this (long ago) to allow one of my languages (in this case,
Scheme based), to interface with C. the cost was, however, that it was
limited to statically compiled code, and had all sorts of integration issues
with the interpreted version (for example, they could not directly share
binding environments, ...).

For scripting languages, these are all existing problems in their
FFIs. The DLS approach I suggested would be a simple portable
replacement for the current C APIs, and would (intentionally) support
only statically compiled code. The integration with the interpreted
environment would be no worse than currently exists.


Quote:
more recently, my project has taken a very different approach:
both the VM and native code are C.
snip /
however, all this effort does have a payoff:
plain C to plain C integration.

If I understand you correctly, it seems that this interface wouldn't
be portable to other VMs, and that the standard libraries could not be
shared with another implementation which did not use a C virtual
machine?


Quote:
but, this does bring up another "generic" option: VMs can generally
go and implement JNI as their FFI (even if the VM has little to do
with the JVM), since one can at least pretend that JNI is a "sort
of" standardized C-side FFI...

(I think) I don't like this idea, since it requires a reimplementation
of the same language to abide by some design principles of the
original implementation. In the approach I outlined above, the only
requirement is some way to statically link to C libraries, which I
believe to be as portable as is possible.


Paul
--
Paul Biggar
paul.biggar at (no spam) gmail.com
 
BGB / cr88192...
Posted: Tue Sep 01, 2009 3:12 am
Guest
"Paul Biggar" <paul.biggar at (no spam) gmail.com> wrote in message
Quote:
On Sun, Aug 30, 2009 at 10:37 PM, BGB / cr88192 <cr88192 at (no spam) hotmail.com
wrote:
In my opinion, the best approach is to provide a
domain-specific-language which allows libraries to call C functions
directly, along with a data type to wrap void-pointers. This is
similar to Cython/pyrex, which I consider the current best-of-breed
for scripting language FFIs.


I once did this (long ago) to allow one of my languages (in this case,
Scheme based), to interface with C. the cost was, however, that it was
limited to statically compiled code, and had all sorts of integration
issues
with the interpreted version (for example, they could not directly share
binding environments, ...).

For scripting languages, these are all existing problems in their
FFIs. The DLS approach I suggested would be a simple portable
replacement for the current C APIs, and would (intentionally) support
only statically compiled code. The integration with the interpreted
environment would be no worse than currently exists.

ok.

as can be noted, dynamic code generation allows dynamic interfacing as well
(grab a DLL and plug into it), but has a complexity cost...


Quote:
more recently, my project has taken a very different approach:
both the VM and native code are C.
snip /
however, all this effort does have a payoff:
plain C to plain C integration.

If I understand you correctly, it seems that this interface wouldn't
be portable to other VMs, and that the standard libraries could not be
shared with another implementation which did not use a C virtual
machine?

another VM could use a similar strategy, but granted, to use plain C<->C
interfacing, the VM in question would need to be able to support C.

the great problem is that this quickly rules out most simple /
language-specific VMs...

basically, to handle both a dynamic language, and C, the VM would need a
similar level of complexity to that of .NET ...


another possibility would be to have several interconnected, but different,
VMs:
one which did C, and another which did the dynamic language;
then the main cost is the effort required to hook these together.

note:
just running bytecoded C will not buy anything, as one will need much of the
same types of metadata which is often available only to the compiler (AKA:
lots of very specific info about the typesystem, memory and stack layout,
....).

so, the C compiler would essentially serve several purposes:
A, to compile C;
B, to mine lots of information about pretty much everything visible at this
point.

with B, it is often sufficient just to try to "compile" a bunch of headers.

note that gathering, managing, and processing all this data, is its own set
of challenges...



but, at this point, one finds they can glue damn near anything to
anything...

for example, it would not be out of the question to glue C, for example, to
a fully textual set of interfaces and data representations. I had actually
considered something for this recently.

I can already more-or-less glue dynamic typing to C-style data
representations, ...


thus, my ideas for how to do JS in my present framework...
if done well, JS "should" be able to achieve close to 1:1 performance with C
(if compiled to statically-typed native code via lots of internal
trickery...).

however, a plain dynamically-typed version is also possible, but will likely
be a little slower.


Quote:

but, this does bring up another "generic" option: VMs can generally
go and implement JNI as their FFI (even if the VM has little to do
with the JVM), since one can at least pretend that JNI is a "sort
of" standardized C-side FFI...

(I think) I don't like this idea, since it requires a reimplementation
of the same language to abide by some design principles of the
original implementation. In the approach I outlined above, the only
requirement is some way to statically link to C libraries, which I
believe to be as portable as is possible.


JNI allows representing a lot more, and there are a lot of JVM-based
libraries which use it.
this only requires then that the VM be able to fake a JVM-like model (at
least for the external interface).

granted, this would be a little more awkward for dynamically-typed VM's, ...
 
Hans-Peter Diettrich...
Posted: Wed Sep 02, 2009 5:09 pm
Guest
BGB / cr88192 schrieb:

Quote:
If I understand you correctly, it seems that this interface wouldn't
be portable to other VMs, and that the standard libraries could not be
shared with another implementation which did not use a C virtual
machine?

another VM could use a similar strategy, but granted, to use plain C<->C
interfacing, the VM in question would need to be able to support C.

the great problem is that this quickly rules out most simple /
language-specific VMs...

basically, to handle both a dynamic language, and C, the VM would need a
similar level of complexity to that of .NET ...

Where do you see such a requirement?

IMO the VM only must support the C calling convention and data types
(ABI) - not different from interfacing any other "language".

DoDi
 
Paul Biggar...
Posted: Wed Sep 02, 2009 6:34 pm
Guest
On Tue, Sep 1, 2009 at 12:12 AM, BGB / cr88192<cr88192 at (no spam) hotmail.com> wrote:
Quote:
more recently, my project has taken a very different approach:
both the VM and native code are C.
snip /
however, all this effort does have a payoff:
plain C to plain C integration.

If I understand you correctly, it seems that this interface wouldn't
be portable to other VMs, and that the standard libraries could not be
shared with another implementation which did not use a C virtual
machine?

another VM could use a similar strategy, but granted, to use plain C<->C
interfacing, the VM in question would need to be able to support C.

the great problem is that this quickly rules out most simple /
language-specific VMs...

basically, to handle both a dynamic language, and C, the VM would need a
similar level of complexity to that of .NET ...


But only if you need to 'handle' C in your VM. I am at a loss as to
why this might be useful. To make a dynamic VM, with a useful FFI, you
don't need to handle C in any meaningful way, except:
- it might be useful to parse some C subset to generate glue coode
(if using the idea's I outlined in my first mail)
- to link to C libraries, which isn't hard (no dynamic linking requirements).

You certainly don't need a JIT, an 'unsafe' environment, to process C
as bytecode (does .NET even do this?) or any of the rest of the
complexity of .NET.



Quote:
I can already more-or-less glue dynamic typing to C-style data
representations, ...

Maybe, but why would you want to? In my opinion, all you do here is
prevent people from reimplementing your language, which is the same
mistake that all existing scripting languages have made.



Quote:
thus, my ideas for how to do JS in my present framework...
if done well, JS "should" be able to achieve close to 1:1 performance with C
(if compiled to statically-typed native code via lots of internal
trickery...).

I wouldn't assume this for a second. You can't statically compile
Javascript to native code at all, primarily because of the existence
of AJAX, which fetches code at run-time from the server, and eval()s
it. In the absence of this, and other means of run-time code
generation, Jensen's work (http://www.cs.au.dk/~amoeller/papers/tajs/)
shows that, yes, JS is very statically type-able. But the application
is not compilation - none of the recent Javascript implementations
(squirellfish, V8 and tracemonkey) are statically compiled, nor could
they be if they wanted to process real-life Javascript.

Paul
--
Paul Biggar
paul.biggar at (no spam) gmail.com
 
BGB / cr88192...
Posted: Thu Sep 03, 2009 6:43 pm
Guest
Quote:
the great problem is that this quickly rules out most simple /
language-specific VMs...

basically, to handle both a dynamic language, and C, the VM would need a
similar level of complexity to that of .NET ...

Where do you see such a requirement?


personal experience...

having the VM run a scripting language is one thing;
having the VM run C is another.

trying to do both at the same time, with the same VM, well, this "opens a
whole new can of worms".

granted, one could take a shortcut and use dynamic typing for implementing
C, but why not ignore this possibility, and instead assume that the VM core
is statically typed. similarly, I can exclude the shortcut of building an
interpreter on top of the VM (this was an option I personally rejected).


after faced with a lot of this, one realizes 2 things:
in the grand scope of things, the .NET VM is not "that" complicated by
comparrison;
OTOH, one finds that all of this stuff is a horrible PITA...

(I am not claiming one needs all of .NETs runtimes/libraries, rather, I am
focusing on the core of the VM itself...).


it is generality which is the hard part, and it is surprising how simple
each "step" may seem along the way, or, at least, until one attempts to
actually do them.


it is a similar issue to how, when compiling C, one may find it surprisingly
difficult to make their compiler be fast. one can easily get near instant
compilation with a scripting language, but not so with C. it has its ways of
eating up time.


Quote:
IMO the VM only must support the C calling convention and data types
(ABI) - not different from interfacing any other "language".


to run C, or to interface with it?...

I was concerning myself with running it in this context.

interfacing is an easier task, after all...


granted, yes, one can implement C on a VM which does not internally use
static typing, which could save some work, but I had excluded this (partly,
I don't know of any real way to do so without somewhat hurting
performance...).
 
 
Page 1 of 1    
All times are GMT
The time now is Sun Nov 29, 2009 7:20 am