Is it really correct to talk about compiled and interpreted languages? [duplicate]

https://softwareengineering.stackexchange.com/questions/385367

18-02-2021
|

Question

It is pretty obvious that any interpreted language CAN also be compiled. For a long time I thought that it was not necessarily the other way around. Then I discovered Ch which is an interpreter that can interpret the whole C language. It also supports parts of C++, Java, Matlab, Fortran and C-shell.

This made me draw the conclusion that whether a language is a compiled or interpreted language is not a property of the language itself, but rather a convention. Is this correct?

La solution

You are 100% right. There is no such thing as a "compiled language" or an "interpreted language". Those terms are not even wrong, they are non-sensical.

Programming languages are sets of abstract mathematical rules, definitions, and restrictions. Programming languages aren't compiled or interpreted. Programming languages just are. [Credit goes to Shriram Krishnamurthi who said this in an interview on Channel9 years ago (at about 51:37-52:20).]

In fact, a programming language can perfectly exist without having any interpreter or compiler! For example, Konrad Zuse's Plankalkül which he designed in the 1930s was never implemented during his lifetime. You could still write programs in it, you could analyze those programs, reason about them, prove properties about them … you just couldn't execute them. (Well, actually, even that is wrong: you can of course run them in your head or with pen and paper.)

Compilation and interpretation are traits of the compiler or interpreter (duh!), not the programming language. Compilation and interpretation live on a different level of abstraction than programming languages: a programming language is an abstract concept, a specification, a piece of paper. A compiler or interpreter is a concrete piece of software (or hardware) that implements that specification. If English were a typed language, the terms "compiled language" and "interpreted language" would be type errors. [Again, credit to Shriram Krishnamurthi.]

Every programming language can be implemented by a compiler. Every programming language can be implemented by an interpreter. Many modern mainstream programming languages have both interpreted and compiled implementations. Many modern mainstream high-performance programming language implementations have both compilers and interpreters.

As you have noticed, there are interpreters for C and for C++. On the other hand, every single current major mainstream implementation of ECMAScript, PHP, Python, Ruby, and Lua has a compiler. The original version of Google's V8 ECMAScript engine was a pure native machine code compiler. (They went through several different designs, and the current version does have an interpreter, but for many years, it didn't have one.) XRuby and Ruby.NET were purely compiled Ruby implementations. IronRuby started out as a purely compiled Ruby implementation, then added an interpreter later in order to improve performance. Opal is a purely compiled Ruby implementation.

Some people might say that the terms "interpreted language" or "compiled language" make sense to apply to programming languages that can only be implemented by an interpreter or by a compiler. But, no such programming language exists. Every programming language can be implemented by an interpreter and by a compiler.

For example, you can automatically and mechanically derive a compiler from an interpreter using the Second Futamura Projection. It was first described by Prof. Yoshihiko Futamura in his 1971 paper Partial Evaluation of Computation Process – An approach to a Compiler-Compiler (Japanese), an English version of which was republished 28 years later. It uses Partial Evaluation, by partially evaluating the partial evaluator itself with respect to the interpreter, thus yielding a compiler.

But even without such complex highly-academic transformations, you can create something that is functionally indistinguishable from compilation in a much simpler way: just bundle together the interpreter with the program to be interpreted into a single executable.

Another possibility is the idea of a "meta-JIT". (This is related in spirit to the Futamura Projections.) This is e.g. used in the RPython framework for implementing programming languages. In RPython, you write an interpreter for your language, and then the RPython framework will JIT-compile your interpreter while it is interpreting the program, thus producing a specialized compiled version of the interpreter which can only interpret that one single program – which is again indistinguishable from compiling that program. So, in some sense, RPython dynamically generates JIT compilers from interpreters.

The other way around, you can wrap a compiler into a wrapper that first compiles the program and then directly executes it, making this wrapped compiler indistinguishable from an interpreter. This is, in fact, how the Scala REPL, the C♯ REPL (both in Mono and .NET), the Clojure REPL, the interactive GHC REPL, and many other REPLs are implemented. They simply take one line / one statement / one expression, compile it, immediately run it, and print the result. This mode of interacting with the compiler is so indistinguishable from an interpreter, that some people actually use the existence of a REPL for the programming language as the defining characteristic of what it means to be an "interpreted programming language".

Note, however, that you can't run a program without an interpreter. A compiler simply translates a program from one language to another. But that's it. Now you have the same program, just in a different language. The only way to actually get a result of the program is to interpret it. Sometimes, the language is an extremely simple binary machine language, and the interpreter is actually hardcoded in silicone (and we call it a "CPU"), but that's still interpretation.

You might also be interested in this answer of mine, which explains the differences and the different means of combining interpreters, JIT compilers and AOT compilers and this answer dealing with the differences between an AOT compiler and a JIT compiler.

Autres conseils

Yes, kind of.

It is useful to remind people that interpreted or compiled or JITed or run in a VM via bytecode are all implementation details and not part of the language itself.

But it's also useful to accept that there may not be (and potentially can not be) a production quality compiler or interpreter for a language. And that reality will impact performance, usability, user experience, and a raft of other very practical concerns that programmers need to take under consideration.

You cannot have an equally potent compiler for any interpreted language. The statement that "compilation is just conversion to another language" is essentially correct but that can be a problem in itself. Think of a program that generates code and later interprets it. This is not at all that uncommon in the scripting world. "You should not do that!" No, it is really annoying when people do that. Especially for those who have to migrate some legacy scripting system to a contemporary platform. In those cases, conversion/compilation does not work. Interpretation is more potent because with interpretation it is never too late to gather the information you need to perform the task at hand. With conversion/compilation you have to know everything you need to know upfront and sometimes that is just impossible.

Licencié sous: CC-BY-SA avec attribution

Non affilié à softwareengineering.stackexchange