Why is Python not compiled?

Discussion:

(too old to reply)

HRM Resident

2022-07-23 16:15:31 UTC

James and I were discussing (offline) some aspects of Python,
specifically James opined that Python is 100 times slower than FORTRAN
doing the same thing. So I thought it better to bring the discussion
here if Mike (or anyone else ) wants to weigh in.

I suspect "100 times" is an exaggeration, but anyone who knows how
computer languages work understands an interpreted language (going back
to BASIC in the 60s and 70s) is far, far slower than a compiled
language. FORTRAN will *always* beat Python in terms of speed by at
least an order of magnitude.

I really had to dig because the Internet is not flush with why
there is no Python compiler. They drone on that Python is
interpreted**, but it's hard to find out why. Eventually, I came across
two reasons that make sense. I had already guessed (1), and I see that
(2) makes sense as you start understanding dynamic objects.

=================

(1) "Mostly, because machine code is heavily machine–dependent. So, if
you compiled Python code to x86 code, it would only run on x86 machines.
That's not only that — assuming you're not running your program on a
bare machine, you'll want to perform system calls, as they're
fundamental to interact with the system to do things like accessing the
file system, networks, graphics, among the others. And system calls
depend on the OS, which means if you targeted Windows x86, your program
would only run on Windows on x86 machines. To support most common
architectures and operating systems, the best you could do is as trivial
and dumb (and potentially misleading for inexpert end–users) as
providing users with multiple variants of your program."

(2) "The short answer is “late binding” meaning Python programs do not
have to decide their own final form at design time. They're able to
change themselves in fundamental ways at execution time.

For example many Python objects do not have to declare anywhere in
advance how much memory they will be needing, and at the time of
compiling to byte code, there's no way to know that either.

Python has to stick around at runtime to dynamically create and destroy
memory as its objects grow and shrink without advance warning, as the
program requires.

This was all work the Python programmer did not have to do when writing
the program. The program gets an intelligent babysitter for supervised
operations."

=================

I will add a third guess on my part. Most programs are not heavy
data analysis beasts. Exceptions are environmental models and programs
that grind away doing statistical analysis for hours or even days. And
greedy people trying to calculate crypto-currency. These are all written
in compiled languages like FORTRAN, C, and C++. These are specialized
requirements . . . most programs (compiled or interpreted) finish in a
few seconds.

Some versions of these languages are designed to run on multi-node
systems . . . the compiler breaks the tasks into "chunks" that run in
parallel on many CPUs at the same time. Then combine the results at the
end. I did a bit of this in 2008 using parallelized FORTRAN on a
256-node system. I was just fooling around calculating the area under
the top half of a sine wave. The narrower the "slices" I picked, the
more accurate the results were when added together. Also, the more nodes
I added, the quicker it finished. I couldn't hog all 256 nodes because
it was a shared system, but I could grab 10,20, 30, or so. The results
were enlightening, and it was intriguing to see the difference between
speed and accuracy as you tweaked the number of nodes and the "width" of
the slice.

I'm sure today that there exist systems with thousands of nodes (or
CPUs) dedicated to these sorts of applications.

Finally, in 2022, hardware is way cheaper than programmers. It's
likely better in some scenarios to throw faster (or more) CPUs at the
problem than to pay teams of programmers $150-$200 an hour to optimize
the heck out of a program. Do you hire one programmer to code it in
Python in 30 minutes on a more expensive computer (or group of hundreds
of computers?) Or do you hire 10 programmers to do it on one with just a
few CPUs? Or both if it's super important? It depends.

It's the same thought pattern on which vehicles to buy. Are you are
going to want a fleet of taxis going 24/7, or haul 10 tons of gravel
once a week.

** There exist obscure things that will "compile" Python by translating
it to to C, etc. They appear to be not widely used for reasons (1) and
(2) above.

--
HRM Resident

James Warren

2022-07-23 16:32:26 UTC

Permalink

    James and I were discussing (offline) some aspects of Python,
specifically James opined that Python is 100 times slower than FORTRAN
doing the same thing. So I thought it better to bring the discussion
here if Mike (or anyone else ) wants to weigh in.
    I suspect "100 times" is an exaggeration, but anyone who knows how
computer languages work understands an interpreted language (going back
to BASIC in the 60s and 70s) is far, far slower than a compiled
language. FORTRAN will *always* beat Python in terms of speed by at
least an order of magnitude.
    I really had to dig because the Internet is not flush with why
there is no Python compiler. They drone on that Python is
interpreted**, but it's hard to find out why. Eventually, I came across
two reasons that make sense. I had already guessed (1), and I see that
(2) makes sense as you start understanding dynamic objects.
=================
(1) "Mostly, because machine code is heavily machine–dependent. So, if
you compiled Python code to x86 code, it would only run on x86 machines.
That's not only that — assuming you're not running your program on a
bare machine, you'll want to perform system calls, as they're
fundamental to interact with the system to do things like accessing the
file system, networks, graphics, among the others. And system calls
depend on the OS, which means if you targeted Windows x86, your program
would only run on Windows on x86 machines. To support most common
architectures and operating systems, the best you could do is as trivial
and dumb (and potentially misleading for inexpert end–users) as
providing users with multiple variants of your program."
(2) "The short answer is “late binding” meaning Python programs do not
have to decide their own final form at design time. They're able to
change themselves in fundamental ways at execution time.
For example many Python objects do not have to declare anywhere in
advance how much memory they will be needing, and at the time of
compiling to byte code, there's no way to know that either.
Python has to stick around at runtime to dynamically create and destroy
memory as its objects grow and shrink without advance warning, as the
program requires.
This was all work the Python programmer did not have to do when writing
the program. The program gets an intelligent babysitter for supervised
operations."
=================
    I will add a third guess on my part. Most programs are not heavy
data analysis beasts. Exceptions are environmental models and programs
that grind away doing statistical analysis for hours or even days. And
greedy people trying to calculate crypto-currency. These are all written
in compiled languages like FORTRAN, C, and C++. These are specialized
requirements . . . most programs (compiled or interpreted) finish in a
few seconds.
    Some versions of these languages are designed to run on multi-node
systems . . . the compiler breaks the tasks into "chunks" that run in
parallel on many CPUs at the same time. Then combine the results at the
end. I did a bit of this in 2008 using parallelized FORTRAN on a
256-node system. I was just fooling around calculating the area under
the top half of a sine wave. The narrower the "slices" I picked, the
more accurate the results were when added together. Also, the more nodes
I added, the quicker it finished. I couldn't hog all 256 nodes because
it was a shared system, but I could grab 10,20, 30, or so. The results
were enlightening, and it was intriguing to see the difference between
speed and accuracy as you tweaked the number of nodes and the "width" of
the slice.
    I'm sure today that there exist systems with thousands of nodes (or
CPUs) dedicated to these sorts of applications.
    Finally, in 2022, hardware is way cheaper than programmers. It's
likely better in some scenarios to throw faster (or more) CPUs at the
problem than to pay teams of programmers $150-$200 an hour to optimize
the heck out of a program. Do you hire one programmer to code it in
Python in 30 minutes on a more expensive computer (or group of hundreds
of computers?) Or do you hire 10 programmers to do it on one with just a
few CPUs? Or both if it's super important? It depends.
    It's the same thought pattern on which vehicles to buy. Are you are
going to want a fleet of taxis going 24/7, or haul 10 tons of gravel
once a week.
** There exist obscure things that will "compile" Python by translating
it to to C, etc. They appear to be not widely used for reasons (1) and
(2) above.

This is an excellent analysis. The biggest point I think is that
hardware is dirt cheap compared to programmers. This alone can
completely justify the existence of Python.

If you want to simulate what happens inside a proton or how
a universe evolves from an initial state then Python is
woefully inadequate. These things are still done in C or Fortran.

HRM Resident

2022-07-23 17:03:46 UTC

Permalink

Post by James Warren
snip<
This is an excellent analysis. The biggest point I think is that
hardware is dirt cheap compared to programmers. This alone can
completely justify the existence of Python.
If you want to simulate what happens inside a proton or how
a universe evolves from an initial state then Python is
woefully inadequate. These things are still done in C or Fortran.

Thank you, James. I guess hell is going to freeze over because we
are not arguing. :-)

Anyhow, it's a hot one, so hell is going to thaw out pretty quickly
if it does freeze over. Now (2 PM) I have 31 C at ground level and 29.4
C at 7 meters up. My humidex value is 31 C. I think we are saved (a
bit) because the humidity isn't as high as yesterday and there's a bit
of wind. I am not "along the coast" and I usually get warmer than the
forecast. We'll see. Usually it peaks around 4-5 PM here.

They are saying 33 C tomorrow with a humidex of 40 C (11 AM EC
forecast.)

--
HRM Resident

James Warren

2022-07-23 18:19:58 UTC

Permalink

>snip<

Post by James Warren
This is an excellent analysis. The biggest point I think is that
hardware is dirt cheap compared to programmers. This alone can
completely justify the existence of Python.
If you want to simulate what happens inside a proton or how
a universe evolves from an initial state then Python is
woefully inadequate. These things are still done in C or Fortran.

Thank you, James. I guess hell is going to freeze over because we
are not arguing. :-)

Another thing that is big with Python is Machine Learning and
Deep Learning with packages such as Scikit-Learn, Keras and Tensorflow.
It doesn't matter that it can take days to train because models can
be designed very quickly.

Anyhow, it's a hot one, so hell is going to thaw out pretty quickly
if it does freeze over. Now (2 PM) I have 31 C at ground level and 29.4
C at 7 meters up. My humidex value is 31 C. I think we are saved (a
bit) because the humidity isn't as high as yesterday and there's a bit
of wind. I am not "along the coast" and I usually get warmer than the
forecast. We'll see. Usually it peaks around 4-5 PM here.

The humidex is 37C where I'm sitting, 30.1C and 54% humidity according
to: http://www.csgnetwork.com/canhumidexcalc.html

They are saying 33 C tomorrow with a humidex of 40 C (11 AM EC
forecast.)

Well Trump. Trump off with that!

I guess that language may be too strong, so
Well T****. T**** off with that!

HRM Resident

2022-07-23 19:31:55 UTC

Permalink

Post by James Warren
The humidex is 37C where I'm sitting, 30.1C and 54% humidity according
to: http://www.csgnetwork.com/canhumidexcalc.html

Humidex is a Canadian thing and I calculate it similarly. The Yanks
use "heat index" which is always lower. My weather station doesn't spit
out humidex because it was made in the USA . . . and nothing invented or
used outside their borders matters.

They won't use Robertson screwdrivers because they are a Canadian
invention. I know this because we sent some gear to them in the 1980s in
plywood boxes screwed shut with Robertson screws. We had to mail them a
few Robertson screwdrivers later on so they could open them. They prefer
slotted or Phillips screwdrivers that strip and make you curse.

Use neighbour, harbour and labour, say Zed, and use all British
spelling for extra fun! "What colour is your lorry? The one with the
smelly carburettor and the wide tyres that you use to analyse things
whilst travelling on the wrong side of the motorway. The one parked in
your garden."

Post by James Warren

They are saying 33 C tomorrow with a humidex of 40 C (11 AM EC
forecast.)

Well Trump. Trump off with that!
I guess that language may be too strong, so
Well T****. T**** off with that!

That's pretty vile language! We're going to have to wash your mouth
out with soap if you keep it up! :-)

--
HRM Resident

James Warren

2022-07-23 20:19:01 UTC

Permalink

Post by James Warren
The humidex is 37C where I'm sitting, 30.1C and 54% humidity according
to: http://www.csgnetwork.com/canhumidexcalc.html

    Humidex is a Canadian thing and I calculate it similarly. The Yanks
use "heat index" which is always lower. My weather station doesn't spit
out humidex because it was made in the USA . . . and nothing invented or
used outside their borders matters.
    They won't use Robertson screwdrivers because they are a Canadian
invention. I know this because we sent some gear to them in the 1980s in
plywood boxes screwed shut with Robertson screws. We had to mail them a
few Robertson screwdrivers later on so they could open them. They prefer
slotted or Phillips screwdrivers that strip and make you curse.
    Use neighbour, harbour and labour, say Zed, and use all British
spelling for extra fun! "What colour is your lorry? The one with the
smelly carburettor and the wide tyres that you use to analyse things
whilst travelling on the wrong side of the motorway. The one parked in
your garden."

Post by James Warren

They are saying 33 C tomorrow with a humidex of 40 C (11 AM EC
forecast.)

Well Trump. Trump off with that!
I guess that language may be too strong, so
Well T****. T**** off with that!

That's pretty vile language! We're going to have to wash your mouth
out with soap if you keep it up! :-)

It is, but it just felt right in the moment. :)