How Python simplifies programming
Python’s syntax is meant to be readable and clean, with little pretense. A standard “hello world” in Python 3.x is nothing more than:
print("Hello world!")
Python provides many syntactical elements to concisely express common program flows. The following sample program reads lines from a text file into a list object while stripping each line of its terminating newline character along the way:
with open("myfile.txt") as my_file:
file_lines = [x.rstrip("\n") for x in my_file]
The with/as
construction is a context manager, which provides an efficient way to instantiate an object for a block of code and then dispose of it outside that block. In this case, the object is my_file
, instantiated with the open()
function. This replaces several lines of boilerplate to open the file, read individual lines from it, then close it up.
The [x … for x in my_file]
construction is another Python idiosyncrasy, the list comprehension. It lets an item that contains other items (in this case, my_file
and the lines it contains) be iterated through, and it lets each iterated element (that is, each x
) be processed and automatically appended to a list.
You could write such a thing as a formal for…
loop in Python, much as you would in another language. The point is that Python has a way to economically express things like loops that iterate over multiple objects and perform a simple operation on each element in the loop, or to work with things that require explicit instantiation and disposal.
Constructions like this let Python developers balance terseness and readability.
Python’s other language features are meant to complement common use cases. Most modern object types—Unicode strings, for example—are built directly into the language. Data structures—like lists, dictionaries (i.e., hashmaps or key-value stores), tuples (for storing immutable collections of objects), and sets (for storing collections of unique objects)—are available as standard-issue items.
Python 2 vs. Python 3
Python is available in two versions, which are different enough to trip up many new users. Python 2.x, the older “legacy” branch, was supported through 2020, long after it was planned to be retired. Python 3.x, the current and future incarnation of the language, has many useful and important features not found in Python 2.x, such as new syntax features (e.g., the “walrus operator”), better concurrency controls, and a more efficient interpreter.
Python 3 adoption was slowed for the longest time by the relative lack of third-party library support. Many Python libraries supported only Python 2, making it difficult to switch. But as Python 2's termination date loomed, the number of libraries supporting only Python 2 dwindled, and shortly before Python 2 reached its end-of-life, all the most popular libraries had become Python 3-compatible. If you are stuck with Python 2 in a legacy environment, you have various strategies at your disposal.
Python's tools and libraries
Python's success rests on a rich ecosystem of first- and third-party software. The language benefits from both a strong standard library and a generous assortment of libraries that are easily obtained from third-party developers. Its ecosystem has been enriched by decades of expansion and contribution.
The standard library provides modules for common programming tasks—math, string handling, file and directory access, networking, asynchronous operations, threading, multiprocess management, and so on. But it also includes modules for managing the common, high-level programming tasks required for modern applications. These include reading and writing structured file formats like JSON and XML, manipulating compressed files, working with internet protocols and data formats (web pages, URLs, email), and more. Most any external code that exposes a C-compatible foreign function interface can be accessed with Python’s ctypes
module.
The default Python distribution also provides a rudimentary but useful cross-platform GUI library via Tkinter, and an embedded copy of the SQLite 3 database.
The thousands of third-party libraries, available through the Python Package Index (PyPI), constitute the strongest showcase for Python’s popularity and versatility. Here are some examples:
- The BeautifulSoup library provides an all-in-one toolbox for scraping HTML—even tricky, broken HTML—and extracting data from it.
- Requests and httpx make working with HTTP requests at scale painless and simple.
- Frameworks like Flask, Django, and FastAPI allow rapid development of web services that encompass both simple and advanced use cases.
- NumPy, Pandas, and Matplotlib accelerate math and statistical operations, and make it easy to create visualizations of data.
- Multiple cloud services can be managed through Python’s object model using Apache Libcloud.
Python’s compromises
Like C#, Java, and Go, Python has automatic memory management, meaning the programmer doesn’t have to implement code to track and release objects. Normally, garbage collection (for objects that don't clean themselves up correctly) happens automatically in the background, but if that poses a performance problem, you can trigger it manually or disable it entirely. You can even declare whole regions of objects exempt from garbage collection as a performance enhancement.
An important aspect of Python is its dynamism. Everything in the language, including functions and modules themselves, are handled as objects. This comes at the expense of speed (more on that later), but makes it far easier to write high-level code. Developers can perform complex object manipulations with only a few instructions, and even treat parts of an application as abstractions that can be altered if needed.
Python’s use of significant whitespace has been cited as both one of its best and worst attributes. The indentation on the second line below isn’t just for readability; it's part of Python’s syntax. Python interpreters will reject programs that don’t use proper indentation to indicate control flow.
with open(‘myfile.txt’) as my_file:
file_lines = [x.rstrip(‘\n’) for x in my_file]
Syntactical white space might cause noses to wrinkle, and some programmers do reject Python for this reason. But strict indentation rules are far less obtrusive in practice than they might seem in theory, even with the most minimal of code editors, and the result is code that is cleaner and more readable.
Another potential turnoff, especially for those coming from languages like C or Java, is how Python handles variable typing. By default, Python uses dynamic or “duck” typing—great for quick coding, but potentially problematic in large codebases. The names of objects don't have types, but the objects themselves do. That said, Python has recently added support for optional compile-time type hinting, so projects that might benefit from static typing can use it.
Is Python really that slow?
One common caveat about Python is that it’s slow. Objectively, it’s true. Python programs generally run much more slowly than corresponding programs in C/C++ or Java. Some Python programs will be slower by an order of magnitude or more.
Why so slow? It isn’t just because most Python runtimes are interpreters rather than compilers. It is also due to the fact that the inherent dynamism and the malleability of objects in Python make it difficult to optimize the language for speed, even when it is compiled. That said, Python’s speed may not be as much of an issue as it seems, and there are ways to alleviate it.
Python performance optimizations
A slow Python program isn't necessarily fated to be forever slow. Many Python programs are slow because they don’t properly use the functionality in Python or its standard library. Novice Python programmers often write Python as if it were C or Java, and leave potential performance optimizations unexplored. An an example, you can speed up math and statistics operations dramatically by using libraries such as NumPy and Pandas.
A common adage of software development is that 90 percent of the activity for a program tends to be in 10 percent of the code, so optimizing that 10 percent can yield major improvements. With Python, you can selectively convert that 10 percent to C or even assembly, using projects like Cython or Numba. The result is often a program that runs within striking distance of a counterpart written entirely in C, but without being cluttered with C’s memory-micromanagement details.
Finally, alternative Python runtimes have speed optimizations that the stock CPython runtime lacks. PyPy, for instance, is a just-in-time (JIT) Python compiler that converts Python to native machine code on the fly. PyPy can provide orders-of-magnitude speedups for many common operations.
Ongoing Python performance improvements
The core developers for CPython, the default Python implementation, have historically favored keeping the implementation simple over trying to make elaborate performance improvements. But over time, there's been a greater push to make Python perform better, and those efforts are now paying off.
For instance, the bytecode generated by CPython from a program's source code can be analyzed at runtime and "specialized." If a given process consistently involves the same type, that operation can be optimized for that type. Many more optimizations like this are in the offing.
Another major project, still in its infancy, is removing CPython's Global Interpreter Lock (GIL), a thread-synchronization mechanism that's kept Python threads from being properly concurrent. Removing the GIL is a complex project, since one of the requirements imposed is that it can't make single-threaded programs slower. But a new implementation of the idea is now in testing, and shows great promise. It's set to be phased in over the next few versions of Python.
Developer time often trumps machine time
For many tasks in Python, the speed of development beats the speed of execution.
A given Python program might take six seconds to execute versus a fraction of a second in another language. But it might take only 10 minutes for a developer to put that Python program together, versus an hour or more of development time in another language. The amount of time lost in the execution of the Python program is more than gained back by the time saved in the development process.
Obviously, this is less true when you’re writing software that has high-throughput, high-concurrency demands, such as a trading application or a database. But for many real-world applications, in domains ranging from systems management to machine learning, Python will prove to be fast enough.
Plus, the flexibility and pace of development that Python enables may allow for innovation that would be more difficult and time-consuming to achieve in other languages. And the language's development team has an eye turned toward making everything faster with time.
When speed of development and programmer comfort are more important than shaving a few seconds off the machine clock, Python may well be the best tool for the job.