A Horrified Haskeller's Descent into Python Gradual Static Typing
About Me
- excited by Programming Languages
- for work, mostly: Haskell, Python
... but also: Fortran, PHP, Java
- for fun: Rust, Idris, Lua, (and LEDs)
- github.com/benclifford
About this talk
- An experience report of moving from mostly Haskell to Python
- Not going to try to convince you static types are good or bad
- This is an FP meetup so I'm expecting you to form your own opinions
- This is not a tutorial - I hope to give you a taste of what
I have encountered.
About the project
Parsl - parsl-project.org
Library for to help you run Python code on (mostly) supercomputers
(eg 1000 nodes x 68 cores)
Development mostly around University of Chicago
Academia - lots of exploratory features/projects that are made public
Python - because "everybody" uses it in our target fields
My interest in static typing parsl
- initially, to understand existing codebase
- and to understand how static typing works in Python
- more recently, a decent tool for reliability
- especially code that is supercomputer-specific so
no CI integration tests
(dynamic) types in Python
Values have types. Variables do not.
x = 3
type(x) ==> <class 'int'>
x = {}
type(x) ==> <class 'dict'>
Type syntax
# untyped
def square(y):
return y*y
x = square(1.41)
def square(y: float) -> float:
return y*y
x: float = square([])
Traceback (most recent call last):
File "<stdin>", line 1, in
File "<stdin>", line 2, in square
TypeError: can't multiply sequence by non-int of type 'list'
Runtime checking
@typeguard.typechecked
def square(y: float) -> float:
return y*y
x: float = square([])
Traceback (most recent call last):
...
TypeError: type of argument "y" must be either float or int; got list instead
"If it walks like a duck and it quacks like a duck, then it must be a duck"
def print_len(x):
print(len(x))
print_len([]) => 0 # empty list
print_len({}) => 0 # empty dict
print_len("hello") => 5 # str
print_len(1.23) => TypeError: object of type 'float' has no len()
Duck typing, statically
class Sized(Protocol): # (based on real Python impl)
def __len__(self) -> int:
pass
def print_len(x: Sized):
print(len(x))
print_len([]) => 0
print_len({}) => 0
print_len("hello") => 5
isinstance({}, Sized) ==> True
print_len(100)
s.py:13: error: Argument 1 to "print_len" has incompatible type "int";
expected "Sized"
Duck typing, statically
class A():
def __len__(self):
return 128
a = A()
print_len(a) ==> 128
isinstance(a, Sized) ==> True
Dynamic arguments
def f(*args, **kwargs):
print(f"There are {len(args)} regular args")
print(f"There are {len(kwargs)} keyword args")
f() =>
There are 0 regular args
There are 0 keyword args
f(1,2,3) =>
There are 3 regular args
There are 0 keyword args
f(8, greeting="hello") =>
There are 1 regular args
There are 1 keyword args
Sig = TypeVar('Sig')
def mydecorator(func: Sig) -> Sig
return func
def internal_f(x: int) -> int:
return x+1
f = mydecorator(internal_f)
Decorator typing
@parsl.python_app
def f(x: int) -> str
return str(x)
# should have type
# f(x: int) -> Future[str]
but
Sig = TypeVar('Sig')
def mydecorator(func: Sig) -> Sig
...
is not expresive enough (in Python <=3.9)
Co-/contra-variance
class Animal():
pass
class Dog(Animal):
pass
# Dog <= Animal <= object
animals: List[Animal] = []
def add_dog(l: List[Dog]):
my_dog: Dog = ...
l.append(my_dog)
add_dog(animals) # valid?
Co-/contra-variance
class Animal():
pass
class Dog(Animal):
pass
# Dog <= Animal <= object
animals: List[Animal] = [Cat(), Dog(), Dog(), Cow()]
def count_dogs(l: List[Dog]):
print(f"There are {len(l)} dogs")
count_dogs(animals) # valid?
Co-variance
Dog <= Animal
implies
Sequence[Dog] <= Sequence[Animal]
(Sequence[X] is a read only List/tuple/...)
Contra-variance
Dog <= Animal
imples
Callable[[Animal], str] <= Callable[[Dog], str]
Co-/contra-variance
* Co-variance eg. (read only) Sequence
or
* Contra-variance eg. function args
or if neither:
* invariant eg. List
In practice, hit problems with List often. eg replace with Sequence
Parsl development considerations
* Easy stuff
- Can go into master
- type annotations with none of the nonsense that I've
just talked about; gradual typing/Any elsewhere
- easy for everyone to understand simple typing (c.f. Haskell98 crowd)
- high payoff in poorly-tested code like error handling
- typeguard at user boundaries (runtime checking)
- mypy within Parsl codebase (static checking)
* Hard stuff
- separate branch for my exploration
- discover bugs to fix on master
without necessarily adding types to master
- avoid forcing complication onto other dynamic Python programmers
- free to do madness (eg no backwards compatibility,
type checker plugins)