A Horrified Haskeller's Descent into Python Gradual Static Typing

About Me

- excited by Programming Languages
- for work, mostly: Haskell, Python
      ... but also: Fortran, PHP, Java
- for fun: Rust, Idris, Lua, (and LEDs)
- github.com/benclifford

About this talk

- An experience report of moving from mostly Haskell to Python
- Not going to try to convince you static types are good or bad
- This is an FP meetup so I'm expecting you to form your own opinions
- This is not a tutorial - I hope to give you a taste of what
  I have encountered.

About the project

Parsl - parsl-project.org

Library for to help you run Python code on (mostly) supercomputers
(eg 1000 nodes x 68 cores)

Development mostly around University of Chicago

Academia - lots of exploratory features/projects that are made public

Python - because "everybody" uses it in our target fields

My interest in static typing parsl

- initially, to understand existing codebase
- and to understand how static typing works in Python
- more recently, a decent tool for reliability
   - especially code that is supercomputer-specific so
     no CI integration tests

(dynamic) types in Python

Values have types. Variables do not.

  x = 3
  type(x)  ==> <class 'int'>
  x = {}
  type(x)  ==> <class 'dict'>

Type syntax

# untyped
def square(y):
  return y*y

x = square(1.41)

# typed
def square(y: float) -> float:
  return y*y

x: float = square(1.41)

Type annotations have no effect!

def square(y: float) -> float:
  return y*y

x: float = square([])

Traceback (most recent call last):
  File "<stdin>", line 1, in 
  File "<stdin>", line 2, in square
TypeError: can't multiply sequence by non-int of type 'list'

Runtime checking

@typeguard.typechecked
def square(y: float) -> float:
  return y*y

x: float = square([])

Traceback (most recent call last):
...
TypeError: type of argument "y" must be either float or int; got list instead

Static checking

def square(y: float) -> float:
  return y*y

x: float = square([])
$ mypy source.py
source.py:4: error: Argument 1 to "square" has incompatible type "List[]";
  expected "float"

Type hierarchy

>>> float.__mro__
(<class 'float'>, <class 'object'>)

>>> type([]).__mro__
(<class 'list'>, <class 'object'>)

>>> isinstance(1.23, object)
True
>>> isinstance([], object)
True
>>> isinstance(1.23, list)
False

Type hierarchy

def f(x: object):
  print(x)

x: float = 1.23

f(x) # typechecks ok, because float <= object

mypy in CI

- mypy - a static typechecker for python
- no static checking at runtime
- mypy happens entirely within test/CI cycle
- same time as lint tools

Gradual typing

def f(x: float):
  print(x + 1)

y: Any = []

f(y)

# list ~ Any ~ float   (!)

# but at runtime

TypeError: can only concatenate list (not "int") to list

Gradual Typing

a = planet
a = a.pickCountry()
a = a.pickCity()
a = a.pickCoordinates()

- Rewrite this in more amenable style...
- or    a: Any

Union types

def f(x: Union[float, str]):
    if isinstance(x, float):
        print(x*2)
    else:
        print("not a float")

y: float = 1.23

f(y)  ==> 2.45

# float <= Union[float, str]
# str <= Union[float, str]

Optional

Optional[X] is equivalent to Union[X, None]

Duck typing, statically

"If it walks like a duck and it quacks like a duck, then it must be a duck"

def print_len(x):
    print(len(x))

print_len([]) => 0       # empty list
print_len({}) => 0       # empty dict
print_len("hello") => 5  # str

print_len(1.23) => TypeError: object of type 'float' has no len()

Duck typing, statically

class Sized(Protocol):  # (based on real Python impl)
    def __len__(self) -> int:
        pass

def print_len(x: Sized):
    print(len(x))

print_len([]) => 0
print_len({}) => 0
print_len("hello") => 5

isinstance({}, Sized) ==> True

print_len(100)
s.py:13: error: Argument 1 to "print_len" has incompatible type "int";
                expected "Sized"

Duck typing, statically

class A():
  def __len__(self):
    return 128

a = A()

print_len(a)  ==> 128

isinstance(a, Sized)  ==> True

Dynamic arguments

def f(*args, **kwargs):
    print(f"There are {len(args)} regular args")
    print(f"There are {len(kwargs)} keyword args")

f() => 
There are 0 regular args
There are 0 keyword args

f(1,2,3) =>
There are 3 regular args
There are 0 keyword args

f(8, greeting="hello") =>
There are 1 regular args
There are 1 keyword args

Decorators

@typeguard.typechecked
def square(y: float) -> float:
  return y*y
@app.route('/post/<int:post_id>')
def show_post(post_id):
    return 'Post %d' % post_id
    # flask quickstart

Decorators

@mydecorator
def f(x):
    return x+1
desugars to (approx):
def internal_f(x):
    return x+1

f = mydecorator(internal_f)

Decorator typing


@mydecorator
def f(x: int) -> int:
  return x+1

# aka:

def internal_f(x: int) -> int:
    return x+1

f = mydecorator(internal_f)

def mydecorator(function: ??) -> ??
    ...

Decorator typing

Sig = TypeVar('Sig')

def mydecorator(func: Sig) -> Sig
    return func

def internal_f(x: int) -> int:
    return x+1

f = mydecorator(internal_f)

Decorator typing

@parsl.python_app
def f(x: int) -> str
  return str(x)

# should have type 
#  f(x: int) -> Future[str]
but
Sig = TypeVar('Sig')
def mydecorator(func: Sig) -> Sig
    ...
is not expresive enough (in Python <=3.9)

Co-/contra-variance

class Animal():
    pass

class Dog(Animal):
    pass

# Dog <= Animal <= object

animals: List[Animal] = []

def add_dog(l: List[Dog]):
  my_dog: Dog = ...
  l.append(my_dog)

add_dog(animals)    # valid?

Co-/contra-variance

class Animal():
    pass

class Dog(Animal):
    pass

# Dog <= Animal <= object

animals: List[Animal] = [Cat(), Dog(), Dog(), Cow()]

def count_dogs(l: List[Dog]):
    print(f"There are {len(l)} dogs")

count_dogs(animals)    # valid?

Co-variance


Dog <= Animal

implies

Sequence[Dog] <= Sequence[Animal]

(Sequence[X] is a read only List/tuple/...)

Contra-variance

Dog <= Animal

imples

Callable[[Animal], str] <= Callable[[Dog], str]

Co-/contra-variance

* Co-variance  eg. (read only) Sequence
or
* Contra-variance  eg. function args
or if neither:
* invariant   eg. List

In practice, hit problems with List often. eg replace with Sequence

Parsl development considerations

* Easy stuff
  - Can go into master
  - type annotations with none of the nonsense that I've
    just talked about; gradual typing/Any elsewhere
  - easy for everyone to understand simple typing (c.f. Haskell98 crowd)
  - high payoff in poorly-tested code like error handling
  - typeguard at user boundaries (runtime checking)
  - mypy within Parsl codebase (static checking)

* Hard stuff
  - separate branch for my exploration
  - discover bugs to fix on master
    without necessarily adding types to master
  - avoid forcing complication onto other dynamic Python programmers
  - free to do madness (eg no backwards compatibility,
    type checker plugins)
- Ende -