acme-smuggler

Ben Clifford

London Haskell, August 2017

benc@hawaga.org.uk

Press 's' for speaker notes


smuggle :: Typeable a =>   a -> ()
discover :: Typeable a => () -> Maybe a

This is impossible. But it is possible.

Demo

> import Acme.Smuggler
> let x = smuggle (32 :: Float)
> let y = smuggle "Hello"
> :t x
x :: ()
> :t y
y :: ()
> discover x :: Maybe Float
Just 32.0
> discover x :: Maybe String
Nothing
> discover y :: Maybe String
Just "Hello"

Injecting


type T = Either String Int

inj = Left :: String -> T

Projecting


prj :: T -> Maybe String
prj = either Just (const Nothing)
> prj (inj "hi")
Just "hi"

> prj (Right 123)
Nothing

prj . int round trip rule


  (prj . inj) = Just

Other examples

Open unions:

type S = Union '[Char, Int, [()]]

Word8 into Double:

> fromIntegral (255 :: Word8) :: Double
255.0

> round (255.0 :: Double) :: Word8
255
> round (3.1415 :: Double) :: Word8
3

read . show

Cardinality


type T = Either String Int

inj :: String -> T
inj = Left

Injection rule:

inj a == inj b => a == b

https://en.wikipedia.org/wiki/Cardinality

smuggle :: a -> ()
discover :: () -> Maybe a
or lets rename and specialise:
inj :: String -> ()
prj :: () -> Maybe String

Can an injection exist?

Can inj :: String -> () exist?

inj "hello" = ()
inj "goodbye" = ()

Injection rule:

() == () => "hello" == "goodbye"

... doesn't hold

So acme-smuggler is impossible!

... but it isn't impossible. Everything I've just told you is wrong.

Getting dirty


type T = Either String Int

fromLeft :: T -> String
fromLeft (Left s) = s
fromLeft (Right 7) = ??????

type T = Either String Int

fromLeft :: T -> String
fromLeft (Left s) = s
fromLeft (Right _) = undefined

Smuggling Bool


inj :: Bool -> ()
inj True = ()
inj False = undefined

Laziness

> let x = inj False 
-- no error
> (const 3) x
3
> x
*** Exception: Prelude.undefined

human-prj

> inj True
()
> inj False
*** Exception: Prelude.undefined

but...

case x of 
  () -> True
  ???? -> False

if x == undefined then False else True

Observing bottom programmatically


try :: IO a -> IO (Either e a)

h :: () -> IO (Either SomeException Bool)
h v = try $ pure (v == ())

prj' :: () -> IO Bool
prj' v = case h v of
  Right _ -> True
  Left _ -> False

unsafePerformIO :: IO a -> a

prj :: () -> Bool
prj = unsafePerformIO . prj'

Is unsafePerformIO safe here?

Is this referentially transparent? YES

Can execute this multiple times (or no times?) YES

But https://hackage.haskell.org/package/base-4.10.0.0/docs/System-IO-Unsafe.html

For this to be safe, the IO computation should be free of side effects and independent of its environment.

Boolean Smuggler


inj :: Bool -> ()
prj :: () -> Bool

(prj . inj) = id

-- ????
inj a == inj b => a == b

Extensible exceptions


h :: () -> IO (Either SomeException Bool)
h v = try $ pure (v == ())

prj' v = case h v of
  Right _ -> True
  Left _ -> False

class Exception where ...

instance Exception IOException
instance Exception BlockedIndefinitelyOnSTM
instance Exception AssertionFailed
instance Exception PatternMatchFail

instance Exception SomethingToSmuggle - ????

Data.Dynamic


toDyn :: Typeable a => a -> Dynamic -- aka inj
fromDynamic :: Typeable a	    -- aka prj
              => Dynamic -> Maybe a

instance Exception Dynamic

Putting the pieces together


-- inj
smuggle :: Typeable t => t -> ()
smuggle = throw . toDyn

-- prj
discover :: Typeable t => () -> Maybe t
discover = either (fromDynamic) (const Nothing)
  . unsafePerformIO
  . try
  . (pure $!)

smuggle :: Typeable t => t -> ()
smuggle = throw  .  toDyn
--     ()  <-   Dyn   <-   t
--     a

discover :: Typeable t => () -> Maybe t
discover =
   either (fromDynamic) (const Nothing) -- Maybe t
 . unsafePerformIO              -- Either Dynamic ()
 . try                          -- IO (Either Dynamic ())
 . (pure $!)                    -- IO ()
                                -- ()
- fin -
TODO: about how I got thinking about this through that juicy pixels timeout bug
acme-smuggler talk: half hour talk to london hug on acme smuggler ** old notes (green IO: could draw green kleisli arrows for IO rather than a green outer wrapper? that would fit the model of a chain of function applications that matches the implementation better? or a squiggly arrow so colour isn't needed? (or both...) ) primary presentation mode should be presenting set/function diagrams so some kind of diagram space (eg slidy with pre-cut up pngs?) but with another screen loaded with ghci and all relevant imports, for showing snippets of evaluation by hand to go with what is being shown in the diagram. figure out a clear visual style for those diagrams that encapsulates what I want to show throughout (eg later on, IO actions and unevaluated thunks) diagrams probably too detailed to fit in a half window, although if they could occupy 240 pixels height max, they could occupy the top half of the screen and the bottom half could be used for showing interactivity. or something like that. 0. introduce this as an acme- "fun" package that resulted from some serious work trying to understand real world stuff for real money in a real job. (don't reveal that it is error handling that I was trying to understand in case I reveal clues). because it's acme, there's a trick. but don't shout out the answer! 1. introduce inj/prj from acme-smuggler with a two line demo. 2. Why is this "wrong"? introduce inj/prj more generally - sets that are bigger/smaller - with set diagrams and value arrows and a big function arrow for inj or prj then show inj/prj diagram with () - so that you can see all values must go to () and so you can't then project out your original value - all the information must have been lost. so now I've proved that it can't work but demonstrated that it does... what's going on? 3. haskell types have another "sort of" value in all of them - bottom - that represents computations that did not work. so we have: let u = undefined :: () :t u u :: () but we can't observe that bottom - the whole computation that it is used in fails. > length ("hello " ++ show u) *** Exception: Prelude.undefined 4 ... or can we? In IO we can observe exceptions - i.e. catch exceptions `try` with inject all of the correct values into Right and our exception message into Left. Show a diagram here - don't make it too obvious that there will be many different Left values, but make it obvious that there are many right values. Diagram is Int projected into Either SomeException Int Left . Also it's all inside IO but that doesn't matter too much here for now. But perhaps use some notation like a red box around 7 meaning "a computation that will return 7"? That red box notation will be useful for thunks later? Or perhaps green cloud notation around it all, labelled IO. But when we run this: > import Control.Exception > try $ return $ length ("hello " ++ show u) :: IO (Either SomeException Int) Right *** Exception: Prelude.undefined We didn't get an exception so we got the Right constructor... but then we did get an exception. ... what happened there? try only catches exceptions during IO. But the only IO action we have was "return", and that always succeeds, in this case returning a thunk to an integer. Diagram is: red thunk (labelled :: ()), green IO wrapper: an IO action that returns an unevaluated thunk. but we run the IO and the green IO wrapper and the return disappears, giving us just the red-outline thunk... and then we try to print it and we have to evaluate it, and discover it is bottom. (click-through animation sequence) It succeeds so we get a Right constructor containing that thunk. Then when we try to print that thunk, at the repl outside of IO, we force the evaluation and our exception happens. But we're done with `try` by then. 5. forcing evaluation We can be a bit stricter, to flush out any exceptions a bit earlier, by using a suitably placed $! > try $ return $! length ("hello " ++ show u) :: IO (Either SomeException Int) (highlight that $! in green/blue) Left Prelude.undefined Animation sequence: show the red thunk being evaluated to bottom, and then the return being replaced by bottom, leaving an IO green cloud with just bottom inside it. (and explain that we're now going to try to run a bottom IO action, which is different from what we had before). > let v = () So now we can tell the difference between u (undefined) and v (()) if we're inside the IO monad: We can Left (something) for one and Right (something) for the other. Reshow previous diagram to say this is what we can do now: LHS = { (), bottom } , RHS = {Right 8, Left _} and draw lines from left to right to show we have that map. annotate diagram with IO in a big cloud roudn the outside to show this only works inside IO. 6. There's more than one bottom. `undefined` gives us bottom. But there are other "functions" (in quotes) like `undefined`. > :t error error :: [Char] -> a error "foo" error "bar" both are "bottom" values. but in IO, we can tell them apart with this technique! > let u' = error "foo" > let u'' = error "bar" Demonstrate same long line from before, using u' and u''. Extend diagram to be { (), error "foo", error "bar" } and {Right 8, Left "foo", Left "bar") Now cross out the () bit and say we can forget about () and just concentrate on these bottom values... we can "smuggle" any `String` into what looks like a value of type (), and *as long as we are in IO* we can recover it. 6.5 Introduce Dynamic as a type with injection and projection, giving a diagram as mentioned below of Int and Bool into Dyn and then from Dyn to Maybe Int and Maybe Bool Need to show how Dyn 7 goes to Just 7 for the first and Nothing for the 2nd. Don't need to show full connectivity here - just the path of a single value from 7 :: Int to Just 7 :: Maybe Int 7. Extensible exceptions But there are *even more* bottom values. Extensible exceptions mean that anything that implements the Exception (?) typeclass is an exception, not just Strings smuggled by `error`. In particular, the Dynamic type implements exceptions. So we can inject any value into a Dynamic, throw that instead of calling `error`. Diagram should now show: Two or three types (Int, Bool, for example) as separate sets on LHS. Next, a single Dyn set containing {Dyn 1, Dyn 2, Dyn 3, ..., Dyn True, Dyn False} and arrows from the LHS type sets into this single set. and a big bottom arrow labelled "toDyn injection" and then another set, (), containing {(), and then `bottom` Dyn 1, `bottom` Dyn 2, `bottom` Dyn 3, `bottom` Dyn True, `bottom` Dyn False} and arrows linking from Dyn into that set, with the big bottom label being `throw` expliclty note that this is still the unit type with its single proper value () here at the top, slowly being pushed into obscurity. next arrow is "try . (pure $!)" which leads us to Either Dyn () all still inside a big green squiggle cloud round everything labelled IO. and then outside of that big green squiggle, we can have a pure function `(either (fromDynamic) (const Nothing))` leading to Maybe Int: draw lines for: 7 (via Dyn 7 to Just 7) True (via Dyn True to Left (Dyn True) to Nothing) and () (via () to Right () to Nothing) 8. and finally we can remove this big green IO cloud squiggle using a "function" with type signature like this, that can project away the IO-ness: f :: IO a -> a which is of course (next slide) unsafePerformIO :: IO a -> a which is safe in the sense that the IO computation you're passing in will always return the same value; but unsafe in the sense that it lets you detect bottom. (a new variant of unsafety I hadn't though of before). 9. Conclusion of acme-smuggler Can I fit the full width encoding and decoding diagram on screen? (as used, perhaps, near the end of the previous section?) "et voila" here's our input values ... here we've squished everything into () in the middle with only a single value and never mind all these bottoms... and out the other side to our Maybe input set again. (relate that to the bigger type signatures inj -> () and prj () -> by indicating which chunk of the sequence is inj and which chunk is prj) if I wanted to, could (I think?) remove () and go to the Void type with no values at all (except all the bottoms) 10. Relation to reality Lazy values leaking exceptions out of try: discovered in a jpeg decoding bug (URL) with thomas dietert - was infinite-looping, so we put a timeout in, but the timeout wasn't firing... exactly (I think) that strictness forcing needed... so a usecase for strictness that isn't just for space purposes: we have this effect, exceptions / bottom, built into the language (rather than declared as for example a monad) which we're trying to manage. spj comments on how you can't have effects in lazy functional stuff because when is that effect going to happen? and this is that happening! (in a video eg the one he did at Churchill college?) (any others?) (I looked at errors/exceptions a lot at BD but does it relate to that?) OTHER NOTES: pure / unsafePerform form something like a prj . inj relationship: pure takes a into a bigger set 'IO a', and then unsafePerformIO takes one of those elements back to the a... there are other elements in 'IO a' - it is "bigger" - for example, some thing that reads from the console. And unsafePerformIO doesn't project those out to Nothing - instead "weird" stuff happens (i.e. IO...) a "function" has a set of "possible input values" and "potential output values" -- https://www.youtube.com/watch?v=6t6bsWVOIzs 7m30s previous to thinking about this i'd have justified a use of unsafePerformIO by saying "the action always does the same thing so it is safe". (eg reads the same file with the same content) - if we can externally prove/believe enough "sameness" then that is sufficient to be allowed to use unsafePerformIO. Thinking about what is in this talk has convinced me that isn't quite enough. is there a more interesting unsafety example that arises from someone using acme-smuggler? (or similar examples?) the practical experience of this was timeout problem with an infinite loop in JuicyPixels. but that didn't use acme-smuggler. but maybe that's the "practical" side of this rather than the practical side being "here's how to ruin your program with acme-smuggler"? in haskell (because it's a lazy language) we blur the distinction between a "program that computes a value x" and "a value x". maybe if we were being really strict in our typing, nothing in haskell would be a -> b, but would always be a -> Exc b (or some similar effect). in which case, we wouldn't need to write the Exc. and so we're back where we started... so imagine a ghost Exc effect in every function. except we have no >>= to sequence evaluation. which is how we can get away with playing with evaluation order here. although Exc doesn't capture "non-termination", a different form of bottom. show how a naive expression can't "catch" an exception happening deep inside. Prelude> 1 + 2 + error "three" + 4 *** Exception: three note difference between * and # kinds. Kind # (like Int#) probably can't be used in acme-smuggler - they have no bottom to smuggle things in! ** can introduce bottom with some haskell code: a partial function: fromJust :: Maybe a -> a fromJust (Just x) = x "what is the value of 'fromJust Nothing'" ? It "doesn't have a value". If we really want to give it something that looks like a value, we can write _|_ bottom. An explicit way to get bottom is to evaluate 'undefined'. But wait, this 'undefined' looks different in GHCi to 'fromJust Nothing'. and so now we can have different "bottoms", at least as far as a human looking at the GHCi prompt... Can we tell them apart? Yes, in IO... some code using try to extract a string (from error "hi" and error "bye" perhaps? or using 'show' to extract the error message from 'undefined' and 'fromJust Nothing'). So now we can distinguish (in IO) between these two bottoms... so now we can project Bool into any Haskell (boxed?) value. ** discussion of unsafePerformIO: how this seems an "unsafeness" that I wasn't expecting: I was somehow expecting that unsafeIO was unsafe because the value might change on evaluations - that is, there isn't referential transparency. (example, get-current-time) - or because the IO effects might have some observably-bad-if-repeated effect (launchMissiles) But that isn't the case here: if I smuggle a value into 'x', then 'discover x' is referentially transparent. And there is no effect on the world. This really does look like a regular projection/injection function - except for messing with the type system. So the unsafeness is "observing bottom" ? ** how did I find this? bug in graphics decoder library. was looping "forever". "timeout" function (inside unsafePerformIO?) similar kind of unsafeness - here we can observe computation time.