€•       Œsphinx.addnodes”Œdocument”“”)”}”(Œ	rawsource”Œ ”Œchildren”]”(Œdocutils.nodes”Œcomment”“”)”}”(hŒôParsl wide event observability prototype report documentation master file, created by
sphinx-quickstart on Sun Oct 26 10:13:01 2025.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.”h]”h	ŒText”“”ŒôParsl wide event observability prototype report documentation master file, created by
sphinx-quickstart on Sun Oct 26 10:13:01 2025.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.”…””}”Œparent”hsbaŒ
attributes”}”(Œids”]”Œclasses”]”Œnames”]”Œdupnames”]”Œbackrefs”]”Œ	xml:space”Œpreserve”uŒtagname”h
hhŒ	_document”hŒsource”Œ3/home/benc/parsl/src/observability-report/index.rst”Œline”Kubh	Œsection”“”)”}”(hhh]”(h	Œtitle”“”)”}”(hŒ)Wide event observability prototype report”h]”hŒ)Wide event observability prototype report”…””}”(hh1h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hh,h&hh'h(h)Kubh	Œcompound”“”)”}”(hhh]”h Œtoctree”“”)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hŒindex”Œentries”]”Œincludefiles”]”Œmaxdepth”KŒcaption”ŒContents”Œglob”‰Œhidden”‰Œincludehidden”‰Œnumbered”K Œ
titlesonly”‰Œ
rawentries”]”Œ
rawcaption”hVuh%hDh'h(h)K	hhAubah}”(h]”h]”Œtoctree-wrapper”ah]”h]”h!]”uh%h?hh,h&hh'h(h)Nubeh}”(h]”Œ)wide-event-observability-prototype-report”ah]”h]”Œ)wide event observability prototype report”ah]”h!]”uh%h*hhh&hh'h(h)Kubh+)”}”(hhh]”(h0)”}”(hŒIntroduction”h]”hŒIntroduction”…””}”(hhqh&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hhnh&hh'h(h)Kubh Œindex”“”)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(Œsingle”Œobservability”Œindex-0”hNt”aŒinline”‰uh%hh'h(h)Khhnh&hubh	Œtarget”“”)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”Œrefid”huh%hhhnh&hh'h(h)Kubh	Œ	paragraph”“”)”}”(hX  These are notes about my current iteration of a Parsl and Academy *observability* prototype. It is intended to help with plugin style integration between those two components and an open collection of friends including Globus Compute, Diaspora and Chronolog.”h]”(hŒBThese are notes about my current iteration of a Parsl and Academy ”…””}”(hhžh&hh'Nh)Nubh	Œemphasis”“”)”}”(hŒ*observability*”h]”hŒobservability”…””}”(hh¨h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hhžubhŒ± prototype. It is intended to help with plugin style integration between those two components and an open collection of friends including Globus Compute, Diaspora and Chronolog.”…””}”(hhžh&hh'Nh)Nubeh}”(h]”hah]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hŒexpect_referenced_by_name”}”Œexpect_referenced_by_id”}”hh’subh)”}”(hXH  As an abstract concept: "Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs." (https://en.wikipedia.org/wiki/Observability). In the context of this project, it means outputting enough information about the system to understand why bugs happen in the system.”h]”(hŒšAs an abstract concept: â€œObservability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs.â€ (”…””}”(hhÄh&hh'Nh)Nubh	Œ	reference”“”)”}”(hŒ+https://en.wikipedia.org/wiki/Observability”h]”hŒ+https://en.wikipedia.org/wiki/Observability”…””}”(hhÎh&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”hÐuh%hÌhhÄubhŒ‡). In the context of this project, it means outputting enough information about the system to understand why bugs happen in the system.”…””}”(hhÄh&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hubh)”}”(hŒTThere is plenty to read about Observability on the web: google around for more info.”h]”hŒTThere is plenty to read about Observability on the web: google around for more info.”…””}”(hhçh&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hubh)”}”(hXB  This is often neglected as part of the core functionality of a research prototype, as demonstrations are run in controlled environments with the original authors both ready to respond to the slightest problem with copious time, and ready to restart everything from scratch repeatedly until the desired outcome is achieved.”h]”hXB  This is often neglected as part of the core functionality of a research prototype, as demonstrations are run in controlled environments with the original authors both ready to respond to the slightest problem with copious time, and ready to restart everything from scratch repeatedly until the desired outcome is achieved.”…””}”(hhõh&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hubh)”}”(hŒòAs soon as that prototype is forced into production, those two properties evaporate and the need for observability manifests: both users and legacy developers need to understand what is happening in this suddenly wider and more hostile world.”h]”hŒòAs soon as that prototype is forced into production, those two properties evaporate and the need for observability manifests: both users and legacy developers need to understand what is happening in this suddenly wider and more hostile world.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hubh)”}”(hŒ­In the Parsl world that exists now as two separate systems: Log files and Parsl Monitoring. This report will explore ways in which they can be usefully unified and extended.”h]”hŒ­In the Parsl world that exists now as two separate systems: Log files and Parsl Monitoring. This report will explore ways in which they can be usefully unified and extended.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khhnh&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹ŒDESC”Œindex-1”hNt”(h‹ŒCZI”j*  hNt”(h‹ŒNSF”j*  hNt”eh‰uh%hh'h(h)Khhnh&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j*  uh%hhhnh&hh'h(h)K!ubh)”}”(hXX  This project builds on experiences debugging Parsl within the DESC project, as well as work sponsored by NSF and CZI understanding pluggability and code maturity as they affect architectural decisions. As a concurrent activiy, I have used some of this experience to push changes into the Academy codebase to support its move towards production.”h]”hXX  This project builds on experiences debugging Parsl within the DESC project, as well as work sponsored by NSF and CZI understanding pluggability and code maturity as they affect architectural decisions. As a concurrent activiy, I have used some of this experience to push changes into the Academy codebase to support its move towards production.”…””}”(hj9  h&hh'Nh)Nubah}”(h]”j*  ah]”h]”h]”h!]”uh%hœh'h(h)K"hhnh&hhÀ}”hÂ}”j*  j0  subh)”}”(hXM  A distant vision is a project-wide or personal-space-wide observability system --  but it is important to acknowledge that this is a distant and vague vision, and that actually what I want to happen is stuff on the scale of weeks to months that is usable on that timescale, with others left to take up that distant vision if desired.”h]”hXN  A distant vision is a project-wide or personal-space-wide observability system â€“  but it is important to acknowledge that this is a distant and vague vision, and that actually what I want to happen is stuff on the scale of weeks to months that is usable on that timescale, with others left to take up that distant vision if desired.”…””}”(hjI  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K$hhnh&hubh)”}”(hŒèThis report attempts to describe abstract concepts but ground them in practice
and concrete code-driven examples. This report also tries to give open
questions and opportunities that might be interesting for other people to work
on.”h]”hŒèThis report attempts to describe abstract concepts but ground them in practice
and concrete code-driven examples. This report also tries to give open
questions and opportunities that might be interesting for other people to work
on.”…””}”(hjW  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K&hhnh&hubh)”}”(hX3  How to try this out? Because I want you to try this out. It's in my
Parsl ``benc-observability`` branch. I will try to label use cases as
expected to work or not, and in what context. There will also be some
academy-related stuff in that branch, with the intention that it moves
elsewhere as productionised.”h]”(hŒLHow to try this out? Because I want you to try this out. Itâ€™s in my
Parsl ”…””}”(hje  h&hh'Nh)Nubh	Œliteral”“”)”}”(hŒ``benc-observability``”h]”hŒbenc-observability”…””}”(hjo  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hje  ubhŒÓ branch. I will try to label use cases as
expected to work or not, and in what context. There will also be some
academy-related stuff in that branch, with the intention that it moves
elsewhere as productionised.”…””}”(hje  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K+hhnh&hubh+)”}”(hhh]”(h0)”}”(hŒWhat exists in Parsl now?”h]”hŒWhat exists in Parsl now?”…””}”(hjŠ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj‡  h&hh'h(h)K3ubh)”}”(hŒPParsl has two observability approaches: file-based logging and Parsl Monitoring.”h]”hŒPParsl has two observability approaches: file-based logging and Parsl Monitoring.”…””}”(hj˜  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K5hj‡  h&hubh)”}”(hX  File based logging is very loosely structured. Log lines are intended for direct human consumption, with minimal automated processing: for example, "grepping the logs". Within Parsl there is a variety of log formats, usually depending on the component which generated the log. Logs are directed to a filesystem accessible by the particular component, which in practice they are especially awkward without a shared file system. It is easy to add a new log line, by writing what is effectively a glorified ``print`` statement.”h]”(hXü  File based logging is very loosely structured. Log lines are intended for direct human consumption, with minimal automated processing: for example, â€œgrepping the logsâ€. Within Parsl there is a variety of log formats, usually depending on the component which generated the log. Logs are directed to a filesystem accessible by the particular component, which in practice they are especially awkward without a shared file system. It is easy to add a new log line, by writing what is effectively a glorified ”…””}”(hj¦  h&hh'Nh)Nubjn  )”}”(hŒ	``print``”h]”hŒprint”…””}”(hj®  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¦  ubhŒ statement.”…””}”(hj¦  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K7hj‡  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œ"Parsl monitoring; database manager”Œindex-2”hNt”ah‰uh%hh'h(h)K9hj‡  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jÑ  uh%hhj‡  h&hh'h(h)K:ubh)”}”(hX„  Parsl Monitoring generates monitoring events that are not intended to be seen by humans. Instead they are conveyed to a Monitoring Database Manager which munges them into a relational schema. This schema is then typically accessed by users via futher processing for pre-prepared visualization or ad-hoc queries designed by data-science aware power users. These monitoring events can be conveyed by a pluggable interface, for example over the network, and in contrast to the logging approach above, distributed filesystem access (in the broadest sense) is not required. The strict SQL schema makes the data model extremely hard to extend ad-hoc.”h]”hX„  Parsl Monitoring generates monitoring events that are not intended to be seen by humans. Instead they are conveyed to a Monitoring Database Manager which munges them into a relational schema. This schema is then typically accessed by users via futher processing for pre-prepared visualization or ad-hoc queries designed by data-science aware power users. These monitoring events can be conveyed by a pluggable interface, for example over the network, and in contrast to the logging approach above, distributed filesystem access (in the broadest sense) is not required. The strict SQL schema makes the data model extremely hard to extend ad-hoc.”…””}”(hjÜ  h&hh'Nh)Nubah}”(h]”jÑ  ah]”h]”h]”h!]”uh%hœh'h(h)K;hj‡  h&hhÀ}”hÂ}”jÑ  jÓ  subh)”}”(hŒÈParsl Monitoring was also implemented with a fixed queries
/ dashboard mindset: one set of views that is expected to be
sufficient. As time has shown, people like to make other
outputs from this data.”h]”hŒÈParsl Monitoring was also implemented with a fixed queries
/ dashboard mindset: one set of views that is expected to be
sufficient. As time has shown, people like to make other
outputs from this data.”…””}”(hjì  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K=hj‡  h&hubh)”}”(hŒ_This report builds on both of these approaches. I'll talk about more details in later sections.”h]”hŒaThis report builds on both of these approaches. Iâ€™ll talk about more details in later sections.”…””}”(hjú  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KBhj‡  h&hubeh}”(h]”Œwhat-exists-in-parsl-now”ah]”h]”Œwhat exists in parsl now?”ah]”h!]”uh%h*hhnh&hh'h(h)K3ubh+)”}”(hhh]”(h0)”}”(hŒDiagram”h]”hŒDiagram”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)KGubh)”}”(hŒof the components/flow.”h]”hŒof the components/flow.”…””}”(hj!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KIhj  h&hubh)”}”(hŒlto distinguish the pieces of my work, and also to distinguish the pieces of what might be substituted where.”h]”hŒlto distinguish the pieces of my work, and also to distinguish the pieces of what might be substituted where.”…””}”(hj/  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KKhj  h&hubh)”}”(hŒwspecific emphasis that this is common techniques, not a single implementation or protocol standards or single anything.”h]”hŒwspecific emphasis that this is common techniques, not a single implementation or protocol standards or single anything.”…””}”(hj=  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KMhj  h&hubh	Œliteral_block”“”)”}”(hXÛ  Python logger
  API   ----> JSON structured logs \
                                   |--> log movement --> Python-based query model             --> graphs/reports
        non-JSON structured logs   /    to one place     --> post facto schema normalisation
        (eg. WQ, Parsl monitoring)  classically files,          --> data-structure based queries
                                    but eg ryan/kafka
                                           logan demo/agent polling”h]”hXÛ  Python logger
  API   ----> JSON structured logs \
                                   |--> log movement --> Python-based query model             --> graphs/reports
        non-JSON structured logs   /    to one place     --> post facto schema normalisation
        (eg. WQ, Parsl monitoring)  classically files,          --> data-structure based queries
                                    but eg ryan/kafka
                                           logan demo/agent polling”…””}”hjM  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$Œlanguage”Œdefault”uh%jK  h'h(h)KOhj  h&hubeh}”(h]”Œdiagram”ah]”h]”Œdiagram”ah]”h!]”uh%h*hhnh&hh'h(h)KGubh+)”}”(hhh]”(h0)”}”(hŒ#Concept: Universal personal logging”h]”hŒ#Concept: Universal personal logging”…””}”(hjk  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjh  h&hh'h(h)K\ubh)”}”(hŒ™Imagine *all* of my academy/globus endpoint/parsl/application runs going into a single log space. permanently. no matter what the project, location, etc.”h]”(hŒImagine ”…””}”(hjy  h&hh'Nh)Nubh§)”}”(hŒ*all*”h]”hŒall”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjy  ubhŒŒ of my academy/globus endpoint/parsl/application runs going into a single log space. permanently. no matter what the project, location, etc.”…””}”(hjy  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K^hjh  h&hubh)”}”(hXJ  heres a logging use case/notion: universal personal log.  all my GC endpoints, parsl runs, academy runs, application submits, go into a single log space that has everything I am running everywhere in the science cloud, by default - eg. identified by my globus credential ID.  no separation whatsoever. no project distinction, etc.”h]”hXJ  heres a logging use case/notion: universal personal log.  all my GC endpoints, parsl runs, academy runs, application submits, go into a single log space that has everything I am running everywhere in the science cloud, by default - eg. identified by my globus credential ID.  no separation whatsoever. no project distinction, etc.”…””}”(hj™  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K`hjh  h&hubh)”}”(hŒ[what does that look like to work with on the query side. what does that look like to query?”h]”hŒ[what does that look like to work with on the query side. what does that look like to query?”…””}”(hj§  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kbhjh  h&hubeh}”(h]”Œ"concept-universal-personal-logging”ah]”h]”Œ#concept: universal personal logging”ah]”h!]”uh%h*hhnh&hh'h(h)K\ubh+)”}”(hhh]”(h0)”}”(hŒTarget audience”h]”hŒTarget audience”…””}”(hjÀ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj½  h&hh'h(h)Keubh)”}”(hŒóThis project is mainly aimed at systems integrators and application builders who are expecting to perform serious debugging and profiling work at a deep technical level. It should support other use-cases such as management-friendly dashboards.”h]”hŒóThis project is mainly aimed at systems integrators and application builders who are expecting to perform serious debugging and profiling work at a deep technical level. It should support other use-cases such as management-friendly dashboards.”…””}”(hjÎ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kghj½  h&hubh)”}”(hX  These users will often have integrated several research-quality projects: for example, Academy submitting into Globus Compute. As systems integrators and application builders, they aren't directly interested in the borders these individual projects have built around themselves, but want to understand (for example) where their simulation task is inside the whole ad-hoc stack. This mirrors the microcosm of Parsl existing as a pile of configurable and pluggable compponents, each with their own observability options.”h]”hX  These users will often have integrated several research-quality projects: for example, Academy submitting into Globus Compute. As systems integrators and application builders, they arenâ€™t directly interested in the borders these individual projects have built around themselves, but want to understand (for example) where their simulation task is inside the whole ad-hoc stack. This mirrors the microcosm of Parsl existing as a pile of configurable and pluggable compponents, each with their own observability options.”…””}”(hjÜ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kihj½  h&hubh)”}”(hXv  Many of the target audience do not, in my experience, come asking directly for observability as a feature. Instead they come with questions such as "Parsl is slow - how can I make it faster?". Without understanding the statement "Parsl is slow" (which as often as not turns out to be "my application code is slow"), it is hard to make progress on "how can I make it faster?"”h]”hX†  Many of the target audience do not, in my experience, come asking directly for observability as a feature. Instead they come with questions such as â€œParsl is slow - how can I make it faster?â€. Without understanding the statement â€œParsl is slowâ€ (which as often as not turns out to be â€œmy application code is slowâ€), it is hard to make progress on â€œhow can I make it faster?â€”…””}”(hjê  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kkhj½  h&hubeh}”(h]”Œtarget-audience”ah]”h]”Œtarget audience”ah]”h!]”uh%h*hhnh&hh'h(h)Keubh+)”}”(hhh]”(h0)”}”(hŒ
Modularity”h]”hŒ
Modularity”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj   h&hh'h(h)Knubh)”}”(hŒ€This report emphasises modularity as a core tenet, to the extent that a single product codebase is not particularly an end goal.”h]”hŒ€This report emphasises modularity as a core tenet, to the extent that a single product codebase is not particularly an end goal.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kphj   h&hubh+)”}”(hhh]”(h0)”}”(hŒ9Modularity as a requirement for a rich research landscape”h]”hŒ9Modularity as a requirement for a rich research landscape”…””}”(hj"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)Ksubh)”}”(hX  A "rich research landscape" means many components, each with competing priorities for productionworthiness vs research. TODO: cite the work done as part of NSF/CZI sustainability grants about recognising the difference between goals rather than ignoring them.”h]”hX  A â€œrich research landscapeâ€ means many components, each with competing priorities for productionworthiness vs research. TODO: cite the work done as part of NSF/CZI sustainability grants about recognising the difference between goals rather than ignoring them.”…””}”(hj0  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kuhj  h&hubh)”}”(hXj  Expecting a single observability system to provide all needs is unlikely to succeed in such a varied research-style environment: while some users are tolerant of appalling code quality in exchange for interesting research results, those same users require production quality from other components in the same stack; and those tolerances vary with every use case.”h]”hXj  Expecting a single observability system to provide all needs is unlikely to succeed in such a varied research-style environment: while some users are tolerant of appalling code quality in exchange for interesting research results, those same users require production quality from other components in the same stack; and those tolerances vary with every use case.”…””}”(hj>  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kwhj  h&hubeh}”(h]”Œ9modularity-as-a-requirement-for-a-rich-research-landscape”ah]”h]”Œ9modularity as a requirement for a rich research landscape”ah]”h!]”uh%h*hj   h&hh'h(h)Ksubh+)”}”(hhh]”(h0)”}”(hŒ#Hourglass model with several waists”h]”hŒ#Hourglass model with several waists”…””}”(hjW  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjT  h&hh'h(h)Kzubh)”}”(hXs  the hourglass model is intended to provide a small number of plugin/intergration points
in the same way that the Internet Protocol does for applications/application protocols vs networking technologies (for example, HTTP over mobile phone network vs telnet over ARPANET is then sufficient integration to run without more work: telnet over mobile phone, HTTP over arpanet)”h]”hXs  the hourglass model is intended to provide a small number of plugin/intergration points
in the same way that the Internet Protocol does for applications/application protocols vs networking technologies (for example, HTTP over mobile phone network vs telnet over ARPANET is then sufficient integration to run without more work: telnet over mobile phone, HTTP over arpanet)”…””}”(hje  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K|hjT  h&hubh)”}”(hŒThe hourglass waists are:”h]”hŒThe hourglass waists are:”…””}”(hjs  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KhjT  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹ŒPython; logging”Œindex-3”hNt”(h‹ŒC programming language”jŒ  hNt”(h‹ŒProgramming languages; C”jŒ  hNt”eh‰uh%hh'h(h)KhjT  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jŒ  uh%hhjT  h&hh'h(h)K„ubh	Œbullet_list”“”)”}”(hhh]”(h	Œ	list_item”“”)”}”(hXâ  python ``logging`` system: any Python code can send log messages to the built-in ``logging`` system and any Python code can register to receive any log messages. This can support components that live in the Python ecosystem. That includes enough of the current ecosystem to consider specially, but not enough to be universal: for example, when running task through Parsl's Work Queue executor, a substantial piece of execution happens in code written in the C programming language.
”h]”h)”}”(hXá  python ``logging`` system: any Python code can send log messages to the built-in ``logging`` system and any Python code can register to receive any log messages. This can support components that live in the Python ecosystem. That includes enough of the current ecosystem to consider specially, but not enough to be universal: for example, when running task through Parsl's Work Queue executor, a substantial piece of execution happens in code written in the C programming language.”h]”(hŒpython ”…””}”(hj¦  h&hh'Nh)Nubjn  )”}”(hŒ``logging``”h]”hŒlogging”…””}”(hj®  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¦  ubhŒ? system: any Python code can send log messages to the built-in ”…””}”(hj¦  h&hh'Nh)Nubjn  )”}”(hŒ``logging``”h]”hŒlogging”…””}”(hjÀ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¦  ubhX‡   system and any Python code can register to receive any log messages. This can support components that live in the Python ecosystem. That includes enough of the current ecosystem to consider specially, but not enough to be universal: for example, when running task through Parslâ€™s Work Queue executor, a substantial piece of execution happens in code written in the C programming language.”…””}”(hj¦  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K…hj¢  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)K…hj  h&hubj¡  )”}”(hX4  JSON records: a second point of modularity is representing observability information as JSON objects. This is flexible data format which complements the Python code approach of the previous waist. Often observability information which cannot flow through the Python ``logging`` API can flow as JSON records.
”h]”h)”}”(hX3  JSON records: a second point of modularity is representing observability information as JSON objects. This is flexible data format which complements the Python code approach of the previous waist. Often observability information which cannot flow through the Python ``logging`` API can flow as JSON records.”h]”(hX
  JSON records: a second point of modularity is representing observability information as JSON objects. This is flexible data format which complements the Python code approach of the previous waist. Often observability information which cannot flow through the Python ”…””}”(hjâ  h&hh'Nh)Nubjn  )”}”(hŒ``logging``”h]”hŒlogging”…””}”(hjê  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjâ  ubhŒ API can flow as JSON records.”…””}”(hjâ  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K‡hjÞ  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)K‡hj  h&hubeh}”(h]”jŒ  ah]”h]”h]”h!]”Œbullet”Œ*”uh%j›  h'h(h)K…hjT  h&hhÀ}”hÂ}”jŒ  j’  subeh}”(h]”Œ#hourglass-model-with-several-waists”ah]”h]”Œ#hourglass model with several waists”ah]”h!]”uh%h*hj   h&hh'h(h)Kzubeh}”(h]”Œ
modularity”ah]”h]”Œ
modularity”ah]”h!]”uh%h*hhnh&hh'h(h)Knubh+)”}”(hhh]”(h0)”}”(hŒ$High level structure of this project”h]”hŒ$High level structure of this project”…””}”(hj%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj"  h&hh'h(h)KŠubh)”}”(hŒ7This report breaks observability into four rough parts:”h]”hŒ7This report breaks observability into four rough parts:”…””}”(hj3  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KŒhj"  h&hubjœ  )”}”(hhh]”(j¡  )”}”(hŒ5A data model of wide event records: :ref:`datamodel`
”h]”h)”}”(hŒ4A data model of wide event records: :ref:`datamodel`”h]”(hŒ$A data model of wide event records: ”…””}”(hjH  h&hh'Nh)Nubh Œpending_xref”“”)”}”(hŒ:ref:`datamodel`”h]”h	h“”)”}”(hjT  h]”hŒ	datamodel”…””}”(hjW  h&hh'Nh)Nubah}”(h]”h]”(Œxref”Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjR  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”jb  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆŒ	reftarget”Œ	datamodel”uh%jP  h'h(h)KŽhjH  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KŽhjD  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)KŽhjA  h&hubj¡  )”}”(hŒ'Creating wide records: :ref:`creating`
”h]”h)”}”(hŒ&Creating wide records: :ref:`creating`”h]”(hŒCreating wide records: ”…””}”(hj…  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`creating`”h]”jV  )”}”(hj  h]”hŒcreating”…””}”(hj‘  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhj  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”j›  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œcreating”uh%jP  h'h(h)Khj…  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Khj  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)KhjA  h&hubj¡  )”}”(hŒ1Moving those event records around: :ref:`moving`
”h]”h)”}”(hŒ0Moving those event records around: :ref:`moving`”h]”(hŒ#Moving those event records around: ”…””}”(hj½  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`moving`”h]”jV  )”}”(hjÇ  h]”hŒmoving”…””}”(hjÉ  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjÅ  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”jÓ  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œmoving”uh%jP  h'h(h)K’hj½  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K’hj¹  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)K’hjA  h&hubj¡  )”}”(hŒ+Analysing those records: :ref:`analysing`

”h]”h)”}”(hŒ)Analysing those records: :ref:`analysing`”h]”(hŒAnalysing those records: ”…””}”(hjõ  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`analysing`”h]”jV  )”}”(hjÿ  h]”hŒ	analysing”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjý  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”j  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œ	analysing”uh%jP  h'h(h)K”hjõ  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K”hjñ  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)K”hjA  h&hubeh}”(h]”h]”h]”h]”h!]”j  j  uh%j›  h'h(h)KŽhj"  h&hubh‘)”}”(hŒ.. _datamodel:”h]”h}”(h]”h]”h]”h]”h!]”h›Œ	datamodel”uh%hh)K—hj"  h&hh'h(ubeh}”(h]”Œ$high-level-structure-of-this-project”ah]”h]”Œ$high level structure of this project”ah]”h!]”uh%h*hhnh&hh'h(h)KŠubeh}”(h]”Œintroduction”ah]”h]”Œintroduction”ah]”h!]”uh%h*hhh&hh'h(h)Kubh+)”}”(hhh]”(h0)”}”(hŒThe data model”h]”hŒThe data model”…””}”(hjM  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjJ  h&hh'h(h)Kšubh+)”}”(hhh]”(h0)”}”(hŒIntroduction to wide events”h]”hŒIntroduction to wide events”…””}”(hj^  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj[  h&hh'h(h)Kubh)”}”(hŒGas JSON objects, as Python LogRecords, as roughly isomorphic structures”h]”hŒGas JSON objects, as Python LogRecords, as roughly isomorphic structures”…””}”(hjl  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KŸhj[  h&hubh)”}”(hŒywide in the style of denormalised data warehouses, rather than heavily normalised like more traditional relational model.”h]”hŒywide in the style of denormalised data warehouses, rather than heavily normalised like more traditional relational model.”…””}”(hjz  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K¡hj[  h&hubh)”}”(hŒithey should be wide and flat: do not create elaborate object graphs. key/value, with values being simple.”h]”hŒithey should be wide and flat: do not create elaborate object graphs. key/value, with values being simple.”…””}”(hjˆ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K£hj[  h&hubh)”}”(hŒÆis often ad-hoc: people are writing code to run tasks, not building data models represent the observable state of their tasks. so don't bake that into the system too much, and expect to be flexible.”h]”hŒÈis often ad-hoc: people are writing code to run tasks, not building data models represent the observable state of their tasks. so donâ€™t bake that into the system too much, and expect to be flexible.”…””}”(hj–  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K¥hj[  h&hubeh}”(h]”Œintroduction-to-wide-events”ah]”h]”Œintroduction to wide events”ah]”h!]”uh%h*hjJ  h&hh'h(h)Kubh+)”}”(hhh]”(h0)”}”(hŒ>What exists now: Parsl python logs vs Parsl monitoring records”h]”hŒ>What exists now: Parsl python logs vs Parsl monitoring records”…””}”(hj¯  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj¬  h&hh'h(h)K©ubh)”}”(hŒMEspecially for this chapter, how both of those can be embedded as wide events”h]”hŒMEspecially for this chapter, how both of those can be embedded as wide events”…””}”(hj½  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K«hj¬  h&hubh)”}”(hŒ§Parsl monitoring structured records are equally valid examples of existing structured records, alongside with equal value to logging as differently structured records.”h]”hŒ§Parsl monitoring structured records are equally valid examples of existing structured records, alongside with equal value to logging as differently structured records.”…””}”(hjË  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K­hj¬  h&hubh)”}”(hXn  Original Parsl monitoring prototype was focused on what is happening
with Parsl user level concepts: tasks, blocks for example
as they move through simple states. Anything deeper is part
of the idea of "Parsl makes it so you don't have to think
about anything happening inside". Which is not how things are
in reality: neither for code reliabilty or for performance.”h]”hXt  Original Parsl monitoring prototype was focused on what is happening
with Parsl user level concepts: tasks, blocks for example
as they move through simple states. Anything deeper is part
of the idea of â€œParsl makes it so you donâ€™t have to think
about anything happening insideâ€. Which is not how things are
in reality: neither for code reliabilty or for performance.”…””}”(hjÙ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K¯hj¬  h&hubh)”}”(hŒ™often want to debug/profile whats happening *inside parsl* rather
than *inside the user workflow* - and the distinction between the two is often unclear.”h]”(hŒ,often want to debug/profile whats happening ”…””}”(hjç  h&hh'Nh)Nubh§)”}”(hŒ*inside parsl*”h]”hŒinside parsl”…””}”(hjï  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjç  ubhŒ rather
than ”…””}”(hjç  h&hh'Nh)Nubh§)”}”(hŒ*inside the user workflow*”h]”hŒinside the user workflow”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjç  ubhŒ8 - and the distinction between the two is often unclear.”…””}”(hjç  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K¶hj¬  h&hubh‘)”}”(hŒ.. _partialdata:”h]”h}”(h]”h]”h]”h]”h!]”h›Œpartialdata”uh%hh)K¹hj¬  h&hh'h(ubeh}”(h]”Œ=what-exists-now-parsl-python-logs-vs-parsl-monitoring-records”ah]”h]”Œ>what exists now: parsl python logs vs parsl monitoring records”ah]”h!]”uh%h*hjJ  h&hh'h(h)K©ubh+)”}”(hhh]”(h0)”}”(hŒ*Optional and missing data in observability”h]”hŒ*Optional and missing data in observability”…””}”(hj/  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj,  h&hh'h(h)K¼ubh)”}”(hŒlog levels - INFO vs DEBUG”h]”hŒlog levels - INFO vs DEBUG”…””}”(hj=  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)K¾hj,  h&hubh)”}”(hŒOmissing log files - eg. start with parsl.log, add in more
files for more detail”h]”hŒOmissing log files - eg. start with parsl.log, add in more
files for more detail”…””}”(hjK  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÀhj,  h&hubh)”}”(hŒºSecurity - not so much the Parsl core use case, but eg
GC executor vs GC endpoint logs vs GC central services have
different security properties. Or in Academy, the hosted HTTP exchange.”h]”hŒºSecurity - not so much the Parsl core use case, but eg
GC executor vs GC endpoint logs vs GC central services have
different security properties. Or in Academy, the hosted HTTP exchange.”…””}”(hjY  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÃhj,  h&hubh)”}”(hŒ³The observability approach needs to accomodate that, for
any/all reasons, some events won't be there. There can't be
a "complete set of events" to complain about being incomplete.”h]”hŒ»The observability approach needs to accomodate that, for
any/all reasons, some events wonâ€™t be there. There canâ€™t be
a â€œcomplete set of eventsâ€ to complain about being incomplete.”…””}”(hjg  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÇhj,  h&hubh)”}”(hŒuless data, well the reports in whatever form are less
informative, to the extent that the lack of data makes them
so.”h]”hŒuless data, well the reports in whatever form are less
informative, to the extent that the lack of data makes them
so.”…””}”(hju  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KËhj,  h&hubh)”}”(hŒdThis optionality aligns with different components adding their own logs, if they happen to be there.”h]”hŒdThis optionality aligns with different components adding their own logs, if they happen to be there.”…””}”(hjƒ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÏhj,  h&hubh)”}”(hŒ;Adding (or removing) a log field is a lightweight operation”h]”hŒ;Adding (or removing) a log field is a lightweight operation”…””}”(hj‘  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÑhj,  h&hubeh}”(h]”(Œ*optional-and-missing-data-in-observability”j#  eh]”h]”(Œ*optional and missing data in observability”Œpartialdata”eh]”h!]”uh%h*hjJ  h&hh'h(h)K¼hÀ}”j¥  j  shÂ}”j#  j  subh+)”}”(hhh]”(h0)”}”(hŒ
Data types”h]”hŒ
Data types”…””}”(hj­  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjª  h&hh'h(h)KÔubh)”}”(hŒªdata types don't matter much for human observation. but for machine processing they do. so this section has some relevance when thinking about the Analysis section later.”h]”hŒ¬data types donâ€™t matter much for human observation. but for machine processing they do. so this section has some relevance when thinking about the Analysis section later.”…””}”(hj»  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÖhjª  h&hubh)”}”(hŒúBoth JSON and Python representation can support a range of types. But richer data types can exist. What's interesting for analysis is primarily relations like equality and ordering, which are implicitly not string-like in a lot of cases. For example:”h]”hŒüBoth JSON and Python representation can support a range of types. But richer data types can exist. Whatâ€™s interesting for analysis is primarily relations like equality and ordering, which are implicitly not string-like in a lot of cases. For example:”…””}”(hjÉ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KØhjª  h&hubh	Œblock_quote”“”)”}”(hXT  * string vs int: '0' vs 0 as a Parsl task ID. Even Parsl source code is not entirely clear on when a task ID is a string and when it is an int, I think. Normalisation example: "03" vs "3". Rundir IDs are usually at least 3 digits long. Is rundir "000" the same as rundir 0?

* UUID as 128 bit number, vs UUID as a case-sensitive/padding-sensitive ASCII/UTF-8 string. uuids should be used *more* in this work - they were invented for the purpose of this kind of distributed identification https://en.wikipedia.org/wiki/Universally_unique_identifier#History https://www.rfc-editor.org/rfc/rfc9562.html  Normalisation example of string form: case differences.

* ordinal relations of text-named log levels (WARN, WARNING, INFO, ERROR, ...) in various enumerations (although for querying an overarching schema is maybe possible for read-only ordering use)
”h]”jœ  )”}”(hhh]”(j¡  )”}”(hX  string vs int: '0' vs 0 as a Parsl task ID. Even Parsl source code is not entirely clear on when a task ID is a string and when it is an int, I think. Normalisation example: "03" vs "3". Rundir IDs are usually at least 3 digits long. Is rundir "000" the same as rundir 0?
”h]”h)”}”(hX  string vs int: '0' vs 0 as a Parsl task ID. Even Parsl source code is not entirely clear on when a task ID is a string and when it is an int, I think. Normalisation example: "03" vs "3". Rundir IDs are usually at least 3 digits long. Is rundir "000" the same as rundir 0?”h]”hX  string vs int: â€˜0â€™ vs 0 as a Parsl task ID. Even Parsl source code is not entirely clear on when a task ID is a string and when it is an int, I think. Normalisation example: â€œ03â€ vs â€œ3â€. Rundir IDs are usually at least 3 digits long. Is rundir â€œ000â€ the same as rundir 0?”…””}”(hjä  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÚhjà  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)KÚhjÝ  ubj¡  )”}”(hX|  UUID as 128 bit number, vs UUID as a case-sensitive/padding-sensitive ASCII/UTF-8 string. uuids should be used *more* in this work - they were invented for the purpose of this kind of distributed identification https://en.wikipedia.org/wiki/Universally_unique_identifier#History https://www.rfc-editor.org/rfc/rfc9562.html  Normalisation example of string form: case differences.
”h]”h)”}”(hX{  UUID as 128 bit number, vs UUID as a case-sensitive/padding-sensitive ASCII/UTF-8 string. uuids should be used *more* in this work - they were invented for the purpose of this kind of distributed identification https://en.wikipedia.org/wiki/Universally_unique_identifier#History https://www.rfc-editor.org/rfc/rfc9562.html  Normalisation example of string form: case differences.”h]”(hŒoUUID as 128 bit number, vs UUID as a case-sensitive/padding-sensitive ASCII/UTF-8 string. uuids should be used ”…””}”(hjü  h&hh'Nh)Nubh§)”}”(hŒ*more*”h]”hŒmore”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjü  ubhŒ^ in this work - they were invented for the purpose of this kind of distributed identification ”…””}”(hjü  h&hh'Nh)NubhÍ)”}”(hŒChttps://en.wikipedia.org/wiki/Universally_unique_identifier#History”h]”hŒChttps://en.wikipedia.org/wiki/Universally_unique_identifier#History”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j  uh%hÌhjü  ubhŒ ”…””}”(hjü  h&hh'Nh)NubhÍ)”}”(hŒ+https://www.rfc-editor.org/rfc/rfc9562.html”h]”hŒ+https://www.rfc-editor.org/rfc/rfc9562.html”…””}”(hj)  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j+  uh%hÌhjü  ubhŒ9  Normalisation example of string form: case differences.”…””}”(hjü  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÜhjø  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)KÜhjÝ  ubj¡  )”}”(hŒÀordinal relations of text-named log levels (WARN, WARNING, INFO, ERROR, ...) in various enumerations (although for querying an overarching schema is maybe possible for read-only ordering use)
”h]”h)”}”(hŒ¿ordinal relations of text-named log levels (WARN, WARNING, INFO, ERROR, ...) in various enumerations (although for querying an overarching schema is maybe possible for read-only ordering use)”h]”hŒ¿ordinal relations of text-named log levels (WARN, WARNING, INFO, ERROR, â€¦) in various enumerations (although for querying an overarching schema is maybe possible for read-only ordering use)”…””}”(hjL  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)KÞhjH  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)KÞhjÝ  ubeh}”(h]”h]”h]”h]”h!]”j  j  uh%j›  h'h(h)KÚhjÙ  ubah}”(h]”h]”h]”h]”h!]”uh%j×  h'h(h)KÚhjª  h&hubh)”}”(hŒEWhats the right canonicalisation attitude here? open question for me.”h]”hŒEWhats the right canonicalisation attitude here? open question for me.”…””}”(hjl  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kàhjª  h&hubh)”}”(hŒACannot expect emitters to conform to some defined canonical form.”h]”hŒACannot expect emitters to conform to some defined canonical form.”…””}”(hjz  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kâhjª  h&hubh)”}”(hXŽ  perhaps I should use rewrite_by_lambda(logs, keyname, lambda) to change known fields into a suitable object representation: int for some task IDs, UUIDs, log levels? then the type system deals with it? some of those could happen in the importers, some in the query, as its modular. I already do a rewrite to shift the created time down to 0 base, in one of my plots. So the notion is there already.”h]”hXŽ  perhaps I should use rewrite_by_lambda(logs, keyname, lambda) to change known fields into a suitable object representation: int for some task IDs, UUIDs, log levels? then the type system deals with it? some of those could happen in the importers, some in the query, as its modular. I already do a rewrite to shift the created time down to 0 base, in one of my plots. So the notion is there already.”…””}”(hjˆ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kähjª  h&hubeh}”(h]”Œ
data-types”ah]”h]”Œ
data types”ah]”h!]”uh%h*hjJ  h&hh'h(h)KÔubh+)”}”(hhh]”(h0)”}”(hŒ.Distributed state machines - parsl issue #4021”h]”hŒ.Distributed state machines - parsl issue #4021”…””}”(hj¡  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjž  h&hh'h(h)Kçubh)”}”(hŒT"distributed" state machines are hard. see parsl-visualize bug #4021 for an example.”h]”hŒXâ€œdistributedâ€ state machines are hard. see parsl-visualize bug #4021 for an example.”…””}”(hj¯  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kéhjž  h&hubh)”}”(hŒídon't try to make the data model force this. the events happen when they happen. handle it on the query/processing side: you make whatever sense of it that you can. it's not the job of the recording side to force a model that isn't true.”h]”hŒódonâ€™t try to make the data model force this. the events happen when they happen. handle it on the query/processing side: you make whatever sense of it that you can. itâ€™s not the job of the recording side to force a model that isnâ€™t true.”…””}”(hj½  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Këhjž  h&hubh)”}”(hX®  for example, in the context of #4021, we might want to project some external opinion that running should "override" launched for a task, that isn't reflected in the emitting code/emitting event model at all, based on an artifical "force to single thread" concept of task execution.
in the same vein, the #4021 suspect tasks have *negative* time in launched state. which sounds very weird for a non-distributed state machine model.”h]”(hXS  for example, in the context of #4021, we might want to project some external opinion that running should â€œoverrideâ€ launched for a task, that isnâ€™t reflected in the emitting code/emitting event model at all, based on an artifical â€œforce to single threadâ€ concept of task execution.
in the same vein, the #4021 suspect tasks have ”…””}”(hjË  h&hh'Nh)Nubh§)”}”(hŒ
*negative*”h]”hŒnegative”…””}”(hjÓ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjË  ubhŒ[ time in launched state. which sounds very weird for a non-distributed state machine model.”…””}”(hjË  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kíhjž  h&hubh)”}”(hX  or we might only want to visualize the running/end of running times and forget overlaying any other state model onto things: the running and running_ended times should at least be consistent wrt each other as they happen to come from a single threaded bit of code.”h]”hX  or we might only want to visualize the running/end of running times and forget overlaying any other state model onto things: the running and running_ended times should at least be consistent wrt each other as they happen to come from a single threaded bit of code.”…””}”(hjë  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kðhjž  h&hubh)”}”(hXR  see also that the parsl TaskRecord never records a state of running or running_ended. despite it being a valid state. this already only exists elsewhere in the system as a reconstructed state machine - not a real single-threaded/non-distributed state machine. so parsl monitoring is already a demonstration of the violation of this model.”h]”hXR  see also that the parsl TaskRecord never records a state of running or running_ended. despite it being a valid state. this already only exists elsewhere in the system as a reconstructed state machine - not a real single-threaded/non-distributed state machine. so parsl monitoring is already a demonstration of the violation of this model.”…””}”(hjù  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kòhjž  h&hubeh}”(h]”Œ+distributed-state-machines-parsl-issue-4021”ah]”h]”Œ.distributed state machines - parsl issue #4021”ah]”h!]”uh%h*hjJ  h&hh'h(h)Kçubh+)”}”(hhh]”(h0)”}”(hŒ commercial observability vendors”h]”hŒ commercial observability vendors”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)Kõubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œ
Cloudwatch”Œindex-4”hNt”(h‹Œ	Honeycomb”j+  hNt”eh‰uh%hh'h(h)K÷hj  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j+  uh%hhj  h&hh'h(h)Kùubh)”}”(hŒ*honeycomb, or built into AWS as cloudwatch”h]”hŒ*honeycomb, or built into AWS as cloudwatch”…””}”(hj8  h&hh'Nh)Nubah}”(h]”j+  ah]”h]”h]”h!]”uh%hœh'h(h)Kúhj  h&hhÀ}”hÂ}”j+  j/  subh)”}”(hŒ»more ad-hoc construction, less buy in from components, rather than all working together to build a single platform, which is often how the commercial observability usecases are described.”h]”hŒ»more ad-hoc construction, less buy in from components, rather than all working together to build a single platform, which is often how the commercial observability usecases are described.”…””}”(hjH  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Kühj  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹ŒOpenTelemetry”Œindex-5”hNt”ah‰uh%hh'h(h)Kþhj  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›ja  uh%hhj  h&hh'h(h)Kÿubh)”}”(hŒDOpenTelemetry as a standard. How does this related to that standard?”h]”hŒDOpenTelemetry as a standard. How does this related to that standard?”…””}”(hjl  h&hh'Nh)Nubah}”(h]”ja  ah]”h]”h]”h!]”uh%hœh'h(h)M hj  h&hhÀ}”hÂ}”ja  jc  subh)”}”(hŒ4TODO: maybe opentelemetry is better in :ref:`moving`”h]”(hŒ'TODO: maybe opentelemetry is better in ”…””}”(hj|  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`moving`”h]”jV  )”}”(hj†  h]”hŒmoving”…””}”(hjˆ  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhj„  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”j’  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œmoving”uh%jP  h'h(h)Mhj|  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj  h&hubeh}”(h]”Œ commercial-observability-vendors”ah]”h]”Œ commercial observability vendors”ah]”h!]”uh%h*hjJ  h&hh'h(h)Kõubh+)”}”(hhh]”(h0)”}”(hŒ(The argument for templating log messages”h]”hŒ(The argument for templating log messages”…””}”(hjµ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj²  h&hh'h(h)Mubh)”}”(hŒLprevious argument: avoids string interpolation if message will be
discarded.”h]”hŒLprevious argument: avoids string interpolation if message will be
discarded.”…””}”(hjÃ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj²  h&hubh)”}”(hŒgnew argument: we can use the template to find "the same" log message
even when its interpolations vary.”h]”hŒknew argument: we can use the template to find â€œthe sameâ€ log message
even when its interpolations vary.”…””}”(hjÑ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M
hj²  h&hubh)”}”(hŒÎadvanced level question:
when a task changes state, should the template be interpolated or
not with the state names? because in my example query, it is relevant
to see those changes *not* as templated away.”h]”(hŒ¶advanced level question:
when a task changes state, should the template be interpolated or
not with the state names? because in my example query, it is relevant
to see those changes ”…””}”(hjß  h&hh'Nh)Nubh§)”}”(hŒ*not*”h]”hŒnot”…””}”(hjç  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjß  ubhŒ as templated away.”…””}”(hjß  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj²  h&hubeh}”(h]”Œ(the-argument-for-templating-log-messages”ah]”h]”Œ(the argument for templating log messages”ah]”h!]”uh%h*hjJ  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒObjects and spans”h]”hŒObjects and spans”…””}”(hj
	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj	  h&hh'h(h)Mubh)”}”(hŒ=it is a "thing we want to talk about", as a very weak notion.”h]”hŒAit is a â€œthing we want to talk aboutâ€, as a very weak notion.”…””}”(hj	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj	  h&hubh)”}”(hŒœweak notion of "object" but it does exist: for example, logs that are about a particular Parsl task, or a particular HTEX worker, or a particular batch job.”h]”hŒ weak notion of â€œobjectâ€ but it does exist: for example, logs that are about a particular Parsl task, or a particular HTEX worker, or a particular batch job.”…””}”(hj&	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj	  h&hubh)”}”(hŒFmulti-attribute keys - sometimes hierarchical but that isn't required.”h]”hŒHmulti-attribute keys - sometimes hierarchical but that isnâ€™t required.”…””}”(hj4	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj	  h&hubh)”}”(hXñ  eg. contrasting parsl task IDs vs parsl checkpoint IDs: in parsl checkpoint world, tasks are identified by their hashsum. there might be many tasks that run to compute that result. when working cross-dfk checkpointing, the cross-dfk sort-of-task ID is that hash sum, in the sense of correlating tasks that are elided due to memoization with their original exec_done task in another run.  (so hashsum is one form of task ID,  parsl_dfk/parsl_task_id is another - both legitimate but both different)”h]”hXñ  eg. contrasting parsl task IDs vs parsl checkpoint IDs: in parsl checkpoint world, tasks are identified by their hashsum. there might be many tasks that run to compute that result. when working cross-dfk checkpointing, the cross-dfk sort-of-task ID is that hash sum, in the sense of correlating tasks that are elided due to memoization with their original exec_done task in another run.  (so hashsum is one form of task ID,  parsl_dfk/parsl_task_id is another - both legitimate but both different)”…””}”(hjB	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj	  h&hubh)”}”(hŒGcross-ref "span" concept from other places in more broad Observability.”h]”hŒKcross-ref â€œspanâ€ concept from other places in more broad Observability.”…””}”(hjP	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj	  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹ŒEntity component system”Œindex-6”hNt”ah‰uh%hh'h(h)Mhj	  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›ji	  uh%hhj	  h&hh'h(h)M ubh)”}”(hX*  Compare with https://en.wikipedia.org/wiki/Entity_component_system which has identified entities, where the entity only has identity and no other substance, along with components as "characterising an entity as having a partcular aspect" and the "system" which deals with all the entities having particular components. This fits the partial data model fairly well, I think: the notion of identifying entities, without those entities having any further structure; and then the data you might expect to find about certain entities being orthogonal to that.”h]”(hŒCompare with ”…””}”(hjt	  h&hh'Nh)NubhÍ)”}”(hŒ5https://en.wikipedia.org/wiki/Entity_component_system”h]”hŒ5https://en.wikipedia.org/wiki/Entity_component_system”…””}”(hj|	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j~	  uh%hÌhjt	  ubhXð   which has identified entities, where the entity only has identity and no other substance, along with components as â€œcharacterising an entity as having a partcular aspectâ€ and the â€œsystemâ€ which deals with all the entities having particular components. This fits the partial data model fairly well, I think: the notion of identifying entities, without those entities having any further structure; and then the data you might expect to find about certain entities being orthogonal to that.”…””}”(hjt	  h&hh'Nh)Nubeh}”(h]”ji	  ah]”h]”h]”h!]”uh%hœh'h(h)M!hj	  h&hhÀ}”hÂ}”ji	  jk	  subeh}”(h]”Œobjects-and-spans”ah]”h]”Œobjects and spans”ah]”h!]”uh%h*hjJ  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒEntity keys”h]”hŒEntity keys”…””}”(hj¢	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjŸ	  h&hh'h(h)M$ubh)”}”(hŒ†This is probably the fundamental problem of JOIN here, compared to traditional observability which passes request IDs around up front.”h]”hŒ†This is probably the fundamental problem of JOIN here, compared to traditional observability which passes request IDs around up front.”…””}”(hj°	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M&hjŸ	  h&hubh)”}”(hŒÁIn a traditional distributed object model system, you would use something like UUIDs everywhere. However, this observability work is not observing a traditional distributed object model system.”h]”hŒÁIn a traditional distributed object model system, you would use something like UUIDs everywhere. However, this observability work is not observing a traditional distributed object model system.”…””}”(hj¾	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M(hjŸ	  h&hubh)”}”(hX%  note that in parsl some IDs are deliberately not known across the system at runtime because it would be expensive to correlate them in realtime, and that is not necessary for the executing-tasks part of Parsl, even though its necessary for the understanding-how-that-task-was-executed section.”h]”hX%  note that in parsl some IDs are deliberately not known across the system at runtime because it would be expensive to correlate them in realtime, and that is not necessary for the executing-tasks part of Parsl, even though its necessary for the understanding-how-that-task-was-executed section.”…””}”(hjÌ	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M*hjŸ	  h&hubeh}”(h]”Œentity-keys”ah]”h]”Œentity keys”ah]”h!]”uh%h*hjJ  h&hh'h(h)M$ubh+)”}”(hhh]”(h0)”}”(hŒOther components”h]”hŒOther components”…””}”(hjå	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjâ	  h&hh'h(h)M-ubh)”}”(hXk  rewrite this to not be parsl-centric but instead talk about integrating "objects" from different components even though those components are not strongly aware of each other. wq vs parsl task id is a nice example: regularly used, but log files are different formats, identifier space is different, cardinality of tasks is different: one parsl task != one wq task.”h]”hXo  rewrite this to not be parsl-centric but instead talk about integrating â€œobjectsâ€ from different components even though those components are not strongly aware of each other. wq vs parsl task id is a nice example: regularly used, but log files are different formats, identifier space is different, cardinality of tasks is different: one parsl task != one wq task.”…””}”(hjó	  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M/hjâ	  h&hubh)”}”(hŒÅSome components aren't Parsl-aware: for example work queue
has no notion of a Parsl task ID. and it runs its own
logging system, that is not Python, and so not amenable to
Python monitoring radios.”h]”hŒÇSome components arenâ€™t Parsl-aware: for example work queue
has no notion of a Parsl task ID. and it runs its own
logging system, that is not Python, and so not amenable to
Python monitoring radios.”…””}”(hj
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M1hjâ	  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹ŒZMQ”Œindex-7”hNt”ah‰uh%hh'h(h)M6hjâ	  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j
  uh%hhjâ	  h&hh'h(h)M7ubh)”}”(hŒlZMQ generates log messages which have been useful sometimes
and these could be gatewayed into observability.”h]”hŒlZMQ generates log messages which have been useful sometimes
and these could be gatewayed into observability.”…””}”(hj%
  h&hh'Nh)Nubah}”(h]”j
  ah]”h]”h]”h!]”uh%hœh'h(h)M8hjâ	  h&hhÀ}”hÂ}”j
  j
  subh)”}”(hXÊ  inherently chaotic research prototypes can benefit from observability - as part of building and debugging them, rather than a post-completion 2nd generation feature - but that is impeded by requiring a strict sql-like data model to exist, when the research prototype is not ready for that. (see attitude that monitoring is something aimed at "users" later on, not something that is aimed at "developers" understanding the behaviour of what they have created)”h]”hXÒ  inherently chaotic research prototypes can benefit from observability - as part of building and debugging them, rather than a post-completion 2nd generation feature - but that is impeded by requiring a strict sql-like data model to exist, when the research prototype is not ready for that. (see attitude that monitoring is something aimed at â€œusersâ€ later on, not something that is aimed at â€œdevelopersâ€ understanding the behaviour of what they have created)”…””}”(hj5
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M;hjâ	  h&hubh)”}”(hXH  "user" applications adding their own events, and expecting those events to be correlatable with everything else that is happening, is part of the model: just as we might expect Globus Compute endpoint logs to be correlatable with parsl htex logs, even though Globus Compute is a "mere" user of Parsl, not a "real part" of Parsl.”h]”hXT  â€œuserâ€ applications adding their own events, and expecting those events to be correlatable with everything else that is happening, is part of the model: just as we might expect Globus Compute endpoint logs to be correlatable with parsl htex logs, even though Globus Compute is a â€œmereâ€ user of Parsl, not a â€œreal partâ€ of Parsl.”…””}”(hjC
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M=hjâ	  h&hubh)”}”(hŒÄshould be easy to add other events - the core observability
model shouldn't be prescriptive about what events exist,
what they look like. even though someone needs to know what their structure is.”h]”hŒÆshould be easy to add other events - the core observability
model shouldnâ€™t be prescriptive about what events exist,
what they look like. even though someone needs to know what their structure is.”…””}”(hjQ
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M?hjâ	  h&hubh)”}”(hŒ@to that end, there is no core schema, either formal or informal.”h]”hŒ@to that end, there is no core schema, either formal or informal.”…””}”(hj_
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MChjâ	  h&hubh)”}”(hX  observability records do not even need to have a timestamp, in the sense of a log message. for example, see relations imported from a relational database into observability records, in parsl monitoring import (crossref usecase about plotting from monitoring database)”h]”hX  observability records do not even need to have a timestamp, in the sense of a log message. for example, see relations imported from a relational database into observability records, in parsl monitoring import (crossref usecase about plotting from monitoring database)”…””}”(hjm
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MEhjâ	  h&hubh)”}”(hŒÆParsl contains two "mini-workflow-systems" on top of core Parsl: parsl-perf and pytest tests. It could be interesting to illustrate how those fit in without being a core part of Parsl observability.”h]”hŒÊParsl contains two â€œmini-workflow-systemsâ€ on top of core Parsl: parsl-perf and pytest tests. It could be interesting to illustrate how those fit in without being a core part of Parsl observability.”…””}”(hj{
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MGhjâ	  h&hubh)”}”(hX!  Parsl monitoring visualisation and Parsl logging are both completely unaware of the application level structure of the mini-workflows run by parsl-perf and pytest, beyond what is expressed to the DFK as DAG fragments: there's nothing to separate out parsl-perf iterations, or pytest tests.”h]”hX#  Parsl monitoring visualisation and Parsl logging are both completely unaware of the application level structure of the mini-workflows run by parsl-perf and pytest, beyond what is expressed to the DFK as DAG fragments: thereâ€™s nothing to separate out parsl-perf iterations, or pytest tests.”…””}”(hj‰
  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MIhjâ	  h&hubh)”}”(hŒ:In the context of pytest, see: :ref:`pytest-observes-logs`”h]”(hŒIn the context of pytest, see: ”…””}”(hj—
  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`pytest-observes-logs`”h]”jV  )”}”(hj¡
  h]”hŒpytest-observes-logs”…””}”(hj£
  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjŸ
  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”j­
  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œpytest-observes-logs”uh%jP  h'h(h)MKhj—
  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MKhjâ	  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œcolmena”Œindex-8”hNt”ah‰uh%hh'h(h)MMhjâ	  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jÐ
  uh%hhjâ	  h&hh'h(h)MNubh)”}”(hŒColmena”h]”hŒColmena”…””}”(hjÛ
  h&hh'Nh)Nubah}”(h]”jÐ
  ah]”h]”h]”h!]”uh%hœh'h(h)MOhjâ	  h&hhÀ}”hÂ}”jÐ
  jÒ
  subh‘)”}”(hŒ.. _creating:”h]”h}”(h]”h]”h]”h]”h!]”h›Œcreating”uh%hh)MQhjâ	  h&hh'h(ubeh}”(h]”Œother-components”ah]”h]”Œother components”ah]”h!]”uh%h*hjJ  h&hh'h(h)M-ubeh}”(h]”(Œthe-data-model”j9  eh]”h]”(Œthe data model”Œ	datamodel”eh]”h!]”uh%h*h•³      hh&hh'h(h)KšhÀ}”j  j/  shÂ}”j9  j/  subh+)”}”(hhh]”(h0)”}”(hŒGenerating wide records”h]”hŒGenerating wide records”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj	  h&hh'h(h)MTubh+)”}”(hhh]”(h0)”}”(hŒWhat exists now”h]”hŒWhat exists now”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)MWubh+)”}”(hhh]”(h0)”}”(hŒParsl”h]”hŒParsl”…””}”(hj.  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj+  h&hh'h(h)M[ubh)”}”(hŒ^Parsl generates a lot of observability-style events, but spread across many different formats.”h]”hŒ^Parsl generates a lot of observability-style events, but spread across many different formats.”…””}”(hj<  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M]hj+  h&hubh)”}”(hXB  Parsl logs - not well structured, for example overlapping DFKs are not well
represented, and actions to do with different tasks can be interleaved without
being clearly separated/identified.
Globus Compute deliberately makes them even less structured, by jumbling up
the file-based logs of multiple runs into one directory”h]”hXB  Parsl logs - not well structured, for example overlapping DFKs are not well
represented, and actions to do with different tasks can be interleaved without
being clearly separated/identified.
Globus Compute deliberately makes them even less structured, by jumbling up
the file-based logs of multiple runs into one directory”…””}”(hjJ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M_hj+  h&hubh)”}”(hX™  Parsl monitoring - well structured but very hard to modify. It is easy to
query for questions that it can answer, and hard to use for anything more.
Users are generally interested in using it when they discover it, but it
suffers from a history of being built as student demo projects dropped into
production. An example of a question it cannot answer: What is the
``parsl_resource_specification`` for a task?”h]”(hXm  Parsl monitoring - well structured but very hard to modify. It is easy to
query for questions that it can answer, and hard to use for anything more.
Users are generally interested in using it when they discover it, but it
suffers from a history of being built as student demo projects dropped into
production. An example of a question it cannot answer: What is the
”…””}”(hjX  h&hh'Nh)Nubjn  )”}”(hŒ ``parsl_resource_specification``”h]”hŒparsl_resource_specification”…””}”(hj`  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjX  ubhŒ for a task?”…””}”(hjX  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mehj+  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œ
Work Queue”Œindex-9”hNt”ah‰uh%hh'h(h)Mlhj+  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jƒ  uh%hhj+  h&hh'h(h)Mmubh)”}”(hX„  Work Queue Logs - well structured. Ignored by Parsl monitoring. Hard to
correlate with monitoring.db: Work Queue uses work queue task IDs, but
Parsl Monitoring uses Parsl Task and Try IDs. Correlating those is a
motivating use case for this Observabilty projected. [TODO: make that
correlation as an explicit use-case section, explaining what needs to change,
how it is done manually now]”h]”hX„  Work Queue Logs - well structured. Ignored by Parsl monitoring. Hard to
correlate with monitoring.db: Work Queue uses work queue task IDs, but
Parsl Monitoring uses Parsl Task and Try IDs. Correlating those is a
motivating use case for this Observabilty projected. [TODO: make that
correlation as an explicit use-case section, explaining what needs to change,
how it is done manually now]”…””}”(hjŽ  h&hh'Nh)Nubah}”(h]”jƒ  ah]”h]”h]”h!]”uh%hœh'h(h)Mnhj+  h&hhÀ}”hÂ}”jƒ  j…  subeh}”(h]”Œparsl”ah]”h]”Œparsl”ah]”h!]”uh%h*hj  h&hh'h(h)M[ubh+)”}”(hhh]”(h0)”}”(hŒAcademy”h]”hŒAcademy”…””}”(hj©  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj¦  h&hh'h(h)Mvubh)”}”(hŒŒAs a more in-development project, it is much better placed to make observability
records from the start as a first-order production feature.”h]”hŒŒAs a more in-development project, it is much better placed to make observability
records from the start as a first-order production feature.”…””}”(hj·  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mxhj¦  h&hubeh}”(h]”Œacademy”ah]”h]”Œacademy”ah]”h!]”uh%h*hj  h&hh'h(h)Mvubeh}”(h]”Œwhat-exists-now”ah]”h]”Œwhat exists now”ah]”h!]”uh%h*hj	  h&hh'h(h)MWubh+)”}”(hhh]”(h0)”}”(hŒ"New Python Code for log generation”h]”hŒ"New Python Code for log generation”…””}”(hjØ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÕ  h&hh'h(h)M~ubh)”}”(hŒ[Acknowledging observability as a first-order feature means
we can make big changes to code.”h]”hŒ[Acknowledging observability as a first-order feature means
we can make big changes to code.”…””}”(hjæ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M€hjÕ  h&hubh)”}”(hX  Every log message needs to be visited to add context. In
many places a bunch of that context can be added by helpers:
for example, in my prototype, some module level loggers are
replaced by object-level loggers: there is a per-task
logger (actually LoggerAdapter) in the TaskRecord, and
logging to that automatically adds on relevant DFK and task
metadata: at most log sites, the change to add that metadata
is to switch from invoking methods on the module-level logger
object, invoking them on the new task-level logger instead.”h]”hX  Every log message needs to be visited to add context. In
many places a bunch of that context can be added by helpers:
for example, in my prototype, some module level loggers are
replaced by object-level loggers: there is a per-task
logger (actually LoggerAdapter) in the TaskRecord, and
logging to that automatically adds on relevant DFK and task
metadata: at most log sites, the change to add that metadata
is to switch from invoking methods on the module-level logger
object, invoking them on the new task-level logger instead.”…””}”(hjô  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MƒhjÕ  h&hubh)”}”(hŒÏSome log lines bracket an operation, and to help with that,
my prototype introduces a LexicalSpan context manager which
can be used as part of a `with` block to identify the span
of work starting and ending.”h]”(hŒ‘Some log lines bracket an operation, and to help with that,
my prototype introduces a LexicalSpan context manager which
can be used as part of a ”…””}”(hj  h&hh'Nh)Nubh	Œtitle_reference”“”)”}”(hŒ`with`”h]”hŒwith”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj  ubhŒ8 block to identify the span
of work starting and ending.”…””}”(hj  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MhjÕ  h&hubh)”}”(hŒõMove away from forming ad-hoc string templates and make
log calls look more machine-readable. This is somewhat
stylistic: with task ID automatically logged, there is no
need to substitute in task ID in some arbitrary subset of
task-related logs.”h]”hŒõMove away from forming ad-hoc string templates and make
log calls look more machine-readable. This is somewhat
stylistic: with task ID automatically logged, there is no
need to substitute in task ID in some arbitrary subset of
task-related logs.”…””}”(hj$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M’hjÕ  h&hubh)”}”(hŒ9TODO: describe academy style that I tried out in PR #NNN:”h]”hŒ9TODO: describe academy style that I tried out in PR #NNN:”…””}”(hj2  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M˜hjÕ  h&hubjL  )”}”(hŒ,extra=myobj.log_extra() | { "some": "more" }”h]”hŒ,extra=myobj.log_extra() | { "some": "more" }”…””}”hj@  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  j_  uh%jK  h'h(h)MšhjÕ  h&hubh)”}”(hX  Parsl config hook for arbitrary log initialization - actually it can do "anything" at process init and maybe that's interesting from a different perspective (because its a callback/plugin), but from the perspective of this report I don't care about non-log uses.”h]”hX  Parsl config hook for arbitrary log initialization - actually it can do â€œanythingâ€ at process init and maybe thatâ€™s interesting from a different perspective (because its a callback/plugin), but from the perspective of this report I donâ€™t care about non-log uses.”…””}”(hjQ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MžhjÕ  h&hubh)”}”(hX  Be aware that there are non-Python bits of code generating various logs. Work Queue (still structured) logs are one example. The output from batch command submit scripts are another less structured one that looks much more like a traditional chaotic output file.”h]”hX  Be aware that there are non-Python bits of code generating various logs. Work Queue (still structured) logs are one example. The output from batch command submit scripts are another less structured one that looks much more like a traditional chaotic output file.”…””}”(hj_  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M hjÕ  h&hubh+)”}”(hhh]”(h0)”}”(hŒPython API on logging side”h]”hŒPython API on logging side”…””}”(hjp  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjm  h&hh'h(h)M£ubh)”}”(hŒKUse Python-provided logger interface, with Python-provided ```extra``` API.”h]”(hŒ;Use Python-provided logger interface, with Python-provided ”…””}”(hj~  h&hh'Nh)Nubjn  )”}”(hŒ```extra```”h]”hŒ`extra`”…””}”(hj†  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj~  ubhŒ API.”…””}”(hj~  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¥hjm  h&hubh)”}”(hŒÞper-class "log extras" method that generates an ``extra`` dict about this object.
that pushes on (like ``repr``) it being the responsibility of the object to describe itself,
rather than being someone elses responsibility.”h]”(hŒ4per-class â€œlog extrasâ€ method that generates an ”…””}”(hjž  h&hh'Nh)Nubjn  )”}”(hŒ	``extra``”h]”hŒextra”…””}”(hj¦  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjž  ubhŒ. dict about this object.
that pushes on (like ”…””}”(hjž  h&hh'Nh)Nubjn  )”}”(hŒ``repr``”h]”hŒrepr”…””}”(hj¸  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjž  ubhŒo) it being the responsibility of the object to describe itself,
rather than being someone elses responsibility.”…””}”(hjž  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M§hjm  h&hubeh}”(h]”Œpython-api-on-logging-side”ah]”h]”Œpython api on logging side”ah]”h!]”uh%h*hjÕ  h&hh'h(h)M£ubh+)”}”(hhh]”(h0)”}”(hŒ-anonymous/temporary identified python objects”h]”hŒ-anonymous/temporary identified python objects”…””}”(hjÛ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjØ  h&hh'h(h)M¬ubh)”}”(hŒápython objects don't have a global-over-time ID. id() exists but it is reused over time so awkward to use over a whole series of logs. so some objects should get a uuid "just" for observability - UUIDs were invented for this.”h]”hŒçpython objects donâ€™t have a global-over-time ID. id() exists but it is reused over time so awkward to use over a whole series of logs. so some objects should get a uuid â€œjustâ€ for observability - UUIDs were invented for this.”…””}”(hjé  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M®hjØ  h&hubh)”}”(hX  likewise e.g. gc endpoints don't have a DFK ID, but endpoint id/executor/block 0 isn't a global-over-time ID: there's a new block 0 at every restart? or is there a unique UEP ID each time that is enough? I don't think so because i see overlapping block-0 entries.”h]”hX  likewise e.g. gc endpoints donâ€™t have a DFK ID, but endpoint id/executor/block 0 isnâ€™t a global-over-time ID: thereâ€™s a new block 0 at every restart? or is there a unique UEP ID each time that is enough? I donâ€™t think so because i see overlapping block-0 entries.”…””}”(hj÷  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M°hjØ  h&hubh)”}”(hX®  repr-of-ID-object might not be the correct format for logging:
I want stuff that is nice strings for values, but repr
(although it is a string) is more designed to look like a
python code fragment rather than the core value of an object.
Maybe ``str`` is better, and maybe some other way of
representing the ID is better? The point is to have values
that work well in aggregate, database style analysis, not
easy on the human eye.”h]”(hŒôrepr-of-ID-object might not be the correct format for logging:
I want stuff that is nice strings for values, but repr
(although it is a string) is more designed to look like a
python code fragment rather than the core value of an object.
Maybe ”…””}”(hj  h&hh'Nh)Nubjn  )”}”(hŒ``str``”h]”hŒstr”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj  ubhŒ³ is better, and maybe some other way of
representing the ID is better? The point is to have values
that work well in aggregate, database style analysis, not
easy on the human eye.”…””}”(hj  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M²hjØ  h&hubeh}”(h]”Œ-anonymous-temporary-identified-python-objects”ah]”h]”Œ-anonymous/temporary identified python objects”ah]”h!]”uh%h*hjÕ  h&hh'h(h)M¬ubh+)”}”(hhh]”(h0)”}”(hŒ6Contributed: Modifying academy to generate wide events”h]”hŒ6Contributed: Modifying academy to generate wide events”…””}”(hj0  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj-  h&hh'h(h)M½ubh)”}”(hŒ"summarise the PRs I merged already”h]”hŒ"summarise the PRs I merged already”…””}”(hj>  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¿hj-  h&hubh)”}”(hŒFcross-ref event graph in analysis section as something enabled by this”h]”hŒFcross-ref event graph in analysis section as something enabled by this”…””}”(hjL  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÁhj-  h&hubeh}”(h]”Œ5contributed-modifying-academy-to-generate-wide-events”ah]”h]”Œ6contributed: modifying academy to generate wide events”ah]”h!]”uh%h*hjÕ  h&hh'h(h)M½ubeh}”(h]”Œ"new-python-code-for-log-generation”ah]”h]”Œ"new python code for log generation”ah]”h!]”uh%h*hj	  h&hh'h(h)M~ubh+)”}”(hhh]”(h0)”}”(hŒ"Translating non-wide-event sources”h]”hŒ"Translating non-wide-event sources”…””}”(hjm  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjj  h&hh'h(h)MÅubh)”}”(hŒÁPart of this modularity work is that some modules produce event-like information that looks superficially very different but that can be understood through the lens of structured event records.”h]”hŒÁPart of this modularity work is that some modules produce event-like information that looks superficially very different but that can be understood through the lens of structured event records.”…””}”(hj{  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÇhjj  h&hubh+)”}”(hhh]”(h0)”}”(hŒ*Using Parsl monitoring events as wide logs”h]”hŒ*Using Parsl monitoring events as wide logs”…””}”(hjŒ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj‰  h&hh'h(h)MÊubh)”}”(hŒtwo approaches:”h]”hŒtwo approaches:”…””}”(hjš  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÌhj‰  h&hubh)”}”(hŒ¬monitoring.json: abandons the SQL database component of conventional Parsl monitoring and instead writes each monitoring message out to a json file, giving an event stream.”h]”hŒ¬monitoring.json: abandons the SQL database component of conventional Parsl monitoring and instead writes each monitoring message out to a json file, giving an event stream.”…””}”(hj¨  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÎhj‰  h&hubh)”}”(hX°  replay-monitoring.db: turns a monitoring.db file into events. the status, resource and block tables already looks like an event stream. This gives an easy way to take existing runs and turn them into event streams without needing to opt-in to any of the other JSON logging, or changing anything at all at runtime: anything new is entirely post-facto. which fits the general concept of doing things post-facto in parsl observability.”h]”hX°  replay-monitoring.db: turns a monitoring.db file into events. the status, resource and block tables already looks like an event stream. This gives an easy way to take existing runs and turn them into event streams without needing to opt-in to any of the other JSON logging, or changing anything at all at runtime: anything new is entirely post-facto. which fits the general concept of doing things post-facto in parsl observability.”…””}”(hj¶  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÐhj‰  h&hubh)”}”(hŒ¥the infrastructure for this already exists, which means that the query side of this project can be used without modification of the execution-side Parsl environment.”h]”hŒ¥the infrastructure for this already exists, which means that the query side of this project can be used without modification of the execution-side Parsl environment.”…””}”(hjÄ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÒhj‰  h&hubh)”}”(hŒ.see earlier use case on priority visualization”h]”hŒ.see earlier use case on priority visualization”…””}”(hjÒ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÔhj‰  h&hubeh}”(h]”Œ*using-parsl-monitoring-events-as-wide-logs”ah]”h]”Œ*using parsl monitoring events as wide logs”ah]”h!]”uh%h*hjj  h&hh'h(h)MÊubh+)”}”(hhh]”(h0)”}”(hŒ9Using Work Queue ``transaction_log`` as a wide log source”h]”(hŒUsing Work Queue ”…””}”(hjë  h&hh'Nh)Nubjn  )”}”(hŒ``transaction_log``”h]”hŒtransaction_log”…””}”(hjó  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjë  ubhŒ as a wide log source”…””}”(hjë  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%h/hjè  h&hh'h(h)MØubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œ
Work Queue”Œindex-10”hNt”ah‰uh%hh'h(h)MÚhjè  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j  uh%hhjè  h&hh'h(h)MÛubh)”}”(hŒ´this is a core part of seeing beyond the pure-Parsl code. it's well structured but not JSON. translation into JSON is mostly syntactic. and can be done line-by-line, aka streaming.”h]”hŒ¶this is a core part of seeing beyond the pure-Parsl code. itâ€™s well structured but not JSON. translation into JSON is mostly syntactic. and can be done line-by-line, aka streaming.”…””}”(hj!  h&hh'Nh)Nubah}”(h]”j  ah]”h]”h]”h!]”uh%hœh'h(h)MÜhjè  h&hhÀ}”hÂ}”j  j  subh)”}”(hŒTODO: example log line”h]”hŒTODO: example log line”…””}”(hj1  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÞhjè  h&hubh)”}”(hŒÓnote that work queue task IDs are not Parsl task IDs: data from the monitoring database cannot be correlated with data from the work queue transaction log! (without further help from the parsl JSON log files...)”h]”hŒÓnote that work queue task IDs are not Parsl task IDs: data from the monitoring database cannot be correlated with data from the work queue transaction log! (without further help from the parsl JSON log filesâ€¦)”…””}”(hj?  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Màhjè  h&hubeh}”(h]”Œ5using-work-queue-transaction-log-as-a-wide-log-source”ah]”h]”Œ5using work queue transaction_log as a wide log source”ah]”h!]”uh%h*hjj  h&hh'h(h)MØubeh}”(h]”Œ"translating-non-wide-event-sources”ah]”h]”Œ"translating non-wide-event sources”ah]”h!]”uh%h*hj	  h&hh'h(h)MÅubh+)”}”(hhh]”(h0)”}”(hŒAAdventure: adding observability to a prototype: idris2interchange”h]”hŒAAdventure: adding observability to a prototype: idris2interchange”…””}”(hj`  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj]  h&hh'h(h)Mäubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œidris2”Œindex-11”hNt”(h‹Œ%High Throughput Executor; interchange”jy  hNt”(h‹Œ9Adventure; Adding observability to the idris2 interchange”jy  hNt”(h‹ŒProgramming languages; idris2”jy  hNt”eh‰uh%hh'h(h)Mæhj]  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jy  uh%hhj]  h&hh'h(h)Mêubh)”}”(hŒÝAnother example: swap out interchange impl for a different
one with a different internal model: a schema of events for
task progress through the original interchange doesn't
necessarily work for some other implementation.”h]”hŒßAnother example: swap out interchange impl for a different
one with a different internal model: a schema of events for
task progress through the original interchange doesnâ€™t
necessarily work for some other implementation.”…””}”(hjŠ  h&hh'Nh)Nubah}”(h]”jy  ah]”h]”h]”h!]”uh%hœh'h(h)Mëhj]  h&hhÀ}”hÂ}”jy  j  subh)”}”(hŒóidris2interchange - i want to debug stuff, not be told by the observability system HAHA we don't support your prototyping. in some sense thats exactly the time I *need* the observability system to be helping me. not later on when it all works.”h]”(hŒ¤idris2interchange - i want to debug stuff, not be told by the observability system HAHA we donâ€™t support your prototyping. in some sense thats exactly the time I ”…””}”(hjš  h&hh'Nh)Nubh§)”}”(hŒ*need*”h]”hŒneed”…””}”(hj¢  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjš  ubhŒK the observability system to be helping me. not later on when it all works.”…””}”(hjš  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mðhj]  h&hubh)”}”(hŒ idris2interchange project is not aimed at producing production code. *ever*. in that sense it is very similar to some student projects that interact with parsl.”h]”(hŒEidris2interchange project is not aimed at producing production code. ”…””}”(hjº  h&hh'Nh)Nubh§)”}”(hŒ*ever*”h]”hŒever”…””}”(hjÂ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjº  ubhŒU. in that sense it is very similar to some student projects that interact with parsl.”…””}”(hjº  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mòhj]  h&hubh)”}”(hX—  mini-journal: what did i have to do to support idris2 logging?
* make log records JSON format instead of textual - prior format was timestamp / string. theres a json library but to start with this records are so simple i'll template them in.
* also already had a simple log-of-value mechanism in there already which readily translates to logging a template, a full message, and the value as separate fields.”h]”hX™  mini-journal: what did i have to do to support idris2 logging?
* make log records JSON format instead of textual - prior format was timestamp / string. theres a json library but to start with this records are so simple iâ€™ll template them in.
* also already had a simple log-of-value mechanism in there already which readily translates to logging a template, a full message, and the value as separate fields.”…””}”(hjÚ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Môhj]  h&hubh)”}”(hX/  now there are json records going to the console. I don't trust the string escaping, but i'll deal with that ad-hoc. but also: needs to go to a file; if i want it to interact with other log files, I need some common keys. htex_task_id is the obvious one there for task correlation. manager ID is another.”h]”hX3  now there are json records going to the console. I donâ€™t trust the string escaping, but iâ€™ll deal with that ad-hoc. but also: needs to go to a file; if i want it to interact with other log files, I need some common keys. htex_task_id is the obvious one there for task correlation. manager ID is another.”…””}”(hjè  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Møhj]  h&hubh)”}”(hŒ¶To go to a file: lazy redirect of stdout to idris2interchange.log. This could be done more seriously to avoid random prints going to the file but this is a prototype so I don't care.”h]”hŒ¸To go to a file: lazy redirect of stdout to idris2interchange.log. This could be done more seriously to avoid random prints going to the file but this is a prototype so I donâ€™t care.”…””}”(hjö  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Múhj]  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œtool; jq”Œindex-12”hNt”ah‰uh%hh'h(h)Mühj]  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j  uh%hhj]  h&hh'h(h)Mýubh)”}”(hŒÓRun it through `jq` for basic validation and haha its broken. I got confused about JSON quotes vs Python style quotes. Various iterations of `jq` vs formatting fixes to work towards `jq` believing this is valid.”h]”(hŒRun it through ”…””}”(hj  h&hh'Nh)Nubj  )”}”(hŒ`jq`”h]”hŒjq”…””}”(hj"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj  ubhŒz for basic validation and haha its broken. I got confused about JSON quotes vs Python style quotes. Various iterations of ”…””}”(hj  h&hh'Nh)Nubj  )”}”(hŒ`jq`”h]”hŒjq”…””}”(hj4  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj  ubhŒ% vs formatting fixes to work towards ”…””}”(hj  h&hh'Nh)Nubj  )”}”(hŒ`jq`”h]”hŒjq”…””}”(hjF  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj  ubhŒ believing this is valid.”…””}”(hj  h&hh'Nh)Nubeh}”(h]”j  ah]”h]”h]”h!]”uh%hœh'h(h)Mþhj]  h&hhÀ}”hÂ}”j  j  subh)”}”(hŒ@code:

parse error: Invalid numeric literal at line 1, column 11”h]”hŒ@code:

parse error: Invalid numeric literal at line 1, column 11”…””}”hj`  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)Mubh)”}”(hX:  That log escaping, which i implemented pretty quickly, seems to make logging extremely slow - especially outputting the pickle stack which is actually quite a big representation when it has a manager registration with all my installed python packages in there. but hey thats what log levels/log optionality is for.”h]”hX:  That log escaping, which i implemented pretty quickly, seems to make logging extremely slow - especially outputting the pickle stack which is actually quite a big representation when it has a manager registration with all my installed python packages in there. but hey thats what log levels/log optionality is for.”…””}”(hjn  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hXI  Let's do some scripting to figure out which of these lines is so expensive - based on line length. one line is 49kb long! (its repeating the full pickled task state rather than a task id!). and similar with manager IDs. but this is probably the sort of changes I'll be needing to make to tie stuff in with other log files anyway.”h]”hXM  Letâ€™s do some scripting to figure out which of these lines is so expensive - based on line length. one line is 49kb long! (its repeating the full pickled task state rather than a task id!). and similar with manager IDs. but this is probably the sort of changes Iâ€™ll be needing to make to tie stuff in with other log files anyway.”…””}”(hj|  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒÚcode:

with open("pytest-parsl/parsltest-current/runinfo/000/htex_local/idris2interchange.jsonlog", "r") as f:

  ls = f.readlines()

ls.sort(key=lambda l: len(l))

print(ls[-1])
print(f"with size {len(ls[-1])} chars")”h]”hŒÚcode:

with open("pytest-parsl/parsltest-current/runinfo/000/htex_local/idris2interchange.jsonlog", "r") as f:

  ls = f.readlines()

ls.sort(key=lambda l: len(l))

print(ls[-1])
print(f"with size {len(ls[-1])} chars")”…””}”hjŠ  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)Mubh)”}”(hŒThis log volume has been a problem for me elsewhere, even without structured logging, filling up eg my root filesystem with docker stdout logs.”h]”hŒThis log volume has been a problem for me elsewhere, even without structured logging, filling up eg my root filesystem with docker stdout logs.”…””}”(hj˜  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒ Now back to ``jq`` validation...”h]”(hŒNow back to ”…””}”(hj¦  h&hh'Nh)Nubjn  )”}”(hŒ``jq``”h]”hŒjq”…””}”(hj®  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¦  ubhŒ validationâ€¦”…””}”(hj¦  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒ²if i get that done... look for every logv call and report each one and how many times it logged a value.
this is in the direction of logging metrics, without actually being that.”h]”hŒ²if i get that doneâ€¦ look for every logv call and report each one and how many times it logged a value.
this is in the direction of logging metrics, without actually being that.”…””}”(hjÆ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒ8a pytest run now give 92000 idris2interchange log lines.”h]”hŒ8a pytest run now give 92000 idris2interchange log lines.”…””}”(hjÔ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒand now jq accepts it all.”h]”hŒand now jq accepts it all.”…””}”(hjâ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒYso lets see if parsl.obserability.load_jsons can load it. it can, without further change.”h]”hŒYso lets see if parsl.obserability.load_jsons can load it. it can, without further change.”…””}”(hjð  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj]  h&hubh)”}”(hŒlogs that have a v:”h]”hŒlogs that have a v:”…””}”(hjþ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M!hj]  h&hubh)”}”(hŒ¼code:

import parsl.observability.getlogs as gl
logs = gl.load_jsons("pytest-parsl/parsltest-current/runinfo/000/htex_local/idris2interchange.jsonlog")
vals = [l for l in logs if 'v' in l]”h]”hŒ¼code:

import parsl.observability.getlogs as gl
logs = gl.load_jsons("pytest-parsl/parsltest-current/runinfo/000/htex_local/idris2interchange.jsonlog")
vals = [l for l in logs if 'v' in l]”…””}”hj  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)M(ubh)”}”(hŒ=code:

>>> vkeys = {l['msg'] for l in vals}
>>> len(vkeys)
52”h]”hŒ=code:

>>> vkeys = {l['msg'] for l in vals}
>>> len(vkeys)
52”…””}”hj  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)M.ubh)”}”(hŒÑNext step is to figure out how task processing can be annotated to fit into the general task flow findcommon style output. Let's start with a single line such as this without trying to add any broader context.”h]”hŒÓNext step is to figure out how task processing can be annotated to fit into the general task flow findcommon style output. Letâ€™s start with a single line such as this without trying to add any broader context.”…””}”(hj(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M/hj]  h&hubh)”}”(hŒxMake a new logv that lets the v field be named. That allows a single association to be made. which is ok for this stage.”h]”hŒxMake a new logv that lets the v field be named. That allows a single association to be made. which is ok for this stage.”…””}”(hj6  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M1hj]  h&hubh)”}”(hŒ*code:

Dispatching task: PickleInteger 356”h]”hŒ*code:

Dispatching task: PickleInteger 356”…””}”hjD  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)M6ubh)”}”(hŒNFirst lets format that task ID properly, without 'PickleInteger' in the value.”h]”hŒRFirst lets format that task ID properly, without â€˜PickleIntegerâ€™ in the value.”…””}”(hjR  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M7hj]  h&hubh)”}”(hŒ"so now log records look like this:”h]”hŒ"so now log records look like this:”…””}”(hj`  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M9hj]  h&hubh)”}”(hŒ}code:

{'created': 1762875362.650125, 'formatted': 'Dispatching task: 358', 'msg': 'Dispatching task', 'htex_task_id': '358'}”h]”hŒ}code:

{'created': 1762875362.650125, 'formatted': 'Dispatching task: 358', 'msg': 'Dispatching task', 'htex_task_id': '358'}”…””}”hjn  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)M>ubh)”}”(hŒ<which I hope is enough to align with the rest of findcommon.”h]”hŒ<which I hope is enough to align with the rest of findcommon.”…””}”(hj|  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M?hj]  h&hubh)”}”(hŒ=So add in an import for this log into getlogs and try it out:”h]”hŒ=So add in an import for this log into getlogs and try it out:”…””}”(hjŠ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MAhj]  h&hubh)”}”(hX“
  code:

0.0: Task %s: will be sent to executor htex_local/parsl.log/MainThread
0.0002638947579168504: Task %s: Adding output dependencies/parsl.log/MainThread
0.00043137666816535635: Task %s: Added output dependencies/parsl.log/MainThread
0.000597007812992219: Task %s: Gathering dependencies: start/parsl.log/MainThread
0.0007628988011092085: Task %s: Gathering dependencies: end/parsl.log/MainThread
0.0009220723182924332: Task %s: submitted for App app, not waiting on any dependency/parsl.log/MainThread
0.001087645781204997: Task %s: has AppFuture: %s/parsl.log/MainThread
0.00124765870757916: Task %s: initializing state to pending/parsl.log/MainThread
0.0014058038386331726: Task %s: TMP: dependencies added, calling launch_if_ready/parsl.log/MainThread
0.001564712019010623: Task %s: submitting into launch pool executor/parsl.log/MainThread
0.0017369337345597932: Task %s: submitted into launch pool executor/parsl.log/MainThread
0.001899743409750099: Task %s: TMP: launch_if_ready returned/parsl.log/MainThread
0.05900042122959541: Task %s: before submitter lock/parsl.log/Task-Launch_0
0.0591468410008514: Task %s: after submitter lock, before executor.submit/parsl.log/Task-Launch_0
0.060021993201998525: Task %s: after executor.submit/parsl.log/Task-Launch_0
0.06016771364871258: Task %s: changing state from pending to launched/parsl.log/Task-Launch_0
0.0603932834440662: Task %s: try %s launched on executor %s with executor id %s/parsl.log/Task-Launch_0
0.06053673232206002: Task %s: Standard out will not be redirected./parsl.log/Task-Launch_0
0.0606727182590467: Task %s: Standard error will not be redirected./parsl.log/Task-Launch_0
49.47250751745866: Dispatching task/interchange log/no thread
49.474345041859536: Putting HTEX task %s into scheduler/Pool manager log/Interchange-Communicator
49.47540741637006: HTEX task %s: received executor task/Pool worker log/MainThread
49.57590749736206: HTEX task %s: Completed task/Pool worker log/MainThread
49.57610538972688: HTEX task %s: pre-pickle/Pool worker log/MainThread
49.576229626132594: HTEX task %s: post-pickle/Pool worker log/MainThread
49.57631690655985: HTEX task %s: about to put in result_queue/Pool worker log/MainThread
49.57641641170748: HTEX task %s: about to pop worker from tasks_in_progress/Pool worker log/MainThread
49.57678588194781: HTEX task %s: All processing finished for task/Pool worker log/MainThread
100.12664758719608: Task %s: changing state from launched to exec_done/parsl.log/HTEX-Result-Queue-Thread
100.12674850912138: Task %s: Standard out will not be redirected./parsl.log/HTEX-Result-Queue-Thread
100.1268388351537: Task %s: Standard error will not be redirected./parsl.log/HTEX-Result-Queue-Thread”h]”hX“
  code:

0.0: Task %s: will be sent to executor htex_local/parsl.log/MainThread
0.0002638947579168504: Task %s: Adding output dependencies/parsl.log/MainThread
0.00043137666816535635: Task %s: Added output dependencies/parsl.log/MainThread
0.000597007812992219: Task %s: Gathering dependencies: start/parsl.log/MainThread
0.0007628988011092085: Task %s: Gathering dependencies: end/parsl.log/MainThread
0.0009220723182924332: Task %s: submitted for App app, not waiting on any dependency/parsl.log/MainThread
0.001087645781204997: Task %s: has AppFuture: %s/parsl.log/MainThread
0.00124765870757916: Task %s: initializing state to pending/parsl.log/MainThread
0.0014058038386331726: Task %s: TMP: dependencies added, calling launch_if_ready/parsl.log/MainThread
0.001564712019010623: Task %s: submitting into launch pool executor/parsl.log/MainThread
0.0017369337345597932: Task %s: submitted into launch pool executor/parsl.log/MainThread
0.001899743409750099: Task %s: TMP: launch_if_ready returned/parsl.log/MainThread
0.05900042122959541: Task %s: before submitter lock/parsl.log/Task-Launch_0
0.0591468410008514: Task %s: after submitter lock, before executor.submit/parsl.log/Task-Launch_0
0.060021993201998525: Task %s: after executor.submit/parsl.log/Task-Launch_0
0.06016771364871258: Task %s: changing state from pending to launched/parsl.log/Task-Launch_0
0.0603932834440662: Task %s: try %s launched on executor %s with executor id %s/parsl.log/Task-Launch_0
0.06053673232206002: Task %s: Standard out will not be redirected./parsl.log/Task-Launch_0
0.0606727182590467: Task %s: Standard error will not be redirected./parsl.log/Task-Launch_0
49.47250751745866: Dispatching task/interchange log/no thread
49.474345041859536: Putting HTEX task %s into scheduler/Pool manager log/Interchange-Communicator
49.47540741637006: HTEX task %s: received executor task/Pool worker log/MainThread
49.57590749736206: HTEX task %s: Completed task/Pool worker log/MainThread
49.57610538972688: HTEX task %s: pre-pickle/Pool worker log/MainThread
49.576229626132594: HTEX task %s: post-pickle/Pool worker log/MainThread
49.57631690655985: HTEX task %s: about to put in result_queue/Pool worker log/MainThread
49.57641641170748: HTEX task %s: about to pop worker from tasks_in_progress/Pool worker log/MainThread
49.57678588194781: HTEX task %s: All processing finished for task/Pool worker log/MainThread
100.12664758719608: Task %s: changing state from launched to exec_done/parsl.log/HTEX-Result-Queue-Thread
100.12674850912138: Task %s: Standard out will not be redirected./parsl.log/HTEX-Result-Queue-Thread
100.1268388351537: Task %s: Standard error will not be redirected./parsl.log/HTEX-Result-Queue-Thread”…””}”hj˜  sbah}”(h]”h]”h]”h]”h!]”h#h$uh%h
hj]  h&hh'h(h)Mdubh)”}”(hŒand there it is.”h]”hŒand there it is.”…””}”(hj¦  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mehj]  h&hubh)”}”(hXW  Next, I'd like to get more in here. Specifically of interest for observability development is I'd like to get an event for the point where a task message is received - even though at that point, its the beginning of a span that we won't know the task identity for until much later when the payload has been depickled and the task_id extracted.”h]”hX]  Next, Iâ€™d like to get more in here. Specifically of interest for observability development is Iâ€™d like to get an event for the point where a task message is received - even though at that point, its the beginning of a span that we wonâ€™t know the task identity for until much later when the payload has been depickled and the task_id extracted.”…””}”(hj´  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mghj]  h&hubh)”}”(hŒÐThe approach is probably something like a two parter:
- make some span concept that has identities for all of its messages
- tie that span to a task ID so that all its lines can get an htex_task_id widened on”h]”hŒÐThe approach is probably something like a two parter:
- make some span concept that has identities for all of its messages
- tie that span to a task ID so that all its lines can get an htex_task_id widened on”…””}”(hjÂ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mihj]  h&hubh)”}”(hXª  This is an example of sending a join back in time. and an example of having to have the definition *somewhere* that these things are related - but that it doesn't have to be in the logging code where we prefer to be fast and stateless. Also a library call that finds an htex task id on any record of a group and widens out all the others to have the same id: look for "these keys" in groups identified by "these keys" and make them global.
(```widen_implication``` or  some functional-dependency related name?). in this case, for the interchange log file, ```submit_pass_id``` => ```htex_task_id```, or if doing so at a higher level ```(dfk,executor,submit_pass_id)=>htex_task_id```”h]”(hŒcThis is an example of sending a join back in time. and an example of having to have the definition ”…””}”(hjÐ  h&hh'Nh)Nubh§)”}”(hŒ*somewhere*”h]”hŒ	somewhere”…””}”(hjØ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjÐ  ubhXU   that these things are related - but that it doesnâ€™t have to be in the logging code where we prefer to be fast and stateless. Also a library call that finds an htex task id on any record of a group and widens out all the others to have the same id: look for â€œthese keysâ€ in groups identified by â€œthese keysâ€ and make them global.
(”…””}”(hjÐ  h&hh'Nh)Nubjn  )”}”(hŒ```widen_implication```”h]”hŒ`widen_implication`”…””}”(hjê  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÐ  ubhŒ\ or  some functional-dependency related name?). in this case, for the interchange log file, ”…””}”(hjÐ  h&hh'Nh)Nubjn  )”}”(hŒ```submit_pass_id```”h]”hŒ`submit_pass_id`”…””}”(hjü  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÐ  ubhŒ => ”…””}”(hjÐ  h&hh'Nh)Nubjn  )”}”(hŒ```htex_task_id```”h]”hŒ`htex_task_id`”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÐ  ubhŒ#, or if doing so at a higher level ”…””}”(hjÐ  h&hh'Nh)Nubjn  )”}”(hŒ1```(dfk,executor,submit_pass_id)=>htex_task_id```”h]”hŒ-`(dfk,executor,submit_pass_id)=>htex_task_id`”…””}”(hj   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÐ  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mmhj]  h&hubh)”}”(hŒÈTODO: show task1 output before join. then implement join and show task1 output with the rest of the decode span in there - the deserialisation of the task and execution of the matchmaker is shown now.”h]”hŒÈTODO: show task1 output before join. then implement join and show task1 output with the rest of the decode span in there - the deserialisation of the task and execution of the matchmaker is shown now.”…””}”(hj4  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mqhj]  h&hubh)”}”(hŒ2TODO: add in result handling span in the same way.”h]”hŒ2TODO: add in result handling span in the same way.”…””}”(hjB  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mshj]  h&hubh)”}”(hXL  widening submit_pass_id using key implication widening after loading/processing all the logs normally, which is what I'd expect if adding in ad-hoc hack stuff outside of the core parsl log loaders... has revealed some fixpoint related stuff: widening to htex_task_id which is the actual known ID isn't sufficient because the widening of htex_task_id to parsl_task_id already happened. I can widen to parsl_task_id OK because that implication has happened on the two log lines that already have an htex task ID. Is that ok in general? do I need fixpoints in general? something to keep an eye on. I think: as long as there is one record to convey the join as having happened, then a subsequent join can flesh that out. but if the join involves facts that aren't represented incrementally like that, then no. probably I can contrive some examples.”h]”hXR  widening submit_pass_id using key implication widening after loading/processing all the logs normally, which is what Iâ€™d expect if adding in ad-hoc hack stuff outside of the core parsl log loadersâ€¦ has revealed some fixpoint related stuff: widening to htex_task_id which is the actual known ID isnâ€™t sufficient because the widening of htex_task_id to parsl_task_id already happened. I can widen to parsl_task_id OK because that implication has happened on the two log lines that already have an htex task ID. Is that ok in general? do I need fixpoints in general? something to keep an eye on. I think: as long as there is one record to convey the join as having happened, then a subsequent join can flesh that out. but if the join involves facts that arenâ€™t represented incrementally like that, then no. probably I can contrive some examples.”…””}”(hjP  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Muhj]  h&hubeh}”(h]”Œ?adventure-adding-observability-to-a-prototype-idris2interchange”ah]”h]”ŒAadventure: adding observability to a prototype: idris2interchange”ah]”h!]”uh%h*hj	  h&hh'h(h)Mäubh+)”}”(hhh]”(h0)”}”(hŒ4Performance measurement of patch stack on 2025-10-27”h]”hŒ4Performance measurement of patch stack on 2025-10-27”…””}”(hji  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjf  h&hh'h(h)MyubjL  )”}”(hŒ_pip install -e . && parsl-perf --config parsl/tests/configs/htex_local.py --iterate=1,1,1,10000”h]”hŒ_pip install -e . && parsl-perf --config parsl/tests/configs/htex_local.py --iterate=1,1,1,10000”…””}”hjw  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  j_  uh%jK  h'h(h)M{hjf  h&hubh)”}”(hŒSRunning parsl-perf with constant block sizes (to avoid
queue length speed changes):”h]”hŒSRunning parsl-perf with constant block sizes (to avoid
queue length speed changes):”…””}”(hjˆ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjf  h&hubh)”}”(hŒAmaster branch (165fdc5bf663ab7fd0d3ea7c2d8d177b02d731c5) 1139 tps”h]”hŒAmaster branch (165fdc5bf663ab7fd0d3ea7c2d8d177b02d731c5) 1139 tps”…””}”(hj–  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M‚hjf  h&hubh)”}”(hŒmore-task-tied-logs: 1024”h]”hŒmore-task-tied-logs: 1024”…””}”(hj¤  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M…hjf  h&hubh	Œdefinition_list”“”)”}”(hhh]”h	Œdefinition_list_item”“”)”}”(hŒLjson-wide-log-records: 537
- but without initializing the JSONHandler: 1122
”h]”(h	Œterm”“”)”}”(hŒjson-wide-log-records: 537”h]”hŒjson-wide-log-records: 537”…””}”(hj¿  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j½  h'h(h)M‡hj¹  ubh	Œ
definition”“”)”}”(hhh]”jœ  )”}”(hhh]”j¡  )”}”(hŒ/but without initializing the JSONHandler: 1122
”h]”h)”}”(hŒ.but without initializing the JSONHandler: 1122”h]”hŒ.but without initializing the JSONHandler: 1122”…””}”(hjÙ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MˆhjÕ  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)MˆhjÒ  ubah}”(h]”h]”h]”h]”h!]”j  Œ-”uh%j›  h'h(h)MˆhjÏ  ubah}”(h]”h]”h]”h]”h!]”uh%jÍ  h'h(h)Mˆhj¹  ubeh}”(h]”h]”h]”h]”h!]”uh%j·  h'h(h)M‡hj´  ubah}”(h]”h]”h]”h]”h!]”uh%j²  h'h(h)M‡hjf  h&hubh)”}”(hŒ-end of branch with all changes up to now: 385”h]”hŒ-end of branch with all changes up to now: 385”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MŠhjf  h&hubeh}”(h]”Œ4performance-measurement-of-patch-stack-on-2025-10-27”ah]”h]”Œ4performance measurement of patch stack on 2025-10-27”ah]”h!]”uh%h*hj	  h&hh'h(h)Myubh+)”}”(hhh]”(h0)”}”(hŒ4Idea: Parsl resource monitoring on a host-wide basis”h]”hŒ4Idea: Parsl resource monitoring on a host-wide basis”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)MŽubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œidea; host-wide monitoring”Œindex-13”hNt”ah‰uh%hh'h(h)Mhj  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j8  uh%hhj  h&hh'h(h)M‘ubh)”}”(hX“  Ignore Parsl Monitoring per-task resource monitoring and
do something else that generates similar observability
records. This was always some disappointment with getting
WQ resource monitoring into the Parsl monitoring database:
what exists there that could be imported? Likewise, host-wide stuff doesn't fit well into the current Parsl Monitoring model but might fit better into an observability model.”h]”hX•  Ignore Parsl Monitoring per-task resource monitoring and
do something else that generates similar observability
records. This was always some disappointment with getting
WQ resource monitoring into the Parsl monitoring database:
what exists there that could be imported? Likewise, host-wide stuff doesnâ€™t fit well into the current Parsl Monitoring model but might fit better into an observability model.”…””}”(hjC  h&hh'Nh)Nubah}”(h]”j8  ah]”h]”h]”h!]”uh%hœh'h(h)M’hj  h&hhÀ}”hÂ}”j8  j:  subeh}”(h]”Œ3idea-parsl-resource-monitoring-on-a-host-wide-basis”ah]”h]”Œ4idea: parsl resource monitoring on a host-wide basis”ah]”h!]”uh%h*hj	  h&hh'h(h)MŽubh+)”}”(hhh]”(h0)”}”(hŒIdea: worker node dmesg”h]”hŒIdea: worker node dmesg”…””}”(hj^  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj[  h&hh'h(h)Mšubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œidea; kernel events”Œindex-14”hNt”(h‹Œ
OOM Killer”jw  hNt”(h‹Œdmesg”jw  hNt”eh‰uh%hh'h(h)Mœhj[  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jw  uh%hhj[  h&hh'h(h)MŸubh)”}”(hŒ‡especially for catching OOM Killer and other interesting kernel events that affect processes without giving user-expected stack traces.”h]”hŒ‡especially for catching OOM Killer and other interesting kernel events that affect processes without giving user-expected stack traces.”…””}”(hj†  h&hh'Nh)Nubah}”(h]”jw  ah]”h]”h]”h!]”uh%hœh'h(h)M hj[  h&hhÀ}”hÂ}”jw  j}  subh)”}”(hŒ3Is dmesg available to users on Aurora worker nodes?”h]”hŒ3Is dmesg available to users on Aurora worker nodes?”…””}”(hj–  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¢hj[  h&hubh)”}”(hŒ?``mesg`` already outputs JSON, if run with the right parameter.”h]”(jn  )”}”(hŒ``mesg``”h]”hŒmesg”…””}”(hj¨  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¤  ubhŒ7 already outputs JSON, if run with the right parameter.”…””}”(hj¤  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¤hj[  h&hubh)”}”(hŒ|That should be an hour to prototype alongside workers. Cross reference host-wide process monitoring as thematically related.”h]”hŒ|That should be an hour to prototype alongside workers. Cross reference host-wide process monitoring as thematically related.”…””}”(hjÀ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¦hj[  h&hubeh}”(h]”Œidea-worker-node-dmesg”ah]”h]”Œidea: worker node dmesg”ah]”h!]”uh%h*hj	  h&hh'h(h)Mšubh+)”}”(hhh]”(h0)”}”(hŒIdea: automatic instrumentation”h]”hŒIdea: automatic instrumentation”…””}”(hjÙ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÖ  h&hh'h(h)M©ubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹ŒOpenTelemetry”Œindex-15”hNt”(h‹Œidea; automatic instrumentation”jò  hNt”eh‰uh%hh'h(h)M«hjÖ  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jò  uh%hhjÖ  h&hh'h(h)M­ubh)”}”(hŒoProjects like OpenTelemetry offer automatic instrumentation. that would be interesting to experiment with here.”h]”hŒoProjects like OpenTelemetry offer automatic instrumentation. that would be interesting to experiment with here.”…””}”(hjÿ  h&hh'Nh)Nubah}”(h]”jò  ah]”h]”h]”h!]”uh%hœh'h(h)M®hjÖ  h&hhÀ}”hÂ}”jò  jö  subh‘)”}”(hŒ.. _moving:”h]”h}”(h]”h]”h]”h]”h!]”h›Œmoving”uh%hh)M±hjÖ  h&hh'h(ubeh}”(h]”Œidea-automatic-instrumentation”ah]”h]”Œidea: automatic instrumentation”ah]”h!]”uh%h*hj	  h&hh'h(h)M©ubeh}”(h]”(Œgenerating-wide-records”jõ
  eh]”h]”(Œgenerating wide records”Œcreating”eh]”h!]”uh%h*hhh&hh'h(h)MThÀ}”j(  jë
  shÂ}”jõ
  jë
  subh+)”}”(hhh]”(h0)”}”(hŒMoving wide records around”h]”hŒMoving wide records around”…””}”(hj0  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj-  h&hh'h(h)M´ubh)”}”(hŒàEvents are generated in various places (for example, in application code running on HPC cluster worker nodes) and usually the user wants them somewhere else - for some kind of analysis, in the broad sense (:ref:`analysing`).”h]”(hŒÎEvents are generated in various places (for example, in application code running on HPC cluster worker nodes) and usually the user wants them somewhere else - for some kind of analysis, in the broad sense (”…””}”(hj>  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`analysing`”h]”jV  )”}”(hjH  h]”hŒ	analysing”…””}”(hjJ  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjF  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”jT  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œ	analysing”uh%jP  h'h(h)M¶hj>  ubhŒ).”…””}”(hj>  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¶hj-  h&hubh)”}”(hX  Concrete mechanisms for moving events around to a place of analysis should not be baked into the architecture. realtime options, file based options, ... - that is an area for experimentation and this work should facilitate that rather than being prescriptive.”h]”hX  Concrete mechanisms for moving events around to a place of analysis should not be baked into the architecture. realtime options, file based options, â€¦ - that is an area for experimentation and this work should facilitate that rather than being prescriptive.”…””}”(hjp  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¸hj-  h&hubh)”}”(hŒåThis chapter talks about different mechanisms that are on offer, and about how configuration might work. It tries to stay away from implementing much new mechanism, but rather tries to focus on integration of what already exists.”h]”hŒåThis chapter talks about different mechanisms that are on offer, and about how configuration might work. It tries to stay away from implementing much new mechanism, but rather tries to focus on integration of what already exists.”…””}”(hj~  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mºhj-  h&hubh+)”}”(hhh]”(h0)”}”(hŒComparison to Parsl logging”h]”hŒComparison to Parsl logging”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjŒ  h&hh'h(h)M½ubh)”}”(hŒ+also followed by Academy at time of writing”h]”hŒ+also followed by Academy at time of writing”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¿hjŒ  h&hubh)”}”(hXV  expectation is you send your debugging expert a tarball of logs for them to pore over - this is extremely asynchronous but a very effective way of moving those records around. it has relatively low effect on performance behaviour: get the logs onto some filesystem while the performance-critical bit is running, move them from there later on.”h]”hXV  expectation is you send your debugging expert a tarball of logs for them to pore over - this is extremely asynchronous but a very effective way of moving those records around. it has relatively low effect on performance behaviour: get the logs onto some filesystem while the performance-critical bit is running, move them from there later on.”…””}”(hj«  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÁhjŒ  h&hubjœ  )”}”(hhh]”j¡  )”}”(hX  poring over these logs "later" - there's no need for those
logs to accumulate in real time in one place for post-facto
analysis. And in practice, when doing log analysis rather
than monitoring analysis, "send me a tarball of your runinfo"
is a standard technique.
”h]”h)”}”(hX  poring over these logs "later" - there's no need for those
logs to accumulate in real time in one place for post-facto
analysis. And in practice, when doing log analysis rather
than monitoring analysis, "send me a tarball of your runinfo"
is a standard technique.”h]”hX  poring over these logs â€œlaterâ€ - thereâ€™s no need for those
logs to accumulate in real time in one place for post-facto
analysis. And in practice, when doing log analysis rather
than monitoring analysis, â€œsend me a tarball of your runinfoâ€
is a standard technique.”…””}”(hjÀ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÃhj¼  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)MÃhj¹  h&hubah}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)MÃhjŒ  h&hubh)”}”(hŒ@async movement is much easier than synchronous/realtime movement”h]”hŒ@async movement is much easier than synchronous/realtime movement”…””}”(hjÚ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÉhjŒ  h&hubeh}”(h]”Œcomparison-to-parsl-logging”ah]”h]”Œcomparison to parsl logging”ah]”h!]”uh%h*hj-  h&hh'h(h)M½ubh+)”}”(hhh]”(h0)”}”(hŒComparison to Parsl Monitoring”h]”hŒComparison to Parsl Monitoring”…””}”(hjó  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjð  h&hh'h(h)MÌubh)”}”(hŒThe transmission model is real-time. Even with recent radio
plugins, the assumption is still that messages will arrive
soon after being sent.”h]”hŒThe transmission model is real-time. Even with recent radio
plugins, the assumption is still that messages will arrive
soon after being sent.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÎhjð  h&hubh)”}”(hX  The almost-real-time data transmisison model is especially
awkward when combined with SQL: distributed system events
will arrive at different times or in the original UDP model
perhaps not at all, and the "first" message that
creates a task (for the purposes of the database) might arrive
after some secondary data that requires that primary key to
exist. yes, it's nice for the SQL database to follow foreign
key rules, especially when looking at the data "afterwards"
but that's not realistic for distributed unreliable events.”h]”hX  The almost-real-time data transmisison model is especially
awkward when combined with SQL: distributed system events
will arrive at different times or in the original UDP model
perhaps not at all, and the â€œfirstâ€ message that
creates a task (for the purposes of the database) might arrive
after some secondary data that requires that primary key to
exist. yes, itâ€™s nice for the SQL database to follow foreign
key rules, especially when looking at the data â€œafterwardsâ€
but thatâ€™s not realistic for distributed unreliable events.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÒhjð  h&hubh)”}”(hŒ<pluggable radios - inspiring following configurability model”h]”hŒ<pluggable radios - inspiring following configurability model”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÜhjð  h&hubh)”}”(hŒ7Radios that exist currently: (TODO: upsides, downsides)”h]”hŒ7Radios that exist currently: (TODO: upsides, downsides)”…””}”(hj+  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mßhjð  h&hubjœ  )”}”(hhh]”(j¡  )”}”(hŒèUDP: sends UDP packets at submit side. UDP is unreliable. So events *do* get lost. I think the assumption at implementation time was that UDP packet loss is just some thing your professor tells you about, but clearly doesn't happen.”h]”h)”}”(hj>  h]”(hŒDUDP: sends UDP packets at submit side. UDP is unreliable. So events ”…””}”(hj@  h&hh'Nh)Nubh§)”}”(hŒ*do*”h]”hŒdo”…””}”(hjG  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hj@  ubhŒ¢ get lost. I think the assumption at implementation time was that UDP packet loss is just some thing your professor tells you about, but clearly doesnâ€™t happen.”…””}”(hj@  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Máhj<  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Máhj9  h&hubj¡  )”}”(hŒˆfilesystem: needs a shared filesystem. One file = one monitoring event. If your filesystem is slow, which it often is, this is slow too.”h]”h)”}”(hjg  h]”hŒˆfilesystem: needs a shared filesystem. One file = one monitoring event. If your filesystem is slow, which it often is, this is slow too.”…””}”(hji  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mâhje  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Mâhj9  h&hubj¡  )”}”(hŒ‘HTEX: the most efficient, but only works with the High Throughput Executor. Uses the existing HTEX result channel to send back monitoring events.”h]”h)”}”(hj~  h]”hŒ‘HTEX: the most efficient, but only works with the High Throughput Executor. Uses the existing HTEX result channel to send back monitoring events.”…””}”(hj€  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mãhj|  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Mãhj9  h&hubj¡  )”}”(hXŠ  ZMQ: This is over TCP. Like UDP radio, needs to be able to connect to the submit side. Probably better than UDP, although there's a TCP and ZMQ session initialization needed at the start of every task, because this radio does not persist connections across tasks. Unlike UDP, on the submit side, is yet another per-worker file descriptor use, I think, which is a serious scalability limitation.”h]”h)”}”(hj•  h]”hXŒ  ZMQ: This is over TCP. Like UDP radio, needs to be able to connect to the submit side. Probably better than UDP, although thereâ€™s a TCP and ZMQ session initialization needed at the start of every task, because this radio does not persist connections across tasks. Unlike UDP, on the submit side, is yet another per-worker file descriptor use, I think, which is a serious scalability limitation.”…””}”(hj—  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mähj“  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Mähj9  h&hubj¡  )”}”(hX  Python multiprocessing: for sending monitoring events within the same cluster of Python `multiprocessing` proceses: roughly the set of processes that were forked locally by the submit script using `multiprocessing`, so in htex: not the workers and not the interchange.
”h]”h)”}”(hX  Python multiprocessing: for sending monitoring events within the same cluster of Python `multiprocessing` proceses: roughly the set of processes that were forked locally by the submit script using `multiprocessing`, so in htex: not the workers and not the interchange.”h]”(hŒXPython multiprocessing: for sending monitoring events within the same cluster of Python ”…””}”(hj®  h&hh'Nh)Nubj  )”}”(hŒ`multiprocessing`”h]”hŒmultiprocessing”…””}”(hj¶  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj®  ubhŒ\ proceses: roughly the set of processes that were forked locally by the submit script using ”…””}”(hj®  h&hh'Nh)Nubj  )”}”(hŒ`multiprocessing`”h]”hŒmultiprocessing”…””}”(hjÈ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj®  ubhŒ6, so in htex: not the workers and not the interchange.”…””}”(hj®  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Måhjª  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Måhj9  h&hubeh}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)Máhjð  h&hubh)”}”(hŒöNone of these are suitable for cloud-style environments where there is neither a shared filesystem or clean IP network. So I also prototyped an Academy radio for use with GC+Parsl - although I would rather use something like Octopus or Chronolog.”h]”hŒöNone of these are suitable for cloud-style environments where there is neither a shared filesystem or clean IP network. So I also prototyped an Academy radio for use with GC+Parsl - although I would rather use something like Octopus or Chronolog.”…””}”(hjì  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mçhjð  h&hubeh}”(h]”Œcomparison-to-parsl-monitoring”ah]”h]”Œcomparison to parsl monitoring”ah]”h!]”uh%h*hj-  h&hh'h(h)MÌubh+)”}”(hhh]”(h0)”}”(hŒPython Configurability”h]”hŒPython Configurability”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)Mêubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹ŒConfigurability”Œindex-16”hNt”(h‹ŒPython; Configurability”j  hNt”eh‰uh%hh'h(h)Mìhj  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j  uh%hhj  h&hh'h(h)Mîubh)”}”(hŒëA soft start in Parsl is to let people opt into observability style logs - with most performance hit coming from turning on json output, i think, it doesn't matter performance-wise too much about adding in the extra stuff on log calls.”h]”hŒíA soft start in Parsl is to let people opt into observability style logs - with most performance hit coming from turning on json output, i think, it doesnâ€™t matter performance-wise too much about adding in the extra stuff on log calls.”…””}”(hj+  h&hh'Nh)Nubah}”(h]”j  ah]”h]”h]”h!]”uh%hœh'h(h)Mïhj  h&hhÀ}”hÂ}”j  j"  subh)”}”(hX  The current parsl stuff is not set up for arbitrary log
configuration outside of the submit-side process: for example,
the worker helpers don't do any log config at all and rely
on their enclosing per-executor environments to do it, which
i think some do not.”h]”hX  The current parsl stuff is not set up for arbitrary log
configuration outside of the submit-side process: for example,
the worker helpers donâ€™t do any log config at all and rely
on their enclosing per-executor environments to do it, which
i think some do not.”…””}”(hj;  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mñhj  h&hubh)”}”(hŒYhtex interchange and worker logs have a hardcoded log config
with a single debug boolean.”h]”hŒYhtex interchange and worker logs have a hardcoded log config
with a single debug boolean.”…””}”(hjI  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M÷hj  h&hubh)”}”(hŒÆI'd like to do something a bit more flexible than adding more
parameters, that reflect that in the future people might want
to configure their handlers differently rather than using the
JSONHandler.”h]”hŒÈIâ€™d like to do something a bit more flexible than adding more
parameters, that reflect that in the future people might want
to configure their handlers differently rather than using the
JSONHandler.”…””}”(hjW  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Múhj  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œ	Chronolog”Œindex-17”hNt”ah‰uh%hh'h(h)Mÿhj  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jp  uh%hhj  h&hh'h(h)M ubh)”}”(hŒ;eg. chronolog. pytest metrics observation in other section.”h]”hŒ;eg. chronolog. pytest metrics observation in other section.”…””}”(hj{  h&hh'Nh)Nubah}”(h]”jp  ah]”h]”h]”h!]”uh%hœh'h(h)Mhj  h&hhÀ}”hÂ}”jp  jr  subh)”}”(hŒÔsee Parsl monitoring radios configuration model. start prototyping that. note that it doesn't magically make arbitrary components that aren't compliant+Python redirectable. but thats fine in the modular approach.”h]”hŒØsee Parsl monitoring radios configuration model. start prototyping that. note that it doesnâ€™t magically make arbitrary components that arenâ€™t compliant+Python redirectable. but thats fine in the modular approach.”…””•      }”(hj‹  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj  h&hubh)”}”(hŒÌsee the existing initialize_logging which allows arbitrary user configurability at the submit-process side, by getting parsl completely out of the way and allowing the user to run whatever code they want.”h]”hŒÌsee the existing initialize_logging which allows arbitrary user configurability at the submit-process side, by getting parsl completely out of the way and allowing the user to run whatever code they want.”…””}”(hj™  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj  h&hubeh}”(h]”Œpython-configurability”ah]”h]”Œpython configurability”ah]”h!]”uh%h*hj-  h&hh'h(h)Mêubh+)”}”(hhh]”(h0)”}”(hŒ/Adventure: Wide records stored as JSON in files”h]”hŒ/Adventure: Wide records stored as JSON in files”…””}”(hj²  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj¯  h&hh'h(h)Mubh)”}”(hŒvThis prototype stores Parsl logs that have been sent into the Python ``logging`` system as JSON objects, one per line.”h]”(hŒEThis prototype stores Parsl logs that have been sent into the Python ”…””}”(hjÀ  h&hh'Nh)Nubjn  )”}”(hŒ``logging``”h]”hŒlogging”…””}”(hjÈ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÀ  ubhŒ& system as JSON objects, one per line.”…””}”(hjÀ  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M
hj¯  h&hubh)”}”(hŒ>This is one of the initial usecases for above configurability.”h]”hŒ>This is one of the initial usecases for above configurability.”…””}”(hjà  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj¯  h&hubh)”}”(hŒŸThis was implemented is a straightforward Python logging `Handler` similar to the existing log handlers, the difference being how the output line is formatted.”h]”(hŒ9This was implemented is a straightforward Python logging ”…””}”(hjî  h&hh'Nh)Nubj  )”}”(hŒ	`Handler`”h]”hŒHandler”…””}”(hjö  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hjî  ubhŒ] similar to the existing log handlers, the difference being how the output line is formatted.”…””}”(hjî  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj¯  h&hubh)”}”(hŒxThe files are then moveable using traditional means: for exmaple, the classic "send me a tarball of your run directory".”h]”hŒ|The files are then moveable using traditional means: for exmaple, the classic â€œsend me a tarball of your run directoryâ€.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj¯  h&hubeh}”(h]”Œ.adventure-wide-records-stored-as-json-in-files”ah]”h]”Œ/adventure: wide records stored as json in files”ah]”h!]”uh%h*hj-  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒMoving in realtime”h]”hŒMoving in realtime”…””}”(hj'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj$  h&hh'h(h)Mubh)”}”(hŒ–What does realtime mean in this case? mostly a case of what do the ultimate consumers need, rather than any strong technical definition at this stage.”h]”hŒ–What does realtime mean in this case? mostly a case of what do the ultimate consumers need, rather than any strong technical definition at this stage.”…””}”(hj5  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj$  h&hubh)”}”(hŒçInside Python parts of Parsl, this data *is* available
in realtime at the point of logging as it goes to whatever
LogHandler is running in each python process. that isn't
true in general on the "event model" side of things, though.”h]”(hŒ(Inside Python parts of Parsl, this data ”…””}”(hjC  h&hh'Nh)Nubh§)”}”(hŒ*is*”h]”hŒis”…””}”(hjK  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjC  ubhŒÁ available
in realtime at the point of logging as it goes to whatever
LogHandler is running in each python process. that isnâ€™t
true in general on the â€œevent modelâ€ side of things, though.”…””}”(hjC  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj$  h&hubh)”}”(hŒxParsl already moves some event stuff around the network in realtime: that is the purpose of the monitoring radio system.”h]”hŒxParsl already moves some event stuff around the network in realtime: that is the purpose of the monitoring radio system.”…””}”(hjc  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj$  h&hubh)”}”(hŒJfollowing two sections, octopus and chronolog, will talk about doing that.”h]”hŒJfollowing two sections, octopus and chronolog, will talk about doing that.”…””}”(hjq  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj$  h&hubh)”}”(hŒ·My initial log-related work was post-facto: copy log files around. but there are plenty of mechanisms that should be able to deliver and analyse live, eg built around Diaspora Octopus”h]”hŒ·My initial log-related work was post-facto: copy log files around. but there are plenty of mechanisms that should be able to deliver and analyse live, eg built around Diaspora Octopus”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M hj$  h&hubeh}”(h]”Œmoving-in-realtime”ah]”h]”Œmoving in realtime”ah]”h!]”uh%h*hj-  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒAdventure: Diaspora Octopus”h]”hŒAdventure: Diaspora Octopus”…””}”(hj˜  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj•  h&hh'h(h)M$ubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹ŒDiaspora; Octopus”Œindex-18”hNt”(h‹ŒOctopus”j±  hNt”(h‹ŒKafka”j±  hNt”(h‹Œ Globus Hosted Services; Diaspora”j±  hNt”eh‰uh%hh'h(h)M&hj•  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j±  uh%hhj•  h&hh'h(h)M*ubh)”}”(hŒiThis is an obvious follow-on to file-based JSON logs: the developers still kinda exist, and are friendly.”h]”hŒiThis is an obvious follow-on to file-based JSON logs: the developers still kinda exist, and are friendly.”…””}”(hjÂ  h&hh'Nh)Nubah}”(h]”j±  ah]”h]”h]”h!]”uh%hœh'h(h)M+hj•  h&hhÀ}”hÂ}”j±  j¹  subh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œpeople; Ryan”Œindex-19”hNt”(h‹Œpeople; Haochen”jÝ  hNt”eh‰uh%hh'h(h)M-hj•  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jÝ  uh%hhj•  h&hh'h(h)M/ubh)”}”(hŒwith Ryan and Haochen”h]”hŒwith Ryan and Haochen”…””}”(hjê  h&hh'Nh)Nubah}”(h]”jÝ  ah]”h]”h]”h!]”uh%hœh'h(h)M0hj•  h&hhÀ}”hÂ}”jÝ  já  subh)”}”(hŒZThis turned into a monster debugging and restructuring session around Octopus reliability.”h]”hŒZThis turned into a monster debugging and restructuring session around Octopus reliability.”…””}”(hjú  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M2hj•  h&hubh)”}”(hŒRRyan has a specific use case he's trying to implement, that I am helping him with:”h]”hŒTRyan has a specific use case heâ€™s trying to implement, that I am helping him with:”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M4hj•  h&hubjØ  )”}”(hŒéi mostly want to know when my agents perform their loop so i can hackily use this as a heartbeat to determine if my agents alive, and, when the agent decides to call the llm, i want to know the outcome of that call

-- Ryan, on Slack”h]”(h)”}”(hŒÖi mostly want to know when my agents perform their loop so i can hackily use this as a heartbeat to determine if my agents alive, and, when the agent decides to call the llm, i want to know the outcome of that call”h]”hŒÖi mostly want to know when my agents perform their loop so i can hackily use this as a heartbeat to determine if my agents alive, and, when the agent decides to call the llm, i want to know the outcome of that call”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M9hj  ubh	Œattribution”“”)”}”(hŒRyan, on Slack”h]”hŒRyan, on Slack”…””}”(hj*  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j(  h'h(h)M;hj  ubeh}”(h]”h]”Œepigraph”ah]”h]”h!]”uh%j×  h'h(h)M9hj•  h&hubeh}”(h]”Œadventure-diaspora-octopus”ah]”h]”Œadventure: diaspora octopus”ah]”h!]”uh%h*hj-  h&hh'h(h)M$ubh+)”}”(hhh]”(h0)”}”(hŒIdea: Chronolog”h]”hŒIdea: Chronolog”…””}”(hjJ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjG  h&hh'h(h)M>ubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œ	Chronolog”Œindex-20”hNt”(h‹Œpeople; Nishchay”jc  hNt”(h‹Œidea; Chronolog”jc  hNt”eh‰uh%hh'h(h)M@hjG  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jc  uh%hhjG  h&hh'h(h)MCubh)”}”(hŒ=Nishchay did some stuff here. I don't know what the state is.”h]”hŒ?Nishchay did some stuff here. I donâ€™t know what the state is.”…””}”(hjr  h&hh'Nh)Nubah}”(h]”jc  ah]”h]”h]”h!]”uh%hœh'h(h)MDhjG  h&hhÀ}”hÂ}”jc  ji  subh)”}”(hŒ0https://grc.iit.edu/research/projects/chronolog/”h]”hÍ)”}”(hj„  h]”hŒ0https://grc.iit.edu/research/projects/chronolog/”…””}”(hj†  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j„  uh%hÌhj‚  ubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MFhjG  h&hubh‘)”}”(hŒ.. _pytest-observes-logs:”h]”h}”(h]”h]”h]”h]”h!]”h›Œpytest-observes-logs”uh%hh)MIhjG  h&hh'h(ubeh}”(h]”Œidea-chronolog”ah]”h]”Œidea: chronolog”ah]”h!]”uh%h*hj-  h&hh'h(h)M>ubh+)”}”(hhh]”(h0)”}”(hŒ1Adventure: pytest observing interchange variables”h]”hŒ1Adventure: pytest observing interchange variables”…””}”(hj°  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj­  h&hh'h(h)MLubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œpytest”Œindex-21”hNt”(h‹ŒPython; pytest”jÉ  hNt”eh‰uh%hh'h(h)MNhj­  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jÉ  uh%hhj­  h&hh'h(h)MPubh)”}”(hXK  pytest htex task priority test wants to wait for interchange
to have all the submitted tasks - which happens asynchronously
to submit calls returning. it does that by logfile parsing.
how does that fit into this observability story: there's a
metric in my prototype for this value (which I used in one
of the other use cases here).”h]”hXM  pytest htex task priority test wants to wait for interchange
to have all the submitted tasks - which happens asynchronously
to submit calls returning. it does that by logfile parsing.
how does that fit into this observability story: thereâ€™s a
metric in my prototype for this value (which I used in one
of the other use cases here).”…””}”(hjÖ  h&hh'Nh)Nubah}”(h]”jÉ  ah]”h]”h]”h!]”uh%hœh'h(h)MQhj­  h&hhÀ}”hÂ}”jÉ  jÍ  subh)”}”(hXY  Can do this by re-parsing the interchange log value. also
could (with suitable configuration) attach a "pytest can
see only metrics" log writer that runs over a unix socket?
in some sense, injecting the relevant observability path
into the interchange code as a configured log handler. that
gives some motivation for the configurability section.”h]”hX]  Can do this by re-parsing the interchange log value. also
could (with suitable configuration) attach a â€œpytest can
see only metricsâ€ log writer that runs over a unix socket?
in some sense, injecting the relevant observability path
into the interchange code as a configured log handler. that
gives some motivation for the configurability section.”…””}”(hjæ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MXhj­  h&hubh)”}”(hŒAlso attaching a JSON log file to the interchange, and having a tail reader of that. also needs special configuration of interchange I think.”h]”hŒAlso attaching a JSON log file to the interchange, and having a tail reader of that. also needs special configuration of interchange I think.”…””}”(hjô  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M_hj­  h&hubeh}”(h]”(Œ0adventure-pytest-observing-interchange-variables”j¤  eh]”h]”(Œ1adventure: pytest observing interchange variables”Œpytest-observes-logs”eh]”h!]”uh%h*hj-  h&hh'h(h)MLhÀ}”j  jš  shÂ}”j¤  jš  subh+)”}”(hhh]”(h0)”}”(hŒGAdventure: Academy agents can report their own relevant logs via action”h]”hŒGAdventure: Academy agents can report their own relevant logs via action”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj  h&hh'h(h)Mdubh)”}”(hŒ6A prototype I made for Logan, and also showed to Ryan.”h]”hŒ6A prototype I made for Logan, and also showed to Ryan.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mfhj  h&hubh)”}”(hŒZThis ties in with Ryan's Diaspora use case for examining what individual agents are up to.”h]”hŒ\This ties in with Ryanâ€™s Diaspora use case for examining what individual agents are up to.”…””}”(hj,  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhhj  h&hubh‘)”}”(hŒ.. _analysing:”h]”h}”(h]”h]”h]”h]”h!]”h›Œ	analysing”uh%hh)Mjhj  h&hh'h(ubeh}”(h]”ŒFadventure-academy-agents-can-report-their-own-relevant-logs-via-action”ah]”h]”ŒGadventure: academy agents can report their own relevant logs via action”ah]”h!]”uh%h*hj-  h&hh'h(h)Mdubeh}”(h]”(Œmoving-wide-records-around”j  eh]”h]”(Œmoving wide records around”Œmoving”eh]”h!]”uh%h*hhh&hh'h(h)M´hÀ}”jS  j  shÂ}”j  j  subh+)”}”(hhh]”(h0)”}”(hŒAnalysing wide records”h]”hŒAnalysing wide records”…””}”(hj[  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjX  h&hh'h(h)Mmubh)”}”(hXÁ  At small enough scale, which is actually quite a large number of tasks, given the volumes of data involved, parsing logs into a Python session and using standard Python tools like list comprehensions is a legitimate way to analyse things - rather than treating this approach as something awkwardly shameful that will be replaced by The Real Thing later. This is especially appealing to Parsl users who tend to be Python data science literate anyway.”h]”hXÁ  At small enough scale, which is actually quite a large number of tasks, given the volumes of data involved, parsing logs into a Python session and using standard Python tools like list comprehensions is a legitimate way to analyse things - rather than treating this approach as something awkwardly shameful that will be replaced by The Real Thing later. This is especially appealing to Parsl users who tend to be Python data science literate anyway.”…””}”(hji  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MohjX  h&hubh)”}”(hŒ•That isn't the only approach and this is also modular: you get the records and can analyse them using whatever tools you personally find appropriate.”h]”hŒ—That isnâ€™t the only approach and this is also modular: you get the records and can analyse them using whatever tools you personally find appropriate.”…””}”(hjw  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MqhjX  h&hubh+)”}”(hhh]”(h0)”}”(hŒ>Adventure: All events for a task, in two aspects/presentations”h]”hŒ>Adventure: All events for a task, in two aspects/presentations”…””}”(hjˆ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj…  h&hh'h(h)Mtubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”((h‹Œaspect”Œindex-22”hNt”(h‹Œpresentation”j¡  hNt”eh‰uh%hh'h(h)Mvhj…  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j¡  uh%hhj…  h&hh'h(h)Mxubh)”}”(hŒÄTODO: move this from elsewhere, tidyup/modularise the code so it is presentable as a good first example.
Show what it looks like: with monitoring.db import only; with full observability prototype.”h]”hŒÄTODO: move this from elsewhere, tidyup/modularise the code so it is presentable as a good first example.
Show what it looks like: with monitoring.db import only; with full observability prototype.”…””}”(hj®  h&hh'Nh)Nubah}”(h]”j¡  ah]”h]”h]”h!]”uh%hœh'h(h)Myhj…  h&hhÀ}”hÂ}”j¡  j¥  subh)”}”(hŒOemphasise: the first one is available now without needing to modify Parsl core.”h]”hŒOemphasise: the first one is available now without needing to modify Parsl core.”…””}”(hj¾  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M|hj…  h&hubh)”}”(hŒ²emphasise: the same analytics code gives different results without needing much modification, given different *aspects* / *presentations* of the same run - see :ref:`partialdata`”h]”(hŒnemphasise: the same analytics code gives different results without needing much modification, given different ”…””}”(hjÌ  h&hh'Nh)Nubh§)”}”(hŒ	*aspects*”h]”hŒaspects”…””}”(hjÔ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjÌ  ubhŒ / ”…””}”(hjÌ  h&hh'Nh)Nubh§)”}”(hŒ*presentations*”h]”hŒpresentations”…””}”(hjæ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjÌ  ubhŒ of the same run - see ”…””}”(hjÌ  h&hh'Nh)NubjQ  )”}”(hŒ:ref:`partialdata`”h]”jV  )”}”(hjú  h]”hŒpartialdata”…””}”(hjü  h&hh'Nh)Nubah}”(h]”h]”(ja  Œstd”Œstd-ref”eh]”h]”h!]”uh%hhjø  ubah}”(h]”h]”h]”h]”h!]”Œrefdoc”hOŒ	refdomain”j  Œreftype”Œref”Œrefexplicit”‰Œrefwarn”ˆjs  Œpartialdata”uh%jP  h'h(h)M~hjÌ  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M~hj…  h&hubh)”}”(hŒ’emphasise: integration: there are resource records for tasks via monitoring, and htex internals via JSON logs: neither is a superset of the other.”h]”hŒ’emphasise: integration: there are resource records for tasks via monitoring, and htex internals via JSON logs: neither is a superset of the other.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M€hj…  h&hubh+)”}”(hhh]”(h0)”}”(hŒ/Task events from the monitoring.db presentation”h]”hŒ/Task events from the monitoring.db presentation”…””}”(hj/  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj,  h&hh'h(h)Mƒubh)”}”(hŒfImport the monitoring.db created by unmodified master Parsl, and see how many event records there are:”h]”hŒfImport the monitoring.db created by unmodified master Parsl, and see how many event records there are:”…””}”(hj=  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M…hj,  h&hubjL  )”}”(hŒ|>>> from parsl.observability.import_monitoring_db import import_db
>>> l=import_db("runinfo/monitoring.db")
>>> len(l)
14596”h]”hŒ|>>> from parsl.observability.import_monitoring_db import import_db
>>> l=import_db("runinfo/monitoring.db")
>>> len(l)
14596”…””}”hjK  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython3”uh%jK  h'h(h)M‡hj,  h&hubh)”}”(hŒHere's an example event record:”h]”hŒ!Hereâ€™s an example event record:”…””}”(hj]  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MŽhj,  h&hubjL  )”}”(hŒ£>>> l[0]
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 0, 'parsl_try_id': 0, 'parsl_task_status': 'pending', 'created': 1764589563.205463}”h]”hŒ£>>> l[0]
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 0, 'parsl_try_id': 0, 'parsl_task_status': 'pending', 'created': 1764589563.205463}”…””}”hjk  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython3”uh%jK  h'h(h)Mhj,  h&hubh)”}”(hŒKNow identify all the tasks, keyed hierarchically by DFK ID and task number:”h]”hŒKNow identify all the tasks, keyed hierarchically by DFK ID and task number:”…””}”(hj}  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M•hj,  h&hubjL  )”}”(hŒ”>>> tasks = { (event['parsl_dfk'], event['parsl_task_id']) for event in l if 'parsl_dfk' in event and 'parsl_task_id' in event }
>>> len(tasks)
2432”h]”hŒ”>>> tasks = { (event['parsl_dfk'], event['parsl_task_id']) for event in l if 'parsl_dfk' in event and 'parsl_task_id' in event }
>>> len(tasks)
2432”…””}”hj‹  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython3”uh%jK  h'h(h)M—hj,  h&hubh)”}”(hŒand pick one randomly:”h]”hŒand pick one randomly:”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj,  h&hubjL  )”}”(hŒ]>>> import random
>>> random.choice(list(tasks))
('a08cc383-927a-4ce8-926b-f31e52e6edc2', 72)”h]”hŒ]>>> import random
>>> random.choice(list(tasks))
('a08cc383-927a-4ce8-926b-f31e52e6edc2', 72)”…””}”hj«  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython3”uh%jK  h'h(h)MŸhj,  h&hubh)”}”(hŒ?That's task 72 of run ``a08cc383-927a-4ce8-926b-f31e52e6edc2``.”h]”(hŒThatâ€™s task 72 of run ”…””}”(hj½  h&hh'Nh)Nubjn  )”}”(hŒ(``a08cc383-927a-4ce8-926b-f31e52e6edc2``”h]”hŒ$a08cc383-927a-4ce8-926b-f31e52e6edc2”…””}”(hjÅ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj½  ubhŒ.”…””}”(hj½  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¥hj,  h&hubh)”}”(hŒ[Now pick out all the records that are labelled as part of that task, and print them nicely:”h]”hŒ[Now pick out all the records that are labelled as part of that task, and print them nicely:”…””}”(hjÝ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M§hj,  h&hubjL  )”}”(hX'  >>> events = [event for event in l if event.get('parsl_dfk', None) == 'a08cc383-927a-4ce8-926b-f31e52e6edc2' and event.get('parsl_task_id', None) == 72]


>>> for event in sorted(events, key=lambda event: float(event['created'])): print(event)
...
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'pending', 'created': 1764589576.775231}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'launched', 'created': 1764589577.435498}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'running', 'created': 1764589577.472009}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.514019, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2360561455881223, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.26, 'psutil_process_time_system': 0.21000000000000002, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203653120.0, 'psutil_process_disk_read': 32585809.0, 'psutil_process_disk_write': 20895.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 59.0, 'psutil_process_num_ctx_switches_involuntary': 396.0}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.566039, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2366030730861701, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.28, 'psutil_process_time_system': 0.22, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203661312.0, 'psutil_process_disk_read': 32995622.0, 'psutil_process_disk_write': 26441.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 69.0, 'psutil_process_num_ctx_switches_involuntary': 454.0}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.600372, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2366030730861701, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.3, 'psutil_process_time_system': 0.23, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203444224.0, 'psutil_process_disk_read': 33404930.0, 'psutil_process_disk_write': 33922.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 74.0, 'psutil_process_num_ctx_switches_involuntary': 465.0}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'running_ended', 'created': 1764589577.646802}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'exec_done', 'created': 1764589577.671692}”h]”hX'  >>> events = [event for event in l if event.get('parsl_dfk', None) == 'a08cc383-927a-4ce8-926b-f31e52e6edc2' and event.get('parsl_task_id', None) == 72]


>>> for event in sorted(events, key=lambda event: float(event['created'])): print(event)
...
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'pending', 'created': 1764589576.775231}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'launched', 'created': 1764589577.435498}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'running', 'created': 1764589577.472009}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.514019, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2360561455881223, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.26, 'psutil_process_time_system': 0.21000000000000002, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203653120.0, 'psutil_process_disk_read': 32585809.0, 'psutil_process_disk_write': 20895.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 59.0, 'psutil_process_num_ctx_switches_involuntary': 396.0}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.566039, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2366030730861701, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.28, 'psutil_process_time_system': 0.22, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203661312.0, 'psutil_process_disk_read': 32995622.0, 'psutil_process_disk_write': 26441.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 69.0, 'psutil_process_num_ctx_switches_involuntary': 454.0}
{'parsl_try_id': 0, 'parsl_task_id': 72, 'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'created': 1764589577.600372, 'resource_monitoring_interval': 1.0, 'psutil_process_pid': 10680, 'psutil_process_memory_percent': 1.2366030730861701, 'psutil_process_children_count': 1.0, 'psutil_process_time_user': 1.3, 'psutil_process_time_system': 0.23, 'psutil_process_memory_virtual': 416145408.0, 'psutil_process_memory_resident': 203444224.0, 'psutil_process_disk_read': 33404930.0, 'psutil_process_disk_write': 33922.0, 'psutil_process_status': 'sleeping', 'psutil_cpu_num': '3', 'psutil_process_num_ctx_switches_voluntary': 74.0, 'psutil_process_num_ctx_switches_involuntary': 465.0}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'running_ended', 'created': 1764589577.646802}
{'parsl_dfk': 'a08cc383-927a-4ce8-926b-f31e52e6edc2', 'parsl_task_id': 72, 'parsl_try_id': 0, 'parsl_task_status': 'exec_done', 'created': 1764589577.671692}”…””}”hjë  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython3”uh%jK  h'h(h)M©hj,  h&hubh)”}”(hX  So what comes out is some records which reflect the change in the task status as seen by the monitoring system (Which includes ``running`` and ``running_ended`` status that isn't known to the DFK) and while the task is running, some resource monitoring records.”h]”(hŒSo what comes out is some records which reflect the change in the task status as seen by the monitoring system (Which includes ”…””}”(hjý  h&hh'Nh)Nubjn  )”}”(hŒ``running``”h]”hŒrunning”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjý  ubhŒ and ”…””}”(hjý  h&hh'Nh)Nubjn  )”}”(hŒ``running_ended``”h]”hŒrunning_ended”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjý  ubhŒg status that isnâ€™t known to the DFK) and while the task is running, some resource monitoring records.”…””}”(hjý  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¹hj,  h&hubh)”}”(hŒ´This is basically a reformatting of records you could get by running SQL queries against ``monitoring.db``, which is unsurprising: the only data source used was that database file.”h]”(hŒYThis is basically a reformatting of records you could get by running SQL queries against ”…””}”(hj/  h&hh'Nh)Nubjn  )”}”(hŒ``monitoring.db``”h]”hŒmonitoring.db”…””}”(hj7  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj/  ubhŒJ, which is unsurprising: the only data source used was that database file.”…””}”(hj/  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M»hj,  h&hubeh}”(h]”Œ/task-events-from-the-monitoring-db-presentation”ah]”h]”Œ/task events from the monitoring.db presentation”ah]”h!]”uh%h*hj…  h&hh'h(h)Mƒubh+)”}”(hhh]”(h0)”}”(hŒ,Task events from a JSON logging presentation”h]”hŒ,Task events from a JSON logging presentation”…””}”(hjZ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjW  h&hh'h(h)M¾ubh)”}”(hX  I'm going to look at that same task (task 72 of run ``a08cc383-927a-4ce8-926b-f31e52e6edc2``) from the perspective of JSON logs now -- Parsl modified to output richer log files in JSON format with more machine readable metadata. TODO: reference the relevant generating events section.”h]”(hŒ6Iâ€™m going to look at that same task (task 72 of run ”…””}”(hjh  h&hh'Nh)Nubjn  )”}”(hŒ(``a08cc383-927a-4ce8-926b-f31e52e6edc2``”h]”hŒ$a08cc383-927a-4ce8-926b-f31e52e6edc2”…””}”(hjp  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjh  ubhŒÁ) from the perspective of JSON logs now â€“ Parsl modified to output richer log files in JSON format with more machine readable metadata. TODO: reference the relevant generating events section.”…””}”(hjh  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÀhjW  h&hubh)”}”(hŒÅAll I am going to change is the importer command. I'm going to use the same selector and printing code shown above. So what we should get is a different *presentation* or *aspect* of the same task.”h]”(hŒ›All I am going to change is the importer command. Iâ€™m going to use the same selector and printing code shown above. So what we should get is a different ”…””}”(hjˆ  h&hh'Nh)Nubh§)”}”(hŒ*presentation*”h]”hŒpresentation”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjˆ  ubhŒ or ”…””}”(hjˆ  h&hh'Nh)Nubh§)”}”(hŒ*aspect*”h]”hŒaspect”…””}”(hj¢  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjˆ  ubhŒ of the same task.”…””}”(hjˆ  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÂhjW  h&hubeh}”(h]”Œ,task-events-from-a-json-logging-presentation”ah]”h]”Œ,task events from a json logging presentation”ah]”h!]”uh%h*hj…  h&hh'h(h)M¾ubeh}”(h]”Œ<adventure-all-events-for-a-task-in-two-aspects-presentations”ah]”h]”Œ>adventure: all events for a task, in two aspects/presentations”ah]”h!]”uh%h*hjX  h&hh'h(h)Mtubh+)”}”(hhh]”(h0)”}”(hŒWAdventure: The minimal change necessary to get htex task logs into the above task trace”h]”hŒWAdventure: The minimal change necessary to get htex task logs into the above task trace”…””}”(hjÍ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÊ  h&hh'h(h)MÇubh)”}”(hŒOThis is probably one log line? To perform the join between IDs? Plus JSON logs.”h]”hŒOThis is probably one log line? To perform the join between IDs? Plus JSON logs.”…””}”(hjÛ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÉhjÊ  h&hubh)”}”(hŒKTODO: present the changes I've done, but minimised to only this one change.”h]”hŒMTODO: present the changes Iâ€™ve done, but minimised to only this one change.”…””}”(hjé  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MËhjÊ  h&hubeh}”(h]”ŒVadventure-the-minimal-change-necessary-to-get-htex-task-logs-into-the-above-task-trace”ah]”h]”ŒWadventure: the minimal change necessary to get htex task logs into the above task trace”ah]”h!]”uh%h*hjX  h&hh'h(h)MÇubh+)”}”(hhh]”(h0)”}”(hŒ6Adventure: blog: Visualization for task prioritisation”h]”hŒ6Adventure: blog: Visualization for task prioritisation”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÿ  h&hh'h(h)MÎubh)”}”(hŒO(two graphs that are already in parsl-visualize but probably-buggy - see #4021)”h]”hŒO(two graphs that are already in parsl-visualize but probably-buggy - see #4021)”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÐhjÿ  h&hubh)”}”(hŒšthis uses replay-monitoring.db approach with no runtime changes. because the work I did there was in parsl master, but I want to do custom visualizations.”h]”hŒšthis uses replay-monitoring.db approach with no runtime changes. because the work I did there was in parsl master, but I want to do custom visualizations.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÒhjÿ  h&hubh)”}”(hŒ[TODO: link to blog post]”h]”hŒ[TODO: link to blog post]”…””}”(hj,  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÔhjÿ  h&hubh+)”}”(hhh]”(h0)”}”(hŒ#prioritisation part 2: by task type”h]”hŒ#prioritisation part 2: by task type”…””}”(hj=  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj:  h&hh'h(h)M×ubh)”}”(hŒOwork towards a second blog post here. now most of the mechanics are worked out.”h]”hŒOwork towards a second blog post here. now most of the mechanics are worked out.”…””}”(hjK  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÙhj:  h&hubh)”}”(hŒJStep 2 of that: This was a second requirement on prioritisation from DESC.”h]”hŒJStep 2 of that: This was a second requirement on prioritisation from DESC.”…””}”(hjY  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÛhj:  h&hubh)”}”(hŒIuse an A->B1/B2->C three step diamond-dag because its a bit less trivial.”h]”hŒIuse an A->B1/B2->C three step diamond-dag because its a bit less trivial.”…””}”(hjg  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÞhj:  h&hubh)”}”(hŒôvisualization of task types for jim's follow on question: how can we adapt step 1 to colour by app name? It's not well presented in parsl-visualize because that focuses on state transitions rather than on app identity as the primary colour-key.”h]”hŒøvisualization of task types for jimâ€™s follow on question: how can we adapt step 1 to colour by app name? Itâ€™s not well presented in parsl-visualize because that focuses on state transitions rather than on app identity as the primary colour-key.”…””}”(hju  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Màhj:  h&hubh)”}”(hŒ]Visualisation also coloured by task-chain/task-cluster to show a cluster based visualization.”h]”hŒ]Visualisation also coloured by task-chain/task-cluster to show a cluster based visualization.”…””}”(hjƒ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mâhj:  h&hubh)”}”(hXª  priority modes: natural (submit-to-htex order, "as unblocked" order), random (priority=random.random()), chain priority by chain depth, chain priority by cluster. the last two should be "the same" in plot 4 i hope. unclear what random mode will do, if anything? i guess get more later-unlocked tasks randomly in there? random is always interesting to me as pushing things away from degenerate cases - i this case "Cs run last"”h]”hX¶  priority modes: natural (submit-to-htex order, â€œas unblockedâ€ order), random (priority=random.random()), chain priority by chain depth, chain priority by cluster. the last two should be â€œthe sameâ€ in plot 4 i hope. unclear what random mode will do, if anything? i guess get more later-unlocked tasks randomly in there? random is always interesting to me as pushing things away from degenerate cases - i this case â€œCs run lastâ€”…””}”(hj‘  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mähj:  h&hubh)”}”(hŒKplot 1: task run/running_ended individual tasks, coloured by parsl app name”h]”hŒKplot 1: task run/running_ended individual tasks, coloured by parsl app name”…””}”(hjŸ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mæhj:  h&hubh)”}”(hŒ>plot 2: tasks of each of two kinds, coloured by parsl app name”h]”hŒ>plot 2: tasks of each of two kinds, coloured by parsl app name”…””}”(hj­  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mèhj:  h&hubh)”}”(hŒUplot 3: tasks running by type, with no priority, with two different priority schemes.”h]”hŒUplot 3: tasks running by type, with no priority, with two different priority schemes.”…””}”(hj»  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mêhj:  h&hubh)”}”(hŒÆplot 4: Visualisation of end-result completed - i.e. how many C tasks have completed over time, ignoring everything else about the inside. with prioritisation and with my two prioritisation schemes.”h]”hŒÆplot 4: Visualisation of end-result completed - i.e. how many C tasks have completed over time, ignoring everything else about the inside. with prioritisation and with my two prioritisation schemes.”…””}”(hjÉ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mìhj:  h&hubh)”}”(hŒóPlot 4 should be the top level plot set - because it an example "goal" of the prioritisation, I think. (might be because you want results sooner, might be because C completing means you can delete a load of intermediate temporary data sooner).”h]”hŒ÷Plot 4 should be the top level plot set - because it an example â€œgoalâ€ of the prioritisation, I think. (might be because you want results sooner, might be because C completing means you can delete a load of intermediate temporary data sooner).”…””}”(hj×  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mîhj:  h&hubh)”}”(hXh  From an observability perspective: the task chain identity is not known to Parsl. this is additional metadata, that in observability concepts, is added on by a "higher level system" and joined on at analysis time. the application knows about it, and the querier knows about it. none of the intermediate execution or observability infrastructure knows about it.”h]”hXl  From an observability perspective: the task chain identity is not known to Parsl. this is additional metadata, that in observability concepts, is added on by a â€œhigher level systemâ€ and joined on at analysis time. the application knows about it, and the querier knows about it. none of the intermediate execution or observability infrastructure knows about it.”…””}”(hjå  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mðhj:  h&hubh	Œenumerated_list”“”)”}”(hhh]”j¡  )”}”(hX  the status table rerun gives runtimes for plotting based on Parsl level dfk/task/try but doesn't give any metadata about those. such as app name. in SQL this is added on as a JOIN, and so it is here too - rerun the tasks table as a sequence of log records - note that they don't have a notion of "created" here because they are records but aren't from a point in time, instead an already aggregated set of information. don't let that scare you. observability records don't have to look like the output of a printf!

”h]”h)”}”(hX  the status table rerun gives runtimes for plotting based on Parsl level dfk/task/try but doesn't give any metadata about those. such as app name. in SQL this is added on as a JOIN, and so it is here too - rerun the tasks table as a sequence of log records - note that they don't have a notion of "created" here because they are records but aren't from a point in time, instead an already aggregated set of information. don't let that scare you. observability records don't have to look like the output of a printf!”h]”hX  the status table rerun gives runtimes for plotting based on Parsl level dfk/task/try but doesnâ€™t give any metadata about those. such as app name. in SQL this is added on as a JOIN, and so it is here too - rerun the tasks table as a sequence of log records - note that they donâ€™t have a notion of â€œcreatedâ€ here because they are records but arenâ€™t from a point in time, instead an already aggregated set of information. donâ€™t let that scare you. observability records donâ€™t have to look like the output of a printf!”…””}”(hjü  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mòhjø  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)Mòhjõ  h&hubah}”(h]”h]”h]”h]”h!]”Œenumtype”Œarabic”Œprefix”hŒsuffix”Œ.”uh%jó  hj:  h&hh'h(h)Mòubeh}”(h]”Œ"prioritisation-part-2-by-task-type”ah]”h]”Œ#prioritisation part 2: by task type”ah]”h!]”uh%h*hjÿ  h&hh'h(h)M×ubeh}”(h]”Œ4adventure-blog-visualization-for-task-prioritisation”ah]”h]”Œ6adventure: blog: visualization for task prioritisation”ah]”h!]”uh%h*hjX  h&hh'h(h)MÎubh+)”}”(hhh]”(h0)”}”(hŒ'Task flow logs through the whole system”h]”hŒ'Task flow logs through the whole system”…””}”(hj.  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj+  h&hh'h(h)Möubh)”}”(hŒQHere's a use case that is hard with what exists in master-branch Parsl right now.”h]”hŒSHereâ€™s a use case that is hard with what exists in master-branch Parsl right now.”…””}”(hj<  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Møhj+  h&hubh)”}”(hX  I want to know, for a particular arbitrary task, the timings of the task as it is submitted by the user workflow, flows through the DFK, into the htex interchange, worker pool, executes on an htex worker, and flows back to the user, with the timing of each step.”h]”hX  I want to know, for a particular arbitrary task, the timings of the task as it is submitted by the user workflow, flows through the DFK, into the htex interchange, worker pool, executes on an htex worker, and flows back to the user, with the timing of each step.”…””}”(hjJ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Múhj+  h&hubh)”}”(hXÀ  What exists in master Parsl right now is some information in monitoring, and some information in log files. The monitoring information is focused on the high level task model, not what is happening inside Parsl to run that high level model. Logs as they exist now are extremely ad-hoc, spread around in at least 4 different places, and poorly integrated: for example, log messages sometimes do not contain context about which task they refer to, do not represent that context uniformly (e.g. in a greppable way) and are ambiguous about context (e.g. some places refer to task 1, the DFK-level task 1, and some places refer to task 1, the HTEX-level task 1, which could be something completely different).”h]”hXÀ  What exists in master Parsl right now is some information in monitoring, and some information in log files. The monitoring information is focused on the high level task model, not what is happening inside Parsl to run that high level model. Logs as they exist now are extremely ad-hoc, spread around in at least 4 different places, and poorly integrated: for example, log messages sometimes do not contain context about which task they refer to, do not represent that context uniformly (e.g. in a greppable way) and are ambiguous about context (e.g. some places refer to task 1, the DFK-level task 1, and some places refer to task 1, the HTEX-level task 1, which could be something completely different).”…””}”(hjX  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mühj+  h&hubh)”}”(hŒIAs a contrast, an example output of this prototype (as of 2025-10-26) is:”h]”hŒIAs a contrast, an example output of this prototype (as of 2025-10-26) is:”…””}”(hjf  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mþhj+  h&hubjL  )”}”(hXv	  === About task 358 ===
2025-10-26 10:29:46.467298 MainThread@117098 Task 358: will be sent to executor htex_Local (parsl.log)
2025-10-26 10:29:46.467412 MainThread@117098 Task 358: Adding output dependencies (parsl.log)
2025-10-26 10:29:46.467484 MainThread@117098 Task 358: Added output dependencies (parsl.log)
2025-10-26 10:29:46.467550 MainThread@117098 Task 358: Gathering dependencies: start (parsl.log)
2025-10-26 10:29:46.467620 MainThread@117098 Task 358: Gathering dependencies: end (parsl.log)
2025-10-26 10:29:46.467685 MainThread@117098 Task 358: submitted for App random_uuid, not waiting on any dependency (parsl.log)
2025-10-26 10:29:46.467752 MainThread@117098 Task 358: has AppFuture: <AppFuture at 0x7f8bc1aed730 state=pending> (parsl.log)
2025-10-26 10:29:46.467818 MainThread@117098 Task 358: initializing state to pending (parsl.log)
2025-10-26 10:29:46.469992 Task-Launch_0@117098 Task 358: changing state from pending to launched (parsl.log)
2025-10-26 10:29:46.470113 Task-Launch_0@117098 Task 358: try 0 launched on executor htex_Local with executor id 340 (parsl.log)
2025-10-26 10:29:46.470240 Task-Launch_0@117098 Task 358: Standard out will not be redirected. (parsl.log)
2025-10-26 10:29:46.470310 Task-Launch_0@117098 Task 358: Standard error will not be redirected. (parsl.log)
2025-10-26 10:29:46.470336 MainThread@117129 HTEX task 340: putting onto pending_task_queue (interchange log)
2025-10-26 10:29:46.470404 MainThread@117129 HTEX task 340: fetched task (interchange log)
2025-10-26 10:29:46.470815 Interchange-Communicator@117144 Putting HTEX task 340 into scheduler (Pool manager log)
2025-10-26 10:29:46.471166 MainThread@117162 HTEX task 340: received executor task (Pool worker log)
2025-10-26 10:29:46.492449 MainThread@117162 HTEX task 340: Completed task (Pool worker log)
2025-10-26 10:29:46.492742 MainThread@117162 HTEX task 340: All processing finished for task (Pool worker log)
2025-10-26 10:29:46.493508 MainThread@117129 HTEX task 340: Manager b'4f65802901c6': Removing task from manager (interchange log)
2025-10-26 10:29:46.493948 HTEX-Result-Queue-Thread@117098 Task 358: changing state from launched to exec_done (parsl.log)
2025-10-26 10:29:46.494729 HTEX-Result-Queue-Thread@117098 Task 358: Standard out will not be redirected. (parsl.log)
2025-10-26 10:29:46.494905 HTEX-Result-Queue-Thread@117098 Task 358: Standard error will not be redirected. (parsl.log)”h]”hXv	  === About task 358 ===
2025-10-26 10:29:46.467298 MainThread@117098 Task 358: will be sent to executor htex_Local (parsl.log)
2025-10-26 10:29:46.467412 MainThread@117098 Task 358: Adding output dependencies (parsl.log)
2025-10-26 10:29:46.467484 MainThread@117098 Task 358: Added output dependencies (parsl.log)
2025-10-26 10:29:46.467550 MainThread@117098 Task 358: Gathering dependencies: start (parsl.log)
2025-10-26 10:29:46.467620 MainThread@117098 Task 358: Gathering dependencies: end (parsl.log)
2025-10-26 10:29:46.467685 MainThread@117098 Task 358: submitted for App random_uuid, not waiting on any dependency (parsl.log)
2025-10-26 10:29:46.467752 MainThread@117098 Task 358: has AppFuture: <AppFuture at 0x7f8bc1aed730 state=pending> (parsl.log)
2025-10-26 10:29:46.467818 MainThread@117098 Task 358: initializing state to pending (parsl.log)
2025-10-26 10:29:46.469992 Task-Launch_0@117098 Task 358: changing state from pending to launched (parsl.log)
2025-10-26 10:29:46.470113 Task-Launch_0@117098 Task 358: try 0 launched on executor htex_Local with executor id 340 (parsl.log)
2025-10-26 10:29:46.470240 Task-Launch_0@117098 Task 358: Standard out will not be redirected. (parsl.log)
2025-10-26 10:29:46.470310 Task-Launch_0@117098 Task 358: Standard error will not be redirected. (parsl.log)
2025-10-26 10:29:46.470336 MainThread@117129 HTEX task 340: putting onto pending_task_queue (interchange log)
2025-10-26 10:29:46.470404 MainThread@117129 HTEX task 340: fetched task (interchange log)
2025-10-26 10:29:46.470815 Interchange-Communicator@117144 Putting HTEX task 340 into scheduler (Pool manager log)
2025-10-26 10:29:46.471166 MainThread@117162 HTEX task 340: received executor task (Pool worker log)
2025-10-26 10:29:46.492449 MainThread@117162 HTEX task 340: Completed task (Pool worker log)
2025-10-26 10:29:46.492742 MainThread@117162 HTEX task 340: All processing finished for task (Pool worker log)
2025-10-26 10:29:46.493508 MainThread@117129 HTEX task 340: Manager b'4f65802901c6': Removing task from manager (interchange log)
2025-10-26 10:29:46.493948 HTEX-Result-Queue-Thread@117098 Task 358: changing state from launched to exec_done (parsl.log)
2025-10-26 10:29:46.494729 HTEX-Result-Queue-Thread@117098 Task 358: Standard out will not be redirected. (parsl.log)
2025-10-26 10:29:46.494905 HTEX-Result-Queue-Thread@117098 Task 358: Standard error will not be redirected. (parsl.log)”…””}”hjt  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)M hj+  h&hubh)”}”(hŒ`This integrates four log files and two task identifier systems into a single sequence of events.”h]”hŒ`This integrates four log files and two task identifier systems into a single sequence of events.”…””}”(hj†  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj+  h&hubeh}”(h]”Œ'task-flow-logs-through-the-whole-system”ah]”h]”Œ'task flow logs through the whole system”ah]”h!]”uh%h*hjX  h&hh'h(h)Möubh+)”}”(hhh]”(h0)”}”(hŒ/Algebra of rearranging and querying wide events”h]”hŒ/Algebra of rearranging and querying wide events”…””}”(hjŸ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjœ  h&hh'h(h)Mubh)”}”(hŒ—These are some of the standard patterns I've found useful enough and straightforward to turn into library functions in the `parsl.observabilty` module.”h]”(hŒ}These are some of the standard patterns Iâ€™ve found useful enough and straightforward to turn into library functions in the ”…””}”(hj­  h&hh'Nh)Nubj  )”}”(hŒ`parsl.observabilty`”h]”hŒparsl.observabilty”…””}”(hjµ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj­  ubhŒ module.”…””}”(hj­  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M!hjœ  h&hubh)”}”(hŒ4look at relational algebra for phrasing and concepts”h]”hŒ4look at relational algebra for phrasing and concepts”…””}”(hjÍ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M#hjœ  h&hubh+)”}”(hhh]”(h0)”}”(hŒ
functorial”h]”hŒ
functorial”…””}”(hjÞ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÛ  h&hh'h(h)M&ubh)”}”(hŒ‘These operations are *functorial* in the sense that they operate on individual event records without regard for the context those records are in.”h]”(hŒThese operations are ”…””}”(hjì  h&hh'Nh)Nubh§)”}”(hŒ*functorial*”h]”hŒ
functorial”…””}”(hjô  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjì  ubhŒp in the sense that they operate on individual event records without regard for the context those records are in.”…””}”(hjì  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M(hjÛ  h&hubh)”}”(hŒWidening”h]”hŒWidening”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M*hjÛ  h&hubh)”}”(hŒ¶widen-by-constant: if we import a new log file but we know its broader context some other way, perhaps because it came from a known directory inside a parsl rundir (eg work queue's )”h]”hŒ¸widen-by-constant: if we import a new log file but we know its broader context some other way, perhaps because it came from a known directory inside a parsl rundir (eg work queueâ€™s )”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M,hjÛ  h&hubh)”}”(hŒdrelabelling - to make names align from multiple sources, or to add distinction from multiple sources”h]”hŒdrelabelling - to make names align from multiple sources, or to add distinction from multiple sources”…””}”(hj(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M.hjÛ  h&hubeh}”(h]”Œ
functorial”ah]”h]”Œ
functorial”ah]”h!]”uh%h*hjœ  h&hh'h(h)M&ubh+)”}”(hhh]”(h0)”}”(hŒnon-functorial”h]”hŒnon-functorial”…””}”(hjA  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj>  h&hh'h(h)M1ubh)”}”(hŒßThese operations are *not functorial*. For example, widening-by-implication copies information between event records that are somehow grouped into a collection together, so cannot be implemented on a record-by-record basis.”h]”(hŒThese operations are ”…””}”(hjO  h&hh'Nh)Nubh§)”}”(hŒ*not functorial*”h]”hŒnot functorial”…””}”(hjW  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjO  ubhŒº. For example, widening-by-implication copies information between event records that are somehow grouped into a collection together, so cannot be implemented on a record-by-record basis.”…””}”(hjO  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M3hj>  h&hubh)”}”(hŒHpost-facto relationship establishment using grouping and key implication”h]”hŒHpost-facto relationship establishment using grouping and key implication”…””}”(hjo  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M5hj>  h&hubh)”}”(hŒ+used where you might use a ``JOIN`` in SQL.”h]”(hŒused where you might use a ”…””}”(hj}  h&hh'Nh)Nubjn  )”}”(hŒ``JOIN``”h]”hŒJOIN”…””}”(hj…  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj}  ubhŒ in SQL.”…””}”(hj}  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M7hj>  h&hubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œrelational algebra”Œindex-23”hNt”ah‰uh%hh'h(h)M9hj>  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j¨  uh%hhj>  h&hh'h(h)M;ubh)”}”(hXÃ  notion of identity and key-sequences:
eg. parsl_dfk/parsl_task_id is a globally
unique identifier for a parsl task across time and space,
and so is parsl_dfk/executor_label/block_number or
parsl_dfk/executor_label/manager_id/worker_number
-- although manager ID is also (in short form) globally
unique.
this is distinct from the hierarchical relations between
entities - although hierarchical identity keys will often
line up with execution hierarchy.”h]”hXÄ  notion of identity and key-sequences:
eg. parsl_dfk/parsl_task_id is a globally
unique identifier for a parsl task across time and space,
and so is parsl_dfk/executor_label/block_number or
parsl_dfk/executor_label/manager_id/worker_number
â€“ although manager ID is also (in short form) globally
unique.
this is distinct from the hierarchical relations between
entities - although hierarchical identity keys will often
line up with execution hierarchy.”…””}”(hj³  h&hh'Nh)Nubah}”(h]”j¨  ah]”h]”h]”h!]”uh%hœh'h(h)M<hj>  h&hhÀ}”hÂ}”j¨  jª  subh)”}”(hŒepeter buneman XML keys stuff did nested sequences of keys
for identifying xml fragments, c. year 2000”h]”hŒepeter buneman XML keys stuff did nested sequences of keys
for identifying xml fragments, c. year 2000”…””}”(hjÃ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MGhj>  h&hubh)”}”(hŒ·joins can send info back in time: if we have a span but don't know which parsl task it belongs to at the start, only later, we can use joins to bring that information from the future.”h]”hŒ¹joins can send info back in time: if we have a span but donâ€™t know which parsl task it belongs to at the start, only later, we can use joins to bring that information from the future.”…””}”(hjÑ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MJhj>  h&hubh+)”}”(hhh]”(h0)”}”(hŒkeys imply key operator”h]”hŒkeys imply key operator”…””}”(hjâ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjß  h&hh'h(h)MMubh)”}”(hŒ+[l_keys] implies [r_key] over [collection]:”h]”hŒ+[l_keys] implies [r_key] over [collection]:”…””}”(hjð  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MOhjß  h&hubjœ  )”}”(hhh]”(j¡  )”}”(hŒ÷if any log selected by l_keys contains an r_key, that r_key is unique (auto-check-that) and should be attached to every log record selected by l_keys. Use case: widening the task reception span in idris2interchange to be labelled with htex_task_id”h]”h)”}”(hj  h]”hŒ÷if any log selected by l_keys contains an r_key, that r_key is unique (auto-check-that) and should be attached to every log record selected by l_keys. Use case: widening the task reception span in idris2interchange to be labelled with htex_task_id”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MQhj  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)MQhjþ  h&hubj¡  )”}”(hŒSthis is functional dependency: https://en.wikipedia.org/wiki/Functional_dependency
”h]”h)”}”(hŒRthis is functional dependency: https://en.wikipedia.org/wiki/Functional_dependency”h]”(hŒthis is functional dependency: ”…””}”(hj  h&hh'Nh)NubhÍ)”}”(hŒ3https://en.wikipedia.org/wiki/Functional_dependency”h]”hŒ3https://en.wikipedia.org/wiki/Functional_dependency”…””}”(hj$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j&  uh%hÌhj  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MRhj  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)MRhjþ  h&hubeh}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)MQhjß  h&hubh)”}”(hŒÈfixpoint notions that might need to be incorporated into the query model in Python code (so that a fixpoint can be converged to across non-local widening queries - see idris2interchange usecase notes)”h]”hŒÈfixpoint notions that might need to be incorporated into the query model in Python code (so that a fixpoint can be converged to across non-local widening queries - see idris2interchange usecase notes)”…””}”(hjE  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MThjß  h&hubh)”}”(hX½  Lots of different identifier spaces, loosely structured,
not necessarily hierarchical: for example, an htex task
is not necessarily "inside" a Parsl task, as htex can be
used outside of a Parsl DFK (which is where the notion of
Parsl task lives). An htex task often runs in a unix process
but that process also runs many htex tasks, and an htex task
also has extent outside of that worker process: there's no
containment relationship either way.”h]”hXÃ  Lots of different identifier spaces, loosely structured,
not necessarily hierarchical: for example, an htex task
is not necessarily â€œinsideâ€ a Parsl task, as htex can be
used outside of a Parsl DFK (which is where the notion of
Parsl task lives). An htex task often runs in a unix process
but that process also runs many htex tasks, and an htex task
also has extent outside of that worker process: thereâ€™s no
containment relationship either way.”…””}”(hjS  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MVhjß  h&hubeh}”(h]”Œkeys-imply-key-operator”ah]”h]”Œkeys imply key operator”ah]”h!]”uh%h*hj>  h&hh'h(h)MMubeh}”(h]”Œnon-functorial”ah]”h]”Œnon-functorial”ah]”h!]”uh%h*hjœ  h&hh'h(h)M1ubeh}”(h]”Œ/algebra-of-rearranging-and-querying-wide-events”ah]”h]”Œ/algebra of rearranging and querying wide events”ah]”h!]”uh%h*hjX  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒAdventure: Browser UI”h]”hŒAdventure: Browser UI”…””}”(hj|  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjy  h&hh'h(h)Maubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œweb browser”Œindex-24”hNt”ah‰uh%hh'h(h)Mchjy  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j•  uh%hhjy  h&hh'h(h)Mdubh)”}”(hŒ+What might a browser UI look like for this?”h]”hŒ+What might a browser UI look like for this?”…””}”(hj   h&hh'Nh)Nubah}”(h]”j•  ah]”h]”h]”h!]”uh%hœh'h(h)Mehjy  h&hhÀ}”hÂ}”j•  j—  subh)”}”(hŒ“compare parsl-visualize. compare scrolling through logs,
but with some more interactivity (eg. click / choose
"show me logs from same dfk/task_id")”h]”hŒ—compare parsl-visualize. compare scrolling through logs,
but with some more interactivity (eg. click / choose
â€œshow me logs from same dfk/task_idâ€)”…””}”(hj°  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mghjy  h&hubh)”}”(hŒBut the parsl-visualize UI is so limited, it only has a handful of graphs to recreate. And some of them do not make sense to me so I would not recreate them.”h]”hŒBut the parsl-visualize UI is so limited, it only has a handful of graphs to recreate. And some of them do not make sense to me so I would not recreate them.”…””}”(hj¾  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mkhjy  h&hubh)”}”(hŒ¼I am not super excited about building UIs but it would probably be interesting to build something simple that can do a few queries and graphs to demonstrate log analysis in clickable form.”h]”hŒ¼I am not super excited about building UIs but it would probably be interesting to build something simple that can do a few queries and graphs to demonstrate log analysis in clickable form.”…””}”(hjÌ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mmhjy  h&hubh)”}”(hŒìAnd then I could put in the analyses I have made (other graphs/reports) too, and also have it work with academy logs right away. and be ready to pull in other JSON log files as a more advanced implementation motivating JSON/wide events.”h]”hŒìAnd then I could put in the analyses I have made (other graphs/reports) too, and also have it work with academy logs right away. and be ready to pull in other JSON log files as a more advanced implementation motivating JSON/wide events.”…””}”(hjÚ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mohjy  h&hubh)”}”(hŒÍUse python and matplotlib, no web-specific stuff, to promote people who have done local scripting putting new plots into the UI, and promote using the graph code from the visualiser in own local scripting.”h]”hŒÍUse python and matplotlib, no web-specific stuff, to promote people who have done local scripting putting new plots into the UI, and promote using the graph code from the visualiser in own local scripting.”…””}”(hjè  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mqhjy  h&hubh)”}”(hŒhMake it able to address a whole collection of monitoring.db runs at once - not only one chosen workflow.”h]”hŒhMake it able to address a whole collection of monitoring.db runs at once - not only one chosen workflow.”…””}”(hjö  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mshjy  h&hubh)”}”(hŒOUse a parsl-aware list of quasi-hierarchical key names to drive narrow-down UI:”h]”hŒOUse a parsl-aware list of quasi-hierarchical key names to drive narrow-down UI:”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Muhjy  h&hubh)”}”(hŒreg: pick dfk.   pick: parsl task, parsl try,   or  executor, then executor task, or executor then block then task.”h]”hŒreg: pick dfk.   pick: parsl task, parsl try,   or  executor, then executor task, or executor then block then task.”…””}”(hj  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mwhjy  h&hubh)”}”(hŒ,htex instance/block/manager/worker/htex_task”h]”hŒ,htex instance/block/manager/worker/htex_task”…””}”(hj   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Myhjy  h&hubh)”}”(hŒ´htex instance/block/manager/worker is an execution location - how an htex instance is identified is different between real Parsl and GC: in real parsl, its a dfk id/executor label.”h]”hŒ´htex instance/block/manager/worker is an execution location - how an htex instance is identified is different between real Parsl and GC: in real parsl, its a dfk id/executor label.”…””}”(hj.  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M{hjy  h&hubh)”}”(hXï  are all graphs relevant for all key selections? or should eg. a block duration/count graph only appear in certain situations? eg if we've focused on one task-try, does that mean... no block status info and so no block graph? graphs could be enabled by: "if you see records like this, this graph is relevant". That would allow eg. enabling htex or WQ specific plots if we see (with more merged info) some htex or WQ specific data. If we only see academy or GC logs, should only report about them.”h]”hXõ  are all graphs relevant for all key selections? or should eg. a block duration/count graph only appear in certain situations? eg if weâ€™ve focused on one task-try, does that meanâ€¦ no block status info and so no block graph? graphs could be enabled by: â€œif you see records like this, this graph is relevantâ€. That would allow eg. enabling htex or WQ specific plots if we see (with more merged info) some htex or WQ specific data. If we only see academy or GC logs, should only report about them.”…””}”(hj<  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M}hjy  h&hubh)”}”(hŒ7Recreate block vs task count graph from Matthews paper.”h]”hŒ7Recreate block vs task count graph from Matthews paper.”…””}”(hjJ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjy  h&hubh)”}”(hX?  Aim for first iteration to work against current monitoring.db format so it can be tried out in a separate install against production runs, distinct from all other observability work. Exensibility there right from the start to allow that to extend for importing new data and plugging in plots and reports about new data.”h]”hX?  Aim for first iteration to work against current monitoring.db format so it can be tried out in a separate install against production runs, distinct from all other observability work. Exensibility there right from the start to allow that to extend for importing new data and plugging in plots and reports about new data.”…””}”(hjX  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjy  h&hubh)”}”(hŒótwo obvious non-monitoring.db extensions: what's happening with managers in blocks. whats happening with work queue. these are both executor specific, and don't fit the monitoring.db schema so well. so clear demos of what could be done better.”h]”hŒ÷two obvious non-monitoring.db extensions: whatâ€™s happening with managers in blocks. whats happening with work queue. these are both executor specific, and donâ€™t fit the monitoring.db schema so well. so clear demos of what could be done better.”…””}”(hjf  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mƒhjy  h&hubh+)”}”(hhh]”(h0)”}”(hŒIdea: Streaming-fold web UI”h]”hŒIdea: Streaming-fold web UI”…””}”(hjw  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjt  h&hh'h(h)M†ubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œidea; streaming-fold web UI”Œindex-25”hNt”ah‰uh%hh'h(h)Mˆhjt  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j  uh%•      hhjt  h&hh'h(h)M‰ubh)”}”(hŒ‰what operators can be build with a streaming-fold? to give live updates as logs come in.
(eg tail from a filesystem in the simplest case)”h]”hŒ‰what operators can be build with a streaming-fold? to give live updates as logs come in.
(eg tail from a filesystem in the simplest case)”…””}”(hj›  h&hh'Nh)Nubah}”(h]”j  ah]”h]”h]”h!]”uh%hœh'h(h)MŠhjt  h&hhÀ}”hÂ}”j  j’  subh)”}”(hŒÄjoins are the hard bit there, I think - but a fundep operator is at least constrained in its behaviour: cache keyed-but-unjoined blocks, if we see a key record, emit the whole block and forget it.”h]”hŒÄjoins are the hard bit there, I think - but a fundep operator is at least constrained in its behaviour: cache keyed-but-unjoined blocks, if we see a key record, emit the whole block and forget it.”…””}”(hj«  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjt  h&hubh)”}”(hŒspend 1 day prototyping this.”h]”hŒspend 1 day prototyping this.”…””}”(hj¹  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjt  h&hubh)”}”(hŒ<counters, lists, a few graph types, drop downs/select fields”h]”hŒ<counters, lists, a few graph types, drop downs/select fields”…””}”(hjÇ  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M‘hjt  h&hubeh}”(h]”Œidea-streaming-fold-web-ui”ah]”h]”Œidea: streaming-fold web ui”ah]”h!]”uh%h*hjy  h&hh'h(h)M†ubeh}”(h]”Œadventure-browser-ui”ah]”h]”Œadventure: browser ui”ah]”h!]”uh%h*hjX  h&hh'h(h)Maubh+)”}”(hhh]”(h0)”}”(hŒpython side query model”h]”hŒpython side query model”…””}”(hjè  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjå  h&hh'h(h)M”ubh)”}”(hXI  log entries by reference (in lists and dicts). deliberately mutating the only copy of a log record, stored across multiple structs, is a feature not a bug. lets us select and modify and then return to the original structure. avoids copying the actual log records ever (except when explicit) and instead deal in object references.”h]”hXI  log entries by reference (in lists and dicts). deliberately mutating the only copy of a log record, stored across multiple structs, is a feature not a bug. lets us select and modify and then return to the original structure. avoids copying the actual log records ever (except when explicit) and instead deal in object references.”…””}”(hjö  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M•hjå  h&hubh)”}”(hŒRuse comprehension-style query notation to be reminiscent of other query languages.”h]”hŒRuse comprehension-style query notation to be reminiscent of other query languages.”…””}”(hj   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M—hjå  h&hubeh}”(h]”Œpython-side-query-model”ah]”h]”Œpython side query model”ah]”h!]”uh%h*hjX  h&hh'h(h)M”ubh+)”}”(hhh]”(h0)”}”(hŒacademy visualization”h]”hŒacademy visualization”…””}”(hj   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj   h&hh'h(h)Mšubh)”}”(hŒTmessage diagram - include here - for multi agent generator
example I made for Logan.”h]”hŒTmessage diagram - include here - for multi agent generator
example I made for Logan.”…””}”(hj+   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mœhj   h&hubeh}”(h]”Œacademy-visualization”ah]”h]”Œacademy visualization”ah]”h!]”uh%h*hjX  h&hh'h(h)Mšubh+)”}”(hhh]”(h0)”}”(hŒHTEX vs WQ questions”h]”hŒHTEX vs WQ questions”…””}”(hjD   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjA   h&hh'h(h)M ubh)”}”(hX¹  Some questions don't make sense outside of the fixed worker count model.
That is the htex tradition, but htex MPI mode has moved away from that
with tasks able to request arbitrary amounts of some resource (mpi ranks) and
WQ is heavily built around that too - DESC applications commonly use memory
rather than core count as their resource constrained on a node. It is a
feature, not a bug, that these features are different across executors.”h]”hX»  Some questions donâ€™t make sense outside of the fixed worker count model.
That is the htex tradition, but htex MPI mode has moved away from that
with tasks able to request arbitrary amounts of some resource (mpi ranks) and
WQ is heavily built around that too - DESC applications commonly use memory
rather than core count as their resource constrained on a node. It is a
feature, not a bug, that these features are different across executors.”…””}”(hjR   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¢hjA   h&hubeh}”(h]”Œhtex-vs-wq-questions”ah]”h]”Œhtex vs wq questions”ah]”h!]”uh%h*hjX  h&hh'h(h)M ubh+)”}”(hhh]”(h0)”}”(hŒOther record storage systems”h]”hŒOther record storage systems”…””}”(hjk   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjh   h&hh'h(h)Mªubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹ŒSQLite”Œindex-26”hNt”ah‰uh%hh'h(h)M¬hjh   h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j„   uh%hhjh   h&hh'h(h)M­ubh)”}”(hŒsSQLite - I experimented with this as part of my earlier DNPC work, using SQL as a query language over JSON records.”h]”hŒsSQLite - I experimented with this as part of my earlier DNPC work, using SQL as a query language over JSON records.”…””}”(hj   h&hh'Nh)Nubah}”(h]”j„   ah]”h]”h]”h!]”uh%hœh'h(h)M®hjh   h&hhÀ}”hÂ}”j„   j†   subeh}”(h]”Œother-record-storage-systems”ah]”h]”Œother record storage systems”ah]”h!]”uh%h*hjX  h&hh'h(h)Mªubh+)”}”(hhh]”(h0)”}”(hŒtype checking event schemas”h]”hŒtype checking event schemas”…””}”(hjª   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj§   h&hh'h(h)M±ubh)”}”(hŒÇthere is no observability schema overall, but individual components will have definitions which may be formal or informal. so it might be useful to be able to perform type checking on output records.”h]”hŒÇthere is no observability schema overall, but individual components will have definitions which may be formal or informal. so it might be useful to be able to perform type checking on output records.”…””}”(hj¸   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M³hj§   h&hubh)”}”(hŒcthere can be type-checking/linting gradual type check of emitted events though:
a couple of places:”h]”hŒcthere can be type-checking/linting gradual type check of emitted events though:
a couple of places:”…””}”(hjÆ   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mµhj§   h&hubjœ  )”}”(hhh]”(j¡  )”}”(hŒŽgenerate the extras by helper functions that force the record have certain fields (which should be referred to in the generating logs section)”h]”h)”}”(hjÙ   h]”hŒŽgenerate the extras by helper functions that force the record have certain fields (which should be referred to in the generating logs section)”…””}”(hjÛ   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¸hj×   ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)M¸hjÔ   h&hubj¡  )”}”(hŒˆlinting rules about x implies y, that can be regarded as hard type checking rules or soft rules depending on how you invoke such a tool
”h]”h)”}”(hŒ‡linting rules about x implies y, that can be regarded as hard type checking rules or soft rules depending on how you invoke such a tool”h]”hŒ‡linting rules about x implies y, that can be regarded as hard type checking rules or soft rules depending on how you invoke such a tool”…””}”(hjò   h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¹hjî   ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)M¹hjÔ   h&hubeh}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)M¸hj§   h&hubeh}”(h]”Œtype-checking-event-schemas”ah]”h]”Œtype checking event schemas”ah]”h!]”uh%h*hjX  h&hh'h(h)M±ubh+)”}”(hhh]”(h0)”}”(hŒFWriting out JSON events (or other formats) after performing query work”h]”hŒFWriting out JSON events (or other formats) after performing query work”…””}”(hj!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj!  h&hh'h(h)M¼ubh)”}”(hŒ[because maybe you've got somewhere better to process these results than the Python runtime.”h]”hŒ]because maybe youâ€™ve got somewhere better to process these results than the Python runtime.”…””}”(hj%!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¾hj!  h&hubh)”}”(hŒror maybe you processed them somewhere that wasn't python and now want them in Python. JSON as the interface layer.”h]”hŒtor maybe you processed them somewhere that wasnâ€™t python and now want them in Python. JSON as the interface layer.”…””}”(hj3!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÀhj!  h&hubeh}”(h]”ŒDwriting-out-json-events-or-other-formats-after-performing-query-work”ah]”h]”ŒFwriting out json events (or other formats) after performing query work”ah]”h!]”uh%h*hjX  h&hh'h(h)M¼ubeh}”(h]”(Œanalysing-wide-records”jD  eh]”h]”(Œanalysing wide records”Œ	analysing”eh]”h!]”uh%h*hhh&hh'h(h)MmhÀ}”jO!  j:  shÂ}”jD  j:  subh+)”}”(hhh]”(h0)”}”(hŒ$Adventure: Academy vs Globus Compute”h]”hŒ$Adventure: Academy vs Globus Compute”…””}”(hjW!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjT!  h&hh'h(h)MÄubh)”}”(hŒéThis isn't well documented in general and I feel like logging is especially neglected (although only part of the complexity here: code deployment is another part.) So I will work on bashing the two together with a hostile commentary.”h]”hŒëThis isnâ€™t well documented in general and I feel like logging is especially neglected (although only part of the complexity here: code deployment is another part.) So I will work on bashing the two together with a hostile commentary.”…””}”(hje!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÆhjT!  h&hubh+)”}”(hhh]”(h0)”}”(hŒgetting started”h]”hŒgetting started”…””}”(hjv!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjs!  h&hh'h(h)MÉubh)”}”(hŒÉBuilt ``cloudlump`` docker image for me to run locally to keep my deployments isolated from each other and from my home directory, to force me to be a bit more explicit about how to get things to work.”h]”(hŒBuilt ”…””}”(hj„!  h&hh'Nh)Nubjn  )”}”(hŒ``cloudlump``”h]”hŒ	cloudlump”…””}”(hjŒ!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj„!  ubhŒ¶ docker image for me to run locally to keep my deployments isolated from each other and from my home directory, to force me to be a bit more explicit about how to get things to work.”…””}”(hj„!  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MËhjs!  h&hubh)”}”(hŒ{This is enough to start a GC endpoint in one container and then run ``assert gcs.Executor().submit(abs, -7).result() == 7``”h]”(hŒDThis is enough to start a GC endpoint in one container and then run ”…””}”(hj¤!  h&hh'Nh)Nubjn  )”}”(hŒ7``assert gcs.Executor().submit(abs, -7).result() == 7``”h]”hŒ3assert gcs.Executor().submit(abs, -7).result() == 7”…””}”(hj¬!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj¤!  ubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÍhjs!  h&hubjL  )”}”(hŒ³>>> import globus_compute_sdk as s
>>> e=s.Executor(endpoint_id='5c7202b5-d022-4534-a399-21d4356129be')
>>> # authorization happens here
>>> assert e.submit(abs, -7).result() == 7”h]”hŒ³>>> import globus_compute_sdk as s
>>> e=s.Executor(endpoint_id='5c7202b5-d022-4534-a399-21d4356129be')
>>> # authorization happens here
>>> assert e.submit(abs, -7).result() == 7”…””}”hjÀ!  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython”uh%jK  h'h(h)MÏhjs!  h&hubh)”}”(hXg  In this environment, the only task reference is in ``~/.globus_compute/uep.5c7202b5-d022-4534-a399-21d4356129be.bac77271-9a60-cad7-c4e9-0eb689ddf4d1/GlobusComputeEngine-HighThroughputExecutor/block-0/beb8bc8bb003/worker_0.log`` using htex task numbers. This is not correlated with anything visible on the submit side -- which is a UUID globus compute task ID:”h]”(hŒ3In this environment, the only task reference is in ”…””}”(hjÒ!  h&hh'Nh)Nubjn  )”}”(hŒ°``~/.globus_compute/uep.5c7202b5-d022-4534-a399-21d4356129be.bac77271-9a60-cad7-c4e9-0eb689ddf4d1/GlobusComputeEngine-HighThroughputExecutor/block-0/beb8bc8bb003/worker_0.log``”h]”hŒ¬~/.globus_compute/uep.5c7202b5-d022-4534-a399-21d4356129be.bac77271-9a60-cad7-c4e9-0eb689ddf4d1/GlobusComputeEngine-HighThroughputExecutor/block-0/beb8bc8bb003/worker_0.log”…””}”(hjÚ!  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hjÒ!  ubhŒ… using htex task numbers. This is not correlated with anything visible on the submit side â€“ which is a UUID globus compute task ID:”…””}”(hjÒ!  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÖhjs!  h&hubjL  )”}”(hŒL>>> f=e.submit(abs, -8)
>>> f.task_id
'8b3da38f-f889-4ae1-8d31-68ff2f830876'”h]”hŒL>>> f=e.submit(abs, -8)
>>> f.task_id
'8b3da38f-f889-4ae1-8d31-68ff2f830876'”…””}”hjò!  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython”uh%jK  h'h(h)MØhjs!  h&hubh)”}”(hŒÁSuggestion: how to correlate GC task IDs with htex task IDs. Note that htex task IDs are not unique within the endpoint directory because multiple HTEXs (over time) log into the same directory.”h]”hŒÁSuggestion: how to correlate GC task IDs with htex task IDs. Note that htex task IDs are not unique within the endpoint directory because multiple HTEXs (over time) log into the same directory.”…””}”(hj"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÞhjs!  h&hubh)”}”(hŒLblock IDs are also not unique because of the above log directory conflation.”h]”hŒLblock IDs are also not unique because of the above log directory conflation.”…””}”(hj"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Màhjs!  h&hubh)”}”(hŒ¸Parsl work with extra debug info is likely to give more task information here, but all stil correlated by htex task ID, which as mentioned above, is not even unique within an endpoint.”h]”hŒ¸Parsl work with extra debug info is likely to give more task information here, but all stil correlated by htex task ID, which as mentioned above, is not even unique within an endpoint.”…””}”(hj "  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mâhjs!  h&hubeh}”(h]”Œgetting-started”ah]”h]”Œgetting started”ah]”h!]”uh%h*hjT!  h&hh'h(h)MÉubh+)”}”(hhh]”(h0)”}”(hŒLaunching an academy agent”h]”hŒLaunching an academy agent”…””}”(hj9"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj6"  h&hh'h(h)Måubh)”}”(hŒÀThere are no academy agents defined by default in an Academy install. If I manually define one in my submit container (in a disposable place, not shared), what happens when I try to launch it?”h]”hŒÀThere are no academy agents defined by default in an Academy install. If I manually define one in my submit container (in a disposable place, not shared), what happens when I try to launch it?”…””}”(hjG"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mçhj6"  h&hubh)”}”(hŒ¸This works to a remote endpoint, from one cloudlump container to another, which surprises me a bit because the MyAgent code must be being conveyed by some bowels of GC serialization...”h]”hŒ¸This works to a remote endpoint, from one cloudlump container to another, which surprises me a bit because the MyAgent code must be being conveyed by some bowels of GC serializationâ€¦”…””}”(hjU"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Méhj6"  h&hubjL  )”}”(hX\  import asyncio
import globus_compute_sdk as gc
import academy.agent as aa
import academy.manager as am
import academy.exchange as ae
import concurrent.futures as cf

print("importing myagent main")

class MyAgent(aa.Agent):
  @aa.action
  async def seven(self):
      print(f"something for stdout from MyAgent {self!r}")
      import os
      return (7, os.getpid(), os.uname())

if __name__ == "__main__":

  async def main():
      # async with await am.Manager.from_exchange_factory(factory=ae.HttpExchangeFactory(auth_method='globus', url="https://exchange.academy-agents.org"), executors = cf.ProcessPoolExecutor()) as m:
      async with await am.Manager.from_exchange_factory(factory=ae.HttpExchangeFactory(auth_method='globus', url="https://exchange.academy-agents.org"), executors = gc.Executor(endpoint_id='5c7202b5-d022-4534-a399-21d4356129be')) as m:
          print(f"with manager {m}")
          h = await m.launch(MyAgent())
          print(f"launched agent with handle {h}")
          s = await h.seven()
          print(f"agent seven result is {s}")
          assert s[0] == 7

  asyncio.run(main())”h]”hX\  import asyncio
import globus_compute_sdk as gc
import academy.agent as aa
import academy.manager as am
import academy.exchange as ae
import concurrent.futures as cf

print("importing myagent main")

class MyAgent(aa.Agent):
  @aa.action
  async def seven(self):
      print(f"something for stdout from MyAgent {self!r}")
      import os
      return (7, os.getpid(), os.uname())

if __name__ == "__main__":

  async def main():
      # async with await am.Manager.from_exchange_factory(factory=ae.HttpExchangeFactory(auth_method='globus', url="https://exchange.academy-agents.org"), executors = cf.ProcessPoolExecutor()) as m:
      async with await am.Manager.from_exchange_factory(factory=ae.HttpExchangeFactory(auth_method='globus', url="https://exchange.academy-agents.org"), executors = gc.Executor(endpoint_id='5c7202b5-d022-4534-a399-21d4356129be')) as m:
          print(f"with manager {m}")
          h = await m.launch(MyAgent())
          print(f"launched agent with handle {h}")
          s = await h.seven()
          print(f"agent seven result is {s}")
          assert s[0] == 7

  asyncio.run(main())”…””}”hjc"  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython”uh%jK  h'h(h)Mëhj6"  h&hubh)”}”(hŒabut if I put the agent in its own agentcode.py module, then I get a remote deserialization error:”h]”hŒabut if I put the agent in its own agentcode.py module, then I get a remote deserialization error:”…””}”(hju"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj6"  h&hubjL  )”}”(hXc  ...
File "/venv/lib/python3.13/site-packages/dill/_dill.py", line 452, in load
   obj = StockUnpickler.load(self)
File "/venv/lib/python3.13/site-packages/dill/_dill.py", line 442, in find_class
   return StockUnpickler.find_class(self, module, name)
          ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'agentcode'”h]”hXc  ...
File "/venv/lib/python3.13/site-packages/dill/_dill.py", line 452, in load
   obj = StockUnpickler.load(self)
File "/venv/lib/python3.13/site-packages/dill/_dill.py", line 442, in find_class
   return StockUnpickler.find_class(self, module, name)
          ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'agentcode'”…””}”hjƒ"  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  j_  uh%jK  h'h(h)Mhj6"  h&hubh)”}”(hŒÎwhich is consistent: in the first example, dill is used to convey the definitions; in the second case, pickle thinks it can do an ``import`` and so never gets to the point of dill conveying the definitions.”h]”(hŒ‚which is consistent: in the first example, dill is used to convey the definitions; in the second case, pickle thinks it can do an ”…””}”(hj”"  h&hh'Nh)Nubjn  )”}”(hŒ
``import``”h]”hŒimport”…””}”(hjœ"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%jm  hj”"  ubhŒB and so never gets to the point of dill conveying the definitions.”…””}”(hj”"  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj6"  h&hubeh}”(h]”Œlaunching-an-academy-agent”ah]”h]”Œlaunching an academy agent”ah]”h!]”uh%h*hjT!  h&hh'h(h)Måubh+)”}”(hhh]”(h0)”}”(hŒ(Looking at GC-endpoint-side academy logs”h]”hŒ(Looking at GC-endpoint-side academy logs”…””}”(hj¿"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj¼"  h&hh'h(h)Mubh)”}”(hŒaFor example, I want to see the agent action invocations. There are two paths here I might expect:”h]”hŒaFor example, I want to see the agent action invocations. There are two paths here I might expect:”…””}”(hjÍ"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj¼"  h&hubjœ  )”}”(hhh]”j¡  )”}”(hXû  some kind of endpoint/worker level as the academy docs suggest running as a worker initialization in the process worker pool. GC workers (and indeed, HTEX workers) don't have a configuration interface that supports that well, although in pure this observability project is working towards that -- see the configurability section nearer the start. I might expect that as part of general observability of *the whole system* rather than hoping that the *other components* are themselves separately debuggable.
”h]”h)”}”(hXú  some kind of endpoint/worker level as the academy docs suggest running as a worker initialization in the process worker pool. GC workers (and indeed, HTEX workers) don't have a configuration interface that supports that well, although in pure this observability project is working towards that -- see the configurability section nearer the start. I might expect that as part of general observability of *the whole system* rather than hoping that the *other components* are themselves separately debuggable.”h]”(hX–  some kind of endpoint/worker level as the academy docs suggest running as a worker initialization in the process worker pool. GC workers (and indeed, HTEX workers) donâ€™t have a configuration interface that supports that well, although in pure this observability project is working towards that â€“ see the configurability section nearer the start. I might expect that as part of general observability of ”…””}”(hjâ"  h&hh'Nh)Nubh§)”}”(hŒ*the whole system*”h]”hŒthe whole system”…””}”(hjê"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjâ"  ubhŒ rather than hoping that the ”…””}”(hjâ"  h&hh'Nh)Nubh§)”}”(hŒ*other components*”h]”hŒother components”…””}”(hjü"  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjâ"  ubhŒ& are themselves separately debuggable.”…””}”(hjâ"  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MhjÞ"  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)MhjÛ"  h&hubah}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)Mhj¼"  h&hubh)”}”(hŒvagent-level log routing: start something at agent start, shut it down at agent end. There are two existing approaches:”h]”hŒvagent-level log routing: start something at agent start, shut it down at agent end. There are two existing approaches:”…””}”(hj #  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M hj¼"  h&hubjœ  )”}”(hhh]”(j¡  )”}”(hŒ¬I've prototyped making agents able to capture "their own" logs and report them via an agent action. I prototyped this with Logan, and mentioned it elsewhere in this report.”h]”h)”}”(hj3#  h]”hŒ²Iâ€™ve prototyped making agents able to capture â€œtheir ownâ€ logs and report them via an agent action. I prototyped this with Logan, and mentioned it elsewhere in this report.”…””}”(hj5#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M"hj1#  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)M"hj.#  h&hubj¡  )”}”(hŒ«Alok added an initialize logging feature to manager launching of agents to insert a log file capture. There is no facility there for conveying the log file anywhere else.
”h]”h)”}”(hŒªAlok added an initialize logging feature to manager launching of agents to insert a log file capture. There is no facility there for conveying the log file anywhere else.”h]”hŒªAlok added an initialize logging feature to manager launching of agents to insert a log file capture. There is no facility there for conveying the log file anywhere else.”…””}”(hjL#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M#hjH#  ubah}”(h]”h]”h]”h]”h!]”uh%j   h'h(h)M#hj.#  h&hubeh}”(h]”h]”h]”h]”h!]”j  jó  uh%j›  h'h(h)M"hj¼"  h&hubh)”}”(hŒsThese different approaches are not contradictory: the Python logging mechanism can cope with multiple log handlers.”h]”hŒsThese different approaches are not contradictory: the Python logging mechanism can cope with multiple log handlers.”…””}”(hjf#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M%hj¼"  h&hubh)”}”(hŒtA goal: I want to look at agent activity - logs or visualization? - of stuff on the submit side and the remote side.”h]”hŒtA goal: I want to look at agent activity - logs or visualization? - of stuff on the submit side and the remote side.”…””}”(hjt#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M'hj¼"  h&hubh)”}”(hŒ£For example: I want to run my fibonacci agent test and see the agent logging its internal state as it changes, as well as seeing the client reporting what it sees.”h]”hŒ£For example: I want to run my fibonacci agent test and see the agent logging its internal state as it changes, as well as seeing the client reporting what it sees.”…””}”(hj‚#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M)hj¼"  h&hubh)”}”(hXC  I should be able to use multiple approaches to demonstrate how these events can be retrieved and give the same (or similar) output on the analysis side, and the differences in characteristics of the approach would be interesting to comment on - with the logging and analysis code the same for all event-movement approaches.”h]”hXC  I should be able to use multiple approaches to demonstrate how these events can be retrieved and give the same (or similar) output on the analysis side, and the differences in characteristics of the approach would be interesting to comment on - with the logging and analysis code the same for all event-movement approaches.”…””}”(hj#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M+hj¼"  h&hubh)”}”(hX˜  Academy manager has an option to initialize remote logging at start of agent execution -- that's one of two remote logging hooks that exists already (the other is on user-side in agent startup). so lets see how they compare. the main questions: how much ahead-of-startup logging do I get from one but not the other? how much do I want my own free-coding configurability? e.g. for formatting or octopus-style?”h]”hX›  Academy manager has an option to initialize remote logging at start of agent execution â€“ thatâ€™s one of two remote logging hooks that exists already (the other is on user-side in agent startup). so lets see how they compare. the main questions: how much ahead-of-startup logging do I get from one but not the other? how much do I want my own free-coding configurability? e.g. for formatting or octopus-style?”…””}”(hjž#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M-hj¼"  h&hubh)”}”(hX.  eg. manager.launch(init_logging) wants to only run once, even though it has logfiles and log levels specified per-agent. that's a process vs "entity" inconsistency to think about. also doesn't have an extralog specifier. rather than wiring in yet another one, look at my more general log configuration.”h]”hX6  eg. manager.launch(init_logging) wants to only run once, even though it has logfiles and log levels specified per-agent. thatâ€™s a process vs â€œentityâ€ inconsistency to think about. also doesnâ€™t have an extralog specifier. rather than wiring in yet another one, look at my more general log configuration.”…””}”(hj¬#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M/hj¼"  h&hubh)”}”(hŒÈThis doesn't having configurable format, and it captures bits of logs from other stuff (such as the parsl worker_log entries). which is desirable sometimes, but perhaps more on an endpoint-wide basis.”h]”hŒÊThis doesnâ€™t having configurable format, and it captures bits of logs from other stuff (such as the parsl worker_log entries). which is desirable sometimes, but perhaps more on an endpoint-wide basis.”…””}”(hjº#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M1hj¼"  h&hubh)”}”(hŒIt looks like it also doesn't *deinitialize* logging - because its a hack scoped around the process, rather than a principled log bracketing.”h]”(hŒ It looks like it also doesnâ€™t ”…””}”(hjÈ#  h&hh'Nh)Nubh§)”}”(hŒ*deinitialize*”h]”hŒdeinitialize”…””}”(hjÐ#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h¦hjÈ#  ubhŒa logging - because its a hack scoped around the process, rather than a principled log bracketing.”…””}”(hjÈ#  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M3hj¼"  h&hubh)”}”(hŒŒNow look at if I make the agent initialize its own logging- especially addressing the rough edges about: deinitialization, parameterisation.”h]”hŒŒNow look at if I make the agent initialize its own logging- especially addressing the rough edges about: deinitialization, parameterisation.”…””}”(hjè#  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M5hj¼"  h&hubeh}”(h]”Œ(looking-at-gc-endpoint-side-academy-logs”ah]”h]”Œ(looking at gc-endpoint-side academy logs”ah]”h!]”uh%h*hjT!  h&hh'h(h)Mubeh}”(h]”Œ#adventure-academy-vs-globus-compute”ah]”h]”Œ$adventure: academy vs globus compute”ah]”h!]”uh%h*hhh&hh'h(h)MÄubh+)”}”(hhh]”(h0)”}”(hŒThe rest”h]”hŒThe rest”…””}”(hj	$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj$  h&hh'h(h)M8ubh+)”}”(hhh]”(h0)”}”(hŒEDebugging monitoring performance as part of developing this prototype”h]”hŒEDebugging monitoring performance as part of developing this prototype”…””}”(hj$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj$  h&hh'h(h)M;ubh)”}”(hŒffindcommon tool - finds common task sequence for templated logs and outputs their sequence, like this:”h]”hŒffindcommon tool - finds common task sequence for templated logs and outputs their sequence, like this:”…””}”(hj($  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M=hj$  h&hubh)”}”(hŒFirst run parsl-perf like this:”h]”hŒFirst run parsl-perf like this:”…””}”(hj6$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M?hj$  h&hubjL  )”}”(hXx  parsl-perf --config parsl/tests/configs/htex_local.py

[...]

==== Iteration 3 ====
Will run 58179 tasks to target 120 seconds runtime
Submitting tasks / invoking apps
All 58179 tasks submitted ... waiting for completion
Submission took 103.880 seconds = 560.059 tasks/second
Runtime: actual 137.225s vs target 120s
Tasks per second: 423.967
Tests complete - leaving DFK block”h]”hXx  parsl-perf --config parsl/tests/configs/htex_local.py

[...]

==== Iteration 3 ====
Will run 58179 tasks to target 120 seconds runtime
Submitting tasks / invoking apps
All 58179 tasks submitted ... waiting for completion
Submission took 103.880 seconds = 560.059 tasks/second
Runtime: actual 137.225s vs target 120s
Tasks per second: 423.967
Tests complete - leaving DFK block”…””}”hjD$  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)MAhj$  h&hubh)”}”(hŒ*which executes a total around 60000 tasks.”h]”hŒ*which executes a total around 60000 tasks.”…””}”(hjV$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MPhj$  h&hubh)”}”(hŒyFirst, note that this prototype benchmarks on my laptop significantly slower than the contemperaneous master branch, at .”h]”hŒyFirst, note that this prototype benchmarks on my laptop significantly slower than the contemperaneous master branch, at .”…””}”(hjd$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MRhj$  h&hubh)”}”(hXZ  That's perhaps unsurprising: this benchmark is incredibly log sensistive, as my previous posts have noted - TODO: link to blog post and to R-performance report) - around 900 tasks per second on a 120 second benchmark. And this prototype adds a lot of log output. Part of the path to productionisation would be understanding and constraining this.”h]”hX\  Thatâ€™s perhaps unsurprising: this benchmark is incredibly log sensistive, as my previous posts have noted - TODO: link to blog post and to R-performance report) - around 900 tasks per second on a 120 second benchmark. And this prototype adds a lot of log output. Part of the path to productionisation would be understanding and constraining this.”…””}”(hjr$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MThj$  h&hubh)”}”(hX  From that output above, it is clear that the submission loop is taking a long time: 100 seconds. With about 35 seconds of execution happening afterwards. The Parsl core should be able to process task submissions much faster than 560 tasks per seconds. So what's taking up time there?”h]”hX  From that output above, it is clear that the submission loop is taking a long time: 100 seconds. With about 35 seconds of execution happening afterwards. The Parsl core should be able to process task submissions much faster than 560 tasks per seconds. So whatâ€™s taking up time there?”…””}”(hj€$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MVhj$  h&hubh)”}”(hŒeRun findcommon (a could-be-modular-but-isn't helper from this observability prototype) on the result:”h]”hŒgRun findcommon (a could-be-modular-but-isnâ€™t helper from this observability prototype) on the result:”…””}”(hjŽ$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MXhj$  h&hubjL  )”}”(hXâ  0.0: Task %s: will be sent to executor htex_local
0.00023320618468031343: Task %s: Adding output dependencies
0.0004515730863634116: Task %s: Added output dependencies
0.000672943356177761: Task %s: Gathering dependencies: start
0.0008952160973877195: Task %s: Gathering dependencies: end
0.0011054732824941516: Task %s: submitted for App app, not waiting on any dependency
0.001316777690507145: Task %s: has AppFuture: %s
0.0015680651123983979: Task %s: initializing state to pending
23.684763520758917: HTEX task %s: putting onto pending_task_queue
23.68483662049256: HTEX task %s: fetched task
23.684863335335613: Task %s: changing state from pending to launched
23.6850573607536: Task %s: try %s launched on executor %s with executor id %s
23.685248910492184: Task %s: Standard out will not be redirected.
23.685424046734745: Task %s: Standard error will not be redirected.
23.686276226995773: Putting HTEX task %s into scheduler
23.686777094898495: HTEX task %s: received executor task
23.687025900194147: HTEX task %s: Completed task
23.687268549254735: HTEX task %s: All processing finished for task
23.687837933843614: HTEX task %s: Manager %r: Removing task from manager
23.688483699079185: Task %s: changing state from launched to exec_done”h]”hXâ  0.0: Task %s: will be sent to executor htex_local
0.00023320618468031343: Task %s: Adding output dependencies
0.0004515730863634116: Task %s: Added output dependencies
0.000672943356177761: Task %s: Gathering dependencies: start
0.0008952160973877195: Task %s: Gathering dependencies: end
0.0011054732824941516: Task %s: submitted for App app, not waiting on any dependency
0.001316777690507145: Task %s: has AppFuture: %s
0.0015680651123983979: Task %s: initializing state to pending
23.684763520758917: HTEX task %s: putting onto pending_task_queue
23.68483662049256: HTEX task %s: fetched task
23.684863335335613: Task %s: changing state from pending to launched
23.6850573607536: Task %s: try %s launched on executor %s with executor id %s
23.685248910492184: Task %s: Standard out will not be redirected.
23.685424046734745: Task %s: Standard error will not be redirected.
23.686276226995773: Putting HTEX task %s into scheduler
23.686777094898495: HTEX task %s: received executor task
23.687025900194147: HTEX task %s: Completed task
23.687268549254735: HTEX task %s: All processing finished for task
23.687837933843614: HTEX task %s: Manager %r: Removing task from manager
23.688483699079185: Task %s: changing state from launched to exec_done”…””}”hjœ$  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)MZhj$  h&hubh)”}”(hXH  In this stylised synthetic task trace, a task takes an average of 23 seconds to go from the first event (choosing executor) to the final mark as done. That's fairly consistent with the parsl-perf output - I would expect the average here to be around half the time of parsl-perf's submission time to completion time (30 seconds).”h]”hXL  In this stylised synthetic task trace, a task takes an average of 23 seconds to go from the first event (choosing executor) to the final mark as done. Thatâ€™s fairly consistent with the parsl-perf output - I would expect the average here to be around half the time of parsl-perfâ€™s submission time to completion time (30 seconds).”…””}”(hj®$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mrhj$  h&hubh)”}”(hŒæWhat's useful with findcommon's output is that it shows the insides of Parsl's working in more depth: 20 states instead of parsl-perf's start, submitted, end. And the potential exists to calculate other statistics on these events.”h]”hŒîWhatâ€™s useful with findcommonâ€™s output is that it shows the insides of Parslâ€™s working in more depth: 20 states instead of parsl-perfâ€™s start, submitted, end. And the potential exists to calculate other statistics on these events.”…””}”(hj¼$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mthj$  h&hubh)”}”(hŒøSo in this average case, there's something slow happening between setting the task to pending, and then the task "simultaneously" being marked as launched on the submit side and the interchange receiving it and placing it in the pending task queue.”h]”hŒþSo in this average case, thereâ€™s something slow happening between setting the task to pending, and then the task â€œsimultaneouslyâ€ being marked as launched on the submit side and the interchange receiving it and placing it in the pending task queue.”…””}”(hjÊ$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mvhj$  h&hubh)”}”(hŒgThat's a bit surprising - tasks are meant to accumulate in the interchange, not before the interchange.”h]”hŒiThatâ€™s a bit surprising - tasks are meant to accumulate in the interchange, not before the interchange.”…””}”(hjØ$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mxhj$  h&hubh)”}”(hX3  So let's perform some deeper investigations -- observability is for Serious Investigators and so it is fine to be hacking on the Parsl source code to understand this more. (by hacking, I mean making temporary changes for the investigation that likely will be thrown away rather than integrated into master).”h]”hX6  So letâ€™s perform some deeper investigations â€“ observability is for Serious Investigators and so it is fine to be hacking on the Parsl source code to understand this more. (by hacking, I mean making temporary changes for the investigation that likely will be thrown away rather than integrated into master).”…””}”(hjæ$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mzhj$  h&hubh)”}”(hŒòLet's flesh out the whole submission process with some more log lines. On the DFK side, that's pretty straightforward: the observability prototype has a per-task logger which, if you have the task record, will attach log messages to the task.”h]”hŒöLetâ€™s flesh out the whole submission process with some more log lines. On the DFK side, thatâ€™s pretty straightforward: the observability prototype has a per-task logger which, if you have the task record, will attach log messages to the task.”…””}”(hjô$  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M|hj$  h&hubh)”}”(hŒFor example, here's the changes to add a log around the first call to launch_if_ready, which is probably the call that is launching the task.”h]”hŒFor example, hereâ€™s the changes to add a log around the first call to launch_if_ready, which is probably the call that is launching the task.”…””}”(hj%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M~hj$  h&hubjL  )”}”(hŒ£+  task_logger.debug("TMP: dependencies added, calling launch_if_ready")
   self.launch_if_ready(task_record)
+  task_logger.debug("TMP: launch_if_ready returned")”h]”hŒ£+  task_logger.debug("TMP: dependencies added, calling launch_if_ready")
   self.launch_if_ready(task_record)
+  task_logger.debug("TMP: launch_if_ready returned")”…””}”hj%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)M€hj$  h&hubh)”}”(hŒ³My suspicion is that this is around the htex submission queues, with a secondary submission around the launch executor, so to start with I'm going to add more logging around that.”h]”hŒµMy suspicion is that this is around the htex submission queues, with a secondary submission around the launch executor, so to start with Iâ€™m going to add more logging around that.”…””}”(hj"%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M‡hj$  h&hubh)”}”(hŒ†Then rerun parsl-perf and findcommon, without modifying either, and it turns out to be that secondary submission, the launch executor:”h]”hŒ†Then rerun parsl-perf and findcommon, without modifying either, and it turns out to be that secondary submission, the launch executor:”…””}”(hj0%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M‰hj$  h&hubjL  )”}”(hX  0.0020453477688227: Task %s: TMP: submitted into launch pool executor
0.002256870306434224: Task %s: TMP: launch_if_ready returned
14.073021359217009: Task %s: TMP: before submitter lock
[...]
14.078550367412324: Task %s: changing state from launched to exec_done”h]”hX  0.0020453477688227: Task %s: TMP: submitted into launch pool executor
0.002256870306434224: Task %s: TMP: launch_if_ready returned
14.073021359217009: Task %s: TMP: before submitter lock
[...]
14.078550367412324: Task %s: changing state from launched to exec_done”…””}”hj>%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)M‹hj$  h&hubh)”}”(hŒ°Don't worry too much about the final time (14s) changing from 23s in the earlier run -- that's a characteristic of parsl-perf batch sizes that I'm working on in another branch.”h]”hŒ·Donâ€™t worry too much about the final time (14s) changing from 23s in the earlier run â€“ thatâ€™s a characteristic of parsl-perf batch sizes that Iâ€™m working on in another branch.”…””}”(hjP%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M“hj$  h&hubh)”}”(hŒ{If that's the case, I'd expect the thread pool executor, previously much faster than htex, to show similar characteristics:”h]”hŒIf thatâ€™s the case, Iâ€™d expect the thread pool executor, previously much faster than htex, to show similar characteristics:”…””}”(hj^%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M•hj$  h&hubh)”}”(hX\  surprisingly, though although the throughput is not much much higher... the trace looks very different timewise. the bulk of the time here still happens at the same place, there isn't so much waiting there - less than a second on average. That's possibly because the executor can get through tasks much faster so the queue doesn't build up so much?”h]”hXb  surprisingly, though although the throughput is not much much higherâ€¦ the trace looks very different timewise. the bulk of the time here still happens at the same place, there isnâ€™t so much waiting there - less than a second on average. Thatâ€™s possibly because the executor can get through tasks much faster so the queue doesnâ€™t build up so much?”…””}”(hjl%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M—hj$  h&hubjL  )”}”(hX  ==== Iteration 2 ====
Will run 68976 tasks to target 120 seconds runtime
Submitting tasks / invoking apps
All 68976 tasks submitted ... waiting for completion
Submission took 117.915 seconds = 584.965 tasks/second
Runtime: actual 118.417s vs target 120s
Tasks per second: 582.485”h]”hX  ==== Iteration 2 ====
Will run 68976 tasks to target 120 seconds runtime
Submitting tasks / invoking apps
All 68976 tasks submitted ... waiting for completion
Submission took 117.915 seconds = 584.965 tasks/second
Runtime: actual 118.417s vs target 120s
Tasks per second: 582.485”…””}”hjz%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)Mšhj$  h&hubjL  )”}”(hX‹  0.0: Task %s: will be sent to executor threads
0.00014157412110423425: Task %s: Adding output dependencies
0.0002898652725047201: Task %s: Added output dependencies
0.000425118042214259: Task %s: Gathering dependencies: start
0.0005696294991521399: Task %s: Gathering dependencies: end
0.0006999648174108608: Task %s: submitted for App app, not waiting on any dependency
0.0008433702196425292: Task %s: has AppFuture: %s
0.0010710284919573986: Task %s: initializing state to pending
0.0011652027385929428: Task %s: TMP: dependencies added, calling launch_if_ready
0.0012973675719411494: Task %s: submitting into launch pool executor
0.0014397921284467212: Task %s: submitted into launch pool executor
0.0015767665501452072: Task %s: TMP: launch_if_ready returned
0.3143575128217656: Task %s: before submitter lock
0.31448896150771743: Task %s: after submitter lock, before executor.submit
0.3146383380777917: Task %s: after before executor.submit
0.3147926810507091: Task %s: changing state from pending to launched
0.3149239369413048: Task %s: try 0 launched on executor threads
0.31504996538376506: Task %s: Standard out will not be redirected.
0.31504996538376506: Task %s: Standard out will not be redirected.
0.3151759985402679: Task %s: Standard error will not be redirected.
0.3151759985402679: Task %s: Standard error will not be redirected.
0.315319734920821: Task %s: changing state from launched to exec_done”h]”hX‹  0.0: Task %s: will be sent to executor threads
0.00014157412110423425: Task %s: Adding output dependencies
0.0002898652725047201: Task %s: Added output dependencies
0.000425118042214259: Task %s: Gathering dependencies: start
0.0005696294991521399: Task %s: Gathering dependencies: end
0.0006999648174108608: Task %s: submitted for App app, not waiting on any dependency
0.0008433702196425292: Task %s: has AppFuture: %s
0.0010710284919573986: Task %s: initializing state to pending
0.0011652027385929428: Task %s: TMP: dependencies added, calling launch_if_ready
0.0012973675719411494: Task %s: submitting into launch pool executor
0.0014397921284467212: Task %s: submitted into launch pool executor
0.0015767665501452072: Task %s: TMP: launch_if_ready returned
0.3143575128217656: Task %s: before submitter lock
0.31448896150771743: Task %s: after submitter lock, before executor.submit
0.3146383380777917: Task %s: after before executor.submit
0.3147926810507091: Task %s: changing state from pending to launched
0.3149239369413048: Task %s: try 0 launched on executor threads
0.31504996538376506: Task %s: Standard out will not be redirected.
0.31504996538376506: Task %s: Standard out will not be redirected.
0.3151759985402679: Task %s: Standard error will not be redirected.
0.3151759985402679: Task %s: Standard error will not be redirected.
0.315319734920821: Task %s: changing state from launched to exec_done”…””}”hjŒ%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)M¤hj$  h&hubh)”}”(hXï  So maybe I can do some graphing of events to give more insight than these averages are showing. A favourite of mine from previous monitoring work is how many tasks are in each state at each moment in time. I'll have to implement that for this observability prototype, because it's not done already, but once it's done it should be reusable. and it should share most infrastructure with `findcommon`. Especially relevant is discovering where bottlenecks are: it looks like this is a parsl-affecting performance regression that might be keeping workers idle. For example, we could ask: does the interchange have "enough" tasks at all times to keep dispatching. With 8 cores on my laptop, I'd like it to have at least 8 tasks or so inside htex at any one time, but this looks like it might not be true. Hopefully graphing will reveal more. It's also important to note that this findcommon output shows latency, not throughput -- though high latency at particular points is an indication of throughput problems.”h]”(hXˆ  So maybe I can do some graphing of events to give more insight than these averages are showing. A favourite of mine from previous monitoring work is how many tasks are in each state at each moment in time. Iâ€™ll have to implement that for this observability prototype, because itâ€™s not done already, but once itâ€™s done it should be reusable. and it should share most infrastructure with ”…””}”(hjž%  h&hh'Nh)Nubj  )”}”(hŒ`findcommon`”h]”hŒ
findcommon”…””}”(hj¦%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hjž%  ubhXj  . Especially relevant is discovering where bottlenecks are: it looks like this is a parsl-affecting performance regression that might be keeping workers idle. For example, we could ask: does the interchange have â€œenoughâ€ tasks at all times to keep dispatching. With 8 cores on my laptop, Iâ€™d like it to have at least 8 tasks or so inside htex at any one time, but this looks like it might not be true. Hopefully graphing will reveal more. Itâ€™s also important to note that this findcommon output shows latency, not throughput â€“ though high latency at particular points is an indication of throughput problems.”…””}”(hjž%  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M¿hj$  h&hubh)”}”(hXg  Or, I can look at how many tasks are in the interchange over time: there either is, or straightforwardly can be, a log line for that. That will fit a different model to the above log lines which are per-task. Instead they're a metric on the state of one thing only: the interchange. of which there is only one, at least for the purposes of this investigation.”h]”hXi  Or, I can look at how many tasks are in the interchange over time: there either is, or straightforwardly can be, a log line for that. That will fit a different model to the above log lines which are per-task. Instead theyâ€™re a metric on the state of one thing only: the interchange. of which there is only one, at least for the purposes of this investigation.”…””}”(hj¾%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÁhj$  h&hubh)”}”(hŒiAdd a new log line like this into the interchange at a suitable point (after task queueing, for example):”h]”hŒiAdd a new log line like this into the interchange at a suitable point (after task queueing, for example):”…””}”(hjÌ%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÃhj$  h&hubjL  )”}”(hŒ®+  ql = len(self.pending_task_queue)
+  logger.info(f"TMP: there are {ql} tasks in the pending task queue", extra={"metric": "pending_task_queue_length", "queued_tasks": ql})”h]”hŒ®+  ql = len(self.pending_task_queue)
+  logger.info(f"TMP: there are {ql} tasks in the pending task queue", extra={"metric": "pending_task_queue_length", "queued_tasks": ql})”…””}”hjÚ%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œnone”uh%jK  h'h(h)MÅhj$  h&hubh)”}”(hŒ˜Now can either look through the logs by hand to manually see the value. Or extract it programmatically and plot it with matplotlib, in an ad-hoc script:”h]”hŒ˜Now can either look through the logs by hand to manually see the value. Or extract it programmatically and plot it with matplotlib, in an ad-hoc script:”…””}”(hjì%  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MÊhj$  h&hubjL  )”}”(hXÌ  import matplotlib.pyplot as plt
from parsl.observability.getlogs import getlogs

logs = getlogs()

# looking for these logs:
# "metric": "pending_task_queue_length", "queued_tasks": ql})

metrics = [(float(l['created']), int(l['queued_tasks']))
           for l in logs
           if  'metric' in l
           and l['metric'] == "pending_task_queue_length"
          ]


plt.scatter(x=[m[0] for m in metrics],
            y=[m[1] for m in metrics])

plt.show()”h]”hXÌ  import matplotlib.pyplot as plt
from parsl.observability.getlogs import getlogs

logs = getlogs()

# looking for these logs:
# "metric": "pending_task_queue_length", "queued_tasks": ql})

metrics = [(float(l['created']), int(l['queued_tasks']))
           for l in logs
           if  'metric' in l
           and l['metric'] == "pending_task_queue_length"
          ]


plt.scatter(x=[m[0] for m in metrics],
            y=[m[1] for m in metrics])

plt.show()”…””}”hjú%  sbah}”(h]”h]”h]”h]”h!]”Œforce”‰Œhighlight_args”}”h#h$j^  Œpython”uh%jK  h'h(h)MÌhj$  h&hubh)”}”(hŒtand indeed that shows that the interchange queue length almost never goes above length 1, and never above length 10.”h]”hŒtand indeed that shows that the interchange queue length almost never goes above length 1, and never above length 10.”…””}”(hj&  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mâhj$  h&hubh)”}”(hXC  That's enough for now, but it's a usecase that shows partially understanding throughput: we can see from this observability data that the conceptual 50000 task queue that begins in parsl-perf as a `for`-loop doesn't progress fast enough to the interchange internal queue, and so probably performance effort should probably be focused on understanding and improving the code path around launch and getting into the interchange queue. With an almost empty interchange queue, anything happening on the worker side is probably not too relevant, at least for that parsl-perf use case.”h]”(hŒÉThatâ€™s enough for now, but itâ€™s a usecase that shows partially understanding throughput: we can see from this observability data that the conceptual 50000 task queue that begins in parsl-perf as a ”…””}”(hj&  h&hh'Nh)Nubj  )”}”(hŒ`for`”h]”hŒfor”…””}”(hj"&  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%j
  hj&  ubhX{  -loop doesnâ€™t progress fast enough to the interchange internal queue, and so probably performance effort should probably be focused on understanding and improving the code path around launch and getting into the interchange queue. With an almost empty interchange queue, anything happening on the worker side is probably not too relevant, at least for that parsl-perf use case.”…””}”(hj&  h&hh'Nh)Nubeh}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mähj$  h&hubh)”}”(hŒ±This "understand the queue lengths (or implicit queue lengths) towards execution" investigation style has been useful in understanding Parsl performance limitations in the past.”h]”hŒµThis â€œunderstand the queue lengths (or implicit queue lengths) towards executionâ€ investigation style has been useful in understanding Parsl performance limitations in the past.”…””}”(hj:&  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mæhj$  h&hubeh}”(h]”ŒEdebugging-monitoring-performance-as-part-of-developing-this-prototype”ah]”h]”ŒEdebugging monitoring performance as part of developing this prototype”ah]”h!]”uh%h*hj$  h&hh'h(h)M;ubh+)”}”(hhh]”(h0)”}”(hŒSee also”h]”hŒSee also”…””}”(hjS&  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjP&  h&hh'h(h)Mëubh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œ	NetLogger”Œindex-27”hNt”ah‰uh%hh'h(h)MíhjP&  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jl&  uh%hhjP&  h&hh'h(h)Mîubh)”}”(hŒGNetLogger  - https://dst.lbl.gov/publications/NetLogger.tech-report.pdf”h]”(hŒNetLogger  - ”…””}”(hjw&  h&hh'Nh)NubhÍ)”}”(hŒ:https://dst.lbl.gov/publications/NetLogger.tech-report.pdf”h]”hŒ:https://dst.lbl.gov/publications/NetLogger.tech-report.pdf”…””}”(hj&  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”Œrefuri”j&  uh%hÌhjw&  ubeh}”(h]”jl&  ah]”h]”h]”h!]”uh%hœh'h(h)MïhjP&  h&hhÀ}”hÂ}”jl&  jn&  subh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œdnpc”Œindex-28”hNt”ah‰uh%hh'h(h)MñhjP&  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›j¡&  uh%hhjP&  h&hh'h(h)Mòubh)”}”(hŒ·my dnpc work, an earlier iteration of this. more focused on human log parsing and so very fragile in the face of improving log messages, and not enough context in the human component.”h]”hŒ·my dnpc work, an earlier iteration of this. more focused on human log parsing and so very fragile in the face of improving log messages, and not enough context in the human component.”…””}”(hj¬&  h&hh'Nh)Nubah}”(h]”j¡&  ah]”h]”h]”h!]”uh%hœh'h(h)MóhjP&  h&hhÀ}”hÂ}”j¡&  j£&  subh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹Œsyslog”Œindex-29”hNt”ah‰uh%hh'h(h)MöhjP&  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jÇ&  uh%hhjP&  h&hh'h(h)M÷ubh)”}”(hŒ6syslog, systemd logging, linux kernel ringbuffer/dmesg”h]”hŒ6syslog, systemd logging, linux kernel ringbuffer/dmesg”…””}”(hjÒ&  h&hh'Nh)Nubah}”(h]”jÇ&  ah]”h]”h]”h!]”uh%hœh'h(h)MøhjP&  h&hhÀ}”hÂ}”jÇ&  jÉ&  subh€)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”hP]”(h‹ŒXML”Œindex-30”hNt”ah‰uh%hh'h(h)MúhjP&  h&hubh‘)”}”(hhh]”h}”(h]”h]”h]”h]”h!]”h›jí&  uh%hhjP&  h&hh'h(h)Mûubh)”}”(hŒ*buneman xml keys (mentioned above, c.2000)”h]”hŒ*buneman xml keys (mentioned above, c.2000)”…””}”(hjø&  h&hh'Nh)Nubah}”(h]”jí&  ah]”h]”h]”h!]”uh%hœh'h(h)MühjP&  h&hhÀ}”hÂ}”jí&  jï&  subh)”}”(hŒ‚microsoft power bi: As a simple example of how do we get this data into something actually novel for academia. Dashboard friendly.”h]”hŒ‚microsoft power bi: As a simple example of how do we get this data into something actually novel for academia. Dashboard friendly.”…””}”(hj'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MþhjP&  h&hubeh}”(h]”Œsee-also”ah]”h]”Œsee also”ah]”h!]”uh%h*hj$  h&hh'h(h)Mëubh+)”}”(hhh]”(h0)”}”(hŒ%wheres the bottleneck - visualization”h]”hŒ%wheres the bottleneck - visualization”…””}”(hj!'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hj'  h&hh'h(h)Mubh)”}”(hŒ_based on template analysis - but could be based on anything
that can be grouped and identified.”h]”hŒ_based on template analysis - but could be based on anything
that can be grouped and identified.”…””}”(hj/'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhj'  h&hubeh}”(h]”Œ#wheres-the-bottleneck-visualization”ah]”h]”Œ%wheres the bottleneck - visualization”ah]”h!]”uh%h*hj$  h&hh'h(h)Mubh+)”}”(hhh]”(h0)”}”(hŒ2Review of changes made so far to Parsl and Academy”h]”hŒ2Review of changes made so far to Parsl and Academy”…””}”(hjH'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjE'  h&hh'h(h)M
ubh)”}”(hŒNThis should be part of understanding what sort of code changes I am proposing.”h]”hŒNThis should be part of understanding what sort of code changes I am proposing.”…””}”(hjV'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)MhjE'  h&hubeh}”(h]”Œ2review-of-changes-made-so-far-to-parsl-and-academy”ah]”h]”Œ2review of changes made so far to parsl and academy”ah]”h!]”uh%h*hj$  h&hh'h(h)M
ubh+)”}”(hhh]”(h0)”}”(hŒ"Applying this approach for academy”h]”hŒ"Applying this approach for academy”…””}”(hjo'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjl'  h&hh'h(h)Mubh)”}”(hXŒ  As an extreme "data might not be there" -- perhaps Parsl
isn't there at all. What does this code and these techniques
look like applied to a similar but very different codebase,
Academy, which doesn't have any distributed monitoring at
all at the moment. There are ~100 log lines in the
academy codebase right now. How much can this be converted
in a few hours, and then analysed in similar ways?”h]”hX•  As an extreme â€œdata might not be thereâ€ â€“ perhaps Parsl
isnâ€™t there at all. What does this code and these techniques
look like applied to a similar but very different codebase,
Academy, which doesnâ€™t have any distributed monitoring at
all at the moment. There are ~100 log lines in the
academy codebase right now. How much can this be converted
in a few hours, and then analysed in similar ways?”…””}”(hj}'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjl'  h&hubh)”}”(hŒ~The point here being both considering this as a real
logging direction for academy, and as a proof-of-generality
beyond Parsl.”h]”hŒ~The point here being both considering this as a real
logging direction for academy, and as a proof-of-generality
beyond Parsl.”…””}”(hj‹'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjl'  h&hubh)”}”(hŒ	thoughts:”h]”hŒ	thoughts:”…””}”(hj™'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)Mhjl'  h&hubh)”}”(hŒ±academy logging so far focused on looking pretty on the
console: eg ANSI colour - that's at the opposite end of the
spectrum to what this observability project is trying to log.”h]”hŒ³academy logging so far focused on looking pretty on the
console: eg ANSI colour - thatâ€™s at the opposite end of the
spectrum to what this observability project is trying to log.”…””}”(hj§'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M hjl'  h&hubh)”}”(hŒxrule of thumb for initial conversion: whatever is substituted
into the human message should be added as an extras field.”h]”hŒxrule of thumb for initial conversion: whatever is substituted
into the human message should be added as an extras field.”…””}”(hjµ'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M$hjl'  h&hubeh}”(h]”Œ"applying-this-approach-for-academy”ah]”h]”Œ"applying this approach for academy”ah]”h!]”uh%h*hj$  h&hh'h(h)Mubeh}”(h]”Œthe-rest”ah]”h]”Œthe rest”ah]”h!]”uh%h*hhh&hh'h(h)M8ubh+)”}”(hhh]”(h0)”}”(hŒAcknowledgements”h]”hŒAcknowledgements”…””}”(hjÖ'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%h/hjÓ'  h&hh'h(h)M(ubh)”}”(hŒchronolog: nishchay, inna”h]”hŒchronolog: nishchay, inna”…””}”(hjä'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M*hjÓ'  h&hubh)”}”(hŒ/desc: esp david adams, tom glanzman, jim chiang”h]”hŒ/desc: esp david adams, tom glanzman, jim chiang”…””}”(hjò'  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M,hjÓ'  h&hubh)”}”(hŒ	uiuc: ved”h]”hŒ	uiuc: ved”…””}”(hj (  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M.hjÓ'  h&hubh)”}”(hŒ	gc: kevin”h]”hŒ	gc: kevin”…””}”(hj(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M0hjÓ'  h&hubh)”}”(hŒacademy: alok, greg, logan”h]”hŒacademy: alok, greg, logan”…””}”(hj(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M2hjÓ'  h&hubh)”}”(hŒdiaspora: Ryan, Haochen”h]”hŒdiaspora: Ryan, Haochen”…””}”(hj*(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M4hjÓ'  h&hubh)”}”(hŒparsl: matthew”h]”hŒparsl: matthew”…””}”(hj8(  h&hh'Nh)Nubah}”(h]”h]”h]”h]”h!]”uh%hœh'h(h)M6hjÓ'  h&hubeh}”(h]”Œacknowledgements”ah]”h]”Œacknowledgements”ah]”h!]”uh%h*hhh&hh'h(h)M(ubeh}”(h]”h]”h]”h]”h!]”Œsource”h(Œtranslation_progress”}”(Œtotal”K Œ
translated”K uuh%hŒcurrent_source”NŒcurrent_line”NŒsettings”Œdocutils.frontend”ŒValues”“”)”}”(Œoutput”Nh/NŒ	generator”NŒ	datestamp”NŒroot_prefix”Œ/”Œsource_link”NŒ
source_url”NŒtoc_backlinks”Œentry”Œfootnote_backlinks”ˆŒsectnum_xform”ˆŒstrip_comments”NŒstrip_elements_with_classes”NŒstrip_classes”NŒreport_level”KŒ
halt_level”KŒexit_status_level”KŒdebug”NŒwarning_stream”NŒ	traceback”ˆŒinput_encoding”Œ	utf-8-sig”Œinput_encoding_error_handler”Œstrict”Œoutput_encoding”Œutf-8”Œoutput_encoding_error_handler”jx(  Œerror_encoding”Œutf-8”Œerror_encoding_error_handler”Œbackslashreplace”Œlanguage_code”Œen”Œrecord_dependencies”NŒconfig”NŒ	id_prefix”hŒauto_id_prefix”Œid”Œdump_settings”NŒdump_internals”NŒdump_transforms”NŒdump_pseudo_xml”NŒexpose_internals”NŒstrict_visitor”NŒ_disable_config”NŒ_source”h(Œ_destination”NŒ_config_files”]”Œfile_insertion_enabled”ˆŒraw_enabled”KŒline_length_limit”M'Œpep_references”NŒpep_base_url”Œhttps://peps.python.org/”Œpep_file_url_template”Œpep-%04d”Œrfc_references”NŒrfc_base_url”Œ&https://datatracker.ietf.org/doc/html/”Œ	tab_width”KŒtrim_footnote_reference_space”‰Œsyntax_highlight”Œlong”Œsmart_quotes”ˆŒsmartquotes_locales”]”Œcharacter_level_inline_markup”‰Œdoctitle_xform”‰Œdocinfo_xform”ˆŒsectsubtitle_xform”‰Œimage_loading”Œlink”Œembed_stylesheet”‰Œcloak_email_addresses”ˆŒsection_self_link”‰Œenv”NubŒreporter”NŒindirect_targets”]”Œsubstitution_defs”}”Œsubstitution_names”}”Œrefnames”}”Œrefids”}”(h]”h’aj*  ]”j0  ajÑ  ]”jÓ  ajŒ  ]”j’  aj9  ]”j/  aj#  ]”j  aj+  ]”j/  aja  ]”jc  aji	  ]”jk	  aj
  ]”j
  ajÐ
  ]”jÒ
  ajõ
  ]”jë
  ajƒ  ]”j…  aj  ]”j  ajy  ]”j  aj  ]”j  aj8  ]”j:  ajw  ]”j}  ajò  ]”jö  aj  ]”j  aj  ]”j"  ajp  ]”jr  aj±  ]”j¹  ajÝ  ]”já  ajc  ]”ji  aj¤  ]”jš  ajÉ  ]”jÍ  ajD  ]”j:  aj¡  ]”j¥  aj¨  ]”jª  aj•  ]”j—  aj  ]”j’  aj„   ]”j†   ajl&  ]”jn&  aj¡&  ]”j£&  ajÇ&  ]”jÉ&  ají&  ]”jï&  auŒnameids”}”(hkhhjG  jD  j  j
  je  jb  jº  j·  jý  jú  j  j  jQ  jN  j  j  j?  j<  j  j9  j  j   j©  j¦  j)  j&  j¥  j#  j¤  j¡  j›  j˜  j  j	  j¯  j¬  j	  j	  jœ	  j™	  jß	  jÜ	  jû
  jø
  j(  jõ
  j'  j$  jÒ  jÏ  j£  j   jÊ  jÇ  jg  jd  jÕ  jÒ  j*  j'  j_  j\  jZ  jW  jå  jâ  jR  jO  jc  j`  j  j  jX  jU  jÓ  jÐ  j  j  jS  j  jR  jO  jí  jê  jÿ  jü  j¬  j©  j!  j  j’  j  jD  jA  jª  j§  j  j¤  j  j  jJ  jG  jO!  jD  jN!  jK!  jÇ  jÄ  jT  jQ  j¿  j¼  jü  jù  j(  j%  j   j  j™  j–  jv  js  j;  j8  jn  jk  jf  jc  jâ  jß  jÚ  j×  j   j   j>   j;   je   jb   j¤   j¡   j!  j!  jF!  jC!  j$  j $  j3"  j0"  j¹"  j¶"  jû#  jø#  jÐ'  jÍ'  jM&  jJ&  j'  j'  jB'  j?'  ji'  jf'  jÈ'  jÅ'  jK(  jH(  uŒ	nametypes”}”(hk‰jG  ‰j  ‰je  ‰jº  ‰jý  ‰j  ‰jQ  ‰j  ‰j?  ‰j  ˆj  ‰j©  ‰j)  ‰j¥  ˆj¤  ‰j›  ‰j  ‰j¯  ‰j	  ‰jœ	  ‰jß	  ‰jû
  ‰j(  ˆj'  ‰jÒ  ‰j£  ‰jÊ  ‰jg  ‰jÕ  ‰j*  ‰j_  ‰jZ  ‰jå  ‰jR  ‰jc  ‰j  ‰jX  ‰jÓ  ‰j  ‰jS  ˆjR  ‰jí  ‰jÿ  ‰j¬  ‰j!  ‰j’  ‰jD  ‰jª  ‰j  ˆj  ‰jJ  ‰jO!  ˆjN!  •Z&      ‰jÇ  ‰jT  ‰j¿  ‰jü  ‰j(  ‰j   ‰j™  ‰jv  ‰j;  ‰jn  ‰jf  ‰jâ  ‰jÚ  ‰j   ‰j>   ‰je   ‰j¤   ‰j!  ‰jF!  ‰j$  ‰j3"  ‰j¹"  ‰jû#  ‰jÐ'  ‰jM&  ‰j'  ‰jB'  ‰ji'  ‰jÈ'  ‰jK(  ‰uh}”(hhh,jD  hnhhžj*  j9  j
  j‡  jÑ  jÜ  jb  j  j·  jh  jú  j½  j  j   jN  j  j  jT  jŒ  j  j<  j"  j9  jJ  j   jJ  j¦  j[  j&  j¬  j#  j,  j¡  j,  j˜  jª  j	  jž  j¬  j  j+  j8  ja  jl  j	  j²  j™	  j	  ji	  jt	  jÜ	  jŸ	  jø
  jâ	  j
  j%
  jÐ
  jÛ
  jõ
  j	  j$  j	  jÏ  j  j   j+  jƒ  jŽ  jÇ  j¦  jd  jÕ  jÒ  jm  j'  jØ  j\  j-  jW  jj  jâ  j‰  jO  jè  j  j!  j`  j]  jy  jŠ  j  j  j  jf  jU  j  j8  jC  jÐ  j[  jw  j†  j  jÖ  jò  jÿ  j  j-  jO  j-  jê  jŒ  jü  jð  j©  j  j  j+  jp  j{  j  j¯  j  j$  jA  j•  j±  jÂ  jÝ  jê  j§  jG  jc  jr  j¤  j­  j  j­  jÉ  jÖ  jG  j  jD  jX  jK!  jX  jÄ  j…  j¡  j®  jQ  j,  j¼  jW  jù  jÊ  j%  jÿ  j  j:  j–  j+  js  jœ  j8  jÛ  jk  j>  j¨  j³  jc  jß  jß  jy  j•  j   j×  jt  j  j›  j   jå  j;   j   jb   jA   j¡   jh   j„   j   j!  j§   jC!  j!  j $  jT!  j0"  js!  j¶"  j6"  jø#  j¼"  jÍ'  j$  jJ&  j$  j'  jP&  jl&  jw&  j¡&  j¬&  jÇ&  jÒ&  jí&  jø&  j?'  j'  jf'  jE'  jÅ'  jl'  jH(  jÓ'  uŒfootnote_refs”}”Œcitation_refs”}”Œautofootnotes”]”Œautofootnote_refs”]”Œsymbol_footnotes”]”Œsymbol_footnote_refs”]”Œ	footnotes”]”Œ	citations”]”Œautofootnote_start”KŒsymbol_footnote_start”K Œ
id_counter”Œcollections”ŒCounter”“”}”…”R”Œparse_messages”]”Œtransform_messages”]”(h	Œsystem_message”“”)”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-0" is not referenced.”…””}”hj)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”ŒINFO”Œsource”h(Œline”Kuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-1" is not referenced.”…””}”hj )  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”K!uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-2" is not referenced.”…””}”hj:)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj7)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”K:uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-3" is not referenced.”…””}”hjT)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjQ)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”K„uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ/Hyperlink target "datamodel" is not referenced.”…””}”hjn)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjk)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”K—uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ1Hyperlink target "partialdata" is not referenced.”…””}”hjˆ)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj…)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”K¹uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-4" is not referenced.”…””}”hj¢)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjŸ)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Kùuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-5" is not referenced.”…””}”hj¼)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj¹)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Kÿuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-6" is not referenced.”…””}”hjÖ)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjÓ)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-7" is not referenced.”…””}”hjð)  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjí)  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M7uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-8" is not referenced.”…””}”hj
*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MNuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "creating" is not referenced.”…””}”hj$*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj!*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MQuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ-Hyperlink target "index-9" is not referenced.”…””}”hj>*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj;*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mmuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-10" is not referenced.”…””}”hjX*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjU*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MÛuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-11" is not referenced.”…””}”hjr*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjo*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mêuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-12" is not referenced.”…””}”hjŒ*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj‰*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mýuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-13" is not referenced.”…””}”hj¦*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj£*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M‘uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-14" is not referenced.”…””}”hjÀ*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj½*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MŸuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-15" is not referenced.”…””}”hjÚ*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj×*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M­uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ,Hyperlink target "moving" is not referenced.”…””}”hjô*  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjñ*  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M±uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-16" is not referenced.”…””}”hj+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mîuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-17" is not referenced.”…””}”hj(+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj%+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-18" is not referenced.”…””}”hjB+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj?+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M*uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-19" is not referenced.”…””}”hj\+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjY+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M/uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-20" is not referenced.”…””}”hjv+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjs+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MCuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ:Hyperlink target "pytest-observes-logs" is not referenced.”…””}”hj+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MIuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-21" is not referenced.”…””}”hjª+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj§+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”MPuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ/Hyperlink target "analysing" is not referenced.”…””}”hjÄ+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjÁ+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mjuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-22" is not referenced.”…””}”hjÞ+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjÛ+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mxuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-23" is not referenced.”…””}”hjø+  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjõ+  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M;uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-24" is not referenced.”…””}”hj,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj,  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mduh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-25" is not referenced.”…””}”hj,,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj),  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M‰uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-26" is not referenced.”…””}”hjF,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjC,  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M­uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-27" is not referenced.”…””}”hj`,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj],  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mîuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-28" is not referenced.”…””}”hjz,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhjw,  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mòuh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-29" is not referenced.”…””}”hj”,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj‘,  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”M÷uh%j )  ubj)  )”}”(hhh]”h)”}”(hhh]”hŒ.Hyperlink target "index-30" is not referenced.”…””}”hj®,  sbah}”(h]”h]”h]”h]”h!]”uh%hœhj«,  ubah}”(h]”h]”h]”h]”h!]”Œlevel”KŒtype”j)  Œsource”h(Œline”Mûuh%j )  ubeŒtransformer”NŒinclude_log”]”Œ
decoration”Nh&hub.