BUGS: - ThreadScope DEADLOCKs occasionally, more often with --debug, why? - X Window System error sometimes? - background of some widgets on Windows are white when they are grey in Linux - Make ^C work on Windows - resizing the panes causes a grab lockup? Happened to Mikolaj on 7.11.2011, too. - fix, rewrite or disable partial redrawing of the graphs pane that causes many of the graphical bugs reported below - rewrite lots of drawing code to sidestep the fixed point precision problem of cairo - scrolling to the right, we get some over-rendering at the boundary causing a thick line - (probably the same as above) the gray vertical lines get sometimes randomly darker when scrolling by moving the scrollbar handle slowly to the right (probably caused by testing an x coordinate up to integral division by slice or without taking into account the hadj_value, so the line is drawn many times when scrolling) - scrolling when event labels are on chops off some labels - rendering bug when scrolling: we need to clip to the rectangle being drawn, otherwise we overwrite other parts of the window when there are long bars being drawn, which makes some events disappear. - sideways scrolling leaves curve rendering artifacts (e.g., the thicker fragments of the flat line at the end of the Activity graph) - sometimes 2 labels are written on top of each other even at max zoom, e.g. "thread 3 yielding" and "thread 3 is runnable" - some sequence of enabling/disabling labels and traces leave the timeline too short to display all traces; refreshing fixed this - may be a feature: filling graphs with colours is from line 1 upwards, not line 0, so lines at level 0 seem under the filled area, not level with it - a few levels of zoom in and then zoom out sometimes results in only the rightmost fragment of the timeline shown, no indication in the scrollbar that scroll is possible, but scrolling indeed fixes the view - ticking the trace boxes off sometimes shows a black rectangle in place of the switched off graph OTHER: - sort bookmark view by time? - Delete key deletes bookmarks? - hotkey to add a bookmark? - 25%/50%/75% percentiles for pool size, see http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.6199 figure 15 WARNING: unlike mean, the median and other percentiles can't be computed from percentiles of subnodes of the trees --- they need the whole data for the interval in question at each level of the tree. This increases the cost from O(n) to O(n * log n), where n is the total number of samples. Additional log n factor, in comparison with mean, is probably inevitable unless we put the data in an array, because otherwise we have to sort the data for each interval to find the k-th element. An extra problem is that to get accurate percentiles for splices that do not match a single subtree node, we have to get the whole data for the splice again, completely repeating the calculations. Then the preprocessing via creating the tree would only be useful if the tree stores the whole data at each tree level, already partitioned and the data for each slice may be gathered cheaply (but a bit inaccurately, see the use of SparkStats.rescale in the current code) by only looking at a few nodes of the tree at a single level, instead of traversing a very long list. There are better data structures than the spark tree for quick lookup of sorted data, so let's remove the pool sizes from the spark tree altogether and hack them separately (or use the better structures for everything). Use the trick with calculating percentiles from all raw data, but after quantizing it into a histogram, to make it tractable; the trick is orthogonal to the change of data representation to multi-level array aggregates, but with old data representation it may be too slow - resample the data (morally) uniformly, unless the sampling is changed from GC time points to equal intervals (note that with resampling, with enough extra inserted sample points, the median approaches the mean, so calculating the median for pool size does not make sense; however, percentiles still make sense --- they are not just mean*(n/m), e.g., for y=5, all percentiles are 5, while for y=x, from 0 to 10, they differ, regardless of sample density and uniformity, except trivial cases) - remove the grey halo in trivial cases of pool size, like the line: ___/----, but keep min/max for ___/\___ - make sure the user understands that the _areas_ are proportional to the total number and curve points to rates and that the green area in spark creation and the total area in spark conversion are equal; perhaps tell so in the help tab or in the timeline text summary tab - the same aggregate style for the activity graph, to see min/max (the green area does convey the total runtime, so perhaps mark min just as a line, no grey shadow); or just redo it completely as the pool graphs - test, in particular the quality of sampling, on the parfib suite; ideally generate sampled events from accurate events and compare visually and/or numerically - either change scale together with zoom level (and keep a colored box showing how much pictures space corresponds to how many sparks (should be probably constant)) or (this one implemented currently:) start with the best scale for the complete view and the let the peaks be clipped, let the user manually readjust scale then - limit the use of save/restore and similar crude hacks - in zoom out, activity and spark graphs seem cut off at the ends, which is mathematically correct, but one more slice or even pixel on each end would make it look better, without compromising correctness too much (just make sure the extra space does not grow with zoom level!); the sparks and Activity are already rendered with (up to?) one extra slice at the ends, but somehow this does not show (perhaps one is not enough) - perhaps enlarge the main timeline canvas to the right to the nearest tick - perhaps draw the trace labels vertically and reduce the size of the Y axis area - click and drag the view (or the selected interval, perhaps shift-click or control-click when starting to drag) - draw the detailed view in the background - bookmarks - save - measure the time between two markers - search for events in the event window, also filter events using regexps; alternatively, adding more categories of events to the RTS could help so that the user can enable only the needed events - indicate when one thread wakes up another thread, or a thread is migrated; perhaps draw lines/arrows between the events? - event list view - respond to page up / page down / arrows (FOCUS) - interact with bookmarks - left pane for - bookmarks - list of traces - traces would be a tree-view, e.g. * HECs * 0 * 1 etc. * Threads * 0 * 1 etc. * RTS * live heap * ETW / Linux Performance counters * cache misses * stalls * etc. - some way to reorder the traces? dragging and dropping? - when rendering threads, we want some way to indicate the time between creation and death - colour grey? - a better progress bar - animate zoom level transitions: ways to make the zoom in/out less confusing for users (e.g. the sudden appearance of spikiness once thresholds) animating the transitions would make it clearer generate the bitmap of the higher resolution new view, and animate it expanding to cover the new view it'd be quick since it's just bitmap scaling so the user can see the region in the old view that is expanding out to become the new view - let the user set the interval size/scale, as an advanced option, _separately_ for each graph stack, each HEC, each visible region at the current zoom level, each selected region (if they are implemented) - a button for vertical zoom (clip graphs if they don't fit at that zoom) and/or select regions of time in the view and zoom only that region of display; - an option to automatically change scale at zoom in to take into account only the visible and smoothed part of the graphs - label coloured areas with a mouseover, according to what they represent - move the text entents stuff out of drawing ThreadRuns - overlay ETW events - integrate strace events in the event view on Linux (note that linux perf events are already available in TS) - colour threads differently - thread names rather than numbers - live heap graph - summary analysis - perhaps draw the graphs even if only the fully-accurate, per-spark events are present in the logs (by transforming them to the sampled format) - and/or extrapolate data for large zoom levels with scarce samples by slated lines, not just horizontal lines (which work great with numerous enough samples, though) - merge adjacent 0 samples: if we have equal adjacent samples we just take the second/last can be done when it's still a list, before making the tree; simple linear pass, and lazy too; does not make sense in per-spark events mode - use the extra info about when the spark pool gets full and empty; we know when the spark pool becomes empty because we can observe that the threads evaluating sparks terminate; similarly, we know overflow only occurs when the pool is full; but note the following: "dcoutts: also noticed that we cannot currently reliably discover when the spark queue is empty. I had thought that "the" spark thread terminating implied that the queue is empty, however there can be multiple spark threads and they can also terminate when there are other threads on that capability's run queue (normal forkIO work takes precedence over evaluating sparks)" - consider making the graphs more accurate by drilling down the tree to the base events at each slice borders: they don't go down to the base events at each slice boundary but only at the two viewport borders (hence the extra slices at the ends dcoutts noticed). The current state generally this results in smoothing the curve, but a side-effect is that the graphs grow higher visually (the max is higher) as the zoom level increases. - or increase the accuracy by dividing increments not by the slice length (implicitly), but the length of the sum of tree node spans that cover the slice, similarly as in gnuplot graphs - perhaps we should change the spark event sampling to emit events every N sparks rather than at GC; but only if experiments (do more accurate sampling and compare the general look of the graphs) show that linear extrapolation of the GC data is not correct (large spark transitions happen in the long periods between GC and we don't know when exactly) (the 4K pool size guarantees that at least with large visible absolute spark transition, the invisible transitions can't be huge in proportion to the visible ones, so then linear extrapolation is correct)) - use adaptive interval, depending on the sample density at the viewed fragment - perhaps, depending on sample density, alternate between raw data, min/max, percentiles; so the raw data line slowly explodes into a band, a big smudge, like a string of beads, that gets even more detailed and perhaps wider, when data density/uniformity allows it; in other words: a thin line means we only guess that's where the data might be a thicker one, with mix/max means we have some data, but too irregular/scarce to say more, and full thickness line, with percentiles means we have enough data or evenly distributed enough to say all - have *configurable* colors, like in Eden; where to save them? - Eden TV has good visualisation for messages between processes & nodes, (steal it, when we do more work on Cloud Haskell) - show time corresponding to the eventlog buffer flushing, after the event for this introduced - verify correctness of the input eventlog before visualizing; define a method of passing the error report from ghc-events to TS and displaying it; have a grace period when that can be disabled via an option, until RTS and validation stabilize - perhaps the far left and far right bars on the histogram should be in red, since extreme spark sizes are usually bad (or mark it in some other way) - or colour stacked bars of the histogram according to strategy label - determine which bars represent too small and too large sparks based on the estimation of spark overhead divided by spark duration, etc. - dynamic strategy labelling (lots of stuff has to be done elsewhere, first) - mousing over or clicking a bar should tell "1.3s worth of sparks (28 total) that took between 16--64ms to evaluate"; people want to know the number of sparks as well as the total eval time, in particular kosmikus wants to know this - either show "Loading" when calculating spark duration histogram for a selected range on the timeline graph, or store precomputed data in the data structures for speedy drawing of the main timeline graphs, however they are (re)implemented in the end (mipmap or tress or anything) - manage histogram interval selection as gimp does: drag to create a selection, then move the selection, etc. - extend the selection stuff, e.g. keep bookmarks sorted and let you select multiple bookmarks which would then give you a range selection - add a new info / stats tab giving numeric info about the current selection, e.g., the amount of cpu time used on each hec, mutator, gc, also sparks, basically all the things that ./prog +RTS --blah can give you, but for any time period, not just whole run - perhaps the current "reset zoom" button could change the x and y scale of timeline and histogram to fit the current region (which initially and most of the time later on is just the whole eventlog)? - perhaps we need a floating "resize me" button to pop up on the graph whenever your current view ought to be resized , e.g. clipping at the top, or tallest bar taking up less than half the vertical space, or put that in red letters into the popups that we'll have at some point, or mark the upper or lower edge of the graph red (depending if it's clipped or too small) and blink the y-zoom reset button (to the right of the x-zoom reset button) in red, too - select a region and display it in a separate tab (e.g. next to the events tab) - scroll around the graph image via a small zoomed out window "The Information Mural: A Technique for Displaying and Navigating Large Information Spaces" - zoom histogram for granularity: increase the number of bars, keeping the min/max unchanged; that requires log with base other than 2 - zoom histogram for detail: keep the number of bars the same, but change the scale to only show sparks between a narrower min/max; for this log 2 is enough, with a constant offset - compare any of the graphs for the whole time and for the interval: either by opening/detaching window/pane dedicated to the interval data or by a permanent tab in the bottom pane with selectable traces that only ever show the data from the selected interval and when you zoom it, the interval changes in the upper pane (but histogram should not be among those traces, because it does not scroll, does not have the time X axis, and generally does not fit and is confusing) - perhaps use the tab with traces to guide what is exported to PDF/PNG; also, since histogram is not among the traces, use some other UI element to specify in the histogram should be shown (currently disabled permanently) - events in the info pane should be searchable, and possibly numbered - selecting a range of events (via a shift-click) in the info pane should create a selection as might be done with the mouse. - select and copy events from the info pane to the OS clipboard - perhaps, on the X scale of timeline, show only offsets, to save space: that is, only give the full time for the left point of the view and all other times relative to that, e.g., 2.356 s, +1ms, +2ms, etc. - and add a way to add a bookmark for a user-specified event, or a kind of events - medium-term: let the user configure which instant events shown in the trace - medium-term: user-defined visualisation frequency for the new trace (at least) and any other tricks needed to visualize instant events well (show density even if many events at almost the same spot and so the shades of green cue is not possible) and fast (in particular, don't draw many lines at the same spot) some random user feedback on the "lots of events" use case of traceEvents: 1. I don't need/use bookmarking (I can use the visual clues to jump), or even grep the event log 2. I find the visualization helpful, but disturbing, because it hides other information 3. it would be great to be able to visually distinguish different "messages" 4. it would be nice if drawing wouldn't slow down significantly even for a relatively small number of events (i.e., 100 is noticeable slowdown already) - make TS useful for par-monad: generate and visualize user code / libs events, make the same kind of granularity histogram as for sparks, etc. - make TS useful for concurrent code: forkIO concurrency events, properly implement the event merging for the cluster use case - make TS useful for sequential code, in addition to showing the GC pattern - make TS useful for the cloud haskell use cases - add finite machines for the GC and other events (after new events are added and/or extended) and use validation profiles to count some of the summary data that is calculated by hand right now - another possibility to visualize sparks when there are not many of them would be to show an activity graph purely for spark evaluation, ie only counting those threads that are doing spark evaluation - a button to snap scale to a power of 10, that'd solve: I need a way to show two different eventlog files with the same time scale. The attached picture was generated from two different files, but I couldn't find a way to make the time scale consistent. I expect this is because the 'zoom' button scales the plot based on the total length of the eventlog file, instead of using a fixed pixels/second ratio, which it what I really need. Alternatively, have dialog to set the time scale specifically, as in an electronic oscilloscope, but in our case, with zoom levels that multiply/divide the scale by 2, detailed settings are rather irrelevant. - another idea illustrated by an electronic oscilloscope: folding a graph at all occurrences of an event and showing the overlaid plot Given a regex named "compute iteration N start", find all instances of of this event within a user-defined time interval, and overlay the plots on top of each other. Ideally, each plot would be drawn partially transparent, so the color would be brighter when most instances were active at the same time since the event start. Alternatively, could could just average all the instances. I want to see if different iterations behave similarly. On an electronic oscilloscope this would be done the "trigger" setting. On an oscilloscope the beam traces from left to right on the display. When it runs off the right-hand-side it waits for a particular event in the signal before it starts tracing again. For a continuous signal the event would be that signal reaching a given threshold -- and the trigger knob on the bottom right-hand-side sets this threshold. Due to persistence-of-vision, you'd see many separate traces overlaid on the display. For ThreadScope the trigger should be a particular event selected with a regex. - show a histogram of the time difference between non-overlapping event pairs. Specifically, If I have events named "image filter iteration N start" and "image filter iteration N end" I want a histogram of how long those iterations took. - highlight the HEC of the current event selected in the Raw Events view (in the timeline) - for small screens with many HECs, make it possible to zoom the canvas vertically (perhaps via M-mouse wheel?)