GlassWire Graph is Meaningless


#1

I’ve used GlassWire a few times over the past few years but always uninstalled it because I couldn’t make sense of the graph. Something seemed not-quite-right about it. I looked into this further and have realized that the way data is presented on the graph appears to be meaningless.

First I’d like to show a case where the graph is almost correct. In the current version when I select the “5 Minutes” interval the top-left summary value changes to a “megabytes per second” reading. I can see that a new point is being added to the graph every second and the height of that point equates to the total amount of data transferred over the last second. This is a bandwidth graph and is quite useful. This graph would actually be correct if it were a bar graph. Since GlassWire is sampling total data transfer at discrete intervals it’s meaningless to show a continuous line (spline) between points. If it were a bar graph each bar would clearly show the amount of data transferred over that bar’s width.

Here’s an example of me downloading a Linux ISO. This graph shows I am downloading at my network connection’s full capacity (~4MB/s) and, line graph issue aside, is a meaningful graph:

Note how the “5 Minutes” graph doesn’t re-scale itself when you expand or contract the width of the window. This is because this graph doesn’t aggregate samples. This is a good thing.

Moving on to the other graphs, such as the “3 Hours” graph:

This is where the logic behind the graph really deteriorates into a mess. The “3 Hour” graph, and all larger scale graphs are completely meaningless. Here’s why:

Firstly, these should be bar graphs rather than line graphs as discussed earlier. For now, we’ll just imagine that they are bar graphs so we can reason more clearly about the errors in these graphs.

GlassWire appears to aggregate individual samples to generate this graph. This means the meaningful megabytes-per-second samples that you can see in the “5 Minute” graph are added together to show an indeterminate “megabytes-per-arbitrary-unit-of-time” reading. There is no visual indication as to how much time each sample corresponds to. Looking at the top-left corner of the graph you’ll see the Y axis maximum set to an arbitrary number of megabytes or gigabytes without any unit of time component. This further conflates the graph and should actually show the time interval used to aggregate the data. For example, instead of “10GB” it should show “10GB/30 Minutes” (depending on the width of the window and the time period)

In it’s current state, the “Month”, “Week”, “24 Hours”, and “3 Hours” graphs have no meaning whatsoever. The “5 Minutes” graph is incorrect (should be a bar graph) but is otherwise good.

There are several things GlassWire needs to change to make these graphs useful:

  1. Replace the line graph with a bar graph.
  2. Dispose of the aggregation logic in the “Month” to “3 Hours” views. Instead, average the 1-second samples into time periods large enough so that you show a reasonable number of bars (this would be based on the window’s width and the time interval)
  3. Always show “megabytes-per-second” in the top left corner
  4. Add a feature to allow a user to select a time period in the graph, and have GlassWire sum each bandwidth sample to get a total data usage number (in megabytes or gigabytes)

My only guess as to why GlassWire works this way is because the graphs are more visually appealing when they have large variations in height.

I also want to raise issue with some quotes from this blog post which tried to explain how the GlassWire graph works: https://blog.glasswire.com/2016/05/09/understanding-glasswires-graph-and-scaling-how-much-data-are-you-using-on-your-pc/

“This example shows that over the past 5 minutes you have downloaded 20 KB worth of data.”

Incorrect. There is actually no way to tell how much data you have downloaded over the entire time period using the GlassWire graph. Looking at the three peaks in that graph I would estimate about 50KB to 80KB of data transfer in the last 5 minutes.

“There is also a bandwidth label at the bottom left of the window. Bandwidth means the number of bytes divided to the time of transfer. So if the 15 KB file was downloaded during 10 seconds, the bandwidth will be 15 KB per second.”

This is only a bandwidth label in the “5 Minutes” graph. In other graphs it becomes a meaningless sum of data transferred over some not-disclosed time period which is a function of the width of the window and the overall time period.

I’ll also add that the math is wrong, if a 15KB file were downloaded over 10 seconds the bandwidth would be 1.5KB per second.

“With regards to our graph this will mean that we have to summarize all the transferred bytes during 3 hours…”

This entire paragraph seems like a non-sequitur and refers to a 3 hour interval when the rest of the post and all screeenshots refer to a 5 minute interval.

The rest of the post refers to scale recalculation without really understanding that the height of the graph is a function of the largest aggregated bandwidth sum which in turn is a function of sustained bandwidth use and of the time period selected, which, as discussed earlier, is meaningless.

It’s a bit of a concern when GlassWire’s own staff don’t understand how the graph works. GlassWire could be a great application with a few minor tweaks to the graph which I have itemized in this post. Until GlassWire makes those changes it appears to be favoring form over function.

In closing, I’d recommend searching google images for “bandwidth graph”. You’ll find they all follow the advice I’ve given in this post. Here’s one example: http://i.imgur.com/L7psKLA.png

I am making this longwinded because I like GlassWire and I want it to be sucessful.


Bandwidth changes depending on time zoom level
Why are the graphs so beautifully regular? I might have expected some randomness
Simple questions about how to read long-term graph
#2

Thank you for your feedback. We’ll take this into consideration for future updates.


#3

I disagree that the “Glasswire Graph is Meaningless”.

I have had my struggles with the graphs (see the topics linked below), in part because of bugs, but really I like the main features of the current graphing and would not want to lose them.

Two main reasons for supporting the current graphs.

  • GlassWire Doesn’t follow the convention of ugly and hard to read column graphs. Having said that, a column graph option for shorter time periods would be useful particularly those liking that format, wanting to compare the data with other programs, or being more dependent upon peak statistics. That is why so many people are concerned about the scaling and a column graph for bandwidth monitoring would simplify that.
  • As far as I can tell, the GlassWire graph is not a line graph but an area graph. The graph aggregation allows two different spikes/curves to be compared by the area of each. I really like that because I don’t have to spend time looking at a series of columns to work out the average. Anyway, averages/smoothing are normally shown as line graphs superimposed over bars/columns. That’s why I don’t have any problem with GlassWire defaulting to an area graph (or even a line graph).

Here’s some of the past topics on these issues:




#4

Remah,

I am having a hard time understanding how you can find any value in the GlassWire graphs aside from the “5 Minutes” graph, so what I’d like to do is take one of your screenshots and ask what information you can glean from the graph. If you’d like to use a different screenshot or make new screenshots please go ahead:

This is one of your “3 Hour” graphs which I believe is meaningless. Here are a few questions for you:

  • What is the total amount of data transferred (in Megabytes) between 9:15 and 10:00?
  • What was the average network throughput (in Megabytes per second) between 9:15 and 10:00?
  • At which point was your download speed saturated (i.e. when did you reach your network’s maximum download speed)?
  • What is the significance of the main “spikes” (one at 9:45 and the other at 10:30)? Was there more total data transferred in one than the other?

#5

As I said, while I disagree that the graph is meaningless and that I wouldn’t want to lose the current graph, there are good reasons why bandwidth monitoring would be easier for many people with the more conventional column graph over short time periods. That would also make it easier to tell when a link is saturated which is one of your questions.

As for answering the other questions, you get bandwidth usage statistics from the Usage view not the Graph view. That way you can specify the start and end times for the exact period being analyzed.


#6

“I disagree that the graph is meaningless and that I wouldn’t want to lose the current graph”
“you get bandwidth usage statistics from the Usage view not the Graph view”

If you have to go to the usage statistics page to glean any information isn’t that conceding that the graph is meaningless? What information can you get from the graph?

I want to drive the point home with one more example. I’ve taken screenshots at the “3 Hours” view with the GlassWire window quite thin, then taken another screenshot with the GlassWire window expanded across two monitors then shrunk back down to size in an image editor:

This shows how GlassWire appears to integrate the underlying data then lay an approximate spline curve over it, giving a completely inaccurate and meaningless representation of the data. The area under the curve means nothing.

I’ll also overlay some fake bars over one of your screenshots to two different and valid interpretations of the graph. Note how these interpretations contradict each other, and how one interpretation shows the larger “spike” actually corresponding to lower average bandwidth and data transfer. This really drives home how the graphs mean nothing. You might look at a graph and feel like you’re seeing something valuable about your network but in reality it is truly meaningless.

Here’s “interpretation 1”, in this interpretation the larger spike corresponds to less bandwidth and less data transfer:

And here’s “interpretation 2”, where the larger spike corresponds to more bandwidth and more data transfer:

Notice how the interpretations contradict each other. Notice also that I fabricated the number of samples, these could have been literally any number of samples and the interpretations could still have been valid. You might have transferred gigabytes in each of those spikes, and you might have downloaded a factor of 10x more in the second spike. There’s no way to know. The graphs are completely meaningless. They tell us nothing and are simply misleading. GlassWire can fix this by removing the aggregation of data in the “3 Hours” and greater views and instead make them behave like the “5 Minutes” graph. Until then, you have no way of knowing what’s happening on your network looking at any of the other graphs.

Again, I’m posting this because I like GlassWire and I want to be able to use and recommend this product to other people. GlassWire is pretty close to being a useful tool. They just need to fix these graphs. I think I’ve driven the point home enough at this point. My suggestion to all GlassWire users is to stick with the “5 Seconds” graph until GlassWire solve the underlying issues with the other graphs.


#7

I definitely agree about the changing windows size being misleading. The graphs should not be different just because someone has a different size monitor:

GlassWire already provides us with a view of the data used for the graphs and we should be using that Usage view to determine whether the graph is misleading. What you have done instead is to estimate that data from the graph. That seems meaningless because your assumptions (sampling intervals, data points, etc) will determine what is contradictory.

If I get the time tomorrow, and no one has done it beforehand, then I will have a look at comparing the Usage stats with the Graph.


#8

That seems meaningless because your assumptions (sampling intervals, data points, etc) will determine what is contradictory.

What do you mean by that? I’m showing that it’s impossible to make any valid assumptions about the graph because of the way it is designed. The fact that I can make multiple contradictory and valid interpretations of the graph shows the issue lies in the graph, not in my assumptions.

I asked you to interpret a graph earlier and you deferred to the Usage tab. The fact that neither you nor I can correctly interpret the graph is the whole crux of the issue. I want you to understand something clearly: If you’re looking at a “spike” or an “area” in the graph, what you see bears no resemblance to the reality of what’s happening on your network connection. It’s entirely meaningless. The heights of the spikes and the overall area under the curve are essentially random, meaningless data. It’s just eye candy. I can’t make it any clearer. The only reason you haven’t realized this is you probably haven’t run a real bandwidth monitoring tool along side it.

I would like to see what the GlassWire team think about this; and I mean to specifically address the issue of the “3 Hour” and larger graphs being meaningless. Are the points I have made sufficient for you to understand what needs to be done in the next GlassWire release? In short, make all graphs behave exactly like the “5 Minutes” graph. That’s the main thing that needs to be done for GlassWire to become a tool that can be used by people who want to see more than a pretty graph.


#9

The GlassWire graph is based on the data. The data that is available to end users is summarized in the Usage view. So to determine if the graph really is meaningless then we would need to look at the relationship between the usage data and the graph.

Another alternative would be to use a another program to generate the statistics and compare them with the Graph view. If you had followed my links you would see that I have used other products like Networx.


#10

“So to determine if the graph really is meaningless then we would need to look at the relationship between the usage data and the graph.”

I agree completely. In fact I was almost going to suggest you try doing this. If you do you’ll find no meaningful way to translate the graph to the numbers seen on the usage table and I think you’ll find my earlier comments make a lot more sense when you start trying to interpret the graph. Could you please give this a go and return with your results?


#11

Hopefully, I’ll get it done tomorrow.


#12

I’m going to include my findings in a number of posts as I get the time. There was more than one thing I wanted to check.

Notes

  1. I recorded to the nearest 0.1MB to save time. This did introduce cumulative rounding errors because of many small readings just less than 0.05MB. But I can always do this again recording exact stats now that I know I can get comparable data.

  2. Graph data is for a time period which is centered on the time displayed. Here’s some examples:
    14:23:30 is 2:23 to 2:24pm
    14:24:30 is 2:24 to 2:25pm
    14:25:30 is 2:25 to 2:26 pm

  3. Usage data is from a start to end time so I didn’t have to work out when a period starts and ends.

A. Check the consistency of graphs in different Graph views

Consistency between the various Graph views means that I can analyse data in a short-term graph like “3 Hours” and then come back at another time or on another day and look at the same data.

Here’s the “3 Hours” graph I worked with.

Here is the equivalent “24 hours” view. They look and act the same:

Using the “Week” view, I couldn’t get a matching graph because I couldn’t get a 3 hour time period. The best I could get was 6 hours but the graphs still look similar.

I matched data by clicking on each graph to discover the intervals being used for the time periods I selected:

  • 1 minute intervals on the “3 Hours” graph
  • 5 minute intervals on the “24 Hours” graph
  • 10 minute intervals on the “Week graph”

I summed the data as required so I had 10 minute blocks. They matched.


#13

Since you have said you are doing a multi-part post I will withhold commentary until you are done.

However I do suggest two things:

  • Always make a comparison against the “5 Minutes” graph. It’s the only one known to be valid.
  • When calculating the total data transfer, take the area under the graph rather than by summing the points. My arguments revolve around the area of the graph (i.e. what the graph looks like to the human eye) as opposed to the points which act as anchors for the spline curve.

#14

The data I matched was the reported bandwidth when you click in the graph as shown within the red highlight:

Sorry if it wasn’t clear that I wasn’t actually getting any data from the plotted lines or areas. :blush:

I simply tried to check that the graphs were logically consistent in the data they used. That also established a measure of validity for the various graphs.

I tried to make the graphs look visibly the same before I extracted all the underlying data by clicking on the graph. This way I could see the impact of different time scales:

This leads on to the next check:

B. Check how easy it is to compare Usage data with the Graph data

I took the above data from the Graph view to compare it with the Usage statistics. It is not straightforward to do, although it should be.

One problem with the Usage statistics is that one hour was the shortest time period I could get using the sliders on the shortest view, the Day view. So if I wanted Usage for 15 minute intervals I had to collect the hourly stats every 15 minutes and then derive the differences for the 15 minute intervals.

For serious analysis of actual usage, I should be able to get Usage intervals down to 1 minute as I could in the Graph view.

Another problem is having to calculate 15 minute traffic. Here’s an example of my calculation in an Excel worksheet:

The calculation is easier to start with no traffic. Any change in the traffic (in MB) would always come in the fourth (and latest) quarter. Many users probably would not find it easy to work this out and might think that there is a problem if they didn’t time-shift the usage differences to match the Graph data which also has to be manipulated to line it up.


#15

Flipper, while I agree that I would like to see comments from Glasswire, I’m much more in agreement with Remah than with you. First of all, your bold statement that the Glasswire Graph is Meaningless is simply your exaggerated perception. I find it very useful for many reasons and hope we won’t see major changes to the existing resulting from your comments. (Adding different graph views may be worthwhile depending on the Glasswire perception.)

Secondly, with all your beating on this subject, there have been no “likes” nor any replies stating agreement. If you perceive the Graph is meaningless or not useful, then perhaps you should act on that basis. There have been a few others who have chosen not to use Glasswire.

I do expect that based on the dissection that you and Remah have provided, the Glasswire team will consider how the product may be improved. So I’m not interested in stopping this debate, but I do point out that you stand on a rather lonely looking island – at least until someone else finds your commentary useful.


#16

We are working on a mobile version of GlassWire currently and this version allows us to play around with the graph and make it more usable, so we appreciate all your comments and feedback. After releasing a mobile version with a revised graph we plan to come back to Win/Mac and figure out ways to revise our desktop software graph and make it more usable.

We don’t feel our graph is meaningless but we appreciate all comments about the graph and GlassWire and we’ll continue to work hard to improve. Our whole team reads and discusses the message board often.


#17

statement that the Glasswire Graph is Meaningless is simply your exaggerated perception

Which of the several examples I gave to support my perception are invalid?

Secondly, with all your beating on this subject, there have been no “likes”

Trial by social media doesn’t apply to statistics nor anything for that matter.


#18

C. Check that the graph data points are drawn to match the graph data

This is simply whether there is a point on the line that represents the data when I click on the graph.

This is not the issue of whether a line chart or bar/column chart should be used.

I ran an Internet connection speed test to separate download and upload graphs.

I marked the data I read from the GlassWire graph onto the graph itself as black columns. The results was almost exact and is within the tolerances expected due to rounding in GlassWire and my plotting the lines manually:

D. Other points raised by @Flipper that I agree with

@Ken_GlassWire, I’d appreciate some feedback on the following points.

The loss of usable information after the “5 Minutes” Graph
I think that I’ve finally realized one reason why @Flipper (correct me if I am wrong) thinks the longer graphs are meaningless. It is the loss of meaningful data which is why he/she says:

There appears to be no way to get data sampled at 1 second intervals after the first five minutes.

If I was analyzing bandwidth usage then I would want to be able to see the 1 second data for longer. But I presume that it is discarded once summarized and eventually only exists as hourly aggregates.

Provide an option to use a bar/column chart
As @Flipper says, the columns on the above graph are easier to read than the line and it is easier to see the sampling intervals. The latter reason alone would be a good reason to provide one or more display options:

  • bar/column chart
  • add vertical lines like this:

Display the sample interval

The ability for GlassWire to summarize data over a selected time period

But I would also add minimum, average and maximum statistics for that period.


#19

@Remah Yes, I was thinking about adding a bar option might be useful, but it may be easier said than done.


#20

It’s been over a year since I posted this thread. Has there been any progress by the GlassWire team in replacing the graph with something that shows meaningful information?

To my mind this should have been fixed within days of being reported. It seems unusual that a business whose premise is to display network traffic knowingly produces a graph that is practically meaningless. It also says volumes about the companies listed on the homepage that none of them bothered to check the correctness of the data being presented by the application.

I’m still ready to buy this product (yes, even a year later) as soon as someone redevelops the graph so it’s more than a pretty picture.