Some thoughts and suggestion after database corruption (highly likely)

Background:
Since 2 days ago, GW started to constantly crash for every possible trigger. A click on miniviewer? yes; A click on setting button? happened; so on… After searching on the forum I tried to reinstall GW without bringing database back. Now everything goes smoothly, but the data I hold is left behind.
During that incident, no system level’s change were made, e.g. no windows upgrade, no driver upgrade, no big fat software was installed(during that time). Completely no idea what’s going on.
I tried to debug the dump file using windbg.exe and I couldn’t find the proper symbol for the debugging, BTW I used srv*c:\symbols*https://msdl.microsoft.com/download/symbols, but the result is always a connection with the server could not be established.

Some thoughts and suggestion:
Base on what I see, it’s a problem related to database corruption. So can you please consider these options in the future?

  1. database integrity check
  2. ability to import, export database
    I know this one is a little off road regarding to the core paid/unpaid difference, but you can open this function to paid user only and maybe with same time limit.
  3. If No.2 is not possible then please consider this: a build in incremental backup function that I can backup the database without refreshing every file every time. This can be the next version of what you already have: https://www.glasswire.com/userguide/#Backup_Settings

Thank you for considering.

2 Likes

@Ken_GlassWire, this topic is timely - thanks @euclidean - as I was about to post on this very topic because of the likelihood of database problems occurring in GlassWire.

I recently had reason to use some results of research into real-world DRAM errors. This got me thinking about GlassWire and the likelihood of DRAM errors causing database corruption.

The basic point of the research is that DRAM errors are much more prevalent than we expect.

We already know that the GlassWire database can be corrupted by disk and CPU errors. The relevance of this research for GlassWire is that GlassWire’s logging of history is, compared with other software, exceptionally exposed to memory errors that will look just like the problem reported in this topic: an unexplainable fault that requires the database to be deleted.

The solution is that GlassWire needs to have the ability to correct database errors and eliminate the undesirable requirement to delete the database to resolve corruption problems. I consider this to be the most important feature that GlassWire needs right now. Personally, I don’t care about my GlassWire data history, but so many users do want to retain their history and this feature is promoted as one the main benefits of the GlassWire graph view.

The research

I looked at two papers:

  • A study published in 2009 looked at DRAM errors in Google datacentres using predominantly ECC DRAM. They found that in any one year about one-third of computers and more than 8% of DRAM had errors.
    PDF DRAM Errors in the Wild : A Large-Scale Field Study

  • A 2011 study of hardware failures in one million consumer PCs based on Windows error reports (WER) uploaded to Microsoft. The DRAM is mostly non-ECC so the errors were from the small part of Kernel memory (averaging about 1.5% of total memory available to Windows) where Windows error correction can detect changed data…These errors are only from computers that could still operate because they had to be able to upload the error report to Microsoft.
    Cycles, Cells and Platters: An Empirical Analysis of Hardware Failures on a million consumer PCs

The research is particularly weighty because of the scale of both studies. The researchers had data from so many more computers than earlier studies. The datacentre study used hundreds of parallel processors just to preprocess the many terabytes of data:

Each one of many ten-thousands of machines in the [Google] fleet logs every ten minutes hundreds of parameters

The implications for GlassWire

I consider that GlassWire is particularly susceptible to errors in the logged history:

  • GlassWire doesn’t usually run on hardware and operating systems where ECC can detect and correct such errors.
  • GlassWire keeps a database of logged history which becomes more susceptible to memory errors. In other words, the probability of database corruption keeps increasing with time:
    • The file size increases over time which, as an aside, also increases the likelihood of a disk error corrupting it.
    • Continuous processing increases the likelihood that an error will not be avoided.
    • The scope of the impact increases with time. In other words, the more days data I have then the more days that will be affected by an error. And the more data I have collected then the more I will care about it if that history is lost.

If we calculated a rough likelihood of such errors occurring then I expect that we’d all be surprised how significant it would be for GlassWire users.

3 Likes

Thank you for focusing on this problem. There is some updates and my data is back(almost).
As I mentioned above about this backup guide https://www.glasswire.com/userguide/#Backup_Settings
Turned out it saved my data (not 100% and not sure everything is really fine).
Because I was trying different backup software recently and just tried to do some test, and one of the test is for GW. So I recovered my data from backups I made on 2019 Feb. 1. Only lost about 1 week of data.
And for now it seems I successfully dodged that corruption, although it’s still major weakness for long-term, long-time data processing.
If anything awkward happens again, I will update this post.

1 Like

Hopefully, the GlassWire team will focus on this as a higher priority.

It’s good that you could keep your history. I guess that the one-year bug that you reported got fixed so you’ve got more than a year of history now.

Wow, thank you and everyone working behind this! Can’t wait to try it.

I think you’ve misunderstood me. I’m not part of the GlassWire team so I don’t even know if they will make it a priority.

:grinning: By all means thank you for your concern.

Shared with the team, thank you @Remah!

@Ken_GlassWire @Remah
Turned out I was so naive to say

After 1 week same old disaster happens again. For this time, I have no choice but a clean re-installation. :expressionless:

@euclidean

I have run GlassWire since 2014 and I have never experienced this myself yet. Do you use some ‘cleaning’ software or something that could be somehow deleting our database or accidentally corrupting it? If so please white list our directory. Our database file locations are listed here.

https://www.glasswire.com/userguide/#Backup_Settings

None of that as long as I can remember. Maybe the database was corrupted long time ago, which makes it impossible to clear the fact. The behavior only started recently. And I did some backups recently, but it was performed with glasswire service shut down and no warnings of any kind showed.

There was a certain point where GlassWire switched to a different database structure. When we did that our software would have to upgrade the previous database to the new format. Perhaps this bug is due to older versions of GlassWire with that older less efficient database structure.

Our QA will continue to test and try to reproduce this. Meanwhile our entire team also runs GlassWire all the time so we’re always looking for this kind of stuff ourselves too. We use all different hardware types.

This problem only appears to be happening for your GlassWire install. That makes it difficult for the GlassWire team to reproduce the problem because there is not enough information to indicate the cause.

I think that there are two options you could consider if you have the time. The GlassWire team could say whether they think these will help, @Ken_GlassWire?

  1. Eliminate system problems as the cause:

    • Check the Windows error reporting logs.
    • Run a basic memory test.
    • Run a comprehensive disk test
    • Run a comprehensive memory test.
  2. Change your GlassWire configuration each time that you reinstall. Statistically speaking, your current problem has a sample size of one configuration. Changing your GlassWire configuration each time that you reinstall, effectively increases the sample size. It also increases the likelihood that the situation will change, i.e. the problem might disappear or more information could be revealed about the possible source of the problem. Here’s three actions that would provide more information:

    • Post more details about your hardware and Windows configuration and the settings you are changing in GlassWire. Then we could see what might be different and other users could duplicate your GlassWire configuration and see if we can reproduce the same problem.
    • Revert to the default configuration, i.e. reinstall with a clean install and don’t change any settings.
    • Change your GlassWire configuration as much as possible. Maybe even move the database to a different folder or drive. Sometimes making more changes will expose the source of a problem on your system.

Thank you for the suggestion. I’ll just try with the clean version, if nothing happens in next couple weeks, that means the problem is 100% related to my database. Otherwise I’ll check my system manually.

2 Likes

My firewall settings went bad causing GlassWire to repeatedly re-prompt and block apps. GlassWire was seeing everything as new and kept adding multiple rule entries in the Windows Firewall. Initially I blamed a Windows Update, but that was a red herring.

In the end I nuked the lot and started over with all my configuration. Whilst is useful to have the manual backup and restore steps it is rather clunky and hardly user friendly. This function really needs to wrapped up into a little utility.

Latest update on the issue. I tried one more time to use the original backup I mentioned to check out if there is a little chance that I get lucky.

So after 1 month, without new backups or anything that would touch the database files, it seems I survived this time. Just lost some weeks’ data in the whole try and error process. Hope there is no “surprise” again in the future…

2 Likes

Is there any plan to reduce the likelihood of database corruption other than not logging some or all connections?
Or is there a plan to provide a feature to repair a corrupted database?

This issue is too important to be ignored. :persevere:

I’m also adding these links to reported problems which are only solved by deleting the corrupted database:

3 Likes

We are redesigning how the service/database works completely for a future major update. :+1:

3 Likes

I am totally in favor of making the GlassWire database backup/recovery a much simpler and more robust process! I recently had to clean install Pro due to my firewall rules going haywire.

But any time you experience a data corruption with no known cause, it wouldn’t hurt to test your hardware. @Remah made a good point earlier about this issue.

If anyone is interested in testing their RAM, there is a comprehensive memory test available for free, called MemTest86 (for both 32 and 64-bit PCs).

https://www.memtest86.com/
You can boot it from a USB flash drive and let your PC run the full test standalone on your hardware overnight. No point watching it go round and round, unless you are into that! :wink:

Troubleshooting Memory Errors
https://www.memtest86.com/troubleshooting.htm

"All valid memory errors should be corrected. It is possible that a particular error will never show up in normal operation. However, operating with marginal memory is risky and can result in data loss and even disk corruption. Even if there is no overt indication of problems you cannot assume that your system is unaffected. Sometimes intermittent errors can cause problems that do not show up for a long time. You can be sure that Murphy will get you if you know about a memory error and ignore it. "

1 Like