Friday, October 10, 2008

Stress Testing FileSure 2.0

One of the things I've come to realize about software development is that most of the hard work we do is hidden from you, much like most of an iceberg is hidden. I thought I'd tell you about one of the many tests we've been running on the new 2.0 version of FileSure.

This test created 100 unique files every second for hours on end. Now that's extreme - each hour we'd be creating 360,000 files. Your servers would probably experience something like that only if there was major problem. Anyway, we discovered that after several hours FileSure would run out of memory and crash. So what to do?

Well, we could ignore it and hope it never happened. But our years of experience tells us that we have to fix these kind of issues.

The root problem that we had to address was the consolidation store.
It's a memory structure that collapses duplicate events to reduce auditing noise, which is one of the coolest features of FileSure.
Anyway, the store is a very efficient data structure, but in this test the problem is that every file is unique.

Since other tables were using indexes in the file list, we couldn’t just zap the old file entries without running thru all the related structures and also clean those up….what a mess.

Well…like many things in life, you find a compromise. Here is what we decided and while we don’t like it much, it does work well and even running our very ugly stress test the FileSure Service doesn’t get over 100 MB.

And the solution is….nice and simple…if the store has more than 100,000 files; delete it and recreate it. The downside? After 100,000 unique files has been audited, is it possible that a few events might not get consolidated correctly.

Let me know if we worried too much about this!

Referenced product details: http://www.bystorm.com/Products/FileSure

No comments: