Difference between revisions of "Archiving Old Data"

From Observer GigaFlow Support | VIAVI Solutions Inc.
Jump to: navigation, search
Line 2: Line 2:
 
== Summary and Scope ==
 
== Summary and Scope ==
  
This Method of Procedure is about how to use the new archive data feature in Observer GigaFlow.
+
This note is about how to use the new archive data feature in Observer GigaFlow.
  
 
== Archiving ==
 
== Archiving ==
Line 9: Line 9:
  
 
Observer GigaFlow is designed to fill available storage space, i.e. data storage is based on storage capacity rather than time.
 
Observer GigaFlow is designed to fill available storage space, i.e. data storage is based on storage capacity rather than time.
 +
 
On a system with unlimited storage, all data would be retained forever.
 
On a system with unlimited storage, all data would be retained forever.
  
 
In practice, the quantity of data stored in the working database is limited by the capacity of the disk drive where the database resides.
 
In practice, the quantity of data stored in the working database is limited by the capacity of the disk drive where the database resides.
 
Additionally, Observer GigaFlow can set aside space on the drive for health monitoring. We recommend that this is always enabled.
 
Additionally, Observer GigaFlow can set aside space on the drive for health monitoring. We recommend that this is always enabled.
 +
 
As the disk drive reaches capacity, Observer GigaFlow must remove data. This is done by age, i.e. the oldest data are removed first.
 
As the disk drive reaches capacity, Observer GigaFlow must remove data. This is done by age, i.e. the oldest data are removed first.
 
Until recently, when flow data was removed from the database, it was simply deleted.
 
Until recently, when flow data was removed from the database, it was simply deleted.
Line 23: Line 25:
  
 
You can find these settings in the application at System > Global > Storage Settings
 
You can find these settings in the application at System > Global > Storage Settings
 +
 
In the Storage settings box, you can:
 
In the Storage settings box, you can:
  
Line 43: Line 46:
 
* Enable or disable Auto Tune of the Postgres database. Yes or No.
 
* Enable or disable Auto Tune of the Postgres database. Yes or No.
  
And, in the newest builds of Observer GigaFlow:
+
<u>And, in the newest builds of Observer GigaFlow</u>:
  
 
* Set archive folder location, default is c:\temp\
 
* Set archive folder location, default is c:\temp\
Line 56: Line 59:
  
 
With drive monitoring enabled, additional space is set aside. GigaFlow will fill the disk drive(s) until the pre-defined minimum amount of free space is left. GigaFlow caps the storage that any particular device is using for forensics data; this is 2 GB per device by default. This cap can be changed globally and on a per device basis which in turn sets an overall cap on the amount of space used by GigaFlow.
 
With drive monitoring enabled, additional space is set aside. GigaFlow will fill the disk drive(s) until the pre-defined minimum amount of free space is left. GigaFlow caps the storage that any particular device is using for forensics data; this is 2 GB per device by default. This cap can be changed globally and on a per device basis which in turn sets an overall cap on the amount of space used by GigaFlow.
 
In the newest builds, old data can be exported to an archive when limits are reached.
 
  
 
Type Resolution Table Duration (default) Retention Setting Involved
 
Type Resolution Table Duration (default) Retention Setting Involved
Line 68: Line 69:
 
Traffic Summaries Hour 7-day 200 days Interface Summary Storage Period
 
Traffic Summaries Hour 7-day 200 days Interface Summary Storage Period
  
With the new archiving feature enabled, old data can be exported to an archive before being removed from the working database, i.e. when device or drive storage limits have been reached.
+
<u>With the new archiving feature enabled, old data can be exported to an archive before being removed from the working database, i.e. when device or drive storage limits have been reached.</u>
  
 
=== Storage Notes and Requirements ===  
 
=== Storage Notes and Requirements ===  

Revision as of 12:26, 29 July 2019

Contents

Summary and Scope

This note is about how to use the new archive data feature in Observer GigaFlow.

Archiving

Overview

Observer GigaFlow is designed to fill available storage space, i.e. data storage is based on storage capacity rather than time.

On a system with unlimited storage, all data would be retained forever.

In practice, the quantity of data stored in the working database is limited by the capacity of the disk drive where the database resides. Additionally, Observer GigaFlow can set aside space on the drive for health monitoring. We recommend that this is always enabled.

As the disk drive reaches capacity, Observer GigaFlow must remove data. This is done by age, i.e. the oldest data are removed first. Until recently, when flow data was removed from the database, it was simply deleted.

However, long term storage of flow data may be useful in some situations, e.g. to meet compliance objectives or to support investigations.

In the newest builds, we have introduced an archiving feature that allows indefinite storage of the most detailed Forensics data.

Storage Settings and the new Archiving Feature

You can find these settings in the application at System > Global > Storage Settings

In the Storage settings box, you can:

  • Enable or disable drive space monitoring.
  • Set the drive to monitor, e.g. C:/.
  • Set the minimum free space allowed (GB).
  • Set the default device storage space (GB).
  • Set the minimum forensics storage (Days), e.g. 21. After this time, flowsec records will be deleted.
  • Set the IP search storage (Days), e.g. 21. After this time, IP address history will be deleted.
  • Set the forensics table cache size, e.g. 10,000. This is the number of entries cached before writing to disk.
  • Set the forensic table cache age (milliseconds), e.g. 10,000. After 10 seconds, the forensic data is written to disk.
  • Set the forensic cache storage size, e.g. 40,000.
  • Enter forensics indexes. This is a comma-delimited list of forensics table field names, e.g. "srcadd,dstadd,appid". See Reports > Forensics in the Reference Manual for more.
  • Set the forensic rollup age (Days), e.g. 4 days, period after which data should be rolled into daily tables.
  • Set the event storage period (Days), e.g. 100 days, how long events should be recorded for.
  • Set the ARP storage period (Days), e.g. 100 days, how long ARP entries should be recorded for.
  • Set the CAM storage period (Days), e.g. 100 days, how long ARP entries should be recorded for.
  • Set the event summary storage period (Days), e.g. 200 days.
  • Set the interface summary storage period (Days), e.g. 200 days.
  • Enable or disable Auto Tune of the Postgres database. Yes or No.

And, in the newest builds of Observer GigaFlow:

  • Set archive folder location, default is c:\temp\
  • Import existing archive(s).
  • Enable archiving.

Data Retention and Rollup and the new Archiving Feature

You can find this information in the manual at System > Global > Data Retention and Rollup

The minimum Forensics data storage is 21 days. Forensics data is the lowest level flow data stored by Observer GigaFlow. Forensics data is stored in tables for up to four hours to speed up search and reporting. These tables are rolled into one-day tables after the Forensics Rollup Age period; this is four days by default. (see System > Global.)

With drive monitoring enabled, additional space is set aside. GigaFlow will fill the disk drive(s) until the pre-defined minimum amount of free space is left. GigaFlow caps the storage that any particular device is using for forensics data; this is 2 GB per device by default. This cap can be changed globally and on a per device basis which in turn sets an overall cap on the amount of space used by GigaFlow.

Type Resolution Table Duration (default) Retention Setting Involved Raw Flows millisecond 1-hour to 1-day 21 days Min Free Space, Default Device Storage Space, Min Forensics Storage, Forensics Rollup Age. IP Search millisecond 1-day 21 days IP Search Duration. Events millisecond 4-hour 100 days Event Storage Period, Event Summary Storage Period. ARP millisecond 1-day 100 days ARP Storage Period. CAM millisecond 1-day 100 days CAM Storage Period. Interface Summaries Minute 2-day 200 days Interface Summary Storage Period. Traffic Summaries Hour 7-day 200 days Interface Summary Storage Period

With the new archiving feature enabled, old data can be exported to an archive before being removed from the working database, i.e. when device or drive storage limits have been reached.

Storage Notes and Requirements

For performance and stability, we recommend that the data archive is stored on a different disk drive to the main working database.

The archive location must have sufficient storage space.

Additionally, we recommend that the archive drive is backed-up periodically.

Each data export is a stored as a separate archive file.

The archive feature can be used together with, or separately from, other archive arrangements. For example, the Observer GigaFlow working database may be located on storage that supports 'snapshot' features. Finally, the archived flow data retains all ‘enrichment’ provided by Observer GigaFlow during its initial processing.

Importing Archive Data

Archive data can be reimported into Observer GigaFlow. Reinstated archive data is restored to the database and clearly labelled. Reinstated archive data is sticky, i.e. it is not automatically removed from the database and must be manually deleted.

For this reason, we recommend that a separate Observer GigaFlow instance is used to work with retrieved archive data.

Summary

In summary, the new archive feature provides an efficient way to store Observer GigaFlow data indefinitely.